Re: [R] x[0]: Can '0' be made an allowed index in R?
Convert to character: goofy <- 1:9 names(goofy) <- (-4):4 goofy[as.character(0)] goofy["-2"] sg On 4/22/24 10:31 AM, Ebert,Timothy Aaron wrote: You could have negative indices. There are two ways to do this. 1) provide a large offset. Offset <- 30 for (i in -29 to 120) { print(df[i+Offset])} 2) use absolute values if all indices are negative. for (i in -200 to -1) {print(df[abs(i)])} Tim -Original Message- From: R-help On Behalf Of Peter Dalgaard via R-help Sent: Monday, April 22, 2024 10:36 AM To: Rolf Turner Cc: R help project ; Hans W Subject: Re: [R] x[0]: Can '0' be made an allowed index in R? [External Email] Heh. Did anyone bring up negative indices yet? -pd On 22 Apr 2024, at 10:46 , Rolf Turner wrote: See fortunes::fortune(36). cheers, Rolf Turner -- Honorary Research Fellow Department of Statistics University of Auckland Stats. Dep't. (secretaries) phone: +64-9-373-7599 ext. 89622 Home phone: +64-9-480-4619 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat/ .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C02%7Ctebert%40ufl.edu %7C79ca6aadcaee4aa3241308dc62d986f6%7C0d4da0f84a314d76ace60a62331e1b84 %7C0%7C0%7C638493933686698527%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C= wmv9OYcMES0nElT9OAKTdjBk%2BB55bQ7BjxOuaVVkPg4%3D=0 PLEASE do read the posting guide http://www.r/ -project.org%2Fposting-guide.html=05%7C02%7Ctebert%40ufl.edu%7C79 ca6aadcaee4aa3241308dc62d986f6%7C0d4da0f84a314d76ace60a62331e1b84%7C0% 7C0%7C638493933686711061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=AP78X nfKrX6B0YVM0N76ty9v%2Fw%2BchHIytw33X7M9umE%3D=0 and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] understanding predict.lm
Doh! Thanks very much. sg On 11/6/23 5:17 PM, John Fox wrote: Dear Spencer, You need the t distribution with correct df, not the standard-normal distribution: > pt(-z.confInt/2, df=13) 1 2 3 4 5 6 7 8 9 10 11 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 12 13 0.025 0.025 > pt(-z.predInt/2, df=13) 1 2 3 4 5 6 7 8 9 10 11 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 12 13 0.025 0.025 I hope this helps, John __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] understanding predict.lm
Hello, All: I am unable to manually replicate predict.lm, specifically comparing se.fit with (fit[,3]-fit[,2]): I think their ratio should be 2*qnorm((1-level)/2), and that's not what I'm getting. Consider the following slight modification of the first example in help('predict.lm'): set.seed(1) x <- rnorm(15) y <- x + rnorm(15) predict(lm(y ~ x)) new <- data.frame(x = seq(-3, 3, 0.5)) predict(lm(y ~ x), new, se.fit = TRUE) pred.w.plim <- predict(lm(y ~ x), new, interval = "prediction", se.fit = TRUE) pred.w.clim <- predict(lm(y ~ x), new, interval = "confidence", se.fit = TRUE) (z.confInt <- with(pred.w.clim, (fit[,3]-fit[,2])/se.fit)) pnorm(-z.confInt/2) s.pred <- sqrt(with(pred.w.plim, se.fit^2+residual.scale^2)) (z.predInt <- with(pred.w.plim, (fit[,3]-fit[,2])/s.pred)) pnorm(-z.predInt/2) ** This gives me 0.01537207. I do not understand why it's not 0.025 with level = 0.95. Can someone help me understand this? Thanks, Spencer Graves __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simulating future observations from heteroscedastic fits
On 8/31/23 1:27 AM, Spencer Graves wrote: Hello, All: I want to simulate future observations from fits to heteroscedastic data. A simple example is as follows: (DF3_2 <- data.frame(y=c(1:3, 10*(1:3)), gp=factor(rep(1:2, e=3 # I want to fit 4 models # and simulate future observations from all 4: fit11 <- lm(y~1, DF3_2) fit21 <- lm(y~gp, DF3_2) library(nlme) (fit12 <- lme(y~1, data=DF3_2, random=~1|gp)) (fit22 <- lme(y~gp, data=DF3_2, random=~1|gp)) library(lme4) (fit12r <- lmer(y~1+(1|gp), data=DF3_2, REML=FALSE)) (fit22r <- lmer(y~gp+(1|gp), data=DF3_2, REML=FALSE)) # I can simulate what I want for fit11 and fit21 # as follows: simPred <- function(object, nSims=2){ pred <- predict(object, DF3_2[6,], se.fit=TRUE, interval='prediction') with(pred, fit[1, 'fit'] + se.fit*rt(nSims, df)) } simPred(fit11) simPred(fit21) # How can I do the same with either fit12 and fit22 # or fit12r and fit22r? I think I found it: simPred4 <- function(object, nSims=2){ # class(object) = lmeeMod sim <- simulate(object, nsim=nSims, newdata=DF3_2[6,]) sim } simPred4(fit12r) simPred4(fit22r) Thanks, Spencer Graves __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simulating future observations from heteroscedastic fits
Hello, All: I want to simulate future observations from fits to heteroscedastic data. A simple example is as follows: (DF3_2 <- data.frame(y=c(1:3, 10*(1:3)), gp=factor(rep(1:2, e=3 # I want to fit 4 models # and simulate future observations from all 4: fit11 <- lm(y~1, DF3_2) fit21 <- lm(y~gp, DF3_2) library(nlme) (fit12 <- lme(y~1, data=DF3_2, random=~1|gp)) (fit22 <- lme(y~gp, data=DF3_2, random=~1|gp)) library(lme4) (fit12r <- lmer(y~1+(1|gp), data=DF3_2, REML=FALSE)) (fit22r <- lmer(y~gp+(1|gp), data=DF3_2, REML=FALSE)) # I can simulate what I want for fit11 and fit21 # as follows: simPred <- function(object, nSims=2){ pred <- predict(object, DF3_2[6,], se.fit=TRUE, interval='prediction') with(pred, fit[1, 'fit'] + se.fit*rt(nSims, df)) } simPred(fit11) simPred(fit21) # How can I do the same with either fit12 and fit22 # or fit12r and fit22r? Thanks, Spencer Graves __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logLIk(lme(...))?
I found my problem: The following function gave llGp1, llGp2 and ll22: logLik_lm <- function(object){ res <- resid(object) n <- length(res) s2MLE <- sum(res^2)/n lglk <- (-n/2)*(log(2*pi*s2MLE)+1) lglk } logLik(fitGp1) logLik(fitGp1)-logLik_lm(fitGp1) llGp1 - logLik_lm(fitGp1) logLik(fitGp2) logLik(fitGp2)-logLik_lm(fitGp2) llGp2 - logLik_lm(fitGp2) logLik(fit22) logLik(fit22)-logLik_lm(fit22) ll22 -logLik_lm(fit22) These differences were all 0 to within roundoff error. That confirmed for me that I could safely compare logLik.lm and logLik.lme. What I thought should have been a linear operation wasn't. Please excuse the waste of your time. Thanks, Spencer Graves On 8/29/23 11:15 AM, Spencer Graves wrote: Hello, all: I have a dataset with 2 groups. I want to estimate 2 means and 2 standard deviations. I naively think I should be able to use lme to do that, e.g., lme(y~gp, random=y~1|gp, method='ML'). I think I should get the same answer as from lm(y~1, ...) within each level of group. I can get the same means, but I don't know how to extract the within-gp standard deviations, and the sum of logLik for the latter two does not equal the former. TOY EXAMPLE: library(nlme) set.seed(1) lmePblm <- data.frame(y=c(rnorm(5, 1, 2), rnorm(5,3,5)), gp=factor(rep(1:2, each=5))) fit22 <- lme(y~gp, lmePblm, random=~1|gp, method='ML') fitGp1 <- lm(y~1, lmePblm[lmePblm$gp==1, ]) fitGp2 <- lm(y~1, lmePblm[lmePblm$gp==2, ]) (ll22 <- logLik(fit22)) (llGp1 <- logLik(fitGp1)) (llGp2 <- logLik(fitGp2)) # Why isn't (ll22 = llGp1+llGp2)? (ll22 - llGp1-llGp2) # And secondarily, how can I get the residual standard deviations # within each gp from fit22? Thanks, Spencer Graves __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] logLIk(lme(...))?
Hello, all: I have a dataset with 2 groups. I want to estimate 2 means and 2 standard deviations. I naively think I should be able to use lme to do that, e.g., lme(y~gp, random=y~1|gp, method='ML'). I think I should get the same answer as from lm(y~1, ...) within each level of group. I can get the same means, but I don't know how to extract the within-gp standard deviations, and the sum of logLik for the latter two does not equal the former. TOY EXAMPLE: library(nlme) set.seed(1) lmePblm <- data.frame(y=c(rnorm(5, 1, 2), rnorm(5,3,5)), gp=factor(rep(1:2, each=5))) fit22 <- lme(y~gp, lmePblm, random=~1|gp, method='ML') fitGp1 <- lm(y~1, lmePblm[lmePblm$gp==1, ]) fitGp2 <- lm(y~1, lmePblm[lmePblm$gp==2, ]) (ll22 <- logLik(fit22)) (llGp1 <- logLik(fitGp1)) (llGp2 <- logLik(fitGp2)) # Why isn't (ll22 = llGp1+llGp2)? (ll22 - llGp1-llGp2) # And secondarily, how can I get the residual standard deviations # within each gp from fit22? Thanks, Spencer Graves __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Book Recommendation
library(sos) (sql <- findFn('SQL')) # This opened two tabs in the default browser on my computer just now. The second tab lists 298 different packages with help pages containing "SQL", sorted to put the package with the most matches first. The first tab lists 1900 different help pages, sorted to put the highest ranking package first. installPackages(sql) # This installs the highest ranking packages, because the 'sos' code knows how to get more information about installed packages than about ones that are not installed. To get that information, I followed this "installPackages(sql)" with: sqo # Hadley Wickham is listed as an author on 14 of those packages. The RSQL and RSQLite are numbers 22 and 50 on that list. RMySQL and RPostgreSQL are numbers 48 and 31, respectively. If you are looking for course content, I suggest you consider using findFn with SQL and applications of greatest interest to your target audience. Spencer Graves p.s. DISCLAIMER: I'm the lead author and maintainer of the sos package. On 8/28/23 1:48 PM, Hadley Wickham wrote: You might find this chapter of R for Data Science helpful: https://r4ds.hadley.nz/databases Hadley On Tue, Aug 29, 2023 at 3:47 AM Stephen H. Dawson, DSL via R-help wrote: Good Morning, I am doing some research to develop a new course where I teach. I am looking for a book to use in the course content to teach accomplishing SQL in R. Does anyone know of a book on this topic to recommend for consideration? Thank You, -- *Stephen Dawson, DSL* /Executive Strategy Consultant/ Business & Technology +1 (865) 804-3454 http://www.shdawson.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Book Recommendation
On 8/28/23 12:47 PM, Bert Gunter wrote: I presume you are familiar with the RSQL and RSQLite packages and their vignettes. Can't offer any help, but a point of clarification: When you say, "teach accomplishing SQL in R," do you explicitly mean using SQL syntax in R to manipulate data or do you mean just doing SQL-like types of data manipulation in R? For the former, I assume you would be using the above-mentioned packages -- or perhaps others that I don't know about like them. For the latter, which I think would be subsumed under "data wrangling in R" there are tons of packages, tutorials, and books out there that one could search for under that rubric. If neither of the above, further clarification might help you get a better answer. Cheers, Bert On Mon, Aug 28, 2023 at 8:47 AM Stephen H. Dawson, DSL via R-help < r-help@r-project.org> wrote: Good Morning, I am doing some research to develop a new course where I teach. I am looking for a book to use in the course content to teach accomplishing SQL in R. Does anyone know of a book on this topic to recommend for consideration? Thank You, -- *Stephen Dawson, DSL* /Executive Strategy Consultant/ Business & Technology +1 (865) 804-3454 http://www.shdawson.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions about R
Hi, Shaun: I suggest you also check the Wikipedia article on "R (programming language)": https://en.wikipedia.org/wiki/R_(programming_language) That article has 115 numbered "References" that provide more information. You don't have to believe anything you read anywhere. However you many know that almost anyone can change almost anything on Wikipedia. What stays tends to be written from a neutral point of view citing credible sources. Many Wikipedia articles contain sections on controveries, etc. I don't find anything like that in this article, with good reason. Hope this helps. Spencer Graves On 8/17/23 11:48 AM, Bert Gunter wrote: Incidentally, you might be interested in the banner shown when R starts up: "R is free software and comes with ABSOLUTELY NO WARRANTY." I believe this is standard for open source software (upon which a lot of organizations depend). In any case, that might be the most definitive and "official" answer you can get. Bert On Thu, Aug 17, 2023 at 9:17 AM Bert Gunter wrote: This is a volunteer Help list for users of R, which is open source, so you can see all its code. I can answer no to your questions, unless you are using one of R's innumerable packages that interacts with the internet and to which the user may give personal information to enable the desired functionality (logins, etc.). But of course how do you know that I am not some malevolent agent or organization wishing to mislead you for my own nefarious purposes? Cheers, Bert On Thu, Aug 17, 2023 at 8:37 AM Shaun Parr wrote: Sent from Outlook for Android<https://aka.ms/AAb9ysg> Hi there, My name is Shaun and I work in an organisation where one of our users wishes to install the R software and our process is to assess the safety of anyone software prior to authorisation. I can’t seem to locate all the information that we require on the webpage, so could someone kindly advise me of the following information please? 1. Please can you confirm what user information the software collects (E.g. Name, password, e-mail address, any Personally Identifiable Information etc)? 2. If any is collected, please can you confirm if the information collected by the software stays locally on the device or if it is transferred anywhere. If it is transferred, could you please advise where it is transferred to (E.g. your own servers, or a third party data centre such as Amazon Web Services or Azure)? 3. Are there any third-party components installed within the software and, if so, are these also kept up-to-date? If you could kindly advise this information, it would be really appreciated, thank you Shaun [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Off-topic: ChatGPT Code Interpreter
I don't know about ChatGPT, but Daniel Kahneman won the 2002 Nobel Memorial Prize in Economics,[1] even though he's not an economist, for his leadership in creating a new subfield in the intersection of human psychology and economics now called "behavioral economics".[2] Then in 2009 Kahneman and Gary Klein published an article on, "Conditions for intuitive expertise: a failure to disagree", which concluded that expert intuition is learned from frequent, rapid, high-quality feedback. People you do not learn from frequent, rapid, high-quality feedback can be beaten by simple heuristics developed by intelligent lay people.[3] That includes most professions, which Kahneman Sibony and Sunstein call "respect-experts". Kahneman Sibony and Sunstein further report that with a little data, a regression model can outperform a simple heuristic, and with massive amounts of data, artificial intelligence can outperform regression models.[4] An extreme but real example of current reality was describe in an article on "Asylum roulette": With asylum judges in the same jurisdiction with cases assigned at random, one judge approved 5 percent of cases while another approved 88 percent.[5] However, virtually all "respect-experts" are influenced in their judgements by time of day and whether their favorite sports team won or lost the previous day. That level of noise can be reduced dramatically by use of appropriate artificial intelligence. Comments? Spencer Graves [1] https://en.wikipedia.org/wiki/Daniel_Kahneman [2] https://en.wikipedia.org/wiki/Behavioral_economics [3] https://www.researchgate.net/publication/26798603_Conditions_for_Intuitive_Expertise_A_Failure_to_Disagree [4] Daniel Kahneman; Olivier Sibony; Cass Sunstein (2021). Noise: A Flaw in Human Judgment (Little, Brown and Company). [5] https://en.wikipedia.org/wiki/Refugee_roulette On 7/17/23 1:46 PM, Bert Gunter wrote: This is an **off-topic** post about the subject line, that I thought might be of interest to the R Community. I hope this does not offend anyone. The widely known ChatGPT software now offers what is called a "Code Interpreter," that, among other things, purports to do "data analysis." (Search for articles with details.) One quote, from the (online) NY Times, is: "Arvind Narayanan, a professor of computer science at Princeton University, cautioned that people should not become overly reliant on code interpreter for data analysis as A.I. still produces inaccurate results and misinformation. 'Appropriate data analysis requires just a lot of critical thinking about the data,” he said.' " Amen. ... Maybe. (As this is off-topic, if you wish to reply to me, probably better to do so privately). Cheers to all, Bert __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot level, velocity, acceleration with one x axis
On 5/31/23 2:12 PM, Viechtbauer, Wolfgang (NP) wrote: How about using the same 'mar' for all plots, but adding an outer margin? DAX <- EuStockMarkets[, 'DAX'] DAX. <- cbind(DAX, diff(log(DAX)), diff(diff(log(DAX colnames(DAX.) <- c("DAX", 'vel (%)', 'accel (%)') head(DAX.) par(mfrow=c(3,1), mar=c(1,4.5,0,2), oma=c(3,0,1,0)) plot(DAX.[, 1], log='y', ylab='DAX', axes=FALSE) axis(2) box(col='grey') plot(DAX.[, 2], ylab='vel (%)', axes=FALSE) axis(2) box(col='grey') plot(DAX.[, 3], ylab='accel (%)', axes=FALSE) axis(2) box(col='grey') axis(1) Best, Wolfgang That's exactly what I needed. Thanks, Spencer -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Spencer Graves Sent: Wednesday, 31 May, 2023 17:45 To: Eric Berger Cc: r-help Subject: Re: [R] plot level, velocity, acceleration with one x axis On 5/31/23 9:20 AM, Eric Berger wrote: I sent you an updated response to deal with the redundant copies of the x-axis. Re-sending. par(mfrow=c(3,1)) plot(DAX.[, 1], log='y', ylab='DAX', xaxt="n") plot(DAX.[, 2], ylab='vel (%)', xaxt="n") plot(DAX.[, 3], ylab='accel (%)') I got that. The primary problem with that is that most of the vertical space is reserved for axis labels, whether they are printed or not. If I squeeze the vertical dimension of the plot, I get, "figure margins too large". To control that, I need to set "mar" separately for each panel, and then the plot regions for each are not the same size. Using the "layout" function instead of "mfrow" is better, but I don't see now to make that work consistently without fixing the aspect ratio. There may be a way in the tidyverse, but I haven't found it yet. The only solution I've found so far that makes sense to me is to modify the code for plot.ts to accept a vector for the log argument, with the constraint that length(lot) = either 1 or ncol(x) and returning invisibly an object that would make it feasible for a user to call axis(2, ...) once for each vertical axis to handle cases where someone wanted to a vertical scale different from linear and log. I'd want to make sure that lines.ts also works with this, because I want to add fits and predictions. Comments? Thanks, Spencer Graves ** With either of the following plots, if I adjust the aspect ratio by enlarging or reducing the vertical dimension of the plot, the relative sizes of the plot regions change. DAX <- EuStockMarkets[, 'DAX'] DAX. <- cbind(DAX, diff(log(DAX)), diff(diff(log(DAX colnames(DAX.) <- c("DAX", 'vel (%)', 'accel (%)') head(DAX.) plot(DAX., log='xy') op <- par(mfrow=c(3,1), mar=c(0, 4.1, 4.1, 2.1)) plot(DAX.[, 1], log='y', ylab='DAX', axes=FALSE) axis(2) box(col='grey') par(mar=c(0, 4.1, 0, 2.1)) plot(DAX.[, 2], ylab='vel (%)', axes=FALSE) axis(2) box(col='grey') par(mar=c(5.1, 4.1, 0, 2.1)) plot(DAX.[, 3], ylab='accel (%)', axes=FALSE) axis(2) box(col='grey') axis(1) par(op) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot level, velocity, acceleration with one x axis
On 5/31/23 9:20 AM, Eric Berger wrote: I sent you an updated response to deal with the redundant copies of the x-axis. Re-sending. par(mfrow=c(3,1)) plot(DAX.[, 1], log='y', ylab='DAX', xaxt="n") plot(DAX.[, 2], ylab='vel (%)', xaxt="n") plot(DAX.[, 3], ylab='accel (%)') I got that. The primary problem with that is that most of the vertical space is reserved for axis labels, whether they are printed or not. If I squeeze the vertical dimension of the plot, I get, "figure margins too large". To control that, I need to set "mar" separately for each panel, and then the plot regions for each are not the same size. Using the "layout" function instead of "mfrow" is better, but I don't see now to make that work consistently without fixing the aspect ratio. There may be a way in the tidyverse, but I haven't found it yet. The only solution I've found so far that makes sense to me is to modify the code for plot.ts to accept a vector for the log argument, with the constraint that length(lot) = either 1 or ncol(x) and returning invisibly an object that would make it feasible for a user to call axis(2, ...) once for each vertical axis to handle cases where someone wanted to a vertical scale different from linear and log. I'd want to make sure that lines.ts also works with this, because I want to add fits and predictions. Comments? Thanks, Spencer Graves ** With either of the following plots, if I adjust the aspect ratio by enlarging or reducing the vertical dimension of the plot, the relative sizes of the plot regions change. DAX <- EuStockMarkets[, 'DAX'] DAX. <- cbind(DAX, diff(log(DAX)), diff(diff(log(DAX colnames(DAX.) <- c("DAX", 'vel (%)', 'accel (%)') head(DAX.) plot(DAX., log='xy') op <- par(mfrow=c(3,1), mar=c(0, 4.1, 4.1, 2.1)) plot(DAX.[, 1], log='y', ylab='DAX', axes=FALSE) axis(2) box(col='grey') par(mar=c(0, 4.1, 0, 2.1)) plot(DAX.[, 2], ylab='vel (%)', axes=FALSE) axis(2) box(col='grey') par(mar=c(5.1, 4.1, 0, 2.1)) plot(DAX.[, 3], ylab='accel (%)', axes=FALSE) axis(2) box(col='grey') axis(1) par(op) > sessionInfo() R version 4.3.0 (2023-04-21) Platform: x86_64-apple-darwin20 (64-bit) Running under: macOS Big Sur 11.7.7 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 time zone: America/Chicago tzcode source: internal attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.3.0 tools_4.3.0 rstudioapi_0.14 On Wed, May 31, 2023 at 4:27 PM Spencer Graves wrote: On 5/30/23 10:23 AM, Eric Berger wrote: What if you just precede these commands as follows: par(mfrow=c(3,1)) plot(DAX.[, 1], log='y', ylab='DAX') plot(DAX.[, 2], ylab='vel (%)') plot(DAX.[, 3], ylab='accel (%)') Most of the space is consumed with two extraneous copies of the axis. We can get around that with three calls to par: op <- par(mfrow=c(3,1), mar=c(0, 4.1, 4.1, 2.1)) plot(DAX.[, 1], log='y', ylab='DAX') par(mar=c(0, 4.1, 0, 2.1)) plot(DAX.[, 2], ylab='vel (%)') par(mar=c(5.1, 4.1, 0, 2.1)) plot(DAX.[, 3], ylab='accel (%)') par(op) However, the three panels are NOT equal in size: roughly 30% vs. 44% vs. 26%. I can get closer using layout, but if I change the aspect ratio, it changes the relationship between the heights of the three panels. That's the problem I'm trying to solve. It's also why it makes sense to me to modify plot.ts to accept a vector for the log argument, with the constraint that length(lot) = either 1 or ncol(x). There may be a way to do it using gglot2 / the tidyverse, but I'm not facile with that, and my web searches have so far failed to produce anything better than modifying plot.ts.R (and then submitting such with compatible changes to plot.ts.Rd), as I suggested earlier. ??? Thanks, Spencer On Tue, May 30, 2023 at 5:45 PM Spencer Graves wrote: On 5/30/23 8:48 AM, Eric Berger wrote: I am a bit confused as to what you are trying to achieve - and even if I could guess it is not clear what the interpretation would be. head(DAX) 1628.75 1613.63 1606.51 1621.04 1618.16 1610.61 Including the leading NA's, what would be the 6 leading terms of the 3 series that you want to plot, and what would be the Y labels that you want to appear at those levels (assuming that there was a Y label for each of them - just to understand the units you are talking about) DAX <- EuStockMarkets[, 'DAX'] DAX. <- cbind(DAX, diff(log(DAX)), diff(diff(log(DAX c
Re: [R] plot level, velocity, acceleration with one x axis
On 5/30/23 10:23 AM, Eric Berger wrote: What if you just precede these commands as follows: par(mfrow=c(3,1)) plot(DAX.[, 1], log='y', ylab='DAX') plot(DAX.[, 2], ylab='vel (%)') plot(DAX.[, 3], ylab='accel (%)') Most of the space is consumed with two extraneous copies of the axis. We can get around that with three calls to par: op <- par(mfrow=c(3,1), mar=c(0, 4.1, 4.1, 2.1)) plot(DAX.[, 1], log='y', ylab='DAX') par(mar=c(0, 4.1, 0, 2.1)) plot(DAX.[, 2], ylab='vel (%)') par(mar=c(5.1, 4.1, 0, 2.1)) plot(DAX.[, 3], ylab='accel (%)') par(op) However, the three panels are NOT equal in size: roughly 30% vs. 44% vs. 26%. I can get closer using layout, but if I change the aspect ratio, it changes the relationship between the heights of the three panels. That's the problem I'm trying to solve. It's also why it makes sense to me to modify plot.ts to accept a vector for the log argument, with the constraint that length(lot) = either 1 or ncol(x). There may be a way to do it using gglot2 / the tidyverse, but I'm not facile with that, and my web searches have so far failed to produce anything better than modifying plot.ts.R (and then submitting such with compatible changes to plot.ts.Rd), as I suggested earlier. ??? Thanks, Spencer On Tue, May 30, 2023 at 5:45 PM Spencer Graves wrote: On 5/30/23 8:48 AM, Eric Berger wrote: I am a bit confused as to what you are trying to achieve - and even if I could guess it is not clear what the interpretation would be. head(DAX) 1628.75 1613.63 1606.51 1621.04 1618.16 1610.61 Including the leading NA's, what would be the 6 leading terms of the 3 series that you want to plot, and what would be the Y labels that you want to appear at those levels (assuming that there was a Y label for each of them - just to understand the units you are talking about) DAX <- EuStockMarkets[, 'DAX'] DAX. <- cbind(DAX, diff(log(DAX)), diff(diff(log(DAX colnames(DAX.) <- c("DAX", 'vel (%)', 'accel (%)') head(DAX.) DAX exhibits growth that is roughly exponential, so I want to plot it on a log scale: plot(DAX.[, 1], log='y', ylab='DAX') plot(DAX.[, 2], ylab='vel (%)') plot(DAX.[, 3], ylab='accel (%)') This is what I want as three panels of a single plot. I think I could get it by modifying the code for plot.ts so it accepted ylab as a vector, etc., as I previously mentioned. What do you think? Thanks, Spencer Graves On Tue, May 30, 2023 at 4:06 PM Spencer Graves wrote: On 5/30/23 6:16 AM, Eric Berger wrote: My code assumes that DAX is a ts object, as in your original post. On Tue, May 30, 2023 at 2:06 PM Eric Berger wrote: Untested but why not a <- cbind(log(DAX), exp(diff(log(DAX))), exp(diff(diff(log(DAX) colnames(a) <- c("logDAX", "vel", "accel") plot(a) Progress, but we're not there yet. a <- cbind(DAX, exp(diff(log(DAX))), exp(diff(diff(log(DAX) colnames(a) <- c("logDAX", "vel", "accel") plot(a) plot(a, axes=FALSE, log='y') axis(1) axis(2) How do I get each y axis labeled in its original units? I can use pretty to get where I want tick marks, but I don't know where to place them "at" in calling axis(2, at= ___)? (axlb1 <- pretty(range(a[, 1]))) (axlb2 <- pretty(range(log(a[, 2]), na.rm=TRUE))) (axlb3 <- pretty(range(log(a[, 3]), na.rm=TRUE))) This suggests I write my own modification of plot.ts that accepts log as a character vector of length = ncol of the ts being plotted and returns invisibly a list with the default "at" and "label" arguments required to produce the default labeling. Then a user who wants a log scale for some but not all variables can get that easily and can further modify any of those scales further if they don't like the default. ??? Thanks very much. Spencer Graves On Tue, May 30, 2023 at 1:46 PM Spencer Graves wrote: On 5/29/23 2:37 AM, Eric Berger wrote: How about this: a <- cbind(AirPassengers, diff(log(AirPassengers)), diff(diff(log(AirPassengers colnames(a)[2:3] <- c("percent increase", "acceleration") plot(a, xlab="year", main="AirPassengers") My real problem is more difficult: I'm analyzing CO2 data from Our World in Data (https://ourworldindata.org/co2-emissions), and I need to plot the CO2 data on a log scale but velocity and acceleration on linear scales. The following is comparable: str(DAX <- EuStockMarkets[, 'DAX']) str(DAX. <- cbind(DAX, diff(log(DAX)), diff(diff(log(DAX) colnames(DAX.)[2:3] <- c('vel', 'accel') plot(DAX.) I want the first of the three panels to plot on the log scale, but the other two on linear scales. Th
Re: [R] plot level, velocity, acceleration with one x axis
On 5/30/23 8:48 AM, Eric Berger wrote: I am a bit confused as to what you are trying to achieve - and even if I could guess it is not clear what the interpretation would be. head(DAX) 1628.75 1613.63 1606.51 1621.04 1618.16 1610.61 Including the leading NA's, what would be the 6 leading terms of the 3 series that you want to plot, and what would be the Y labels that you want to appear at those levels (assuming that there was a Y label for each of them - just to understand the units you are talking about) DAX <- EuStockMarkets[, 'DAX'] DAX. <- cbind(DAX, diff(log(DAX)), diff(diff(log(DAX colnames(DAX.) <- c("DAX", 'vel (%)', 'accel (%)') head(DAX.) DAX exhibits growth that is roughly exponential, so I want to plot it on a log scale: plot(DAX.[, 1], log='y', ylab='DAX') plot(DAX.[, 2], ylab='vel (%)') plot(DAX.[, 3], ylab='accel (%)') This is what I want as three panels of a single plot. I think I could get it by modifying the code for plot.ts so it accepted ylab as a vector, etc., as I previously mentioned. What do you think? Thanks, Spencer Graves On Tue, May 30, 2023 at 4:06 PM Spencer Graves wrote: On 5/30/23 6:16 AM, Eric Berger wrote: My code assumes that DAX is a ts object, as in your original post. On Tue, May 30, 2023 at 2:06 PM Eric Berger wrote: Untested but why not a <- cbind(log(DAX), exp(diff(log(DAX))), exp(diff(diff(log(DAX) colnames(a) <- c("logDAX", "vel", "accel") plot(a) Progress, but we're not there yet. a <- cbind(DAX, exp(diff(log(DAX))), exp(diff(diff(log(DAX) colnames(a) <- c("logDAX", "vel", "accel") plot(a) plot(a, axes=FALSE, log='y') axis(1) axis(2) How do I get each y axis labeled in its original units? I can use pretty to get where I want tick marks, but I don't know where to place them "at" in calling axis(2, at= ___)? (axlb1 <- pretty(range(a[, 1]))) (axlb2 <- pretty(range(log(a[, 2]), na.rm=TRUE))) (axlb3 <- pretty(range(log(a[, 3]), na.rm=TRUE))) This suggests I write my own modification of plot.ts that accepts log as a character vector of length = ncol of the ts being plotted and returns invisibly a list with the default "at" and "label" arguments required to produce the default labeling. Then a user who wants a log scale for some but not all variables can get that easily and can further modify any of those scales further if they don't like the default. ??? Thanks very much. Spencer Graves On Tue, May 30, 2023 at 1:46 PM Spencer Graves wrote: On 5/29/23 2:37 AM, Eric Berger wrote: How about this: a <- cbind(AirPassengers, diff(log(AirPassengers)), diff(diff(log(AirPassengers colnames(a)[2:3] <- c("percent increase", "acceleration") plot(a, xlab="year", main="AirPassengers") My real problem is more difficult: I'm analyzing CO2 data from Our World in Data (https://ourworldindata.org/co2-emissions), and I need to plot the CO2 data on a log scale but velocity and acceleration on linear scales. The following is comparable: str(DAX <- EuStockMarkets[, 'DAX']) str(DAX. <- cbind(DAX, diff(log(DAX)), diff(diff(log(DAX) colnames(DAX.)[2:3] <- c('vel', 'accel') plot(DAX.) I want the first of the three panels to plot on the log scale, but the other two on linear scales. The obvious attempt does not work: plot(DAX., log=c('y', '', '')) #Error in length(log) && log != "" : # 'length = 3' in coercion to 'logical(1)' Trying to construct my own axes isn't easy, either: str(logDAX <- cbind(log(DAX), diff(log(DAX)), diff(diff(log(DAX) colnames(logDAX) <- c('logDAX', 'vel', 'accel') plot(logDAX, axes=FALSE) axis(1) axis(2) I'm thinking of creating my own copy of "plot.ts", and changing it so it accepts the "log" argument as a vector of length equal to ncol of the ts object to be plotted AND returning an object that would allow a user to call "axis" ncol times. Suggestions? Thanks, Spencer Graves HTH, Eric On Mon, May 29, 2023 at 7:57 AM Spencer Graves wrote: Hello, All: I want to plot level, velocity, and acceleration in three panels with only one x axis. The code below does this using "layout". However, I want the three plot areas to be of equal size, and this won't do that: If I stretch the plot vertically, the relative sizes of the three panels changes. There's probably a way to do this with ggplot2, but I have yet to find it. Suggestions? Thanks, Spencer Graves str(AirTime <- as.numeric(time(AirPassengers))) str(AP <- as.
Re: [R] plot level, velocity, acceleration with one x axis
On 5/30/23 6:16 AM, Eric Berger wrote: My code assumes that DAX is a ts object, as in your original post. On Tue, May 30, 2023 at 2:06 PM Eric Berger wrote: Untested but why not a <- cbind(log(DAX), exp(diff(log(DAX))), exp(diff(diff(log(DAX) colnames(a) <- c("logDAX", "vel", "accel") plot(a) Progress, but we're not there yet. a <- cbind(DAX, exp(diff(log(DAX))), exp(diff(diff(log(DAX) colnames(a) <- c("logDAX", "vel", "accel") plot(a) plot(a, axes=FALSE, log='y') axis(1) axis(2) How do I get each y axis labeled in its original units? I can use pretty to get where I want tick marks, but I don't know where to place them "at" in calling axis(2, at= ___)? (axlb1 <- pretty(range(a[, 1]))) (axlb2 <- pretty(range(log(a[, 2]), na.rm=TRUE))) (axlb3 <- pretty(range(log(a[, 3]), na.rm=TRUE))) This suggests I write my own modification of plot.ts that accepts log as a character vector of length = ncol of the ts being plotted and returns invisibly a list with the default "at" and "label" arguments required to produce the default labeling. Then a user who wants a log scale for some but not all variables can get that easily and can further modify any of those scales further if they don't like the default. ??? Thanks very much. Spencer Graves On Tue, May 30, 2023 at 1:46 PM Spencer Graves wrote: On 5/29/23 2:37 AM, Eric Berger wrote: How about this: a <- cbind(AirPassengers, diff(log(AirPassengers)), diff(diff(log(AirPassengers colnames(a)[2:3] <- c("percent increase", "acceleration") plot(a, xlab="year", main="AirPassengers") My real problem is more difficult: I'm analyzing CO2 data from Our World in Data (https://ourworldindata.org/co2-emissions), and I need to plot the CO2 data on a log scale but velocity and acceleration on linear scales. The following is comparable: str(DAX <- EuStockMarkets[, 'DAX']) str(DAX. <- cbind(DAX, diff(log(DAX)), diff(diff(log(DAX) colnames(DAX.)[2:3] <- c('vel', 'accel') plot(DAX.) I want the first of the three panels to plot on the log scale, but the other two on linear scales. The obvious attempt does not work: plot(DAX., log=c('y', '', '')) #Error in length(log) && log != "" : # 'length = 3' in coercion to 'logical(1)' Trying to construct my own axes isn't easy, either: str(logDAX <- cbind(log(DAX), diff(log(DAX)), diff(diff(log(DAX) colnames(logDAX) <- c('logDAX', 'vel', 'accel') plot(logDAX, axes=FALSE) axis(1) axis(2) I'm thinking of creating my own copy of "plot.ts", and changing it so it accepts the "log" argument as a vector of length equal to ncol of the ts object to be plotted AND returning an object that would allow a user to call "axis" ncol times. Suggestions? Thanks, Spencer Graves HTH, Eric On Mon, May 29, 2023 at 7:57 AM Spencer Graves wrote: Hello, All: I want to plot level, velocity, and acceleration in three panels with only one x axis. The code below does this using "layout". However, I want the three plot areas to be of equal size, and this won't do that: If I stretch the plot vertically, the relative sizes of the three panels changes. There's probably a way to do this with ggplot2, but I have yet to find it. Suggestions? Thanks, Spencer Graves str(AirTime <- as.numeric(time(AirPassengers))) str(AP <- as.numeric(AirPassengers)) def.par <- par(no.readonly = TRUE) # save default, for resetting... (mat3x1 <- matrix(1:3, 3)) plot3x1 <- layout(mat3x1, heights=c(1.4, 1, 1.5)) layout.show(plot3x1) par(mar=c(0, 4.1, 4.1, 2.1)) plot(AirTime, AP, log='y', type='l', axes=FALSE, main='AirPassengers', ylab='AirPassengers') box(col='grey') axis(2, las=1) par(mar=c(0, 4.1, 0, 2.1)) vAP <- diff(log(AP)) plot(tail(AirTime, -1), vAP, type='l', ylab='percent increase', axes=FALSE) box(col='grey') axis(2, las=1) par(mar=c(5.1, 4.1, 0, 2.1)) plot(tail(AirTime, -2), diff(vAP), type='l', ylab='acceleration', xlab='year', las=1) box(col='grey') par(def.par) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot level, velocity, acceleration with one x axis
On 5/29/23 2:37 AM, Eric Berger wrote: How about this: a <- cbind(AirPassengers, diff(log(AirPassengers)), diff(diff(log(AirPassengers colnames(a)[2:3] <- c("percent increase", "acceleration") plot(a, xlab="year", main="AirPassengers") My real problem is more difficult: I'm analyzing CO2 data from Our World in Data (https://ourworldindata.org/co2-emissions), and I need to plot the CO2 data on a log scale but velocity and acceleration on linear scales. The following is comparable: str(DAX <- EuStockMarkets[, 'DAX']) str(DAX. <- cbind(DAX, diff(log(DAX)), diff(diff(log(DAX) colnames(DAX.)[2:3] <- c('vel', 'accel') plot(DAX.) I want the first of the three panels to plot on the log scale, but the other two on linear scales. The obvious attempt does not work: plot(DAX., log=c('y', '', '')) #Error in length(log) && log != "" : # 'length = 3' in coercion to 'logical(1)' Trying to construct my own axes isn't easy, either: str(logDAX <- cbind(log(DAX), diff(log(DAX)), diff(diff(log(DAX) colnames(logDAX) <- c('logDAX', 'vel', 'accel') plot(logDAX, axes=FALSE) axis(1) axis(2) I'm thinking of creating my own copy of "plot.ts", and changing it so it accepts the "log" argument as a vector of length equal to ncol of the ts object to be plotted AND returning an object that would allow a user to call "axis" ncol times. Suggestions? Thanks, Spencer Graves HTH, Eric On Mon, May 29, 2023 at 7:57 AM Spencer Graves wrote: Hello, All: I want to plot level, velocity, and acceleration in three panels with only one x axis. The code below does this using "layout". However, I want the three plot areas to be of equal size, and this won't do that: If I stretch the plot vertically, the relative sizes of the three panels changes. There's probably a way to do this with ggplot2, but I have yet to find it. Suggestions? Thanks, Spencer Graves str(AirTime <- as.numeric(time(AirPassengers))) str(AP <- as.numeric(AirPassengers)) def.par <- par(no.readonly = TRUE) # save default, for resetting... (mat3x1 <- matrix(1:3, 3)) plot3x1 <- layout(mat3x1, heights=c(1.4, 1, 1.5)) layout.show(plot3x1) par(mar=c(0, 4.1, 4.1, 2.1)) plot(AirTime, AP, log='y', type='l', axes=FALSE, main='AirPassengers', ylab='AirPassengers') box(col='grey') axis(2, las=1) par(mar=c(0, 4.1, 0, 2.1)) vAP <- diff(log(AP)) plot(tail(AirTime, -1), vAP, type='l', ylab='percent increase', axes=FALSE) box(col='grey') axis(2, las=1) par(mar=c(5.1, 4.1, 0, 2.1)) plot(tail(AirTime, -2), diff(vAP), type='l', ylab='acceleration', xlab='year', las=1) box(col='grey') par(def.par) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot level, velocity, acceleration with one x axis
On 5/29/23 2:37 AM, Eric Berger wrote: How about this: a <- cbind(AirPassengers, diff(log(AirPassengers)), diff(diff(log(AirPassengers colnames(a)[2:3] <- c("percent increase", "acceleration") plot(a, xlab="year", main="AirPassengers") That's it. Thanks. sg HTH, Eric On Mon, May 29, 2023 at 7:57 AM Spencer Graves wrote: Hello, All: I want to plot level, velocity, and acceleration in three panels with only one x axis. The code below does this using "layout". However, I want the three plot areas to be of equal size, and this won't do that: If I stretch the plot vertically, the relative sizes of the three panels changes. There's probably a way to do this with ggplot2, but I have yet to find it. Suggestions? Thanks, Spencer Graves str(AirTime <- as.numeric(time(AirPassengers))) str(AP <- as.numeric(AirPassengers)) def.par <- par(no.readonly = TRUE) # save default, for resetting... (mat3x1 <- matrix(1:3, 3)) plot3x1 <- layout(mat3x1, heights=c(1.4, 1, 1.5)) layout.show(plot3x1) par(mar=c(0, 4.1, 4.1, 2.1)) plot(AirTime, AP, log='y', type='l', axes=FALSE, main='AirPassengers', ylab='AirPassengers') box(col='grey') axis(2, las=1) par(mar=c(0, 4.1, 0, 2.1)) vAP <- diff(log(AP)) plot(tail(AirTime, -1), vAP, type='l', ylab='percent increase', axes=FALSE) box(col='grey') axis(2, las=1) par(mar=c(5.1, 4.1, 0, 2.1)) plot(tail(AirTime, -2), diff(vAP), type='l', ylab='acceleration', xlab='year', las=1) box(col='grey') par(def.par) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot level, velocity, acceleration with one x axis
Hello, All: I want to plot level, velocity, and acceleration in three panels with only one x axis. The code below does this using "layout". However, I want the three plot areas to be of equal size, and this won't do that: If I stretch the plot vertically, the relative sizes of the three panels changes. There's probably a way to do this with ggplot2, but I have yet to find it. Suggestions? Thanks, Spencer Graves str(AirTime <- as.numeric(time(AirPassengers))) str(AP <- as.numeric(AirPassengers)) def.par <- par(no.readonly = TRUE) # save default, for resetting... (mat3x1 <- matrix(1:3, 3)) plot3x1 <- layout(mat3x1, heights=c(1.4, 1, 1.5)) layout.show(plot3x1) par(mar=c(0, 4.1, 4.1, 2.1)) plot(AirTime, AP, log='y', type='l', axes=FALSE, main='AirPassengers', ylab='AirPassengers') box(col='grey') axis(2, las=1) par(mar=c(0, 4.1, 0, 2.1)) vAP <- diff(log(AP)) plot(tail(AirTime, -1), vAP, type='l', ylab='percent increase', axes=FALSE) box(col='grey') axis(2, las=1) par(mar=c(5.1, 4.1, 0, 2.1)) plot(tail(AirTime, -2), diff(vAP), type='l', ylab='acceleration', xlab='year', las=1) box(col='grey') par(def.par) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] identify the distribution of the data
On 2/8/23 12:06 PM, Ebert,Timothy Aaron wrote: IMO) The best approach is to develop a good understanding of the individual processes that resulted in the observed values. The blend of those processes then results in the distribution of the observed values. This is seldom done, and often not possible to do. The alternatives depend on why you are doing this. 0) Sometime the nature of the data suggest a distribution. You list integer values. If all observations are integer (counts for example) then Poisson may be appropriate. With two values then maybe the Binomial distribution. Continuous data might be normally distributed (Gaussian distribution). If I roll one six-sided die many times I will have a uniform distribution (assuming a fair die). I could then try the same task but roll 2 dice and add the result. I still have discrete values, but the shape is closer to Gaussian. The distribution looks more and more Gaussian as I add more dice together in each roll. I concur: The application will often suggest a distribution, e.g., Poisson, binomial or negative binomial for nonnegative integers, Weibull for lifetime data, etc. I love normal probability plots -- the qqnorm function. This can identify outliers or multimodality or the need for a transformation. Continuous data that are always positive are often log-normal -- or a mixture of log-normals. x <- rnorm(100) X <- exp(x) qqnorm(X, datax=TRUE, log='x') The central limit theorem says that the distribution of almost any sum of random variables will be more nearly normal than the distributions of individual summands. It also says that almost any product of positive random variables will be more nearly log-normal than the distributions of individual components of the product. This application to products is less well known and occasionally controversial. https://en.wikipedia.org/wiki/Gibrat%27s_law Spencer Graves 1) Try a simulation. Draw 5 values from a normal distribution, make a histogram. Then do it again. Is it easy to see that both samples are from the same distribution? Personally, the answer is no. So increase the sample size until you are happy with a decision that any two draws are from the same distribution. For my part, at 1 million most people would not be able to detect any difference between the two histograms. This helps calibrate the people. How does your sample size compare to your choice in this exercise? 2) Given that you have sufficient data (see above), can you see the distribution in your data? Is that good enough? 3) Are you doing this as part of following the assumptions of statistical models? In such tests for normality, we tend to assume that a failure to reject the null hypothesis is sufficient proof that the null hypothesis is true. However, in most other cases we are told that a failure to reject the null hypothesis is not sufficient to prove the null hypothesis. You need to work this out, but the importance, consequences, and alternatives of testing model assumptions is a large body of literature with (sometimes) widely divergent viewpoints. 4) There are hundreds of distributions. https://cran.r-project.org/web/views/Distributions.html but the common distributions are seen in sites like this one: https://www.stat.umn.edu/geyer/old/5101/rlook.html. Given so many choices, you can probably find one that will fit your data reasonably well. Depending on how many data points you have will determine the reliability of that answer. Is that really informative to the problem you are trying to solve? Answering "what distribution do these data follow?" is not usually the goal. Regards, Tim -Original Message- From: R-help On Behalf Of Bert Gunter Sent: Wednesday, February 8, 2023 12:00 PM To: Bogdan Tanasa Cc: r-help Subject: Re: [R] identify the distribution of the data [External Email] 1. This is a statistical question, which usually is inappropriate here: this list is about R language (including packages) programming. 2. IMO (so others may disagree), your question indicates a profound misunderstanding of basic statistical issues. While maybe you phrased it poorly or I misunderstand, but "identify the type of distribution" is basically a meaningless query. Explaining why this is so and what may be more meaningful would require a deep dive into statistics. You might try referencing a basic statistical text and/or online tutorials. Try searching on "Goodness of fit", "statistical modeling" or the like. Cheers, Bert On Wed, Feb 8, 2023 at 8:35 AM Bogdan Tanasa wrote: Dear all, I do have dataframes with numerical values such as 1,9, 20, 51, 100 etc Which way do you recommend to use in order to identify the type of the distribution of the data (normal, poisson, bernoulli, exponential, log-normal etc ..) Thanks so much, Bogda
Re: [R] Plotmath isn't working for special characters
On 1/24/23 3:33 PM, David Stevens wrote: Simple expressions on plots, such as parse(text='x >= y') have been resulting in just a placeholder box (x box y and not the symbol) in my R plot labels in windows, R v 4.2.2. I haven't down an exhaustive test but <= and >= have this behavior. plot(1,1,main=parse(text="x >= y")) Has anyone else seen this? What's your "sessionInfo()"? I got the symbol, not problem. Spencer Graves sessionInfo() R version 4.2.2 (2022-10-31) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur 11.7.2 Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base loaded via a namespace (and not attached): [1] compiler_4.2.2 tools_4.2.2 rstudioapi_0.14 [4] remotes_2.4.2 > > best David __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Certification
If you go to "https://cran.r-project.org; and click, "Download R for macOS", you will find several versions that you can download and that have a "digital signature". After downloading, you can "check the integrity of the downloaded package by checking the signature: pkgutil --check-signature R-4.2.2.pkg". Beyond that, RStudio and other companies will happily sell you a maintenance contract, which will get you more polite answers than from a list serve like this ;-) Spencer Graves On 1/2/23 11:24 AM, John Kane wrote: Hi Mukesh, Have a look at the blurb that prints at the start-up of R. "R is free software and comes with ABSOLUTELY NO WARRANTY." This is a hint that the R-Project is unlikely to be issuing certificates. On Mon, 2 Jan 2023 at 08:18, Mukesh Ghanshyamdas Lekhrajani via R-help < r-help@r-project.org> wrote: Thanks Petr ! I will look at other training bodies as Coursera, or few others... but I was just wondering if there could be a certificate from the "originators" itself, I mean an "R" certificate from "r-project" itself and that would carry more importance than external / unauthorized certificate bodies. But, if you suggest there is no such certification provided by "r-project", then the only option for me is to search else where like - Coursera or few others. I now have got my answers, but later the day - if ever "r-project" comes up with "R Language" certifications, do keep me informed. Thanks, Mukesh 9819285174. -Original Message- From: PIKAL Petr Sent: Monday, January 2, 2023 6:13 PM To: mukesh.lekhraj...@yahoo.com; R-help Mailing List Subject: RE: [R] R Certification Hallo Mukesh R project is not Microsoft or Oracle AFAIK. But if you need some certificate you could take courses on Coursera, they are offering certificates. Cheers Petr -Original Message- From: R-help On Behalf Of Mukesh Ghanshyamdas Lekhrajani via R-help Sent: Monday, January 2, 2023 1:04 PM To: 'Jeff Newmiller' ; 'Mukesh Ghanshyamdas Lekhrajani via R-help' ; r-help@r-project.org Subject: Re: [R] R Certification Hello Jeff ! Yes, you are right.. and that’s why I am asking this question - just like other governing bodies that issue certification on their respective technologies, does "r-project.org" also have a learning path ? and then a certification. Say - Microsoft issues certificate for C#, .Net, etc.. Then, Oracle issues certificates for Java, DB etc.. These are authentic governing bodies for learning and issuing certificates On exactly similar lines - "r-project.org" would also be having some learning path and then let "r-project" take the proctored exam and issue a certificate... I am not looking at any external institute for certifying me on "R" - but, the governing body itself.. So, the question again is - "does r-project provide a learning path and issue certificate after taking exams" Thanks, Mukesh 9819285174 -Original Message- From: Jeff Newmiller Sent: Monday, January 2, 2023 2:26 PM To: mukesh.lekhraj...@yahoo.com; Mukesh Ghanshyamdas Lekhrajani via R- help ; r-help@r-project.org Subject: Re: [R] R Certification I think this request is like saying "I want a unicorn." There are many organizations that will enter your name into a certificate form for a fee, possibly with some credibility... but if they put "r-project.org" down as the name of the organization granting this "certificate" then you are probably getting fooled. On December 30, 2022 8:33:09 AM PST, Mukesh Ghanshyamdas Lekhrajani via R-help wrote: Hello R Support Team, I want to do R certification, could you help me with the list of certificates with their prices so it helps me to register. I want to do the certification directly from the governing body "r-project.org" and not from any 3rd party. Please help. Mukesh +91 9819285174 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting gu
Re: [R] [External] Amazing AI
On 12/21/22 7:50 AM, Barry Rowlingson wrote: Next year one of my R programming assigments will read like this: "Here is some R code written by a multi-million dollar AI system to compute [something]. It doesn't work. Fix the bugs, then ask the AI to write a letter to its creators apologising for how rubbish it is at coding. Collect one million dollars." You might want to be careful about such a promise. Kahneman, Sibony, and Sunstein (2021) Noise: A flaw in human judgment (Little, Brown and Company) claim that genuine expertise is acquired by learning from frequent, rapid, high-quality feedback on the quality of their decisions. Few people have access to such feedback. They call leaders in fields without such feedback "respect-experts", and note that respect-experts have only the illusion of competence. 1. They further say that most respect-experts can be beaten by simple heuristics developed by intelligent lay people. 2. Moreover, with a modest amount of data, ordinary least squares can beat most such heuristics. 3. And if lots of data are available, AI can beat the simple heuristics. They provide substantial quantities of research to support those claims. Regarding your million dollars, it should not be hard to write an R interface to existing AI code cited by Kahneman et al. Do you really want one of your students initiating a legal procedure to try to collect your million dollars? A quarter century ago, my friend Soren Bisgaard told me about a colleague who had raved about AI. Soren thought, "You prefer artificial intelligence to real intelligence?" I perceive a role for AI in identifying subtle phenomena missed by more understandable modeling techniques. Let's use the best understandable model, and apply AI to the residuals from that. Then identify the variables that make the largest contributions to a useful AI model, and see if they can be added to the other model. Spencer Graves On Sun, Dec 18, 2022 at 7:01 PM Boris Steipe wrote: This email originated outside the University. Check before clicking links or attachments. Technically not a help question. But crucial to be aware of, especially for those of us in academia, or otherwise teaching R. I am not aware of a suitable alternate forum. If this does not interest you, please simply ignore - I already know that this may be somewhat OT. Thanks. -- You very likely have heard of ChatGPT, the conversation interface on top of the GPT-3 large language model and that it can generate code. I thought it doesn't do R - I was wrong. Here is a little experiment: MY PROMPT: == Please write me some R code that will read a tsv file, remove all rows that have duplicates in the column "names", and store the median values of the remaining rows in a variable called "Medians". Of course, that calculation should only be done with columns that are numeric. Print the last variable in order from largest to smallest. AI RESPONSE: Here is some R code that should accomplish what you've described: (Note: the code is formatted in a little terminal window, complete with a button to "copy code" at the top.) # Read the tsv file data <- read.table("file.tsv", header = TRUE, sep = "\t") # Remove rows with duplicates in the "names" column data <- data[!duplicated(data$names), ] # Select only the numeric columns num_columns <- sapply(data, is.numeric) numeric_data <- data[, num_columns] # Calculate the medians for each column Medians <- apply(numeric_data, 1, median) # Sort Medians in descending order Medians <- sort(Medians, decreasing = TRUE) # Print Medians print(Medians) This code assumes that the tsv file is called "file.tsv" and is stored in the current working directory. It also assumes that the first row of the file contains the column names. MY PROMPT: == Can you provide me with a bit of sample data that I can use to test this code? AI RESPONSE: Certainly! Here is some sample data that you can use to test the code I provided: nameage height weight Alice 23 5.6 120 Bob 32 6.2 180 Charlie 29 5.9 160 Alice 23 5.6 120 David 25 5.8 140 Eve 27 5.5 110 You can save this data as a tsv file and use it to test the code. -- Notes: 1) The code, as written does not work with the test data. The reason is that the test data specifies a column "name", but the code assumes "names" (which I specified in the instructions). Once this is fixed in the test data, the code executes. 2) The instructions are not accurately translated: I had asked for row means. But changing the marg
Re: [R] cannot print a list with cat
On 10/24/22 7:39 AM, Steven T. Yen wrote: I have a "list" containing four elements, as shown below: > t(mycontrol) tol reltol steptol gradtol [1,] 0 0 1e-08 1e-12 Printing this in a main program causes no problem (as shown above). But, using the command t(mycontrol) the line gets ignored. Any idea? I'm confused. I get: > (mycontrol <- list(tol=0, reltol=0, + steptol=1e-8, gradtol=1e-12)) $tol [1] 0 $reltol [1] 0 $steptol [1] 1e-08 $gradtol [1] 1e-12 > > t(mycontrol) tol reltol steptol gradtol [1,] 0 0 1e-08 1e-12 I don't know what you mean by "main program" vs. "the command t(mycontrol)". ??? Spencer Graves > sessionInfo() R version 4.2.1 (2022-06-23) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur 11.7 Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets [6] methods base loaded via a namespace (and not attached): [1] compiler_4.2.1 tools_4.2.1 Thanks. Steven Yen __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 install.package
From long and unhappy experience, I routinely reboot before reporting problems like this, then upgrade all software where feasible ... including "update.packages". Some problems magically disappear after doing this. If they persist, then, as it says at the end of each email on this thread, "PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code." If you do that, it increases the chances that the help you get will actually be helpful ;-) Spencer On 10/11/22 1:01 AM, sibylle.stoec...@gmx.ch wrote: Dear R users Thanks a lot. I will try again. The error code was below the email, but there was some encryption: - Error in nchar(homeDir:invalid multibyte string, element 1 - I realised that I was able to install ggplot2, but the error is afterwards when loading ggplot2 (library()). Kind regards Sibylle install.packages("ggplot2", dependencies = TRUE) WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding: Installiere Paket nach ‘C:/Users/Sibylle Stöckli/AppData/Local/R/win-library/4.2’ (da ‘lib’ nicht spezifiziert) trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/ggplot2_3.3.6.zip' Content type 'application/zip' length 4122980 bytes (3.9 MB) downloaded 3.9 MB Paket ‘ggplot2’ erfolgreich ausgepackt und MD5 Summen abgeglichen Die heruntergeladenen Binärpakete sind in C:\Users\Sibylle Stöckli\AppData\Local\Temp\RtmpcBknba\downloaded_packages Error in nchar(homeDir) : invalid multibyte string, element 1 -Ursprüngliche Nachricht- Von: Ebert,Timothy Aaron Gesendet: Dienstag, 11. Oktober 2022 02:42 An: sibylle.stoec...@gmx.ch; R-help@r-project.org Betreff: RE: [R] ggplot2 install.package Can you please provide us with a copy of the error msg? I have used ggplot2 with R 4.2.1, and probably R 4.2.0 in RStudio on Windows 11 and not had trouble. You might need to update R, or RStudio. R and RStudio need to be compatible versions. Tim -Original Message- From: R-help On Behalf Of sibylle.stoec...@gmx.ch Sent: Monday, October 10, 2022 9:33 AM To: R-help@r-project.org Subject: [R] ggplot2 install.package [External Email] Dear R members I am working with RStudio R 4.2.0 and using ggplot2 and gridExtra. Since some windows updates I get some error when loading ggplot2. * It seems that I am able to install ggplot2. The problem is when loading ggplot2 with library(). * Do you have any ideas to solve the problem? Kind regards Sibylle ** install.packages("ggplot2", dependencies = TRUE) WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding: https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcran.rstudio.com%2Fbin%2Fwindows%2FRtools%2Fdata=05%7C01%7Ctebert%40ufl.edu%7C05f8820150e648d4f29108daab015484%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638010323276937614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000%7C%7C%7Csdata=PKqDjS%2FEfL4Mtho3TRywJUuy%2FpGQ9A3seBjgGr6IScM%3Dreserved=0 Installiere Paket nach C:/Users/Sibylle St ckli/AppData/Local/R/win-library/4.2 (da lib nicht spezifiziert) trying URL 'https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcran.rstudio.com%2Fbin%2Fwindows%2Fcontrib%2F4.2%2Fggplot2_3.3.6.zipdata=05%7C01%7Ctebert%40ufl.edu%7C05f8820150e648d4f29108daab015484%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638010323276937614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000%7C%7C%7Csdata=b5dcFImQLYrE850p5SnbOyibj9KV0JrZIvfL8trNI84%3Dreserved=0' Content type 'application/zip' length 4122980 bytes (3.9 MB) downloaded 3.9 MB Paket ggplot2 erfolgreich ausgepackt und MD5 Summen abgeglichen Die heruntergeladenen Bin rpakete sind in C:\Users\Sibylle St ckli\AppData\Local\Temp\RtmpcBknba\downloaded_packages Error in nchar(homeDir) : invalid multibyte string, element 1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parsing XML?
Hi, Richard et al.: On 7/28/22 1:50 AM, Richard O'Keefe wrote: What do you mean by "a list that I can understand"? A quick tally of the number of XML elements by identifier: 1 echoedSearchRetrieveRequest 1 frbrGrouping 1 maximumRecords 1 nextRecordPosition 1 numberOfRecords 1 query 1 records 1 resultSetIdleTime 1 searchRetrieveResponse 1 servicelevel 1 sortKeys 1 startRecord 1 wskey 2 version 50 leader 50 recordData 51 recordPacking 51 recordSchema 100 record 105 controlfield 923 datafield 1900 subfield How did you get that? Please forgive me for being so dense. I've done several web searches and tried to work several tutorials, etc., without so far seeing what I might do that could be informative. Even this list of "XML elements by identifiers" STILL does not include things like the name of the newspaper and publisher plus start and end dates. I believe these fields are there, but I can't see how to parse them. I earlier parsed a JSON version of essentially the same dataset. However, the JSON version seemed not to distinguish between newspapers that were still publishing and those for which the end date was unknown. My contact at the Library of Congress then suggested I parse the XML version. Thanks, Spencer What of this information do you actually want? The elements of the list should be what? On Thu, 28 Jul 2022 at 08:52, Spencer Graves <mailto:spencer.gra...@effectivedefense.org>> wrote: Hello, All: What would you suggest I do to parse the following XML file into a list that I can understand: XMLfile <- "https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml <https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml>" This is the first of XML files containing "U.S. Newspaper Directory" maintained by the US Library of Congress discussed in the thread below. I've tried various things using the XML and xml2. XMLdata <- xml2::read_xml(XMLfile) str(XMLdata) XMLdat <- XML::xmlParse(XMLdata) str(XMLdat) XMLtxt <- xml2::xml_text(XMLdata) nchar(XMLtxt) #[1] 29415 Someplace there's a schema for this. I don't know if it's embedded in this XML file or in a separate file. If it's in a separate file, how could I describe it to my contacts with the Library of Congress so they would understand what I needed and could help me get it. Thanks, Spencer Graves p.s. All 29415 characters in XMLtext appear in the thread below. Forwarded Message Subject: [Newspapers and Current Periodicals] How can I get counts of the numbers of newspapers by year in the US, and preferably also elsewhere? A search of "U.S. Newspaper Directory, Date: Wed, 27 Jul 2022 14:59:03 + From: Kerry Huller mailto:seri...@ask.loc.gov>> To: Spencer Graves mailto:spencer.gra...@effectivedefense.org>> CC: t...@loc.gov <mailto:t...@loc.gov> --# Type your reply above this line #-- Newspapers and Current Periodicals Reference Librarian Jul 27 2022, 10:59am via System Hello Spencer, So, when I view the xml, I'm actually looking at it in XML editor software, so I can view the tags and it's structured neatly. I've copied and pasted the text from the beginning of the file and the first newspaper title below from my XML editor: http://www.loc.gov/zing/srw/ <http://www.loc.gov/zing/srw/>" xmlns:oclcterms="http://purl.org/oclc/terms/ <http://purl.org/oclc/terms/>" xmlns:dc="http://purl.org/dc/elements/1.1/ <http://purl.org/dc/elements/1.1/>" xmlns:diag="http://www.loc.gov/zing/srw/diagnostic/ <http://www.loc.gov/zing/srw/diagnostic/>" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance <http://www.w3.org/2001/XMLSchema-instance>"> 1.1 2250 info:srw/schema/1/marcxml xml http://www.loc.gov/MARC21/slim <http://www.loc.gov/MARC21/slim>"> 0nas a2207i 4500 1030438981 180404c2015aluwr n 0 a0eng 2018200464 DLC rda DLC eng 1 2577-5316 1 021110 USPS 711 Alabama Avenue, Selma, AL 36701
Re: [R] A humble request
stand it. DO ask your supervisor help you to select an alternative, and if you can't figure that out between you, ASK a more informative question. On Mon, 4 Jul 2022 at 00:38, John Kane <mailto:jrkrid...@gmail.com>> wrote: Spenser, the idea to source the fonction makes sense but since tho OP is a very new beginner perhaps you could point him towards code showing him how to do this? I have never done this AFAIR, and while I suspect a few minutes googling would show me how, it is likely to be more difficult for a noobie. On Sun, 3 Jul 2022 at 07:25, Spencer Graves mailto:spencer.gra...@effectivedefense.org>> wrote: > > Muhammad Zubair Chishti: > > > What specifically have you tried? What were the results? > > > Please "provide commented, minimal, self-contained, reproducible > code", per "the posting guide > http://www.R-project.org/posting-guide.html <http://www.R-project.org/posting-guide.html>", as indicated in at the end > of each email in this thread. > > > And don't overlook the suggestion I made: Download the archived > package. Do NOT try to compile it. Instead source only the function you > want, try to run it. With luck, it will work. If it doesn't, you will > get a diagnostic that can help you take the next step. > > > Spencer > > > On 7/3/22 1:28 AM, Andrew Simmons wrote: > > It seems like this package was archived because package "ifultools" > > was archived. I tried installing "ifultools" from source, but it has a > > seriously large amount of C compilation issues. The main issue seems > > to be that variable PROBLEM was never defined anywhere, and it is > > unclear what its definition should be. Unfortunately, as far as I can > > tell, this issue is unfixable. If you want to install "wmtsa", you'll > > have to use an older version of R. Otherwise, you can use one of the > > other wavelet analysis packages that Richard O'Keefe mentioned. > > > > On Sun, Jul 3, 2022 at 2:01 AM Muhammad Zubair Chishti > > mailto:mzchis...@eco.qau.edu.pk>> wrote: > >> > >> Dear Respected Experts and specifically Professor Richard O'Keefe, > >> Thank you so much for your precious time and generous help. However, the > >> problem is still there and I am just unable to resolve it due to the lack > >> of expertise in R. Still, the hope is there. I believe that this platform > >> can help me. > >> > >> Regards > >> Muhammad Zubair Chishti > >> School of Business, > >> Zhengzhou University, Henan, China > >> My Google scholar link: > >> https://scholar.google.com/citations?hl=en=YPqNJMwJ <https://scholar.google.com/citations?hl=en=YPqNJMwJ> > >> My ReseachGate Profile: > >> https://www.researchgate.net/profile/Muhammad-Chishti <https://www.researchgate.net/profile/Muhammad-Chishti> > >> > >> > >> On Sun, Jul 3, 2022 at 9:11 AM Richard O'Keefe mailto:rao...@gmail.com>> wrote: > >> > >>> A postscript to my previous comment. > >>> I used to supervise PhD students. > >>> Let me advise you to write this issue up as a draft section for your > >>> thesis. > >>> 1. Why I wanted to use the wmtsa package. > >>> 2. Why I didn't. > >>> 3. How I went about selecting a replacement. > >>> 4. What I chose and why that's the right choice. > >>> 5. How the analyses I wanted to do are done in > >>> package X and what difference it makes. > >>> > >>> Off the top of my head, the only reasons for struggling to use an old > >>> package are to try to replicate someone else's results and/or to try to use > >>> their software (built atop the dead package) with new data. Well, if you > >>> get different results, that's interesting too, and then it's time to work > >>> harder to resurrect the dead package. > >>> > >>> Speaking of which, an easier route might be to set up a separate > >>> environment running an old version of R that *can* run the old code and the > >>> old co
Re: [R] A humble request
On 7/2/22 12:30 PM, David Winsemius wrote: On Jul 2, 2022, at 9:32 AM, Muhammad Zubair Chishti wrote: Dear Respected Sir, Thank you for the quick response. However, the link your shared provides the old version of wmtsa that is not working in R 4.2.0. Can you please tell me any other solution for this issue? Your response suggests that you do not understand how one installs a source package for R. You probably need to educate yourself about how to compile packages from source. Since the process is different for Linux, MacOS, and Windows, you would need to pay attention to the specific requirements of your unnamed OS. BUT you do not need to compile the entire package to use a function in it: You only need to source that particular function ... and any others it calls that you don't already have ;-) That simplifies the task greatly. With luck, Muhammad Zubair Chishti can source only the desired functions, and they will run and do what is desired with no further effort. If not, Muhammad can be guided by the error message(s): If they complain that a certain function is not available, Muhammad can find it and source it. If the problem is more subtle, Muhammad can use "debug", as I earlier suggested. Spencer Graves You might find as I did that the wmtsa package has some dependencies that are also archived, namely pkg:splus2R (provided generously by Insightful for many years but apparently no longer) and pkg:ifultools. They both have incompatibilities that prevent compilation from source with my Mac setup. I did find a mirror that has a binary version of splus2R that did install and load. install.packages("splus2R", repo="https://mirrors.vcea.wsu.edu/r-cran/;, type="binary", dependencies=TRUE) So maybe part of your PhD training should be learning how to compile packages and address changes in the compilers being used as R evolves. That is beyond my capabilities for offering assistance. Alternately you might find a forum or a consultant with similar interests to yours. Again, I can offer no assistance because I have never used pkg:wmtsa and don't know what it provided when it was compatible with the CRAN universe. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A humble request
Are you familiar with the "debug" function? You can download the package, source the function you want to use. If it doesn't run or doesn't seem to give a sensible answer, debug(fun) followed by fun(arg1, arg2, arg3) will start the function then stop and invite you to walk through the function line by line. You can look at what it does, and change it as you like. Doing so should help you learn R while also making it easier for you to figure out how to make the function do what you want. Hope this helps. Spencer Graves On 7/2/22 11:39 AM, Muhammad Zubair Chishti wrote: Dear Respected Professor, You are right. However, a student like me who doesn't know much R requires specific helping packages to install. Therefore, I requested here to experts to give the solution for my issue. I hope that I will get the solution from this platform. Regards Muhammad Zubair Chishti On Sat, 2 Jul 2022, 21:34 Spencer Graves, <mailto:spencer.gra...@effectivedefense.org>> wrote: If I understand correctly the rules for "archived" packages, anyone is free to download them, compile them locally -- AND resubmit any such package with themselves as the maintainer: They were archived, because they would no longer pass CRAN checks, and the designated maintainer failed to respond to the notice of problems with newer versions of R. If anything I've said here is wrong or poorly nuanced, I trust someone knowledgeable will correct me. Spencer Graves On 7/2/22 11:24 AM, Rainer Hurling wrote: > Hi Muhammad, > > Am 02.07.22 um 18:11 schrieb Muhammad Zubair Chishti: >> Dear Experts, >> I cannot find a package "wmtsa" for my R version "R 4.2.0". Kindly >> help me >> to find it or share the link with me. >> Although I tried the old version of "wmtsa" but failed. >> Thank you for your precious time. >> >> Regards >> Muhammad Zubair Chishti >> >> [[alternative HTML version deleted]] > > AFAIK there is no package wmtsa anymore. It has been archived on > 2020-06-09 [1], the latest version was wmtsa_2.0-3.tar.gz from 2017-12-06. > > [1] https://cran.r-project.org/src/contrib/Archive/wmtsa/ <https://cran.r-project.org/src/contrib/Archive/wmtsa/> > > HTH, > Rainer > > __ > R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is it possible to set a default working directory for all R consoles?
I use the free version of RStudio, and I routinely work with "Projects". For new projects, I first I create a new project directory in Finder (in a Mac) or Windows Explorer if it does not already exist. Then in RStudio, I do File > "New Project..." > in "Existing Directory". Then when I want to work on an existing project, I can do File > "Recent Projects" in RStudio. Or I can double click on the appropriate *.Rproj file in Finder or Windows Explorer. Hope this helps. Spencer Graves On 6/17/22 10:46 AM, Bill Dunlap wrote: Is there an environment variable containing that IP address? as.list(grep(value=TRUE, "172", Sys.getenv())) # as.list to make printing nicer If you know which variable is causing the problem you may be able to override it by setting an R-specific one. -Bill On Fri, Jun 17, 2022 at 8:28 AM IAGO GINÉ VÁZQUEZ wrote: Hi all, Is there some way to set a default working directory each time R.exe is opened? I ask this because Always that I open R 4.2.o in Windows 10 I get the next warning messages Warning message: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="\\172.19.2.44\profiles\profiles\me\Documents": Access is denied Warning message: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="\\172.19.2.44\profiles\profiles\me\Documents": Access is denied Warning message: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="\\172.19.2.44\profiles\profiles\me\Documents": Access is denied Warning message: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="\\172.19.2.44\profiles\profiles\me\Documents": Access is denied Even from cmd.exe in a C: location C:\Users\me>R Warning message: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="\\172.19.2.44\profiles\profiles\me\Documents": Access is denied Warning message: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="\\172.19.2.44\profiles\profiles\me\Documents": Access is denied Warning message: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="\\172.19.2.44\profiles\profiles\me\Documents": Access is denied Warning message: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="\\172.19.2.44\profiles\profiles\me\Documents": Access is denied Reading https://cran.r-project.org/bin/windows/base/rw-FAQ.html#What-are-HOME-and-working-directories_003f, first I cannot apply the suggested solution (I cannot see such "shortcut’s properties"), and second, I am not interested just in Rgui, or even RStudio, but other terminals running R (like VSCode-radian or SublimeText-Terminus) Thanks in advance. Best wishes, Iago [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exact 95% CI around the mean for LogNormal distribution
The mean of a log-Cauchy (aka log-Student's t with 1 degree of freedom) is infinite ;-) The median and other quantiles are not affected by transformations, though the mean is. If you really want anything other than a confidence interval about the mean of the logs, assuming a log-normal distribution, I think you need to think very carefully about the problem you are trying to solve, and how that might be impacted by different assumptions about the distribution of the data. And for that, I recommend normal probability plotting, i.e., qqnorm of log(X): If that looks plausibly like a straight line, you are fine with a log-normal assumption. If not, my favorite reference is Titterington, Smith and Makov (1985) Statistical Analysis of Finite Mixture Distributions (Wiley). Hope this helps. Spencer Graves On 3/16/22 5:44 PM, Bert Gunter wrote: This is largely a statistics question, so somewhat off topic here (see the posting guide linked below). Ergo the lack of a response yet. Therefore you *might* do better posting on the r-sig-ecology list. And now for a couple of statistics comments, which you are free to ignore of course, and which you may already be well aware of. Assuming the rv X is approximately lognormally distributed means that log(x) is ~ normally distributed which means that a (symmetric) CI for the mean of the log(X) (=: meanlog of X) is also approximately a CI for the median of log(X). Hence the back transform (exp()) of the meanlog CI is an approx CI for the **median** of the lognormal distribution. The median of a lognormal is **not** the same as the mean, but it generally makes more sense, as the mean of a skew distribution like the lognormal has no clear interpretation, while the median (or any quantile) still does. Your mileage may vary, of course. Cheers, Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Mar 16, 2022 at 1:10 PM varin sacha via R-help wrote: Dear R-experts, I have used the EnvStats package and the elnorm function (p. 248). I would like to calculate the exact 95% confidence intervals around the mean, not around the meanlog. Here below my R code, how can I get the exact 95% CIs around the mean ? Many thanks. library(EnvStats) x=rlnorm(10,0,1) mean(x) elnorm(x,method="mvue",ci=TRUE,ci.type="two-sided",ci.method="exact",conf.level=0.95) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a Truth Table Generator in R?
library(sos) tt <- findFn("{truth table}") installPackages(tt) tt This just now opened two sheets in my default browser. The first listed all the help pages in contributed packages with the phrase "truth table", sorted to put first the ones in packages with the most matches. The second is a summary by package. Hope this helps. Spencer Graves On 3/12/22 10:42 AM, Paul Bernal wrote: Dear friends, Hope you are doing great. I have been searching for a truth table generator in R, but everything I find has a Python implementation instead. Maybe there is in fact a truth table generator in R, but I am not searching in the right places? Any help and/or guidance will be greatly appreciated. Best regards, Paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Searching for packages
There is also sos::findFn, discussed in a vignette included in that package. When I created that package, I felt it was the fastest literature search for anything statistical: It will identify all help pages that match a search string and sort them by package. It creates an object so related searches can be combined by union or intersection. And the print method by default creates two pages in your default browser, one listing the help pages and the second giving the package summary. See also: https://en.wikiversity.org/wiki/Searching_R_Packages https://journal.r-project.org/archive/2018/RJ-2018-058/RJ-2018-058.pdf sg On 2/28/22 1:47 PM, Ebert,Timothy Aaron wrote: Thank you for the list. -Original Message- From: Bert Gunter Sent: Monday, February 28, 2022 10:47 AM To: Rui Barradas Cc: Ebert,Timothy Aaron ; serv...@shdawson.com; r-help@r-project.org Subject: Re: [R] Searching for packages [External Email] Other places beside google to look for R info: https://urldefense.proofpoint.com/v2/url?u=https-3A__rdrr.io_=DwIFaQ=sJ6xIWYx-zLMB3EPkvcnVg=9PEhQh2kVeAsRzsn7AkP-g=NX0Xg1Ds2e1R_R7VKRbwJ9Nj-9mEmj5NnaA8zGWLlX904sfhElg1hwTnFBnnDpaM=oi6J8PKSQJp4dd82zs3hlWnFQWnHaXrfDLawV-2gDkg= https://urldefense.proofpoint.com/v2/url?u=https-3A__www.rdocumentation.org_=DwIFaQ=sJ6xIWYx-zLMB3EPkvcnVg=9PEhQh2kVeAsRzsn7AkP-g=NX0Xg1Ds2e1R_R7VKRbwJ9Nj-9mEmj5NnaA8zGWLlX904sfhElg1hwTnFBnnDpaM=WlC49B04XegAvmAosdabM_1CYDvQ3QHC6bzZ_-cVU00= https://urldefense.proofpoint.com/v2/url?u=https-3A__rseek.org_=DwIFaQ=sJ6xIWYx-zLMB3EPkvcnVg=9PEhQh2kVeAsRzsn7AkP-g=NX0Xg1Ds2e1R_R7VKRbwJ9Nj-9mEmj5NnaA8zGWLlX904sfhElg1hwTnFBnnDpaM=3H-rSYG6BgbLVYLsFTZAOZBvPc43aw16HtLgcX4S2_o= Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Feb 28, 2022 at 5:44 AM Rui Barradas wrote: Hello, Actually, I only found it at the second try. The RSiteSearch webpage has a search bar, I've used it search binGroup* and found it on screen 3. Rui Barradas Às 13:32 de 28/02/2022, Ebert,Timothy Aaron escreveu: binGroup2 (version 1.02) was released 2020-04-07. There was one update on 2021-03-17 (according to available.versions("binGroup2"). My last system update was (I assume) recent. My system is governed by my University. I get monthly updates, rarely more often. My version of R is 4.1. I tried Rui's suggestion (thank you) of searching using R. I read through the list several times but did not find binGroup2, though it is possible that it was there as binGroup and I need to open all the documents to find it. Tim -Original Message- From: Stephen H. Dawson, DSL Sent: Monday, February 28, 2022 7:46 AM To: Ebert,Timothy Aaron ; r-help@r-project.org Subject: Re: [R] Searching for packages [External Email] When was binGroup2 released? When is your last system update? *Stephen Dawson, DSL* /Executive Strategy Consultant/ Business & Technology +1 (865) 804-3454 https://urldefense.proofpoint.com/v2/url?u=http-3A__www.shdawson.com =DwICaQ=sJ6xIWYx-zLMB3EPkvcnVg=9PEhQh2kVeAsRzsn7AkP-g=WEV48V _ui8Oz-EeQtPAqqF1cRT83c11S15V05P3diVQoJspuLdN1R04nGi2PSLKn=J6gEolZ qXpDtRFdEzUW6R1ySBhZw3FgsLL2GoKQ1vg0= On 2/28/22 07:36, Ebert,Timothy Aaron wrote: If I want help on a package I can usually go to google and type "package.name in r" and I get what I need in the first few hits. Google typically offers suggestions if my spelling is a bit off, and I get hits that are close. I searched for the binGroup package in the usual way and found that it is no longer supported. I asked the authors only to discover that there is a binGroup2 package. If I search for "tidyv in R" I get tidyverse. Any idea why I did not get binGroup2 in the binGroup search? Regards, Tim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_m ail man_listinfo_r-2Dhelp=DwICaQ=sJ6xIWYx-zLMB3EPkvcnVg=9PEhQh2kV eAs Rzsn7AkP-g=WEV48V_ui8Oz-EeQtPAqqF1cRT83c11S15V05P3diVQoJspuLdN1R0 4nG i2PSLKn=1lwcSxFoUFNj5-s-czuFCrPlhteONuK8xl-a2CkpnVA= PLEASE do read the posting guide https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject .or g_posting-2Dguide.html=DwICaQ=sJ6xIWYx-zLMB3EPkvcnVg=9PEhQh2k VeA sRzsn7AkP-g=WEV48V_ui8Oz-EeQtPAqqF1cRT83c11S15V05P3diVQoJspuLdN1R 04n Gi2PSLKn=rVBDL-rnzLnz-rie3bN1-pU8DPfnVxnJMTfbsKzW3nw= and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_ma ilman_listinfo_r-2Dhelp=DwIFaQ=sJ6xIWYx-zLMB3EPkvcnVg=9PEhQh2k VeAsRzsn7AkP-g=NX0Xg1Ds2e1R_R7VKRbwJ9Nj-9mEmj5NnaA8zGWLlX904sfhElg
Re: [R] nlme::lme sigma parameter standard deviation or variance?
Modifying the first example in help('lme', pac='nlme'): library(nlme) fm1 <- lme(distance ~ age, data = Orthodont) # random is ~ age fm1$sigma fm1.2 <- lme(2*distance ~ age, data = Orthodont) # random is ~ age fm1.2$sigma Conclusion: Standard deviation, as indicated in help('lmeObject'). Hope this helps. Spencer Graves On 12/18/21 4:00 AM, Eric Berger wrote: You can run a test. Multiply all your data by a scalar, say 2. If this changes the result lme_mod$sigma by a factor of 2, then it is a std deviation. If it changes the result by a factor of 4, then it is a variance. HTH, Eric On Sat, Dec 18, 2021 at 11:26 AM Courtney Van Den elzen wrote: Hi R-help, I am a researcher fitting linear mixed models using the package nlme. I am wondering whether the sigma value that is outputted as part of the model object is standard deviation or variance? for example, I might fit a model lme_mod <- nlme::lme(response ~ predictor1 + predictor2, random = (~1|grouping1)) I am wondering whether lme_mod$sigma is actually a standard deviation or if it's a variance. Thanks so much, Courtney [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Date read correctly from CSV, then reformatted incorrectly by R
I've written Ecfun::asNumericDF to overcome some of the common problems with read.data, read.csv, etc.: https://www.rdocumentation.org/packages/Ecfun/versions/0.2-5/topics/asNumericDF I use it routinely to help parse numbers, dates, etc., that are read as character. I'm sure it can be improved. It's on GitHub in case anyone would like to take the time to suggest improvements: https://github.com/sbgraves237/Ecfun Hope this helps. Spencer Graves On 11/20/21 4:13 PM, Avi Gross via R-help wrote: This seems to be a topic that comes up periodically. The various ways in R and other packages for reading in data often come with methods that simply guess wrong or encounter one or more data items in a column that really do not fit so fields may just by default become a more common denominator of character or perhaps floating point. There are ways that some such programs can be given a hint of what you expect or even be supplied with a way to coerce them into what you want while being read in. But realistically, often a more practical method might be to take the data.frame variety you read in and before using it for other purposes, check it for validity and make any needed changes. Simplistic ones might be to see how many columns were read in to see if it matches expectations or generate an error. Or you may trim columns (or rows) that are not wanted. In that vein, are there existing functions available that will accept what types you want one or more columns to be in and that validate if the current type is something else and then convert if needed? I mean we have functions like as.integer(df$x ) or more flexibly as(df$x, "integer") and you may simply build on a set of those and create others to suit any special needs. Of course a good method carefully checks the results before over-writing as sometimes the result may not be the same length (as shown below) or may violate some other ideas or rules: as(c(NULL, NA, 3, 3.1, "3.1", list(1,2,"a")), "character") [1] "NA" "3" "3.1" "3.1" "1" "2" "a" So if you have dates in some format, or sometimes an unknown format, there are ways, including some others have shown, to make them into some other date format or even make multiple columns that together embody the format. What people sometimes do is assume software is perfect and should do anything they want. It is the other way around and the programmer or data creator has some responsibilities to use the right software on the right data and that may also mean sanity checks along the way to see if the data is what you expect or alter it to be what you need. -Original Message- From: R-help On Behalf Of Philip Monk Sent: Saturday, November 20, 2021 3:28 PM To: Jeff Newmiller Cc: R-help Mailing List Subject: Re: [R] Date read correctly from CSV, then reformatted incorrectly by R Thanks, Jeff. I follow what you're doing below, but know I need to read up on Date / POSIXct. Helpful direction! :) On Sat, 20 Nov 2021 at 18:41, Jeff Newmiller wrote: Beat me to it! But it is also worth noting that once converted to Date or POSIXct, timestamps should be treated as data without regard to how that data is displayed. When you choose to output that data you will have options as to the display format associated with the function you are using for output. My take: dta <- read.table( text= "Buffer28/10/201619/11/2016 31/12/201616/01/2017 05/03/2017 1002.437110889-8.696748953.2392998162.443183304 2.346743827 2002.524329899-7.6888620683.3868117342.680347706 2.253885237 3002.100784256-8.0598558353.1437865072.615152896 2.015645973 4001.985608385-10.67072062.8945727912.591925038 2.057913137 5001.824982163-9.1225197362.5603507272.372226799 1.995863839 ", header=TRUE, check.names=FALSE, as.is=TRUE) dta library(dplyr) library(tidyr) dt_fmt <- "%d/%m/%Y" dta_long <- ( dta %>% pivot_longer( cols = -Buffer , names_to = "dt_chr" , values_to = "LST" ) %>% mutate( dt_date = as.Date( dt_chr, format = dt_fmt ) , dt_POSIXct = as.POSIXct( dt_chr, format = dt_fmt, tz = "Etc/GMT+8" ) ) ) dta_long On November 20, 2021 10:01:56 AM PST, Andrew Simmons wrote: The as.Date function for a character class argument will try reading in two formats (%Y-%m-%d and %Y/%m/%d). This does not look like the format you have provided, which is why it doesn't work. Try something like: x <- c("28/10/2016", "19/11/2016", "31/12/2016", "16/01/2016", "05/03/2017") as.Date(x, format = "%
Re: [R] Date
My speculation is that Microsoft Excel recognized that as a date and saved it in the "mm/dd/" format you saw when reading it into R with dat=read.csv("myfile.csv"). "str" told you the format. You can convert that from character to Date using as.Date(dat$mydate, '%m/%d/%Y'), as documented in help('as.Date'). NOTE: The error message, "character string is not in a standard unambiguous format" is almost appropriate: In this case, it's clear that "09/16/2019" refers to month 09, day 16, and year 2019. However, if it were "09/06/2019", we would not know if it were September 6 or 9 June of 2019. If it were "09/06/08", we would have the added possibility with the year first, followed by month and day: June 8, 2009. This ambiguity is resolved most forcefully by ISO 8601. Hope this helps. Spencer Graves On 11/4/21 5:30 PM, PIKAL Petr wrote: Hi Not sure why the date format was changed but if I am correct R do not read dates as dates but as character vector. You need to transfer such columns to dates by asDate. The error is probably from your use two asDate commands. Cheers Petr -Original Message- From: R-help On Behalf Of Val Sent: Thursday, November 4, 2021 10:43 PM To: r-help@R-project.org (r-help@r-project.org) Subject: [R] Date IHi All, l, I am reading a csv file and one of the columns is named as "mydate" with this form, 2019-09-16. I am reading this file as dat=read.csv("myfile.csv") the structure of the data looks like as follow str(dat) mydate : chr "09/16/2019" "02/21/2021" "02/22/2021" "10/11/2017" ... Please note the format has changed from -mm-dd to mm/dd/ When I tried to change this as a Date using as.Date(as.Date(mydate, format="%m/%d/%Y" ) I am getting this error message Error in charToDate(x) : characte string is not in a standard unambiguous format My question is, 1. how can I read the file as it is (i.e., without changing the date format) ? 2. why does R change the date format? Thank you, __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních partnerů PRECHEZA a.s. jsou zveřejněny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner’s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/ Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] about L-BFGS-B
Hello 최병권 : Are you familiar with the "debug" function? "debug(glm.cmp)" followed by the problem command will put you into the environment of "glm.cmp", and you can walk through that function line by line looking at each variable. If you need more help, "PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code." Also, am I correct that this glm.cmp function is in the COMPoissonReg package? If yes, have you tried contacting the maintainer of that package (Andrew Raim ?)? If you can't solve your problem using "debug(glm.cmp)" AND you have NOT yet asked the package maintainer with "commented, minimal, self-contained, reproducible code." They might like to have that example to make the code easier to use, e.g., by giving a more informative error message. Hope this helps, Spencer Graves On 8/15/21 7:26 AM, J C Nash wrote: You have the answer in the error message: the objective function has been calculated as +/-Inf somehow. You are going to have to figure out where the function is computed and why it is not finite. JN On 2021-08-15 12:41 a.m., 최병권 wrote: Hello Dear, I am Choy from Seoul. I have a question about R Is it possible to answer the error that occurred while executing R? I am currently doing zero inflated regression because the data follows a poisson distribution. However, for glm.c <http://glm.cm>mp (rndclass ~ CR4), the results are normally displayed, but when an independent variable is added, the following error message is displayed: #Error in optim(par.init, loglik, method = optim.method, control = optim.control, : L-BFGS-B needs finite values of 'fn' I can't solve it, but I'd appreciate it if you could help. For reference, the estimated regression equation is attached below. # A CMP example with offset terms cmp.m3 = glm.cmp(rndclass ~ CR4 + offset(CR4), data=dta) -(1) print(cmp.m3) # A ZICMP example with offset terms. zicmp.m4 = glm.cmp(rndclass ~ CR4+offset(CR4), formula.nu = ~ offset(CR4), formula.p = ~ CR4+offset(CR4), data=dta) -(2) # dependent variable: summary(dta$rndc) 0 1 2 3 12398 1290 11341 109 Thank you Best Choy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SOS package: findFn does not work
Thanks for the question. Two things: ** 1. TRY THE DEVELOPMENT VERSION OBTAINABLE AS FOLLOWS: install.packages('devtools') # if it's not installed library(devtools) install_github("sbgraves237/sos") library(sos) z <- findFn("spline", maxPages = 2) # This should work. ** 2. PLEASE PROVIDE: sessionInfo() "findFn('spline')" just worked for me using both the CRAN and development versions of sos. (2.1-0 from CRAN under Windows 10; 2.1- from GitHub under macOS 11.4; both with R 4.1.0). I need to push the GitHub version to CRAN. However, it would help me if I understood your configuration and if the GitHub version fixes the problem for you. Thanks for the report. I apologize for the inconvenience. Spencer Graves On 8/7/21 2:57 AM, hp wan wrote: Dear All, Recently, I found that the SOS package (very helpful package) does not work. When I used the "findFn" function to search something, it always said "found 0 matches" (see below). My desktop system is Win 10 and R version is R-4.1.0. Any suggestion was greatly appreciated. HP z <- findFn("spline", maxPages = 2) found 0 matches Warning message: In findFn("spline", maxPages = 2) : HIT not found in HTML; processing one page only. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sin curve question
plot(sin, to=pi) # also works but with x labeled in radians. # With x axis labeled in degrees plot(sin, to=pi, axes=FALSE) axis(2) lbls <- seq(0, 180, 30) axis(1, pi*lbls/180, lbls) This can probably be done in ggplot2, but I don't know how off the top of my head. Hope this helps. Spencer On 7/24/21 2:04 PM, Eric Berger wrote: Alternatively with base graphics N <- 500 ## number of points (arbitrary) degrees <- seq(from=0,to=180,length=N) degreesToRadians <- function(d) { pi * d / 180.0} ## vectorIzed! plot(x=degrees,y=sin(degreesToRadians(degrees)),type='l', xlab="x",ylab="sin(x)",main="sin(x) vs x\nx is in degrees") On Sat, Jul 24, 2021 at 9:52 PM Sorkin, John wrote: Try something like the following copdat$degrees <- c(1:180) John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) On Jul 24, 2021, at 2:41 PM, Thomas Subia via R-help wrote: Colleagues, Here is my code which plots sin(x) vs x, for angles between 0 and 180 degrees. library(ggplot2) library(REdaS) copdat$degrees <- c(0,45,90,135,180) copdat$radians <- deg2rad(copdat$degrees) copdat$sin_x <- sin(copdat$radians) ggplot(copdat,aes(x=degrees,y=sin_x))+ geom_point(size = 2)+ geom_line()+ theme_cowplot()+xlab("x")+ ylab("sin(x)")+ scale_x_continuous(breaks=seq(0,180,30))+ ggtitle("sin(x) vs x\nx is in degrees") My trig students would prefer a curved line plot similar to what can be plotted with Excel smooth line functionality. I wanted to provide a relatively simple R script using ggplot to do this without having to resort to fitting a sine curve to these points. Some guidance would be appreciated. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-helpdata=04%7C01%7CJSorkin%40som.umaryland.edu%7C1ffa4922f2ba41588da908d94ed2a982%7C717009a620de461a88940312a395cac9%7C0%7C0%7C637627488997910453%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=90qApIoS6rwqkQuKPzy3x2AUPntuJ2W%2FtJgPGfiddEI%3Dreserved=0 PLEASE do read the posting guide https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.htmldata=04%7C01%7CJSorkin%40som.umaryland.edu%7C1ffa4922f2ba41588da908d94ed2a982%7C717009a620de461a88940312a395cac9%7C0%7C0%7C637627488997910453%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=s9YIcjlEo4MvI6hcX%2FkV4gwJJKa172QrPEnHsqTiRa8%3Dreserved=0 and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MAC version 10
What web site did you go to? Have you tried: https://cran.r-project.org/bin/macosx/ Then click on "R-4.1.0.pkg". That should download as normal. Then double click on that, etc. Spencer Graves On 7/6/21 3:52 PM, Farrah T wrote: Hello, I have Mac Version 10 Catalina and I would like to download R on this device for my summer class. I have difficulty to find a right option when I go to the website. May you please kindly share the step by step guidance with me. The screenshot with be great. I downloaded few zip files and yet didn’t know how to follow after that step neither. Looking forward to hearing from your side. Best, Fararh __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] input string ... cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'?
Hello: What if anything should I do regarding notes from either "load" or "attach" that, "input string ... cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'?"? I'm running R 4.0.5 under macOS 11.2.3; see "sessionInfo()" and detailed instructions below on the precise file I dowloaded from the web and tried to read. I may be able to get what I want just ignoring this. However, I'd like to know how to fix this. Thanks, Spencer Graves sessionInfo() R version 4.0.5 (2021-03-31) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur 10.16 Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.0.5htmltools_0.5.1.1 tools_4.0.5 yaml_2.2.1 [5] tinytex_0.31 rmarkdown_2.7 knitr_1.31 digest_0.6.27 [9] xfun_0.22 rlang_0.4.10 evaluate_0.14 > search() [1] ".GlobalEnv""file:NAVCO 1.3 List.RData" [3] "file:NAVCO 1.3 List.RData" "tools:rstudio" [5] "package:stats" "package:graphics" [7] "package:grDevices" "package:utils" [9] "package:datasets" "package:methods" [11] "Autoloads" "package:base" *** To get the file I used for this, I went to "https://www.ericachenoweth.com/research;. From there I clicked "Version 1.3". This took me to https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ON9XND I then clicked the "Download" icon to the right of "NAVCO 1.3 List.tab". This gave me 5 "Download Options", one of which was "RData Format"; I selected that. This downloaded "NAVCO 1.3 List.RData", which I moved to getwd(). Then I did 'load("NAVCO 1.3 List.RData")' and 'attach("NAVCO 1.3 List.RData")'. Both of those gave me 8 repetitions of a message like "input string ... cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'?" with different values substituted for "...". __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] about a p-value < 2.2e-16
On 2021-3-19 9:52 AM, Jiefei Wang wrote: After digging into the R source, it turns out that the argument `exact` has nothing to do with the numeric precision. It only affects the statistic model used to compute the p-value. When `exact=TRUE` the true distribution of the statistic will be used. Otherwise, a normal approximation will be used. I think the documentation needs to be improved here, you can compute the exact p-value *only* when you do not have any ties in your data. If you have ties in your data you will get the p-value from the normal approximation no matter what value you put in `exact`. This behavior should be documented or a warning should be given when `exact=TRUE` and ties present. FYI, if the exact p-value is required, `pwilcox` function will be used to compute the p-value. There are no details on how it computes the pvalue but its C code seems to compute the probability table, so I assume it computes the exact p-value from the true distribution of the statistic, not a permutation or MC p-value. My example shows that it does NOT use Monte Carlo, because otherwise it uses some distribution. I believe the term "exact" means that it uses the permutation distribution, though I could be mistaken. If it's NOT a permutation distribution, I don't know what it is. Spencer Best, Jiefei On Fri, Mar 19, 2021 at 10:01 PM Jiefei Wang wrote: Hey, I just want to point out that the word "exact" has two meanings. It can mean the numerically accurate p-value as Bogdan asked in his first email, or it could mean the p-value calculated from the exact distribution of the statistic(In this case, U stat). These two are actually not related, even though they all called "exact". Best, Jiefei On Fri, Mar 19, 2021 at 9:31 PM Spencer Graves < spencer.gra...@effectivedefense.org> wrote: On 2021-3-19 12:54 AM, Bogdan Tanasa wrote: thanks a lot, Vivek ! in other words, assuming that we work with 1000 data points, shall we use EXACT = TRUE, it uses the normal approximation, while if EXACT=FALSE (for these large samples), it does not ? As David Winsemius noted, the documentation is not clear. Consider the following: set.seed(1) > x <- rnorm(100) > y <- rnorm(100, 2) > > wilcox.test(x, y)$p.value [1] 1.172189e-25 > wilcox.test(x, y)$p.value [1] 1.172189e-25 > > wilcox.test(x, y, EXACT=TRUE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, EXACT=TRUE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, exact=TRUE)$p.value [1] 4.123875e-32 > wilcox.test(x, y, exact=TRUE)$p.value [1] 4.123875e-32 > > wilcox.test(x, y, EXACT=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, EXACT=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, exact=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, exact=FALSE)$p.value [1] 1.172189e-25 > We get two values here: 1.172189e-25 and 4.123875e-32. The first one, I think, is the normal approximation, which is the same as exact=FALSE. I think that with exact=FALSE, you get a permutation distribution, though I'm not sure. You might try looking at "wilcox_test in package coin for exact, asymptotic and Monte Carlo conditional p-values, including in the presence of ties" to see if it is clearer. NOTE: R is case sensitive, so "EXACT" is a different variable from "exact". It is interpreted as an optional argument, which is not recognized and therefore ignored in this context. Hope this helps. Spencer On Thu, Mar 18, 2021 at 10:47 PM Vivek Das wrote: Hi Bogdan, You can also get the information from the link of the Wilcox.test function page. “By default (if exact is not specified), an exact p-value is computed if the samples contain less than 50 finite values and there are no ties. Otherwise, a normal approximation is used.” For more: https://stat.ethz.ch/R-manual/R-devel/library/stats/html/wilcox.test.html Hope this helps! Best, VD On Thu, Mar 18, 2021 at 10:36 PM Bogdan Tanasa wrote: Dear Peter, thanks a lot. yes, we can see a very precise p-value, and that was the request from the journal. if I may ask another question please : what is the meaning of "exact=TRUE" or "exact=FALSE" in wilcox.test ? i can see that the "numerically precise" p-values are different. thanks a lot ! tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) tst$p.value [1] 8.535524e-25 tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=FALSE) tst$p.value [1] 3.448211e-25 On Thu, Mar 18, 2021 at 10:15 PM Peter Langfelder < peter.langfel...@gmail.com> wrote: I thinnk the answer is much simpler. The print method for hypothesis tests (class htest) truncates the p-values. In the above example, instead of using wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) and copying the output, just print the p-value: tst = wilcox.test(rnorm(100), rnorm(100, 2), exact
Re: [R] about a p-value < 2.2e-16
On 2021-3-19 12:54 AM, Bogdan Tanasa wrote: > thanks a lot, Vivek ! in other words, assuming that we work with 1000 data > points, > > shall we use EXACT = TRUE, it uses the normal approximation, > > while if EXACT=FALSE (for these large samples), it does not ? As David Winsemius noted, the documentation is not clear. Consider the following: > set.seed(1) > x <- rnorm(100) > y <- rnorm(100, 2) > > wilcox.test(x, > y)$p.value [1] 1.172189e-25 > wilcox.test(x, y)$p.value [1] 1.172189e-25 > > wilcox.test(x, y, EXACT=TRUE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, EXACT=TRUE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, exact=TRUE)$p.value [1] 4.123875e-32 > wilcox.test(x, y, exact=TRUE)$p.value [1] 4.123875e-32 > > wilcox.test(x, y, EXACT=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, EXACT=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, exact=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y, exact=FALSE)$p.value [1] 1.172189e-25 > We get two values here: 1.172189e-25 and 4.123875e-32. The first one, I think, is the normal approximation, which is the same as exact=FALSE. I think that with exact=FALSE, you get a permutation distribution, though I'm not sure. You might try looking at "wilcox_test in package coin for exact, asymptotic and Monte Carlo conditional p-values, including in the presence of ties" to see if it is clearer. NOTE: R is case sensitive, so "EXACT" is a different variable from "exact". It is interpreted as an optional argument, which is not recognized and therefore ignored in this context. Hope this helps. Spencer > On Thu, Mar 18, 2021 at 10:47 PM Vivek Das wrote: > >> Hi Bogdan, >> >> You can also get the information from the link of the Wilcox.test function >> page. >> >> “By default (if exact is not specified), an exact p-value is computed if >> the samples contain less than 50 finite values and there are no ties. >> Otherwise, a normal approximation is used.” >> >> For more: >> >> https://stat.ethz.ch/R-manual/R-devel/library/stats/html/wilcox.test.html >> >> Hope this helps! >> >> Best, >> >> VD >> >> >> On Thu, Mar 18, 2021 at 10:36 PM Bogdan Tanasa wrote: >> >>> Dear Peter, thanks a lot. yes, we can see a very precise p-value, and that >>> was the request from the journal. >>> >>> if I may ask another question please : what is the meaning of "exact=TRUE" >>> or "exact=FALSE" in wilcox.test ? >>> >>> i can see that the "numerically precise" p-values are different. thanks a >>> lot ! >>> >>> tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) >>> tst$p.value >>> [1] 8.535524e-25 >>> >>> tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=FALSE) >>> tst$p.value >>> [1] 3.448211e-25 >>> >>> On Thu, Mar 18, 2021 at 10:15 PM Peter Langfelder < >>> peter.langfel...@gmail.com> wrote: >>> >>>> I thinnk the answer is much simpler. The print method for hypothesis >>>> tests (class htest) truncates the p-values. In the above example, >>>> instead of using >>>> >>>> wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) >>>> >>>> and copying the output, just print the p-value: >>>> >>>> tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) >>>> tst$p.value >>>> >>>> [1] 2.988368e-32 >>>> >>>> >>>> I think this value is what the journal asks for. >>>> >>>> HTH, >>>> >>>> Peter >>>> >>>> On Thu, Mar 18, 2021 at 10:05 PM Spencer Graves >>>> wrote: >>>>> I would push back on that from two perspectives: >>>>> >>>>> >>>>> 1. I would study exactly what the journal said very >>>>> carefully. If they mandated "wilcox.test", that function has an >>>>> argument called "exact". If that's what they are asking, then using >>>>> that argument gives the exact p-value, e.g.: >>>>> >>>>> >>>>> > wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) >>>>> >>>>> Wilcoxon rank sum exact test >>>>> >>>>> data: rnorm(100) and rnorm(100, 2) >>>>> W = 691, p-value < 2.2e-16 >>>>> >>>>> >>>>> 2.
Re: [R] about a p-value < 2.2e-16
I would push back on that from two perspectives: 1. I would study exactly what the journal said very carefully. If they mandated "wilcox.test", that function has an argument called "exact". If that's what they are asking, then using that argument gives the exact p-value, e.g.: > wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE) Wilcoxon rank sum exact test data: rnorm(100) and rnorm(100, 2) W = 691, p-value < 2.2e-16 2. If that's NOT what they are asking, then I'm not convinced what they are asking makes sense: There is is no such thing as an "exact p value" except to the extent that certain assumptions hold, and all models are wrong (but some are useful), as George Box famously said years ago.[1] Truth only exists in mathematics, and that's because it's a fiction to start with ;-) Hope this helps. Spencer Graves [1] https://en.wikipedia.org/wiki/All_models_are_wrong On 2021-3-18 11:12 PM, Bogdan Tanasa wrote: <https://meta.stackexchange.com/questions/362285/about-a-p-value-2-2e-16> Dear all, i would appreciate having your advice on the following please : in R, the wilcox.test() provides "a p-value < 2.2e-16", when we compare sets of 1000 genes expression (in the genomics field). however, the journal asks us to provide the exact p value ... would it be legitimate to write : "p-value = 0" ? thanks a lot, -- bogdan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Very slow optim()
Hi, Deepayan: On 2021-03-13 01:27, Deepayan Sarkar wrote: On Sat, Mar 13, 2021 at 10:08 AM Spencer Graves wrote: TWO COMMENTS: 1. DID YOU ASSIGN THE OUTPUT OF "optim" to an object, like "est <- optim(...)"? If yes and if "optim" terminated normally, the 60,000+ paramters should be there as est$par. See the documentation on "optim". 2. WHAT PROBLEM ARE YOU TRYING TO SOLVE? I hope you will forgive me for being blunt (or perhaps bigoted), but I'm skeptical about anyone wanting to use optim to estimate 60,000+ parameters. With a situation like that, I think you would be wise to recast the problem as one in which those 60,000+ parameters are sampled from some hyperdistribution characterized by a small number of hyperparameters. Then write a model where your observations are sampled from distribution(s) controlled by these random parameters. Then multiply the likelihood of the observations by the likelihood of the hyperdistribution and integrate out the 60,000+ parameters, leaving only a small number hyperparameters. Just a comment on this comment: I think it's perfectly reasonable to optimize 60k+ parameters with conjugate gradient. CG was originally developed to solve linear equations of the form Ax=b. If x was not large in size, one would just use solve(A, b) instead of an iterative method. Use of CG is quite common in image processing. A relatively small 300x300 image will give you 90k parameters. -Deepayan Thanks for this. If both A and b are 300x300, then x will also be 300x300. What do you do in this case if A is not square or even ill conditioned? Do you care if you get only one of many possible or approximate solutions, and the algorithm spends most of its time making adjustments in a singular subspace that would have best been avoided? Spencer __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Very slow optim()
TWO COMMENTS: 1. DID YOU ASSIGN THE OUTPUT OF "optim" to an object, like "est <- optim(...)"? If yes and if "optim" terminated normally, the 60,000+ paramters should be there as est$par. See the documentation on "optim". 2. WHAT PROBLEM ARE YOU TRYING TO SOLVE? I hope you will forgive me for being blunt (or perhaps bigoted), but I'm skeptical about anyone wanting to use optim to estimate 60,000+ parameters. With a situation like that, I think you would be wise to recast the problem as one in which those 60,000+ parameters are sampled from some hyperdistribution characterized by a small number of hyperparameters. Then write a model where your observations are sampled from distribution(s) controlled by these random parameters. Then multiply the likelihood of the observations by the likelihood of the hyperdistribution and integrate out the 60,000+ parameters, leaving only a small number hyperparameters. When everything is linear and all the random variables / random effects and observation errors follow normal distributions, this is the classic linear, mixed-effects situation that is routinely handled well for most such situations by the nlme package, documented with in companion book Pinhiero and Bates (2000) Mixed-Effects Models in S and S-PLUS (Springer). If the models are nonlinear but with curvature that is reasonably well behaved and the random variables / random effects and observation errors are still normal, the nlme package and Pinhiero and Bates still provide a great approach to most such situations, as far as I know. When the observations are non-normally distributed, then the best software I know is the lme4 package. I have not used it recently, but it was written and being maintained by some of the leading experts in this area as far as I know. CONCLUSION: If you are short on time and "1" will work for you, do that. Obviously, you will need to do some further analysis to understand the 60,000+ parameters you estimated -- which implies by itself that you really should be using approach "2". However, if I'm short on time and need an answer, then I'd ignore "2" and hope to get something by plotting and doing other things with the 60,000+ parameters that should be in "est$par" if "optim" actually ended normally. However, if the problem is sufficiently important to justify more work, then I'd want to cast it as some kind if mixed-effects model, per "2" -- perhaps using an analysis of "1" as a first step towards "2". Hope this helps. Spencer On 2021-03-12 20:53, J C Nash wrote: optim() has no method really suitable for very large numbers of parameters. - CG as set up has never worked very well in any of its implementations (I wrote it, so am allowed to say so!). Rcgmin in optimx package works better, as does Rtnmin. Neither are really intended for 60K parameters however. - optim::L-BFGS-B is reasonable, but my experience is that it still is not intended for more than a couple of hundred parameters. JN On 2021-03-12 9:31 p.m., Jeff Newmiller wrote: Calculate fewer of them? If you don't setup your code to save intermediate results, then you cannot see intermediate results. On March 11, 2021 8:32:17 PM PST, "毕芳妮 via R-help" wrote: Dear list, I am using optim() to estimate over 60 thousans of parameters, and use the server to run the program.But it took me 5 hours and there was just no result coming out.How could I do to show some results that have been calculated by optim()? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] forecast accuracy
Hello, Faheem: 1. You might get a better response if you "PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code", as it says at the bottom of each email in this list. That would make it easier for people to understand the problem you are trying to solve. 2. Are you familiar with the "fda" package? I assume you are, but if not, I suggest you look at it. 3. If you are familiar with the "fda" package, have you reviewed the Wikipedia article on "Functional data analysis" and Ramsay, Hooker and Graves (2009) Functional data analysis with R and Matlab (Springer)? The latter provides more extensive documentation for the "fda" package. 4. Have you written directly to the maintainer of the "fda" package? You can get his email from help("fda"). Hope this helps. Spencer Graves On 2021-02-17 06:02, Faheem Jan via R-help wrote: I am new in the functional time series, my question may be stupid as I am new, I am functional forecasting one year a head, Know I want to check the forecast accuracy by calculating the mean absolute percentage error, but I am unable to due this R, please help me or suggest me any link which help me to solve my problem, Regard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there anyone who uses both R and Python here? How do you debug? Perhaps in RStudio?
You can mix R and Python code in the same R Markdown vignette. See: https://bookdown.org/yihui/rmarkdown/language-engines.html ```{r "RcodeChunk"} # R code ``` ```{python "PythonCodeChunk"} # Python code ``` I did this a couple of years ago. I haven't used Python since. However, this is described in the book Xie, Allaire, and Grolemund (2020) R Markdown: The Definitive Guide (Chapman & Hall and available for free at the above link). Spencer Graves On 2021-01-27 10:31, Robert Knight wrote: An iterative process works well. Python to get the data desired and then Rscript script.r from a command line. My process involves building a script in R using, using Rstudio, Pycharm, VS Code, Kate, or some other editor. Then using data input built with Python as input to Rscript. The R scripts produce excel files or CSV data for other use RStudio is amazing for some slow pace academic work. The "expected a numeric but got a char" error appeared to often for my needs and so the workflows wound up with Python building data that's already cleaned for use in R to avoid data import troubles. My code use a functional paradigm rather than object oriented paradigm. Python does more than just munge my data since it handled many mathematic operations on it, but it's ultimate purpose is to clean large amounts of data to avoid import errors in R. On Wed, Jan 27, 2021, 1:49 AM C W wrote: Hello all, I'm a long time R user, but recently also using Python. I noticed that RStudio rolled out Python through reticulate. It's great so far! My question is, how do you debug in Python? In R, I simply step through the code script in my console with cmd+enter. But you can't do that with Python, some of them are objects. Here's my example. class person: def __init__(self, id, created_at, name, attend_date, distance): """Create a new `person`. """ self._id = id self.created_at = created_at self.name = name self.attend_date = attend_date self.distance = distance @classmethod def get_person(self, employee): """Find and return a person by. """ return person(employee['created_at'], employee['id'], employee['name'], employee['attend_date'], employee['distance'] ) The error message says self._id was 'str', but expecting an 'int'. I can't do: self._id = 5 I guess it's "hidden". Can't really assign and test like that. It seems hardcore Python programmers just use a debugger, and do not understand the greatness of interactive IDE and console. I'd still like to stay in IDE, hopefully. So, how are the R users coping with object classes? Do you just instantiate every time? What if you got 10 of these class person objects to debug? I know this may be a Python question. But, I really wanted to see from a R user's working experience. Thanks a lot, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging multiple .csv files
Did you work the examples in help("merge")? Also, have you looked at the "dplyr" package? It has 9 different vignettes. The lead author is Hadley Wickham, who won the 2019 COPSS Presidents' Award for work like this. Alternatively, you could manually read all 10 files, then figure out from that what the merged object should look like with what rows and what columns. I might start with something like: PosAll <- unique(c(table1$Pos, table2$Pos, ... )) Combined <- data.frame(Pos=PosAll, col1=NA, col2=NA, ...) Then create the object you want that contains a column "Pos" = PosAll. Then: rownames(table1) <- table1$Pos rownames(table2) <- table2$Pos ... sel1 <- PosAll %in% table1$Pos ... Combined[sel1, names(table1)] <- table1[PosAll[sel1], ] CAVEAT: I haven't tried this exact code. It may not work as written. However, I've done things like this in the past. "merge" and "dplyr" have seemed too much like black magic for me. Spencer On 2020-12-15 16:13, pooja sinha wrote: I know that but I do not want to merge them sequentially because I may lose some rows which are present in one file while the other doesn't have. I googled and found something called multmerge but the code is not working for me. I used the following: path <-"P:/Documents/Puja Desktop items/Documents/RESULTS/est meth results/Final_est_meth_data_rn6/WGBS_exon_sort_CHG/merge_csv" filenames <- list.files(path = path) View(filenames) multmerge = function(path){ filenames=list.files(path=path, full.names=TRUE) datalist = lapply(filenames, function(x){read.csv(file=x,header=T)}) Reduce(function(x,y) {merge(x,y)}, datalist) } full_data = multmerge("~/P:/Documents/Puja Desktop items/Documents/RESULTS/est meth results/Final_est_meth_data_rn6/WGBS_exon_sort_CHG/merge_csv") But after running the full_data, the folder is empty. Thanks, Puja On Tue, Dec 15, 2020 at 4:51 PM Bert Gunter wrote: ?read.csv to read your csv files in data frames ?merge to merge them (sequentially). Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Dec 15, 2020 at 1:36 PM pooja sinha wrote: Hi All, I have 10 .csv files containing 12 to 15 columns but have some columns in common. I need to join all of my .csv files into one using one common column ‘Pos’. The header portion of my .csv files looks as shown below: Chrom Pos Avg Stdev A15_3509.C A31_3799.C A32_3800.C A35_3804.C Gene ID Class ExNum Can anyone help me in getting the code for the above problem. Any help will be highly appreciated. Thanks, Puja [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Google Summer of Code 2021
Hello, Aarnob Guha: I'm not sure, but r-de...@r-project.org might be a better list for this question. Over two years ago, I responded to a "Call For Proposals" from the R Consortium suggesting a project to improve the capabilities for searching R packages. That proposal was not funded, but documentation of the basic idea is still available: https://en.wikiversity.org/wiki/Draft_Proposal_for_improving_the_ability_of_R_users_to_search_R_packages See also: https://en.wikiversity.org/wiki/Searching_R_Packages Hope this helps. Spencer Graves On 2020-12-08 03:09, Aarnob Guha wrote: Dear R Project for Statistical Computing, My name is Aarnob Guha and I am very enthusiastic about participating in Google Summer of Code in 2021. I've been going through past projects in the Google Summer of Code archives for this organisation and I was curious as to whether R Project for Statistical Computing has any projects planned for 2021? If so, I was curious as to what these projects might be so that I can get an idea of what I need to learn in order to adequately do well in the project if I end up doing it. If you are ok with disclosing it that is. Otherwise, it is also fine. Thank you. Also, I wasn't particularly sure which mailing list to send this email to, so I just sent it to the general 'help' mailing list. I apologise for any inconvenience caused by this. Kind Regards, Aarnob Guha. <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail> Virus-free. www.avast.com <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to save Results in svg format
I often have trouble with font sizes in the svg files, because they rarely are the same as what I see on the screen. I then have to read the documentation for "par" and play with the cex, cex.axis, cex.lab, cex.main, and cex.sub parameters until I get what I want in the svg file. I may also need to play with the "mar" argument in "par". Occasionally, I may also have to research some of the other plot parameters. In some cases, I may also need a png file. I can use "png" in the same way as "svg", but the cex, etc., parameters must be set differently. I've recently learned how to open an svg file in the free open-source software GIMP and adjust the "Resolution" from the 90 pixels/in default to something more like 300 to get what I want. I couldn't import an svg file recently into a Google Doc. The 90 pixels per inch default conversion didn't look very clean and sharp after imported into a Google Doc. I tried 600 pixels/in and found that Google Doc looked like it accepted it at first. However, when I went back later, I found that it had subsequently malfunctioned. I tried 300 pixels/in, and I think that worked, though I'm not 100% certain. hope this helps. Spencer Graves On 2020-12-03 23:21, David Carlson wrote: If you look at the examples on the manual pages for the upgma() and NJ() functions you will see that the results are generally sent to the plot() function. To save that graph as an .svg file you need to open a graphics device using the svg() function, plot the data, and close the graphics device. You probably need something like the following: svg("UPGMA_5x5.svg") plot(UPGMA_5x5) dev.off() or svg("NJ_119x119.svg") plot(NJ_119x119) dev.off() There are numerous arguments that set various details for the plot that you should look at (?svg). David L Carlson Texas A University On Thu, Dec 3, 2020 at 10:24 PM Anas Jamshed wrote: #Loading the required libraries library(ape) library(phangorn) library(seqinr) #Importing the required file align_5 <- read.alignment("C:/Users/VAMSI/align 5.fasta", format = "fast") align_119 <- read.alignment("C:/Users/VAMSI/align 119.fasta", format = "fasta") Computing the distance matrix for both UPGMA and NJ algorithms implementation. matrix_5x5 <- dist.alignment(align_5, matrix = "identity") summary(matrix_5x5) matrix_119x119 <- dist.alignment(align_119, matrix = "identity") summary(matrix_119x119) #Implementation of UPGMA algorithm for a small matrix (5x5) and entire matrix (119x119) UPGMA_5x5 <- upgma(matrix_5x5) UPGMA_119x119 <- upgma(matrix_119x119) summary(UPGMA_5x5) summary(UPGMA_119x119) #Implementation of NJ algorithm for a small matrix (5x5) and entire matrix (119x119) NJ_5x5 <- NJ(matrix_5x5) NJ_119x119 <- NJ(matrix_119x119) summary(NJ_5x5) summary(NJ_119x119) I have done this whole analysis but don't know how can I the save my tree file in svg or some other image format . In the avove script , I am applying the phylogenetic algorithm on the distance matrix which I have created through fasta file [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!T-8PRCLc041aD-rW1ehtb14tfQAVpWdpBddf_q6Raeho0RbG9wDodEK_QDpGMCc$ PLEASE do read the posting guide https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!T-8PRCLc041aD-rW1ehtb14tfQAVpWdpBddf_q6Raeho0RbG9wDodEK_1H9sJRs$ and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inappropriate color name
Might it be appropriate to raise that question on the Talk page associated with the Wikipedia article on "Indian red (color)": https://en.wikipedia.org/wiki/Indian_red_(color) Many Wikimedian are generally sympathetic to discussions of political correctness and similar topics. If the name of that article were changed, then it should be a lot easier to pursue a similar name change elsewhere. Spencer Graves On 2020-11-17 15:25, Mitchell Maltenfort wrote: What about just amputating the final "n?" "Indian" might mean one of two things, but "India" is pretty distinct. On Tue, Nov 17, 2020 at 4:10 PM T. A. Milne via R-help wrote: Apologies to the list for continuing a thread which is clearly off-topic. However, contacting the maintainer of an R package to complain about this specific color name seems ill-considered. 1) The name "indian red" is a part of widely-used color schemes everywhere, not just in R. It's the color defined as: "The color indianred / Indian red with hexadecimal color code #cd5c5c is a shade of red. In the RGB color model #cd5c5c is comprised of 80.39% red, 36.08% green and 36.08% blue. In the HSL color space #cd5c5c has a hue of 0° (degrees), 53% saturation and 58% lightness. This color has an approximate wavelength of 611.37 nm." https://encycolorpedia.com/cd5c5c 2) The "indian" in the color name refers to ferric oxide, historically sourced from India. Per Wikipedia: "The name Indian red derives from the red laterite soil found in India, which is composed of naturally occurring iron oxides.[citation needed] The first recorded use of Indian red as a color term in English was in 1672.[3" https://en.wikipedia.org/wiki/Indian_red_(color) Given the name refers to the locus of the ferric oxide source, It isn't obvious that any particular group should be offended by the name. -- T. Arthur Milne On Nov 16, 2020, at 5:46 PM, Rolf Turner wrote: On Tue, 17 Nov 2020 07:54:01 +1100 Jim Lemon wrote: Hi Elaine, There seems to be a popular contest to discover offence everywhere. I don't think that it does anything against racism, sexism or antidisestablishmentarianism. Words are plucked from our vast lexicon to comfort or insult our fellows depending upon the intent of the user. It is the intent that matters, not the poor word. Chasing the words wastes your time, blames those who use the words harmlessly, and gives the real offender time to find another epithet. Jim: This is superbly expressed. I wish that I could have said that! Your posting should go down in the annals of brilliant rhetoric, alongside Dr. Johnson's "Letter to Lord Chesterfield". cheers, Rolf You know, I wouldn’t have continued this thread (which has now wandered off topic from the original somewhat-more-technical question), but I feel now like it’s necessary to do so (and only fair, if anyone is considering moderating me after letting these posts by): That is a view commonly held by white people, and even more overwhelmingly by white men. Our field is already not as diverse as it should be for a variety of reasons, and this “pretending no one else on earth exists” kind of stuff is at least some part of the reason. The question at issue here aside, white men complaining about people finding racism or sexism everywhere they look doesn’t pass the sniff test. Most or all of these things that people are reporting as offensive are being reported by people you’re clearly not listening to. Further, impact is what matters. If I step on your foot, I apologize, regardless of whether or not it was intentional, because it’s the right thing to do. If someone tells you “that thing you’re saying is offensive or is hurting me” and you say “I didn’t mean it,” and then keep right on doing it, what does it say to the person on the receiving end of it? All anyone that is being “blamed,” as you put it, is being asked to do is to try to do better next time. -- #BlackLivesMatter || \\UTGERS, |---*O*--- ||_// the State | Ryan Novosielski - novos...@rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\of NJ | Office of Advanced Research Computing - MSB C630, Newark `' __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [
Re: [R] WG: Packages for mathematics
Several resources for "Searching R Packages" are discussed in a Wikiversity article by that title: https://en.wikiversity.org/wiki/Searching_R_Packages I've found the "sos" package on CRAN (which has a vignette describing use) and "https://www.rdocumentation.org; to be most useful. Many functions are included in base R. Many more are provided in contributed packages. Spencer Graves On 2020-11-16 04:02, ELLENBERGER Mihaela via R-help wrote: Von: ELLENBERGER Mihaela Gesendet: Montag, 16. November 2020 10:24 An: cran-submissi...@r-project.org Betreff: Packegs for mathematics Hello I'm undergraduated student of Biomedical Sciences in Switzerland and learning how to work with R. Currently I'm using R-Studio. I would like to now, which packeges sholud I install please for mathematics ( analysis)? My collegeues don't use all the same maths software ( some of them are using Excel, SPSS, Mathematica, etc.) -Funktionen,Serien, Folgen,Grenzwerte, Stetigkeiten -Integralrechnungen -Differentialrechnungen -Analysis multivariant Could you sent me an advice please, because there are so many packeges in R. Kind regards Mihaela Ellenberger [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] analyzing results from Tuesday's US elections
On 2020-11-07 23:39, Abby Spurdle wrote: What can you tell me about plans to analyze data from this year's general election, especially to detect possible fraud? I was wondering if there's any R packages with out-of-the-box functions for this sort of thing. Can you please let us know, if you find any. I might be able to help with such an effort. I have NOT done much with election data, but I have developed tools for data analysis, including web scraping, and included them in R packages available on the Comprehensive R Archive Network (CRAN) and GitHub.[1] Do you have a URL for detailed election results? Or even better, a nice R-friendly CSV file... I recognize that the results aren't complete. And that such a file may need to be updated later. But that doesn't necessarily prevent modelling now. I asked, because I don't know of any such. With the increasingly vicious, widespread and systematic attacks on the integrity of elections in the US, I think it would be good to have a central database of election results with tools regularly scraping websites of local and state election authorities. Whenever new data were posted, the software would update the central repository and send emails to anyone interested. That could simplify data acquisition, because historical data could already be available there. And it would be one standard format for the entire US and maybe the world. This could be extremely valuable in exposing electoral fraud, thereby reducing its magnitude and effectiveness. This is a global problem, but it seems to have gotten dramatically worse in the US in recent years.[2] I'd like to join -- or organize -- a team of people working on this. If we can create the database and data analysis tools in a package like Ecfun on CRAN, I think we can interest college profs, especially those teaching statistics to political science students, who would love to involve their students in something like this. They could access data real time in classes, analyze it using standard tools that we could develop, and involve their students in discussing what it means and what it doesn't. They could discuss Bayesian sequential updating and quality control concepts using data that are real and relevant to the lives of their students. It could help get students excited about both statistics and elections. Such a project may already exist. I know there are projects at some major universities that sound like they might support this. However with the limited time I've invested in this so far, I didn't find any that seemed to provide easy access to such data and an easy way to join such a project. Ballotpedia has such data but don't want help in analyzing it and asked for a few hundred dollars for data for one election cycle in Missouri, which is what I requested. I can get that for free from the web site of the Missouri Secretary of State. I thought I might next ask the Carter Center about this. However, but I'm totally consumed with other priorities right now. I don't plan to do anything on this in the short term -- unless I can find collaborators. If such a central database doesn't exist -- and maybe even if it does -- I thought it might be good to make all the data available in a standard format in Wikidata, which is a project of the Wikimedia Foundation, which is also the parent organization of Wikipedia. Then I could help create software and documentation on how to scrape data from the web sites of different election organizations that have it and automatically update Wikidata while also sending emails to people who express interest in those election results. Then we could create software for analyzing such data and make that available, e.g., on Wikiversity, which is another project of the Wikimedia Foundation -- with the R code in Ecfun or some other CRAN package. If we start now, I think we could have something mediocre in time for various local elections that occur next year with improvements for the 2022 US Congressional elections and something even better for the 2024 US presidential elections. Thanks for asking. Spencer Graves [1] https://github.com/sbgraves237 [2] https://en.wikiversity.org/wiki/Electoral_integrity_in_the_United_States __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] analyzing results from Tuesday's US elections
Hello: What can you tell me about plans to analyze data from this year's general election, especially to detect possible fraud? I might be able to help with such an effort. I have NOT done much with election data, but I have developed tools for data analysis, including web scraping, and included them in R packages available on the Comprehensive R Archive Network (CRAN) and GitHub.[1] Penny Abernathy, who holds the Knight Chair in Journalism and Digital Media Economics at UNC-Chapel Hill, told me that the electoral fraud that disqualified the official winner from NC-09 to the US House in 2018 was detected by a college prof, who accessed the data two weeks after the election.[2] Spencer Graves [1] https://github.com/sbgraves237 [2] https://en.wikiversity.org/wiki/Local_Journalism_Sustainability_Act __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CRAN R For Windows 3.6.3
If you have production code written in R that make it expensive to even consider upgrading to the latest R, it may be worth paying the support fees of an organization like RStudio. Otherwise, I think it make sense to upgrade to the latest version and hope for the best. If you encounter problems, you can ask someplace on StackExchange or one of the R email lists like this or a package maintainer, as Duncan said. Spencer On 2020-09-19 12:44, Duncan Murdoch wrote: On 18/09/2020 1:39 p.m., Shapira, Leeor via R-help wrote: Can you please let me know the End of Life and End of Vendor Support dates for CRAN R for Windows 3.6.3? Thank you. R doesn't have either of those. There is no vendor support ever. It is free software; it is up to its users to support it. On the other hand, it is free software, so you can use it forever. In practice, there is de facto support from its authors in that they are very responsive to bug reports. That ends with the next release, so 3.6.3 support ended in April, 2020 when R 4.0.0 was released. Another way to think of support and end of life equivalents is to ask how long CRAN will provide the source code to packages for it. There are no time limits on that, though it can be some work to find a set and tools to build them if you are using older releases. And finally, you might want to know how long CRAN will keep updating binary packages for R 3.6.3. I think that should continue until the release of R 4.1.0, sometime around April 2021. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [External] Re: help with web scraping
Dear Rasmus Liland et al.: On 2020-07-25 11:30, Rasmus Liland wrote: On 2020-07-25 09:56 -0500, Spencer Graves wrote: Dear Rasmus et al.: It is LILAND et al., is it not? ... else it's customary to put a comma in there, isn't it? ... The APA Style recommends "Sharp et al., 2007": https://blog.apastyle.org/apastyle/2011/11/the-proper-use-of-et-al-in-apa-style.html Regarding Confucius, I'm confused. right, moving on: On 2020-07-25 04:10, Rasmus Liland wrote: Please research using Thunderbird, Claws mail, or some other sane e-mail client; they are great, I promise. Thanks. I researched it and turned of HTML. Please excuse: I noticed it was a problem, but hadn't prioritized time to research and fix it until your comment. Thanks. Please excuse:? Before my last post, I had written code to do all that.? Good! In brief, the political offices are "h3" tags.? Yes, some type of header element at least, in-between the various tables, everything children of the div in the element tree. I used "strsplit" to split the string at "".? I then wrote a function to find "", extract the political office and pass the rest to "XML::readHTMLTable", adding columns for party and political office. Yes, doing that for the political office is also possible, but the party is inside the table's caption tag, which end up as the name of the table in the XML::readHTMLTable list ... However, this suppressed "" everywhere.? Why is that, please explain. I don't know why the Missouri Secretary of State's web site includes "" to signal a new line, but it does. I also don't know why XML::readHTMLTable suppressed "" everywhere it occurred, but it did that. After I used gsub to replace "" with "\n", I found that XML::readHTMLTable did not replace "\n", so I got what I wanted. I thought there should be an option with something like "XML::readHTMLTable" that would not delete "" everywhere, but I couldn't find it.? No, there is not, AFAIK. Please, if anyone else knows, please say so *echoes in the forest* If you aren't aware of one, I can gsub("", "\n", ...) on the string for each political office before passing it to "XML::readHTMLTable".? I just tested this:? It works. Such a great hack! IMHO, this is much more flexible than using xml2::read_html, rvest::read_table, dplyr::mutate like here[1] I have other web scraping problems in my work plan for the few days.? Maybe, idk ... I will definitely try XML::htmlTreeParse, etc., as you suggest. I wish you good luck, Rasmus [1] https://stackoverflow.com/questions/38707669/how-to-read-an-html-table-and-account-for-line-breaks-within-cells And I added my solution to this problem to this Stackoverflow thread. Thanks again, Spencer __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [External] Re: help with web scraping
Dear Rasmus et al.: On 2020-07-25 04:10, Rasmus Liland wrote: > On 2020-07-24 10:28 -0500, Spencer Graves wrote: >> Dear Rasmus: >> >>> Dear Spencer, >>> >>> I unified the party tables after the >>> first summary table like this: >>> >>> url <- >>> "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975; >>> M_sos <- RCurl::getURL(url) >>> saveRDS(object=M_sos, file="dcp.rds") >>> dat <- XML::readHTMLTable(M_sos) >>> idx <- 2:length(dat) >>> cn <- unique(unlist(lapply(dat[idx], colnames))) >> This is useful for this application. >> >>> dat <- do.call(rbind, >>> sapply(idx, function(i, dat, cn) { >>> x <- dat[[i]] >>> x[,cn[!(cn %in% colnames(x))]] <- NA >>> x <- x[,cn] >>> x$Party <- names(dat)[i] >>> return(list(x)) >>> }, dat=dat, cn=cn)) >>> dat[,"Date Filed"] <- >>> as.Date(x=dat[,"Date Filed"], >>> format="%m/%d/%Y") >> This misses something extremely >> important for this application:? The >> political office.? That's buried in >> the HTML or whatever it is.? I'm using >> something like the following to find >> that: >> >> str(LtGov <- gregexpr('Lieutenant Governor', M_sos)[[1]]) > Dear Spencer, > > I came up with a solution, but it is not > very elegant. Instead of showing you > the solution, hoping you understand > everything in it, I istead want to give > you some emphatic hints to see if you > can come up with a solution on you own. > > - XML::htmlTreeParse(M_sos) >- *Gandalf voice*: climb the tree > until you find the content you are > looking for flat out at the level of > �The Children of the Div�, *uuuUUU* >- you only want to keep the table and > header tags at this level > - Use XML::xmlValue to extract the >values of all the headers (the >political positions) > - Observe that all the tables on the >page you were able to extract >previously using XML::readHTMLTable, >are at this level, shuffled between >the political position header tags, >this means you extract the political >position and party affiliation by >using a for loop, if statements, >typeof, names, and [] and [[]] to grab >different things from the list >(content or the bag itself). >XML::readHTMLTable strips away the >line break tags from the Mailing >address, so if you find a better way >of extracting the tables, tell me, >e.g. you get > > 8805 HUNTER AVEKANSAS CITY MO 64138 > >and not > > 8805 HUNTER AVEKANSAS CITY MO 64138 > > When you've completed this �programming > quest�, you're back at the level of the > previous email, i.e. you have have the > same tables, but with political position > and party affiliation added to them. � Please excuse:� Before my last post, I had written code to do all that.� In brief, the political offices are "h3" tags.� I used "strsplit" to split the string at "".� I then wrote a function to find "", extract the political office and pass the rest to "XML::readHTMLTable", adding columns for party and political office. � However, this suppressed "" everywhere.� I thought there should be an option with something like "XML::readHTMLTable" that would not delete "" everywhere, but I couldn't find it.� If you aren't aware of one, I can gsub("", "\n", ...) on the string for each political office before passing it to "XML::readHTMLTable".� I just tested this:� It works. � I have other web scraping problems in my work plan for the few days.� I will definitely try XML::htmlTreeParse, etc., as you suggest. � Thanks again. � Spencer Graves > > Best, > Rasmus > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] prevent XML::readHTMLTable from suppressing
Hello, All: Thanks to Rasmus Liland, William Michels, and Luke Tierney with my earlier web scraping question. With their help, I've made progress. Sadly, I still have a problem: One field has "", which gets suppressed by XML::readHTMLTable: sosURL <- "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975; sosChars <- RCurl::getURL(sosURL) MOcan <- XML::readHTMLTable(sosChars) MOcan[[2]][1, 2] [1] "4476 FIVE MILE RDSENECA MO 64865" (Seneca <- regexpr('SENECA', sosChars)) substring(sosChars, Seneca-22, Seneca+14) [1] "4476 FIVE MILE RDSENECA MO 64865" How can I get essentially the same result but without having XML::readHTMLTable suppress ""? NOTE: I get something very similar with xml2::read_html and rvest::html_table: sosPointers <- xml2::read_html(sosChars) MOcan2 <- rvest::html_table(sosPointers) MOcan2[[2]][1, 2] [1] "4476 FIVE MILE RDSENECA MO 64865" MOcan2 does not have names, and some of the fields are automatically converted to integers, which I think is not smart in this application. Thanks, Spencer Graves __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [External] Re: help with web scraping
Dear Rasmus: On 2020-07-24 09:16, Rasmus Liland wrote: > On 2020-07-24 08:20 -0500, luke-tier...@uiowa.edu wrote: >> On Fri, 24 Jul 2020, Spencer Graves wrote: >>> On 2020-07-23 17:46, William Michels wrote: >>>> On Thu, Jul 23, 2020 at 2:55 PM Spencer Graves >>>> wrote: >>>>> Hello, All: >>>>> >>>>> I've failed with multiple >>>>> attempts to scrape the table of >>>>> candidates from the website of >>>>> the Missouri Secretary of >>>>> State: >>>>> >>>>> https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975 >>>> Hi Spencer, >>>> >>>> I tried the code below on an older >>>> R-installation, and it works fine. >>>> Not a full solution, but it's a >>>> start: >>>> >>>>> library(RCurl) >>>> Loading required package: bitops >>>>> url <- >>>>> "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975; >>>>> M_sos <- getURL(url) >>> Hi Bill et al.: >>> >>> That broke the dam:� It gave me a >>> character vector of length 1 >>> consisting of 218 KB.� I fed that to >>> XML::readHTMLTable and >>> purrr::map_chr, both of which >>> returned lists of 337 data.frames. >>> The former retained names for all >>> the tables, absent from the latter. >>> The columns of the former are all >>> character;� that's not true for the >>> latter. >>> >>> Sadly, it's not quite what I want: >>> It's one table for each office-party >>> combination, but it's lost the >>> office designation. However, I'm >>> confident I can figure out how to >>> hack that. >> Maybe try something like this: >> >> url <- >> "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975; >> h <- xml2::read_html(url) >> tbl <- rvest::html_table(h) > Dear Spencer, > > I unified the party tables after the > first summary table like this: > > url <- > "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975; > M_sos <- RCurl::getURL(url) > saveRDS(object=M_sos, file="dcp.rds") > dat <- XML::readHTMLTable(M_sos) > idx <- 2:length(dat) > cn <- unique(unlist(lapply(dat[idx], colnames))) � This is useful for this application. > dat <- do.call(rbind, > sapply(idx, function(i, dat, cn) { > x <- dat[[i]] > x[,cn[!(cn %in% colnames(x))]] <- NA > x <- x[,cn] > x$Party <- names(dat)[i] > return(list(x)) > }, dat=dat, cn=cn)) > dat[,"Date Filed"] <- > as.Date(x=dat[,"Date Filed"], > format="%m/%d/%Y") � This misses something extremely important for this application:� The political office.� That's buried in the HTML or whatever it is.� I'm using something like the following to find that: str(LtGov <- gregexpr('Lieutenant Governor', M_sos)[[1]]) � After I figure this out, I will use something like your code to combine it all into separate tables for each office, and then probably combine those into one table for the offices I'm interested in.� For my present purposes, I don't want all the offices in Missouri, only the executive positions and those representing parts of the Kansas City metro area in the Missouri legislature. � Thanks again, � Spencer Graves > write.table(dat, file="dcp.tsv", sep="\t", > row.names=FALSE, > quote=TRUE, na="N/A") > > Best, > Rasmus > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [External] Re: help with web scraping
On 2020-07-24 08:20, luke-tier...@uiowa.edu wrote: Maybe try something like this: url <- "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975; h <- xml2::read_html(url) Error in open.connection(x, "rb") : HTTP error 404. Thanks for the suggestion, but this failed for me on the platform described in "sessionInfo" below. tbl <- rvest::html_table(h) As I previously noted, RCurl::getURL returned a single character string of roughly 218 KB, from which I've so far gotten most but not all of what I want. Unfortunately, when I fed that character vector to rvest::html_table, I got: Error in UseMethod("html_table") : no applicable method for 'html_table' applied to an object of class "character" I don't know for sure yet, but I believe I'll be able to get what I want from the single character string using, e.g., gregexpr and other functions. Thanks again, Spencer Graves Best, luke On Fri, 24 Jul 2020, Spencer Graves wrote: Hi Bill et al.: That broke the dam: It gave me a character vector of length 1 consisting of 218 KB. I fed that to XML::readHTMLTable and purrr::map_chr, both of which returned lists of 337 data.frames. The former retained names for all the tables, absent from the latter. The columns of the former are all character; that's not true for the latter. Sadly, it's not quite what I want: It's one table for each office-party combination, but it's lost the office designation. However, I'm confident I can figure out how to hack that. Thanks, Spencer Graves On 2020-07-23 17:46, William Michels wrote: Hi Spencer, I tried the code below on an older R-installation, and it works fine. Not a full solution, but it's a start: library(RCurl) Loading required package: bitops url <- "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975; M_sos <- getURL(url) print(M_sos) [1] "\r\n\r\n\r\n\r\n\r\n\tSOS, Missouri - Elections: Offices Filed in Candidate Filing\r\n wrote: Hello, All: I've failed with multiple attempts to scrape the table of candidates from the website of the Missouri Secretary of State: https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975 I've tried base::url, base::readLines, xml2::read_html, and XML::readHTMLTable; see summary below. Suggestions? Thanks, Spencer Graves sosURL <- "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975; str(baseURL <- base::url(sosURL)) # this might give me something, but I don't know what sosRead <- base::readLines(sosURL) # 404 Not Found sosRb <- base::readLines(baseURL) # 404 Not Found sosXml2 <- xml2::read_html(sosURL) # HTTP error 404. sosXML <- XML::readHTMLTable(sosURL) # List of 0; does not seem to be XML sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Catalina 10.15.5 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets [6] methods base loaded via a namespace (and not attached): [1] compiler_4.0.2 tools_4.0.2 curl_4.3 [4] xml2_1.3.2 XML_3.99-0.3 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with web scraping
Hi Bill et al.: That broke the dam: It gave me a character vector of length 1 consisting of 218 KB. I fed that to XML::readHTMLTable and purrr::map_chr, both of which returned lists of 337 data.frames. The former retained names for all the tables, absent from the latter. The columns of the former are all character; that's not true for the latter. Sadly, it's not quite what I want: It's one table for each office-party combination, but it's lost the office designation. However, I'm confident I can figure out how to hack that. Thanks, Spencer Graves On 2020-07-23 17:46, William Michels wrote: Hi Spencer, I tried the code below on an older R-installation, and it works fine. Not a full solution, but it's a start: library(RCurl) Loading required package: bitops url <- "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975; M_sos <- getURL(url) print(M_sos) [1] "\r\n\r\n\r\n\r\n\r\n\tSOS, Missouri - Elections: Offices Filed in Candidate Filing\r\n wrote: Hello, All: I've failed with multiple attempts to scrape the table of candidates from the website of the Missouri Secretary of State: https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975 I've tried base::url, base::readLines, xml2::read_html, and XML::readHTMLTable; see summary below. Suggestions? Thanks, Spencer Graves sosURL <- "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975; str(baseURL <- base::url(sosURL)) # this might give me something, but I don't know what sosRead <- base::readLines(sosURL) # 404 Not Found sosRb <- base::readLines(baseURL) # 404 Not Found sosXml2 <- xml2::read_html(sosURL) # HTTP error 404. sosXML <- XML::readHTMLTable(sosURL) # List of 0; does not seem to be XML sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Catalina 10.15.5 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets [6] methods base loaded via a namespace (and not attached): [1] compiler_4.0.2 tools_4.0.2curl_4.3 [4] xml2_1.3.2 XML_3.99-0.3 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with web scraping
Hello, All: I've failed with multiple attempts to scrape the table of candidates from the website of the Missouri Secretary of State: https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975 I've tried base::url, base::readLines, xml2::read_html, and XML::readHTMLTable; see summary below. Suggestions? Thanks, Spencer Graves sosURL <- "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975; str(baseURL <- base::url(sosURL)) # this might give me something, but I don't know what sosRead <- base::readLines(sosURL) # 404 Not Found sosRb <- base::readLines(baseURL) # 404 Not Found sosXml2 <- xml2::read_html(sosURL) # HTTP error 404. sosXML <- XML::readHTMLTable(sosURL) # List of 0; does not seem to be XML sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Catalina 10.15.5 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets [6] methods base loaded via a namespace (and not attached): [1] compiler_4.0.2 tools_4.0.2 curl_4.3 [4] xml2_1.3.2 XML_3.99-0.3 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R 4.0.2 is released
Thank you and all the core R team. Spencer Graves On 2020-06-22 03:21, Peter Dalgaard via R-help wrote: The build system rolled up R-4.0.2.tar.gz (codename "Taking Off Again") this morning. The list below details the changes in this release. You can get the source code from http://cran.r-project.org/src/base/R-4/R-4.0.2.tar.gz or wait for it to be mirrored at a CRAN site nearer to you. Binaries for various platforms will appear in due course. For the R Core Team, Peter Dalgaard These are the checksums (md5 and SHA-256) for the freshly created files, in case you wish to check that they are uncorrupted: MD5 (AUTHORS) = b9c44f9f78cab3184ad9898bebc854b4 MD5 (COPYING) = eb723b61539feef013de476e68b5c50a MD5 (COPYING.LIB) = a6f89e2100d9b6cdffcea4f398e37343 MD5 (FAQ) = 4afa171cd982aaa60f0ba92e2e7bc5d6 MD5 (INSTALL) = 7893f754308ca31f1ccf62055090ad7b MD5 (NEWS) = 566a6bb3642e28e6bf01cf98db31137c MD5 (NEWS.0) = bfcd7c147251b5474d96848c6f57e5a8 MD5 (NEWS.1) = eb78c4d053ec9c32b815cf0c2ebea801 MD5 (NEWS.2) = 496062c138e2def06cebccddfb814ac6 MD5 (NEWS.3) = 012e7f4a80cc8ec947bf3f0ff6117ec8 MD5 (R-latest.tar.gz) = 1eac7293d5fe313a56ddabfda02b437e MD5 (README) = f468f281c919665e276a1b691decbbe6 MD5 (RESOURCES) = 529223fd3ffef95731d0a87353108435 MD5 (THANKS) = 251d20510bfc3cc93b82c5a99f7efcc6 MD5 (VERSION-INFO.dcf) = 62496d3a0fd8cc2ed644ea518c052371 MD5 (R-4/R-4.0.2.tar.gz) = 1eac7293d5fe313a56ddabfda02b437e 2cde824a7b18958e5f06b391c801c8288be0f84fa8934b7ddefef23c67e60c09 AUTHORS e6d6a009505e345fe949e1310334fcb0747f28dae2856759de102ab66b722cb4 COPYING 6095e9ffa777dd22839f7801aa845b31c9ed07f3d6bf8a26dc5d2dec8ccc0ef3 COPYING.LIB eddf87b12197c7b3b19cbc9b11c1beab95b14e3dcd715bf37d2f6a8b2a72c2a1 FAQ f87461be6cbaecc4dce44ac58e5bd52364b0491ccdadaf846cb9b452e9550f31 INSTALL ec05bba338358410fae6b34fed061605989ab3601aba1b3fcb45a610d5dd2eb9 NEWS 4e21b62f515b749f80997063fceab626d7258c7d650e81a662ba8e0640f12f62 NEWS.0 12b30c724117b1b2b11484673906a6dcd48a361f69fc420b36194f9218692d01 NEWS.1 e80de410c77f05ff2012fa70051b89119845f734a7fa5c55857e61e4ed7d5f6e NEWS.2 7201d139947afa52b5e09d26dc01445edf444506264355b2185122bc1ed3dce0 NEWS.3 d3bceab364da0876625e4097808b42512395fdf41292f4915ab1fd257c1bbe75 R-latest.tar.gz 2fdd3e90f23f32692d4b3a0c0452f2c219a10882033d1774f8cadf25886c3ddc README 408737572ecc6e1135fdb2cf7a9dbb1a6cb27967c757f1771b8c39d1fd2f1ab9 RESOURCES c9c7cb32308b4e560a22c858819ade9de524a602abd4e92d1c328c89f8037d73 THANKS 10cc5f566a4a5ce49147e7dcfbe9180dba09ccb9efb17298b067309eb799e92e VERSION-INFO.dcf d3bceab364da0876625e4097808b42512395fdf41292f4915ab1fd257c1bbe75 R-4/R-4.0.2.tar.gz This is the relevant part of the NEWS file CHANGES IN R 4.0.2: UTILITIES: * R CMD check skips vignette re-building (with a warning) if the VignetteBuilder package(s) are not available. BUG FIXES: * Paths with non-ASCII characters caused problems for package loading on Windows PR#17833. * Using tcltk widgets no longer crashes R on Windows. * source(*, echo=TRUE) no longer fails in some cases with empty lines; reported by Bill Dunlap in PR#17769. * on.exit() now correctly matches named arguments, thanks to PR#17815 (including patch) by Brodie Gaslam. * regexpr(*, perl=TRUE) no longer returns incorrect positions into text containing characters outside of the Unicode Basic Multilingual Plane on Windows. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rtools required
I just encountered something that looks exactly like that with R 4.0.0 and the latest RStudio AND with Rtools40 installed under Windows 10 (AND some manual adjustment of the path to delete references to old versions of Rtools and make sure Rtools40 was there). I got around it by have Rtools40 installed AND running "install.packages('xmlw')" inside R 4.0.0 inside a CMD prompt. That worked, even though the same command inside RStudio failed with: WARNING: Rtools is required to build an R package but no version of Rtools compatible with the currently running version of R was found. Note that the following incompatible version(s) of Rtools were found: - Rtools 3.5 (installed at C:\Rtools). I got this message after deleting references to C:\Rtools from the path and rebooting. *** I BELIEVE YOU NEED Rtools40 installed, because binaries for some packages are not (yet) available for R 4.0.0. Hope this helps. Spencer Graves On 2020-04-29 21:02, Bert Gunter wrote: Wouldn't packages that have to be built from source on installation require Rtools? Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Apr 29, 2020 at 6:48 PM Steven wrote: Thanks to all - very helpful. I search from c:\ and now find file .Renviron located in C:\Users\USER\Documents. That does it. I would like to pose an additional question, since it would also fall under the same subject line. This is an RStudio question but since I am using the free version the RStudio folks would not help me. My students simply need to install packages and are not building any packages so they have no reason to install Rtools. When they install (not build) packages (from CRAN or file archive) they received the same warning message saying "Rtools required to build a package". This is nonsense of course as they are, as I said, simply installing. I tell them to user an older RStudio version, specifically 1.1.463 that's free of that warning. Any idea? Steven Yen On 2020/4/30 上午 06:30, Fox, John wrote: Dear Steven, It's possible that Windows will hide .Renviron, but it's generally a good idea, in my opinion, in Folder Options > View to click "Show hidden files" and uncheck "hide extensions". Then .Renviron should show up (once you've created it). Best, John -Original Message- From: Bert Gunter Sent: Wednesday, April 29, 2020 5:50 PM To: Steven Cc: Fox, John ; R-help Mailing List Subject: Re: [R] Rtools required Type ?.Renviron ?R.home ?"environment variables" at the R prompt to get what I think should be the info you need (or at least useful info). Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Apr 29, 2020 at 2:37 PM Steven wrote: Thanks John. Where is file .Renviron located? It must be a hidden file. I cannot find it. On 2020/4/28 下午 08:29, Fox, John wrote: Dear Steven, Did you follow the instruction on the Rtools webpage to add PATH="${RTOOLS40_HOME}\usr\bin;${PATH}" to your .Renviron file? I hope this helps, John - John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada Web: http::/socserv.mcmaster.ca/jfox On Apr 28, 2020, at 4:38 AM, Steven wrote: Dear All I updated to R-4.0.0. and also installed the latest Rtools 4.0 (to now the new default folder c:\rtools40). While compiling a package (binary) I received the follow marning message saying Rtools is required. Any clues? Thanks. Steven Yen WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding: https://cran.rstudio.com/bin/windows/Rtools/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented
Re: [R] R 3.6.2 is released
Ditto. Spencer Graves On 2019-12-12 07:54, Rich Shepard wrote: On Thu, 12 Dec 2019, Peter Dalgaard via R-help wrote: The build system rolled up R-3.6.2.tar.gz (codename "Dark and Stormy Night") this morning. Peter, My thanks to all of you on the R core team. Regards, Rich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transforming dates
On 2019-11-03 17:04, Peter Dalgaard wrote: On 3 Nov 2019, at 21:22 , David Winsemius wrote: On 11/3/19 11:51 AM, Bert Gunter wrote: === Hey, that's my birthday! Err, no it isn't... ;-) Is that November 3 of 2019 or March 19 of 2011 or 11 March 2019? [please excuse the typo in the earlier response] The English still use stones as a unit of mass, and most of the US still steadfastly refuses to seriously consider metrication or ISO 8601. I know an architect in the US, who has worked on several different projects every year for the past 40 years only one of which has been in metric units. Binary, octal or hex is superior to decimal, except for the fact that most humans have 10 digits on hands and feet. And decimal is vastly superior to arithmetic in mixed bases, e.g., adding miles, rods, yards, feet, inches, and 64ths. Spencer Graves __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transforming dates
On 2019-11-03 17:04, Peter Dalgaard wrote: On 3 Nov 2019, at 21:22 , David Winsemius wrote: On 11/3/19 11:51 AM, Bert Gunter wrote: === Hey, that's my birthday! Err, no it isn't... ;-) Is that November 11 of 2019 or March 19 of 2011 or 11 March 2019? The English still use stones as a unit of mass, and most of the US still steadfastly refuses to seriously consider metrication or ISO 8601. I know an architect in the US, who has worked on several different projects every year for the past 40 years only one of which has been in metric units. Binary, octal or hex is superior to decimal, except for the fact that most humans have 10 digits on hands and feet. And decimal is vastly superior to arithmetic in mixed bases, e.g., adding miles, rods, yards, feet, inches, and 64ths. Spencer Graves __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R 3.6.1
What might be the likely outcomes of an attempt to reinstall 3.6.1? And might that depend on whether the current 3.6.1 was uninstalled first? Spencer On 2019-07-17 07:17, Jeff Newmiller wrote: It would never make sense for such messages to reflect normal and expected operation, so hypothesizing about intentionally changing stack behavior doesn't make sense. The default format for saveRDS changed in 3.6.0. There may be bugs associated with that, but rolling back to 3.6.0 would just trade bugs. https://cran.r-project.org/doc/manuals/r-devel/NEWS.html On July 16, 2019 8:56:28 PM CDT, Jan Galkowski wrote: Did something seriously change in R 3.6.1 at least for Windows in terms of stack impacts? I'm encountering many problems with the 00UNLOCK, needing to disable locking during installations. And I'm encountering Error: C stack usage 63737888 is too close to the limit for cases I did not before, even when all I'm doing is serializing an object to be saved with *saveRDS* or even *save.image(.)*. Yes, I know, I did not append a minimally complete example. Just wanted to see if it was just me, or if anyone else was seeing this. It's on Windows 7 HE and I've run *R* here for years. My inclination is to drop back to 3.6.0 if it is just me or if no one knows about this problem. Thanks, - Jan Galkowski. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random truncation
PLEASE EXCUSE: This discussion has diverged from R into discussing the precise assumptions seemingly descriptive of an application that drove the initial post to this thread. A reply by Abby Spurdle seemed to me to raise questions, whose answers may not be intelligible without material snipped from Spurdle's reply. I wish to thank Spurdle from the reply and apologize to those who feel this is an abuse of this list. I trust that those in the latter category will please not bother to read further. For anyone still interested in this problem, below please find my earlier analysis with corrections that attempt to respond to Spurdle's most recent concerns. Thanks, Spencer Graves On 2019-07-12 22:31, Abby Spurdle wrote: > The distribution of the randomly truncated variable has thus four > parameters: a, b, mu and sigma. I was able to write down the likelihood > and attempted to maximise it I read the Wikipedia article more carefully. The formula is relatively simple, and is based on the application of Bayes Theorem. If one doesn't want to work out the integral, numerical methods can be used. However, the problem needs to be defined *precisely* first. Correct: In my case, I confess I hadn't thought this through completely before posting. I tried Rseek, as Bert Gunter suggested. That led me to the "truncreg" and "DTDA" packages, neither of which seemed to be what I wanted; thanks to Bert, Rolf, and Abby for your comments. I'm observing a random variable Y[i] = (x[i]'b+e[i]) given Y[i]>(z[i]'c+f[i]) where the tick mark (') denotes transpose of a vector, and e and f are normally distributed with mean 0 and standard deviations s and t, respectively, i = 1:n. Thus, Y[i] follows a truncated normal distribution with mean x[i]'b and standard deviation s, with the truncation condition being that Y[i]>(z[i]'c+f[i]). I want the total of all the Y's from the untruncated distribution, i.e., including those truncated (and not observed). I think the likelihood is the product of the density of Y[i] given x[i] and given that Y[i] is actually observed. By substituting Y[i] = (x[i]'b+e[i]) into the truncation condition Y[i]>(z[i]'c+f[i]), we get the following: (x[i]'b+e)>(z[i]c+f). This occurs if and only if: (x[i]'b-z[i]'c)>(f-e), Therefore, the probability that Y[i] is observed (and not truncated) is Pr{Y[i] observed} = Phi((x[i]'b-z[i]'c)/sqrt(s^2+t^2)) where Phi is the cdf of the standard normal. And then the likelihood for observation i can be written as follows: f(y[i]|x[i], z[i], b, c, s, t) = phi((y[i]-x[i]'b)/s) / Phi((x[i]'b-z[i]'c)/sqrt(s^2+t^2)). We may not be able to estimate "c" in this, because if one of the z[i]'s is nonzero, we can pick "c" so z[i]'c is Inf. That makes the denominator 0 and the likelihood Inf. (If all the z[i]'s are zero, we still cannot estimate "c".) However, if "b" is estimable, ignoring the truncation, then we can estimate "b", "s" and "t" given "c". And then the desired total of all the Y's, observed and unobserved, would be the sum of y[i] divided by Pr{Y[i] observed}. This likelihood is simple enough, it can be easily programmed in R and maximized over variations in "b", "s" and "t" given "c". I can get starting values for "b" and "s" from "lm", ignoring the truncation. And I can first fit the model assuming t = s, then test whether it's different using likelihood ratio. And I can try to estimate "c", but I should probably use values I can estimate from other sources until I'm comfortable with the estimates I get for "b", "s" and "t" given an assumed value for "c". Comments? Thanks so much. Spencer Graves __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random truncation
On 2019-07-12 22:31, Abby Spurdle wrote: > The distribution of the randomly truncated variable has thus four > parameters: a, b, mu and sigma. I was able to write down the likelihood > and attempted to maximise it I read the Wikipedia article more carefully. The formula is relatively simple, and is based on the application of Bayes Theorem. If one doesn't want to work out the integral, numerical methods can be used. However, the problem needs to be defined *precisely* first. Correct: In my case, I confess I hadn't thought this through completely before posting. I tried Rseek, as Bert Gunter suggested. That led me to the "truncreg" and "DTDA" packages, neither of which seemed to be what I wanted; thanks to Bert, Rolf, and Abby for your comments. I'm observing Y[i] = (X[i]'b+e) given Y[i]>(z[i]'c+f) where e and f are normally distributed with standard deviations s and t, respectively, i = 1:n. I want the total of all the Y's, including those truncated (and not observed). I think the likelihood is the product of the density of Y[i] given x[i] and given that Y[i] is actually observed. The latter can be further written as follows: (x[i]'b+e)>(z[i]c+f) iff (x[i]'b-z[i]'c)>(f-e), Therefore, the probability that Y[i] is observed is Pr{Y[i] observed} = Phi((x[i]'b-z[i]'c)/sqrt(s^2+t^2)) where Phi is the cdf of the standard normal. And then the likelihood for observation i can be written as follows: f(y[i]|x[i], z[i], b, c, s, t) = phi((y[i]-x[i]'b)/s) / Phi((x[i]'b-z[i]'c)/sqrt(s^2+t^2). We may not be able to estimate "c" in this, because if one of the z[i]'s is nonzero, we can pick "c" so z[i]'c is Inf. That makes the denominator 0 and the likelihood Inf. (If all the z[i]'s are zero, we still cannot estimate "c".) However, if "b" is estimable, ignoring the truncation, then we can estimate "b", "s" and "t" given "c". And then the desired total of all the Y's, observed and unobserved, would be the sum of y[i] divided by Pr{Y[i] observed}. This likelihood is simple enough, it can be easily programmed in R and maximized over variations in "b", "s" and "t" given "c". I can get starting values for "b" and "s" from "lm", ignoring the truncation. And I can first fit the model assuming t = s, then test whether it's different using likelihood ratio. And I can try to estimate "c", but I should probably use values I can estimate from other sources until I'm comfortable with the estimates I get for "b", "s" and "t" given an assumed value for "c". Comments? Thanks so much. Spencer Graves __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] random truncation
Hello: What do you suggest I do about modeling random truncation? I have data on a variable Y in strata S[0], S[1], ..., S[n], where Y is always observed in S[0] but is less often observed in the other strata. I assume that the probability of observing Y is a monotonically increasing function of Y and a monotonically decreasing function of d[i] = the distance from S[0] to S[i]. There is a section on "random truncation" in the Wikipedia article on "Truncated distribution".[1] It would be nice if I had an R package that would make it relatively easy to model the truncation as a function of "d" and / or publication that described someone doing it in R. (I also have a couple of other variables that influence the distribution of Y.) Thanks, Spencer Graves [1] https://en.wikipedia.org/wiki/Truncated_distribution#Random_truncation __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] density vs. mass for discrete probability functions
On 2019-03-15 08:37, peter dalgaard wrote: Mathematically, you can bring discrete and continuous distributions on a common footing by defining probability functions as densities wrt. counting measure. You don't really need Radon-Nikodym derivatives to understand the idea, just the fact that sums can be interpreted as integrals wrt counting measure, hence sum_{x in A} f(x) and int_A f(x) dx are essentially the same concept. Correct. That's for clearing up my "mud". sg -pd On 15 Mar 2019, at 01:43 , Stefan Schreiber wrote: Dear R users, While experimenting with the dbinom() function and reading its documentation (?dbinom) it reads that "dbinom gives the density" but shouldn't it be called "mass" instead of "density"? I assume that it has something to do with keeping the function for "density" consistent across discrete and continuous probability functions - but I am not sure and was hoping someone could clarify? Furthermore the help file for dbinom() function references a link (http://www.herine.net/stat/software/dbinom.html) but it doesn't seem to land where it should. Maybe this could be updated? Thank you, Stefan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3.5.3
On 2019-03-15 08:54, Jeff Newmiller wrote: You really need to get into better touch with your feelings, man... how are we supposed to know what you want? ;-) In general you should be using the main release unless you know there is a bug in it that affects you and has specifically been addressed in the patched version. The patched version may have also introduced new bugs and has in general been tested less than the main release version. But hey... pick your own poison... maybe you want to contribute more bug reports per week... patched would be just your thing! A "fortune"? sg On March 15, 2019 6:29:52 AM PDT, Bill Poling wrote: Good morning. This may have already been asked, if so my appologies. I want to upgrade to 3.5.3 however there is an alternative called patched. Do I want it or the basic 3.5.3? Thank you. https://www.google.com/search?safe=strict=1C1GCEB_enUS821US821=naWLXLuuAoOb5wKp6K_oBA=difference+between+r+3.5.3+and+3.5.3+patched=difference+between+r+3.5.3+and+3.5.3+patched_l=psy-ab.3...3568.18782..19030...1.0..0.133.3101.34j4..01..gws-wiz...0i7i30j33i10j33i299j33i22i29i30j33i160.nyMRzRN9ebw WHP Confidentiality Notice This message is sent from Zelis. ...{{dropped:13}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] density vs. mass for discrete probability functions
On 2019-03-14 19:43, Stefan Schreiber wrote: Dear R users, While experimenting with the dbinom() function and reading its documentation (?dbinom) it reads that "dbinom gives the density" but shouldn't it be called "mass" instead of "density"? I assume that it has something to do with keeping the function for "density" consistent across discrete and continuous probability functions - but I am not sure and was hoping someone could clarify? The Wikipedia article on "Probability density function" gives the "Formal definition" that, "the density of [a random variable] with respect to a reference measure ... is the Radon–Nikodym derivative". This sounds bazaar to people who haven't studied measure-theoretic probability, but it allows a unified treatment of continuous and discrete probabilities and to others that are combinations and neither. The "reference measure" for a discrete probability distribution is the "counting measure", which supports the use of the word "density" in this context being equivalent to "mass". For continuous distributions, the "reference measure" is routinely taken to be the "improper prior" that assigns measure 1 to any unit interval on the real line. Does that make it clear as mud? Spencer Graves https://en.wikipedia.org/wiki/Probability_density_function Furthermore the help file for dbinom() function references a link (http://www.herine.net/stat/software/dbinom.html) but it doesn't seem to land where it should. Maybe this could be updated? Thank you, Stefan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RMarkdown vignettes v. Jupyter notebooks?
Hello: What are the differences between Jupyter notebooks and RMarkdown vignettes? I'm trying to do real time monitoring of the broadcast quality of a radio station, and it seems to me that it may be easier to do that in Python than in R.[1] This led me to a recent post to "python-l...@python.org" that mentioned "Jupyter, Mathematica, and the Future of the Research Paper"[2] by Paul Romer, who won the 2018 Nobel Memorial Prize in Economics only a few days ago. In brief, this article suggests that Jupyter notebooks may replace publication in refereed scientific journals as the primary vehicle for sharing scientific research, because they make it so easy for readers to follow both the scientific and computational logic and test their own modifications. A "Jupyter Notebook Tutorial: The Definitive Guide"[3] suggested I first install Anaconda Navigator. I got version 1.9.2 of that. It opens with options for eight different "applications" including JupyterLab 0.34.9, Jupyter Notebook 5.6.0, Spyder 3.3.1 (an IDE for Python), and RStudio 1.1.456. This leads to several questions: 1. In general, what experiences have people had with Jupyter Notebooks, Anaconda Navigator, and RMarkdown vignettes in RStudio, and the similarities and differences? Do you know any references that discuss this? 2. More specifically, does it make sense to try to use RStudio from within Anaconda Navigator, or is one better off using RStudio as a separate, stand alone application -- or should one even abandon RStudio and run R instead from within a Jupyter Notebook? [I'm new to this topic, so it's possible that this question doesn't even make sense.] Thanks, Spencer Graves [1] If you have ideas for how best to do real time monitoring of broadcast quality of a radio station, I'd love to hear them. I need software that will do that, preferably something that's free, open source. The commercial software I've seen for this is not adequate for my purposes, so I'm trying to write my own. I have a sample script in Python that will read a live stream from a radio tuner and output a *.wav of whatever length I want, and I wrote Python eight years ago for a similar real time application. I'd prefer to use R, but I don't know how to get started. [2] 2018-04-13: "https://paulromer.net/jupyter-mathematica-and-the-future-of-the-research-paper;. This further cites a similar article in The Atlantic from 2018-04-05: "www.theatlantic.com/science/archive/2018/04/the-scientific-paper-is-obsolete/556676". __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using S4 objects in "with"?
On 2018-09-26 15:34, Duncan Murdoch wrote: On 26/09/2018 4:16 PM, Spencer Graves wrote: Is there anything comparable to "with" for S4 objects? EXAMPLE: A "Wave" object in the tuneR package has slots "left" and "right", plus others. I'd like to be able to do something like the following: library(tuneR) x <- seq(0, 2*pi, length = 6) all.equal(x, rev(x)) channel <- round(32000 * sin(440 * x)) Wobj <- Wave(left = channel, right=rev(channel)) with(Wobj, quantile(left-right)) ** This last statement throws "Error ... object 'left' not found". Is there something comparable to "with" that can do this? I don't know of anything that is "officially sanctioned". A couple of ideas: 1. Slots in S4 are stored in attributes. So with(attributes(Wobj), quantile(left - right)) works. BUT: as far as I recall, this is an undocumented implementation detail, and you aren't supposed to count on it. 2. You could write an as.list() method for the Wave class, then with(as.list(Wobj), would work. This may be the "right" way to do this. Thanks. I'd prefer to have as.list.default convert every S4 object to a list. And have with(S4_object, ...) interpret it equivalent to with(as.list(S4_object), ...). I think I'll do it other ways for the time being. Best Wishes, Spencer Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using S4 objects in "with"?
Is there anything comparable to "with" for S4 objects? EXAMPLE: A "Wave" object in the tuneR package has slots "left" and "right", plus others. I'd like to be able to do something like the following: library(tuneR) x <- seq(0, 2*pi, length = 6) all.equal(x, rev(x)) channel <- round(32000 * sin(440 * x)) Wobj <- Wave(left = channel, right=rev(channel)) with(Wobj, quantile(left-right)) ** This last statement throws "Error ... object 'left' not found". Is there something comparable to "with" that can do this? Thanks, Spencer Graves __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summarizing R script
It depends on what you want, but I've found it very useful to create packages and submitting them to CRAN. See "Creating R Packages" for how to do that.[1] Part of this involves creating vignettes using Rmarkdown within RStudio. Creating R packages and routinely running "R CMD check" sounds like it would take extra time. My experience has been very much the opposite, because it dramatically reduces the bugs in my software and makes it vastly easier to find the bugs that still exist. AND I have something I can just hand to others, and they can use it. That would be exceedingly difficult otherwise. And there are publications like "R Journal" that are looking for descriptions of what you've done. I have a paper in "R Journal" describing the "sos" package; that article is a vignette in that package. This process has worked for me.[2] Spencer [1] Available from help.start(). See also "https://cran.r-project.org/doc/manuals/r-release/R-exts.html;. [2] The "sos" package is the fastest literature search I know for anything statistical. It's availability on CRAN combined with the R Journal article got me invited to help organize a plenary session on "Navigating the R Package Universe" at the useR!2017 conference in Brussels last year. This is an example of how creating an R package with a vignette has helped me find an audience. On 2018-09-26 09:06, Roger Koenker wrote: I use R CMD BATCH foo which produces a file called foo.Rout and provided the script includes sessionInfo() constitutes a quite sufficient summary for my purposes, it isn’t exactly pretty, but it is informative. On Sep 26, 2018, at 3:00 PM, Spencer Brackett wrote: R users, Is anyone aware of the proper procedure for summarizing a script(your complete list of functions, arguments , and error codes within your R console for say a formal report or publication? Many thanks, Best wishes, Spencer Brackett -- Forwarded message - From: CHATTON Anne via R-help Date: Wed, Sep 26, 2018 at 6:03 AM Subject: [R] Problems to obtain standardized betas in multiply-imputed data To: r-help@r-project.org Dear all, I am having problems in obtaining standardized betas on a multiply-imputed data set. Here are the codes I used : imp = mice(data, 5, maxit=10, seed=42, print=FALSE) FitImp <- with(imp,lm(y ~ x1 + x2 + x3)) Up to here everything is fine. But when I ask for the standardized coefficients of the multiply-imputed regressions using this command : sdBeta <- lm.beta(FitImp) I get the following error message: Error in b * sx : argument non numérique pour un opérateur binaire Can anyone help me with this please? Anne __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [FORGED] Question on Binom.Confint
On 2018-09-14 08:52, Guo, Fang (Associate) wrote: It's method="lrt" and I used the "binom" package. The ultimate answer can be obtained as follows: > debug(binom.confint) > binom.confint(x = 0, n = 100, tol = 1e-8, method='lrt') Then walk through the code line by line. Reading the code, I find the following: if (any(method == "lrt") || all.methods) { res.lrt <- binom.lrt(x, n, conf.level = conf.level, ...) res <- if (is.null(res)) res.lrt else rbind(res, res.lrt) } Then check the help page for "binom.lrt". That includes the following: Confidence intervals are based on profiling the binomial deviance in the neighbourhood of the MLE. If x == 0 or x == n and bayes is TRUE, then a Bayesian adjustment is made to move the log-likelihood function away from Inf. Specifically, these values are replaced by (x + 0.5)/(n + 1), which is the posterier mode of f(p|x) using Jeffrey's prior on p. Furthermore, if conf.adj is TRUE, then the upper (or lower) bound uses a 1 - alpha confidence level. Typically, the observed mean will not be inside the estimated confidence interval. If bayes is FALSE, then the Clopper-Pearson exact method is used on the endpoints. This tends to make confidence intervals at the end too conservative, though the observed mean is guaranteed to be within the estimated confidence limits. Spencer -Original Message- From: Rolf Turner [mailto:r.tur...@auckland.ac.nz] Sent: Thursday, September 13, 2018 10:02 PM To: Guo, Fang (Associate) Cc: r-help@R-project.org Subject: Re: [FORGED] [R] Question on Binom.Confint On 09/14/2018 08:15 AM, Guo, Fang (Associate) wrote: Hi, I have a question with the function Binom.Confint(x,n,"method"=lrt). For likelihood ratio test, I'd like to ask how you define the upper limit when the frequency of successes is zero. Thanks! Point 1: This question is inappropriate for this list, since it is about statistical theory and not about R syntax and programming. Point 2: Where did you find the function Binom.Confint()? I can find no such function anywhere. I did manage to locate a function binom.confint() (note the lower case "b" and "c") but it does not have an argument "method". Please do not expect those whom you are addressing to be telepathic. Point 3: Having "method"=lrt in the call is decidedly weird. Perhaps you meant method="lrt"; this is entirely different. cheers, Rolf Turner -- Technical Editor ANZJS Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276 *** Warning: This email may contain confidential or privileged information intended only for the use of the individual or entity to whom it is addressed. If you are not the intended recipient, please understand that any disclosure, copying, distribution, or use of the contents of this email is strictly prohibited. *** __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question on Binom.Confint
On 2018-09-13 20:58, David Winsemius wrote: On Sep 13, 2018, at 1:15 PM, Guo, Fang (Associate) wrote: Hi, I have a question with the function Binom.Confint(x,n,"method"=lrt). For likelihood ratio test, I'd like to ask how you define the upper limit when the frequency of successes is zero. Thanks! First you need to tell use where you are getting Binom.Confint. Error: object 'Binom.Confint' not found sos::findFn('Binom.Confint') found NO function named "Binom.Confint", but it did find two named "binom.confint", one in the "binom" package and the other in the "NNTbiomarker" package. The same search in "rdocumentation.org" returned the same results. The indicated command would not work in either. Spencer __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] real time monitoring of streaming audio with R?
Hello, All: Is it feasible to do real time monitoring of streaming audio with R, writing a compressed copy of what's read to 1-hour long MP3 files? I'm a volunteer with a community radio station (kkfi.org). My minimum requirements at the moment are to create MP3 files from what we broadcast. I have a tuner hooked to the audio input of a computer. I can listen to it in real time using the computer audio output jack, and I can record it manually using Audacity. If I have MP3 versions of both what we send to the tower and what is actually broadcasted, I can read the two into R, compare them, identify substantive differences, write appropriate descriptions to files, send emails, etc. I don't know how to sample the live stream. I know it can be done in Python, but I don't know how, and I'd prefer to use R. I suspect it can be done with ffplay, part of ffmpeg, but again I don't know how. Thanks, Spencer Graves __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting chr to num
Have you considered "Ecfun::asNumericChar" (and "Ecfun::asNumericDF")? DF <- data.frame(variable = c("12.6% ", "30.9%", "61.4%", "1")) Ecfun::asNumericChar(DF$variable) [1] 0.126 0.309 0.614 1.000 If you read the documentation including the examples, you will see that many of these issues and others are handled automatically in the way that I thought was the most sensible. If you disagree, we can discuss other examples and perhaps modify the code for those functions. Spencer Graves On 2018-08-20 00:26, Rui Barradas wrote: Hello, Inline. On 20/08/2018 01:08, Daniel Nordlund wrote: See comment inline below: On 8/18/2018 10:06 PM, Rui Barradas wrote: Hello, It also works with class "factor": df <- data.frame(variable = c("12.6%", "30.9%", "61.4%")) class(df$variable) #[1] "factor" as.numeric(gsub(pattern = "%", "", df$variable)) #[1] 12.6 30.9 61.4 This is because sub() and gsub() return a character vector and the instruction becomes an equivalent of what the help page ?factor documents in section Warning: To transform a factor f to approximately its original numeric values, as.numeric(levels(f))[f] is recommended and slightly more efficient than as.numeric(as.character(f)). Also, I would still prefer as.numeric(sub(pattern = "%$","",df$variable)) #[1] 12.6 30.9 61.4 The pattern is more strict and there is no need to search multiple occurrences of '%'. The pattern is more strict, and that could cause the conversion to fail if the process that created the strings resulted in trailing spaces. That's true, and I had thought of that but it wasn't in the OP's problem description. The '$' could still be used with something like "%\\s*$": as.numeric(sub('%\\s*$', '', df$variable)) #[1] 12.6 30.9 61.4 Rui Barradas Without the '$' the conversion succeeds. df <- data.frame(variable = c("12.6% ", "30.9%", "61.4%")) as.numeric(sub('%$', '', df$variable)) [1] NA 30.9 61.4 Warning message: NAs introduced by coercion <<>> Dan --- This email has been checked for viruses by AVG. https://www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Quadratic programming, for loop
sos::findFn('{quadratic programming}') just identified 156 help pages in 68 packages containing the term "quadratic programming". The function mentioned by Berwin Turlach, "solve.QP", is in package "quadprog", which has not been updated since 2016-12-20. I've used qudprod successfully, but you might wish to consider some of the other options in package(s) more actively maintained. The "print" method for sos::findFn('{quadratic programming}') produced two sheets in my default browser. The first of these contained 156 rows for the 156 help pages in 68 packages, sorted by default by c('Count', 'MaxScore', 'TotalScore', 'Package', 'Score', 'Function'). The second sheet listed only the 68 packages sorted by c('Count', 'MaxScore', 'TotalScore', 'Package'). You can click on the column headers to get them sorted in different orders, if you want. "sos::findFn" is for me the fastest literature search for anything statistical. I often write the list of help pages and the package summary to an Excel file using the "writeFindFn2xls" function, then annotate the package summary with other information to help me decide which package(s) and function(s) to try. Hope this helps. Spencer Graves, lead author of "sos" On 2018-06-26 07:01, Berwin A Turlach wrote: G'day all, On Tue, 26 Jun 2018 11:16:55 +0300 Maija Sirkjärvi wrote: It seems that my Amat and dvec are incompatible. Amat is a matrix of zeros size: *2*J-3,J* and dvec is a vector of length *J*. There should be no problem, but apparently there is. [...] solve.QP solves the quadratic program: min(-d^T b + 1/2 b^T D b) where A^T b >= b_0. Note the transpose. :) If dvec is of length *J*, then b will be of length J too, and Amat should be Jx(2J-3) so that its transpose is (2j-3)xJ, making it compatible for matrix multiplication with b. Cheers, Berwin --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Systemfit Question
I got an error when I tried to source the "structure" below beginning with "Empdata <- structure(list(country = structure(c(1L, 1L, 1L, 1L, 1L, 1L,": The source ended with " "98", "983", "991", "995", "997"), + + + + + + )) +" I supplied ")" and got another "+". When I supplied a second ")", I got, "Error in structure(c(15L, 487L, 397L, 323L, 233L, 315L, 370L, 309L, 268L, : argument is missing, with no default". Please excuse if I missed something obvious that I might have seen if I had read this entire thread, but it looked to me like this example was intended to be self contained and does not seem to be so. Spencer On 2018-05-16 17:21, Ngwinui Azenui wrote: I can't get my simultaneous equations to work using system fit. Please help. #Reproducible script Empdata<- read.csv("/Users/ngwinuiazenui/Documents/UPLOADemp.csv") View(Empdata) str(Empdata) Empdata$gnipc<-as.numeric(Empdata$gnipc) install.packages("systemfit") library("systemfit") pdata <- plm.data(Empdata, index=c("country","year")) install.packages("plm") library("plm") #Simultaneous equation 1 eqprd<- prd ~ in_emp +gnipc eqemp<- in_emp ~ prd inst <- ~ gnipc system<- list(productivity = eqprd, industrialemp = eqemp) fit2sls<- systemfit(system, "2SLS", inst = inst, data = Empdata) #I get an error code when I run the last line #Simultaneous equation 2 eqprd2<- prd ~ in_emp + lag(prd, -1) + gnipc eqemp2<- in_emp ~ prd + lag(prd, -1) + conflict system2<- list(productivity = eqprd2, industrialemp = eqemp2) inst <- ~ lag(prd, -1) + gnipc + conflict fit2sls<- systemfit(system2, "2SLS", inst = inst, data = Empdata) #I also wanted to include year and country fix effects(string variables) to the model in "Simultaneous equation 2) EmpSur<- systemfit(prd ~ in_emp + ag_emp, method = "SUR", data = pdata) #When I run the code below, I get a table with reults by country print(EmpSur) #An except of the output from running
Re: [R] Package for Molecular Properties
library(sos) (mp <- findFn('{molecular properties}')) ** found 7 matches in 4 packages and opened two web pages in my default browser with (a) the 7 matches and (b) the 4 packages. The first function was something for amino acids, like you suggested. Two others returned compound and substance information from PubChem. Does this help? Spencer On 2018-05-02 19:17, bbb_...@verizon.net wrote: All Is there a package or library that will, given a nucleotide sequence 1. calculate the extinction coefficient at 260 nm for (Beer-Lambert's law) 2. calculate molecular weight 3. return it's complementary sequence I was able to find several packages that can do similar calculations for an amino acid sequence for proteins but none for nucleic acids. Any pointers, etc. would be great. thank you in advance Indr __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Learning advanced R
On 2018-03-14 11:52, Rich Shepard wrote: On Wed, 14 Mar 2018, Duncan Murdoch wrote: I'm all for learning more languages and using the one that's best for each job, but for people who don't know Python, it would be helpful to list the aspects in which it excels. When should an R user choose to write something in Python instead? Duncan, "Best" is subjective, but my view is the language most comfortable and familiar to the developer/analyst should be the one used. In my environmental consulting business I use both R and Python. While Python has support for many statistical models I'm more comfortable with the ones available in R. For spatial analyses (separate from spatial statistics) I've used GRASS for > 20 years and it heavily uses Python. I also use Python (along with emacs, awk, sed, and grep) for cleaning and organizing data. For writing, I use LaTeX (a markup language) and the LyX GUI front end. Python has a lot of support for scientific and financial analyses, as does R. Considering there are a gazillion programming languages available (and used for essential applications, such as GnuCash (written in guile, a scheme variant) which I use for business and personal bookkeeping, picking the "best" one is strictly a personal matter. I prefer emacs, my system and network admin friends prefer vi. In linux, at least, there are so many options for doing a task that sometimes it's difficult to decide which to use in a given situation. If the languages you know do all you need then learn a new one only if it's to scratch an itch. :-) My software development productivity increased by a factor of maybe 30 by using first S-Plus then R, including writing R packages, then RStudio and writing Rmarkdown vignettes. 1. I started writing Fortran in 1963. I've written assembly language for multiple machines, Cobol, Lisp, and other language. I started using S-Plus in the early 1990s and abandoned it for R when I needed "debug" for some S-Plus code. Developing R packages improved my software development productivity by a factor of 10, because the discipline of creating unit tests in "\examples" made it so much easier to debug and maintain -- AND share with others. 2. I've also written some Python, though not much. I used Emacs until I found RStudio. Vi and Emacs are not tools you can give to someone, who is only marginally computer literate and expect them to be productive in a reasonable period of time. By contrast, if someone knows enough to be able to install R and RStudio, I can give them some R code and be confident that they will get something useful from the experience in a relatively short period of time. You can't do that with vi and Emacs unless they already know those applications. 3. Recently, I've started writing RMarkdown vignettes, and that further increased my productivity. 3.1. Two years ago, I told I client I was going to prepare and Rmarkdown vignette to document what I did with their data. My sales guy said absolutely, we were NOT going to give the client an Rmarkdown vignette. I spent a week analyzing the data and 6 months answering questions from the team mostly by pointing them to certain lines in the vignette, occasionally by extending it. In the middle of that, we learned that the client required our analysis to be verifiable. After that, the vignette became a primary deliverable. 3.2. More recently, another client asked me to explain principal components. This client was moderately facile with software but not with R nor vector spaces. I gave him an Rmarkdown vignette that included a principal components on some data he gave me done both with a single command and step by step supplemented with a simple discussion of a one-dimensional subspace of two-dimensional space. He was happy. 4. I invite all to review and improve the discussion in the Wikipedia article on "Software repository". This a table with a discussion of "Selected repositories", much of which I wrote 8 years ago. It's heavily biased toward CRAN, because that's what I know the best, and I've so far been unable to find anyone with the expertise and interest in improving it. This article averaged 290 views per day over the past 90 days, over 26,000 in the past 3 months. If you can improve that article, an audience that size might be worth talking to. Spencer Graves Best regards, Rich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing li
Re: [R] deparseDots to get names of all arguments?
On 2018-02-20 20:52, William Dunlap wrote: > Does substitute(...()) do what you want? That's the key. Thanks very much. Spencer Graves > > > myFunc <- function(x, ...) substitute(...()) > > myFunc(y=1/(1:10), x=sin(3:1), z=stop("Oops"), "untagged arg") > $y > 1/(1:10) > > $z > stop("Oops") > > [[3]] > [1] "untagged arg" > > > names(.Last.value) > [1] "y" "z" "" > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com <http://tibco.com> > > On Tue, Feb 20, 2018 at 2:36 PM, Spencer Graves > <spencer.gra...@effectivedefense.org > <mailto:spencer.gra...@effectivedefense.org>> wrote: > > Hi, All: > > > How can I get the names of all the arguments in dots(...)? > > > I'm able to get the name of the first argument but not the > second: > > > > deparseDots <- function(...){ > deparse(substitute(...)) > } > a <- 1 > b <- 2 > deparseDots(a, b) > [1] "a" > > > I'd like to get c('a', 'b'). > > > Thanks, > Spencer Graves > > > > sessionInfo() > R version 3.4.3 (2017-11-30) > Platform: x86_64-apple-darwin15.6.0 (64-bit) > Running under: macOS High Sierra 10.13.3 > > Matrix products: default > BLAS: > > /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib > LAPACK: > > /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_3.4.3 tools_3.4.3 yaml_2.1.16 > > __ > R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- > To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] deparseDots to get names of all arguments?
Hi, All: How can I get the names of all the arguments in dots(...)? I'm able to get the name of the first argument but not the second: deparseDots <- function(...){ deparse(substitute(...)) } a <- 1 b <- 2 deparseDots(a, b) [1] "a" I'd like to get c('a', 'b'). Thanks, Spencer Graves > sessionInfo() R version 3.4.3 (2017-11-30) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS High Sierra 10.13.3 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.4.3 tools_3.4.3 yaml_2.1.16 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Draft proposal for Searching R Packages
Hello, All: I just posted a "Draft Proposal for improving the ability of R users to search R packages" to Wikiversity (https://en.wikiversity.org/wiki/Draft_Proposal_for_improving_the_ability_of_R_users_to_search_R_packages). You are all invited to rewrite it in any way you think is more likely to produce the most useful result. Wikimedia invites contributors to "be bold but not reckless", writing from a neutral point of view citing credible sources. I do NOT want to do this project: I think the world will be better if it is done, and I think others are better equipped to actually do it -- or manage others doing it -- than I am. If you read this, you will see that it contains critical gaps. I hope one or more of you will fill these critical gaps or help find others who will. As indicated there, the next major deadline is April 1. This sounds like lots of time, except that the key thing that is missing in this draft proposal is principal investigator(s). Without PI(s), it won't fly. Thanks, Spencer Graves, PhD Founder EffectivedDefense.org 7300 W. 107th St. # 506 Overland Park, KS 66212 ph: 408-655-4567 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Scraping info from a web site?
Hi, All: What would you suggest one use to read the data on members of the US Congress and their positions on net neutrality from "https://www.battleforthenet.com/scoreboard; into R? I found recommendations for the "rvest" package to "Easily Harvest (Scrape) Web Pages". I tried the following: URL <- 'https://www.battleforthenet.com/scoreboard/' library(rvest) Bftn <- read_html(URL) str(Bftn) List of 2 $ node: $ doc : - attr(*, "class")= chr [1:2] "xml_document" "xml_node" However, I don't know what to do with . The "Selectorgadget" vignette with rvest suggested selecting what I wanted on the web page and pasting that as an argument into "html_node". This led me to try the following: Bftn_nodes <- html_nodes(Bftn, '.psb-unknown , #house, #senate, #senate p') str(Bftn_nodes) List of 4 $ :List of 2 ..$ node: ..$ doc : ..- attr(*, "class")= chr "xml_node" $ :List of 2 ..$ node: ..$ doc : ..- attr(*, "class")= chr "xml_node" $ :List of 2 ..$ node: ..$ doc : ..- attr(*, "class")= chr "xml_node" $ :List of 2 ..$ node: ..$ doc : ..- attr(*, "class")= chr "xml_node" - attr(*, "class")= chr "xml_nodeset" This seems like it may be progress, but I'm still confused on what to do next. Or maybe I should be using a different package? Or posting this question to someplace else like StackOverflow.com? Thanks, Spencer Graves __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Searching R Packages
Hello, All: Might you have time to review the article I recently posted to Wikiversity on "Searching R Packages" (https://en.wikiversity.org/wiki/Searching_R_Packages)? Please edit this yourself or propose changes in the associated "Discuss" page or in an email to this list or to me. My goal in this is to invite readers to turn that article into a proposal for improving the search capabilities in R that would ultimately be funded by, e.g., The R Foundation. What do you think? Please forward this to anyone you think might be interested. Best Wishes, Spencer Graves, PhD Founder EffectiveDefense.org 7300 W. 107th St. # 506 Overland Park, KS 66212 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting all possible combinations
On 2017-08-23 11:35 AM, Bert Gunter wrote: ummm, Ista, it's 2^n. or (2^n-1) if the empty set is not considered as a "combination" ;-) spencer Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Aug 23, 2017 at 8:52 AM, Ista Zahnwrote: On Wed, Aug 23, 2017 at 11:33 AM, Christofer Bogaso wrote: Hi again, I am exploring if R can help me to get all possible combinations of members in a group. Let say I have a group with 5 members : A, B, C, D, E Now I want to generate all possible unique combinations with all possible lengths from that group e.g. 1st combination : A 2nd combination : B . 5th combination : E 6th combination : A, B 7th combination : B, C last combination: A, B, C, D, E Ideally, I have a fairly large group so am looking for some programmatic way to generate all possible combinations. Be careful, the number of combinations grows pretty quickly. You can calculate the number ahead of time with sum(choose(n, 1:n)) where n is the number of values in your group. --Ista Any help will be highly appreciated. Thanks for your time. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert Factor to Date
On 2017-08-22 2:04 PM, Patrick Casimir wrote: > > This is large data set Spencer. What about when the dates change as below: > � Have you tried what I suggested?� What were the results? Spencer > > COL1 COL2 > Jan-141-Aug-16 > Feb-141-Aug-16 > Mar-141-Aug-16 > Apr-141-Aug-16 > May-141-Aug-16 > Jun-141-Aug-16 > Jul-141-Aug-16 > Aug-141-Aug-16 > Sep-141-Aug-16 > Oct-141-Aug-16 > Nov-141-Aug-16 > Dec-141-Aug-16 > Jan-151-Aug-16 > Feb-151-Aug-16 > Mar-151-Aug-16 > Apr-151-Aug-16 > May-151-Aug-16 > Jun-151-Aug-16 > Jul-151-Aug-16 > Aug-151-Aug-16 > Sep-151-Aug-16 > Oct-151-Aug-16 > Nov-151-Aug-16 > Dec-151-Aug-16 > Jan-161-Aug-16 > Feb-161-Aug-16 > Mar-161-Aug-16 > Apr-161-Aug-16 > May-161-Aug-16 > Jun-161-Aug-16 > Jul-161-Aug-16 > Aug-161-Aug-16 > Sep-161-Aug-16 > Oct-161-Aug-16 > > > > > > > > ---- > *From:* R-help <r-help-boun...@r-project.org> on behalf of Spencer > Graves <spencer.gra...@effectivedefense.org> > *Sent:* Tuesday, August 22, 2017 2:49 PM > *To:* r-help@r-project.org > *Subject:* Re: [R] Convert Factor to Date > > > On 2017-08-22 1:30 PM, Patrick Casimir wrote: > > Dear R Fellows, > > > > > > I Have a dataset( data1) with 2 columns of date showing a class of > factor. How to convert them to date? Then compare them, keep the > greater date only in a new column. Using as.Date to change the class > to Date but the data becomes NA. > > > �� When I specified a format with the second date, I got the desired > behavior: > > > �> as.Date(factor('1-Nov-16'), '%d-%b-%y') > [1] "2016-11-01" > �> as.Date('Nov-16', '%b-%y') > [1] NA > �> as.Date(factor('Nov-16'), '%b-%y') > [1] NA > �> as.Date('Nov-16', '%b-%y') > [1] NA > > > �� To convert the first column, I pasted "1-" in front: > > > as.Date(paste0('1-', factor('Nov-16')), '%d-%b-%y') > > > �� Hope this helps.� Spencer > > > Much Thanks > > > > > > COL1��� COL2 > > Apr-16� 1-Nov-16 > > May-16� 1-Nov-16 > > Jun-16� 1-Nov-16 > > Jul-16� 1-Nov-16 > > Aug-16� 1-Nov-16 > > Sep-16� 1-Nov-16 > > Oct-16� 1-Nov-16 > > Nov-16� 1-Nov-16 > > Dec-16� 1-Nov-16 > > Jan-17� 1-Nov-16 > > Feb-17� 1-Nov-16 > > Mar-17� 1-Nov-16 > > Apr-17� 1-Nov-16 > > May-17� 1-Nov-16 > > Jun-17� 1-Nov-16 > > Jul-17� 1-Nov-16 > > Aug-17� 1-Nov-16 > > Sep-17� 1-Nov-16 > > > > > >��� [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help=02%7C01%7Cpatrcasi%40nova.edu%7C6abf3517ab5f407427d308d4e98e9efd%7C2c2b2d312e3e4df1b571fb37c042ff1b%7C0%7C0%7C636390246143633480=jwTeb%2BvH0bbkXdckgzE6PJZ3gDl9d1%2F3t9K%2BxDtjyls%3D=0 > > PLEASE do read the posting guide > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.R-project.org%2Fposting-guide.html=02%7C01%7Cpatrcasi%40nova.edu%7C6abf3517ab5f407427d308d4e98e9efd%7C2c2b2d312e3e4df1b571fb37c042ff1b%7C0%7C0%7C636390246143633480=GUAR582xxtA88KLkQC1oPnvyNecfUyXjV9MrIziJicU%3D=0 > > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help=02%7C01%7Cpatrcasi%40nova.edu%7C6abf3517ab5f407427d308d4e98e9efd%7C2c2b2d312e3e4df1b571fb37c042ff1b%7C0%7C0%7C636390246143633480=jwTeb%2BvH0bbkXdckgzE6PJZ3gDl9d1%2F3t9K%2BxDtjyls%3D=0 > PLEASE do read the posting guide > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.R-project.org%2Fposting-guide.html=02%7C01%7Cpatrcasi%40nova.edu%7C6abf3517ab5f407427d308d4e98e9efd%7C2c2b2d312e3e4df1b571fb37c042ff1b%7C0%7C0%7C636390246143633480=GUAR582xxtA88KLkQC1oPnvyNecfUyXjV9MrIziJicU%3D=0 > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert Factor to Date
On 2017-08-22 1:30 PM, Patrick Casimir wrote: Dear R Fellows, I Have a dataset( data1) with 2 columns of date showing a class of factor. How to convert them to date? Then compare them, keep the greater date only in a new column. Using as.Date to change the class to Date but the data becomes NA. When I specified a format with the second date, I got the desired behavior: > as.Date(factor('1-Nov-16'), '%d-%b-%y') [1] "2016-11-01" > as.Date('Nov-16', '%b-%y') [1] NA > as.Date(factor('Nov-16'), '%b-%y') [1] NA > as.Date('Nov-16', '%b-%y') [1] NA To convert the first column, I pasted "1-" in front: as.Date(paste0('1-', factor('Nov-16')), '%d-%b-%y') Hope this helps. Spencer Much Thanks COL1COL2 Apr-16 1-Nov-16 May-16 1-Nov-16 Jun-16 1-Nov-16 Jul-16 1-Nov-16 Aug-16 1-Nov-16 Sep-16 1-Nov-16 Oct-16 1-Nov-16 Nov-16 1-Nov-16 Dec-16 1-Nov-16 Jan-17 1-Nov-16 Feb-17 1-Nov-16 Mar-17 1-Nov-16 Apr-17 1-Nov-16 May-17 1-Nov-16 Jun-17 1-Nov-16 Jul-17 1-Nov-16 Aug-17 1-Nov-16 Sep-17 1-Nov-16 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rounding of problem with sum command in R
On 2017-08-22 9:26 AM, niharika singhal wrote: Hello I have a vector v=c(0.0886,0.1744455,0.1379778,0.1209769,0.1573065,0.1134463,0.2074027) when i do sum(v) or 0.0886+0.1744455+0.1379778+0.1209769+0.1573065+0.1134463+0.2074027 i am getting output as 1 No: That's only the display: > sum(v)-1 [1] 1.6e-07 hope this helps. Spencer But if i add them manually i get 1.0026 I do not want to round of my value since it effect my code further Can anyone suggest how can i avoid this. Thanks & Regards Niharika Singhal [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] weight in lm
On 2017-08-14 5:53 AM, peter dalgaard wrote: On 14 Aug 2017, at 10:13 , Troels Ring <tr...@gvdnet.dk> wrote: Dear friends - I hope you will accept a naive question on lm: R version 3.4.1, Windows 10 I have 204 "baskets" of three types corresponding to factor F, each of size from 2 to 33 containing measurements, and need to know if the standard deviation on the measurements in each basket,sdd, is different across types, F. Plotting the observed sdd versus the sizes from 2 to 33, called "k" , does show a decreasing spread as k increases towards 33. I tried lm(sdd ~ F,weight=k) and got different results if omitting the weight argument but would it be the correct way to use sqrt(k) as weight instead? I doubt that there is a "correct" way, but theory says that if the baskets have the same SD and data are normally distributed, then the variance of the sample VARIANCE is proportional to 1/f = 1/(k-1). Weights in lm are inverse-variance, so the "natural" thing to do would seem to be to regress the square of sdd with weights (k-1). (If the distribution is not normal, the variance of the sample variance is complicated by a term that involves both n and the excess kurtosis, whereas the variance of the sample SD is complicated in any case. All according to the gospel of St.Google.) The Wikipedia article on "standard deviation" gives the more general formula. (That article does NOT give a citation for that formula. I you know one, please add it -- or post it here, to make it easier for someone else to add it.) Thanks, Peter. Spencer Graves -pd Best wishes Troels Ring Aalborg, Denmark __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kalman filter for a time series
On 2017-07-29 11:26 PM, Staff wrote: I found an example at http://www.bearcave.com/finance/random_r_hacks/kalman_smooth.html That example is signed by "Ian Kaplan". There's a box at the bottom of the page for you to email him. shown below. But it seems the structSSM function has been removed from KFAS library or it never was part of KFAS. I don't know. so it won't run. Does anyone know how to fix the code so that it runs? Have you tried the vignette with KFAS? Hope this helps. Spencer Graves library(KFAS) library(tseries) library(timeSeries) library(zoo) library(quantmod) getDailyPrices = function( tickerSym, startDate, endDate ) { prices = get.hist.quote( instrument = tickerSym, start = startDate, end = endDate, quote="AdjClose", provider="yahoo", compression="d", quiet=T) prices.ts = ts(prices) return( prices.ts ) } kalmanFilter = function( x ) { t = x if (class(t) != "ts") { t = ts(t) } ssModel = structSSM( y = t, distribution="Gaussian") ssFit = fitSSM(inits=c(0.5*log(var(t)), 0.5*log(var(t))), model = ssModel ) kfs = KFS( ssFit$model, smoothing="state", nsim=length(t)) vals = kfs$a lastVal = vals[ length(vals)] return(lastVal) } Start = "2011-01-01" End = "2012-12-31" SandP = "^GSPC" windowWidth = 20 tsLength = 100 SAndP.ts = getDailyPrices( SandP, Start, End ) SAndP.ts = SAndP.ts[1:tsLength] SAndP.smoothed = rollapply( data=SAndP.ts, width=windowWidth, FUN=kalmanFilter) par(mfrow=c(1,1)) prices = coredata( SAndP.ts[windowWidth:length(SAndP.ts)]) plot(prices, col="blue", type="l") lines(coredata(SAndP.smoothed), col="magenta") par(mfrow=c(1,1)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nash equilibrium and other game theory tools implemented in networks using igraph or similar
On 2017-06-28 11:36 AM, Suzen, Mehmet wrote: Hello Chris, I was implying you are capable enough to implement it, while you have already identify a research paper. If there is no package out there, uploading to CRAN would help future user too. I am more than happy to help if you want to implement from scratch. sos::findFn("nash equilibrium") identified 4 packages: antitrust, GNE, GPGame, and nopp. I found GNE mentioned earlier in this thread but not the other three. Similar searches for "nash equilibrium networ" and "nash equilibrium networks" returned no matches. By the way, one of the sessions at useR!2017 in Brussels next week will discuss "Navigating the R package universe". This will provide a brief overview of tools currently available, followed by a general discussion of what people would want in tools to make it easier to find and use what you want in contributed packages, now well over 10,000. The planned discussion will briefly cover the "sos" package, "RDocumentation.org", METACRAN (www.r-pkg.org), "Task views", and other tools. We're hoping that this session will facilitate the development of one or more teams to collaborate on the following: * Creating common interfaces for different approaches to essentially the same problem, like "optimx". * Improving "Task views". * Improving search capabilities. If you can make it to useR!2017, we hope to see you in this session, Wed. July 5, from 17:00 - 18:30 in the main meeting room. If you might like to help with this but can't make Brussels, please stay tuned for further announcements or contact me after the conference for further information. Spencer Graves p.s. Please excuse if I highjacked this thread to promote this session, but I didn't see any mention of the sos package or RDocumentation.org, so it seemed appropriate. Best, Mehmet On 27 June 2017 at 17:45, Chris Buddenhagen <cbuddenha...@gmail.com> wrote: Does anyone know of some code, and examples that implement game theory/Nash equilibrium hypothesis testing using existing packages like igraph/statnet or similar? Perhaps along the lines of this article: Zhang, Y., Aziz-Alaoui, M. A., Bertelle, C., & Guan, J. (2014). Local Nash Equilibrium in Social Networks, *4*, 6224. Best, Chris Buddenhagen cbuddenha...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nash equilibrium and other game theory tools implemented in networks using igraph or similar
On 2017-06-28 5:40 AM, Jeff Newmiller wrote: Showing your work so that someone else can either see something you missed or share in the joy when a rare answer comes through is what elevates such a posting from spam to shared research. A "fortune"? sg __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.