Re: [R] Adding Year-Month-Day to X axis
Jim, That you very much! How do I instruct staxlab to label once every n days, rather than labeling every day? Greg > On May 5, 2018, at 6:50 PM, Jim Lemonwrote: > > staxlab(1,at=x_mmdd,labels=format(x_mmdd,"%Y-%m-%d")) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] why the length and width of a plot region produced by the dev.new() function cannot be correctly set?
--From:Duncan MurdochSend Time:2018 May 6 (Sun) 04:58To:孙业平 ; David Winsemius Cc:R Help Mailing List Subject:Re: [R] why the length and width of a plot region produced by the dev.new() function cannot be correctly set? On 05/05/2018 11:33 AM, 孙业平 wrote: > > -- > From:Duncan Murdoch > Send Time:2018 May 4 (Fri) 17:24 > To:孙业平 ; David Winsemius > Cc:R Help Mailing List > Subject:Re: [R] why the length and width of a plot region produced by > the dev.new() function cannot be correctly set? > > On 04/05/2018 3:04 AM, sunyeping via R-help wrote: > > > > >--From:David >Winsemius Send Time:2018 May 4 (Fri) 13:25To:孙业平 > Cc:R Help Mailing List Subject:Re: >[R] why the length and width of a plot region produced by the dev.new() >function cannot be correctly set? > > > >> On May 3, 2018, at 6:28 PM, sunyeping via R-help >wrote: > >> > >> When I check the size of the plot region usingdev.size("in")a new plot >region is produced and in the Rconsole I get[1] 5.33 5.322917 > > > > Your test is all mangleed together. You failed in your duty to read the >list info and the Posting guide . NO HTML! > > > >> If I mean to produce a plot region with size setting >bydev.new(length=3,width=3)a plot region is produced, but the size is >[2.281250, 5.322917], as detected by the de.size function. If I >type:dev.new(length=10,width=10)I get a plot region of with the size of >[7.614583, 5.322917]. It seems that the width of the new plot region cannot be >set, and tt is always 5.322917. The length of the new plot region can be set, >but it is always smaller that the values I set.What do I miss? What is the >correct way of setting the dimension of the new plot region? I will be >grateful to any help.Best regards, > > > > The size of the device is not the size of the plot region. You need to >take into account the margins. See ?par > > Thank you, David.I have read the par() document. Clearly the size of the >plot region is smaller than or equal to the divice size. However, if I produce >a graphic device with dev.new (length, width) or other functions, I find the >largest width of the new device is always 5.3 inches whatever the values I >set, and the length of it is alway smaller than what I set. > > The length and width aren't the first and second parameters for any > device, and length isn't a parameter at all. Try > > dev.new(height = 10, width = 10) > > and you should get a bigger device if it will fit on your screen. If it > won't fit, then you might get a smaller one, and you'll need to choose a > non-screen device such as png() or pdf() instead of the default device. > > Duncan Murdoch > >Could you tell me how to produce a graphic divice with correct size > that I set? I need this function because the graphic divice cannot > accomendate all of the graph I make with some of plot tools such as > ggtree. In ggtree plot, part of the tree tips label are invisible > (https://www.dropbox.com/s/87gyusx7ay1xxu8/tree.pdf?dl=0) even I set > "par(mar=rep(0,4))". So I think I must plot the tree on a larger graphic > device. Best regards. > > > > > >> > >>[[alternative HTML version deleted]] > >> > >> __ > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > David Winsemius > > Alameda, CA, USA > > > > 'Any technology distinguishable from magic is insufficiently advanced.' >-Gehm's Corollary to Clarke's Third Law > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > "dev.new(height = 10, width = 10) " doesn't work neither. It produces a > device with a size of [ 5.760417, 5.75]. My computer is a usual 14 > inch thankpad labtop. Is 5 ~ inches really the up limit of the size of > the R graphic device in computer screen? I doubt it. You ask questions in a very rude way. I'm going to let you figure this one out by yourself. Duncan Murdoch Sorry, Professor.
Re: [R] Adding Year-Month-Day to X axis
Hi Greg, The only reason I included the staxlab function in the plotrix library was to fit all the dates onto the axis. If you want to try it: install.packages("plotrix") Jim On Sun, May 6, 2018 at 9:02 AM, Gregory Coatswrote: > Jim, Thanks for responding! > I am using the official R 3.5.0 for Mac OS X. > This apparently does not include library (plotrix) > > library(plotrix) > Error in library(plotrix) : there is no package called ‘plotrix’ > > Greg > > On May 5, 2018, at 6:50 PM, Jim Lemon wrote: > > Hi Greg, > What you are getting there is a factor, interpreted as a 1:n sequence > based on the sort order of your "dates". Here's a way to get dates on > your x-axis in the format you want: > > x_mmdd<-as.Date(c("2018-04-25","2018-04-26","2018-04-27", > "2018-04-28","2018-04-29","2018-04-30","2018-05-01","2018-05-02", > "2018-05-03","2018-05-04","2018-05-05"),format="%Y-%m-%d") > plot(x_mmdd, y_duration, type="l",xaxt="n") > library(plotrix) > staxlab(1,at=x_mmdd,labels=format(x_mmdd,"%Y-%m-%d")) > > Jim > > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding Year-Month-Day to X axis
"Apparently, R does not understand my Year-Month-Day " I think, rather, you need to learn how R handles dates and times. See here to begin, perhaps: ?DateTimeClasses There are many R resources for dealing with data over time, many of which are listed here, and others might be found by online searching. https://cran.r-project.org/web/views/TimeSeries.html There are also many tutorials on dealing with time data in R. Even a cursory web search should find many. ... and of course someone may respond directly to your query here (but not me, as I'm not that knowledgeable). Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sat, May 5, 2018 at 11:14 AM, Gregory Coatswrote: > I am using R 3.5.0 for Mac OS X. > Issuing these two commands yields the expected plot. > y_duration <- c (301.59050, 387.35700, 365.64366, 317.26150, > 321.71883, 342.44950, 318.95350, 322.33233, 330.60333, 428.99516, > 297.82066) > plot (y_duration, type="l”) > > Adding Year-Month-Day values for the x axis, and then calling plot (x,y), > yields a bizarre plot. Apparently, R does not understand my Year-Month-Day > values. > x_mmdd <- c (2018-04-25, 2018-04-26, 2018-04-27, 2018-04-28, > 2018-04-29, 2018-04-30, 2018-05-01, 2018-05-02, 2018-05-03, 2018-05-04, > 2018-05-05) > plot (x_mmdd, y_duration, type="l") > > I would be enormously appreciative of your guidance. > Greg Coats > Virginia, USA > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding Year-Month-Day to X axis
Hi Greg, What you are getting there is a factor, interpreted as a 1:n sequence based on the sort order of your "dates". Here's a way to get dates on your x-axis in the format you want: x_mmdd<-as.Date(c("2018-04-25","2018-04-26","2018-04-27", "2018-04-28","2018-04-29","2018-04-30","2018-05-01","2018-05-02", "2018-05-03","2018-05-04","2018-05-05"),format="%Y-%m-%d") plot(x_mmdd, y_duration, type="l",xaxt="n") library(plotrix) staxlab(1,at=x_mmdd,labels=format(x_mmdd,"%Y-%m-%d")) Jim On Sun, May 6, 2018 at 4:14 AM, Gregory Coatswrote: > I am using R 3.5.0 for Mac OS X. > Issuing these two commands yields the expected plot. > y_duration <- c (301.59050, 387.35700, 365.64366, 317.26150, 321.71883, > 342.44950, 318.95350, 322.33233, 330.60333, 428.99516, 297.82066) > plot (y_duration, type="l”) > > Adding Year-Month-Day values for the x axis, and then calling plot (x,y), > yields a bizarre plot. Apparently, R does not understand my Year-Month-Day > values. > x_mmdd <- c (2018-04-25, 2018-04-26, 2018-04-27, 2018-04-28, 2018-04-29, > 2018-04-30, 2018-05-01, 2018-05-02, 2018-05-03, 2018-05-04, 2018-05-05) > plot (x_mmdd, y_duration, type="l") > > I would be enormously appreciative of your guidance. > Greg Coats > Virginia, USA > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding Year-Month-Day to X axis
I am using R 3.5.0 for Mac OS X. Issuing these two commands yields the expected plot. y_duration <- c (301.59050, 387.35700, 365.64366, 317.26150, 321.71883, 342.44950, 318.95350, 322.33233, 330.60333, 428.99516, 297.82066) plot (y_duration, type="l”) Adding Year-Month-Day values for the x axis, and then calling plot (x,y), yields a bizarre plot. Apparently, R does not understand my Year-Month-Day values. x_mmdd <- c (2018-04-25, 2018-04-26, 2018-04-27, 2018-04-28, 2018-04-29, 2018-04-30, 2018-05-01, 2018-05-02, 2018-05-03, 2018-05-04, 2018-05-05) plot (x_mmdd, y_duration, type="l") I would be enormously appreciative of your guidance. Greg Coats Virginia, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discovering patterns in textual strings
Jeff: The previous solution I sent you was hugely inefficient and frankly kind of stupid. Here is a much better and simpler solution. > z <- c("abc", "abc_def", "abc.def", "abc def", "abcd_ef", "abcd", "e","f") ## Create vector of patterns of same length as z, many of which are repeated > pats <- sub("^(.+)[. _].*","\\1",z) ## Now can use tapply() to get indices if desired ## Note that the patterns label the groups > tapply(seq_along(z),pats,I) $abc [1] 1 2 3 4 $abcd [1] 5 6 $e [1] 7 $f [1] 8 No need to reply. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sat, May 5, 2018 at 12:14 AM, Bert Gunterwrote: > "Does that help?" > > No. I am not your private consultant. You need to reply to the list, which > I have cc'ed here, not just me. > > I am still somewhat confused by your specifications, but others may not > be. Part of my confusion stems from your failure to provide a reproducible > example (see e.g. the posting guide linked below). For example, I cannot > tell from your text whether the Abc and Bce strings contain one or more > spaces at the end. I shall assume they may but need not. > > Anyway, here is a reproducible example and solution that assumes that the > substrings/patterns of interest to you occur at the beginning of the > strings and may or may not be followed by one of "." "_" or " "(space) and > then possibly further text which should be ignored. Assuming that you are > familiar with regular expressions, maybe this will help to get you started > even if I have misunderstood your specifications. If you aren't familiar > with regex's, maybe the stringr package may provide a gentler interface > than using R's raw regex functionality. Or maybe someone else can suggest a > better approach (which is another reason why you should reply to the list, > not just me). > > z <- c("abc", >"abc_def", >"abc.def", >"abc def", >"abcd_ef", >"abcd", >"e","f") > > pats <- unique(sub("^(.+)[. _]+.*", "\\1", z)) > ## gives: > > pats > [1] "abc" "abcd" "e""f" > > > This gives you the four separate patterns that you could then use to group > your records, perhaps by: > > > lapply(pats,function(x)grep(paste0("^", x,"([_. ]|$)"), z)) > [[1]] > [1] 1 2 3 4 > > [[2]] > [1] 5 6 > > [[3]] > [1] 7 > > [[4]] > [1] 8 > > That is, indices 1-4 in z are the first group; 5 and 6 are the second; etc. > > > > Cheers, > Bert > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > On Fri, May 4, 2018 at 9:00 PM, Jeff Reichman > wrote: > >> Bert >> >> Thank you for the link. Figured there might be something >> >> Regarding your questions >> >> This is from a large 53 Billion records. The column in question are >> AdNames (Real Time Bidding data) >> >> #1. Generally yes, but not always >> >> #2 Separators could be underscores (_) or dots (.) as in 1.2.3_ABC .. >> >> #3 Yes. So there could be Abc 123 could be a matching string >> >> This would not be considered a match ... >> abc_something >> this.is_a long stringwithabcinthemiddle >> >> The sequence(s) are always are at the beginning (or so it appears). Out >> of the 54 billion records I am able to pull (SparkR sql) 948,679 unique >> strings. It is from these unique strings that I (if possible) want to >> identify the "key" strings. >> >> 1. Abc_1232.niok7j9hd >> 2. Abc >> 3. Abc.2#348hfk2.njilo >> 4. Abc.2 >> 5. Abc.7 >> 6. BAdfr_kajdhf98#kjsdh >> 7. BAdrf_gofer >> 948679 >> >> >> So I may have a thousand individuals strings all of which have Abc as a >> common string, or Badrf. So I am looking to pull "Abc," "BAdrf", etc. So >> then I can go back and restructure the data to show that any record with >> Abc_1232.niok7j9hd if part of the Abc "Group," or Family ??? >> >> Does that help >> >> Jeff >> >> -Original Message- >> From: Bert Gunter >> Sent: Friday, May 4, 2018 5:41 PM >> To: reichm...@sbcglobal.net >> Cc: R-help >> Subject: Re: [R] Discovering patterns in textual strings >> >> The answer is, of course, using regular expressions and/or libraries >> therefor. However, I do not think you have defined your problem >> sufficiently. Some questions I have: >> >> 1. Do possible patterns to be matched always appear at the beginning of >> your strings? >> >> 2. Always together between specified separators ("_" in your example); >> or one of several specified separators; or otherwise? >> >> 3. Do spaces or other nonprinting characters occur in your strings? >> >> e.g. would >> >> abc_something >> this.is_a long stringwithabcinthemiddle >> >> be considered
Re: [R] why the length and width of a plot region produced by the dev.new() function cannot be correctly set?
On 05/05/2018 11:33 AM, 孙业平 wrote: -- From:Duncan MurdochSend Time:2018 May 4 (Fri) 17:24 To:孙业平 ; David Winsemius Cc:R Help Mailing List Subject:Re: [R] why the length and width of a plot region produced by the dev.new() function cannot be correctly set? On 04/05/2018 3:04 AM, sunyeping via R-help wrote: > > --From:David Winsemius Send Time:2018 May 4 (Fri) 13:25To:孙业平 Cc:R Help Mailing List Subject:Re: [R] why the length and width of a plot region produced by the dev.new() function cannot be correctly set? > >> On May 3, 2018, at 6:28 PM, sunyeping via R-help wrote: >> >> When I check the size of the plot region usingdev.size("in")a new plot region is produced and in the Rconsole I get[1] 5.33 5.322917 > > Your test is all mangleed together. You failed in your duty to read the list info and the Posting guide . NO HTML! > >> If I mean to produce a plot region with size setting bydev.new(length=3,width=3)a plot region is produced, but the size is [2.281250, 5.322917], as detected by the de.size function. If I type:dev.new(length=10,width=10)I get a plot region of with the size of [7.614583, 5.322917]. It seems that the width of the new plot region cannot be set, and tt is always 5.322917. The length of the new plot region can be set, but it is always smaller that the values I set.What do I miss? What is the correct way of setting the dimension of the new plot region? I will be grateful to any help.Best regards, > > The size of the device is not the size of the plot region. You need to take into account the margins. See ?par > Thank you, David.I have read the par() document. Clearly the size of the plot region is smaller than or equal to the divice size. However, if I produce a graphic device with dev.new (length, width) or other functions, I find the largest width of the new device is always 5.3 inches whatever the values I set, and the length of it is alway smaller than what I set. The length and width aren't the first and second parameters for any device, and length isn't a parameter at all. Try dev.new(height = 10, width = 10) and you should get a bigger device if it will fit on your screen. If it won't fit, then you might get a smaller one, and you'll need to choose a non-screen device such as png() or pdf() instead of the default device. Duncan Murdoch Could you tell me how to produce a graphic divice with correct size that I set? I need this function because the graphic divice cannot accomendate all of the graph I make with some of plot tools such as ggtree. In ggtree plot, part of the tree tips label are invisible (https://www.dropbox.com/s/87gyusx7ay1xxu8/tree.pdf?dl=0) even I set "par(mar=rep(0,4))". So I think I must plot the tree on a larger graphic device. Best regards. > > >> >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA > > 'Any technology distinguishable from magic is insufficiently advanced.' -Gehm's Corollary to Clarke's Third Law > > > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > "dev.new(height = 10, width = 10) " doesn't work neither. It produces a device with a size of [ 5.760417, 5.75]. My computer is a usual 14 inch thankpad labtop. Is 5 ~ inches really the up limit of the size of the R graphic device in computer screen? I doubt it. You ask questions in a very rude way. I'm going to let you figure this one out by yourself. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] source(echo = TRUE) with a iso-8859-1 encoded file gives an error
On Fri, May 04, 2018 at 10:58:26PM +, Ista Zahn wrote: > On Fri, May 4, 2018 at 4:47 PM, Scott Kostyshakwrote: > > I have very little knowledge about file encodings and would like to > > learn more. > > > > I've read the following pages to learn more: > > > > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__stat.ethz.ch_R-2Dmanual_R-2Ddevel_library_base_html_Encoding.html=DwIFaQ=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=neJ42wVqpDzuvOKMBML6-HnbH0l0aXpb0ZUFWoGb-Bo=yaDPpePO4lxR7-PBircARZlFh-GVyi5sTNtjTr_JZ7U=PSqR5opjnHspAeM6Edm1ddsaY3ok1bnV-t6W4MKtVCM= > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_questions_4806823_how-2Dto-2Ddetect-2Dthe-2Dright-2Dencoding-2Dfor-2Dread-2Dcsv=DwIFaQ=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=neJ42wVqpDzuvOKMBML6-HnbH0l0aXpb0ZUFWoGb-Bo=yaDPpePO4lxR7-PBircARZlFh-GVyi5sTNtjTr_JZ7U=1M6pNfwFR5uG5DkSAHPpXZKYETCiwV1wsJxpew6lThY= > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__developer.r-2Dproject.org_Encodings-5Fand-5FR.html=DwIFaQ=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=neJ42wVqpDzuvOKMBML6-HnbH0l0aXpb0ZUFWoGb-Bo=yaDPpePO4lxR7-PBircARZlFh-GVyi5sTNtjTr_JZ7U=hAF57aL9khHQ_2Ndars7qMO-FoqxnnmOiEDIprsllko= > > > > The last one, in particular, has been very helpful. I would be > > interested in any further references that you suggest. > > > > I attach a file that reproduces the issue I would like to learn more > > about. I do not know if the file encoding will be correctly preserved > > through email, so I also provide the file (temporarily) on Dropbox here: > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.dropbox.com_s_3lbgebk7b5uaia7_encoding-5Fexport-5Fissue.R-3Fdl-3D0=DwIFaQ=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=neJ42wVqpDzuvOKMBML6-HnbH0l0aXpb0ZUFWoGb-Bo=yaDPpePO4lxR7-PBircARZlFh-GVyi5sTNtjTr_JZ7U=fGtYdB-U7ktXVFeniRudE-ZmxmCP3ZUfeLOvJ0AJwqs= > > > > The file gives an error when using "source()" with the > > argument echo = TRUE: > > > > > source("encoding_export_issue.R", echo = TRUE) > > Error in nchar(dep, "c") : invalid multibyte string, element 1 > > In addition: Warning message: > > In grepl("^[[:blank:]]*$", dep[1L]) : > > input string 1 is invalid in this locale > > > > The problem comes from the "á" character in the .R file. The file > > appears to be encoded as "iso-8859-1": > > > > $ file --mime-encoding encoding_export_issue.R > > encoding_export_issue.R: iso-8859-1 > > > > Note that for me: > > > > > getOption("encoding") > > [1] "native.enc" > > > > so "native.enc" is used for the "encoding" argument of source(). > > > > The following two calls succeed: > > > > > source("encoding_export_issue.R", echo = TRUE, encoding = "unknown") > > > source("encoding_export_issue.R", echo = TRUE, encoding = "iso-8859-1") > > > > Is this file a valid "iso-8859-1" encoded file? > > The one you attached is not. The one linked to in dropbox is. > > Why does source() fail > > in the case of encoding set to "native.enc"? Is it because of the > > settings to UTF-8 in my locale (see info on my system at the bottom of > > this email). > > Yes. > > > > > I'm guessing it would be a bad idea to put > > > > options(encoding = "unknown") > > > > in my .Rprofile, because it is difficult to always correctly guess the > > encoding of files? > > My guess is that the issue is less about the difficulty of guessing > the encoding, and more about the time it takes to do so. That's not > particularly relevant for the "source" function, but the encoding > option is used by many of the file IO functions in R and so has > implications well beyond the behavior of "source". Ah I did not think about this possibility. Makes sense. > > Is there a reason why setting it to "unknown" would > > lead to more problems than leaving it set to "native.enc"? > > It depends on what you are actually doing. If you are on a UTF-8 > locale and working exclusively with UTF-8 files, setting > options(encoding = "unknown") will just slow down your file IO by > checking for the encoding every time. Good to know. Thank you for your response, Ista. Scott -- Scott Kostyshak Assistant Professor of Economics University of Florida https://people.clas.ufl.edu/skostyshak/ > > > > I've reproduced the above behavior on R-devel (r74677) and 3.4.3. Below > > is my session info and locale info for my system with the 3.4.3 version: > > > >> sessionInfo() > > R version 3.4.3 (2017-11-30) > > Platform: x86_64-pc-linux-gnu (64-bit) > > Running under: Ubuntu 16.04.3 LTS > > > > Matrix products: default > > BLAS: /usr/lib/libblas/libblas.so.3.6.0 > > LAPACK: /usr/lib/lapack/liblapack.so.3.6.0 > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11]
Re: [R] adding overall constraint in optim()
Here is what you do for your problem: require(BB) Mo.vect <- as.vector(tail(head(mo,i),1)) wgt.vect <- as.vector(tail(head(moWeightsMax,i),1)) cov.mat <- cov(tail(head(morets,i+12),12)) opt.fun <- function(wgt.vect) -sum(Mo.vect %*% wgt.vect) / (t(wgt.vect) %*% (cov.mat %*% wgt.vect)) LowerBounds<-c(0.2,0.05,0.1,0,0,0) UpperBounds<-c(0.6,0.3,0.6,0.15,0.1,0.2) spgSolution <- spg(wgt.vect, fn=opt.fun, lower=LowerBounds, upper=UpperBounds, project="projectLinear", projectArgs=list(A=matrix(1, 1, length(wgt.vect)), b=1, meq=1))) Ravi From: Ravi Varadhan Sent: Saturday, May 5, 2018 12:31 PM To: m.ash...@enduringinvestments.com; r-help@r-project.org Subject: adding overall constraint in optim() Hi, You can use the projectLinear argument in BB::spg to optimize with linear equality/inequality constraints. Here is how you implement the constraint that all parameters sum to 1. require(BB) spg(par=p0, fn=myFn, project="projectLinear", projectArgs=list(A=matrix(1, 1, length(p0)), b=1, meq=1)) Hope this is helpful, Ravi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] adding overall constraint in optim()
Hi, You can use the projectLinear argument in BB::spg to optimize with linear equality/inequality constraints. Here is how you implement the constraint that all parameters sum to 1. require(BB) spg(par=p0, fn=myFn, project="projectLinear", projectArgs=list(A=matrix(1, 1, length(p0)), b=1, meq=1)) Hope this is helpful, Ravi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] why the length and width of a plot region produced by the dev.new() function cannot be correctly set?
--From:Duncan MurdochSend Time:2018 May 4 (Fri) 17:24To:孙业平 ; David Winsemius Cc:R Help Mailing List Subject:Re: [R] why the length and width of a plot region produced by the dev.new() function cannot be correctly set? On 04/05/2018 3:04 AM, sunyeping via R-help wrote: > > --From:David >Winsemius Send Time:2018 May 4 (Fri) 13:25To:孙业平 > Cc:R Help Mailing List Subject:Re: >[R] why the length and width of a plot region produced by the dev.new() >function cannot be correctly set? > >> On May 3, 2018, at 6:28 PM, sunyeping via R-help >>wrote: >> >> When I check the size of the plot region usingdev.size("in")a new plot >>region is produced and in the Rconsole I get[1] 5.33 5.322917 > > Your test is all mangleed together. You failed in your duty to read the list >info and the Posting guide . NO HTML! > >> If I mean to produce a plot region with size setting >>bydev.new(length=3,width=3)a plot region is produced, but the size is >>[2.281250, 5.322917], as detected by the de.size function. If I >>type:dev.new(length=10,width=10)I get a plot region of with the size of >>[7.614583, 5.322917]. It seems that the width of the new plot region cannot >>be set, and tt is always 5.322917. The length of the new plot region can be >>set, but it is always smaller that the values I set.What do I miss? What is >>the correct way of setting the dimension of the new plot region? I will be >>grateful to any help.Best regards, > > The size of the device is not the size of the plot region. You need to take >into account the margins. See ?par > Thank you, David.I have read the par() document. Clearly the size of the plot >region is smaller than or equal to the divice size. However, if I produce a >graphic device with dev.new (length, width) or other functions, I find the >largest width of the new device is always 5.3 inches whatever the values I >set, and the length of it is alway smaller than what I set. The length and width aren't the first and second parameters for any device, and length isn't a parameter at all. Try dev.new(height = 10, width = 10) and you should get a bigger device if it will fit on your screen. If it won't fit, then you might get a smaller one, and you'll need to choose a non-screen device such as png() or pdf() instead of the default device. Duncan Murdoch Could you tell me how to produce a graphic divice with correct size that I set? I need this function because the graphic divice cannot accomendate all of the graph I make with some of plot tools such as ggtree. In ggtree plot, part of the tree tips label are invisible (https://www.dropbox.com/s/87gyusx7ay1xxu8/tree.pdf?dl=0) even I set "par(mar=rep(0,4))". So I think I must plot the tree on a larger graphic device. Best regards. > > >> >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >>http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA > > 'Any technology distinguishable from magic is insufficiently advanced.' >-Gehm's Corollary to Clarke's Third Law > > > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > "dev.new(height = 10, width = 10) " doesn't work neither. It produces a device with a size of [ 5.760417, 5.75]. My computer is a usual 14 inch thankpad labtop. Is 5 ~ inches really the up limit of the size of the R graphic device in computer screen? I doubt it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error in chol.default((value + t(value))/2) : , the leading minor of order 1 is not positive definite
Dear friends - I'm having troubles with nlme fitting a simplified model as shown below eliciting the error Error in chol.default((value + t(value))/2) : the leading minor of order 1 is not positive definite - I have seen the threads on this error but it didn't help me solve the problem. The model runs well in brms and identifies the used parameters even with fixed effects for TRT - but here in nlme TRT is ignored and I guess this is not the reason for the said error Below is the quite clumsy simulated data set and specification of call to nlme - the start values are taken from fitted values in brms library(ggplot2) windows(record=TRUE) #generate 3*10 rats - add fixed effects to the four parameters according to the three groups - add random effects pr each rat - add residual random effect #Parameter values taken from Sapirstein AJP 181:330-6, 1955 set.seed(1234) Time <- seq(1,60,by=1) A <- 275; B <- 140; g1 <- 0.1105; g2 <- .0161 N <- 30 AA <- rep(A,30)+rnorm(30,0,30);BB <- rep(B,30)+rnorm(30,0,15) ; gg1 <- rep(g1,30)+rnorm(30,0,0.01); gg2 <- rep(g2,30)+rnorm(30,0,0.001) TRT <- gl(3,10*60) levels(TRT) <- c("CTRL","DIAB","HYPER") AA1 <- AA + c(rep(0,10),rep(10,10),rep(-10,10)) BB1 <- BB + c(rep(0,10),rep(5,10),rep(-5,10)) Gg1 <- gg1 + c(rep(0,10),rep(0.01,10),rep(-0.01,10)) Gg2 <- gg2 + c(rep(0,10),rep(0.005,10),rep(-0.005,10)) getY <- function(A,B,g1,g2) { Y <- A*exp(-g1*Time) + B*exp(-g2*Time) Y <- Y + rnorm(60,0,20) } YY <- c() for (i in 1:N) YY <- c(YY,getY(AA1[i],BB1[i],Gg1[i],Gg2[i])) TT <- rep(Time,N) RAT <- gl(N,length(Time)) dats <- data.frame(RAT,TRT,TT,YY) Dats <- dats names(Dats)[c(3,4)] <- c("Time","Y") dput(Dats,"dats0505.dat") with(Dats,plot(Time,Y,pch=19,cex=.1,col=TRT)) ggplot(data=Dats,aes(x=Time,y=Y,group=RAT,col=TRT)) + geom_line() library(nlme) gfr.nlme <- nlme(Y ~ A*exp(-Time*g1)+B*exp(-Time*g2), data = Dats, fixed = A+g1+B+g2 ~1, random = A+g1+B+g2 ~1,groups = ~ RAT, start = c(255,115,130*1e-3,17*1e-3), na.action = na.omit,verbose=TRUE,control = list(msVerbose = TRUE)) summary(gfr.nlme) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [ESS] ess-insert-function-outline
On Fri, 04-May-2018 at 10:23AM +0200, Lionel Henry wrote: |> |> |> > On 4 mai 2018, at 10:05, Patrick Connollywrote: |> > |> > That's the same as what's in my lisp/old directory. What am I to |> > learn from that? |> |> You can copy-paste its contents into your emacs configuration file. |> . Many thanks, Lionel. It works fine now. And apologies for my ignorance. -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ ESS-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/ess-help
Re: [R] Discovering patterns in textual strings
"Does that help?" No. I am not your private consultant. You need to reply to the list, which I have cc'ed here, not just me. I am still somewhat confused by your specifications, but others may not be. Part of my confusion stems from your failure to provide a reproducible example (see e.g. the posting guide linked below). For example, I cannot tell from your text whether the Abc and Bce strings contain one or more spaces at the end. I shall assume they may but need not. Anyway, here is a reproducible example and solution that assumes that the substrings/patterns of interest to you occur at the beginning of the strings and may or may not be followed by one of "." "_" or " "(space) and then possibly further text which should be ignored. Assuming that you are familiar with regular expressions, maybe this will help to get you started even if I have misunderstood your specifications. If you aren't familiar with regex's, maybe the stringr package may provide a gentler interface than using R's raw regex functionality. Or maybe someone else can suggest a better approach (which is another reason why you should reply to the list, not just me). z <- c("abc", "abc_def", "abc.def", "abc def", "abcd_ef", "abcd", "e","f") pats <- unique(sub("^(.+)[. _]+.*", "\\1", z)) ## gives: > pats [1] "abc" "abcd" "e""f" This gives you the four separate patterns that you could then use to group your records, perhaps by: > lapply(pats,function(x)grep(paste0("^", x,"([_. ]|$)"), z)) [[1]] [1] 1 2 3 4 [[2]] [1] 5 6 [[3]] [1] 7 [[4]] [1] 8 That is, indices 1-4 in z are the first group; 5 and 6 are the second; etc. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, May 4, 2018 at 9:00 PM, Jeff Reichmanwrote: > Bert > > Thank you for the link. Figured there might be something > > Regarding your questions > > This is from a large 53 Billion records. The column in question are > AdNames (Real Time Bidding data) > > #1. Generally yes, but not always > > #2 Separators could be underscores (_) or dots (.) as in 1.2.3_ABC .. > > #3 Yes. So there could be Abc 123 could be a matching string > > This would not be considered a match ... > abc_something > this.is_a long stringwithabcinthemiddle > > The sequence(s) are always are at the beginning (or so it appears). Out > of the 54 billion records I am able to pull (SparkR sql) 948,679 unique > strings. It is from these unique strings that I (if possible) want to > identify the "key" strings. > > 1. Abc_1232.niok7j9hd > 2. Abc > 3. Abc.2#348hfk2.njilo > 4. Abc.2 > 5. Abc.7 > 6. BAdfr_kajdhf98#kjsdh > 7. BAdrf_gofer > 948679 > > > So I may have a thousand individuals strings all of which have Abc as a > common string, or Badrf. So I am looking to pull "Abc," "BAdrf", etc. So > then I can go back and restructure the data to show that any record with > Abc_1232.niok7j9hd if part of the Abc "Group," or Family ??? > > Does that help > > Jeff > > -Original Message- > From: Bert Gunter > Sent: Friday, May 4, 2018 5:41 PM > To: reichm...@sbcglobal.net > Cc: R-help > Subject: Re: [R] Discovering patterns in textual strings > > The answer is, of course, using regular expressions and/or libraries > therefor. However, I do not think you have defined your problem > sufficiently. Some questions I have: > > 1. Do possible patterns to be matched always appear at the beginning of > your strings? > > 2. Always together between specified separators ("_" in your example); or > one of several specified separators; or otherwise? > > 3. Do spaces or other nonprinting characters occur in your strings? > > e.g. would > > abc_something > this.is_a long stringwithabcinthemiddle > > be considered matching? > There are undoubtedly other possibilities that I've missed. > > > > You may also find it useful to check this "task view" out for > possibilities: > https://cran.r-project.org/web/views/NaturalLanguageProcessing.html > > Cheers, > Bert > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Fri, May 4, 2018 at 3:25 PM, Jeff Reichman > wrote: > > R Help Forum > > > > > > > > Is there a R library (or a way) that I can extract unique character > > strings, or repeating patterns in textual strings. Say for example I > > have the following records: > > > > > > > > Abc_1234_kjhksh_276 > > > > Abc > > > > Abc_1234_lakdofyo_324 > > > > Bce_876_skdhk_*&^%*& > > > > Bce > > > > Bce_454 > > > > > > > > And I would like to see the following results > > > > Abc > > > > Abc_1234 > > > > Bce > > > > > > > > > > > > Jeff Reichman > > > > > >