Re: [R] Removing a dollar sign from a character vector
The "end of string" special meaning only applies when the dollar sign is at the right end of the string (as it was in the OP attempt). That is, it is NOT generally necessary to wrap it in brackets to remove the special meaning unless it would otherwise be at the end of the pattern string. -- Sent from my phone. Please excuse my brevity. On February 10, 2016 10:10:40 PM PST, William Dunlap via R-helpwrote: > > y > [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 " > > gsub("$", "", y) > [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ # >no change. Why? > >"$" as a regular expression means "end of string", which has zero >length - >replacing "end >of string" with nothing does not affect the string. Try gsub("$", >"DOLLAR", "$100") >to see it do something. > >Use either fixed=TRUE so the 'pattern' argument is not regarded as a >regular expression or pattern="\\$" or pattern="[$]" to remove dollar's >special >meaning in the pattern language. > >Read up on regular expressions (probably there is a See Also entry in >help(gsub)). > > >Bill Dunlap >TIBCO Software >wdunlap tibco.com > >On Wed, Feb 10, 2016 at 9:39 PM, James Plante wrote: > >> What I’ve got: >> # sessionInfo() >> R version 3.2.3 (2015-12-10) >> Platform: x86_64-apple-darwin13.4.0 (64-bit) >> Running under: OS X 10.11.3 (El Capitan) >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] XML_3.98-1.3 dplyr_0.4.3 >> >> loaded via a namespace (and not attached): >> [1] magrittr_1.5 R6_2.1.2 assertthat_0.1 >rsconnect_0.4.1.4 >> [5] parallel_3.2.3DBI_0.3.1 tools_3.2.3 Rcpp_0.12.3 >> >> > str(y) #toy vector, subset of larger vector in a dataframe of >~4,600 >> rows. >> chr [1:5] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " >"$2,600.00 “ >> >> y is a subset of a column in a dataframe that’s too big to post. I >tried >> the commands listed here on the dataframe and it didn’t work. So I’m >using >> a small subset to find out where my error is. It’s being a PITA, and >I’m >> trying to solve it. What I want is a vector of numbers: 1000, 1000, >1000, >> 2600, 2,600. >> >> What I’ve tried: >> > y >> [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 " >> > gsub("$", "", y) >> [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ ># no >> change. Why? >> > gsub(".00", "", y) # note: that’s dot zero zero, replace with “" >> [1] "$10 " "$10 " "$10 " "$2, " "$2, “ #WTF? >> >> I’ve also tried sapply and apply, but haven’t yet tried a loop. >(These >> were done in desperation; gsub ought to work the way the help says.) >I’ve >> tried lots more than is listed here, over and over, with no results. >I’d be >> grateful for any guidance you can provide. >> >> Thanks in advance, >> >> Jim Plante >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing a dollar sign from a character vector
I should have said that R-3.2.3 requires the $ to be backslashed even when it is not at the end of the pattern: > gsub("$[[:digit:]]*", "", c("$VAR", "$20/oz.")) [1] "$VAR""$20/oz." > gsub("\\$[[:digit:]]*", "", c("$VAR", "$20/oz.")) [1] "VAR" "/oz." Modern Linuxen's tools like sed do not seem to have this requirement. % echo '$VAR' '$20/oz.' | sed -e 's/$[0-9]*//g' VAR /oz. % echo '$VAR' '$20/oz.' | sed -e 's/\$[0-9]*//g' VAR /oz. Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Feb 11, 2016 at 9:30 AM, William Dunlapwrote: > In certain programs (not current R), a pattern with stuff after a naked > dollar > sign would not match anything because dollar meant end-of-string. > > In any case I prefer simple rules like 'backslash a dollar sign' instead of > 'backslash a dollar sign at the end of the pattern but not elsewhere'. > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Thu, Feb 11, 2016 at 9:01 AM, Jeff Newmiller > wrote: > >> The "end of string" special meaning only applies when the dollar sign is >> at the right end of the string (as it was in the OP attempt). That is, it >> is NOT generally necessary to wrap it in brackets to remove the special >> meaning unless it would otherwise be at the end of the pattern string. >> -- >> Sent from my phone. Please excuse my brevity. >> >> On February 10, 2016 10:10:40 PM PST, William Dunlap via R-help < >> r-help@r-project.org> wrote: >> >>> y >>>[1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 " >>> gsub("$", "", y) >>>[1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ # >>> no change. Why? >>> >>> "$" as a regular expression means "end of string", which has zero length - >>> replacing "end >>> of string" with nothing does not affect the string. Try gsub("$", >>> "DOLLAR", "$100") >>> to see it do something. >>> >>> Use either fixed=TRUE so the 'pattern' argument is not regarded as a >>> regular expression or pattern="\\$" or pattern="[$]" to remove dollar's >>> special >>> meaning in the pattern language. >>> >>> Read up on regular expressions (probably there is a See Also >>> entry in >>> help(gsub)). >>> >>> >>> Bill Dunlap >>> TIBCO Software >>> wdunlap tibco.com >>> >>> On Wed, Feb 10, 2016 at 9:39 PM, James Plante wrote: >>> >>> What I’ve got: # sessionInfo() R version 3.2.3 (2015-12-10) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.11.3 (El Capitan) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] XML_3.98-1.3 dplyr_0.4.3 loaded via a namespace (and not attached): [1] magrittr_1.5 R6_2.1.2 assertthat_0.1 rsconnect_0.4.1.4 [5] parallel_3.2.3DBI_0.3.1 tools_3.2.3 Rcpp_0.12.3 str(y) #toy vector, subset of larger vector in a dataframe of ~4,600 > rows. chr [1:5] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ y is a subset of a column in a dataframe that’s too big to post. I tried the commands listed here on the dataframe and it didn’t work. So I’m using a small subset to find out where my error is. It’s being a PITA, and I’m trying to solve it. What I want is a vector of numbers: 1000, 1000, 1000, 2600, 2,600. What I’ve tried: > y > [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 " > gsub("$", "", y) > [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ # no change. Why? > gsub(".00", "", y) # note: that’s dot zero zero, replace with “" > [1] "$10 " "$10 " "$10 " "$2, " "$2, “ #WTF? I’ve also tried sapply and apply, but haven’t yet tried a loop. (These were done in desperation; gsub ought to work the way the help says.) I’ve tried lots more than is listed here, over and over, with no results. I’d be grateful for any guidance you can provide. Thanks in advance, Jim Plante -- R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. >>> >>> [[alternative HTML version deleted]] >>> >>> -- >>> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal,
Re: [R] Removing a dollar sign from a character vector
In certain programs (not current R), a pattern with stuff after a naked dollar sign would not match anything because dollar meant end-of-string. In any case I prefer simple rules like 'backslash a dollar sign' instead of 'backslash a dollar sign at the end of the pattern but not elsewhere'. Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Feb 11, 2016 at 9:01 AM, Jeff Newmillerwrote: > The "end of string" special meaning only applies when the dollar sign is > at the right end of the string (as it was in the OP attempt). That is, it > is NOT generally necessary to wrap it in brackets to remove the special > meaning unless it would otherwise be at the end of the pattern string. > -- > Sent from my phone. Please excuse my brevity. > > On February 10, 2016 10:10:40 PM PST, William Dunlap via R-help < > r-help@r-project.org> wrote: > >> y >>> >>[1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 " >> >>> gsub("$", "", y) >>> >>[1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ # >> no change. Why? >> >> "$" as a regular expression means "end of string", which has zero length - >> replacing "end >> of string" with nothing does not affect the string. Try gsub("$", >> "DOLLAR", "$100") >> to see it do something. >> >> Use either fixed=TRUE so the 'pattern' argument is not regarded as a >> regular expression or pattern="\\$" or pattern="[$]" to remove dollar's >> special >> meaning in the pattern language. >> >> Read up on regular expressions (probably there is a See Also >> entry in >> help(gsub)). >> >> >> Bill Dunlap >> TIBCO Software >> wdunlap tibco.com >> >> On Wed, Feb 10, 2016 at 9:39 PM, James Plante wrote: >> >> What I’ve got: >>> # sessionInfo() >>> R version 3.2.3 (2015-12-10) >>> Platform: x86_64-apple-darwin13.4.0 (64-bit) >>> Running under: OS X 10.11.3 (El Capitan) >>> >>> locale: >>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] XML_3.98-1.3 dplyr_0.4.3 >>> >>> loaded via a namespace (and not attached): >>> [1] magrittr_1.5 R6_2.1.2 assertthat_0.1rsconnect_0.4.1.4 >>> [5] parallel_3.2.3DBI_0.3.1 tools_3.2.3 >>> Rcpp_0.12.3 >>> >>> str(y) #toy vector, subset of larger vector in a dataframe of ~4,600 >>> rows. >>> chr [1:5] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ >>> >>> y is a subset of a column in a dataframe that’s too big to post. I tried >>> the commands listed here on the dataframe and it didn’t work. So I’m using >>> a small subset to find out where my error is. It’s being a PITA, and I’m >>> trying to solve it. What I want is a vector of numbers: 1000, 1000, 1000, >>> 2600, 2,600. >>> >>> What I’ve tried: >>> y >>> [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 " >>> gsub("$", "", y) >>> [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ # no >>> change. Why? >>> gsub(".00", "", y) # note: that’s dot zero zero, replace with “" >>> [1] "$10 " "$10 " "$10 " "$2, " "$2, “ #WTF? >>> >>> I’ve also tried sapply and apply, but haven’t yet tried a loop. (These >>> were done in desperation; gsub ought to work the way the help says.) I’ve >>> tried lots more than is listed here, over and over, with no results. I’d be >>> grateful for any guidance you can provide. >>> >>> Thanks in advance, >>> >>> Jim Plante >>> >>> -- >>> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> [[alternative HTML version deleted]] >> >> -- >> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing a dollar sign from a character vector
> y [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 " > gsub("$", "", y) [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ # no change. Why? "$" as a regular expression means "end of string", which has zero length - replacing "end of string" with nothing does not affect the string. Try gsub("$", "DOLLAR", "$100") to see it do something. Use either fixed=TRUE so the 'pattern' argument is not regarded as a regular expression or pattern="\\$" or pattern="[$]" to remove dollar's special meaning in the pattern language. Read up on regular expressions (probably there is a See Also entry in help(gsub)). Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Feb 10, 2016 at 9:39 PM, James Plantewrote: > What I’ve got: > # sessionInfo() > R version 3.2.3 (2015-12-10) > Platform: x86_64-apple-darwin13.4.0 (64-bit) > Running under: OS X 10.11.3 (El Capitan) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] XML_3.98-1.3 dplyr_0.4.3 > > loaded via a namespace (and not attached): > [1] magrittr_1.5 R6_2.1.2 assertthat_0.1rsconnect_0.4.1.4 > [5] parallel_3.2.3DBI_0.3.1 tools_3.2.3 Rcpp_0.12.3 > > > str(y) #toy vector, subset of larger vector in a dataframe of ~4,600 > rows. > chr [1:5] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ > > y is a subset of a column in a dataframe that’s too big to post. I tried > the commands listed here on the dataframe and it didn’t work. So I’m using > a small subset to find out where my error is. It’s being a PITA, and I’m > trying to solve it. What I want is a vector of numbers: 1000, 1000, 1000, > 2600, 2,600. > > What I’ve tried: > > y > [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 " > > gsub("$", "", y) > [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ # no > change. Why? > > gsub(".00", "", y) # note: that’s dot zero zero, replace with “" > [1] "$10 " "$10 " "$10 " "$2, " "$2, “ #WTF? > > I’ve also tried sapply and apply, but haven’t yet tried a loop. (These > were done in desperation; gsub ought to work the way the help says.) I’ve > tried lots more than is listed here, over and over, with no results. I’d be > grateful for any guidance you can provide. > > Thanks in advance, > > Jim Plante > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing a dollar sign from a character vector
y <- as.numeric( gsub( "[$, ]", "", y ) ) -- Sent from my phone. Please excuse my brevity. On February 10, 2016 9:39:16 PM PST, James Plantewrote: >What I’ve got: ># sessionInfo() >R version 3.2.3 (2015-12-10) >Platform: x86_64-apple-darwin13.4.0 (64-bit) >Running under: OS X 10.11.3 (El Capitan) > >locale: >[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > >attached base packages: >[1] stats graphics grDevices utils datasets methods base > > >other attached packages: >[1] XML_3.98-1.3 dplyr_0.4.3 > >loaded via a namespace (and not attached): >[1] magrittr_1.5 R6_2.1.2 assertthat_0.1 >rsconnect_0.4.1.4 >[5] parallel_3.2.3DBI_0.3.1 tools_3.2.3 Rcpp_0.12.3 > > >> str(y) #toy vector, subset of larger vector in a dataframe of ~4,600 >rows. >chr [1:5] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " >"$2,600.00 “ > >y is a subset of a column in a dataframe that’s too big to post. I >tried the commands listed here on the dataframe and it didn’t work. So >I’m using a small subset to find out where my error is. It’s being a >PITA, and I’m trying to solve it. What I want is a vector of numbers: >1000, 1000, 1000, 2600, 2,600. > >What I’ve tried: >> y >[1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 " >> gsub("$", "", y) >[1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ # >no change. Why? >> gsub(".00", "", y) # note: that’s dot zero zero, replace with “" >[1] "$10 " "$10 " "$10 " "$2, " "$2, “ #WTF? > >I’ve also tried sapply and apply, but haven’t yet tried a loop. (These >were done in desperation; gsub ought to work the way the help says.) >I’ve tried lots more than is listed here, over and over, with no >results. I’d be grateful for any guidance you can provide. > >Thanks in advance, > >Jim Plante > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.