Re: [R] Removing a dollar sign from a character vector

2016-02-11 Thread Jeff Newmiller
The "end of string" special meaning only applies when the dollar sign is at the 
right end of the string (as it was in the OP attempt). That is,  it is NOT 
generally necessary to wrap it in brackets to remove the special meaning unless 
it would otherwise be at the end of the pattern string. 
-- 
Sent from my phone. Please excuse my brevity.

On February 10, 2016 10:10:40 PM PST, William Dunlap via R-help 
 wrote:
>   > y
>   [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 "
>   > gsub("$", "", y)
> [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ #
>no change. Why?
>
>"$" as a regular expression means "end of string", which has zero
>length -
>replacing "end
>of string" with nothing does not affect the string.  Try gsub("$",
>"DOLLAR", "$100")
>to see it do something.
>
>Use either fixed=TRUE so the 'pattern'  argument is not regarded as a
>regular expression or pattern="\\$" or pattern="[$]" to remove dollar's
>special
>meaning in the pattern language.
>
>Read up on regular expressions (probably there is a See Also entry in
>help(gsub)).
>
>
>Bill Dunlap
>TIBCO Software
>wdunlap tibco.com
>
>On Wed, Feb 10, 2016 at 9:39 PM, James Plante  wrote:
>
>> What I’ve got:
>> # sessionInfo()
>> R version 3.2.3 (2015-12-10)
>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
>> Running under: OS X 10.11.3 (El Capitan)
>>
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>> other attached packages:
>> [1] XML_3.98-1.3 dplyr_0.4.3
>>
>> loaded via a namespace (and not attached):
>> [1] magrittr_1.5  R6_2.1.2  assertthat_0.1   
>rsconnect_0.4.1.4
>> [5] parallel_3.2.3DBI_0.3.1 tools_3.2.3   Rcpp_0.12.3
>>
>> > str(y) #toy vector, subset of larger vector in a dataframe of
>~4,600
>> rows.
>>  chr [1:5] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 "
>"$2,600.00 “
>>
>> y is a subset of a column in a dataframe that’s too big to post. I
>tried
>> the commands listed here on the dataframe and it didn’t work. So I’m
>using
>> a small subset to find out where my error is. It’s being a PITA, and
>I’m
>> trying to solve it. What I want is a vector of numbers: 1000, 1000,
>1000,
>> 2600, 2,600.
>>
>> What I’ve tried:
>> > y
>> [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 "
>> > gsub("$", "", y)
>> [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “
># no
>> change. Why?
>> > gsub(".00", "", y)  # note: that’s dot zero zero, replace with “"
>> [1] "$10 " "$10 " "$10 " "$2, " "$2, “  #WTF?
>>
>> I’ve also tried sapply and apply, but haven’t yet tried a loop.
>(These
>> were done in desperation; gsub ought to work the way the help says.)
>I’ve
>> tried lots more than is listed here, over and over, with no results.
>I’d be
>> grateful for any guidance you can provide.
>>
>> Thanks in advance,
>>
>> Jim Plante
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Removing a dollar sign from a character vector

2016-02-11 Thread William Dunlap via R-help
I should have said that R-3.2.3 requires the $ to be backslashed even when
it
is not at the end of the pattern:

  > gsub("$[[:digit:]]*", "", c("$VAR", "$20/oz."))
  [1] "$VAR""$20/oz."
  > gsub("\\$[[:digit:]]*", "", c("$VAR", "$20/oz."))
  [1] "VAR"  "/oz."

Modern Linuxen's tools like sed do not seem to have this requirement.
  % echo '$VAR' '$20/oz.' | sed -e 's/$[0-9]*//g'
  VAR /oz.
  % echo '$VAR' '$20/oz.' | sed -e 's/\$[0-9]*//g'
  VAR /oz.




Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Feb 11, 2016 at 9:30 AM, William Dunlap  wrote:

> In certain programs (not current R), a pattern with stuff after a naked
> dollar
> sign would not match anything because dollar meant end-of-string.
>
> In any case I prefer simple rules like 'backslash a dollar sign' instead of
> 'backslash a dollar sign at the end of the pattern but not elsewhere'.
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Thu, Feb 11, 2016 at 9:01 AM, Jeff Newmiller 
> wrote:
>
>> The "end of string" special meaning only applies when the dollar sign is
>> at the right end of the string (as it was in the OP attempt). That is, it
>> is NOT generally necessary to wrap it in brackets to remove the special
>> meaning unless it would otherwise be at the end of the pattern string.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On February 10, 2016 10:10:40 PM PST, William Dunlap via R-help <
>> r-help@r-project.org> wrote:
>>
>>>  y

>>>[1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 "
>>>
  gsub("$", "", y)

>>>[1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ #
>>> no change. Why?
>>>
>>> "$" as a regular expression means "end of string", which has zero length -
>>> replacing "end
>>> of string" with nothing does not affect the string.  Try gsub("$",
>>> "DOLLAR", "$100")
>>> to see it do something.
>>>
>>> Use either fixed=TRUE so the 'pattern'  argument is not regarded as a
>>> regular expression or pattern="\\$" or pattern="[$]" to remove dollar's 
>>> special
>>> meaning in the pattern language.
>>>
>>> Read up on regular expressions (probably there is a See Also
>>> entry in
>>> help(gsub)).
>>>
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>> On Wed, Feb 10, 2016 at 9:39 PM, James Plante  wrote:
>>>
>>>  What I’ve got:
  # sessionInfo()
  R version 3.2.3 (2015-12-10)
  Platform: x86_64-apple-darwin13.4.0 (64-bit)
  Running under: OS X 10.11.3 (El Capitan)

  locale:
  [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

  attached base packages:
  [1] stats graphics  grDevices utils datasets  methods   base

  other attached packages:
  [1] XML_3.98-1.3 dplyr_0.4.3

  loaded via a namespace (and not attached):
  [1] magrittr_1.5  R6_2.1.2  assertthat_0.1
 rsconnect_0.4.1.4
  [5] parallel_3.2.3DBI_0.3.1 tools_3.2.3
 Rcpp_0.12.3

  str(y) #toy vector, subset of larger vector in a dataframe of ~4,600
>
  rows.
   chr [1:5] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 
 “

  y is a subset of a column in a dataframe that’s too big to post. I tried
  the commands listed here on the dataframe and it didn’t work. So I’m using
  a small subset to find out where my error is. It’s being a PITA, and I’m
  trying to solve it. What I want is a vector of numbers: 1000, 1000, 1000,
  2600, 2,600.

  What I’ve tried:

>  y
>
  [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 "

>  gsub("$", "", y)
>
  [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ # no
  change. Why?

>  gsub(".00", "", y)  # note: that’s dot zero zero, replace with “"
>
  [1] "$10 " "$10 " "$10 " "$2, " "$2, “  #WTF?

  I’ve also tried sapply and apply, but haven’t yet tried a loop. (These
  were done in desperation; gsub ought to work the way the help says.) I’ve
  tried lots more than is listed here, over and over, with no results. I’d 
 be
  grateful for any guidance you can provide.

  Thanks in advance,

  Jim Plante

 --

  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

>>>
>>>  [[alternative HTML version deleted]]
>>>
>>> --
>>>
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, 

Re: [R] Removing a dollar sign from a character vector

2016-02-11 Thread William Dunlap via R-help
In certain programs (not current R), a pattern with stuff after a naked
dollar
sign would not match anything because dollar meant end-of-string.

In any case I prefer simple rules like 'backslash a dollar sign' instead of
'backslash a dollar sign at the end of the pattern but not elsewhere'.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Feb 11, 2016 at 9:01 AM, Jeff Newmiller 
wrote:

> The "end of string" special meaning only applies when the dollar sign is
> at the right end of the string (as it was in the OP attempt). That is, it
> is NOT generally necessary to wrap it in brackets to remove the special
> meaning unless it would otherwise be at the end of the pattern string.
> --
> Sent from my phone. Please excuse my brevity.
>
> On February 10, 2016 10:10:40 PM PST, William Dunlap via R-help <
> r-help@r-project.org> wrote:
>
>>  y
>>>
>>[1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 "
>>
>>>  gsub("$", "", y)
>>>
>>[1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ #
>> no change. Why?
>>
>> "$" as a regular expression means "end of string", which has zero length -
>> replacing "end
>> of string" with nothing does not affect the string.  Try gsub("$",
>> "DOLLAR", "$100")
>> to see it do something.
>>
>> Use either fixed=TRUE so the 'pattern'  argument is not regarded as a
>> regular expression or pattern="\\$" or pattern="[$]" to remove dollar's 
>> special
>> meaning in the pattern language.
>>
>> Read up on regular expressions (probably there is a See Also
>> entry in
>> help(gsub)).
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>> On Wed, Feb 10, 2016 at 9:39 PM, James Plante  wrote:
>>
>>  What I’ve got:
>>>  # sessionInfo()
>>>  R version 3.2.3 (2015-12-10)
>>>  Platform: x86_64-apple-darwin13.4.0 (64-bit)
>>>  Running under: OS X 10.11.3 (El Capitan)
>>>
>>>  locale:
>>>  [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>
>>>  attached base packages:
>>>  [1] stats graphics  grDevices utils datasets  methods   base
>>>
>>>  other attached packages:
>>>  [1] XML_3.98-1.3 dplyr_0.4.3
>>>
>>>  loaded via a namespace (and not attached):
>>>  [1] magrittr_1.5  R6_2.1.2  assertthat_0.1rsconnect_0.4.1.4
>>>  [5] parallel_3.2.3DBI_0.3.1 tools_3.2.3
>>> Rcpp_0.12.3
>>>
>>>  str(y) #toy vector, subset of larger vector in a dataframe of ~4,600

>>>  rows.
>>>   chr [1:5] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “
>>>
>>>  y is a subset of a column in a dataframe that’s too big to post. I tried
>>>  the commands listed here on the dataframe and it didn’t work. So I’m using
>>>  a small subset to find out where my error is. It’s being a PITA, and I’m
>>>  trying to solve it. What I want is a vector of numbers: 1000, 1000, 1000,
>>>  2600, 2,600.
>>>
>>>  What I’ve tried:
>>>
  y

>>>  [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 "
>>>
  gsub("$", "", y)

>>>  [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ # no
>>>  change. Why?
>>>
  gsub(".00", "", y)  # note: that’s dot zero zero, replace with “"

>>>  [1] "$10 " "$10 " "$10 " "$2, " "$2, “  #WTF?
>>>
>>>  I’ve also tried sapply and apply, but haven’t yet tried a loop. (These
>>>  were done in desperation; gsub ought to work the way the help says.) I’ve
>>>  tried lots more than is listed here, over and over, with no results. I’d be
>>>  grateful for any guidance you can provide.
>>>
>>>  Thanks in advance,
>>>
>>>  Jim Plante
>>>
>>> --
>>>
>>>  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>  https://stat.ethz.ch/mailman/listinfo/r-help
>>>  PLEASE do read the posting guide
>>>  http://www.R-project.org/posting-guide.html
>>>  and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>  [[alternative HTML version deleted]]
>>
>> --
>>
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Removing a dollar sign from a character vector

2016-02-10 Thread William Dunlap via R-help
   > y
   [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 "
   > gsub("$", "", y)
   [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ #
no change. Why?

"$" as a regular expression means "end of string", which has zero length -
replacing "end
of string" with nothing does not affect the string.  Try gsub("$",
"DOLLAR", "$100")
to see it do something.

Use either fixed=TRUE so the 'pattern'  argument is not regarded as a
regular expression or pattern="\\$" or pattern="[$]" to remove dollar's special
meaning in the pattern language.

Read up on regular expressions (probably there is a See Also entry in
help(gsub)).


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Feb 10, 2016 at 9:39 PM, James Plante  wrote:

> What I’ve got:
> # sessionInfo()
> R version 3.2.3 (2015-12-10)
> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> Running under: OS X 10.11.3 (El Capitan)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> other attached packages:
> [1] XML_3.98-1.3 dplyr_0.4.3
>
> loaded via a namespace (and not attached):
> [1] magrittr_1.5  R6_2.1.2  assertthat_0.1rsconnect_0.4.1.4
> [5] parallel_3.2.3DBI_0.3.1 tools_3.2.3   Rcpp_0.12.3
>
> > str(y) #toy vector, subset of larger vector in a dataframe of ~4,600
> rows.
>  chr [1:5] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “
>
> y is a subset of a column in a dataframe that’s too big to post. I tried
> the commands listed here on the dataframe and it didn’t work. So I’m using
> a small subset to find out where my error is. It’s being a PITA, and I’m
> trying to solve it. What I want is a vector of numbers: 1000, 1000, 1000,
> 2600, 2,600.
>
> What I’ve tried:
> > y
> [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 "
> > gsub("$", "", y)
> [1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ # no
> change. Why?
> > gsub(".00", "", y)  # note: that’s dot zero zero, replace with “"
> [1] "$10 " "$10 " "$10 " "$2, " "$2, “  #WTF?
>
> I’ve also tried sapply and apply, but haven’t yet tried a loop. (These
> were done in desperation; gsub ought to work the way the help says.) I’ve
> tried lots more than is listed here, over and over, with no results. I’d be
> grateful for any guidance you can provide.
>
> Thanks in advance,
>
> Jim Plante
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Removing a dollar sign from a character vector

2016-02-10 Thread Jeff Newmiller
y <- as.numeric( gsub( "[$, ]", "", y ) )
-- 
Sent from my phone. Please excuse my brevity.

On February 10, 2016 9:39:16 PM PST, James Plante  wrote:
>What I’ve got:
># sessionInfo()
>R version 3.2.3 (2015-12-10)
>Platform: x86_64-apple-darwin13.4.0 (64-bit)
>Running under: OS X 10.11.3 (El Capitan)
>
>locale:
>[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
>attached base packages:
>[1] stats graphics  grDevices utils datasets  methods   base   
> 
>
>other attached packages:
>[1] XML_3.98-1.3 dplyr_0.4.3 
>
>loaded via a namespace (and not attached):
>[1] magrittr_1.5  R6_2.1.2  assertthat_0.1   
>rsconnect_0.4.1.4
>[5] parallel_3.2.3DBI_0.3.1 tools_3.2.3   Rcpp_0.12.3  
>  
>
>> str(y) #toy vector, subset of larger vector in a dataframe of ~4,600
>rows.
>chr [1:5] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 "
>"$2,600.00 “
>
>y is a subset of a column in a dataframe that’s too big to post. I
>tried the commands listed here on the dataframe and it didn’t work. So
>I’m using a small subset to find out where my error is. It’s being a
>PITA, and I’m trying to solve it. What I want is a vector of numbers:
>1000, 1000, 1000, 2600, 2,600. 
>
>What I’ve tried:
>> y
>[1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 "
>> gsub("$", "", y)
>[1] "$1,000.00 " "$1,000.00 " "$1,000.00 " "$2,600.00 " "$2,600.00 “ #
>no change. Why?
>> gsub(".00", "", y)  # note: that’s dot zero zero, replace with “"
>[1] "$10 " "$10 " "$10 " "$2, " "$2, “  #WTF?
>
>I’ve also tried sapply and apply, but haven’t yet tried a loop. (These
>were done in desperation; gsub ought to work the way the help says.)
>I’ve tried lots more than is listed here, over and over, with no
>results. I’d be grateful for any guidance you can provide. 
>
>Thanks in advance,
>
>Jim Plante
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.