Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Miao
That works like a charm!  Thanks so much Duncan.

On Fri, Apr 29, 2011 at 6:37 PM, Duncan Murdoch wrote:

> On 29/04/2011 9:34 PM, Miao wrote:
>
>> Thanks Duncan for clarifying this.  I'm pretty a newbie to such type of
>> characters and special characters.  In R's gsub() what regular
>> expressions shall I use to handle all these situations?
>>
>
> I don't know.  This might work:
>
> gsub("[\x01-\x1f\x7f-\xff]", "", x)
>
> (i.e. the range from character 1 to character 31, and 127 to 255) but I
> don't know if our regular expression matcher will accept those characters.
>
> Duncan Murdoch
>
>
>>
>> On Fri, Apr 29, 2011 at 6:07 PM, Duncan Murdoch
>> mailto:murdoch.dun...@gmail.com>> wrote:
>>
>>On 29/04/2011 7:41 PM, Miao wrote:
>>
>>Hello,
>>
>>Can anyone help on gsub() in R?  I have a string like something
>>below, and
>>wanted to delete all the strings with leading backslash,
>>including "\xa0On",
>>"\023, "\xab", and many others.   How should I write a regular
>>expression
>>pattern in gsub()?  I don't care how many characters following
>>backslash.
>>
>>
>>
>>If those are R strings, none of them contain a backslash.  In R, a
>>backslash would always be printed as \\.
>>
>>\x is the introduction to a hexadecimal encoding for a character;
>>the next two characters show the hex digits.  So your first string
>>contains a single character \xa0, the third one contains \xab, and
>>so on.
>>
>>The \023 is an octal encoding for a single character.
>>
>>Duncan Murdoch
>>
>>
>>
>>txt<- "Is This Thing\xa0On? http://bit.ly/jAbKem  wait \023 for
>>people \xab
>>and be patient :"
>>
>>Thanks in advance,
>>Miao
>>
>>[[alternative HTML version deleted]]
>>
>>__
>>R-help@r-project.org  mailing list
>>
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>>
>>
>> --
>> proceed everyday
>>
>
>


-- 
proceed everyday

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Mike Miller

On Fri, 29 Apr 2011, Duncan Murdoch wrote:


On 29/04/2011 7:41 PM, Miao wrote:


Can anyone help on gsub() in R?  I have a string like something below, and
wanted to delete all the strings with leading backslash, including 
"\xa0On",

"\023, "\xab", and many others.   How should I write a regular expression
pattern in gsub()?  I don't care how many characters following backslash.



If those are R strings, none of them contain a backslash.  In R, a backslash 
would always be printed as \\.


\x is the introduction to a hexadecimal encoding for a character; the next 
two characters show the hex digits.  So your first string contains a single 
character \xa0, the third one contains \xab, and so on.


The \023 is an octal encoding for a single character.



If we were dealing with a leading backslash, I guess this would do it:

gsub("^.*", "", txt)

R would display a double backslash, but I believe that represents a single 
backslash.  So if the string were saved using write.table, say, only a 
single backslash would be stored.



a <- "\\This is a string."
a

[1] "\\This is a string."

gsub("^", "", a)

[1] "This is a string."

a

[1] "\\This is a string."

gsub("^.*", "", a)

[1] ""

gsub("^.*", "", c(a,"Another string","\\more"))

[1] ""   "Another string" ""

write.table(a, file="a.txt", quote=F, row.names=F, col.names=F)


$ cat a.txt
\This is a string.

Apparently this is not what the OP really wanted.  The OP probably wanted 
to remove characters that were not from the regular ASCII set.



Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Duncan Murdoch

On 29/04/2011 9:34 PM, Miao wrote:

Thanks Duncan for clarifying this.  I'm pretty a newbie to such type of
characters and special characters.  In R's gsub() what regular
expressions shall I use to handle all these situations?


I don't know.  This might work:

gsub("[\x01-\x1f\x7f-\xff]", "", x)

(i.e. the range from character 1 to character 31, and 127 to 255) but I 
don't know if our regular expression matcher will accept those characters.


Duncan Murdoch




On Fri, Apr 29, 2011 at 6:07 PM, Duncan Murdoch
mailto:murdoch.dun...@gmail.com>> wrote:

On 29/04/2011 7:41 PM, Miao wrote:

Hello,

Can anyone help on gsub() in R?  I have a string like something
below, and
wanted to delete all the strings with leading backslash,
including "\xa0On",
"\023, "\xab", and many others.   How should I write a regular
expression
pattern in gsub()?  I don't care how many characters following
backslash.



If those are R strings, none of them contain a backslash.  In R, a
backslash would always be printed as \\.

\x is the introduction to a hexadecimal encoding for a character;
the next two characters show the hex digits.  So your first string
contains a single character \xa0, the third one contains \xab, and
so on.

The \023 is an octal encoding for a single character.

Duncan Murdoch



txt<- "Is This Thing\xa0On? http://bit.ly/jAbKem  wait \023 for
people \xab
and be patient :"

Thanks in advance,
Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
proceed everyday


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Miao
Thanks Duncan for clarifying this.  I'm pretty a newbie to such type of
characters and special characters.  In R's gsub() what regular expressions
shall I use to handle all these situations?


On Fri, Apr 29, 2011 at 6:07 PM, Duncan Murdoch wrote:

> On 29/04/2011 7:41 PM, Miao wrote:
>
>> Hello,
>>
>> Can anyone help on gsub() in R?  I have a string like something below, and
>> wanted to delete all the strings with leading backslash, including
>> "\xa0On",
>> "\023, "\xab", and many others.   How should I write a regular expression
>> pattern in gsub()?  I don't care how many characters following backslash.
>>
>
>
> If those are R strings, none of them contain a backslash.  In R, a
> backslash would always be printed as \\.
>
> \x is the introduction to a hexadecimal encoding for a character; the next
> two characters show the hex digits.  So your first string contains a single
> character \xa0, the third one contains \xab, and so on.
>
> The \023 is an octal encoding for a single character.
>
> Duncan Murdoch
>
>
>
>> txt<- "Is This Thing\xa0On? http://bit.ly/jAbKem  wait \023 for people
>> \xab
>> and be patient :"
>>
>> Thanks in advance,
>> Miao
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>


-- 
proceed everyday

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Duncan Murdoch

On 29/04/2011 7:41 PM, Miao wrote:

Hello,

Can anyone help on gsub() in R?  I have a string like something below, and
wanted to delete all the strings with leading backslash, including "\xa0On",
"\023, "\xab", and many others.   How should I write a regular expression
pattern in gsub()?  I don't care how many characters following backslash.



If those are R strings, none of them contain a backslash.  In R, a 
backslash would always be printed as \\.


\x is the introduction to a hexadecimal encoding for a character; the 
next two characters show the hex digits.  So your first string contains 
a single character \xa0, the third one contains \xab, and so on.


The \023 is an octal encoding for a single character.

Duncan Murdoch




txt<- "Is This Thing\xa0On? http://bit.ly/jAbKem  wait \023 for people \xab
and be patient :"

Thanks in advance,
Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.