Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Duncan Murdoch

On 29/04/2011 7:41 PM, Miao wrote:

Hello,

Can anyone help on gsub() in R?  I have a string like something below, and
wanted to delete all the strings with leading backslash, including \xa0On,
\023, \xab, and many others.   How should I write a regular expression
pattern in gsub()?  I don't care how many characters following backslash.



If those are R strings, none of them contain a backslash.  In R, a 
backslash would always be printed as \\.


\x is the introduction to a hexadecimal encoding for a character; the 
next two characters show the hex digits.  So your first string contains 
a single character \xa0, the third one contains \xab, and so on.


The \023 is an octal encoding for a single character.

Duncan Murdoch




txt- Is This Thing\xa0On? http://bit.ly/jAbKem  wait \023 for people \xab
and be patient :

Thanks in advance,
Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Miao
Thanks Duncan for clarifying this.  I'm pretty a newbie to such type of
characters and special characters.  In R's gsub() what regular expressions
shall I use to handle all these situations?


On Fri, Apr 29, 2011 at 6:07 PM, Duncan Murdoch murdoch.dun...@gmail.comwrote:

 On 29/04/2011 7:41 PM, Miao wrote:

 Hello,

 Can anyone help on gsub() in R?  I have a string like something below, and
 wanted to delete all the strings with leading backslash, including
 \xa0On,
 \023, \xab, and many others.   How should I write a regular expression
 pattern in gsub()?  I don't care how many characters following backslash.



 If those are R strings, none of them contain a backslash.  In R, a
 backslash would always be printed as \\.

 \x is the introduction to a hexadecimal encoding for a character; the next
 two characters show the hex digits.  So your first string contains a single
 character \xa0, the third one contains \xab, and so on.

 The \023 is an octal encoding for a single character.

 Duncan Murdoch



 txt- Is This Thing\xa0On? http://bit.ly/jAbKem  wait \023 for people
 \xab
 and be patient :

 Thanks in advance,
 Miao

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
proceed everyday

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Duncan Murdoch

On 29/04/2011 9:34 PM, Miao wrote:

Thanks Duncan for clarifying this.  I'm pretty a newbie to such type of
characters and special characters.  In R's gsub() what regular
expressions shall I use to handle all these situations?


I don't know.  This might work:

gsub([\x01-\x1f\x7f-\xff], , x)

(i.e. the range from character 1 to character 31, and 127 to 255) but I 
don't know if our regular expression matcher will accept those characters.


Duncan Murdoch




On Fri, Apr 29, 2011 at 6:07 PM, Duncan Murdoch
murdoch.dun...@gmail.com mailto:murdoch.dun...@gmail.com wrote:

On 29/04/2011 7:41 PM, Miao wrote:

Hello,

Can anyone help on gsub() in R?  I have a string like something
below, and
wanted to delete all the strings with leading backslash,
including \xa0On,
\023, \xab, and many others.   How should I write a regular
expression
pattern in gsub()?  I don't care how many characters following
backslash.



If those are R strings, none of them contain a backslash.  In R, a
backslash would always be printed as \\.

\x is the introduction to a hexadecimal encoding for a character;
the next two characters show the hex digits.  So your first string
contains a single character \xa0, the third one contains \xab, and
so on.

The \023 is an octal encoding for a single character.

Duncan Murdoch



txt- Is This Thing\xa0On? http://bit.ly/jAbKem  wait \023 for
people \xab
and be patient :

Thanks in advance,
Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailto:R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
proceed everyday


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Mike Miller

On Fri, 29 Apr 2011, Duncan Murdoch wrote:


On 29/04/2011 7:41 PM, Miao wrote:


Can anyone help on gsub() in R?  I have a string like something below, and
wanted to delete all the strings with leading backslash, including 
\xa0On,

\023, \xab, and many others.   How should I write a regular expression
pattern in gsub()?  I don't care how many characters following backslash.



If those are R strings, none of them contain a backslash.  In R, a backslash 
would always be printed as \\.


\x is the introduction to a hexadecimal encoding for a character; the next 
two characters show the hex digits.  So your first string contains a single 
character \xa0, the third one contains \xab, and so on.


The \023 is an octal encoding for a single character.



If we were dealing with a leading backslash, I guess this would do it:

gsub(^.*, , txt)

R would display a double backslash, but I believe that represents a single 
backslash.  So if the string were saved using write.table, say, only a 
single backslash would be stored.



a - \\This is a string.
a

[1] \\This is a string.

gsub(^, , a)

[1] This is a string.

a

[1] \\This is a string.

gsub(^.*, , a)

[1] 

gsub(^.*, , c(a,Another string,\\more))

[1]Another string 

write.table(a, file=a.txt, quote=F, row.names=F, col.names=F)


$ cat a.txt
\This is a string.

Apparently this is not what the OP really wanted.  The OP probably wanted 
to remove characters that were not from the regular ASCII set.



Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Miao
That works like a charm!  Thanks so much Duncan.

On Fri, Apr 29, 2011 at 6:37 PM, Duncan Murdoch murdoch.dun...@gmail.comwrote:

 On 29/04/2011 9:34 PM, Miao wrote:

 Thanks Duncan for clarifying this.  I'm pretty a newbie to such type of
 characters and special characters.  In R's gsub() what regular
 expressions shall I use to handle all these situations?


 I don't know.  This might work:

 gsub([\x01-\x1f\x7f-\xff], , x)

 (i.e. the range from character 1 to character 31, and 127 to 255) but I
 don't know if our regular expression matcher will accept those characters.

 Duncan Murdoch



 On Fri, Apr 29, 2011 at 6:07 PM, Duncan Murdoch
 murdoch.dun...@gmail.com mailto:murdoch.dun...@gmail.com wrote:

On 29/04/2011 7:41 PM, Miao wrote:

Hello,

Can anyone help on gsub() in R?  I have a string like something
below, and
wanted to delete all the strings with leading backslash,
including \xa0On,
\023, \xab, and many others.   How should I write a regular
expression
pattern in gsub()?  I don't care how many characters following
backslash.



If those are R strings, none of them contain a backslash.  In R, a
backslash would always be printed as \\.

\x is the introduction to a hexadecimal encoding for a character;
the next two characters show the hex digits.  So your first string
contains a single character \xa0, the third one contains \xab, and
so on.

The \023 is an octal encoding for a single character.

Duncan Murdoch



txt- Is This Thing\xa0On? http://bit.ly/jAbKem  wait \023 for
people \xab
and be patient :

Thanks in advance,
Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailto:R-help@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





 --
 proceed everyday





-- 
proceed everyday

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.