Re: [R] Need content_transformer() called by tm_map() to change non-letters to spaces

2015-04-24 Thread Jeff Newmiller
Regex "[^a-zA-Z]" reads as "not a letter". 
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On April 23, 2015 1:10:41 PM PDT, Mike  wrote:
>Hello,
>In the following code, any characters matching  "/|@| \\|") will be
>changed to a space. 
>> library(tm)
>> toSpace <- content_transformer(function(x, pattern) gsub(pattern, "
>", x))
>> docs <- tm_map(docs, toSpace, "/|@| \\|")
>
>What code would transform all non-letters to a space?  (What goes where
>the x's are.)It is very difficult to put all non-letters in a
>string...  So I'm doing the opposite of the above.
>> toSpace_2 <- content_transformer(function xxx))
>> docs <- tm_map(docs, toSpace_2, "abcdefghijklmnopqrstuvwxyz")
>
>This needs to be done by a content_transformer() function to maintain
>the integrity of docs.
>
>Thanks
> 
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Need content_transformer() called by tm_map() to change non-letters to spaces

2015-04-23 Thread Mike
Hello,
In the following code, any characters matching  "/|@| \\|") will be changed to 
a space. 
> library(tm)
> toSpace <- content_transformer(function(x, pattern) gsub(pattern, " ", x))
> docs <- tm_map(docs, toSpace, "/|@| \\|")

What code would transform all non-letters to a space?  (What goes where the 
x's are.)It is very difficult to put all non-letters in a string...  So I'm 
doing the opposite of the above.
> toSpace_2 <- content_transformer(function xxx))
> docs <- tm_map(docs, toSpace_2, "abcdefghijklmnopqrstuvwxyz")

This needs to be done by a content_transformer() function to maintain the 
integrity of docs.

Thanks
 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.