Thanks S. Ellison.

Finally, Ihad some time to test it. Thanks for your clarification.

Just one more question:

You say:

Your regexes are on multiple lines and include whitespace and linefeeds.
For example you are not testing for
" .*forum.*|.*buy.*"; you are testing for
" .*forum.*|
                      .*buy.*"


But, the ".*", as far as I understand, means: any character, 0 or more
times. So I should cover the blank and break lines. May you explain this
further, this is not making click on my head.




2015-10-26 7:29 GMT-05:00 S Ellison <s.elli...@lgcgroup.com>:

>
>
> > From: Omar André Gonzáles Díaz
> > Subject: [R] regex not working for some entries in for loop
> >
> > I'm using some regex in a for loop to check for some values in column
> "source",
> > and put a result in column "fuente".
>
> Your regexes are on multiple lines and include whitespace and linefeeds.
> For example you are not testing for
> " .*forum.*|.*buy.*"; you are testing for
> " .*forum.*|
>                       .*buy.*"
> (which among other things includes a \n)
> Don’t do that. Keep it to one line with no white space.
> if you must have line breaks in the code, form the pattern using paste, as
> in
> pat1 <- paste(c("site.*", ".*event.*", ".*free.*", ".*theguardlan.*",
>         ".*guardlink.*", ".*torture.*", ".*forum.*", ".*buy.*",
>         ".*share.*", ".*buttons.*", ".*pyme\\.lavoztx\\.com\\.*",
>         ".*amezon.*", "computrabajo.com.pe", ".*porn.*", "quality"),
>         collapse="|")
>
> spam <- grepl(pat1, sf$source,ignore.case = T)
>
> Also, it's not immediately clear why you’re looping. grepl returns a
> vector of logicals; you have a vector of character strings. Consider
> replacing 'if' constructs with 'ifelse' - albeit a complicated ifelse() -
> and doing the whole thing without a loop.
>
> S Ellison
>
>
> *******************************************************************
> This email and any attachments are confidential. Any u...{{dropped:17}}

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to