Hi,

Here is a somewhat detailed explanation of what I want to achieve:

I have a data frame:

      id     url
urlType
1     1      www.yahoo.com                                    1
2     2      www.google.com/?search=                     2
3     3      www.google.com                                   1
4     4      www.yahoo.com/?query=                       2
5     5      www.gmail.com                                     1

I want to get all the URLs that are not of type `1` and satisfy the
condition defined by the following function:

checkBaseLine <- function(s){
for (listItem in WHITELIST){
 if(regexpr(as.character(listItem), s)[1] > -1){
return(TRUE)
}
 }
return(FALSE)
}

Here is the definition for WHITELIST:-

WHITELIST = "[?]query=, [?]search="
WHITELIST <- unlist(trim(strsplit(trim(WHITELIST), ",")))

Now, for the given data frame I want to apply the above function for
all row values for a given column:-

That is:

It works fine when I define a condition like:
data <- data[data$urlType != 1,]

However, I want to combine two logical conditions together like:
data <- data[data$urlType != 1 & checkBaseLine(data$url),]

This would check whether the column `urlType` contains row values that !=
1, and the column `url` contains row values that satisfy the function
definition.

Any ideas how this can be done?

Thanks in advance.

Regards,
Harsh Yadav


On Thu, Jul 8, 2010 at 9:43 PM, Erik Iverson <er...@ccbr.umn.edu> wrote:

> It will be a lot easier to help you if you follow the posting guide and
> PLEASE do read the posting guide and provide commented, minimal,
> self-contained, reproducible code.
>
> You gave your function definition, which is good.  Use ?dput to give us a
> small data.frame that can accurately show what you want.
>
>
> harsh yadav wrote:
>
>> Hi all,
>>
>> I have a data frame for which I want to limit the output by checking
>> whether
>> row values for specific column meets particular conditions.
>>
>> Here are the more specific details:
>>
>> I have a function that checks whether an input string exists in a defined
>> list:-
>>
>> checkBaseLine <- function(s){
>>  for (listItem in WHITELIST){
>> if(regexpr(as.character(listItem), s)[1] > -1){
>>  return(TRUE)
>> }
>> }
>>  return(FALSE)
>> }
>>
>> Now, I have a data frame for which I want to apply the above function for
>> all row values for a given column:-
>>
>> This works fine when I define a condition like:
>> data <- data[data$urlType != 1,]
>>
>> However, I want to combine two logical conditions together like:
>> data <- data[data$urlType != 1 & checkBaseLine(data$url),]
>>
>> This would check whether the column `urlType` contains row values that !=
>> 1,
>> and the column `url` contains row values that gets evaluated using the
>> defined function.
>>
>> Any ideas how this can be done?
>>
>> Thanks in advance.
>>
>> Regards,
>> Harsh Yadav
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to