Dear R-helpers & bioconductor

Sorry for cross-posting, this concerns R-programming stuff applied on
Bioconductor context.
Also sorry for this long message, I try to be complete in my request.

I am trying to write a subset method for a specific class (ExpressionSet
from Bioconductor) allowing selection more flexible than "[" method .

The schema I am thinking for is the following:

subset.ExpressionSet <- function(x,subset,...){

}

I will use the subset argument for rows (genes), as in default method.

Now I would like to allow to select different columns (features) based on
phenotypic data.
phenotypic data provides detailed information about the columns.

Basically, first function I have written allows the following:

> sub1 <- subset(ExpressionSetObject, subset=NULL, V1=value1, v2=value2)
# subset=NULL takes all rows

See: there are two conditions on two variables belonging to the associated
data.frame encapsulated in the ExpressionSetObject (to be complete, the
conditions will be applied on more of 2 columns, as they are used on the
phylogenic data.frame that concerns all variables)
To simplify a little bit, this would nearly return:
ExpressionSetObject[,V1==value & V2==value]

This is nice as I can already handle any number of conditions on variables
values thanks to '...'. First step is
conditions <- list(...) and are then handled later in code

Nevertheless, those conditions are basic (one value).

I would like to handle arbitrary conditions, such as: V1 %in% c(value1,
value2)
More simple expression would be passed with V2==value instead of V2=value2

My very problem is that I don't know how to turn '...' into an object
containing those conditions that could be used later.

My attempt which seems the nearest is:

> foo <- function(...){
> as.expression(substitute(list(...)))
> }
>foo(x==1,y%in%1:2)
expression(list(x == 1, y %in% 1:2))

where as I would like to have something like
list(expression(x==1), expression(y %in% 1:2))
those expressions beeing evaluated later on in the context of my specific
object.


Are there any existing function where '...' are already handled the way I
want so that I can mimic?

Thanks for any insight.


Eric

---

For those who have Biobase available, here is my current subset function and
a demo-case that explains a little bit.


library(Biobase)
example(ExpressionSet) # create sample object
print(expressionSet)

# now my subset function as it is

subset.ExpressionSet <- function(x,subset=NULL,verbose=TRUE,...){
  # subset is used to subset on rows
  # ... is used to make multiple conditions on columns based on pData
  # list of conditions is handled in ...
    stopifnot(is(x,"ExpressionSet"))
    phenoData <- pData(x)
    listCriteria <- list(...)
    if (is.null(subset)) subset <- rep(TRUE,nrow(exprs(x)))
    subset <- subset & !is.na(subset)
    retainedCriteria <- list()
    tmp <- sapply(names(listCriteria), function(critname) {
      if(!critname %in% colnames(phenoData)){
        if (verbose) cat("\n*** subsetCompounds: Dropped
criteria:",critname, "not in phenoData of object\n")
      }else{
        if(is.null(listCriteria[critname])) listCriteria[[critname]]<-
unique(phenoData[,critname])
        retainedCriteria[[critname]] <<-  phenoData[,critname] %in%
listCriteria[critname]
      }
      })
      criteriaValues <- do.call("cbind",retainedCriteria)

     selectedColumns <- rownames(phenoData)[apply(criteriaValues,1,logic)]
      ## cbind(phenoData,criteriaValues)
      out <- x[subset,selectedColumns]
    if (verbose)  cat('\n',length(selectedColumns),' columns selected
(',paste(selectedColumns,collapse=' '),
      ')\n',sep='')
     invisible(return(out))
  }

# looking at phenotypic data associated with the sample expressionSet
> pData(expressionSet)
     sex    type score
A Female Control  0.75
B   Male    Case  0.40
C   Male Control  0.73
D   Male    Case  0.42
E Female    Case  0.93
F   Male Control  0.22
G   Male    Case  0.96
H   Male    Case  0.79
I Female    Case  0.37
J   Male Control  0.63
K   Male    Case  0.26
L Female Control  0.36
M   Male    Case  0.41
N   Male    Case  0.80
O Female    Case  0.10
P Female Control  0.41
Q Female    Case  0.16
R   Male Control  0.72
S   Male    Case  0.17
T Female    Case  0.74
U   Male Control  0.35
V Female Control  0.77
W   Male Control  0.27
X   Male Control  0.98
Y Female    Case  0.94
Z Female    Case  0.32


# now the sample use
> (subset1 =subset(expressionSet,sex="Male",type="Control"))
7 columns selected (C F J R U W X)
ExpressionSet (storageMode: lockedEnvironment)
assayData: 500 features, 7 samples
  element names: exprs, se.exprs
phenoData
  sampleNames: C, F, ..., X  (7 total)
  varLabels and varMetadata description:
    sex: Female/Male
    type: Case/Control
    score: Testing Score
featureData
  featureNames: AFFX-MurIL2_at, AFFX-MurIL10_at, ..., 31739_at  (500 total)
  fvarLabels and fvarMetadata description: none
experimentData: use 'experimentData(object)'
Annotation: hgu95av2


# what I would like to allow in use:
(subset2 = subset(expressionSet, sex=="Male", score > 0.75) # note the ==
instead of =

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to