Hi.

On Wed, Feb 23, 2011 at 4:02 PM, Gregory W <greg.d.w...@gmail.com> wrote:
> Hello,
>
> I was hoping to get some information about how to manipulate
> AromaUnitTotalCnBinarySets.
>
> I've already performed CRMAv2 on all the affy platforms in my study.
>
> I load the results into an R session with:
>
>
>>    tags <- "ACC,ra,-XY,BPN,-XY,AVG,FLN,-XY" ;
>>    dsT  <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, 
>> chipType=chipType) ;
>>    print(dsT)
>>    dsC       <- list() ;
>>    dsC$total <- dsT    ;
>
>
> I have normal and tumor replicates for a particular patient.  I grab
> these data by:
>
>>    normals.indx  <- match(normal.names,getNames(dsC$total))
>>    normals         <- extract(dsC$total, normals.indx)
>
> And similarly for the tumors.  This is what normals and tumors
> contains:
>
>> normals
> .   AromaUnitTotalCnBinarySet:
> .   Name: MyStudy
> .   Tags: ACC,ra,-XY,BPN,-XY,AVG,FLN,-XY
> .   Full name: MyStudy,ACC,ra,-XY,BPN,-XY,AVG,FLN,-XY
> .   Number of files: 2
> .   Names: Normal_Rep1,   Normal_Rep2
> .   Path (to the first file): totalAndFracBData/MyStudy,ACC,ra,-
> XY,BPN,-XY,AVG,FLN,-XY/GenomeWideSNP_6
> .   Total file size: 14.35 MB
> .   RAM: 0.00MB
>
>
>> tumors
> .   AromaUnitTotalCnBinarySet:
> .   Name: MyStudy
> .   Tags: ACC,ra,-XY,BPN,-XY,AVG,FLN,-XY
> .   Full name: MyStudy,ACC,ra,-XY,BPN,-XY,AVG,FLN,-XY
> .   Number of files: 2
> .   Names: Tumors_Rep1,   Tumors_Rep2
> .   Path (to the first file): totalAndFracBData/MyStudy,ACC,ra,-
> XY,BPN,-XY,AVG,FLN,-XY/GenomeWideSNP_6
> .   Total file size: 14.38 MB
> .   RAM: 0.00MB
>
>
> I see there's a function getAverageFile that allows me to do this:
>
>>     normals.average <- getAverageFile(normals)
>>     normals             <- newInstance(normals, files=list(normals.average))
>
> and similarly for tumors.  Then I can take these results and pass them
> to CbsModel:
>
>>     cbs  <- CbsModel(tumors, normals)
>
> HOWEVER, even though my replicates are from the same sample, they were
> processed using different protocols. So instead of averaging the
> normals and then averaging the tumors, i would like to take the
> respective ratios and then average the ratios.
>
> I can accomplish this in a round about way by using writeDataFrame and
> loading it back into R and then manipulating the data.  But then the
> data isn't in the right format for CbsModel since its been ratio-ized
> and is no longer two AromaUnitTotalCnBinarySets.
>
> 1-  Is there a way to perform an average of the ratios and then pass
> this into the CbsModel??

There is not-yet-public approach. The reason why it is "not yet
public" is that the directory and/or file structure may be changed in
the future.  However, here is how it goes:

Given your two AromaUnitTotalCnBinarySet:s, you can export total CN
ratios for each tumor-normal pair as:

dsT <- tumors;
dsN <- normals;
dsTCN <- exportTotalCnRatioSet(dsT, dsN, logBase=NULL, verbose=verbose);

which will generate a new AromaUnitTotalCnBinarySet (stored under
"rawCnData/") where the fullnames of the files have format
GSM226867,ref=GSM226872,ratio,total (indicating that the TCN ratios
have been calculated for GSM226867 with GSM226872 as the reference).
Note that this will export TCN on the non-log scale (we avoid
logarithms as far as possible in aroma, especially in the file
formats!).

In order to calculate the average of the T/N samples in this TCN data
set, you can do:

dfAvg <- getAverageFile(dsTCN, verbose=-10);

Importantly, each TCN file i=1,2,...,I contains TCNs calculated as
C_ij = T_ij / N_ij for loci j=1,2,...,J.  This means that when you do
the above, the average is calculated as C_j = median_i(C_ij) =
median_i(T_ij / N_ij).  If you want to calculate this on the log-ratio
scale(*), you need to do:

dfAvgLog <- getAverageFile(dsTCN, g=log2, h=function(y) 2^y, verbose=-10);

which will calculate the average as C_j = h(median_i(g(C_ij)) =
2^(median_i(log2(C_ij)).

Footnote: (*) Note that since we are using the median to calculate the
average, the two approaches will indeed give identical results, cf.

> diff <- dfAvg[,1] - dfAvgLog[,1]
> summary(diff)
       1
 Min.   :0
 1st Qu.:0
 Median :0
 Mean   :0
 3rd Qu.:0
 Max.   :0

In other words, you don't have to bother with g=..., and h=... approach.


Now we're close. The 'dfAvg' object is a file and it has a internal
fullname, e.g. .average-signals-median-mad,72ad79d42f44e60321fc0ad2fbe30a59.
 To change this name, do:

# Change the fullname (using a fullname translator)
setFullName(dfAvg, "TumorVsNormal");

Next, before you can pass it to CbsModel, you have wrap the 'dfAvg'
file up in a (single-file) data set, e.g.

dsAvg <- newInstance(dsT, list(dfAvg));

Then do:

# Segmentation model
cbs <- CbsModel(dsAvg, tags="*,TCN,avg");
cbs$.calculateRatios <- FALSE;

and so on.

/Henrik

[snip]

>
> Thank you!
> Greg
>
>
>
> --
> When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
> version of the package, 2) to report the output of sessionInfo() and 
> traceback(), and 3) to post a complete code example.
>
>
> You received this message because you are subscribed to the Google Groups 
> "aroma.affymetrix" group with website http://www.aroma-project.org/.
> To post to this group, send email to aroma-affymetrix@googlegroups.com
> To unsubscribe and other options, go to http://www.aroma-project.org/forum/
>

-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

Reply via email to