Hi Sabrina.

Please keep responses on-list in case others have a similar question.

> How do you test if the one I created is similar to the existing core
> CDF? I don't think I include the extended or FUll probes, because I
> only included the probes with notation of core from Affy's annotation
> file


There is a function called compareCdfs() but I have never used it so  
I'm not sure what it does for a comparison.  Might be worth a look.

You could always just read the indices and see how much overlap there  
is.  It should be too hard.  A few thoughts on this:

cdf1 <- AffymetrixCdfFile$fromChipType("MoEx-1_0-st-v1",  
tags="coreR1,A20080718,MR")
cdf2 <- AffymetrixCdfFile$fromChipType("MoEx-1_0-st-v1",  
tags="yourTags")

cells1 <- getCellIndices(cdf)
cells1 <- lapply(cells1, unlist, use.names=FALSE)
[same for cells2]

commonUnits <- intersect( names(cells1), names(cells2) )

m1 <- match(commonUnits, names(cells1)
cells1 <- cells1[m1]
[same for cells2]

overlapSummaryTable <- matrix(0, nr=length(commonUnits), nc=3,
                        dimnames=list(commonUnits,  
c("cells1","cells2","overlap"))

for(i in 1:length(commonUnits)) {
   overlapSummaryTable[i,] <- c( length(cells1[[i]]),  
length(cells2[[i]]),
                                 length( intersect(cells1[[i]],  
cells2[[i]]) ) )
}

Presumably, every unit you have is a subset of the standard CDF?


> The other question i have is that even if I use the existing CDF file,
> some of the gene expression or probe level expressions are really low,
> like 1.1. It seemed to me t hat they are just too low. These were
> obtained after the background normalization etc.
> I used Rmabackgroundcorrection


I need a little more convincing before I get too worried about these  
things.  Is there any reason to suspect there is something wrong  
here?  What does the raw (i.e. before BG adjustment) data look like  
for these units that have low expression?  What does the data for this  
unit look like after BG adjustment but before normalization?  What  
does the data for this unit look like before PLM fitting but after  
normalization?   Anything unusual in the commands you have run?  Like  
taking logs twice or BG adjusting data that has already been  
preprocessed, etc.?

Just some thoughts.

Cheers,
Mark




On 4-Sep-09, at 12:49 AM, sabrina wrote:

> Hi, Mark:
> How do you test if the one I created is similar to the existing core
> CDF? I don't think I include the extended or FUll probes, because I
> only included the probes with notation of core from Affy's annotation
> file
> Here is the display when I used the existing core CDF
>
> Path: annotationData/chipTypes/MoEx-1_0-st-v1
> Filename: MoEx-1_0-st-v1,coreR1,A20080718,MR.cdf
> Filesize: 30.53MB
> Chip type: MoEx-1_0-st-v1,coreR1,A20080718,MR
> RAM: 0.00MB
> File format: v4 (binary; XDA)
> Dimension: 2560x2560
> Number of cells: 6553600
> Number of units: 17831
> Cells per unit: 367.54
> Number of QC units: 1
>
>
> when I check one of the units
> $`6838637`
> $`6838637`$type
> [1] 1
>
> $`6838637`$direction
> [1] 1
>
>
>
> from the existing CDF:
> $`6838637`$groups$`4736172`
> $`6838637`$groups$`4736172`$x
> [1] 2273 1148  816 1391
>
> $`6838637`$groups$`4736172`$y
> [1]  832 1172 2502  540
>
> $`6838637`$groups$`4736172`$pbase
> [1] "C" "T" "A" "G"
>
> $`6838637`$groups$`4736172`$tbase
> [1] "G" "A" "T" "C"
>
> $`6838637`$groups$`4736172`$expos
> [1] 0 1 2 3
>
> $`6838637`$groups$`4736172`$direction
> [1] 1
>
> -----------------------------------------------------------------------
> Here is the one I created myself
> Path: annotationData/chipTypes/MoEx-1_0-st-v1
> Filename: MoEx-1_0-st-v1,core,2009-09-02,no1Probe,no1Exon,SS.cdf
> Filesize: 28.14MB
> Chip type: MoEx-1_0-st-v1,core,2009-09-02,no1Probe,no1Exon,SS
> RAM: 0.00MB
> File format: v4 (binary; XDA)
> Dimension: 2560x2560
> Number of cells: 6553600
> Number of units: 16442
> Cells per unit: 398.59
> Number of QC units: 0
>
> ----------------------------------------------------------------------------
> from my CDF, I have
>
> $`6838637`
> $`6838637`$type
> [1] 1
>
> $`6838637`$direction
> [1] 0
>
> .................................
> $`6838637`$groups$`4736172`
> $`6838637`$groups$`4736172`$x
> [1] 1148  816 2273 1391
>
> $`6838637`$groups$`4736172`$y
> [1] 1172 2502  832  540
>
> $`6838637`$groups$`4736172`$pbase
> [1] "A" "A" "A" "A"
>
> $`6838637`$groups$`4736172`$tbase
> [1] "T" "T" "T" "T"
>
> $`6838637`$groups$`4736172`$expos
> [1] 0 1 2 3
>
> $`6838637`$groups$`4736172`$direction
> [1] 1
>
>
>
> NOTICE that $pbase and $tbase are different. and $`6838637`$direction
> are different
> I noticed that the Number of QC units is 0 in my case, but I don't
> know what is that for.
> What other information do you need to diagnose it? What is the best
> way to verify if my CDF is overlapping with the existing one? Thanks!
>
> Sabrina
>
>
> On Sep 2, 6:17 pm, Mark Robinson <mrobin...@wehi.edu.au> wrote:
>> Hi Sabrina.
>>
>> Hmm, that does seem unexpected.  I would expect that if there are  
>> just
>> a few probes picked out, it shouldn't change that much.
>>
>> Can you give more details about this new CDF you made?  Would it be
>> similar to the existing 'core' CDF?  Presumably, there will be a  
>> large
>> overlap with the existing CDF.  Can you verify this?  And, give some
>> numbers on this.  A few things like: total number of probes for each
>> CDF, for a given unit does it have a large overlap of probes, etc.
>>
>> The one thing that comes to mind is that maybe you've included a  
>> bunch
>> of the 'full' or 'extended' probes in your new CDF and this would
>> likely lower the average.
>>
>> Cheers,
>> Mark
>>
>> On 3-Sep-09, at 5:07 AM, sabrina shao wrote:
>>
>>
>>
>>
>>
>>> Hi, all
>>> I am working on mouse Exon arrays from Affy. Because there are  
>>> probes
>>> with multiple hits on the genome, I screened out these probes and
>>> exons with only one probe and transcript clusters with one exon on  
>>> the
>>> array. After that I created a CDF file as illustrated from this  
>>> group.
>>> My problems is: when I checked the probe level expressions (not gene
>>> expression) , the average is about 4.5 , and the gene expression  
>>> level
>>> ( not surprisingly) is about similar level. And the smallest probe
>>> level was about 1.1.  I also used the CDF file I downloaded from  
>>> aroma
>>> here, using the same program, I got the gene expression level around
>>> 6.2. That is about 1.7 fold change. and the results from the new CDF
>>> that I created was just too low. I can't figure it out why. Can  
>>> anyone
>>> give me some suggestions and hints? Thanks
>>
>>> Sabrina
>>
>> ------------------------------
>> Mark Robinson, PhD (Melb)
>> Epigenetics Laboratory, Garvan
>> Bioinformatics Division, WEHI
>> e: m.robin...@garvan.org.au
>> e: mrobin...@wehi.edu.au
>> p: +61 (0)3 9345 2628
>> f: +61 (0)3 9347 0852
>> ------------------------------

------------------------------
Mark Robinson, PhD (Melb)
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: m.robin...@garvan.org.au
e: mrobin...@wehi.edu.au
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852
------------------------------






--~--~---------~--~----~------------~-------~--~----~
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe from this group, send email to 
aroma-affymetrix-unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to