[aroma.affymetrix] Re: exon array probe level value

sabrina Fri, 18 Sep 2009 08:45:26 -0700

Hi, Mark:


On Sep 6, 3:58 am, Mark Robinson <mrobin...@wehi.edu.au> wrote:
> Hi Sabrina.
>
> Please keep responses on-list in case others have a similar question.

I think I just accidentally hit the reply to author, sorry about that!

I solved the CDF problem because of  one typo :)
>
> > The other question i have is that even if I use the existing CDF file,
> > some of the gene expression or probe level expressions are really low,
> > like 1.1. It seemed to me t hat they are just too low. These were
> > obtained after the background normalization etc.
> > I used Rmabackgroundcorrection
>
> I need a little more convincing before I get too worried about these  
> things.  Is there any reason to suspect there is something wrong  
> here?  What does the raw (i.e. before BG adjustment) data look like  
> for these units that have low expression?  What does the data for this  
> unit look like after BG adjustment but before normalization?  What  
> does the data for this unit look like before PLM fitting but after  
> normalization?   Anything unusual in the commands you have run?  Like  
> taking logs twice or BG adjusting data that has already been  
> preprocessed, etc.?


I am still struggling about this issue. What I am trying to figure out
is how to filter out background noise. I know that in the AROMA
process, I do background correction, then normalization, then firma
model at probeset level. After I finished normalization, (ExonRmaPlm),
I took a look at the gene level and probe level distribution, here is
an example:

summary(pbsetSumm) #this is for each probe on the array
#output:
 Min.   : 0.8715   Min.   : 0.9028   Min.   : 0.9475   Min.   :
0.9499
 1st Qu.: 4.6625   1st Qu.: 4.5951   1st Qu.: 4.6517   1st Qu.:
4.6143
 Median : 6.3976   Median : 6.3863   Median : 6.3850   Median :
6.3803
 Mean   : 6.3778   Mean   : 6.3743   Mean   : 6.3652   Mean   :
6.3766
 3rd Qu.: 7.9876   3rd Qu.: 8.0125   3rd Qu.: 7.9833   3rd Qu.:
8.0051
 Max.   :14.4451   Max.   :14.3972   Max.   :14.4558   Max.   :
14.4009

In the original FIRMA paper, I noticed it filtered out all probes has
expression level <3 (log2 scale) and in the simulation study, you used
the two different values of Muc (7 and 10) to mimic two different
scenarios, one close to background, one far above background.

If I filter out these expression level <3 (only for the first array),
I will filter out 16760 probes , which is about 8% of probes, but then
the question is: if I want to detect exon skipping, the transcript
with skipped exons should have background level expression for that
exon , am I correct? If I am correct, then if I filter out these so-
called background noise probes, will I miss these exon skipping
events? i am a little bit confused and any suggestions are welcome!
Thanks!

Sabrina

> Just some thoughts.
>
> Cheers,
> Mark
>
> On 4-Sep-09, at 12:49 AM, sabrina wrote:
>
>
>
> > Hi, Mark:
> > How do you test if the one I created is similar to the existing core
> > CDF? I don't think I include the extended or FUll probes, because I
> > only included the probes with notation of core from Affy's annotation
> > file
> > Here is the display when I used the existing core CDF
>
> > Path: annotationData/chipTypes/MoEx-1_0-st-v1
> > Filename: MoEx-1_0-st-v1,coreR1,A20080718,MR.cdf
> > Filesize: 30.53MB
> > Chip type: MoEx-1_0-st-v1,coreR1,A20080718,MR
> > RAM: 0.00MB
> > File format: v4 (binary; XDA)
> > Dimension: 2560x2560
> > Number of cells: 6553600
> > Number of units: 17831
> > Cells per unit: 367.54
> > Number of QC units: 1
>
> > when I check one of the units
> > $`6838637`
> > $`6838637`$type
> > [1] 1
>
> > $`6838637`$direction
> > [1] 1
>
> > from the existing CDF:
> > $`6838637`$groups$`4736172`
> > $`6838637`$groups$`4736172`$x
> > [1] 2273 1148  816 1391
>
> > $`6838637`$groups$`4736172`$y
> > [1]  832 1172 2502  540
>
> > $`6838637`$groups$`4736172`$pbase
> > [1] "C" "T" "A" "G"
>
> > $`6838637`$groups$`4736172`$tbase
> > [1] "G" "A" "T" "C"
>
> > $`6838637`$groups$`4736172`$expos
> > [1] 0 1 2 3
>
> > $`6838637`$groups$`4736172`$direction
> > [1] 1
>
> > -----------------------------------------------------------------------
> > Here is the one I created myself
> > Path: annotationData/chipTypes/MoEx-1_0-st-v1
> > Filename: MoEx-1_0-st-v1,core,2009-09-02,no1Probe,no1Exon,SS.cdf
> > Filesize: 28.14MB
> > Chip type: MoEx-1_0-st-v1,core,2009-09-02,no1Probe,no1Exon,SS
> > RAM: 0.00MB
> > File format: v4 (binary; XDA)
> > Dimension: 2560x2560
> > Number of cells: 6553600
> > Number of units: 16442
> > Cells per unit: 398.59
> > Number of QC units: 0
>
> > ----------------------------------------------------------------------------
> > from my CDF, I have
>
> > $`6838637`
> > $`6838637`$type
> > [1] 1
>
> > $`6838637`$direction
> > [1] 0
>
> > .................................
> > $`6838637`$groups$`4736172`
> > $`6838637`$groups$`4736172`$x
> > [1] 1148  816 2273 1391
>
> > $`6838637`$groups$`4736172`$y
> > [1] 1172 2502  832  540
>
> > $`6838637`$groups$`4736172`$pbase
> > [1] "A" "A" "A" "A"
>
> > $`6838637`$groups$`4736172`$tbase
> > [1] "T" "T" "T" "T"
>
> > $`6838637`$groups$`4736172`$expos
> > [1] 0 1 2 3
>
> > $`6838637`$groups$`4736172`$direction
> > [1] 1
>
> > NOTICE that $pbase and $tbase are different. and $`6838637`$direction
> > are different
> > I noticed that the Number of QC units is 0 in my case, but I don't
> > know what is that for.
> > What other information do you need to diagnose it? What is the best
> > way to verify if my CDF is overlapping with the existing one? Thanks!
>
> > Sabrina
>
> > On Sep 2, 6:17 pm, Mark Robinson <mrobin...@wehi.edu.au> wrote:
> >> Hi Sabrina.
>
> >> Hmm, that does seem unexpected.  I would expect that if there are  
> >> just
> >> a few probes picked out, it shouldn't change that much.
>
> >> Can you give more details about this new CDF you made?  Would it be
> >> similar to the existing 'core' CDF?  Presumably, there will be a  
> >> large
> >> overlap with the existing CDF.  Can you verify this?  And, give some
> >> numbers on this.  A few things like: total number of probes for each
> >> CDF, for a given unit does it have a large overlap of probes, etc.
>
> >> The one thing that comes to mind is that maybe you've included a  
> >> bunch
> >> of the 'full' or 'extended' probes in your new CDF and this would
> >> likely lower the average.
>
> >> Cheers,
> >> Mark
>
> >> On 3-Sep-09, at 5:07 AM, sabrina shao wrote:
>
> >>> Hi, all
> >>> I am working on mouse Exon arrays from Affy. Because there are  
> >>> probes
> >>> with multiple hits on the genome, I screened out these probes and
> >>> exons with only one probe and transcript clusters with one exon on  
> >>> the
> >>> array. After that I created a CDF file as illustrated from this  
> >>> group.
> >>> My problems is: when I checked the probe level expressions (not gene
> >>> expression) , the average is about 4.5 , and the gene expression  
> >>> level
> >>> ( not surprisingly) is about similar level. And the smallest probe
> >>> level was about 1.1.  I also used the CDF file I downloaded from  
> >>> aroma
> >>> here, using the same program, I got the gene expression level around
> >>> 6.2. That is about 1.7 fold change. and the results from the new CDF
> >>> that I created was just too low. I can't figure it out why. Can  
> >>> anyone
> >>> give me some suggestions and hints? Thanks
>
> >>> Sabrina
>
> >> ------------------------------
> >> Mark Robinson, PhD (Melb)
> >> Epigenetics Laboratory, Garvan
> >> Bioinformatics Division, WEHI
> >> e: m.robin...@garvan.org.au
> >> e: mrobin...@wehi.edu.au
> >> p: +61 (0)3 9345 2628
> >> f: +61 (0)3 9347 0852
> >> ------------------------------
>
> ------------------------------
> Mark Robinson, PhD (Melb)
> Epigenetics Laboratory, Garvan
> Bioinformatics Division, WEHI
> e: m.robin...@garvan.org.au
> e: mrobin...@wehi.edu.au
> p: +61 (0)3 9345 2628
> f: +61 (0)3 9347 0852
> ------------------------------
--~--~---------~--~----~------------~-------~--~----~
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe from this group, send email to 
aroma-affymetrix-unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en
-~----------~----~----~----~------~----~------~--~---

[aroma.affymetrix] Re: exon array probe level value

Reply via email to