Hi,

I have made a few remarks below:

On Sat, Dec 11, 2010 at 6:29 AM, Hans-Ulrich
<hans-ulrich.kl...@uni-muenster.de> wrote:
> Hi Hendrik.
>
> Sure - I would like to share my annotation files. I made a new .ugp
> file without problems. Unfortunately, a known issue arises when
> creating the new .ufl file.
> When importing the ufl information from the NetAffx annotation files
> na31, the following warning appears:
>
> Warning messages:
> 1: In updateDataColumn.AromaTabularBinaryFile(this, .con = con, .hdr =
> hdr,  :
>  4 values to be assigned were out of range [-32768,32767] and
> therefore censored to fit the range. Of these, 4 values in
> [42148,51916] were too large.
> 2: In updateDataColumn.AromaTabularBinaryFile(this, .con = con, .hdr =
> hdr,  :
>  90 values to be assigned were out of range [-32768,32767] and
> therefore censored to fit the range. Of these, 90 values in
> [33603,21059245] were too large.

Remark #1: By default fragment lengths are stored as 2-byte integers,
hence the range limits.  *On a file basis*, this can be changed to
4-byte integers, doubling the file size (only to cover those ~100
fragments).  I was surprised to see that it is signed (I guess there
was a reason for that or I simply forgot to override the default) - if
changing it to be non-signed the range would be [0,~65530] (reserving
some values for NA/Inf etc).  That would cover even more.  Thus, this
issue can be solved.

>
> Actually, some frangments are really very large (and other are very
> small):
>
>> summary(ufl)
>  length          length.02
>  Min.   :    7   Min.   :    7
>  1st Qu.:  432   1st Qu.:  755
>  Median :  679   Median : 1433
>  Mean   :  923   Mean   : 1922
>  3rd Qu.:  976   3rd Qu.: 2609
>  Max.   :32767   Max.   :32767
>  NA's   :29333   NA's   :15585
>
> This is different from annotation version na30. (The same issue came
> up with the Mouse Diversity Array. I wrote that in another thread.)

Yes, I recall that discussion.

> Instead of modifying the annotation files and setting fragment length
>> 2000 to NA etc., I would prefer to keep the original annotation and
> add a parameter to the function for fragment length normalization (max
> and min fragment length). What do you think?

Remark #2:
That would be doable, but it has to be introduced to the code such
that it is backward compatible, especially so that no one is trying to
use a non-censored UFL file in an older version of aroma.affymetrix;
then those large signals may cause havoc.  The best way is to add
support to the code, and much later start providing new UFL files.  In
the meanwhile, others can use their own UFL files.

Remark #3:
The question is how to filter the fragment lengths.  You are suggest
to add arguments to specify the "min" and "max" fragment length, but
what action should be taken?  Previously it seems like Affymetrix have
been setting the lengths to missing values (NAs).  That's one option.
An alternative is to censor, that is too small values to be equal to
"min" and analogously for too large values.  Maybe one wants to have
censoring in one direction and NAs in the other?

So, ir would be great if you could think through different scenarios
and see what could be useful/make sense.  That would help decide on
what arguments should be added.  Having to change existing arguments
in future releases because one didn't consider all cases will cause
problems to people.

>
> Also, I will write to the affymetrix forum and ask why they changed
> their annotation.

That would be useful to know.

/Henrik

>
> Best,
> Hans-Ulrich
>
>
> On Dec 10, 5:49 pm, Henrik Bengtsson <henrik.bengts...@aroma-
> project.org> wrote:
>> Hi.
>>
>> On Fri, Dec 10, 2010 at 3:04 AM, Hans-Ulrich
>>
>> <hans-ulrich.kl...@uni-muenster.de> wrote:
>> > Hi all.
>>
>> > I want to apply CRMAv2  + CBS to Genome Wide SNP 6.0 arrays. The
>> > segments created by CBS should have hg19 coordinates. I looked at the
>> > aroma project web site:
>> >http://www.aroma-project.org/chipTypes/GenomeWideSNP_6
>>
>> > The annotation files are based on na30 and hg18. I probably can create
>> > a new .ugp and .ufl file for hg19 based on the R scripts available on
>> > that page.
>>
>> If you create these, would mind sharing them with others?  I can make
>> them available on the chip type page.
>>
>> See also links below.
>>
>> > However, there is no script for the .acs file.
>>
>> There is; see below.
>>
>> > So, I have two questions:
>>
>> > What is stored in these annotation files?
>> > - .ufl -- length of the fragments (?)
>> > - .ugp -- mapping from probes to genomic positions (?)
>> > - .acs -- mapping from probes to sequences (?)
>>
>> > If my guesses above are correct, can create new .ufl and .ugp files
>> > and use the old .acs file?
>>
>> Correct, since the sequences on the actually chip never changes, the
>> Aroma Cell Sequence (ACS) file is not changing either.  An exception
>> is when we obtain sequences for cells for which we did not them
>> before.
>>
>> See Section 'Building custom-made annotation data' on
>>
>>  http://aroma-project.org/howtos
>>
>> for more details on how to build UFL and UGP (and ACS).
>>
>> /Henrik
>>
>>
>>
>> > Thanks,
>> > Hans-Ulrich
>>
>> > --
>> > When reporting problems on aroma.affymetrix, make sure 1) to run the
>> > latest version of the package, 2) to report the output of sessionInfo() and
>> > traceback(), and 3) to post a complete code example.
>>
>> > You received this message because you are subscribed to the Google
>> > Groups "aroma.affymetrix" group with websitehttp://www.aroma-project.org/.
>> > To post to this group, send email to aroma-affymetrix@googlegroups.com
>> > To unsubscribe and other options, go
>> > tohttp://www.aroma-project.org/forum/
>
> --
> When reporting problems on aroma.affymetrix, make sure 1) to run the latest
> version of the package, 2) to report the output of sessionInfo() and
> traceback(), and 3) to post a complete code example.
>
>
> You received this message because you are subscribed to the Google Groups
> "aroma.affymetrix" group with website http://www.aroma-project.org/.
> To post to this group, send email to aroma-affymetrix@googlegroups.com
> To unsubscribe and other options, go to http://www.aroma-project.org/forum/
>

-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

Reply via email to