Dear Henrik

Meanwhile I have created ufl and ugp files for both 100K and 500K
arrays but not for GenomeWideSNP_6 aray.

Can you confirm that the following code, which I use for both 100K and
500K arrays, is correct:

# retrieving annotation files
chiptypes <- c("Mapping50K_Hind240", "Mapping50K_Xba240")
cdfs <- lapply(chiptypes, FUN=function(x){AffymetrixCdfFile$byChipType
(x)})
names(cdfs) <- chiptypes
print(cdfs)

# importing data from NetAffx CSV files
csvs <- lapply(cdfs, FUN=function(cdf){AffymetrixNetAffxCsvFile
$byChipType(getChipType(cdf), tags=".na27")})
print(csvs)

# allocating empty UFL (Unit Fragment Length) files
ufls <- lapply(cdfs, FUN=function(cdf){AromaUflFile$allocateFromCdf
(cdf, tags="na27,CS20090112")})
print(ufls)

# import SNP data
units <- list();
for (chipType in names(ufls)) {
   ufl <- ufls[[chipType]];
   csv <- csvs[[chipType]];
   units[[chipType]] <- importFrom(ufl, csv, verbose=-50);
}
str(units)

# allocating empty UGP (Unit Genome Position) files
ugps <- lapply(cdfs, FUN=function(cdf){AromaUgpFile$allocateFromCdf
(cdf, tags="na27,CS20090112")})
print(ugps)

# import SNP data
units <- list();
for (chipType in names(ugps)) {
   ugp <- ugps[[chipType]];
   csv <- csvs[[chipType]];
   units[[chipType]] <- importFrom(ugp, csv, verbose=-50);
}
str(units)


Here is the summary for the 100K arrays:
# Summary 50K chips
> str(units)
List of 2
 $ Mapping50K_Hind240: int [1:57244] 18632 18677 1631 18713 1630 18712
18619 1639 18722 18608 ...
 $ Mapping50K_Xba240 : int [1:58960] 29181 18239 31302 19831 47750
45114 19103 39711 19772 37811 ...
>
> ufl <- AromaUflFile$byChipType(chiptypes[1], tags="na27,CS20090112");
> print(summaryOfUnits(ufl, enzymes="HindIII"))
               snp cnp affxSnp other total
enzyme1-only 56933   0       0     0 56933
missing        311   0       0    55   366
total        57244   0       0    55 57299
> ufl <- AromaUflFile$byChipType(chiptypes[2], tags="na27,CS20090112");
> print(summaryOfUnits(ufl, enzymes="XbaI"))
               snp cnp affxSnp other total
enzyme1-only 58616   0       0     0 58616
missing        344   0       0    55   399
total        58960   0       0    55 59015
>
> ugp <- AromaUgpFile$byChipType(chiptypes[1], tags="na27,CS20090112");
> print(summary(ugp, enzymes="HindIII"))
 chromosome        position
 Min.   :  1.000   Min.   :    48603
 1st Qu.:  4.000   1st Qu.: 34667112
 Median :  7.000   Median : 72677620
 Mean   :  8.402   Mean   : 80405004
 3rd Qu.: 12.000   3rd Qu.:114826216
 Max.   : 23.000   Max.   :246727435
 NA's   :363.000   NA's   :      363
> ugp <- AromaUgpFile$byChipType(chiptypes[2], tags="na27,CS20090112");
> print(summary(ugp, enzymes="XbaI"))
 chromosome        position
 Min.   :  1.000   Min.   :    93683
 1st Qu.:  4.000   1st Qu.: 34636629
 Median :  7.000   Median : 72249739
 Mean   :  8.507   Mean   : 80010574
 3rd Qu.: 12.000   3rd Qu.:114666170
 Max.   : 24.000   Max.   :246885089
 NA's   :390.000   NA's   :      390


Here is the summary for the 500K arrays:
# Summary 500K chips
> str(units)
List of 2
 $ Mapping250K_Sty: int [1:238304] 15133 175423 164715 237140 112643
189587 162193 79611 196992 73555 ...
 $ Mapping250K_Nsp: int [1:262264] 34952 76 74370 232354 3677 72977
73533 176215 161345 238482 ...
>
> ufl <- AromaUflFile$byChipType(chiptypes[1], tags="na27,CS20090112");
> print(summaryOfUnits(ufl, enzymes="StyI"))
                snp cnp affxSnp other  total
enzyme1-only 144868   0       0     0 144868
missing       93436   0       0    74  93510
total        238304   0       0    74 238378
> ufl <- AromaUflFile$byChipType(chiptypes[2], tags="na27,CS20090112");
> print(summaryOfUnits(ufl, enzymes="NspI"))
                snp cnp affxSnp other  total
enzyme1-only 261563   0       0     0 261563
missing         701   0       0    74    775
total        262264   0       0    74 262338
>
> ugp <- AromaUgpFile$byChipType(chiptypes[1], tags="na27,CS20090112");
> print(summary(ugp, enzymes="StyI"))
 chromosome        position
 Min.   :  1.000   Min.   :     2994
 1st Qu.:  4.000   1st Qu.: 31306881
 Median :  8.000   Median : 67082398
 Mean   :  9.117   Mean   : 77333484
 3rd Qu.: 13.000   3rd Qu.:114799352
 Max.   : 23.000   Max.   :247135059
 NA's   :677.000   NA's   :      677
> ugp <- AromaUgpFile$byChipType(chiptypes[2], tags="na27,CS20090112");
> print(summary(ugp, enzymes="NspI"))
 chromosome        position
 Min.   :  1.000   Min.   :    17408
 1st Qu.:  4.000   1st Qu.: 32574796
 Median :  8.000   Median : 70596240
 Mean   :  8.758   Mean   : 79224244
 3rd Qu.: 13.000   3rd Qu.:114776300
 Max.   : 23.000   Max.   :247110269
 NA's   :775.000   NA's   :      775


Could it be that function summaryOfUnits() does not work as expected?
For ufl it prints "enzyme1-only" instead  of e.g. "NspI-only"
For ugp it prints an error: no applicable method


Is it correct that for na27 there are now 93436 missing SNPs for StyI
compared to only 75 missing SNPs for na24 (as shown on your page about
building UFL files)?


Can I build the ufl and ugp files in the same way for GenomeWideSNP_6?
I assume that I need to repeat everything done for ".na27" with
".cn.na27"?
Will these files be identical to the files supplied by you or do you
use the file "GenomeWideSNP_6_build36_SNPandCN.tab", which only you
have (as decribed in your pages)?


Here is the sessionInfo:
> sessionInfo()
R version 2.7.1 (2008-06-23)
x86_64-unknown-linux-gnu

locale:
C

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods
base

other attached packages:
 [1] aroma.affymetrix_0.9.4 aroma.apd_0.1.3
R.huge_0.1.6
 [4] affxparser_1.12.2      aroma.core_0.9.4
sfit_0.1.5
 [7] aroma.light_1.8.1      digest_0.3.1
matrixStats_0.1.3
[10] R.rsp_0.3.4            R.cache_0.1.7
R.utils_1.0.4
[13] R.oo_1.4.5             R.methodsS3_1.0.3
>

Best regards
Christian


On Jan 8, 2:39 pm, cstratowa <christian.strat...@vie.boehringer-
ingelheim.com> wrote:
> Dear Henrik
>
> Would it be possible for you to supply the ufl and ugp files for the
> new Affymetrix xxx.na27.annot.csv files for the mapping arrays?
>
> Thank you in advance
> Best regards
> Christian
--~--~---------~--~----~------------~-------~--~----~
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe from this group, send email to 
aroma-affymetrix-unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to