Good Morning Dave:
You do not want to use one 11.ooc file from one genome
on a different genome. You can simply construct the 11.ooc
file for your genome:
$ blat yourGenome.2bit \
/dev/null /dev/null -tileSize=11 -makeOoc=yourGenome.11.ooc \
-repMatch=1024
Adjust the repMatch number based on the size of your genome.
1024 is used for human sequence. For example, given a 'faSize'
measurement of your genome:
$ twoBitToFa yourGenome.2bit stdout | faSize stdin
2914958544 bases (162452744 N's 2752505800 real 1439244378 upper 1313261422
lower
using the "real" bases measurement of 2752505800, calculate
the ratio to hg19 "real" bases of 2897310462:
awk 'BEGIN{printf "%.6f\n", 2752505800 / 2897310462 * 1024}'
972.821510
Round down the answer to the nearest 50: repMatch=950 in this example.
--Hiram
Dave Tang wrote:
> Dear list,
>
> The blatSuite.zip file (downloaded from
> http://genome-test.cse.ucsc.edu/~kent/exe/linux/) comes with a 11.ooc
> file. I couldn't find any information regarding which genome was used to
> generate this file. Richard asked a similar question here
> https://lists.soe.ucsc.edu/pipermail/genome/2004-February/003964.html:
>
> 1) What's the difference going to between genome versions?
> Is it worth re-creating a new version or will the ooc file
> produce similar results?
>
> 2) Does it make sense to run the ooc file for the human on
> the mouse genome?
>
> Additionally on the FAQ (http://genome.ucsc.edu/FAQ/FAQblat.html#blat6),
> it was mentioned that "The 11.ooc file contains sequences determined to be
> over-represented in the genome sequence."
>
> So it was a bit confusing to me; do all genomes have these
> over-represented sequences, hence the default 11.ooc file that comes with
> the blatSuite.zip? Or I should generate my own ooc file as has been
> pointed out in previous emails from this mailing list?
>
> Thank you very much for you help.
>
> Best,
>
> Dave
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome