Hi Tyler,

Have you already found the description of the RepeatMasker version used 
for the RepeatMasker track?  From 
http://hgdownload.cse.ucsc.edu/goldenPath/mm9/bigZips/:

chromOut.tar.gz - RepeatMasker .out files (one file per chromosome).
     RepeatMasker was run with the -s (sensitive) setting.
     May 17 2007 (open-3-1-8) version of RepeatMasker,
     Repeat Masker library RELEASE 20061006

I assume you have seen the track documentation:
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=mm9&g=rmsk
including this FAQ:
http://genome.ucsc.edu/FAQ/FAQdownloads#download16

We also keep descriptions of how tracks are made in the source tree at 
src/hg/makeDb/doc/.  There is a browse-able version of the source tree 
online at http://genome-source.cse.ucsc.edu/gitweb/.  Specifically, the 
description of how mm9 tracks were made is here:

http://genome-source.cse.ucsc.edu/gitweb/?p=kent.git;a=blob;f=src/hg/makeDb/doc/mm9.txt
(repeatMasker is described around line 350)

and the perl script doRepeatMasker.pl that is used to run RepeatMasker 
is here:
http://genome-source.cse.ucsc.edu/gitweb/?p=kent.git;a=blob;f=src/hg/utils/automation/doRepeatMasker.pl

I hope this helps you find what you need.  If you have further 
questions, please contact us again at [email protected].

--
Brooke Rhead
UCSC Genome Bioinformatics Group


On 6/7/12 3:52 PM, Tyler Garvin wrote:
> Hi,
>
> I have been analysing L1 data from RepeatMasker for mouse (mm9) and
> I  have been trying to develop an evolutionary tree starting with
L1Md_A/F/F2/F3/Gf/T and backtracking up the tree. From what I have read
in the documentation, UCSC uses some variation of the repeatmasker
algorithm using consensus sequences from Repbase for training the HMM.
>
> Repbase only has consensus sequences for L1Md_F and L1Md_Gf,
> however,  and when I align them with the UCSC repeatmasker elements I am 
> finding
very poor similarity between the transposons and their supposed
consensus. I was wondering if you could tell me where to find the
training sequences/procedure that were used to categorize these L1Md
elements. In all honesty, it would be nice if I could have access to
these consensus sequences for all LINE/SINE/LTR elements but I am more
focused on the L1Md elements.
>
> Thanks,
> Tyler
>
>
> --------------------------
> USC Graduate (Fall 2012)
> Biomedical Engineering (Electrical)
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to