Hello Jim, The program that we use to convert the original PSL alignment files (http://genome.ucsc.edu/FAQ/FAQformat.html#format2) for RefSeq Genes to GenePred format (http://genome.ucsc.edu/FAQ/FAQformat.html#format9) merges blocks that are less than 8 bp apart. We have added the task of better documenting this to our to-do list.
However, if you really need to look at indels, you should look at the actual alignments instead of the genePred tables created from the alignments. Even if the gaps were not closed, as in the case you have pointed out, the genePred format can't represent insertions in the mRNA, since it's a genome annotation. The alignments are in the refSeqAli table. If you have further questions, please contact us again at [email protected]. -- Brooke Rhead UCSC Genome Bioinformatics Group Jim Robinson wrote on 11/19/11 8:50 PM: > Hi, > > I think there is an error in the annotation for NEFL on hg19 in the > refGene.txt.gz download. Specifically, it is missing a 1 base > insertion (intron) at position 24,811,071. I think the correct record > should be > > 774 NM_006158 chr8 - 24808468 24814131 24810322 > 24814029 5 24808468,24810988,24811071,24811694,24812985, > 24810465,24811070,24811309,24811819,24814131, 0 NEFL cmpl > cmpl 1,0,2,0,0 > > best, > > Jim > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
