Hello,
I am seeing numerous discrepancy between the sequences and coordinates in
the insect 15-way multiple-alignment files
(http://hgdownload.cse.ucsc.edu/goldenPath/dm3/multiz15way/) primarily for
D. ananassae and D. virilus. Hopefully I can explain this as concisely as
possible, however, let me know how I can clarify:
For example:
1. I extract the MAF block of a popular melanogaster miRNA (dme-mir-289,
chr3L:13613907-13614035)using the maf_parse program from Adam Siepel's lab:
maf_parse -o MAF --start 13613907 --end 13614035 chr3L.maf
See the attached dme-mir-289.maf output file. Notice that the droAna3
sequence in this file is
"CAGCTCGGGTTTTAGGTTGAGTTTACAGTAAAATAAATATTTAAGTGGAGCCTGCGACTctgctactgccactgc
cactgccactgccactgccGCTCGGGGAGTCACTTGAGCGTTTGTTGGCACGTAAAAGACATCATAATTAGCATT"
and the coordinate is scaffold_13337:1466195-1466344 -.
2. When I extract the sequence of this coordinate from the droAna3.fa file
(http://hgdownload.cse.ucsc.edu/goldenPath/droAna3/bigZips/) and reverse
complement it, I get a completely different sequence than that reported (see
the file droAna3_289_maf.fa).
3. When I blat the sequence reported in the MAF file against droAna3.fa, I
get a best scoring coordinate that is different from the MAF coordinate.
That is, scaffold_13337:21827571-21827720 -. No blat reported coordinate
agrees with the MAF coordinate (see droAna3_blatOutput.txt for all blat
reported coordinates). When I extract the sequence of the blat reported
coordinate and reverse complement it (see droAna3_289_blat.fa), this
sequence agrees with the MAF reported sequence.
I can furnish additional examples upon request. I really appreciate any time
and effort spent helping me explain this bizarre observation.
Thanks,
Jaaved
--
Jaaved Mohammed,
Ph.D. Student of Computational Biology
Tri-Institutional Training Program in Computational Biology and Medicine
(Cornell University - Ithaca, Weill Cornell Medical College, and Memorial
Sloan-Kettering Cancer Center)
BLAT SEARCH: Searcing for
"CAGCTCGGGTTTTAGGTTGAGTTTACAGTAAAATAAATATTTAAGTGGAGCCTGCGACTctgctactgccactgccactgccactgccactgccGCTCGGGGAGTCACTTGAGCGTTTGTTGGCACGTAAAAGACATCATAATTAGCATT"
in droAna3
Loaded 230993012 letters in 13749 sequences
Searched 150 bases in 1 sequences
#Coordinate Score Identity
scaffold_13337:21827571-21827720 - 150.0 100.0
scaffold_13335:2281053-2332716 - 42.0 23.0
scaffold_12929:1391758-1399351 - 41.0 34.099999999999994
scaffold_13088:276036-276187 - 38.0 59.599999999999994
scaffold_13335:476266-763105 - 37.0 0.0
scaffold_13337:12681604-12681666 + 35.0 68.5
scaffold_13337:7223810-7232150 + 35.0 21.69999999999999
scaffold_13266:15195748-15195784 + 35.0 97.3
scaffold_12943:849650-937313 - 35.0 5.299999999999997
scaffold_12929:2834580-2956121 - 35.0 2.5999999999999943
scaffold_13417:430541-430576 + 34.0 97.3
scaffold_13266:12479952-12480364 + 34.0 45.0
scaffold_13266:18784220-18784259 - 34.0 92.5
scaffold_13340:23121639-23214271 - 33.0 2.6999999999999886
scaffold_13047:1752543-1752577 - 33.0 97.2
scaffold_12929:949144-1156623 - 33.0 -11.700000000000003
scaffold_12903:586999-587315 - 33.0 45.8
scaffold_13417:6298037-6298074 + 32.0 86.2
scaffold_13340:17551471-17551504 + 32.0 97.1
scaffold_13337:12681604-12681635 + 32.0 100.0
scaffold_12903:719143-719176 + 32.0 97.1
scaffold_12613:56530-216296 + 32.0 -12.100000000000009
scaffold_13334:1254143-1254181 - 32.0 82.1
scaffold_13334:1502851-1502883 + 31.0 97.0
scaffold_13117:2712632-2712666 + 31.0 94.3
scaffold_12929:1883277-1883329 + 31.0 68.8
scaffold_13337:9208454-9215321 - 31.0 15.699999999999989
scaffold_13248:1650015-1650317 - 31.0 42.9
scaffold_13340:13726587-13726616 + 30.0 100.0
scaffold_13337:7709783-7709812 + 30.0 100.0
scaffold_13248:954292-954321 + 30.0 100.0
scaffold_13047:1343471-1343502 + 30.0 96.9
scaffold_12929:319689-319720 + 30.0 96.9
scaffold_13337:5750102-5750501 - 30.0 38.8
scaffold_13266:13182411-13182446 - 30.0 81.9
scaffold_13248:2307695-2307728 - 30.0 94.2
scaffold_13248:2106588-2106621 - 30.0 87.9
scaffold_13417:1660439-1660469 + 29.0 96.8
scaffold_13337:9231443-9231492 + 29.0 67.69999999999999
scaffold_13334:1502851-1502879 + 29.0 100.0
scaffold_13337:6861510-6861620 - 29.0 51.699999999999996
scaffold_13117:320444-320474 - 29.0 96.8
scaffold_13047:1752548-1752576 - 29.0 100.0
scaffold_12916:2613647-2613684 - 29.0 75.0
scaffold_12903:192140-192168 - 29.0 100.0
scaffold_13337:6558091-6558126 + 28.0 88.9
scaffold_13337:18342508-18342535 + 28.0 100.0
scaffold_13337:20065338-20065365 + 28.0 100.0
scaffold_13337:1594231-1594259 + 28.0 89.7
scaffold_13337:13439469-13439505 + 28.0 74.2
scaffold_13337:21675234-21675268 + 28.0 75.9
scaffold_13088:461723-461757 + 28.0 75.9
scaffold_12943:1376335-1376362 + 28.0 100.0
scaffold_12916:12871436-12871476 + 28.0 71.9
scaffold_3888:11311-11344 - 28.0 84.9
scaffold_13417:1242045-1540711 - 28.0 -34.400000000000006
scaffold_13340:8526207-8526236 - 28.0 96.7
scaffold_13340:10799221-10799264 - 28.0 81.9
scaffold_13340:22084721-22084753 - 28.0 83.9
scaffold_13337:9962504-9962531 - 28.0 100.0
scaffold_13337:19901841-19901879 - 28.0 71.0
scaffold_13088:276036-276063 - 28.0 100.0
scaffold_12916:4727105-4727135 - 28.0 80.0
scaffold_12911:3055436-3055469 - 28.0 84.9
scaffold_13417:6298051-6298077 + 27.0 100.0
scaffold_13337:9658667-9658698 + 27.0 83.4
scaffold_13335:1757013-1757041 + 27.0 96.6
scaffold_13266:11054113-11054139 + 27.0 100.0
scaffold_13266:5105646-5105677 + 27.0 83.4
scaffold_13248:954289-954315 + 27.0 100.0
scaffold_13117:4573847-4820889 + 27.0 -30.0
scaffold_13340:18084120-18084150 - 27.0 93.6
scaffold_13340:19177783-19177811 - 27.0 96.6
scaffold_13335:1157167-1157195 - 27.0 96.6
scaffold_13335:1673006-1998674 - 27.0 -33.30000000000001
scaffold_13334:1560249-1560277 - 27.0 96.6
scaffold_13266:6853608-6853634 - 27.0 100.0
scaffold_13266:16222680-16222706 - 27.0 100.0
scaffold_13117:1402900-1402926 - 27.0 100.0
scaffold_13117:4962736-4962762 - 27.0 100.0
scaffold_12943:4369076-4369124 - 27.0 64.6
scaffold_12929:616256-616285 - 27.0 76.7
scaffold_12916:7433069-7433097 - 27.0 96.6
scaffold_12903:192140-192168 - 27.0 96.6
scaffold_13340:8632701-8632726 + 26.0 100.0
scaffold_13337:20065342-20065367 + 26.0 100.0
scaffold_13335:1145626-1145651 + 26.0 100.0
scaffold_13266:5105640-5105670 + 26.0 82.8
scaffold_12916:1612972-1612997 + 26.0 100.0
scaffold_12916:11507913-11507940 + 26.0 96.5
scaffold_12916:3771405-3771434 + 26.0 81.5
scaffold_13417:115873-115898 - 26.0 100.0
scaffold_13340:4843190-4843216 - 26.0 70.4
scaffold_13337:14136049-14136076 - 26.0 96.5
scaffold_13337:20585441-20585466 - 26.0 100.0
scaffold_13337:10449161-10449195 - 26.0 71.5
scaffold_13335:56352-56379 - 26.0 88.9
scaffold_13266:7472336-7472363 - 26.0 96.5
scaffold_13248:2094727-2094756 - 26.0 93.4
scaffold_13248:2402585-2402610 - 26.0 100.0
scaffold_13248:2692260-2692287 - 26.0 96.5
scaffold_13117:361130-361155 - 26.0 100.0
scaffold_12948:441813-441840 - 26.0 96.5
scaffold_12929:1877583-1877612 - 26.0 86.3
scaffold_13340:4024427-4024451 + 25.0 100.0
scaffold_13340:2074970-2075003 + 25.0 71.5
scaffold_13340:5662583-5662608 + 25.0 73.1
scaffold_13337:4679450-4679476 + 25.0 96.3
scaffold_13337:6794688-6794712 + 25.0 100.0
scaffold_13337:7248045-7248069 + 25.0 100.0
scaffold_13335:100515-100539 + 25.0 100.0
scaffold_13334:1253915-1253939 + 25.0 100.0
scaffold_13266:9496449-9496477 + 25.0 93.2
scaffold_13047:1317306-1317334 + 25.0 93.2
scaffold_13047:1700846-1700872 + 25.0 96.3
scaffold_12916:15202316-15202344 + 25.0 93.2
scaffold_12905:103050-103081 + 25.0 73.1
scaffold_13340:8361988-8362014 - 25.0 88.5
scaffold_13340:8770360-8770391 - 25.0 89.3
scaffold_13337:5069917-5069941 - 25.0 100.0
scaffold_13337:10230343-10230374 - 25.0 73.1
scaffold_13337:20986114-20986145 - 25.0 74.1
scaffold_13248:3144056-3144080 - 25.0 100.0
scaffold_13117:1402903-1402927 - 25.0 100.0
scaffold_13117:2533369-2533393 - 25.0 100.0
scaffold_12929:1148832-1148856 - 25.0 100.0
scaffold_12929:1391760-1391790 - 25.0 90.4
scaffold_12929:1645455-1645483 - 25.0 93.2
scaffold_12916:7133441-7133468 - 25.0 60.8
scaffold_13340:20895315-20895344 + 24.0 90.0
scaffold_13340:9264933-9264963 + 24.0 72.0
scaffold_13340:6228942-6228966 + 24.0 84.0
scaffold_13266:6589589-6589612 + 24.0 100.0
scaffold_13266:9496450-9496477 + 24.0 92.9
scaffold_13248:2079756-2079800 + 24.0 57.699999999999996
scaffold_12929:2445240-2445265 + 24.0 96.2
scaffold_12916:2373928-2373951 + 24.0 100.0
scaffold_12903:717502-717525 + 24.0 100.0
scaffold_13337:12258216-12258241 - 24.0 96.2
scaffold_13334:298043-298070 - 24.0 92.9
scaffold_13266:15453081-15453106 - 24.0 96.2
scaffold_13117:3817015-3817042 - 24.0 92.9
scaffold_13417:2231610-2231636 + 23.0 92.6
scaffold_13340:4024427-4024449 + 23.0 100.0
scaffold_13340:9390855-9390877 + 23.0 100.0
scaffold_13340:9390855-9390877 + 23.0 100.0
scaffold_13335:100515-100537 + 23.0 100.0
scaffold_13266:5407090-5407112 + 23.0 100.0
scaffold_13266:5407090-5407112 + 23.0 100.0
scaffold_13266:5459228-5459251 + 23.0 79.2
scaffold_13248:2168427-2168449 + 23.0 100.0
scaffold_13047:302748-302774 + 23.0 92.6
scaffold_13045:92224-92247 + 23.0 87.5
scaffold_12916:3156393-3156417 + 23.0 96.0
scaffold_12916:6279473-6279504 + 23.0 69.3
scaffold_12916:1901359-1901392 + 23.0 68.0
scaffold_12613:498631-498653 + 23.0 100.0
scaffold_13340:5838240-5838262 - 23.0 100.0
scaffold_13337:1754019-1754043 - 23.0 96.0
scaffold_13337:9215303-9215327 - 23.0 96.0
scaffold_13335:377190-377212 - 23.0 100.0
scaffold_13335:1203056-1203080 - 23.0 96.0
scaffold_13335:1387023-1387045 - 23.0 100.0
scaffold_13266:7329866-7329888 - 23.0 100.0
scaffold_13250:1354721-1354744 - 23.0 87.5
scaffold_13248:4121931-4121953 - 23.0 100.0
scaffold_13248:3300330-3300352 - 23.0 100.0
scaffold_13117:2531356-2531382 - 23.0 92.6
scaffold_13117:1159842-1159866 - 23.0 96.0
scaffold_13047:1012834-1012856 - 23.0 100.0
scaffold_12943:937291-937313 - 23.0 100.0
scaffold_12929:264094-264116 - 23.0 100.0
scaffold_12929:2205967-2205998 - 23.0 66.69999999999999
scaffold_12916:4581046-4581070 - 23.0 96.0
scaffold_12916:7433075-7433097 - 23.0 100.0
scaffold_13340:432244-432265 + 22.0 100.0
scaffold_13340:19410730-19410751 + 22.0 100.0
scaffold_13334:357613-357634 + 22.0 100.0
scaffold_13266:6589589-6589610 + 22.0 100.0
scaffold_13266:2903276-2903301 + 22.0 78.3
scaffold_13248:883092-883113 + 22.0 100.0
scaffold_13088:461735-461756 + 22.0 100.0
scaffold_13047:302761-302782 + 22.0 100.0
scaffold_12929:1397057-1397078 + 22.0 100.0
scaffold_12929:2846663-2846686 + 22.0 87.0
scaffold_12916:3123287-3123308 + 22.0 100.0
scaffold_12916:3790409-3790430 + 22.0 100.0
scaffold_13417:581807-581830 - 22.0 95.9
scaffold_13417:1540689-1540712 - 22.0 95.9
scaffold_13340:9552068-9552091 - 22.0 95.9
scaffold_13337:19615659-19615682 - 22.0 95.9
scaffold_13335:1976305-1976326 - 22.0 100.0
scaffold_13334:918838-918859 - 22.0 100.0
scaffold_13266:16669771-16669792 - 22.0 100.0
scaffold_13117:3869474-3869495 - 22.0 100.0
scaffold_13117:3294709-3294730 - 22.0 100.0
scaffold_13117:1796459-1796480 - 22.0 100.0
scaffold_12943:619779-619800 - 22.0 100.0
scaffold_12929:264095-264116 - 22.0 100.0
scaffold_12911:2955122-2955143 - 22.0 100.0
scaffold_13340:6228939-6228959 + 21.0 100.0
scaffold_13335:2448094-2448114 + 21.0 100.0
scaffold_13334:1473317-1473337 + 21.0 100.0
scaffold_13266:1804607-1804627 + 21.0 100.0
scaffold_13266:3299716-3299736 + 21.0 100.0
scaffold_13248:2649037-2649057 + 21.0 100.0
scaffold_13248:1985118-1985138 + 21.0 100.0
scaffold_13117:4628465-4628487 + 21.0 95.7
scaffold_13117:319301-319321 + 21.0 100.0
scaffold_13047:1700846-1700866 + 21.0 100.0
scaffold_12916:4735403-4735423 + 21.0 100.0
scaffold_12911:3480916-3480936 + 21.0 100.0
scaffold_12911:3601155-3601177 + 21.0 95.7
scaffold_12911:4392467-4392489 + 21.0 95.7
scaffold_12613:56530-56550 + 21.0 100.0
scaffold_13417:638743-638763 - 21.0 100.0
scaffold_13340:9552069-9552091 - 21.0 95.7
scaffold_13340:10799236-10799258 - 21.0 95.7
scaffold_13340:11242570-11242590 - 21.0 100.0
scaffold_13335:1387025-1387045 - 21.0 100.0
scaffold_13334:1187357-1187377 - 21.0 100.0
scaffold_13117:1796460-1796480 - 21.0 100.0
scaffold_12929:1526175-1526195 - 21.0 100.0
scaffold_12911:4332024-4332044 - 21.0 100.0
scaffold_12911:4332024-4332044 - 21.0 100.0
scaffold_12903:778458-778478 - 21.0 100.0
scaffold_12903:125555-125577 - 21.0 95.7
scaffold_12903:125555-125577 - 21.0 95.7
scaffold_13340:20895311-20895332 + 20.0 95.5
scaffold_13266:4697100-4697119 + 20.0 100.0
scaffold_13248:3782772-3782791 + 20.0 100.0
scaffold_13117:4870381-4870400 + 20.0 100.0
scaffold_13117:4573847-4573866 + 20.0 100.0
scaffold_12929:2445240-2445259 + 20.0 100.0
scaffold_12929:1059973-1059992 + 20.0 100.0
scaffold_12929:578725-578744 + 20.0 100.0
scaffold_12916:15033731-15033752 + 20.0 95.5
scaffold_12916:15278076-15278095 + 20.0 100.0
scaffold_12916:15766659-15766678 + 20.0 100.0
scaffold_12903:136875-136894 + 20.0 100.0
scaffold_13417:260605-260626 - 20.0 95.5
scaffold_13340:485440-485459 - 20.0 100.0
scaffold_13340:10743802-10743821 - 20.0 100.0
scaffold_13335:763078-763097 - 20.0 100.0
scaffold_13334:1424892-1424911 - 20.0 100.0
scaffold_13266:12077776-12077795 - 20.0 100.0
scaffold_13266:14224456-14224475 - 20.0 100.0
scaffold_13266:17972900-17972921 - 20.0 95.5
scaffold_13248:1505081-1505102 - 20.0 95.5
scaffold_13248:3146461-3146482 - 20.0 95.5
scaffold_13117:3317127-3317146 - 20.0 100.0
scaffold_13117:1159847-1159866 - 20.0 100.0
scaffold_13047:1475676-1475695 - 20.0 100.0
scaffold_12916:3587634-3587653 - 20.0 100.0
scaffold_12916:12158401-12158420 - 20.0 100.0
scaffold_12911:4120675-4120694 - 20.0 100.0
scaffold_12911:4120675-4120694 - 20.0 100.0
scaffold_13248:2741108-2741126 + 19.0 100.0
scaffold_13248:2079755-2079773 + 19.0 100.0
scaffold_13117:5123621-5123639 + 19.0 100.0
scaffold_13117:5123621-5123639 + 19.0 100.0
scaffold_12943:1965461-1965481 + 19.0 95.3
scaffold_12916:12871436-12871454 + 19.0 100.0
scaffold_13335:1998658-1998676 - 19.0 100.0
scaffold_13335:1203056-1203076 - 19.0 95.3
scaffold_13334:1542232-1542250 - 19.0 100.0
scaffold_13266:1818916-1818934 - 19.0 100.0
scaffold_13266:1818916-1818934 - 19.0 100.0
scaffold_13266:9055690-9055708 - 19.0 100.0
scaffold_13266:9616293-9616311 - 19.0 100.0
scaffold_13266:10991569-10991587 - 19.0 100.0
scaffold_12943:4369076-4369094 - 19.0 100.0
scaffold_12916:4581046-4581064 - 19.0 100.0
scaffold_12916:8016120-8016138 - 19.0 100.0
scaffold_12905:103050-103067 + 18.0 100.0
scaffold_13334:1560249-1560266 - 18.0 100.0
scaffold_12929:1500304-1500321 - 18.0 100.0
scaffold_12916:12158401-12158418 - 18.0 100.0
scaffold_12916:7133454-7133471 - 18.0 100.0
scaffold_13117:4628471-4628487 + 17.0 100.0
scaffold_12903:750290-750306 + 17.0 100.0
scaffold_13335:2276551-2276567 - 17.0 100.0
scaffold_13335:173220-173236 - 17.0 100.0
scaffold_13335:173220-173236 - 17.0 100.0
scaffold_13117:3817015-3817031 - 17.0 100.0
scaffold_12943:4579645-4579661 - 17.0 100.0
scaffold_12943:1685810-1685826 - 17.0 100.0
scaffold_12943:1685810-1685826 - 17.0 100.0
scaffold_12929:1645455-1645471 - 17.0 100.0
scaffold_12916:11707625-11707641 - 17.0 100.0
scaffold_12916:146670-146686 - 17.0 100.0
scaffold_13248:1761457-1761472 + 16.0 100.0
scaffold_13099:641841-641856 + 16.0 100.0
scaffold_13248:1505081-1505098 - 16.0 94.5
scaffold_13248:1505081-1505096 - 16.0 100.0
scaffold_13047:1409116-1409131 - 16.0 100.0
scaffold_12916:5402411-5402426 - 16.0 100.0
scaffold_12943:619779-619795 - 15.0 94.2
scaffold_12929:2205980-2205996 - 15.0 94.2
scaffold_12929:264095-264111 - 15.0 94.2
scaffold_12916:14642547-14642563 - 15.0 94.2
scaffold_12929:1376251-1376266 + 14.0 93.8
scaffold_13417:1242045-1242060 - 14.0 93.8
scaffold_13047:1505642-1505653 - 12.0 100.0
scaffold_12903:201139-201148 + 10.0 100.0
scaffold_13417:260354-260362 - 9.0 100.0
scaffold_12911:4120635-4120643 - 9.0 100.0
WARNING: MAF and BLAT coordinate disagree for
MAF Coordinate BLAT Coordinate
chrom= scaffold_13337 scaffold_13337
start= 1466195 21827571
end= 1466344 21827720
strand= - -
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome