Hello Sumanth, The 15bp sequences are probably simply too short to capture a repeat match. If you added in some flanking sequence (using the genome alignment), the match would probably be found, the same as the genome track.
Perhaps noting what type of repeat this is (on the Repeat track's item detail page) and learning about it's characteristics would help. Perhaps there are variable regions - which would definitely interfere with short match alignments. It sounds like the sequence region is mapping uniquely. Perhaps align to genome first, then filter out matches to genome regions annotated as repeats (or repeats plus some other suspicious factors, like a reverse orientation intron mapping). Hopefully this helps, Jen --------------------------------- Jennifer Jackson UCSC Genome Informatics Group http://genome.ucsc.edu/ On 6/2/10 10:28 AM, Polikepahad, Sumanth wrote: > Hi, > > I have been trying to map the deep-sequenced data to mouse genome. There is > a ~15 nt sequence (about 15000 copies) which is mapping in reverse > orientation to the intron region of a gene. I have downloaded the intron > database from Tables in ucsc browser without repeat masking option. But when > I check the repeat masking option, the mapped regions of the above sequences > are being masked suggesting that they might be repeat sequences. However, > when I map them to the mouse repeat database obtained from the > www.girinst.org and also to the www.repeatmasker.org, that particular > sequence is not shown as a repeat. But it is shown as a repeat in Hydra > genome. Can someone suggest me what I am missing here. Why ucsc browser > considering that sequence as a repeat and the others are not? > > thanks in advance. > ________________________________________ > From: [email protected] [[email protected]] > On Behalf Of [email protected] > [[email protected]] > Sent: Wednesday, June 02, 2010 12:16 PM > To: [email protected] > Subject: Genome Digest, Vol 89, Issue 3 > > Send Genome mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.soe.ucsc.edu/mailman/listinfo/genome > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Genome digest..." > > > Today's Topics: > > 1. Re: mask coding region (Jennifer Jackson) > 2. Re: custom track (Jennifer Jackson) > 3. Re: [Genome-mirror] Masked Genome Strand (Jennifer Jackson) > 4. Re: [Genome-mirror] PDF file to Power Point (Maximilian Haussler) > 5. cpg island locations across the genome (Carlo Colantuoni) > 6. Kent source tree (quinn) > 7. ensGene and ucscToEnsembl (Oliver Lui) > 8. MySql error (kunchaparty,Shanti) > 9. Custom Track Question (Shashikant Pujar) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 01 Jun 2010 12:50:47 -0700 > From: Jennifer Jackson<[email protected]> > Subject: Re: [Genome] mask coding region > To: Vera Pendino<[email protected]> > Cc: [email protected] > Message-ID:<[email protected]> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Update: > > One of our scientific engineers reminded me that we have tools in the > source tree to help with masking: > http://genomewiki.cse.ucsc.edu/index.php/Kent_source_utilities > > * maskOutFa (takes .bed file w/coords) > * twoBitMask > > Download the source here: > http://hgdownload.cse.ucsc.edu/downloads.html > scroll down to "Source Downloads" -> > "UCSC Genome Browser source download" > > These utilities are not in the set of pre-compiled utilities on the > downloads server, so you will need to follow the instructions in the > READMEs and linked help documents. > > Perhaps this will be help you avoid having to create your own tool(s) > for the masking step, if you decide to try this method. > > Best wishes, > Jen > > On 6/1/10 9:57 AM, Jennifer Jackson wrote: >> Hello Vera, >> >> Yes, this is possible, but you will need to obtain the reference genome >> sequence, coordinates that you want to mask, do the masking, then run >> BLAT on your own server against the newly created file. >> >> (using hg19 as an example in links, if using hg18, swap in that database >> for the links). >> >> FTP sequence: >> http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/ >> >> Obtain CDS coordinates: >> Use Table browser or Downloads server, a Gene Prediction track (UCSC >> Genes, RefSeq Genes, CCDS, etc.), and output or ftp the CDS coordinates. >> >> Table browser (good even if using FTP to learn table >> names/fields. See track descriptions to review methods >> and select proper dataset for your purposes). >> http://genome.ucsc.edu/cgi-bin/hgTables >> http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html >> >> Ftp complete files (representing mySQL tables): >> http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/ >> >> >> BLAT: >> http://genome.ucsc.edu/FAQ/FAQblat.html >> >> Hopefully this will help you to get started, please let us know if you >> need more help, >> >> Jennifer >> >> --------------------------------- >> Jennifer Jackson >> UCSC Genome Informatics Group >> http://genome.ucsc.edu/ >> >> On 6/1/10 8:58 AM, Vera Pendino wrote: >>> Hi, >>> I would like to run blat with a short sequence on the regions that are >>> annotated as intronic, intergenic and UTRs in the human genome(hg19). >>> In other words, I'd like to know if it is possible to mask the coding >>> regions of the genome. >>> Could you help me? >>> thank you >>> >>> Vera >>> >>> _______________________________________________ >>> Genome maillist - [email protected] >>> https://lists.soe.ucsc.edu/mailman/listinfo/genome >> _______________________________________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/mailman/listinfo/genome > > > ------------------------------ > > Message: 2 > Date: Tue, 01 Jun 2010 13:30:34 -0700 > From: Jennifer Jackson<[email protected]> > Subject: Re: [Genome] custom track > To: Dorit Shweiki<[email protected]> > Cc: [email protected] > Message-ID:<[email protected]> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hello Dorit, > > We are sorry, but bedGraph format does not allow a per-item identifier. > It is positional data only. > > Some help: > http://genome.ucsc.edu/goldenPath/help/bedgraph.html > http://genomewiki.ucsc.edu/index.php/Selecting_a_graphing_track_data_format > > Thanks, > Jennifer > > --------------------------------- > Jennifer Jackson > UCSC Genome Informatics Group > http://genome.ucsc.edu/ > > On 6/1/10 3:27 AM, Dorit Shweiki wrote: >> >> >> Hello, >> >> >> >> I created a file which contains 2 tracks >> >> The first BED shows genes and their position >> >> The second BEDGraph shows expression level. >> >> How can I add the gene name or geneid to the second track - where do I >> put it? >> >> >> >> >> >> browser position chr2:1-189,746,636 >> >> browser hide all >> >> track name="Grade_A_position" description="rhesus grade A probes >> position by gene" visibility=1 >> >> chr2 1511924 2110200 LOC719197 >> 500 + >> >> chr2 3039794 3574082 LOC722879 >> 500 - >> >> chr2 10025405 10345631 LOC719312 >> 500 + >> >> chr2 11199466 11252352 LOC719328 >> 500 - >> >> track type=bedGraph name="Dev_express" description="up and down gene >> expressed in ESCs" visibility=full color=200,100,0 altColor=0,100,200 >> priority=20 >> >> chr2 11199466 11252352 -0.694189 >> >> chr2 12369959 12379310 -0.179144 >> >> chr2 12384400 12394747 0.055239 >> >> >> >> >> >> >> >> Thank you in advance >> >> Best regards >> >> Dorit >> >> >> >> >> >> _______________________________________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/mailman/listinfo/genome > > > ------------------------------ > > Message: 3 > Date: Tue, 01 Jun 2010 13:53:05 -0700 > From: Jennifer Jackson<[email protected]> > Subject: Re: [Genome] [Genome-mirror] Masked Genome Strand > To: [email protected] > Cc: [email protected], [email protected] > Message-ID:<[email protected]> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hello, > > Yes, the reference genome fasta sequence represents the forward (+) strand. > > For "all fasta sequences" this is not true. Those that represent > annotation (such as transcripts, example: RefSeq) can be from either > strand. This type of fasta sequence represents a transcript in the > direction of transcription (5'->3'). In most cases, the primary table of > the source track related to the transcript fasta sequence has the > reference genome alignment coordinates (including strand). > > Hopefully this helps, > Thanks, > Jennifer > > --------------------------------- > Jennifer Jackson > UCSC Genome Informatics Group > http://genome.ucsc.edu/ > > On 5/31/10 2:19 PM, [email protected] wrote: >> >> Hi, >> I have downloaded the masked genome as a ref genome to align our reads.I am >> just >> wondering the sequences in the fasta masked.fa files, which strand is that? >> Are >> all the fasta sequences are in the forward strand? please let me know. >> >> thanks >> -dafil >> >> _______________________________________________ >> Genome-mirror mailing list >> [email protected] >> https://lists.soe.ucsc.edu/mailman/listinfo/genome-mirror > > > ------------------------------ > > Message: 4 > Date: Wed, 2 Jun 2010 10:22:12 +0100 > From: Maximilian Haussler<[email protected]> > Subject: Re: [Genome] [Genome-mirror] PDF file to Power Point > To: Mariaestela Ortiz<[email protected]> > Cc: genome<[email protected]> > Message-ID: > <[email protected]> > Content-Type: text/plain; charset=ISO-8859-1 > > Hi Maria, > > when I prepared figures for an article, I've used Adobe Illustrator (or > inkscape, free software) and imported the pdf as line (vector) graphics, > then played around with the image until I was happy with it. The advantage > is that you can remove parts of the line drawing, move them around, increase > font sizes etc. You can then copy-paste in the end into powerpoint as a > vector graphics so the resultion should be very good... > > hope that helps > Max > > > On Sun, May 30, 2010 at 9:03 PM, Mariaestela Ortiz<[email protected]> wrote: > >> Hello There, I would like to copy and paste various linear gene maps that >> depict the exons (blue color) into a Power Point Slide. I am preparing a >> talk for a conference. Please send me any hints or help on how to do this. >> I tried via PDF but the resolution is very low. >> >> Any help would be much appreciated. >> >> All the best, >> >> Maria >> _______________________________________________ >> Genome-mirror mailing list >> [email protected] >> https://lists.soe.ucsc.edu/mailman/listinfo/genome-mirror >> > > > ------------------------------ > > Message: 5 > Date: Tue, 1 Jun 2010 20:58:28 -0400 > From: "Carlo Colantuoni"<[email protected]> > Subject: [Genome] cpg island locations across the genome > To:<[email protected]> > Cc: 'Carlo Colantuoni'<[email protected]> > Message-ID:<006301cb01ee$b733a350$259ae9...@com> > Content-Type: text/plain; charset="us-ascii" > > hi there, > > > > i am wondering if ucsc genome browser has a database of mapped cpg islands > across the genome (such as a track in the browser that I could download)? i > am most interested in rat, but would like to look at human and mouse too. i > am interested in downloading all the locations, not just searching one gene > for cpg islands. > > > > thanks, > > carlo > > > > ------------------------------ > > Message: 6 > Date: Tue, 1 Jun 2010 23:31:17 -0400 > From: quinn<[email protected]> > Subject: [Genome] Kent source tree > To: [email protected] > Message-ID: > <[email protected]> > Content-Type: text/plain; charset=ISO-8859-1 > > Hi UCSC help group, > > I am trying to converse axt files to MAF files by axtToMaf program, but I > can't find where I can download axtToMaf. There is a link from the mailing > list, but it's too old and doesn't work anymore. Could you tell me where I > can download axtToMaf and other kent sources? Any help will be highly > appreciated! > > Best, > Quinn > > > ------------------------------ > > Message: 7 > Date: Wed, 2 Jun 2010 10:10:24 +0000 > From: Oliver Lui<[email protected]> > Subject: [Genome] ensGene and ucscToEnsembl > To:<[email protected]> > Message-ID:<[email protected]> > Content-Type: text/plain; charset="iso-8859-1" > > > Hi there > > I tried to download the ensGene table in the GRCh37/ hg19 database in GTF > format, but the gene ids and the transcript ids are always the same, i.e. > start with "ENST". I think the gene ids should start with "ENSG"? > > Also, I've downloaded the latest human GTF file (Homo_sapiens.GRCh37.58.gtf) > from the Ensembl website, but I couldn't find the corresponding ucsc ids for > some of the Ensembl ids (at the bottom of the file), e.g. LRG_15, from the > ucscToEnsembl table. Any suggestion about what I could do? > > Thanks! > > Regards > Oliver > > _________________________________________________________________ > http://clk.atdmt.com/UKM/go/197222280/direct/01/ > Do you have a story that started on Hotmail? Tell us now > > ------------------------------ > > Message: 8 > Date: Tue, 1 Jun 2010 16:45:43 -0500 > From: "kunchaparty,Shanti"<[email protected]> > Subject: [Genome] MySql error > To: "[email protected]"<[email protected]> > Message-ID: > > <2ff752fb5881994783fae68e5ff64700236109a...@dcpwvmbxc1vs2.mdanderson.edu> > > Content-Type: text/plain; charset="us-ascii" > > We are unable to connect to the mysql server using the following command: > mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A > > Is the server down? Thanks > > Regards > Shanti > ------------------------------------------ > Shanti Kunchaparty, PhD > Scientific Application Specialist > Research IS and Technology > Phone: 713-792-1863 > MD Anderson Cancer Center > * Please consider the environment before printing this e-mail > > > > > > ------------------------------ > > Message: 9 > Date: Wed, 2 Jun 2010 10:03:34 -0400 > From: Shashikant Pujar<[email protected]> > Subject: [Genome] Custom Track Question > To: "[email protected]"<[email protected]> > Message-ID: > > <29456825ba12254791bfe870d2b28b501d7c646...@mbxc.exchange.cornell.edu> > Content-Type: text/plain; charset="us-ascii" > > Hi > > I have loaded a custom track (Illumina NGS reads in BAM format of a 2Mb > region) on the UCSC Dog Genome Browser. Is there a way I can extract all > SNPs and Indels between the custom track and canFam2? > > Thanks > > Shashi Pujar > > > ------------------------------ > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > > > End of Genome Digest, Vol 89, Issue 3 > ************************************* > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
