Re: [Genome] Genome Digest, Vol 74, Issue 32

Hollywoodkiller Movies Thu, 26 Mar 2009 21:39:36 -0700

On Fri, Mar 27, 2009 at 7:37 AM, <[email protected]> wrote:


> Send Genome mailing list submissions to
>        [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://www.soe.ucsc.edu/mailman/listinfo/genome
> or, via email, send a message with subject or body 'help' to
>        [email protected]
>
> You can reach the person managing the list at
>        [email protected]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Genome digest..."
>
>
> Today's Topics:
>
>   1. Re: convert large genomic regions from mouse to human
>      (Hiram Clawson)
>   2. Re: SelfChain table (Atif Shahab)
>   3. Questions about Blat -fastmap option. (Royden Clark)
>   4. Re: Questions about Blat -fastmap option. (Galt Barber)
>   5. Coloring sub-parts of a track (Dhiral Phadke)
>   6. Re: Coloring sub-parts of a track (Hiram Clawson)
>   7. gladHumESOtherData (Charu Gupta Kumar)
>   8. quality score 98 (Michael Hiller)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 24 Mar 2009 22:26:34 -0700
> From: Hiram Clawson <[email protected]>
> Subject: Re: [Genome] convert large genomic regions from mouse to
>        human
> To: Jiantao Shi <[email protected]>
> Cc: [email protected]
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Good Evening Jiantao:
>
> The hg18 genome aligns to less than %40 of the Mouse genome:
>
> http://genomewiki.ucsc.edu/index.php/Mm9_multiple_alignment
>
> It would be expected that many regions do not translate unless
> these are all highly conserved regions.
>
> Take one of your regions that does not map and carefully
> examine the human chain and net tracks to see how it breaks up.
>
> --Hiram
>
> Jiantao Shi wrote:
> > Hi,
> >
> > I have a about 3000 mouse (mm8) genomic regions (2000bps on average)
> > identified by ChIP-seq experiment downloaded from a public data set. And
> i
> > want to convert these coordinates to those of human (hg18). However, i
> got
> > nothing returned using the liftover webserver in ucsc genome website. So
> i
> > downloaded the liftover program and run it locally. Again, only about 400
> of
> > these regions were successfully converted, others were reported as
> > "Partially deleted in new". Any suggestions?
> >
> > Best,
> > Jiantao Shi
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 25 Mar 2009 23:10:17 +0800
> From: Atif Shahab <[email protected]>
> Subject: Re: [Genome] SelfChain table
> To: Jennifer Jackson <[email protected]>
> Cc: [email protected]
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> thanks!  That clarifies the issue.
>
> - atif
>
> On 3/25/2009 8:12 AM, Jennifer Jackson wrote:
> > Hello,
> > Please see our documentation
> > http://genome.ucsc.edu/goldenPath/help/chain.html
> > http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms
> >
> > The coordinates for this chr20 chain are in the (-) strand. Chain file
> > formats are different from many of our other file formats in that they
> > are based on the reverse-complement strand (not converted to be
> > positive strand). However, they are still ordered smallest -> largest
> > in value. And the smallest coordinate is still zero-based. These chain
> > coordinates are converted to be positive stranded and to be a 1-based
> > start when displayed in the browser chain description page.
> >
> > The chromosome 20 length is: 62,435,964
> >
> > Doing the math to compare the browser data points to the chain file
> > data points:
> >
> > start: 62435964 - 19465389 = 42970575
> > end: 62435964 - 19472380 = 42963584 +1 (to convert the zero-based
> > "smallest" coordinate) = 42963585
> >
> > I hope I have addressed all of your questions, but let me know if I
> > missed anything,
> > Jennifer Jackson
> > UCSC Genome Bioinformatics Group
> >
> > Atif Shahab wrote:
> >>
> >> Sorry copied the wrong row.  Following is the row I find in
> >> chr11.chainSelf
> >>
> >> 725, 212246, 'chr11', 134452384, 18466024, 18469085, 'chr20',
> >> 62435964, '-', 19465389, 19472380, 80309, 77.2
> >>
> >> while what I see in UCSC browser is
> >>
> >> *Human position:* chr11:18466025-18469085 size: 3061
> >> *Strand:* -
> >> *Human position: * chr20:42963585-42970575
> >> <
> http://ucsc.gis.a-star.edu.sg/cgi-bin/hgTracks?db=hg18&position=chr20%3A42963585-42970575
> >
> >> size: 6991
> >> *Chain ID:* 80309
> >> *Score:* 212246 *Approximate Score within browser window:* 1240
> >>
> >> The browser reports reports the position to be in
> >> chr20:42963585-42970575 while what I get from the chr11.chainSelf is
> >> chr20:19465389-19472380.
> >>
> >> - atif
> >>
> >> On 3/24/2009 2:44 AM, Jennifer Jackson wrote:
> >>> Hello,
> >>>
> >>> The data reported for the region chr11:56259903-56260492 on the
> >>> track description page represents a subset of the data from the
> >>> entire chain record (chr11_chainSelf.id = 1118118). That is why the
> >>> score is approximate. The track description page explains the
> >>> display conventions in the Description section.
> >>>
> >>> We hope this helps to explain the data,
> >>> Jennifer Jackson
> >>> UCSC Genome Bioinformatics Group
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Atif Shahab wrote:
> >>>> Hi,
> >>>>
> >>>> In the UCSC browser I get the following info
> >>>>
> >>>> *Human position:* chr11:56259903-56260492 size: 590
> >>>> *Strand:* -
> >>>> *Human position: * chr20:56904056-56908860
> >>>> <
> http://ucsc.gis.a-star.edu.sg/cgi-bin/hgTracks?db=hg18&position=chr20%3A56904056-56908860
> >
> >>>> size: 4805
> >>>> *Chain ID:* 1118118
> >>>> *Score:* 38233 *Approximate Score within browser window:* 2221
> >>>>
> >>>> But when I look into the mysql table for chr11.  I find the following
> >>>>
> >>>> 725, 212246, 'chr11', 134452384, 18466024, 18469085, 'chr20',
> >>>> 62435964, '-', 19465389, 19472380, 80309, 77.2
> >>>>
> >>>> Notice that the chr20 start/end are different from what is show in
> >>>> the browser.  Does the browser use some formula to compute the
> >>>> start/end?
> >>>>
> >>>> Also the chain track displays boxes joined by single/double lines.
> >>>> Which table contains the info regarding the start/end positions of
> >>>> these boxes?
> >>>>
> >>>> - atif
> >>>> _______________________________________________
> >>>> Genome maillist  -  [email protected]
> >>>> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 25 Mar 2009 12:17:18 -0400
> From: Royden Clark <[email protected]>
> Subject: [Genome] Questions about Blat -fastmap option.
> To: [email protected]
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
> Greetings UCSC Genome Group,
>
> Sorry if this question has already been answered but I could not find
> it in the archives or the faqs.
>
> The documentation for blat
> at
> http://genome.ucsc.edu/goldenPath/help/blatSpec.html
>
> states
>
>  -fastMap    Run for fast DNA/DNA remapping - not allowing introns,
>
>                requiring high %ID
>
>
>
>
> The question is what is the %ID it requires?
> Can that % be modified?
>
> Thank you
> Royden Clark
>
> IT Specialist II
> University of Virginia
>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 25 Mar 2009 12:37:25 -0700 (PDT)
> From: Galt Barber <[email protected]>
> Subject: Re: [Genome] Questions about Blat -fastmap option.
> To: Royden Clark <[email protected]>
> Cc: [email protected]
> Message-ID: <pine.gso.4.64.0903251153350.25...@sundance>
> Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
>
>
> I think the idea for -fastMap is that it doesn't
> try to chain alignments together, so no introns.
>
> The "high %ID" required for said fastMap is not modifiable.
> Presumably it's very high (>90%).
>
> Of course simple %-identity threshold filtering is set with -minIdentity.
>
> fastMap does not enhance sensitivity but it might
> be good for speed in situations where gaps are
> not expected.
>
> I see that it is used for liftOver from one assembly
> to the next of the same species;
> and for mapping clones to contigs;
>
> Essentially any place where you expect really long
> identical alignments with no gaps or introns.
>
> Normally blat runs a banded dynamic programming algorithm
> to finalize and extend alignments.  This is an expensive
> and slow step and does not scale well with long queries.
>
> fastMap helps blat perform better when there are these
> long identical regions by skipping those extra costly slow
> steps.  Even so, applications usually chunk the query
> into 3kb pieces and run cluster jobs.
> Blat/gfServer have a default limit query size of 40kb.
> It just runs too slowly for longer queries.
>
> Several examples of fastMap usage look like this:
>  blat -minScore=100 -minIdentity=98 -fastMap
>
> fastMap is for mapping a chunk of virtually identical
> dna onto a larger object like a clone or scaffold or chrom.
>
> It will tolerate a few substitutions but not much else.
> Certainly no long gaps.
>
> -Galt
>
>
> On Wed, 25 Mar 2009, Royden Clark wrote:
>
> > Greetings UCSC Genome Group,
> >
> > Sorry if this question has already been answered but I could not find
> > it in the archives or the faqs.
> >
> > The documentation for blat
> > at
> > http://genome.ucsc.edu/goldenPath/help/blatSpec.html
> >
> > states
> >
> >  -fastMap    Run for fast DNA/DNA remapping - not allowing introns,
> >
> >                requiring high %ID
> >
> >
> >
> >
> > The question is what is the %ID it requires?
> > Can that % be modified?
> >
> > Thank you
> > Royden Clark
> >
> > IT Specialist II
> > University of Virginia
> > _______________________________________________
> > Genome maillist  -  [email protected]
> > http://www.soe.ucsc.edu/mailman/listinfo/genome
> >
>
>
> ------------------------------
>
> Message: 5
> Date: Thu, 26 Mar 2009 11:58:47 -0700 (PDT)
> From: Dhiral Phadke <[email protected]>
> Subject: [Genome] Coloring sub-parts of a track
> To: [email protected]
> Cc: [email protected]
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=iso-8859-1
>
> Hi,
> ?
> I have a track that I would like to display using 2 different colors: For
> example:
> ?
> track type=bedGraph name="TEST1-BedGraph Format" description="BedGraph
> format" visibility=full color=255,0,0 priority=20
> chr19 59302000 59302300 -1.0
> chr19 59302300 59302600 -0.75
> chr19 59302600 59302900 -0.50
> chr19 59302900 59303200 -0.25
> chr19 59303200 59303500 0.0
> chr19 59303500 59303800 0.25
> chr19 59303800 59304100 0.50
> chr19 59304100 59304400 0.75
> chr19 59304400 59304700 1.00
>
> I would like to display positions 59302000 - 59303500 in red and positions
> 59303500 - 59304700?in green. Is this possible? How can this be done?
> ?
> Thanks a lot.
> ?
> -Dhiral.
>
>
>
>
> ------------------------------
>
> Message: 6
> Date: Thu, 26 Mar 2009 12:02:44 -0700
> From: Hiram Clawson <[email protected]>
> Subject: Re: [Genome] Coloring sub-parts of a track
> To: [email protected]
> Cc: [email protected], [email protected]
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Try using color=0,255,0 altColor=255,0,0
>
> This will display positive values in green, and negative values in red.
> You only have control over the color via this positive/negative indication.
> You do not have control of individual colors for individual plot positions.
>
> --Hiram
>
> Dhiral Phadke wrote:
> > Hi,
> >
> > I have a track that I would like to display using 2 different colors: For
> example:
> >
> > track type=bedGraph name="TEST1-BedGraph Format" description="BedGraph
> format" visibility=full color=255,0,0 priority=20
> > chr19 59302000 59302300 -1.0
> > chr19 59302300 59302600 -0.75
> > chr19 59302600 59302900 -0.50
> > chr19 59302900 59303200 -0.25
> > chr19 59303200 59303500 0.0
> > chr19 59303500 59303800 0.25
> > chr19 59303800 59304100 0.50
> > chr19 59304100 59304400 0.75
> > chr19 59304400 59304700 1.00
> >
> > I would like to display positions 59302000 - 59303500 in red and
> positions 59303500 - 59304700 in green. Is this possible? How can this be
> done?
> >
> > Thanks a lot.
> >
> > -Dhiral.
> >
> >
> >
> > _______________________________________________
> > Genome maillist  -  [email protected]
> > http://www.soe.ucsc.edu/mailman/listinfo/genome
> >
>
>
>
> ------------------------------
>
> Message: 7
> Date: Thu, 26 Mar 2009 16:08:53 -0500
> From: "Charu Gupta Kumar" <[email protected]>
> Subject: [Genome] gladHumESOtherData
> To: <[email protected]>
> Message-ID: <003801c9ae57$120c6600$362532...@edu>
> Content-Type: text/plain;       charset="us-ascii"
>
> Hi,
>
>
>
> I was curious what this table data shows. What are qVal and hVal ? Why is
> there a single tissue name associated with a probe?
>
>
>
> Thanks,
>
>
>
> Charu
>
>
>
> Charu Gupta Kumar
>
> University of Illinois at Urbana-Champaign
>
>
>
>
>
> ------------------------------
>
> Message: 8
> Date: Thu, 26 Mar 2009 15:04:23 -0700
> From: Michael Hiller <[email protected]>
> Subject: [Genome] quality score 98
> To: [email protected]
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Dear UCSC team,
>
> I have a question concerning the "manually set" quality score 98 that
> represents missing quality scores.
> The chimp browser for chr21 or chrY does not show quality scores, which
> is fine, since there are no qual scores.
> However, the hg18 44-way alignment contains for chimp chr21 or chrY the
> qual score 0, which comes from mafAddQRows that encodes scores 0 .. <5
> and 98 as 0.
>
> s panTro2.chr21               13793045 70 +  46489110
> CTTGTGTGCCACCATCCCTGACTTTGTTGATAAGGGCATCAGGCTACATCCCTCTGGTACTCAGTGGTAA
> q panTro2.chr21
> 0000000000000000000000000000000000000000000000000000000000000000000000
>
> That means that any attempt to filter out bad quality from a maf will
> fail for chr21 and Y because one cannot distinguish between a real
> quality score of say 3 and missing data (98) because both end up as 0.
>
> I have the following questions/suggestions:
> 1. Is there any species where 98 represents a real quality score (I mean
> 97 < 98 < 99) or is 98 always missing data?
> 2. Would it make sense to encode score 98 in the maf as '.' like it is
> done for gaps? Then one can distinguish between bad qual and missing data.
> 3. For chimp: chrY, chrY_random and chr21 have no quality scores in the
> browser display and in the quality wib table. However, the region
> chr7:87674857-92389096 has quality score 98. And these regions in the
> hg18 44-way maf are contain a 0 in the q lines. Is the region
> chr7:87674857-92389096 different from chrY or chr21? And why is it
> treated differently?
>
> I think the quality score annotation of mafs is very useful, especially
> because of many low coverage genomes.
>
> Thanks a lot for your help.
> - Michael
>
>
>
>
> ------------------------------
>
> _______________________________________________
> Genome maillist  -  [email protected]
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>
>
> End of Genome Digest, Vol 74, Issue 32
> **************************************
>



-- 
http://www.watch-movies-online-hollywoodkiller.com
_______________________________________________
Genome maillist  -  [email protected]
http://www.soe.ucsc.edu/mailman/listinfo/genome

Re: [Genome] Genome Digest, Vol 74, Issue 32

Reply via email to