Hi Angie,

Thanks for the advice.

I'll convert to bigBed if the current setup doesn't suffice ... I
already stuffed `I`s in teh QUAL column, so that's fine, but the NNN's
in the sequence do look a bit wonky.

Out of curiosity, how does "the general public" know when a new
release of your software is deployed to go live?

Also -- thanks for fixing the issue so quickly.

-steve


On Fri, Jul 29, 2011 at 8:11 PM, Angie Hinrichs <[email protected]> wrote:
> Hi Steve,
>
> I have fixed the code, so '*' should be handled correctly on our main site 
> after the next software release (2.5-3 weeks).
>
> Using NNNN's as the sequence might cause some strange display effects in the 
> browser (I expect all bases would be marked as mismatches).  If your user is 
> going to click on an item to see item details, the page will terminate early 
> when it tries to display missing qual scores, so you might want to stuff some 
> characters there too (at least for the next few weeks), if you continue to 
> use the BAM for viewing.
>
> Since BED is already a UCSC Genome Browser custom track format, why not send 
> the GEO bed directly to the browser?  You could submit a track line followed 
> by a GEO URL(s), if there is a stable URL for the bed file(s).  Some GEO 
> short-read files are so large that they may time out the upload, but it seems 
> like it's worth a shot because of our code's alignment-centric display of 
> BAM.  (Of course, bigBed would work great and avoid the bulk upload time 
> issue.)
>
> Angie
>
> ----- Original Message -----
> From: "Steve Lianoglou" <[email protected]>
> To: "Angie Hinrichs" <[email protected]>
> Cc: [email protected]
> Sent: Friday, July 29, 2011 4:41:50 PM
> Subject: Re: [Genome] Mimimal BAM files is tripping up browser
>
> Hi Angie,
>
> Thanks for the quick response.
>
> I converted the bed to bam because I wanted to use some data I pulled
> down from GEO that is supposed to represent alignments, but was only
> given as bed files.
>
> The tools I've written to deal with NGS data work off of bam files, so
> I thought the easiest thing to do was to convert it to a bam file and
> move along -- and my collaborator likes to look at data through the
> "UCSC Genome Browser lens", which is where I got tripped up.
>
> Converting the data to bigBed didn't even cross my mind, really ... I
> regenerated bam files from the bed files, but put NNN's in the SEQ
> column for now.
>
> Thanks for the help,
>
> -steve
>
> On Fri, Jul 29, 2011 at 7:20 PM, Angie Hinrichs <[email protected]> wrote:
>> Hi Steve,
>>
>> You're right -- that is legal BAM, but our code expects to see a query 
>> sequence that corresponds to the CIGAR (in this case, it wants 32 bases 
>> corresponding to "32M").  The code assumes that BAM contains sequence 
>> alignments.  Thanks for reporting this, I will work on a fix.
>>
>> In the meantime, can you tell us why you converted bed to bam?  The bigBed 
>> format (http://genome.ucsc.edu/goldenPath/help/bigBed.html) was designed for 
>> large remote BED tracks, and might be a better solution (at least for 
>> viewing in the genome browser :).
>>
>> Thanks,
>> Angie
>>
>> ----- Original Message -----
>> From: "Steve Lianoglou" <[email protected]>
>> To: [email protected]
>> Sent: Friday, July 29, 2011 2:48:45 PM
>> Subject: Re: [Genome] Mimimal BAM files is tripping up browser
>>
>> As a follow up to this -- I think the "*" in either the sequence or
>> quality columns (columns 10 or 11) of a BAM file is what's causing the
>> error ...
>>
>> -steve
>>
>> On Fri, Jul 29, 2011 at 4:59 PM, Steve Lianoglou
>> <[email protected]> wrote:
>>> Hi,
>>>
>>> [sorry, I sent this email to the genome-mirror list previously, so I'm
>>> resending here]
>>>
>>> I've converted some bed files to bam files using bedtools.
>>>
>>> The alignments in the bam file look like so:
>>>
>>> *       16      chr1    13517   255     32M     *       0       0
>>>  *       *
>>> *       16      chr1    16275   255     32M     *       0       0
>>>  *       *
>>> *       16      chr1    16458   255     32M     *       0       0
>>>  *       *
>>> *       16      chr1    16461   255     32M     *       0       0
>>>  *       *
>>>
>>> When I add the bam file as a custom track and try to hop to the
>>> genome, I get this error:
>>>
>>> baseColorDrawSetup: *: mRNA size (0) != psl qSize (32)
>>>
>>> I've put this BAM file online to help smoke this problem for testing
>>> purposes, which you can add as a custom track like so:
>>>
>>> track type="bam" name="test-bam"
>>> bigDataUrl="http://cbio.mskcc.org/leslielab/files/ucsc/test.bam";
>>> genome="hg19" visibility="squish"
>>>
>>> I'm guessing that the genome browser doesn't like "*" in the QNAME
>>> (1st) column of the BAM file (or maybe one of the "*"s in another
>>> column(?)).
>>>
>>> It's easy enough for me to change whatever column to a bogus value to
>>> fix this (some points as to which column that would be are welcome),
>>> but as far as I can tell this is a valid bam file. Each column that
>>> has a "*" is allowed to do so according to the spec:
>>>
>>> http://samtools.sourceforge.net/SAM1.pdf
>>>
>>> While this is easy enough for me to fix on my end, I thought it would
>>> be worth reporting since it seems like a (somehow minor) bug on your
>>> side as well (assuming such a bam file is valid, of course).
>>>
>>> Thanks,
>>> -steve
>>>
>>> --
>>> Steve Lianoglou
>>> Graduate Student: Computational Systems Biology
>>>  | Memorial Sloan-Kettering Cancer Center
>>>  | Weill Medical College of Cornell University
>>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>>
>>
>>
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>>  | Memorial Sloan-Kettering Cancer Center
>>  | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>
>
>
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to