On 06.08.2014 15:24, Peter Cock wrote: > On Wed, Aug 6, 2014 at 2:19 PM, Wolfgang Maier > <[email protected]> wrote: >> Dear all, >> >> I've been asking this in the pysam user group, but didn't get any >> answers there, so I am giving it one more try here on this bigger list >> (it may well be a samtools api question anyway): > > You forgot the link to that discussion: > https://groups.google.com/forum/#!topic/pysam-user-group/HTFch5Ejxd0 >
Right, that's my other post (unfortunately, not really a discussion yet :)). Thanks for linking to it. >> pysam refuses to provide an iterator over reads in sam files that don't >> specify an @SQ sequence dictionary in their header (if you're asking for >> one, it raises - a somewhat misleading - "NotImplementedError: can not >> iterate over samfile without header"). >> There seems to be a good reason for this, since if you bypass the error, >> then sometimes (though, interestingly, not with every sam file) the >> Python interpreter crashes immediately when you try to retrieve the >> first read. >> >> I am trying to understand what is going on here and why, so I guess my >> question is: >> >> - is an @SQ entry in the header an absolute technical requirement to >> retrieve reads from a samfile or > > No, while SAM files ought to have @SQ lines they can often be > missing (thus useful commands like samtools reheader exist), and > if you had a SAM file of entirely unmapped reads there should be > no @SQ lines at all. > That's what I thought, but reading this again made me realize that there may be a simple pattern explaining when pysam causes Python to crash: It looks (from a limited number of cases though) like the crash is only happening with aligned sam files, not with unmapped ones, which could mean that the cause of the crash is that pysam/samtools (?) does not like to encounter sequence names when it hasn't seen them before in the header. Surprisingly, you can cure mapped sam files by putting in a single valid @SQ line, which can have a SN value that is *not* occurring among the reads. Maybe that indicates that pysam/samtools (?) is having a problem because it is trying to do something with an uninitialized data structure ? >> - is this some kind of limitation of the samtools api or >> - simply a bug in pysam > > Probably one of those - I'm not familiar enough with the samtools API > to say which. > > Peter > ------------------------------------------------------------------------------ Infragistics Professional Build stunning WinForms apps today! Reboot your WinForms applications with our WinForms controls. Build a bridge from your legacy apps to the future. http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk _______________________________________________ Samtools-help mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/samtools-help
