Thanks John, for the clarification! I realized that I misspelt your surname also, apologies. Then I can safely continue to use my small program that removes EOF blocks from bed bgzip archives (without headers).
Anyone who knows whether this bug in htslib will be fixed? -Pär -----Original Message----- From: John Marshall [mailto:[email protected]] Sent: den 5 oktober 2015 09:48 To: Pär Larsson <[email protected]> Cc: [email protected] Subject: Re: [Samtools-help] bgzip removal of EOF blocks? On 3 Oct 2015, at 23:52, Pär Larsson <[email protected]> wrote: > Sorry to bother you with a question relating to an older discussion > (http://sourceforge.net/p/samtools/mailman/message/34109200/). John Marshall > expressed concerns that removal of the 28 byte EOF block from bgzip archives > would be unreliable for catting archives together using the 'cat' command. Actually I expanded on concerns with *not* removing the 28 byte EOF block. In principle you shouldn't even need to remove it, but bugs in current versions of tools mean that you do, as Stathis had found. Removing the 28 bytes enables you to produce a catted file that looks identical to one that was written all at once, which will be fine. > Seems to work when I try (using bgzipped bed files) although tabix indices > and archive sizes become different. It would save time if it could be done > this way so I'm just curious if anyone might know when and how it could fail. > Silently? The other obvious issue with catting these files together is that you need to make sure that headers of the second and subsequent files don't cause trouble, as they will now be embedded in the concatenated file rather than at the beginning. Stathis avoided this by removing headers from all but the first input VCF file. Embedded headers may be acceptable in BED files depending on the tools you're using, so you may be fine -- but in general it's something to consider carefully. John -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. ------------------------------------------------------------------------------ _______________________________________________ Samtools-help mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/samtools-help
