On 18/03/10 09:11, michael watson (IAH-C) wrote: > Hi > > I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM. > > I find it strange that extractseq reports a memory problem: > > -bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq -sequence chr1.fasta > -outseq chr1_.1.fasta -regions '34415690-34415711' > Extract regions from a sequence > Uncaught exception: Allocation failed, insufficient memory available, raised > at ajstr.c:2406 > > Whereas if I write a Bioperl script using SeqIO and the trunk() function, it > works perfectly. > > I'd have thought EMBOSS would be more streamlined and memory efficient than > Bioperl?
It appears to be in the buffering of input to detect the format. While we try to improve the performance, you can simply specify the format: -sformat fasta to turn off the file input buffering. Reading an unknown format requires a lot of input to be buffered, in case a GCG ".." checksum line appears. Hope that helps Peter _______________________________________________ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss