Bug#548987: xapian-omega: omindex allows too little memory for the programs handling the file formats
tags 548987 + upstream pending thanks On Wed, Sep 30, 2009 at 01:31:06PM +0200, Rune Kock wrote: On Wed, Sep 30, 2009 at 11:10, Olly Betts o...@survex.com wrote: I think this is probably the cause of upstream #358: http://trac.xapian.org/ticket/358 Yes, #358 is probably two bugs, this one and a memory leak. evoisard has omindex using 360 MB, which seems excessive, but shouldn't be a problem on his 1 GB machine. Indexing large documents is fairly memory hungry, so I don't think this is a leak, just the C++ STL hording memory, probably plus some heap fragmentation. The reporter never responded to my request for further investigation so it's hard to be totally sure, but there's only one place in omindex which explicitly allocates memory dynamically, and that is only called once. Debian bug 404528 decribes a similar bug in another package, and suggests using _SC_PHYS_PAGES instead of _SC_AVPHYS_PAGES. _SC_PHYS_PAGES isn't ideal as other processes might be using that memory, but _SC_AVPHYS_PAGES clearly isn't suitable, and since this is mostly a catch for filters spiralling out of control, it looks like _SC_AVPHYS_PAGES is probably the best option, perhaps with a lower ratio than 7/8. I agree. Or maybe even just a fixed limit of, say, 50 MB. Some document filters will need more than 50MB to extract a large document. We really just want to ensure that an out of control filter doesn't cause problems. I've committed a fix to upstream SVN for this. Cheers, Olly -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#548987: xapian-omega: omindex allows too little memory for the programs handling the file formats
On Wed, Sep 30, 2009 at 05:54:45AM +0200, Rune Kock wrote: As far as I can tell, the problem is that runfilter.cc sets an rlimit of 7/8 of the free memory. And that freemem.cc calculates that using sysconf(_SC_AVPHYS_PAGES), which doesn't include the memory that the kernel is using for temporary caching, even though that really is available. I think this is probably the cause of upstream #358: http://trac.xapian.org/ticket/358 Debian bug 404528 decribes a similar bug in another package, and suggests using _SC_PHYS_PAGES instead of _SC_AVPHYS_PAGES. _SC_PHYS_PAGES isn't ideal as other processes might be using that memory, but _SC_AVPHYS_PAGES clearly isn't suitable, and since this is mostly a catch for filters spiralling out of control, it looks like _SC_AVPHYS_PAGES is probably the best option, perhaps with a lower ratio than 7/8. Thanks for the report and especially the detective work. Cheers, Olly -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#548987: xapian-omega: omindex allows too little memory for the programs handling the file formats
On Wed, Sep 30, 2009 at 11:10, Olly Betts o...@survex.com wrote: I think this is probably the cause of upstream #358: http://trac.xapian.org/ticket/358 Yes, #358 is probably two bugs, this one and a memory leak. evoisard has omindex using 360 MB, which seems excessive, but shouldn't be a problem on his 1 GB machine. Debian bug 404528 decribes a similar bug in another package, and suggests using _SC_PHYS_PAGES instead of _SC_AVPHYS_PAGES. _SC_PHYS_PAGES isn't ideal as other processes might be using that memory, but _SC_AVPHYS_PAGES clearly isn't suitable, and since this is mostly a catch for filters spiralling out of control, it looks like _SC_AVPHYS_PAGES is probably the best option, perhaps with a lower ratio than 7/8. I agree. Or maybe even just a fixed limit of, say, 50 MB. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#548987: xapian-omega: omindex allows too little memory for the programs handling the file formats
Package: xapian-omega Version: 1.0.16-1 Severity: important I get the following error when trying to index a PDF file: Indexing /x.pdf as application/pdf ... pdftotext: error while loading shared libraries: libc.so.6: failed to map segment from shared object: Cannot allocate memory Filter for application/pdf not installed - ignoring extension pdf As far as I can tell, the problem is that runfilter.cc sets an rlimit of 7/8 of the free memory. And that freemem.cc calculates that using sysconf(_SC_AVPHYS_PAGES), which doesn't include the memory that the kernel is using for temporary caching, even though that really is available. Debian bug 404528 decribes a similar bug in another package, and suggests using _SC_PHYS_PAGES instead of _SC_AVPHYS_PAGES. Here is the output of 'free' on my system: total used free sharedbuffers cached Mem:249644 246080 3564 0 0 182332 -/+ buffers/cache: 63748 185896 Swap: 979956 11720 968236 -- System Information: Debian Release: squeeze/sid APT prefers testing APT policy: (500, 'testing') Architecture: i386 (i586) Kernel: Linux 2.6.30-1-486 Locale: LANG=en_DK.UTF-8, LC_CTYPE=en_DK.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages xapian-omega depends on: ii libc6 2.9-25 GNU C Library: Shared libraries ii libgcc1 1:4.4.1-1 GCC support library ii libstdc++64.4.1-1The GNU Standard C++ Library v3 ii libxapian15 1.0.16-3 Search engine library Versions of packages xapian-omega recommends: ii apache2 2.2.13-2 Apache HTTP Server metapackage ii apache2-mpm-worker [httpd-cgi 2.2.13-2 Apache HTTP Server - high speed th Versions of packages xapian-omega suggests: ii antiword 0.37-6 Converts MS Word files to text, PS ii catdoc 0.94.2-1MS-Word to TeX or plain text conve pn catdvi none (no description available) pn djvulibre-binnone (no description available) pn ghostscript none (no description available) pn libwpd-tools none (no description available) pn libwps-tools none (no description available) ii perl 5.10.0-25 Larry Wall's Practical Extraction ii unrtf0.19.3-1.1 RTF to other formats converter ii unzip6.0-1 De-archiver for .zip files ii xpdf-utils 3.02-1.4+lenny1 Portable Document Format (PDF) sui -- debconf-show failed -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org