Package: xsane Version: 0.996-3 Severity: wishlist
I would like to see support for the Djvu image format in Xsane. Djvu is a free file format specifically made for scanned text, it supports advanced compression algorithms (Wavelet) and achieves smaller file sizes than PDF. It is used e.g. by the Internet Archive. At the same time, Djvu supports embedding OCR text into an image word-by-word, so that an image area is linked to the text equivalent, this allows for easy searching, cut & paste etc. For a first overview, see http://en.wikipedia.org/wiki/Djvu or www.djvu.org As a related issue, I would like to see OCR support for multipage documents. Using Djvu format, this makes scanned documents searchable, without the idea that the OCR has to be a standalone document in perfect quality. In my experience, it works quite well to scan a document as lineart 300dpi to PDF, convert the PDF to Djvu using pdf2djvu, and then OCR it using ocrodjvu (a frontend to Ocropus, Tesseract, Cuneiform), as a background process without user interaction. I would like to skip the PDF stage and see OCR integrated with xsane. I think scanning multipage texts is quite a common task, so this would help a lot of people. Thanks! Michael Below -- System Information: Debian Release: squeeze/sid APT prefers testing APT policy: (500, 'testing'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 2.6.32-3-amd64 (SMP w/4 CPU cores) Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages xsane depends on: ii libatk1.0-0 1.28.0-1 The ATK accessibility toolkit ii libc6 2.10.2-6 Embedded GNU C Library: Shared lib ii libcairo2 1.8.10-2 The Cairo 2D vector graphics libra ii libfontconfig1 2.8.0-2 generic font configuration library ii libfreetype6 2.3.11-1 FreeType 2 font engine, shared lib ii libgimp2.0 2.6.7-1.1 Libraries for the GNU Image Manipu ii libglib2.0-0 2.22.4-1 The GLib library of C routines ii libgtk2.0-0 2.18.6-1 The GTK+ graphical user interface ii libjpeg62 6b-16.1 The Independent JPEG Group's JPEG ii liblcms1 1.18.dfsg-1.2+b1 Color management library ii libpango1.0-0 1.26.2-1 Layout and rendering of internatio ii libpng12-0 1.2.43-1 PNG library - runtime ii libsane 1.0.20-14+b1 API library for scanners ii libtiff4 3.9.2-3+b1 Tag Image File Format (TIFF) libra ii xsane-common 0.996-3 featureful graphical frontend for ii zlib1g 1:1.2.3.4.dfsg-3 compression library - runtime Versions of packages xsane recommends: ii cups-client 1.4.2-4 Common UNIX Printing System(tm) - ii epiphany-browser [ww 2.29.3-1 Intuitive GNOME web browser ii iceweasel [www-brows 3.5.8-1 Web browser based on Firefox ii konqueror [www-brows 4:4.3.4-1 KDE 4's advanced file manager, web ii lynx-cur [www-browse 2.8.8dev.2-1 Text-mode WWW Browser with NLS sup ii midori [www-browser] 0.1.8-1 fast, lightweight graphical web br ii opera [www-browser] 10.10.4742.gcc4.qt3 The Opera Web Browser ii w3m [www-browser] 0.5.2-4 WWW browsable pager with excellent Versions of packages xsane suggests: ii gimp 2.6.7-1.1 The GNU Image Manipulation Program ii gocr 0.46-2.1 A command line OCR pn gv <none> (no description available) pn hylafax-client | mgetty-fax <none> (no description available) -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org