Package: xsane
Version: 0.996-3
Severity: wishlist

I would like to see support for the Djvu image format in Xsane.

Djvu is a free file format specifically made for scanned text, it
supports advanced compression algorithms (Wavelet) and achieves
smaller file sizes than PDF. It is used e.g. by the Internet
Archive. 

At the same time, Djvu supports embedding OCR text into an image
word-by-word, so that an image area is linked to the text
equivalent, this allows for easy searching, cut & paste etc.

For a first overview, see http://en.wikipedia.org/wiki/Djvu or
www.djvu.org

As a related issue, I would like to see OCR support for multipage
documents. Using Djvu format, this makes scanned documents searchable,
without the idea that the OCR has to be a standalone document in
perfect quality. 

In my experience, it works quite well to scan a document as
lineart 300dpi to PDF, convert the PDF to Djvu using pdf2djvu, and
then OCR it using ocrodjvu (a frontend to Ocropus, Tesseract,
Cuneiform), as a background process without user interaction. I
would like to skip the PDF stage and see OCR integrated with
xsane. I think scanning multipage texts is quite a common task, so
this would help a lot of people.

Thanks!

Michael Below

-- System Information:
Debian Release: squeeze/sid
  APT prefers testing
  APT policy: (500, 'testing'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.32-3-amd64 (SMP w/4 CPU cores)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages xsane depends on:
ii  libatk1.0-0             1.28.0-1         The ATK accessibility toolkit
ii  libc6                   2.10.2-6         Embedded GNU C Library: Shared lib
ii  libcairo2               1.8.10-2         The Cairo 2D vector graphics libra
ii  libfontconfig1          2.8.0-2          generic font configuration library
ii  libfreetype6            2.3.11-1         FreeType 2 font engine, shared lib
ii  libgimp2.0              2.6.7-1.1        Libraries for the GNU Image Manipu
ii  libglib2.0-0            2.22.4-1         The GLib library of C routines
ii  libgtk2.0-0             2.18.6-1         The GTK+ graphical user interface 
ii  libjpeg62               6b-16.1          The Independent JPEG Group's JPEG 
ii  liblcms1                1.18.dfsg-1.2+b1 Color management library
ii  libpango1.0-0           1.26.2-1         Layout and rendering of internatio
ii  libpng12-0              1.2.43-1         PNG library - runtime
ii  libsane                 1.0.20-14+b1     API library for scanners
ii  libtiff4                3.9.2-3+b1       Tag Image File Format (TIFF) libra
ii  xsane-common            0.996-3          featureful graphical frontend for 
ii  zlib1g                  1:1.2.3.4.dfsg-3 compression library - runtime

Versions of packages xsane recommends:
ii  cups-client          1.4.2-4             Common UNIX Printing System(tm) - 
ii  epiphany-browser [ww 2.29.3-1            Intuitive GNOME web browser
ii  iceweasel [www-brows 3.5.8-1             Web browser based on Firefox
ii  konqueror [www-brows 4:4.3.4-1           KDE 4's advanced file manager, web
ii  lynx-cur [www-browse 2.8.8dev.2-1        Text-mode WWW Browser with NLS sup
ii  midori [www-browser] 0.1.8-1             fast, lightweight graphical web br
ii  opera [www-browser]  10.10.4742.gcc4.qt3 The Opera Web Browser
ii  w3m [www-browser]    0.5.2-4             WWW browsable pager with excellent

Versions of packages xsane suggests:
ii  gimp                          2.6.7-1.1  The GNU Image Manipulation Program
ii  gocr                          0.46-2.1   A command line OCR
pn  gv                            <none>     (no description available)
pn  hylafax-client | mgetty-fax   <none>     (no description available)

-- no debconf information



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to