On March 1, 2021 12:50:35 PM GMT+01:00, Wols Lists <antli...@youngman.org.uk> 
wrote:
>I've got a bunch of scans, let's assume they're text documents. And
>they're rather big ... I want to email them.
>
>How on earth do I convert them to TRUE b&w documents? At the moment they
>are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes
>to store all the colour, luminance, whatever, per pixel. But actually,
>there's only ONE BIT of information there - whether that pixel is black
>or white.
>
>I'm using imagemagick, but so far all my attempts to strip out the
>surplus information have resulted in INcreasing the file size ???
>
>So basically, how do I save an image as "one bit per pixel" like you'd
>think you'd send to a B&W printer?
>
>Even at 300dpi, I make that 300*300/8 ~= 10KB/in^2 or 800KB of
>uncompressed info for a page of A4, not 3MB.
>
>Cheers,
>Wol
>

Have you tried an optical character recognition software like Tesseract[1]?

1. https://github.com/tesseract-ocr/tesseract



--
Hund

Reply via email to