Re: [Gimp-developer] Some questions about tiff
On Sun, 2022-09-18 at 23:53 +0200, Adalbert Hanßen wrote: > Attached is an example scan from XSane. Hello! I'm sorry to have taken a while to reply - it's been a difficult time here. First, i opened your sample image, and i do see a histogram under Curves. However, it's not very obviopus. If you switch from linear to logarithmic histogram using the rightmost icon near the top by Reset Channel, you'll see it more clearly. > 1. reduce the size of the file by reducing the 600 dpi resolution > which was chosen during scanning for better OCR results, keep the OCR > result, Image->Scale Image, and use a multiple of two, For example, your sample image is 4976x3190 pixels; in Scale Image, put a /2 after the width, in the text box, and press the tab keyu, and it'll divide by two. Or use /4 to divide by 4. > 2. reduce the bits per pixel for the scan image plane, e.g. by > posterizing or even binaizing, Probably i'd keep 8 bits per pixel, but you can use Curves to reduce teh amount of detail stored: drag the bottom left of the diagonal line (the curve) right by four boxes, and the top right corner of the diagonal line left by one box or half a box, making sure you can still read the text. > > 3. improve the contrast of the displayed pdf file by some contrast > enhancing function, e.g. as it is done after applying a contrast > curve > in Gimp. > > I want to do all this maintaining the OCR-plane from input files. > Manipulating sandwich PDF-files (like scans made searchable by OCR) > is > probably out of the scope of Gimp. But the functions used for the > image > plane are in it. > gs (ghostscript) can reduce the dpi e.g. from 600 dpi (good for OCR) > down to 150 dpi (insufficient for OCR but sufficient to display most > documents. I wish, they would also provide 200 dpi requiring a bit > more > storage space. Unfortunately gs only handles 72 dpi (/screen), 150dpi > (/ebook) and 300 dpi for output. It can do this keeping the OCR pane. I don't know about planes, are we flying somewhere?? You could use imagemagick or (on Linux at least) netpbm and a shell script, to automate this. GhostScript isn't the tool i'd use normally. > > To my knowledge, gs can't apply any color or grey level > transformations, > even none which could be made by a look up table. gs is a PostScript renderer, not an image processing tool. Hope this helps ankh / liam / demib0y -- Liam Quin - delightfulcomputing.com Cancer gofundme https://www.gofundme.com/f/5u9v7-every-little-helps Vintage pictures & texts https://www.fromoldbooks.org/ Full-time "slave" in voluntary servitude. ___ gimp-developer-list mailing list List address:gimp-developer-list@gnome.org List membership: https://mail.gnome.org/mailman/listinfo/gimp-developer-list List archives: https://mail.gnome.org/archives/gimp-developer-list
Re: [Gimp-developer] Some questions about tiff
Attached is an example scan from XSane. There are more than just a few grey levels and you see the hand written notes on top of it This file could easily be represented in a posterized form with 4 Bits per pixel, probably it could also be represented by 2 Bits per pixel, but Gimp's function posterize should be given parameters to use something blue for the handwritten part like in my second example . Gimp does not show a histogram when using the contrast curve tool in my example. My question arises when lots of already scanned and OCR-treated pdf files shall be optimized with three goals: 1. reduce the size of the file by reducing the 600 dpi resolution which was chosen during scanning for better OCR results, keep the OCR result, 2. reduce the bits per pixel for the scan image plane, e.g. by posterizing or even binaizing, 3. improve the contrast of the displayed pdf file by some contrast enhancing function, e.g. as it is done after applying a contrast curve in Gimp. I want to do all this maintaining the OCR-plane from input files. Manipulating sandwich PDF-files (like scans made searchable by OCR) is probably out of the scope of Gimp. But the functions used for the image plane are in it. gs (ghostscript) can reduce the dpi e.g. from 600 dpi (good for OCR) down to 150 dpi (insufficient for OCR but sufficient to display most documents. I wish, they would also provide 200 dpi requiring a bit more storage space. Unfortunately gs only handles 72 dpi (/screen), 150dpi (/ebook) and 300 dpi for output. It can do this keeping the OCR pane. To my knowledge, gs can't apply any color or grey level transformations, even none which could be made by a look up table. Regards Adalbert Am 18.09.22 um 22:32 schrieb Liam R E Quin: On Sun, 2022-09-18 at 20:52 +0200, Adalbert Hanßen via gimp-developer- list wrote: XSane produced a color scan from a document with 600 dpi and fill color, 1.1MB file size. I normally have XSane make a png file. For 8-bit per channel (0 to 255) images, you can also use the XSane gimp plugin, which is a lot easier, but make sure to export the file right away so you havwe a copy if gimp crashes or if you make a mistake :) When I load this file into Gimp, I get an error message about an incompatible TIFF format (additiona channels without the field ExtraSamples). It gives me choice to let the additional channel worlk as * non pre-multiplied alpha * pre-multiplied alpha * channel I see no difference whatever choice I select. If you choose Channel, it'll be visible in Gimp's Channels dialogue. Otherwise, it's most like pre-multiplied alpha (transparency), and will most likely be "all opaque", so you can ignore it. However: When I try to adapt colors by the contrast curve, I see no Histogram under it. How large is the image? If you used the Line Art setting in XSane every pixel will be either 0 or 255, so the histogram is just two vertical lines, one at eacn end, that aren't really visible as they're right next to the edge. FOr a large image it can take a while for the background thread to count all the pixels in the image and fill in the histogram. ** Is this due to the error message when loading the file? ** no. ankh / liam / demib0y ___ gimp-developer-list mailing list List address:gimp-developer-list@gnome.org List membership: https://mail.gnome.org/mailman/listinfo/gimp-developer-list List archives: https://mail.gnome.org/archives/gimp-developer-list
Re: [Gimp-developer] Some questions about tiff
On Sun, 2022-09-18 at 20:52 +0200, Adalbert Hanßen via gimp-developer- list wrote: > XSane produced a color scan from a document with 600 dpi and fill > color, > 1.1MB file size. I normally have XSane make a png file. For 8-bit per channel (0 to 255) images, you can also use the XSane gimp plugin, which is a lot easier, but make sure to export the file right away so you havwe a copy if gimp crashes or if you make a mistake :) > When I load this file into Gimp, I get an error message about an > incompatible TIFF format (additiona channels without the field > ExtraSamples). It gives me choice to let the additional channel worlk > as > > * non pre-multiplied alpha > * pre-multiplied alpha > * channel > > I see no difference whatever choice I select. If you choose Channel, it'll be visible in Gimp's Channels dialogue. Otherwise, it's most like pre-multiplied alpha (transparency), and will most likely be "all opaque", so you can ignore it. > However: When I try to adapt colors by the contrast curve, I see no > Histogram under it. How large is the image? If you used the Line Art setting in XSane every pixel will be either 0 or 255, so the histogram is just two vertical lines, one at eacn end, that aren't really visible as they're right next to the edge. FOr a large image it can take a while for the background thread to count all the pixels in the image and fill in the histogram. > ** Is this due to the error message when loading the file? ** no. ankh / liam / demib0y -- Liam Quin - paligo.net, delightfulcomputing.com Cancer gofundme https://www.gofundme.com/f/5u9v7-every-little-helps Vintage pictures & texts https://www.fromoldbooks.org/ Full-time "slave" in voluntary servitude ___ gimp-developer-list mailing list List address:gimp-developer-list@gnome.org List membership: https://mail.gnome.org/mailman/listinfo/gimp-developer-list List archives: https://mail.gnome.org/archives/gimp-developer-list