I posted this back in January on the user list, but never got a response.  The 
problem is rearing its ugly head again.  Anyone have any ideas?

Sample file here: 
https://www.dropbox.com/s/ahke8boksmk2f94/sample%20rotated%20image%20-%20Redacted.tif?dl=0


----------------------
Detecting angle of rotation
Last year, I contributed some changes to Tika to remove the dependnency on 
Python and rotation.py.  Instead, Tika uses code from Tess4J to figure out how 
much a document is rotated.  And then uses ImageMagick to correct the rotation.

I just found some situations where this code is not working.  I don't know 
enough about the actual math behind all this, so hopefully, someone can help

Below is some test code, which is the same as what Tika is using.  The attached 
file shows a rotation of -6.8 (the unredacted version shows -10) even though it 
should be 0.   Any idea why it's not calculating correctly?

package com.torchai.service.textextractor.service;

import org.apache.tika.parser.ocr.tess4j.ImageDeskew;

import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;

public class GetAngle {

    private static double getAngle(Path sourceFile) throws IOException {
        BufferedImage bi = ImageIO.read(sourceFile.toFile());
        ImageDeskew id = new ImageDeskew(bi);
        double angle = id.getSkewAngle();
        if (angle < 1.0D && angle > -1.0D) {
            angle = 0.0D;
        } else {
            System.out.println("*** angle: " + angle);
        }

        return angle;
    }

    public static void main(String[] args) throws IOException {
        Path path = Paths.get( "/testFiles", 
"apache-tika-3035541828217823624.tmp");
//        Path path = Paths.get( "/testFiles", "skewed-02_image_text.png");
        System.out.println("*** path: " + path);
        System.out.println("*** getAngle: " + getAngle(path));

    }
}






Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch AI]<http://www.torch.ai/>
5250 W 116th Pl, Suite 200., Leawood, KS 66211
WWW.TORCH.AI<http://www.torch.ai/>


Reply via email to