I posted this back in January on the user list, but never got a response. The problem is rearing its ugly head again. Anyone have any ideas?
Sample file here: https://www.dropbox.com/s/ahke8boksmk2f94/sample%20rotated%20image%20-%20Redacted.tif?dl=0 ---------------------- Detecting angle of rotation Last year, I contributed some changes to Tika to remove the dependnency on Python and rotation.py. Instead, Tika uses code from Tess4J to figure out how much a document is rotated. And then uses ImageMagick to correct the rotation. I just found some situations where this code is not working. I don't know enough about the actual math behind all this, so hopefully, someone can help Below is some test code, which is the same as what Tika is using. The attached file shows a rotation of -6.8 (the unredacted version shows -10) even though it should be 0. Any idea why it's not calculating correctly? package com.torchai.service.textextractor.service; import org.apache.tika.parser.ocr.tess4j.ImageDeskew; import javax.imageio.ImageIO; import java.awt.image.BufferedImage; import java.io.IOException; import java.nio.file.Path; import java.nio.file.Paths; public class GetAngle { private static double getAngle(Path sourceFile) throws IOException { BufferedImage bi = ImageIO.read(sourceFile.toFile()); ImageDeskew id = new ImageDeskew(bi); double angle = id.getSkewAngle(); if (angle < 1.0D && angle > -1.0D) { angle = 0.0D; } else { System.out.println("*** angle: " + angle); } return angle; } public static void main(String[] args) throws IOException { Path path = Paths.get( "/testFiles", "apache-tika-3035541828217823624.tmp"); // Path path = Paths.get( "/testFiles", "skewed-02_image_text.png"); System.out.println("*** path: " + path); System.out.println("*** getAngle: " + getAngle(path)); } } Peter Kronenberg | Senior AI Analytic ENGINEER C: 703.887.5623 [Torch AI]<http://www.torch.ai/> 5250 W 116th Pl, Suite 200., Leawood, KS 66211 WWW.TORCH.AI<http://www.torch.ai/>