I'm attaching the pom. I can't remember if attachments get stripped. If they do, I'll copy+paste.
Apache Maven 3.8.7 (b89d5959fcde851dcb1c8946a785a163f14e1e29) temurin-17-jdk-amd64 On Thu, Apr 4, 2024 at 3:16 AM Gerardo Hernandez <[email protected]> wrote: > > Hi Tim, > > Did you use the exact same pom I shared, or a custom one? If the second, > could you please share it so I can verify if something missing on mine. > > Also, what jdk/maven versions are you using? > > Tilman, I get the expected string when printing > System.out.println(org.apache.tika.parser.pdf.PDFParser.PASSWORD); on both > 2.7.0 and 2.8.0+ > > Thanks, and regards, > Gerardo > ________________________________ > From: Tim Allison <[email protected]> > Sent: Wednesday, April 3, 2024 06:43 AM > To: [email protected] <[email protected]> > Subject: Re: AutoDetectParser not working after upgrading from 2.7.0 to 2.8.0+ > > Y, I'm not able to repro this problem with 2.8.0 or higher. I'm seeing > 239 parsers (probably diff from Tilman because of installed external > parsers?). > > On Wed, Apr 3, 2024 at 5:09 AM Tilman Hausherr <[email protected]> wrote: > > > > On 03.04.2024 08:55, Gerardo Hernandez wrote: > > > On 2.7.0, I get a list of 203 parsers, and the file is parser > > > successfully: > > > > I get 227 parsers with 2.9.2. My pom.xml is somewhat different. The main > > part is > > > > > > <dependencies> > > <dependency> > > <groupId>org.apache.tika</groupId> > > <artifactId>tika-core</artifactId> > > <version>${tika.version}</version> > > </dependency> > > <dependency> > > <groupId>org.apache.tika</groupId> > > <artifactId>tika-parsers-standard-package</artifactId> > > <version>${tika.version}</version> > > </dependency> > > <dependency> > > <groupId>org.slf4j</groupId> > > <artifactId>slf4j-simple</artifactId> > > <version>${slf4j.version}</version> > > </dependency> > > <dependency> > > <groupId>org.bouncycastle</groupId> > > <artifactId>bcprov-jdk18on</artifactId> > > <version>${bouncycastle.version}</version> > > </dependency> > > </dependencies> > > > > What happens if you add this on top of your code? > > > > System.out.println(org.apache.tika.parser.pdf.PDFParser.PASSWORD); > > > > it should output "org.apache.tika.parser.pdf.password". This is to test > > if the PDF parser is in your class path. > > > > Tilman > >
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>org.example</groupId> <artifactId>tika-test</artifactId> <version>1.0-SNAPSHOT</version> <name>Archetype - tika-test</name> <url>http://maven.apache.org</url> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <maven.compiler.release>13</maven.compiler.release> </properties> <dependencies> <!-- 239 for 2.8.0 --> <dependency> <groupId>org.apache.tika</groupId> <artifactId>tika-core</artifactId> <version>2.7.0</version> </dependency> <dependency> <groupId>org.apache.tika</groupId> <artifactId>tika-parsers-standard-package</artifactId> <version>2.7.0</version> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-slf4j-impl</artifactId> <version>2.17.2</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> <version>1.7.36</version> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-api</artifactId> <version>2.17.2</version> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-core</artifactId> <version>2.17.2</version> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-1.2-api</artifactId> <version>2.17.2</version> </dependency> </dependencies> </project>
