[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101781#comment-16101781 ] Tim Allison commented on TIKA-2433: --- All thanks should go to Nick who fixed my npe so quickly. Ugh, on my part. :( Completely understand need not to change anything now. Cheers! > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > Fix For: 1.17 > > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101769#comment-16101769 ] Karl Buchta commented on TIKA-2433: --- Hi Tim, for sure we will in the future, but right now we are in the phase of going live with many other changes, completely new infrastructure, docker in production, major release, ... That's why we are skipping this part right now. Thanks for the hint though, i would usually absolutely agree, but everything is burning right now :). Best, and thx a lot for your kind and fast support, extraordinary! Karl > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > Fix For: 1.17 > > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101540#comment-16101540 ] Tim Allison commented on TIKA-2433: --- Might recommend migrating to tika-server. As Nick pointed out, we'll be getting rid of this part of the codebase soon. > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > Fix For: 1.17 > > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101306#comment-16101306 ] Karl Buchta commented on TIKA-2433: --- Hi again, my dev lead has decided to downgrade instead of compiling. So my advice is to continue with other work now. >From my side this issue can be closed, we will wait for the 1.17 release. Best Karl > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} > Reporter: Karl Buchta > Fix For: 1.17 > > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100627#comment-16100627 ] Hudson commented on TIKA-2433: -- FAILURE: Integrated in Jenkins build Tika-trunk #1338 (See [https://builds.apache.org/job/Tika-trunk/1338/]) TIKA-2433 All non-pipe modes need configuring, otherwise the Tika Server (nick: [https://github.com/apache/tika/commit/f31b7f1e281938c393f159cd4a76f3396291e7b6]) * (edit) tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > Fix For: 1.17 > > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100206#comment-16100206 ] Karl Buchta commented on TIKA-2433: --- Ok, thx for the info. Will try to compile it tomorrow with this revision, and confirm here if it works. > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > Fix For: 1.17 > > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100166#comment-16100166 ] Nick Burch commented on TIKA-2433: -- As it's in a deprecated part of the codebase, I'm not sure we'd do a release just for this fix. We've only just done 1.16, so it may be several months until 1.17 is released > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > Fix For: 1.17 > > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100162#comment-16100162 ] Karl Buchta commented on TIKA-2433: --- Nick Burch, thanks a lot for this information. Will you release this fix at some point? Otherwise we will use the the source and try to compile. Will post here tomorrow again. Awesome help and support, thank you. Best Karl > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > Fix For: 1.17 > > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-2433. -- Resolution: Fixed Fix Version/s: 1.17 I can reproduce the problem, hopefully fixed in f31b7f1e281938c393f159cd4a76f3396291e7b6 Just be aware that the Tika App server mode will be removed in 2.x, you should swap to the proper RESTful Tika Server fairly soon > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > Fix For: 1.17 > > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100064#comment-16100064 ] Karl Buchta edited comment on TIKA-2433 at 7/25/17 1:50 PM: Sorry i thought this was running via cli as our other services, but this is not the case. was (Author: karlbuchta): Sorry i actually thought this was running via cli as our other services, but this is not the case. > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100064#comment-16100064 ] Karl Buchta commented on TIKA-2433: --- Sorry i actually thought this was running via cli as our other services, but this is not the case. > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100057#comment-16100057 ] Karl Buchta edited comment on TIKA-2433 at 7/25/17 1:46 PM: This is how we start the server: {noformat} java -Djava.awt.headless=true -jar /opt/tika/tika.jar --server --port 9100 -t {noformat} To be more exact, this is our supervisor config: {noformat} [program:tika] priority=2 command=java -Djava.awt.headless=true -jar /opt/tika/tika.jar --server --port 9100 -t autorestart=true stopsignal = KILL stdout_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile=/dev/stderr stderr_logfile_maxbytes=0 {noformat} We write to / read from the server via ruby sockets directly. This code was not subject to change during the upgrade, meaning it worked with 1.15. Otherwise we do not use the Tika App via cli, but only this way. {code} class Tika < ExtractorBase def self.run (data, binary = false) if binary data = StringIO.new(data , 'rb') # we use b to read as binary, so we do not destroy the encoding we do not know else # TODO: should never be used data = StringIO.new(data , 'r') # we use b to read as binary, so we do not destroy the encoding we do not know end s = TCPSocket.new(ENV['TIKA_SERVER'], ENV['TIKA_PORT']) i = 0 while 1 chunk = data.read(65536) break unless chunk s.write(chunk) i += 65536 end s.shutdown(Socket::SHUT_WR) resp = '' while 1 chunk = s.recv(65536) break if chunk.empty? || !chunk resp << chunk end resp end end {code} Thank you a lot for your quick reply. was (Author: karlbuchta): This is how we start the server: {noformat} java -Djava.awt.headless=true -jar /opt/tika/tika.jar --server --port 9100 -t {noformat} To be more exact, this is our supervisor config: {noformat} [program:tika] priority=2 command=java -Djava.awt.headless=true -jar /opt/tika/tika.jar --server --port 9100 -t autorestart=true stopsignal = KILL stdout_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile=/dev/stderr stderr_logfile_maxbytes=0 {noformat} We write to / read from the server via ruby sockets directly. This code was not subject to change during the upgrade, meaning it worked with 1.15. Otherwise we do not use the the Tika via cli, but only this way. {code} class Tika < ExtractorBase def self.run (data, binary = false) if binary data = StringIO.new(data , 'rb') # we use b to read as binary, so we do not destroy the encoding we do not know else # TODO: should never be used data = StringIO.new(data , 'r') # we use b to read as binary, so we do not destroy the encoding we do not know end s = TCPSocket.new(ENV['TIKA_SERVER'], ENV['TIKA_PORT']) i = 0 while 1 chunk = data.read(65536) break unless chunk s.write(chunk) i += 65536 end s.shutdown(Socket::SHUT_WR) resp = '' while 1 chunk = s.recv(65536) break if chunk.empty? || !chunk resp << chunk end resp end end {code} Thank you a lot for your quick reply. > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (
[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100057#comment-16100057 ] Karl Buchta commented on TIKA-2433: --- This is how we start the server: {noformat} java -Djava.awt.headless=true -jar /opt/tika/tika.jar --server --port 9100 -t {noformat} To be more exact, this is our supervisor config: {noformat} [program:tika] priority=2 command=java -Djava.awt.headless=true -jar /opt/tika/tika.jar --server --port 9100 -t autorestart=true stopsignal = KILL stdout_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile=/dev/stderr stderr_logfile_maxbytes=0 {noformat} We write to / read from the server via ruby sockets directly. This code was not subject to change during the upgrade, meaning it worked with 1.15. Otherwise we do not use the the Tika via cli, but only this way. {code:ruby} class Tika < ExtractorBase def self.run (data, binary = false) if binary data = StringIO.new(data , 'rb') # we use b to read as binary, so we do not destroy the encoding we do not know else # TODO: should never be used data = StringIO.new(data , 'r') # we use b to read as binary, so we do not destroy the encoding we do not know end s = TCPSocket.new(ENV['TIKA_SERVER'], ENV['TIKA_PORT']) i = 0 while 1 chunk = data.read(65536) break unless chunk s.write(chunk) i += 65536 end s.shutdown(Socket::SHUT_WR) resp = '' while 1 chunk = s.recv(65536) break if chunk.empty? || !chunk resp << chunk end resp end end {code} Thank you a lot for your quick reply. > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100057#comment-16100057 ] Karl Buchta edited comment on TIKA-2433 at 7/25/17 1:45 PM: This is how we start the server: {noformat} java -Djava.awt.headless=true -jar /opt/tika/tika.jar --server --port 9100 -t {noformat} To be more exact, this is our supervisor config: {noformat} [program:tika] priority=2 command=java -Djava.awt.headless=true -jar /opt/tika/tika.jar --server --port 9100 -t autorestart=true stopsignal = KILL stdout_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile=/dev/stderr stderr_logfile_maxbytes=0 {noformat} We write to / read from the server via ruby sockets directly. This code was not subject to change during the upgrade, meaning it worked with 1.15. Otherwise we do not use the the Tika via cli, but only this way. {code} class Tika < ExtractorBase def self.run (data, binary = false) if binary data = StringIO.new(data , 'rb') # we use b to read as binary, so we do not destroy the encoding we do not know else # TODO: should never be used data = StringIO.new(data , 'r') # we use b to read as binary, so we do not destroy the encoding we do not know end s = TCPSocket.new(ENV['TIKA_SERVER'], ENV['TIKA_PORT']) i = 0 while 1 chunk = data.read(65536) break unless chunk s.write(chunk) i += 65536 end s.shutdown(Socket::SHUT_WR) resp = '' while 1 chunk = s.recv(65536) break if chunk.empty? || !chunk resp << chunk end resp end end {code} Thank you a lot for your quick reply. was (Author: karlbuchta): This is how we start the server: {noformat} java -Djava.awt.headless=true -jar /opt/tika/tika.jar --server --port 9100 -t {noformat} To be more exact, this is our supervisor config: {noformat} [program:tika] priority=2 command=java -Djava.awt.headless=true -jar /opt/tika/tika.jar --server --port 9100 -t autorestart=true stopsignal = KILL stdout_logfile=/dev/stdout stdout_logfile_maxbytes=0 stderr_logfile=/dev/stderr stderr_logfile_maxbytes=0 {noformat} We write to / read from the server via ruby sockets directly. This code was not subject to change during the upgrade, meaning it worked with 1.15. Otherwise we do not use the the Tika via cli, but only this way. {code:ruby} class Tika < ExtractorBase def self.run (data, binary = false) if binary data = StringIO.new(data , 'rb') # we use b to read as binary, so we do not destroy the encoding we do not know else # TODO: should never be used data = StringIO.new(data , 'r') # we use b to read as binary, so we do not destroy the encoding we do not know end s = TCPSocket.new(ENV['TIKA_SERVER'], ENV['TIKA_PORT']) i = 0 while 1 chunk = data.read(65536) break unless chunk s.write(chunk) i += 65536 end s.shutdown(Socket::SHUT_WR) resp = '' while 1 chunk = s.recv(65536) break if chunk.empty? || !chunk resp << chunk end resp end end {code} Thank you a lot for your quick reply. > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (
[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100029#comment-16100029 ] Karl Buchta commented on TIKA-2433: --- I am looking it up within the application stack. > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
[ https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100021#comment-16100021 ] Nick Burch commented on TIKA-2433: -- What are the arguments you are passing to the Tika App? > Tika 1.16 - Nullpointer Exception after update - Asking for help > > > Key: TIKA-2433 > URL: https://issues.apache.org/jira/browse/TIKA-2433 > Project: Tika > Issue Type: Bug > Components: cli >Affects Versions: 1.16 > Environment: Docker - Debian Stretch - Oracle Java > +Installation in Dockerfile+ > {noformat} > ENV TIKA_VERSION 1.16 > # also see > https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile > RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail > http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ > && curl --fail > http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o > tika-server.jar \ > && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita > tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin > {noformat} > +Tika.xml+ > {noformat} > > > > > class="org.apache.tika.parser.ocr.TesseractOCRParser"/> > > > > {noformat} >Reporter: Karl Buchta > > Hi, > i would like to kindly ask for help. We had to update to the latest Tika > 1.16. I have no experience in Tika so far, i am just maintaining the > configuration and application from another developer. > Version 1.15 worked very fine for us. But right now i see following error > (office is the name of our docker container, hence this output): > https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 > {noformat} > office | java.lang.NullPointerException > office | at > org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) > office | at > org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) > {noformat} > I have checked the source on github and have seen, that this code part was > changed with one of the latest commits before the 1.16 release (see link > above). > I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i > haven't used Tika so far, and i cannot see that the CLI requirements changed > from the release notes, i would like to ask, whether this is the case anyway. > Do you have some hints on where to start, is this maybe due to improper cli > usage? Or do you think there is a missing java package or dependency? > It's hard for me to say, as the cli commands are automated and distributed > over several layers and configuration files in the application stack, hence i > am asking for a hint. > Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help
Karl Buchta created TIKA-2433: - Summary: Tika 1.16 - Nullpointer Exception after update - Asking for help Key: TIKA-2433 URL: https://issues.apache.org/jira/browse/TIKA-2433 Project: Tika Issue Type: Bug Components: cli Affects Versions: 1.16 Environment: Docker - Debian Stretch - Oracle Java +Installation in Dockerfile+ {noformat} ENV TIKA_VERSION 1.16 # also see https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \ && curl --fail http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o tika-server.jar \ && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin {noformat} +Tika.xml+ {noformat} {noformat} Reporter: Karl Buchta Hi, i would like to kindly ask for help. We had to update to the latest Tika 1.16. I have no experience in Tika so far, i am just maintaining the configuration and application from another developer. Version 1.15 worked very fine for us. But right now i see following error (office is the name of our docker container, hence this output): https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202 {noformat} office | java.lang.NullPointerException office |at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202) office |at org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153) {noformat} I have checked the source on github and have seen, that this code part was changed with one of the latest commits before the 1.16 release (see link above). I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i haven't used Tika so far, and i cannot see that the CLI requirements changed from the release notes, i would like to ask, whether this is the case anyway. Do you have some hints on where to start, is this maybe due to improper cli usage? Or do you think there is a missing java package or dependency? It's hard for me to say, as the cli commands are automated and distributed over several layers and configuration files in the application stack, hence i am asking for a hint. Thx for any advice, best Karl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[ANNOUNCE] Apache Tika 1.16 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.16. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.16 contains a number of improvements and bug fixes. Details can be found in the changes file: http://www.apache.org/dist/tika/CHANGES-1.16.txt Apache Tika is available in source form from the following download page: http://www.apache.org/dyn/closer.cgi/tika/apache-tika-1.16-src.zip Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: http://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: https://people.apache.org/keys/group/tika.asc For more information on Apache Tika, visit the project home page: http://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[RESULT][VOTE] Release Apache Tika 1.16 Candidate #1
All, This VOTE has passed with the following tallies: +1 PMCTim AllisonChris MattmannDave MeikleLuís Filipe Nassif Oleg Tikhonov +1 CommunityJB Data I'll push the dists out, update the site and send the ANNOUNCE email later today. Thank you, all! Cheers, Tim From: Tim Allison <talli...@apache.org> To: "dev@tika.apache.org" <dev@tika.apache.org>; "u...@tika.apache.org" <u...@tika.apache.org> Sent: Friday, July 7, 2017 10:40 PM Subject: [VOTE] Release Apache Tika 1.16 Candidate #1 A candidate for the Tika 1.16 release is available at: https://dist.apache.org/repos/dist/dev/tika/ The release candidate is a zip archive of the sources in: https://github.com/apache/tika/tree/1.16-rc1 The SHA1 checksum of the archive is e6884af0209ace42bf0b9b59d72c3c5a0052055e In addition, a staged maven repository is available here: https://repository.apache.org/content/repositories/orgapachetika-1025 Please vote on releasing this package as Apache Tika 1.16. The vote is open for the next 72 hours and passes if a majority of at least three +1 Tika PMC votes are cast. [ ] +1 Release this package as Apache Tika 1.16 [ ] -1 Do not release this package because... This is my +1. Cheers, Tim
Re: [VOTE] Release Apache Tika 1.16 Candidate #1
[x]+1 Release this package as Apache Tika 1.16 Basic tests and build on Ubuntu 17.04 + Java 8 (Oracle). Thanks, Oleg On Wed, Jul 12, 2017 at 11:03 AM, Dave Meikle <dmei...@apache.org> wrote: > On 8 July 2017 at 03:40, Tim Allison <talli...@apache.org> wrote: > > > > > A candidate for the Tika 1.16 release is available at: > > https://dist.apache.org/repos/dist/dev/tika/ > > > > The release candidate is a zip archive of the sources in: > > https://github.com/apache/tika/tree/1.16-rc1 > > > > The SHA1 checksum of the archive is > > e6884af0209ace42bf0b9b59d72c3c5a0052055e > > > > In addition, a staged maven repository is available here: > > https://repository.apache.org/content/repositories/orgapachetika-1025 > > > > Please vote on releasing this package as Apache Tika 1.16. > > The vote is open for the next 72 hours and passes if a majority of at > > least three +1 Tika PMC votes are cast. > > > > [ ] +1 Release this package as Apache Tika 1.16 > > [ ] -1 Do not release this package because... > > > > > +1 from me. Checksums and signatures good. Built and tested on various > machines using Java 8. Been run in a production workload and all good. > > Cheers, > Dave >
Re: [VOTE] Release Apache Tika 1.16 Candidate #1
On 8 July 2017 at 03:40, Tim Allison <talli...@apache.org> wrote: > > A candidate for the Tika 1.16 release is available at: > https://dist.apache.org/repos/dist/dev/tika/ > > The release candidate is a zip archive of the sources in: > https://github.com/apache/tika/tree/1.16-rc1 > > The SHA1 checksum of the archive is > e6884af0209ace42bf0b9b59d72c3c5a0052055e > > In addition, a staged maven repository is available here: > https://repository.apache.org/content/repositories/orgapachetika-1025 > > Please vote on releasing this package as Apache Tika 1.16. > The vote is open for the next 72 hours and passes if a majority of at > least three +1 Tika PMC votes are cast. > > [ ] +1 Release this package as Apache Tika 1.16 > [ ] -1 Do not release this package because... > > +1 from me. Checksums and signatures good. Built and tested on various machines using Java 8. Been run in a production workload and all good. Cheers, Dave
Re: [VOTE] Release Apache Tika 1.16 Candidate #1
I don't think it is needed. Built on Win7, jdk1.8.0_131. Tests passed with and without tesseract 3.05. +1 from me. Regards, Luis 2017-07-10 14:10 GMT-03:00 Allison, Timothy B. <talli...@mitre.org>: > Is this worth a re-spin? > > -Original Message- > From: Allison, Timothy B. [mailto:talli...@mitre.org] > Sent: Monday, July 10, 2017 10:26 AM > To: lfcnas...@gmail.com > Cc: dev@tika.apache.org > Subject: RE: [VOTE] Release Apache Tika 1.16 Candidate #1 > > Y. I need to fix that unit test. Thank you! > > https://issues.apache.org/jira/browse/TIKA-2426 > > From: Luís Filipe Nassif [mailto:lfcnas...@gmail.com] > Sent: Monday, July 10, 2017 9:29 AM > To: u...@tika.apache.org > Cc: dev@tika.apache.org; Tim Allison <talli...@apache.org> > Subject: Re: [VOTE] Release Apache Tika 1.16 Candidate #1 > > OK, that is a Locale issue, working around... > > 2017-07-10 10:24 GMT-03:00 Luís Filipe Nassif <lfcnas...@gmail.com lfcnas...@gmail.com>>: > I got the following failure on Window7, jdk1.8.0_131, in > OOXMLParserTest.testXLSBVarious:1537. > Any ideas? > > Failed tests: > OOXMLParserTest.testXLSBVarious:1537->TikaTest.assertContains:102 > 13.1211231321 not found in: > http://www.w3.org/1999/xhtml;> > > name="extended-properties:AppVersion" content="16.0300" /> name="dc:creator" content="Allison, Timothy B." /> name="extended-properties:Company" content="" /> name="dcterms:created" content="2017-03-09T12:24:26Z" /> name="Last-Modified" content="2017-03-10T14:58:49Z" /> name="dcterms:modified" content="2017-03-10T14:58:49Z" /> name="Last-Save-Date" content="2017-03-10T14:58:49Z" /> name="protected" content="false" /> content="2017-03-10T14:58:49Z" /> content="Microsoft Excel" /> content="2017-03-10T14:58:49Z" /> content="application/vnd.ms-excel.sheet.binary.macroenabled.12" /> name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" /> > content="org.apache.tika.parser.microsoft.ooxml.OOXMLParser" > /> name="meta:author" content="Allison, Timothy B." /> name="meta:creation-date" content="2017-03-09T12:24:26Z" /> name="extended-properties:Application" content="Microsoft Excel" /> name="meta:last-author" content="Allison, Timothy B." /> name="Creation-Date" content="2017-03-09T12:24:26Z" /> name="Last-Author" content="Allison, Timothy B." /> name="X-TIKA:origResourceName" > content="C:\Users\tallison\Desktop\working\xlsb\" > /> name="Author" content="Allison, Timothy B." /> content="" /> > mySheet1 String > This is a string integer 13 > float 13,1211231321 > currency $ 0,003,,03.00 > percent 20% > float 2 13,12 long int > 123456789012345 longer int > 1,23456789012345E+15 Allison, Timothy B.: Allison, > Timothy B.: > test comment2 > > fraction 1/4 date > 3/9/17 comment contents Allison, > Timothy B.: Allison, Timothy B.: > test comment > > hyperlink tika_link formula > 4 2 formulaErr ERROR > formulaFloat 0,5 March April > customFormat146/1963 merchant1 > 1 3 > customFormat2 3/128 merchant2 2 > 4 text test Allison, > Timothy B.: Allison, Timothy B.: > comment1 > > > Allison, Timothy B.: Allison, Timothy B.: > comment2 > > > Allison, Timothy B.: Allison, Timothy B.: > comment3 > > the > Allison, Timothy B.: Allison, Timothy B.: > comment4 (end of row) > > the > Allison, Timothy B.: Allison, Timothy B.: > comment5 between cells > quick > comment6 > Allison, Timothy B.: Allison, Timothy B.: > comment6 actually in cell > > > Allison, Timothy B.: Allison, Timothy B.: > comment7 end of file > > > Allison, Timothy B.: Allison, Timothy B.: > comment8 end of file > > OddLeftHeader OddCenterHeader OddRightHeader EvenLeftHeader > EvenCenterHeader EvenRightHeader FirstPageLeftHeader > FirstPageCenterHeader FirstPageRightHeader OddLeftFooter > OddCenterFooter OddRightFooter EvenLeftFooter EvenCenterFooter > EvenRightFooter FirstPageLeftFooter FirstPageCenterFooter > FirstPageRightFooter test textbox http://lucene.apache.org/;>http://lucene.apache.org/ > myChartTitle > > merchant1 March April 1 3 merchant2 March April 2 4 /> test WordArt myChartTitle > merchant1 March April 1 3 merchant2 March April 2 4 /> myChartTitle >
RE: [VOTE] Release Apache Tika 1.16 Candidate #1
Is this worth a re-spin? -Original Message- From: Allison, Timothy B. [mailto:talli...@mitre.org] Sent: Monday, July 10, 2017 10:26 AM To: lfcnas...@gmail.com Cc: dev@tika.apache.org Subject: RE: [VOTE] Release Apache Tika 1.16 Candidate #1 Y. I need to fix that unit test. Thank you! https://issues.apache.org/jira/browse/TIKA-2426 From: Luís Filipe Nassif [mailto:lfcnas...@gmail.com] Sent: Monday, July 10, 2017 9:29 AM To: u...@tika.apache.org Cc: dev@tika.apache.org; Tim Allison <talli...@apache.org> Subject: Re: [VOTE] Release Apache Tika 1.16 Candidate #1 OK, that is a Locale issue, working around... 2017-07-10 10:24 GMT-03:00 Luís Filipe Nassif <lfcnas...@gmail.com<mailto:lfcnas...@gmail.com>>: I got the following failure on Window7, jdk1.8.0_131, in OOXMLParserTest.testXLSBVarious:1537. Any ideas? Failed tests: OOXMLParserTest.testXLSBVarious:1537->TikaTest.assertContains:102 13.1211231321 not found in: http://www.w3.org/1999/xhtml;> mySheet1 String This is a string integer 13 float 13,1211231321 currency $ 0,003,,03.00 percent 20% float 2 13,12 long int 123456789012345 longer int 1,23456789012345E+15 Allison, Timothy B.: Allison, Timothy B.: test comment2 fraction 1/4 date 3/9/17 comment contents Allison, Timothy B.: Allison, Timothy B.: test comment hyperlink tika_link formula 4 2 formulaErr ERROR formulaFloat 0,5 March April customFormat146/1963 merchant1 1 3 customFormat2 3/128 merchant2 2 4 text test Allison, Timothy B.: Allison, Timothy B.: comment1 Allison, Timothy B.: Allison, Timothy B.: comment2 Allison, Timothy B.: Allison, Timothy B.: comment3 the Allison, Timothy B.: Allison, Timothy B.: comment4 (end of row) the Allison, Timothy B.: Allison, Timothy B.: comment5 between cells quick comment6 Allison, Timothy B.: Allison, Timothy B.: comment6 actually in cell Allison, Timothy B.: Allison, Timothy B.: comment7 end of file Allison, Timothy B.: Allison, Timothy B.: comment8 end of file OddLeftHeader OddCenterHeader OddRightHeader EvenLeftHeader EvenCenterHeader EvenRightHeader FirstPageLeftHeader FirstPageCenterHeader FirstPageRightHeader OddLeftFooter OddCenterFooter OddRightFooter EvenLeftFooter EvenCenterFooter EvenRightFooter FirstPageLeftFooter FirstPageCenterFooter FirstPageRightFooter test textbox http://lucene.apache.org/;>http://lucene.apache.org/myChartTitle merchant1 March April 1 3 merchant2 March April 2 4 test WordArt myChartTitle merchant1 March April 1 3 merchant2 March April 2 4 myChartTitle merchant1 March April 1 3 merchant2 March April 2 4 http://tika.apache.org/;>http://tika.apache.org/ 2017-07-10 10:17 GMT-03:00 JB Data <jbdat...@gmail.com<mailto:jbdat...@gmail.com>>: +1. No regression in my 1.15 env<http://jbigdata.fr/jbigdata/ged-02.html>. Test docx chart extraction (TIKA-2254): OK. @JBΔ<http://jbigdata.fr> 2017-07-08 22:29 GMT+02:00 Chris Mattmann <mattm...@apache.org<mailto:mattm...@apache.org>>: +1 from me SIGS and CHECKSUMS look good. Thanks Tim! Cheers, Chris LMC-053601:apache-tika-1.16-rc1 mattmann$ for type in "" \-app \-eval \-server; do $HOME/bin/stage_apache_rc tika$type 1.16 https://dist.apache.org/repos/dist/dev/tika/; done % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 53.5M 100 53.5M0 0 3992k 0 0:00:13 0:00:13 --:--:-- 5122k % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 836 100 8360 0 1092 0 --:--:-- --:--:-- --:--:-- 1092 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 10034 100340 0 96 0 --:--:-- --:--:-- --:--:--96 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 41.6M 100 41.6M0 0 6578k 0 0:00:06 0:00:06 --:--:-- 8297k % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 836 100 8360 0 1012 0 --:--:-- --:--:-- --:--:-- 1012 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 10034 100340 0 46 0 --:--:-- --:--:-- --:--:--46 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 56.4M 100 56.4M0 0 3950k 0 0:00:14
RE: [VOTE] Release Apache Tika 1.16 Candidate #1
Y. I need to fix that unit test. Thank you! https://issues.apache.org/jira/browse/TIKA-2426 From: Luís Filipe Nassif [mailto:lfcnas...@gmail.com] Sent: Monday, July 10, 2017 9:29 AM To: u...@tika.apache.org Cc: dev@tika.apache.org; Tim Allison <talli...@apache.org> Subject: Re: [VOTE] Release Apache Tika 1.16 Candidate #1 OK, that is a Locale issue, working around... 2017-07-10 10:24 GMT-03:00 Luís Filipe Nassif <lfcnas...@gmail.com<mailto:lfcnas...@gmail.com>>: I got the following failure on Window7, jdk1.8.0_131, in OOXMLParserTest.testXLSBVarious:1537. Any ideas? Failed tests: OOXMLParserTest.testXLSBVarious:1537->TikaTest.assertContains:102 13.1211231321 not found in: http://www.w3.org/1999/xhtml;> mySheet1 String This is a string integer 13 float 13,1211231321 currency $ 0,003,,03.00 percent 20% float 2 13,12 long int 123456789012345 longer int 1,23456789012345E+15 Allison, Timothy B.: Allison, Timothy B.: test comment2 fraction 1/4 date 3/9/17 comment contents Allison, Timothy B.: Allison, Timothy B.: test comment hyperlink tika_link formula 4 2 formulaErr ERROR formulaFloat 0,5 March April customFormat146/1963 merchant1 1 3 customFormat2 3/128 merchant2 2 4 text test Allison, Timothy B.: Allison, Timothy B.: comment1 Allison, Timothy B.: Allison, Timothy B.: comment2 Allison, Timothy B.: Allison, Timothy B.: comment3 the Allison, Timothy B.: Allison, Timothy B.: comment4 (end of row) the Allison, Timothy B.: Allison, Timothy B.: comment5 between cells quick comment6 Allison, Timothy B.: Allison, Timothy B.: comment6 actually in cell Allison, Timothy B.: Allison, Timothy B.: comment7 end of file Allison, Timothy B.: Allison, Timothy B.: comment8 end of file OddLeftHeader OddCenterHeader OddRightHeader EvenLeftHeader EvenCenterHeader EvenRightHeader FirstPageLeftHeader FirstPageCenterHeader FirstPageRightHeader OddLeftFooter OddCenterFooter OddRightFooter EvenLeftFooter EvenCenterFooter EvenRightFooter FirstPageLeftFooter FirstPageCenterFooter FirstPageRightFooter test textbox http://lucene.apache.org/;>http://lucene.apache.org/myChartTitle merchant1 March April 1 3 merchant2 March April 2 4 test WordArt myChartTitle merchant1 March April 1 3 merchant2 March April 2 4 myChartTitle merchant1 March April 1 3 merchant2 March April 2 4 http://tika.apache.org/;>http://tika.apache.org/ 2017-07-10 10:17 GMT-03:00 JB Data <jbdat...@gmail.com<mailto:jbdat...@gmail.com>>: +1. No regression in my 1.15 env<http://jbigdata.fr/jbigdata/ged-02.html>. Test docx chart extraction (TIKA-2254): OK. @JBΔ<http://jbigdata.fr> 2017-07-08 22:29 GMT+02:00 Chris Mattmann <mattm...@apache.org<mailto:mattm...@apache.org>>: +1 from me SIGS and CHECKSUMS look good. Thanks Tim! Cheers, Chris LMC-053601:apache-tika-1.16-rc1 mattmann$ for type in "" \-app \-eval \-server; do $HOME/bin/stage_apache_rc tika$type 1.16 https://dist.apache.org/repos/dist/dev/tika/; done % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 53.5M 100 53.5M0 0 3992k 0 0:00:13 0:00:13 --:--:-- 5122k % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 836 100 8360 0 1092 0 --:--:-- --:--:-- --:--:-- 1092 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 10034 100340 0 96 0 --:--:-- --:--:-- --:--:--96 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 41.6M 100 41.6M0 0 6578k 0 0:00:06 0:00:06 --:--:-- 8297k % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 836 100 8360 0 1012 0 --:--:-- --:--:-- --:--:-- 1012 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 10034 100340 0 46 0 --:--:-- --:--:-- --:--:--46 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 56.4M 100 56.4M0 0 3950k 0 0:00:14 0:00:14 --:--:-- 4742k % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 836 100 8360 0 1470 0 --:--:-- --:--:-- --:--:-- 1469 %
Re: [VOTE] Release Apache Tika 1.16 Candidate #1
OK, that is a Locale issue, working around... 2017-07-10 10:24 GMT-03:00 Luís Filipe Nassif <lfcnas...@gmail.com>: > I got the following failure on Window7, jdk1.8.0_131, in > OOXMLParserTest.testXLSBVarious:1537. > Any ideas? > > Failed tests: > OOXMLParserTest.testXLSBVarious:1537->TikaTest.assertContains:102 > 13.1211231321 not found in: > http://www.w3.org/1999/xhtml;> > > > > > > > > > > > > > > content="application/vnd.ms-excel.sheet.binary.macroenabled.12" > /> > > content="org.apache.tika.parser.microsoft.ooxml.OOXMLParser" > /> > > > > > > > > content="C:\Users\tallison\Desktop\working\xlsb\" > /> > > > > > > > mySheet1 > String This is a string > integer 13 > float 13,1211231321 > currency $ 0,003,,03.00 > percent 20% > float 2 13,12 > long int 123456789012345 > longer int 1,23456789012345E+15 > Allison, Timothy B.: Allison, Timothy B.: > test comment2 > > fraction 1/4 > date 3/9/17 > comment contents > Allison, Timothy B.: Allison, Timothy B.: > test comment > > hyperlink tika_link > formula 4 2 > formulaErr ERROR > formulaFloat 0,5 March April > customFormat146/1963 merchant1 > 1 3 > customFormat2 3/128 merchant2 2 > 4 > text test > > Allison, Timothy B.: Allison, Timothy B.: > comment1 > > > Allison, Timothy B.: Allison, Timothy B.: > comment2 > > > Allison, Timothy B.: Allison, Timothy B.: > comment3 > > the > Allison, Timothy B.: Allison, Timothy B.: > comment4 (end of row) > > the > Allison, Timothy B.: Allison, Timothy B.: > comment5 between cells > quick > comment6 > Allison, Timothy B.: Allison, Timothy B.: > comment6 actually in cell > > > Allison, Timothy B.: Allison, Timothy B.: > comment7 end of file > > > Allison, Timothy B.: Allison, Timothy B.: > comment8 end of file > > OddLeftHeader OddCenterHeader OddRightHeader > EvenLeftHeader EvenCenterHeader EvenRightHeader > > FirstPageLeftHeader FirstPageCenterHeader FirstPageRightHeader > OddLeftFooter OddCenterFooter OddRightFooter > EvenLeftFooter EvenCenterFooter EvenRightFooter > FirstPageLeftFooter FirstPageCenterFooter FirstPageRightFooter > test textbox > > http://lucene.apache.org/;>http://lucene.apache.org/ > myChartTitle > > merchant1 March April 1 3 merchant2 March April 2 4 > > > > test WordArt > myChartTitle > > merchant1 March April 1 3 merchant2 March April 2 4 > > > > myChartTitle > > merchant1 March April 1 3 merchant2 March April 2 4 > > > > http://tika.apache.org/;>http://tika.apache.org/ > class="package-entry" /> > > 2017-07-10 10:17 GMT-03:00 JB Data <jbdat...@gmail.com>: > >> +1. >> No regression in my 1.15 env <http://jbigdata.fr/jbigdata/ged-02.html>. >> Test docx chart extraction (TIKA-2254): OK. >> >> @*JB*Δ <http://jbigdata.fr> >> >> >> 2017-07-08 22:29 GMT+02:00 Chris Mattmann <mattm...@apache.org>: >> >>> +1 from me SIGS and CHECKSUMS look good. >>> >>> Thanks Tim! >>> >>> Cheers, >>> Chris >>> >>> LMC-053601:apache-tika-1.16-rc1 mattmann$ for type in "" \-app \-eval >>> \-server; do $HOME/bin/stage_apache_rc tika$type 1.16 >>> https://dist.apache.org/repos/dist/dev/tika/; done >>> % Total% Received % Xferd Average Speed TimeTime Time >>> Current >>> Dload Upload Total SpentLeft >>> Speed >>> 100 53.5M 100 53.5M0 0 3992k 0 0:00:13 0:00:13 --:--:-- >>> 5122k >>> % Total% Received % Xferd Average Speed TimeTime Time >>> Current >>> Dload Upload Total SpentLeft >>> Speed >>> 100 836 100 8360 0 1092 0 --:--:-- --:--:-- >>> --:--:-- 1092 >>> % Total% Received % Xferd Average Speed TimeTime Time >>> Current >>> Dload Upload Total SpentLeft >>> Speed >>> 10034 100340 0 96 0 --:--:-- --:--:-- >>> --:--:--96 >>> % Total% Received % Xferd Average Speed TimeTime Time >>> Current >>> Dload Upload Total SpentLeft >>> Speed >>> 100 41.6M 100 41.6M0 0 6578k 0
Re: [VOTE] Release Apache Tika 1.16 Candidate #1
I got the following failure on Window7, jdk1.8.0_131, in OOXMLParserTest.testXLSBVarious:1537. Any ideas? Failed tests: OOXMLParserTest.testXLSBVarious:1537->TikaTest.assertContains:102 13.1211231321 not found in: http://www.w3.org/1999/xhtml;> mySheet1 String This is a string integer 13 float 13,1211231321 currency $ 0,003,,03.00 percent 20% float 2 13,12 long int 123456789012345 longer int 1,23456789012345E+15 Allison, Timothy B.: Allison, Timothy B.: test comment2 fraction 1/4 date 3/9/17 comment contents Allison, Timothy B.: Allison, Timothy B.: test comment hyperlink tika_link formula 4 2 formulaErr ERROR formulaFloat 0,5 March April customFormat146/1963 merchant1 1 3 customFormat2 3/128 merchant2 2 4 text test Allison, Timothy B.: Allison, Timothy B.: comment1 Allison, Timothy B.: Allison, Timothy B.: comment2 Allison, Timothy B.: Allison, Timothy B.: comment3 the Allison, Timothy B.: Allison, Timothy B.: comment4 (end of row) the Allison, Timothy B.: Allison, Timothy B.: comment5 between cells quick comment6 Allison, Timothy B.: Allison, Timothy B.: comment6 actually in cell Allison, Timothy B.: Allison, Timothy B.: comment7 end of file Allison, Timothy B.: Allison, Timothy B.: comment8 end of file OddLeftHeader OddCenterHeader OddRightHeader EvenLeftHeader EvenCenterHeader EvenRightHeader FirstPageLeftHeader FirstPageCenterHeader FirstPageRightHeader OddLeftFooter OddCenterFooter OddRightFooter EvenLeftFooter EvenCenterFooter EvenRightFooter FirstPageLeftFooter FirstPageCenterFooter FirstPageRightFooter test textbox http://lucene.apache.org/;>http://lucene.apache.org/ myChartTitle merchant1 March April 1 3 merchant2 March April 2 4 test WordArt myChartTitle merchant1 March April 1 3 merchant2 March April 2 4 myChartTitle merchant1 March April 1 3 merchant2 March April 2 4 http://tika.apache.org/;>http://tika.apache.org/ 2017-07-10 10:17 GMT-03:00 JB Data <jbdat...@gmail.com>: > +1. > No regression in my 1.15 env <http://jbigdata.fr/jbigdata/ged-02.html>. > Test docx chart extraction (TIKA-2254): OK. > > @*JB*Δ <http://jbigdata.fr> > > > 2017-07-08 22:29 GMT+02:00 Chris Mattmann <mattm...@apache.org>: > >> +1 from me SIGS and CHECKSUMS look good. >> >> Thanks Tim! >> >> Cheers, >> Chris >> >> LMC-053601:apache-tika-1.16-rc1 mattmann$ for type in "" \-app \-eval >> \-server; do $HOME/bin/stage_apache_rc tika$type 1.16 >> https://dist.apache.org/repos/dist/dev/tika/; done >> % Total% Received % Xferd Average Speed TimeTime Time >> Current >> Dload Upload Total SpentLeft >> Speed >> 100 53.5M 100 53.5M0 0 3992k 0 0:00:13 0:00:13 --:--:-- >> 5122k >> % Total% Received % Xferd Average Speed TimeTime Time >> Current >> Dload Upload Total SpentLeft >> Speed >> 100 836 100 8360 0 1092 0 --:--:-- --:--:-- --:--:-- >> 1092 >> % Total% Received % Xferd Average Speed TimeTime Time >> Current >> Dload Upload Total SpentLeft >> Speed >> 10034 100340 0 96 0 --:--:-- --:--:-- --:--:-- >> 96 >> % Total% Received % Xferd Average Speed TimeTime Time >> Current >> Dload Upload Total SpentLeft >> Speed >> 100 41.6M 100 41.6M0 0 6578k 0 0:00:06 0:00:06 --:--:-- >> 8297k >> % Total% Received % Xferd Average Speed TimeTime Time >> Current >> Dload Upload Total SpentLeft >> Speed >> 100 836 100 8360 0 1012 0 --:--:-- --:--:-- --:--:-- >> 1012 >> % Total% Received % Xferd Average Speed TimeTime Time >> Current >> Dload Upload Total SpentLeft >> Speed >> 10034 100340 0 46 0 --:--:-- --:--:-- --:--:-- >> 46 >> % Total% Received % Xferd Average Speed TimeTime Time >> Current >> Dload Upload Total SpentLeft >> Speed >> 100 56.4M 100 56.4M0 0 3950k 0 0:00:14 0:00:14 --:--:-- >> 4742k >> % Total% Received % Xferd Average Speed TimeTime Time >> Current >> Dload Upload Total SpentLeft >> Speed >> 100 836 100 8360 0 1470 0 --:--:-- --:--:-- --:--:-- >> 1469 >> % Total% Rece
RE: [VOTE] Release Apache Tika 1.16 Candidate #1
Doh. Thank you! From: JB Data [mailto:jbdat...@gmail.com] Sent: Saturday, July 8, 2017 2:34 AM To: u...@tika.apache.org; Tim Allison <talli...@apache.org> Cc: dev@tika.apache.org Subject: Re: [VOTE] Release Apache Tika 1.16 Candidate #1 Warn: link https://github.com/apache/tika/tree/1.16-rc1<https://github.com/apache/tika/tree/1.15-rc1> "hrefs" to the 1.15-rc1. @JBΔ<http://jbigdata.fr> 2017-07-08 4:40 GMT+02:00 Tim Allison <talli...@apache.org<mailto:talli...@apache.org>>: A candidate for the Tika 1.16 release is available at: https://dist.apache.org/repos/dist/dev/tika/ The release candidate is a zip archive of the sources in: https://github.com/apache/tika/tree/1.16-rc1<https://github.com/apache/tika/tree/1.15-rc1> The SHA1 checksum of the archive is e6884af0209ace42bf0b9b59d72c3c5a0052055e In addition, a staged maven repository is available here: https://repository.apache.org/content/repositories/orgapachetika-1025 Please vote on releasing this package as Apache Tika 1.16. The vote is open for the next 72 hours and passes if a majority of at least three +1 Tika PMC votes are cast. [ ] +1 Release this package as Apache Tika 1.16 [ ] -1 Do not release this package because... This is my +1. Cheers, Tim
Re: [VOTE] Release Apache Tika 1.16 Candidate #1
+1 from me SIGS and CHECKSUMS look good. Thanks Tim! Cheers, Chris LMC-053601:apache-tika-1.16-rc1 mattmann$ for type in "" \-app \-eval \-server; do $HOME/bin/stage_apache_rc tika$type 1.16 https://dist.apache.org/repos/dist/dev/tika/; done % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 53.5M 100 53.5M0 0 3992k 0 0:00:13 0:00:13 --:--:-- 5122k % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 836 100 8360 0 1092 0 --:--:-- --:--:-- --:--:-- 1092 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 10034 100340 0 96 0 --:--:-- --:--:-- --:--:--96 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 41.6M 100 41.6M0 0 6578k 0 0:00:06 0:00:06 --:--:-- 8297k % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 836 100 8360 0 1012 0 --:--:-- --:--:-- --:--:-- 1012 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 10034 100340 0 46 0 --:--:-- --:--:-- --:--:--46 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 56.4M 100 56.4M0 0 3950k 0 0:00:14 0:00:14 --:--:-- 4742k % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 836 100 8360 0 1470 0 --:--:-- --:--:-- --:--:-- 1469 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 10034 100340 0 65 0 --:--:-- --:--:-- --:--:--65 LMC-053601:apache-tika-1.16-rc1 mattmann$ $HOME/bin/stage_apache_rc tika 1.16-src https://dist.apache.org/repos/dist/dev/tika/ % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 84.2M 100 84.2M0 0 6563k 0 0:00:13 0:00:13 --:--:-- 5261k % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 836 100 8360 0 2129 0 --:--:-- --:--:-- --:--:-- 2127 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 10034 100340 0 47 0 --:--:-- --:--:-- --:--:--47 LMC-053601:apache-tika-1.16-rc1 mattmann$ ls tika-1.16-src.zip tika-app-1.16.jar tika-eval-1.16.jar tika-server-1.16.jar tika-1.16-src.zip.asc tika-app-1.16.jar.asc tika-eval-1.16.jar.asc tika-server-1.16.jar.asc tika-1.16-src.zip.md5 tika-app-1.16.jar.md5 tika-eval-1.16.jar.md5 tika-server-1.16.jar.md5 LMC-053601:apache-tika-1.16-rc1 mattmann$ $HOME/bin/verify_gpg_sigs Verifying Signature for file tika-1.16-src.zip.asc gpg: assuming signed data in `tika-1.16-src.zip' gpg: Signature made Fri Jul 7 19:27:42 2017 PDT using RSA key ID EF0CF38A gpg: Good signature from "Tim Allison (ASF signing key) <talli...@apache.org>" gpg: WARNING: This key is not certified with a trusted signature! gpg: There is no indication that the signature belongs to the owner. Primary key fingerprint: 833C 1CC4 926C 1DDE 29BB 8731 E403 2DC4 EF0C F38A Verifying Signature for file tika-app-1.16.jar.asc gpg: assuming signed data in `tika-app-1.16.jar' gpg: Signature made Fri Jul 7 19:13:16 2017 PDT using RSA key ID EF0CF38A gpg: Good signature from "Tim Allison (ASF signing key) <talli...@apache.org>" gpg: WARNING: This key is not certified with a trusted signature! gpg: There is no indication that the signature belongs to the owner. Primary key fingerprint: 833C 1CC4 926C 1DDE 29BB 8731 E403 2DC4 EF0C F38A Verifying Signature for file tika-eval-1.16.jar.asc gpg: assuming signed data in `tika-eval-1.16.jar' gpg: Signature made Fri Jul 7 19:20:17 2017 PDT using RSA key ID EF0CF38A gpg: Good signature from "Tim Allison (ASF signing key) <talli...@apache.org>" gpg: WARNIN
[VOTE] Release Apache Tika 1.16 Candidate #1
A candidate for the Tika 1.16 release is available at: https://dist.apache.org/repos/dist/dev/tika/ The release candidate is a zip archive of the sources in: https://github.com/apache/tika/tree/1.16-rc1 The SHA1 checksum of the archive is e6884af0209ace42bf0b9b59d72c3c5a0052055e In addition, a staged maven repository is available here: https://repository.apache.org/content/repositories/orgapachetika-1025 Please vote on releasing this package as Apache Tika 1.16. The vote is open for the next 72 hours and passes if a majority of at least three +1 Tika PMC votes are cast. [ ] +1 Release this package as Apache Tika 1.16 [ ] -1 Do not release this package because... This is my +1. Cheers, Tim
Re: Tika 1.16?
FWIW, I believe there is some history on this but can’t find it right now. I thought we discussed X.Y.Z versionining at one point and decided against it. I have no objections to it however. Aloha, Chris On 6/2/17, 2:39 PM, "Tyler Bui-Palsulich"wrote: +1 to 1.15.1. It would also be nice to be able to have "cheap" security releases as they come up. Tyler On Jun 2, 2017 6:12 AM, "Bob Paulin" wrote: > Would be breaking a bit from the current release numbering but I'd fully > support moving to semantic versioning. +1 to a 1.15.1 > > - Bob > > > On 6/2/2017 8:06 AM, Luís Filipe Nassif wrote: > > Maybe 1.15.1? > > > > Em 1 de jun de 2017 10:03 AM, "Bob Paulin" escreveu: > > > >> +1 > >> > >> > >> On 6/1/2017 6:50 AM, Allison, Timothy B. wrote: > >>> Given the broken OSGi and the org.json issues with 1.15, does it make > >> sense to aim for 1.16 fairly soon, say 3-4 weeks? > >>> Cheers, > >>> > >>> Tim > >>> > >>> > >> > >> > > >
Re: Tika 1.16?
+1 to 1.15.1. It would also be nice to be able to have "cheap" security releases as they come up. Tyler On Jun 2, 2017 6:12 AM, "Bob Paulin"wrote: > Would be breaking a bit from the current release numbering but I'd fully > support moving to semantic versioning. +1 to a 1.15.1 > > - Bob > > > On 6/2/2017 8:06 AM, Luís Filipe Nassif wrote: > > Maybe 1.15.1? > > > > Em 1 de jun de 2017 10:03 AM, "Bob Paulin" escreveu: > > > >> +1 > >> > >> > >> On 6/1/2017 6:50 AM, Allison, Timothy B. wrote: > >>> Given the broken OSGi and the org.json issues with 1.15, does it make > >> sense to aim for 1.16 fairly soon, say 3-4 weeks? > >>> Cheers, > >>> > >>> Tim > >>> > >>> > >> > >> > > >
Re: Tika 1.16?
Maybe 1.15.1? Em 1 de jun de 2017 10:03 AM, "Bob Paulin"escreveu: > +1 > > > On 6/1/2017 6:50 AM, Allison, Timothy B. wrote: > > Given the broken OSGi and the org.json issues with 1.15, does it make > sense to aim for 1.16 fairly soon, say 3-4 weeks? > > > > Cheers, > > > > Tim > > > > > > >
Re: Tika 1.16?
+1 On 6/1/2017 6:50 AM, Allison, Timothy B. wrote: > Given the broken OSGi and the org.json issues with 1.15, does it make sense > to aim for 1.16 fairly soon, say 3-4 weeks? > > Cheers, > > Tim > > signature.asc Description: OpenPGP digital signature
Tika 1.16?
Given the broken OSGi and the org.json issues with 1.15, does it make sense to aim for 1.16 fairly soon, say 3-4 weeks? Cheers, Tim