[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help

2017-07-26 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101781#comment-16101781
 ] 

Tim Allison commented on TIKA-2433:
---

All thanks should go to Nick who fixed my npe so quickly.  Ugh, on my part. :(

Completely understand need not to change anything now.

Cheers!

> Tika 1.16 - Nullpointer Exception after update - Asking for help
> 
>
> Key: TIKA-2433
> URL: https://issues.apache.org/jira/browse/TIKA-2433
> Project: Tika
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.16
> Environment: Docker - Debian Stretch - Oracle Java
> +Installation in Dockerfile+
> {noformat}
> ENV TIKA_VERSION 1.16
> # also see 
> https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile
> RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \
>  && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o 
> tika-server.jar \
>  && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita 
> tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin
> {noformat}
> +Tika.xml+
> {noformat}
> 
> 
> 
> 
>  class="org.apache.tika.parser.ocr.TesseractOCRParser"/>
> 
> 
> 
> {noformat}
>Reporter: Karl Buchta
> Fix For: 1.17
>
>
> Hi,
> i would like to kindly ask for help. We had to update to the latest Tika 
> 1.16. I have no experience in Tika so far, i am just maintaining the 
> configuration and application from another developer.
> Version 1.15 worked very fine for us. But right now i see following error 
> (office is the name of our docker container, hence this output):
> https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202
> {noformat}
> office | java.lang.NullPointerException
> office |  at 
> org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202)
> office |  at 
> org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153)
> {noformat}
> I have checked the source on github and have seen, that this code part was 
> changed with one of the latest commits before the 1.16 release (see link 
> above).
> I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i 
> haven't used Tika so far, and i cannot see that the CLI requirements changed 
> from the release notes, i would like to ask, whether this is the case anyway. 
> Do you have some hints on where to start, is this maybe due to improper cli 
> usage? Or do you think there is a missing java package or dependency?
> It's hard for me to say, as the cli commands are automated and distributed 
> over several layers and configuration files in the application stack, hence i 
> am asking for a hint.
> Thx for any advice, best Karl.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help

2017-07-26 Thread Karl Buchta (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101769#comment-16101769
 ] 

Karl Buchta commented on TIKA-2433:
---

Hi Tim,

for sure we will in the future, but right now we are in the phase of going live 
with many other changes, completely new infrastructure, docker in production, 
major release, ...

That's why we are skipping this part right now.

Thanks for the hint though, i would usually absolutely agree, but everything is 
burning right now :).

Best, and thx a lot for your kind and fast support, extraordinary!

Karl

> Tika 1.16 - Nullpointer Exception after update - Asking for help
> 
>
> Key: TIKA-2433
> URL: https://issues.apache.org/jira/browse/TIKA-2433
> Project: Tika
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.16
> Environment: Docker - Debian Stretch - Oracle Java
> +Installation in Dockerfile+
> {noformat}
> ENV TIKA_VERSION 1.16
> # also see 
> https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile
> RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \
>  && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o 
> tika-server.jar \
>  && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita 
> tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin
> {noformat}
> +Tika.xml+
> {noformat}
> 
> 
> 
> 
>  class="org.apache.tika.parser.ocr.TesseractOCRParser"/>
> 
> 
> 
> {noformat}
>Reporter: Karl Buchta
> Fix For: 1.17
>
>
> Hi,
> i would like to kindly ask for help. We had to update to the latest Tika 
> 1.16. I have no experience in Tika so far, i am just maintaining the 
> configuration and application from another developer.
> Version 1.15 worked very fine for us. But right now i see following error 
> (office is the name of our docker container, hence this output):
> https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202
> {noformat}
> office | java.lang.NullPointerException
> office |  at 
> org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202)
> office |  at 
> org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153)
> {noformat}
> I have checked the source on github and have seen, that this code part was 
> changed with one of the latest commits before the 1.16 release (see link 
> above).
> I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i 
> haven't used Tika so far, and i cannot see that the CLI requirements changed 
> from the release notes, i would like to ask, whether this is the case anyway. 
> Do you have some hints on where to start, is this maybe due to improper cli 
> usage? Or do you think there is a missing java package or dependency?
> It's hard for me to say, as the cli commands are automated and distributed 
> over several layers and configuration files in the application stack, hence i 
> am asking for a hint.
> Thx for any advice, best Karl.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help

2017-07-26 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101540#comment-16101540
 ] 

Tim Allison commented on TIKA-2433:
---

Might recommend migrating to tika-server.  As Nick pointed out, we'll be 
getting rid of this part of the codebase soon.

> Tika 1.16 - Nullpointer Exception after update - Asking for help
> 
>
> Key: TIKA-2433
> URL: https://issues.apache.org/jira/browse/TIKA-2433
> Project: Tika
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.16
> Environment: Docker - Debian Stretch - Oracle Java
> +Installation in Dockerfile+
> {noformat}
> ENV TIKA_VERSION 1.16
> # also see 
> https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile
> RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \
>  && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o 
> tika-server.jar \
>  && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita 
> tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin
> {noformat}
> +Tika.xml+
> {noformat}
> 
> 
> 
> 
>  class="org.apache.tika.parser.ocr.TesseractOCRParser"/>
> 
> 
> 
> {noformat}
>Reporter: Karl Buchta
> Fix For: 1.17
>
>
> Hi,
> i would like to kindly ask for help. We had to update to the latest Tika 
> 1.16. I have no experience in Tika so far, i am just maintaining the 
> configuration and application from another developer.
> Version 1.15 worked very fine for us. But right now i see following error 
> (office is the name of our docker container, hence this output):
> https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202
> {noformat}
> office | java.lang.NullPointerException
> office |  at 
> org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202)
> office |  at 
> org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153)
> {noformat}
> I have checked the source on github and have seen, that this code part was 
> changed with one of the latest commits before the 1.16 release (see link 
> above).
> I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i 
> haven't used Tika so far, and i cannot see that the CLI requirements changed 
> from the release notes, i would like to ask, whether this is the case anyway. 
> Do you have some hints on where to start, is this maybe due to improper cli 
> usage? Or do you think there is a missing java package or dependency?
> It's hard for me to say, as the cli commands are automated and distributed 
> over several layers and configuration files in the application stack, hence i 
> am asking for a hint.
> Thx for any advice, best Karl.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help

2017-07-26 Thread Karl Buchta (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101306#comment-16101306
 ] 

Karl Buchta commented on TIKA-2433:
---

Hi again,

my dev lead has decided to downgrade instead of compiling.
So my advice is to continue with other work now.

>From my side this issue can be closed, we will wait for the 1.17 release.

Best Karl


> Tika 1.16 - Nullpointer Exception after update - Asking for help
> 
>
> Key: TIKA-2433
> URL: https://issues.apache.org/jira/browse/TIKA-2433
> Project: Tika
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.16
> Environment: Docker - Debian Stretch - Oracle Java
> +Installation in Dockerfile+
> {noformat}
> ENV TIKA_VERSION 1.16
> # also see 
> https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile
> RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \
>  && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o 
> tika-server.jar \
>  && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita 
> tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin
> {noformat}
> +Tika.xml+
> {noformat}
> 
> 
> 
> 
>  class="org.apache.tika.parser.ocr.TesseractOCRParser"/>
> 
> 
> 
> {noformat}
>Reporter: Karl Buchta
> Fix For: 1.17
>
>
> Hi,
> i would like to kindly ask for help. We had to update to the latest Tika 
> 1.16. I have no experience in Tika so far, i am just maintaining the 
> configuration and application from another developer.
> Version 1.15 worked very fine for us. But right now i see following error 
> (office is the name of our docker container, hence this output):
> https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202
> {noformat}
> office | java.lang.NullPointerException
> office |  at 
> org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202)
> office |  at 
> org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153)
> {noformat}
> I have checked the source on github and have seen, that this code part was 
> changed with one of the latest commits before the 1.16 release (see link 
> above).
> I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i 
> haven't used Tika so far, and i cannot see that the CLI requirements changed 
> from the release notes, i would like to ask, whether this is the case anyway. 
> Do you have some hints on where to start, is this maybe due to improper cli 
> usage? Or do you think there is a missing java package or dependency?
> It's hard for me to say, as the cli commands are automated and distributed 
> over several layers and configuration files in the application stack, hence i 
> am asking for a hint.
> Thx for any advice, best Karl.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help

2017-07-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100627#comment-16100627
 ] 

Hudson commented on TIKA-2433:
--

FAILURE: Integrated in Jenkins build Tika-trunk #1338 (See 
[https://builds.apache.org/job/Tika-trunk/1338/])
TIKA-2433 All non-pipe modes need configuring, otherwise the Tika Server (nick: 
[https://github.com/apache/tika/commit/f31b7f1e281938c393f159cd4a76f3396291e7b6])
* (edit) tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java


> Tika 1.16 - Nullpointer Exception after update - Asking for help
> 
>
> Key: TIKA-2433
> URL: https://issues.apache.org/jira/browse/TIKA-2433
> Project: Tika
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.16
> Environment: Docker - Debian Stretch - Oracle Java
> +Installation in Dockerfile+
> {noformat}
> ENV TIKA_VERSION 1.16
> # also see 
> https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile
> RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \
>  && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o 
> tika-server.jar \
>  && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita 
> tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin
> {noformat}
> +Tika.xml+
> {noformat}
> 
> 
> 
> 
>  class="org.apache.tika.parser.ocr.TesseractOCRParser"/>
> 
> 
> 
> {noformat}
>Reporter: Karl Buchta
> Fix For: 1.17
>
>
> Hi,
> i would like to kindly ask for help. We had to update to the latest Tika 
> 1.16. I have no experience in Tika so far, i am just maintaining the 
> configuration and application from another developer.
> Version 1.15 worked very fine for us. But right now i see following error 
> (office is the name of our docker container, hence this output):
> https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202
> {noformat}
> office | java.lang.NullPointerException
> office |  at 
> org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202)
> office |  at 
> org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153)
> {noformat}
> I have checked the source on github and have seen, that this code part was 
> changed with one of the latest commits before the 1.16 release (see link 
> above).
> I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i 
> haven't used Tika so far, and i cannot see that the CLI requirements changed 
> from the release notes, i would like to ask, whether this is the case anyway. 
> Do you have some hints on where to start, is this maybe due to improper cli 
> usage? Or do you think there is a missing java package or dependency?
> It's hard for me to say, as the cli commands are automated and distributed 
> over several layers and configuration files in the application stack, hence i 
> am asking for a hint.
> Thx for any advice, best Karl.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help

2017-07-25 Thread Karl Buchta (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100206#comment-16100206
 ] 

Karl Buchta commented on TIKA-2433:
---

Ok, thx for the info. Will try to compile it tomorrow with this revision, and 
confirm here if it works.

> Tika 1.16 - Nullpointer Exception after update - Asking for help
> 
>
> Key: TIKA-2433
> URL: https://issues.apache.org/jira/browse/TIKA-2433
> Project: Tika
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.16
> Environment: Docker - Debian Stretch - Oracle Java
> +Installation in Dockerfile+
> {noformat}
> ENV TIKA_VERSION 1.16
> # also see 
> https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile
> RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \
>  && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o 
> tika-server.jar \
>  && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita 
> tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin
> {noformat}
> +Tika.xml+
> {noformat}
> 
> 
> 
> 
>  class="org.apache.tika.parser.ocr.TesseractOCRParser"/>
> 
> 
> 
> {noformat}
>Reporter: Karl Buchta
> Fix For: 1.17
>
>
> Hi,
> i would like to kindly ask for help. We had to update to the latest Tika 
> 1.16. I have no experience in Tika so far, i am just maintaining the 
> configuration and application from another developer.
> Version 1.15 worked very fine for us. But right now i see following error 
> (office is the name of our docker container, hence this output):
> https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202
> {noformat}
> office | java.lang.NullPointerException
> office |  at 
> org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202)
> office |  at 
> org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153)
> {noformat}
> I have checked the source on github and have seen, that this code part was 
> changed with one of the latest commits before the 1.16 release (see link 
> above).
> I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i 
> haven't used Tika so far, and i cannot see that the CLI requirements changed 
> from the release notes, i would like to ask, whether this is the case anyway. 
> Do you have some hints on where to start, is this maybe due to improper cli 
> usage? Or do you think there is a missing java package or dependency?
> It's hard for me to say, as the cli commands are automated and distributed 
> over several layers and configuration files in the application stack, hence i 
> am asking for a hint.
> Thx for any advice, best Karl.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help

2017-07-25 Thread Nick Burch (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100166#comment-16100166
 ] 

Nick Burch commented on TIKA-2433:
--

As it's in a deprecated part of the codebase, I'm not sure we'd do a release 
just for this fix. We've only just done 1.16, so it may be several months until 
1.17 is released

> Tika 1.16 - Nullpointer Exception after update - Asking for help
> 
>
> Key: TIKA-2433
> URL: https://issues.apache.org/jira/browse/TIKA-2433
> Project: Tika
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.16
> Environment: Docker - Debian Stretch - Oracle Java
> +Installation in Dockerfile+
> {noformat}
> ENV TIKA_VERSION 1.16
> # also see 
> https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile
> RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \
>  && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o 
> tika-server.jar \
>  && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita 
> tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin
> {noformat}
> +Tika.xml+
> {noformat}
> 
> 
> 
> 
>  class="org.apache.tika.parser.ocr.TesseractOCRParser"/>
> 
> 
> 
> {noformat}
>Reporter: Karl Buchta
> Fix For: 1.17
>
>
> Hi,
> i would like to kindly ask for help. We had to update to the latest Tika 
> 1.16. I have no experience in Tika so far, i am just maintaining the 
> configuration and application from another developer.
> Version 1.15 worked very fine for us. But right now i see following error 
> (office is the name of our docker container, hence this output):
> https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202
> {noformat}
> office | java.lang.NullPointerException
> office |  at 
> org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202)
> office |  at 
> org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153)
> {noformat}
> I have checked the source on github and have seen, that this code part was 
> changed with one of the latest commits before the 1.16 release (see link 
> above).
> I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i 
> haven't used Tika so far, and i cannot see that the CLI requirements changed 
> from the release notes, i would like to ask, whether this is the case anyway. 
> Do you have some hints on where to start, is this maybe due to improper cli 
> usage? Or do you think there is a missing java package or dependency?
> It's hard for me to say, as the cli commands are automated and distributed 
> over several layers and configuration files in the application stack, hence i 
> am asking for a hint.
> Thx for any advice, best Karl.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help

2017-07-25 Thread Karl Buchta (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100162#comment-16100162
 ] 

Karl Buchta commented on TIKA-2433:
---

Nick Burch, thanks a lot for this information. Will you release this fix at 
some point?

Otherwise we will use the the source and try to compile.
Will post here tomorrow again.

Awesome help and support, thank you.

Best Karl

> Tika 1.16 - Nullpointer Exception after update - Asking for help
> 
>
> Key: TIKA-2433
> URL: https://issues.apache.org/jira/browse/TIKA-2433
> Project: Tika
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.16
> Environment: Docker - Debian Stretch - Oracle Java
> +Installation in Dockerfile+
> {noformat}
> ENV TIKA_VERSION 1.16
> # also see 
> https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile
> RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \
>  && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o 
> tika-server.jar \
>  && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita 
> tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin
> {noformat}
> +Tika.xml+
> {noformat}
> 
> 
> 
> 
>  class="org.apache.tika.parser.ocr.TesseractOCRParser"/>
> 
> 
> 
> {noformat}
>Reporter: Karl Buchta
> Fix For: 1.17
>
>
> Hi,
> i would like to kindly ask for help. We had to update to the latest Tika 
> 1.16. I have no experience in Tika so far, i am just maintaining the 
> configuration and application from another developer.
> Version 1.15 worked very fine for us. But right now i see following error 
> (office is the name of our docker container, hence this output):
> https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202
> {noformat}
> office | java.lang.NullPointerException
> office |  at 
> org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202)
> office |  at 
> org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153)
> {noformat}
> I have checked the source on github and have seen, that this code part was 
> changed with one of the latest commits before the 1.16 release (see link 
> above).
> I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i 
> haven't used Tika so far, and i cannot see that the CLI requirements changed 
> from the release notes, i would like to ask, whether this is the case anyway. 
> Do you have some hints on where to start, is this maybe due to improper cli 
> usage? Or do you think there is a missing java package or dependency?
> It's hard for me to say, as the cli commands are automated and distributed 
> over several layers and configuration files in the application stack, hence i 
> am asking for a hint.
> Thx for any advice, best Karl.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help

2017-07-25 Thread Karl Buchta (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100064#comment-16100064
 ] 

Karl Buchta commented on TIKA-2433:
---

Sorry i actually thought this was running via cli as our other services, but 
this is not the case.

> Tika 1.16 - Nullpointer Exception after update - Asking for help
> 
>
> Key: TIKA-2433
> URL: https://issues.apache.org/jira/browse/TIKA-2433
> Project: Tika
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.16
> Environment: Docker - Debian Stretch - Oracle Java
> +Installation in Dockerfile+
> {noformat}
> ENV TIKA_VERSION 1.16
> # also see 
> https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile
> RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \
>  && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o 
> tika-server.jar \
>  && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita 
> tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin
> {noformat}
> +Tika.xml+
> {noformat}
> 
> 
> 
> 
>  class="org.apache.tika.parser.ocr.TesseractOCRParser"/>
> 
> 
> 
> {noformat}
>Reporter: Karl Buchta
>
> Hi,
> i would like to kindly ask for help. We had to update to the latest Tika 
> 1.16. I have no experience in Tika so far, i am just maintaining the 
> configuration and application from another developer.
> Version 1.15 worked very fine for us. But right now i see following error 
> (office is the name of our docker container, hence this output):
> https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202
> {noformat}
> office | java.lang.NullPointerException
> office |  at 
> org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202)
> office |  at 
> org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153)
> {noformat}
> I have checked the source on github and have seen, that this code part was 
> changed with one of the latest commits before the 1.16 release (see link 
> above).
> I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i 
> haven't used Tika so far, and i cannot see that the CLI requirements changed 
> from the release notes, i would like to ask, whether this is the case anyway. 
> Do you have some hints on where to start, is this maybe due to improper cli 
> usage? Or do you think there is a missing java package or dependency?
> It's hard for me to say, as the cli commands are automated and distributed 
> over several layers and configuration files in the application stack, hence i 
> am asking for a hint.
> Thx for any advice, best Karl.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help

2017-07-25 Thread Karl Buchta (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100057#comment-16100057
 ] 

Karl Buchta commented on TIKA-2433:
---

This is how we start the server:

{noformat}
java -Djava.awt.headless=true -jar /opt/tika/tika.jar --server --port 9100 -t
{noformat}


To be more exact, this is our supervisor config:

{noformat}
[program:tika]
priority=2
command=java -Djava.awt.headless=true -jar /opt/tika/tika.jar --server --port 
9100 -t
autorestart=true
stopsignal = KILL
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
{noformat}


We write to / read from the server via ruby sockets directly.
This code was not subject to change during the upgrade, meaning it worked with 
1.15.
Otherwise we do not use the the Tika via cli, but only this way.
{code:ruby}
  class Tika < ExtractorBase

def self.run (data, binary = false)
  if binary
data = StringIO.new(data , 'rb') # we use b to read as binary, so we do 
not destroy the encoding we do not know
  else
# TODO: should never be used
data = StringIO.new(data , 'r') # we use b to read as binary, so we do 
not destroy the encoding we do not know
  end


  s = TCPSocket.new(ENV['TIKA_SERVER'], ENV['TIKA_PORT'])
  i = 0
  while 1
chunk = data.read(65536)
break unless chunk
s.write(chunk)
i += 65536
  end
  s.shutdown(Socket::SHUT_WR)
  resp = ''
  while 1
chunk = s.recv(65536)
break if chunk.empty? || !chunk
resp << chunk
  end
  resp
end

  end
{code}

Thank you a lot for your quick reply.

> Tika 1.16 - Nullpointer Exception after update - Asking for help
> 
>
> Key: TIKA-2433
> URL: https://issues.apache.org/jira/browse/TIKA-2433
> Project: Tika
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.16
> Environment: Docker - Debian Stretch - Oracle Java
> +Installation in Dockerfile+
> {noformat}
> ENV TIKA_VERSION 1.16
> # also see 
> https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile
> RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \
>  && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o 
> tika-server.jar \
>  && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita 
> tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin
> {noformat}
> +Tika.xml+
> {noformat}
> 
> 
> 
> 
>  class="org.apache.tika.parser.ocr.TesseractOCRParser"/>
> 
> 
> 
> {noformat}
>Reporter: Karl Buchta
>
> Hi,
> i would like to kindly ask for help. We had to update to the latest Tika 
> 1.16. I have no experience in Tika so far, i am just maintaining the 
> configuration and application from another developer.
> Version 1.15 worked very fine for us. But right now i see following error 
> (office is the name of our docker container, hence this output):
> https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202
> {noformat}
> office | java.lang.NullPointerException
> office |  at 
> org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202)
> office |  at 
> org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153)
> {noformat}
> I have checked the source on github and have seen, that this code part was 
> changed with one of the latest commits before the 1.16 release (see link 
> above).
> I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i 
> haven't used Tika so far, and i cannot see that the CLI requirements changed 
> from the release notes, i would like to ask, whether this is the case anyway. 
> Do you have some hints on where to start, is this maybe due to improper cli 
> usage? Or do you think there is a missing java package or dependency?
> It's hard for me to say, as the cli commands are automated and distributed 
> over several layers and configuration files in the application stack, hence i 
> am asking for a hint.
> Thx for any advice, best Karl.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help

2017-07-25 Thread Karl Buchta (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100029#comment-16100029
 ] 

Karl Buchta commented on TIKA-2433:
---

I am looking it up within the application stack.

> Tika 1.16 - Nullpointer Exception after update - Asking for help
> 
>
> Key: TIKA-2433
> URL: https://issues.apache.org/jira/browse/TIKA-2433
> Project: Tika
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.16
> Environment: Docker - Debian Stretch - Oracle Java
> +Installation in Dockerfile+
> {noformat}
> ENV TIKA_VERSION 1.16
> # also see 
> https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile
> RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \
>  && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o 
> tika-server.jar \
>  && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita 
> tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin
> {noformat}
> +Tika.xml+
> {noformat}
> 
> 
> 
> 
>  class="org.apache.tika.parser.ocr.TesseractOCRParser"/>
> 
> 
> 
> {noformat}
>Reporter: Karl Buchta
>
> Hi,
> i would like to kindly ask for help. We had to update to the latest Tika 
> 1.16. I have no experience in Tika so far, i am just maintaining the 
> configuration and application from another developer.
> Version 1.15 worked very fine for us. But right now i see following error 
> (office is the name of our docker container, hence this output):
> https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202
> {noformat}
> office | java.lang.NullPointerException
> office |  at 
> org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202)
> office |  at 
> org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153)
> {noformat}
> I have checked the source on github and have seen, that this code part was 
> changed with one of the latest commits before the 1.16 release (see link 
> above).
> I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i 
> haven't used Tika so far, and i cannot see that the CLI requirements changed 
> from the release notes, i would like to ask, whether this is the case anyway. 
> Do you have some hints on where to start, is this maybe due to improper cli 
> usage? Or do you think there is a missing java package or dependency?
> It's hard for me to say, as the cli commands are automated and distributed 
> over several layers and configuration files in the application stack, hence i 
> am asking for a hint.
> Thx for any advice, best Karl.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-2433) Tika 1.16 - Nullpointer Exception after update - Asking for help

2017-07-25 Thread Nick Burch (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100021#comment-16100021
 ] 

Nick Burch commented on TIKA-2433:
--

What are the arguments you are passing to the Tika App?

> Tika 1.16 - Nullpointer Exception after update - Asking for help
> 
>
> Key: TIKA-2433
> URL: https://issues.apache.org/jira/browse/TIKA-2433
> Project: Tika
>  Issue Type: Bug
>  Components: cli
>Affects Versions: 1.16
> Environment: Docker - Debian Stretch - Oracle Java
> +Installation in Dockerfile+
> {noformat}
> ENV TIKA_VERSION 1.16
> # also see 
> https://github.com/LogicalSpark/docker-tikaserver/blob/master/Dockerfile
> RUN mkdir -p /opt/tika && cd /opt/tika && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-app-${TIKA_VERSION}.jar -o tika.jar \
>  && curl --fail 
> http://www-eu.apache.org/dist/tika/tika-server-${TIKA_VERSION}.jar -o 
> tika-server.jar \
>  && apt-get install -y tesseract-ocr tesseract-ocr-eng tesseract-ocr-ita 
> tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu gdal-bin
> {noformat}
> +Tika.xml+
> {noformat}
> 
> 
> 
> 
>  class="org.apache.tika.parser.ocr.TesseractOCRParser"/>
> 
> 
> 
> {noformat}
>Reporter: Karl Buchta
>
> Hi,
> i would like to kindly ask for help. We had to update to the latest Tika 
> 1.16. I have no experience in Tika so far, i am just maintaining the 
> configuration and application from another developer.
> Version 1.15 worked very fine for us. But right now i see following error 
> (office is the name of our docker container, hence this output):
> https://github.com/apache/tika/blob/1.16/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java#L202
> {noformat}
> office | java.lang.NullPointerException
> office |  at 
> org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:202)
> office |  at 
> org.apache.tika.cli.TikaCLI$TikaServer$1.run(TikaCLI.java:1153)
> {noformat}
> I have checked the source on github and have seen, that this code part was 
> changed with one of the latest commits before the 1.16 release (see link 
> above).
> I checked the Change.txt at https://tika.apache.org/1.16/index.html. As i 
> haven't used Tika so far, and i cannot see that the CLI requirements changed 
> from the release notes, i would like to ask, whether this is the case anyway. 
> Do you have some hints on where to start, is this maybe due to improper cli 
> usage? Or do you think there is a missing java package or dependency?
> It's hard for me to say, as the cli commands are automated and distributed 
> over several layers and configuration files in the application stack, hence i 
> am asking for a hint.
> Thx for any advice, best Karl.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)