[
https://issues.apache.org/jira/browse/NUTCH-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764053#action_12764053
]
Andrzej Bialecki commented on NUTCH-335:
-
The most common cause of this is a PDF th
[
http://issues.apache.org/jira/browse/NUTCH-335?page=comments#action_12424855 ]
Siddharudh nadgeri commented on NUTCH-335:
--
I searched before but solution was not there alternative you have given and if
you know any pdf properties that
[
http://issues.apache.org/jira/browse/NUTCH-335?page=comments#action_12424553 ]
Stefan Neufeind commented on NUTCH-335:
---
The problem that that in most cases I've come across the PDF is protected and
does not allow text-extraction. Though