Re: Expected output

2014-06-28 Thread Lewis John Mcgibbney
Hi Kevin, On Fri, Jun 27, 2014 at 7:56 AM, dev-digest-h...@tika.apache.org wrote: Subject: Expected output Hello everyone. I have a question about the expected output for tika. I am working on integrating my python application with tika-server. One of the test files for unit test

[jira] [Commented] (TIKA-1300) Switch default PDFBox parser to NonSequentialParser

2014-06-28 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046896#comment-14046896 ] Tilman Hausherr commented on TIKA-1300: --- [~talli...@mitre.org] are there any rules

[jira] [Commented] (TIKA-1300) Switch default PDFBox parser to NonSequentialParser

2014-06-28 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046972#comment-14046972 ] Tim Allison commented on TIKA-1300: --- Don't think so. I'd recommend the 1000 zips vs 1m

Re: Expected output

2014-06-28 Thread kevin slote
Possibly nothing. That was part of my question. I was asking if data like this was to be expected. %99 of the time, tika-server returns data that is formatted more like standard csv output. I have only ever seen metadata returned like this once before. Usually, meta-data is just data about