[jira] [Commented] (TIKA-1401) occured infinite loop using tika library

2015-05-06 Thread Matthias Krueger (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14530284#comment-14530284 ] Matthias Krueger commented on TIKA-1401: I'm preparing a patch for this

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2015-03-17 Thread Matthias Krueger (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14366201#comment-14366201 ] Matthias Krueger commented on TIKA-1365: Quick wrapup: * HTML starting

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-16 Thread Matthias Krueger (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063699#comment-14063699 ] Matthias Krueger commented on TIKA-1365: Some more observations: * The l

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-16 Thread Matthias Krueger (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063617#comment-14063617 ] Matthias Krueger commented on TIKA-1365: {code} System.out.println

Re: Java code layout settings - do we have them documented somewhere?

2014-07-14 Thread Matthias Krueger
Hi all, a guideline on how to handle existing code that does not adhere to 4 spaces would be helpful. When submitting a patch/pull request should I include a commit reformatting the rest of the class if that's not indented with 4 spaces? Thanks Matthias On 14.07.2014 23:20, Mattmann, Chri

[jira] [Commented] (TIKA-1040) Could not delete temporary file

2014-07-09 Thread Matthias Krueger (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056219#comment-14056219 ] Matthias Krueger commented on TIKA-1040: This should be addressed when mer

[jira] [Commented] (TIKA-1361) Update MP4Parser to 1.0.2

2014-07-05 Thread Matthias Krueger (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052947#comment-14052947 ] Matthias Krueger commented on TIKA-1361: Needed to adjust the code to compile

[jira] [Created] (TIKA-1361) Update MP4Parser to 1.0.2

2014-07-05 Thread Matthias Krueger (JIRA)
Matthias Krueger created TIKA-1361: -- Summary: Update MP4Parser to 1.0.2 Key: TIKA-1361 URL: https://issues.apache.org/jira/browse/TIKA-1361 Project: Tika Issue Type: Improvement

[jira] [Commented] (TIKA-1332) Create "eval" code

2014-06-26 Thread Matthias Krueger (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14045219#comment-14045219 ] Matthias Krueger commented on TIKA-1332: It might be good to distinguish bet

Re: [DISCUSS] 1.6 Release?

2014-06-06 Thread Matthias Krueger
Hi, I've come across TIKA-1182 and TIKA-1322. Both are trivial to fix and would be helpful to have in 1.6. TIKA-1182 is fixed in FontBox so we only need to revert Tika's temporary workaround. For TIKA-1322 I've created a pull request (https://github.com/apache/tika/pull/9). Let me know

Re: Review Request 22246: New parser for Matlab .mat files

2014-06-05 Thread Matthias Krueger
> On June 4, 2014, 11:25 p.m., Matthias Krueger wrote: > > The Matlab MIME types used seem to be application/x-matlab-data or > > application/matlab-mat. > > > > Would it make sense to add them to the mime XML for detection? >

[jira] [Commented] (TIKA-1182) Out of memory exception when parsing TTF file

2014-06-05 Thread Matthias Krueger (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019351#comment-14019351 ] Matthias Krueger commented on TIKA-1182: I retested with the current Tika t

Re: Review Request 22219: Add Translation to Tika

2014-06-05 Thread Matthias Krueger
ing in the Translator interface and the catch(Exception) and rethrow IllegalStateException in Tika#translate seems a bit weird. - Matthias Krueger On June 5, 2014, 4:19 p.m., Tyler Palsulich wrote: > > --- > This is an automatical

Re: Review Request 22246: New parser for Matlab .mat files

2014-06-04 Thread Matthias Krueger
application/matlab-mat. Would it make sense to add them to the mime XML for detection? MATLAB data file - Matthias Krueger On June 4, 2014, 10:23 p.m., Ann Burgess wrote: > > --- > This is an auto

[jira] [Created] (TIKA-1322) XML file parse errors within archives trigger Zip bomb detection

2014-06-04 Thread Matthias Krueger (JIRA)
Matthias Krueger created TIKA-1322: -- Summary: XML file parse errors within archives trigger Zip bomb detection Key: TIKA-1322 URL: https://issues.apache.org/jira/browse/TIKA-1322 Project: Tika

Re: unit test error for new parser

2014-06-04 Thread Matthias Krueger
Hi Annie, [INFO] - [ERROR] COMPILATION ERROR : [INFO] - [ERROR] /Users/annbryant/TIKA/tika/tika-parsers/src/main/java/org/apache/tika/parser/mat/MatParser.java:[69,23] cannot

Extended fix for TIKA-1169

2014-05-16 Thread Matthias Krueger
I came across some other .jnilib binaries which were detected as .class files and caused issues. It seems there are more Mach-o binary magic variants depending on 32/64 Bit architecture and endianness. Fix is attached. Let me know if I should rather clone the closed TIKA-1169 and attach it th

Shared MIME info update

2014-04-28 Thread Matthias Krueger
Hi all, I ran a diff on tika-mimetypes.xml and the latest Freedesktop share MIME info DB release (http://cgit.freedesktop.org/xdg/shared-mime-info/). It seems they have diverged quite a lot. Do you see benefit in bringing them closer together again? Or is licensing in the way (I think they d