Re: Review Request 22246: New parser for Matlab .mat files

2014-06-06 Thread Ann Burgess
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22246/#review44980 --- File Attachment: TIKA-1327.aburgess.140606.patch.txt - TIKA-1327.a

Re: Review Request 22246: New parser for Matlab .mat files

2014-06-06 Thread Ann Burgess
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22246/ --- (Updated June 6, 2014, 11:07 p.m.) Review request for tika and Chris Mattmann.

Re: Review Request 22246: New parser for Matlab .mat files

2014-06-06 Thread Ann Burgess
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22246/ --- (Updated June 6, 2014, 11:06 p.m.) Review request for tika and Chris Mattmann.

[jira] [Commented] (TIKA-1302) Let's run Tika against a large batch of docs nightly

2014-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020362#comment-14020362 ] Hudson commented on TIKA-1302: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #29 (See [https://bu

[jira] [Commented] (TIKA-1302) Let's run Tika against a large batch of docs nightly

2014-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020346#comment-14020346 ] Hudson commented on TIKA-1302: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #29 (See [https://bu

[jira] [Updated] (TIKA-1325) Move the font metadata definitions to properties

2014-06-06 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1325: -- Attachment: TIKA-1325_TimeZone.patch More code than I'd like... Let me know if this works in a standard

[jira] [Commented] (TIKA-1327) New parser for Matlab .mat files

2014-06-06 Thread Ann Burgess (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020129#comment-14020129 ] Ann Burgess commented on TIKA-1327: --- Code posted on Review Board at: https://reviews.apac

[jira] [Updated] (TIKA-1327) New parser for Matlab .mat files

2014-06-06 Thread Ann Burgess (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ann Burgess updated TIKA-1327: -- Description: New parser for Matlab .mat files. > New parser for Matlab .mat files > ---

[jira] [Updated] (TIKA-1327) New parser for Matlab .mat files

2014-06-06 Thread Ann Burgess (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ann Burgess updated TIKA-1327: -- Labels: parser (was: ) > New parser for Matlab .mat files > > >

[jira] [Created] (TIKA-1327) New parser for Matlab .mat files

2014-06-06 Thread Ann Burgess (JIRA)
Ann Burgess created TIKA-1327: - Summary: New parser for Matlab .mat files Key: TIKA-1327 URL: https://issues.apache.org/jira/browse/TIKA-1327 Project: Tika Issue Type: Improvement Compo

[jira] [Updated] (TIKA-1303) Parsing Html page (not well formed) containing two title tags results in metadata (title) to be overwritten

2014-06-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch updated TIKA-1303: - Fix Version/s: (was: 1.5) > Parsing Html page (not well formed) containing two title tags results in

[jira] [Commented] (TIKA-1325) Move the font metadata definitions to properties

2014-06-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020116#comment-14020116 ] Nick Burch commented on TIKA-1325: -- I'd prefer not to comment out the test, as I think it

[jira] [Comment Edited] (TIKA-1325) Move the font metadata definitions to properties

2014-06-06 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020025#comment-14020025 ] Tim Allison edited comment on TIKA-1325 at 6/6/14 5:05 PM: --- Looks

[jira] [Commented] (TIKA-1303) Parsing Html page (not well formed) containing two title tags results in metadata (title) to be overwritten

2014-06-06 Thread Ashish Sood (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020036#comment-14020036 ] Ashish Sood commented on TIKA-1303: --- I am currently out of the office, returning on Monda

[jira] [Commented] (TIKA-1325) Move the font metadata definitions to properties

2014-06-06 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020025#comment-14020025 ] Tim Allison commented on TIKA-1325: --- Looks like Fontbox/TTFDataStream isn't setting a tim

Re: [DISCUSS] 1.6 Release?

2014-06-06 Thread Lewis John Mcgibbney
Hi Chris, On Fri, Jun 6, 2014 at 2:57 AM, wrote: > > So there's been lots of great activity lately between Nick, Tim, Annie, > Tyler, Lewis, Paul R., and me and others. We've got ~44 issues fixed in > JIRA. I moved > all unfixed to 1.7 and would like to roll a 1.6 RC no later than Monday. > Plea

[jira] [Commented] (TIKA-1303) Parsing Html page (not well formed) containing two title tags results in metadata (title) to be overwritten

2014-06-06 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020021#comment-14020021 ] Lewis John McGibbney commented on TIKA-1303: Can someone mark for 1.6 fix? (or

[jira] [Commented] (TIKA-1303) Parsing Html page (not well formed) containing two title tags results in metadata (title) to be overwritten

2014-06-06 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020019#comment-14020019 ] Lewis John McGibbney commented on TIKA-1303: Hi [~hakram], can you possibly att

[jira] [Comment Edited] (TIKA-1250) Process loops infintely processing a CHM file

2014-06-06 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020002#comment-14020002 ] Tyler Palsulich edited comment on TIKA-1250 at 6/6/14 4:30 PM: --

[jira] [Commented] (TIKA-1250) Process loops infintely processing a CHM file

2014-06-06 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020002#comment-14020002 ] Tyler Palsulich commented on TIKA-1250: --- Hi [~g...@hilbertinc.com]. You can attach a

[jira] [Commented] (TIKA-1325) Move the font metadata definitions to properties

2014-06-06 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019985#comment-14019985 ] Tim Allison commented on TIKA-1325: --- Will do. > Move the font metadata definitions to pr

[jira] [Commented] (TIKA-1325) Move the font metadata definitions to properties

2014-06-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019980#comment-14019980 ] Nick Burch commented on TIKA-1325: -- I didn't change the date parts, I just added a test wh

[jira] [Commented] (TIKA-1325) Move the font metadata definitions to properties

2014-06-06 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019970#comment-14019970 ] Tim Allison commented on TIKA-1325: --- Nick, Thank you for fixing this. I just tried to

[jira] [Commented] (TIKA-1325) Move the font metadata definitions to properties

2014-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019965#comment-14019965 ] Hudson commented on TIKA-1325: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #28 (See [https://bu

[jira] [Commented] (TIKA-1325) Move the font metadata definitions to properties

2014-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019948#comment-14019948 ] Hudson commented on TIKA-1325: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #28 (See [https://bu

[jira] [Commented] (TIKA-1258) Update NetCDF dependency

2014-06-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019923#comment-14019923 ] Nick Burch commented on TIKA-1258: -- I can't see anything test related in your patches? We

[jira] [Commented] (TIKA-1325) Move the font metadata definitions to properties

2014-06-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019921#comment-14019921 ] Nick Burch commented on TIKA-1325: -- As of r1600917, the AFM and TTF parsers now largely us

[jira] [Commented] (TIKA-1326) MSI file detection

2014-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019886#comment-14019886 ] Hudson commented on TIKA-1326: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #27 (See [https://bu

[jira] [Commented] (TIKA-1326) MSI file detection

2014-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019863#comment-14019863 ] Hudson commented on TIKA-1326: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #27 (See [https://bu

[jira] [Commented] (TIKA-1326) MSI file detection

2014-06-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019853#comment-14019853 ] Nick Burch commented on TIKA-1326: -- Applied, with tweaks, in r1600887. Because it's OLE2

[jira] [Commented] (TIKA-1326) MSI file detection

2014-06-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019834#comment-14019834 ] Nick Burch commented on TIKA-1326: -- I was about to say "that can't possibly be right", but

[jira] [Created] (TIKA-1326) MSI file detection

2014-06-06 Thread Luis Filipe Nassif (JIRA)
Luis Filipe Nassif created TIKA-1326: Summary: MSI file detection Key: TIKA-1326 URL: https://issues.apache.org/jira/browse/TIKA-1326 Project: Tika Issue Type: Improvement Compo

[jira] [Commented] (TIKA-1325) Move the font metadata definitions to properties

2014-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019817#comment-14019817 ] Hudson commented on TIKA-1325: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #26 (See [https://bu

[jira] [Commented] (TIKA-1325) Move the font metadata definitions to properties

2014-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019809#comment-14019809 ] Hudson commented on TIKA-1325: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #26 (See [https://bu

[jira] [Commented] (TIKA-1258) Update NetCDF dependency

2014-06-06 Thread Michal Hlavac (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019794#comment-14019794 ] Michal Hlavac commented on TIKA-1258: - I added attachment with my first comment. It con

[jira] [Commented] (TIKA-1182) Out of memory exception when parsing TTF file

2014-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019778#comment-14019778 ] Hudson commented on TIKA-1182: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #25 (See [https://bu

[jira] [Commented] (TIKA-1322) XML file parse errors within archives trigger Zip bomb detection

2014-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019756#comment-14019756 ] Hudson commented on TIKA-1322: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #24 (See [https://bu

[jira] [Commented] (TIKA-1322) XML file parse errors within archives trigger Zip bomb detection

2014-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019755#comment-14019755 ] Hudson commented on TIKA-1322: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #25 (See [https://bu

[jira] [Commented] (TIKA-1182) Out of memory exception when parsing TTF file

2014-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019754#comment-14019754 ] Hudson commented on TIKA-1182: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #25 (See [https://bu

[jira] [Created] (TIKA-1325) Move the font metadata definitions to properties

2014-06-06 Thread Nick Burch (JIRA)
Nick Burch created TIKA-1325: Summary: Move the font metadata definitions to properties Key: TIKA-1325 URL: https://issues.apache.org/jira/browse/TIKA-1325 Project: Tika Issue Type: Improvement

[jira] [Resolved] (TIKA-1182) Out of memory exception when parsing TTF file

2014-06-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1182. -- Resolution: Fixed Fix Version/s: 1.6 Temporary fix reverted in r1600844. > Out of memory excepti

[jira] [Resolved] (TIKA-1322) XML file parse errors within archives trigger Zip bomb detection

2014-06-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1322. -- Resolution: Fixed Fix Version/s: 1.6 Thanks, applied in r1600841. > XML file parse errors within

[jira] [Commented] (TIKA-1322) XML file parse errors within archives trigger Zip bomb detection

2014-06-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019737#comment-14019737 ] ASF GitHub Bot commented on TIKA-1322: -- Github user asfgit closed the pull request at:

[GitHub] tika pull request: TIKA-1322: Properly close XMLParser's output in...

2014-06-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/tika/pull/9 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled

[jira] [Commented] (TIKA-1258) Update NetCDF dependency

2014-06-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019731#comment-14019731 ] Nick Burch commented on TIKA-1258: -- Oh... so it is... Any chance you could get it working

[jira] [Commented] (TIKA-1258) Update NetCDF dependency

2014-06-06 Thread Michal Hlavac (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019725#comment-14019725 ] Michal Hlavac commented on TIKA-1258: - Yes, and it's actually disabled :) So, there is

[jira] [Commented] (TIKA-1258) Update NetCDF dependency

2014-06-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019722#comment-14019722 ] Nick Burch commented on TIKA-1258: -- There's at least one - http://svn.apache.org/repos/as

[jira] [Commented] (TIKA-1258) Update NetCDF dependency

2014-06-06 Thread Michal Hlavac (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019713#comment-14019713 ] Michal Hlavac commented on TIKA-1258: - What tests do you mean? There are no unit tests

[jira] [Commented] (TIKA-1258) Update NetCDF dependency

2014-06-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019711#comment-14019711 ] Nick Burch commented on TIKA-1258: -- Currently, all the Tika Bundle unit tests pass, so we

[jira] [Resolved] (TIKA-1254) No warning when Tika does not find a parser.

2014-06-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1254. -- Resolution: Not a Problem Closing this as "not a bug", since Tika provides a way to check what parsers

[jira] [Updated] (TIKA-1258) Update NetCDF dependency

2014-06-06 Thread Michal Hlavac (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michal Hlavac updated TIKA-1258: Attachment: vcs-diff7915698944227263145.patch > Update NetCDF dependency >

[jira] [Commented] (TIKA-1258) Update NetCDF dependency

2014-06-06 Thread Michal Hlavac (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019670#comment-14019670 ] Michal Hlavac commented on TIKA-1258: - Update of NetCDF dependency broke tika-bundle. I

Re: [DISCUSS] 1.6 Release?

2014-06-06 Thread Matthias Krueger
Hi, I've come across TIKA-1182 and TIKA-1322. Both are trivial to fix and would be helpful to have in 1.6. TIKA-1182 is fixed in FontBox so we only need to revert Tika's temporary workaround. For TIKA-1322 I've created a pull request (https://github.com/apache/tika/pull/9). Let me know