[jira] [Resolved] (TIKA-793) Invalid ASCII character (65533) when retriving MP3 metadata

2011-12-29 Thread Nick Burch (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-793. - Resolution: Fixed Fix Version/s: 1.1 > Invalid ASCII character (65533) when retriving MP3 metada

[jira] [Commented] (TIKA-793) Invalid ASCII character (65533) when retriving MP3 metadata

2011-12-29 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177081#comment-13177081 ] Nick Burch commented on TIKA-793: - Comment (COM/COMM) tag handling fixed in r1225480 - it us

[jira] [Commented] (TIKA-830) Tika.parseToString() causes ForkParser to try to serialize itself

2011-12-29 Thread Jukka Zitting (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177088#comment-13177088 ] Jukka Zitting commented on TIKA-830: Excellent, thanks Nick! > Tika.par

[jira] [Updated] (TIKA-833) POI Daily beta6 as of 12/27 breaks ExcelParserTest.testExcelParserFormatting()

2011-12-29 Thread Jeremy Anderson (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson updated TIKA-833: - Comment: was deleted (was: Fixed in POI as of r1225098.) > POI Daily beta6 as of 12/27 breaks

[jira] [Created] (TIKA-836) parsing really slow on some documents

2011-12-29 Thread Rob Tulloh (Created) (JIRA)
parsing really slow on some documents - Key: TIKA-836 URL: https://issues.apache.org/jira/browse/TIKA-836 Project: Tika Issue Type: Improvement Components: parser Affects Versions: 1.0

[jira] [Commented] (TIKA-833) POI Daily beta6 as of 12/27 breaks ExcelParserTest.testExcelParserFormatting()

2011-12-29 Thread Jukka Zitting (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177183#comment-13177183 ] Jukka Zitting commented on TIKA-833: It's good that we monitor changes in POI and make s

[jira] [Closed] (TIKA-833) POI Daily beta6 as of 12/27 breaks ExcelParserTest.testExcelParserFormatting()

2011-12-29 Thread Jeremy Anderson (Closed) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson closed TIKA-833. Resolution: Not A Problem Fix Version/s: (was: 1.1) Issue lies in base POI and is now fixe

[jira] [Commented] (TIKA-833) POI Daily beta6 as of 12/27 breaks ExcelParserTest.testExcelParserFormatting()

2011-12-29 Thread Jeremy Anderson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177178#comment-13177178 ] Jeremy Anderson commented on TIKA-833: -- Fixed in POI as of r1225098. >

[jira] [Reopened] (TIKA-833) POI Daily beta6 as of 12/27 breaks ExcelParserTest.testExcelParserFormatting()

2011-12-29 Thread Jeremy Anderson (Reopened) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson reopened TIKA-833: -- re-opening to change the status to keep out of release notes. Thanks Jukka, still getting the hang o

[jira] [Created] (TIKA-835) TNEF parsing unstable

2011-12-29 Thread Rob Tulloh (Created) (JIRA)
TNEF parsing unstable - Key: TIKA-835 URL: https://issues.apache.org/jira/browse/TIKA-835 Project: Tika Issue Type: Bug Components: parser Affects Versions: 1.0 Environment: CentOS 4.x/5.x/6.x Java

[jira] [Resolved] (TIKA-833) POI Daily beta6 as of 12/27 breaks ExcelParserTest.testExcelParserFormatting()

2011-12-29 Thread Jeremy Anderson (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Anderson resolved TIKA-833. -- Resolution: Fixed Fix Version/s: 1.1 Fixed in POI as of r1225098. > POI D

[jira] [Commented] (TIKA-833) POI Daily beta6 as of 12/27 breaks ExcelParserTest.testExcelParserFormatting()

2011-12-29 Thread Jukka Zitting (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177248#comment-13177248 ] Jukka Zitting commented on TIKA-833: Thanks, Jeremy! > POI Daily beta6

Re: I would like to join this mailing list

2011-12-29 Thread Jukka Zitting
Hi Adam, Welcome! To subscribe, send a message to dev-subscr...@tika.apache.org. For more details, see http://tika.apache.org/mail-lists.html. BR, Jukka Zitting

[jira] [Updated] (TIKA-836) parsing really slow on some documents

2011-12-29 Thread Rob Tulloh (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Tulloh updated TIKA-836: Description: We are seeing that tika sometimes takes a very long time to parse some content (likely PDF). Fo

[jira] [Commented] (TIKA-836) parsing really slow on some documents

2011-12-29 Thread Rob Tulloh (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177546#comment-13177546 ] Rob Tulloh commented on TIKA-836: - Definitely the pdf files. Here are the feeding time break

[jira] [Commented] (TIKA-835) TNEF parsing unstable

2011-12-29 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177545#comment-13177545 ] Nick Burch commented on TIKA-835: - Without a file, it's going to be very hard for us to iden

[jira] [Commented] (TIKA-835) TNEF parsing unstable

2011-12-29 Thread Rob Tulloh (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177547#comment-13177547 ] Rob Tulloh commented on TIKA-835: - If you can tell me how to debug this, I'll be glad to try

arrayindex out of bounds exception

2011-12-29 Thread Aami
Hi When i tried to extract apdf using apache tika version 0.9 im getting the following exception Thread-0/PDFStreamEngine [WARN] java.lang.ArrayIndexOutOfBoundsException: 1 java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.pdfbox.util.TextPosition.mergeDiacritic(TextPosition.ja

[jira] [Commented] (TIKA-835) TNEF parsing unstable

2011-12-29 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177574#comment-13177574 ] Nick Burch commented on TIKA-835: - winmail.dat is a TNEF file, which POI supports through HM