[jira] Updated: (TIKA-526) OOXMLParser fails to extract text from within smart tags

2010-10-04 Thread Geoff Jarrad (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Geoff Jarrad updated TIKA-526: -- Attachment: smarttag-snippet.docx An example .docx containing smart-tagged text that is not extracted by

[jira] Created: (TIKA-526) OOXMLParser fails to extract text from within smart tags

2010-10-04 Thread Geoff Jarrad (JIRA)
OOXMLParser fails to extract text from within smart tags Key: TIKA-526 URL: https://issues.apache.org/jira/browse/TIKA-526 Project: Tika Issue Type: Bug Components: parser

[jira] Created: (TIKA-525) Mismatched start and end elements in HtmlParser

2010-10-04 Thread Geoff Jarrad (JIRA)
Mismatched start and end elements in HtmlParser --- Key: TIKA-525 URL: https://issues.apache.org/jira/browse/TIKA-525 Project: Tika Issue Type: Bug Components: parser Affects Versions

[jira] Commented: (TIKA-521) OutOfMemoryError Parsing XSLX File

2010-10-04 Thread Sjoerd Smeets (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917819#action_12917819 ] Sjoerd Smeets commented on TIKA-521: I'm facing the same issue. Increasing the heapssize

[jira] Created: (TIKA-524) Unification of HTML output from Office, OOXML and Open Document parsers

2010-10-04 Thread Geoff Jarrad (JIRA)
Unification of HTML output from Office, OOXML and Open Document parsers --- Key: TIKA-524 URL: https://issues.apache.org/jira/browse/TIKA-524 Project: Tika Issue Type: Impro

[jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio

2010-10-04 Thread Dennis Adler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917814#action_12917814 ] Dennis Adler commented on TIKA-522: --- Hi Nick, Pardon the alias on the GMail addr... I can

[jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio

2010-10-04 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917788#action_12917788 ] Nick Burch commented on TIKA-522: - When it goes wrong, can you capture the bytes in the buffe

[jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio

2010-10-04 Thread Dennis Adler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917781#action_12917781 ] Dennis Adler commented on TIKA-522: --- I've further traced the problem. It happens in MagicDe

[jira] Commented: (TIKA-490) Support for adding language profiles dynamically

2010-10-04 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917733#action_12917733 ] Chris A. Mattmann commented on TIKA-490: Hi Jan: I've been a bit bogged down lately

[jira] Commented: (TIKA-490) Support for adding language profiles dynamically

2010-10-04 Thread JIRA
[ https://issues.apache.org/jira/browse/TIKA-490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917718#action_12917718 ] Jan Høydahl commented on TIKA-490: -- Chris, Jukka, Are we approaching a solution on this one

[jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio

2010-10-04 Thread Dennis Adler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917680#action_12917680 ] Dennis Adler commented on TIKA-522: --- If it makes a difference, while we're using URL to par

[jira] Created: (TIKA-523) Add application/ms-tnef as alias to application/vnd.ms-tnef

2010-10-04 Thread JIRA
Add application/ms-tnef as alias to application/vnd.ms-tnef --- Key: TIKA-523 URL: https://issues.apache.org/jira/browse/TIKA-523 Project: Tika Issue Type: Improvement Compone