[jira] [Created] (TIKA-992) OpenGraph meta tags to allow multiple values

2012-09-04 Thread Markus Jelsma (JIRA)
Markus Jelsma created TIKA-992: -- Summary: OpenGraph meta tags to allow multiple values Key: TIKA-992 URL: https://issues.apache.org/jira/browse/TIKA-992 Project: Tika Issue Type: Bug

[jira] [Updated] (TIKA-992) OpenGraph meta tags to allow multiple values

2012-09-04 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated TIKA-992: --- Attachment: TIKA-992-1.3-1.patch Here's a patch improving the unit test and relies on Metadata.add()

[jira] [Assigned] (TIKA-989) We don't extract a placeholder for documents embedded in a Word OOXML (.docx) document

2012-09-04 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned TIKA-989: --- Assignee: Michael McCandless We don't extract a placeholder for documents

[jira] [Updated] (TIKA-989) We don't extract a placeholder for documents embedded in a Word OOXML (.docx) document

2012-09-04 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated TIKA-989: Attachment: TIKA-989.patch Patch w/ test, adding placeholders for embedded files inside Word

Re: Question about XPath Matcher code MatchingContentHandler

2012-09-04 Thread Jukka Zitting
Hi, On Mon, Sep 3, 2012 at 7:50 PM, Ken Krugler kkrugler_li...@transpac.com wrote: No, the html _does_ match, which it needs to as it descends the DOM hierarchy. Note that we're dealing with SAX events instead of DOM hierarchies here. So what the startElement() methods does not (and is not