[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled
[ https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065908#comment-15065908 ] Hudson commented on TIKA-1815: -- UNSTABLE: Integrated in tika-trunk-jdk1.7 #893 (See [https://builds.apache.org/job/tika-trunk-jdk1.7/893/]) Fix for TIKA-1815 Text content from parser is empty when NamedEntityParser is enabled contributed by Thamme Gowda this closes #67 (mattmann: [http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1721058]) * trunk/CHANGES.txt * trunk/tika-parsers/src/main/java/org/apache/tika/parser/ner/NamedEntityParser.java > Text content from parser is empty when NamedEntityParser is enabled > --- > > Key: TIKA-1815 > URL: https://issues.apache.org/jira/browse/TIKA-1815 > Project: Tika > Issue Type: Bug > Components: parser >Reporter: Thamme Gowda N >Assignee: Chris A. Mattmann > Labels: memex > Fix For: 1.12 > > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > When the NamedEntityParser is enabled, the Tika#parseToString() and other > parse() methods produces an empty string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled
[ https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065888#comment-15065888 ] Chris A. Mattmann commented on TIKA-1815: - Applied thanks [~thammegowda]! {noformat} [chipotle:~/tmp/tika1.12] mattmann% svn commit -m "Fix for TIKA-1815 Text content from parser is empty when NamedEntityParser is enabled contributed by Thamme Gowda this closes #67" SendingCHANGES.txt Sending tika-parsers/src/main/java/org/apache/tika/parser/ner/NamedEntityParser.java Transmitting file data .. Committed revision 1721058. [chipotle:~/tmp/tika1.12] mattmann% {noformat} > Text content from parser is empty when NamedEntityParser is enabled > --- > > Key: TIKA-1815 > URL: https://issues.apache.org/jira/browse/TIKA-1815 > Project: Tika > Issue Type: Bug > Components: parser >Reporter: Thamme Gowda N >Assignee: Chris A. Mattmann > Labels: memex > Fix For: 1.12 > > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > When the NamedEntityParser is enabled, the Tika#parseToString() and other > parse() methods produces an empty string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled
[ https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065887#comment-15065887 ] ASF GitHub Bot commented on TIKA-1815: -- Github user asfgit closed the pull request at: https://github.com/apache/tika/pull/67 > Text content from parser is empty when NamedEntityParser is enabled > --- > > Key: TIKA-1815 > URL: https://issues.apache.org/jira/browse/TIKA-1815 > Project: Tika > Issue Type: Bug > Components: parser >Reporter: Thamme Gowda N >Assignee: Chris A. Mattmann > Labels: memex > Fix For: 1.12 > > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > When the NamedEntityParser is enabled, the Tika#parseToString() and other > parse() methods produces an empty string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled
[ https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065885#comment-15065885 ] Chris A. Mattmann commented on TIKA-1815: - Applied and all tests pass, commiting: {noformat} [INFO] Installing /Users/mattmann/tmp/tika1.12/pom.xml to /Users/mattmann/.m2/repository/org/apache/tika/tika/1.12-SNAPSHOT/tika-1.12-SNAPSHOT.pom [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Tika parent . SUCCESS [ 1.917 s] [INFO] Apache Tika core ... SUCCESS [ 18.468 s] [INFO] Apache Tika parsers SUCCESS [03:25 min] [INFO] Apache Tika XMP SUCCESS [ 2.795 s] [INFO] Apache Tika serialization .. SUCCESS [ 1.663 s] [INFO] Apache Tika batch .. SUCCESS [01:57 min] [INFO] Apache Tika application SUCCESS [ 38.660 s] [INFO] Apache Tika OSGi bundle SUCCESS [ 21.145 s] [INFO] Apache Tika translate .. SUCCESS [ 1.832 s] [INFO] Apache Tika server . SUCCESS [ 24.827 s] [INFO] Apache Tika examples ... SUCCESS [ 18.857 s] [INFO] Apache Tika Java-7 Components .. SUCCESS [ 2.609 s] [INFO] Apache Tika SUCCESS [ 0.032 s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 07:36 min [INFO] Finished at: 2015-12-20T11:08:58-08:00 [INFO] Final Memory: 102M/1708M {noformat} > Text content from parser is empty when NamedEntityParser is enabled > --- > > Key: TIKA-1815 > URL: https://issues.apache.org/jira/browse/TIKA-1815 > Project: Tika > Issue Type: Bug > Components: parser >Reporter: Thamme Gowda N >Assignee: Chris A. Mattmann > Labels: memex > Fix For: 1.12 > > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > When the NamedEntityParser is enabled, the Tika#parseToString() and other > parse() methods produces an empty string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled
[ https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065535#comment-15065535 ] ASF GitHub Bot commented on TIKA-1815: -- Github user thammegowda closed the pull request at: https://github.com/apache/tika/pull/66 > Text content from parser is empty when NamedEntityParser is enabled > --- > > Key: TIKA-1815 > URL: https://issues.apache.org/jira/browse/TIKA-1815 > Project: Tika > Issue Type: Bug > Components: parser >Reporter: Thamme Gowda N > Fix For: 1.12 > > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > When the NamedEntityParser is enabled, the Tika#parseToString() and other > parse() methods produces an empty string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled
[ https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065533#comment-15065533 ] ASF GitHub Bot commented on TIKA-1815: -- GitHub user thammegowda opened a pull request: https://github.com/apache/tika/pull/67 FIX for TIKA-1815 contributed by Thamme Gowda + Writing the text content to XML Document + Added Regex recogniser to default NER chain Closes #66 (this is a simpler version of the same). Fixes #TIKA-1815 You can merge this pull request into a Git repository by running: $ git pull https://github.com/thammegowda/tika TIKA-1815 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tika/pull/67.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #67 commit a40a18e2f61f2152fa065bda193ceb74e7e60c97 Author: Thamme Gowda Date: 2015-12-19T20:56:21Z FIX for TIKA-1815 contributed by Thamme Gowda + Writing the text content to XML Document + Added Regex recogniser to default NER chain > Text content from parser is empty when NamedEntityParser is enabled > --- > > Key: TIKA-1815 > URL: https://issues.apache.org/jira/browse/TIKA-1815 > Project: Tika > Issue Type: Bug > Components: parser >Reporter: Thamme Gowda N > Fix For: 1.12 > > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > When the NamedEntityParser is enabled, the Tika#parseToString() and other > parse() methods produces an empty string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled
[ https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065485#comment-15065485 ] ASF GitHub Bot commented on TIKA-1815: -- GitHub user thammegowda opened a pull request: https://github.com/apache/tika/pull/66 Fix for TIKA-1815 contributed by Thamme Gowda + Outputting the text content to XMLDocumentHandler You can merge this pull request into a Git repository by running: $ git pull https://github.com/thammegowda/tika fix-TIKA-1815 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tika/pull/66.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #66 commit e96da2bc28d5eef81d034e39eb05099ed5d38ac1 Author: Thamme Gowda Date: 2015-10-30T21:47:45Z Add NamedEntityParser Add OpenNLPNERecogniser as default commit a720507a1c1906a501470a7d5c5cec335412fcd3 Author: Thamme Gowda Date: 2015-10-30T22:16:11Z Set charset for converting text to stream commit 6b1a20e681a5d319886464ec147967c876b7e60d Author: Thamme Gowda Date: 2015-10-31T04:23:43Z Automated OpenNLP NER model downloader commit e381ea88ebd2bb8f5adfe36d710acfce673e30aa Author: Thamme Gowda Date: 2015-11-04T00:31:40Z using a secondary parser to convert non-text streams commit ea7871bd4afae7d18e500ffc285e58afd08f5e86 Author: Thamme Gowda Date: 2015-11-08T07:36:48Z Add regex based NER commit 084985b3612438e9ca7107fecdffd67757d04d10 Author: Thamme Gowda Date: 2015-11-08T07:38:17Z Add CoreNLP NER with runtime binding commit e4d74218ece77143d1e5245a3ef64ddf5578c310 Author: Thamme Gowda Date: 2015-11-08T23:41:15Z Added support for chaining NER implementations commit 7e6b43c83ec6cdd35ea258f52c0110ba986c82b3 Author: Thamme Gowda Date: 2015-11-09T05:58:58Z charset specified commit caba68773a287752dea43f3366e6d4309fde861c Author: Thamme Gowda Date: 2015-11-10T01:34:04Z Merge branch 'trunk' of github.com:apache/tika into trunk commit 08b916790b279cda0201f2529ca58646dea4b2f9 Author: Thamme Gowda Date: 2015-11-10T19:06:29Z Resolved Code formatting issues + Removed star imports + Removed dead code / commented code + Added License header to missing files commit e07ac630d54cc79d9a7bfc9ac82332474d07434b Author: Thamme Gowda Date: 2015-11-16T09:05:07Z Add missing doc strings, fix code formatting issues commit 96d4d7cc29d4bcd8ac0cf7a595c39b6ed64d4d19 Author: Thamme Gowda Date: 2015-11-18T03:03:41Z Fix: build phase for model downloader commit 6d0b121b8b321e8a31257fc608bb001d3fe7afb5 Author: Thamme Gowda Date: 2015-12-11T14:33:36Z Merge branch 'trunk' of github.com:apache/tika into trunk commit 66d3a10ffabf1f54cff384ce1c7325c2a3c16279 Author: Thamme Gowda Date: 2015-12-19T18:59:26Z Fix : TIKA-1815 by Thamme Gowda N. 1. Writing text content to XMLContentHandler 2. Added RegexNERParser to Default parser chain > Text content from parser is empty when NamedEntityParser is enabled > --- > > Key: TIKA-1815 > URL: https://issues.apache.org/jira/browse/TIKA-1815 > Project: Tika > Issue Type: Bug > Components: parser >Reporter: Thamme Gowda N > Fix For: 1.12 > > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > When the NamedEntityParser is enabled, the Tika#parseToString() and other > parse() methods produces an empty string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)