GitHub user thammegowda opened a pull request: https://github.com/apache/tika/pull/66
Fix for TIKA-1815 contributed by Thamme Gowda + Outputting the text content to XMLDocumentHandler You can merge this pull request into a Git repository by running: $ git pull https://github.com/thammegowda/tika fix-TIKA-1815 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tika/pull/66.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #66 ---- commit e96da2bc28d5eef81d034e39eb05099ed5d38ac1 Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-10-30T21:47:45Z Add NamedEntityParser Add OpenNLPNERecogniser as default commit a720507a1c1906a501470a7d5c5cec335412fcd3 Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-10-30T22:16:11Z Set charset for converting text to stream commit 6b1a20e681a5d319886464ec147967c876b7e60d Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-10-31T04:23:43Z Automated OpenNLP NER model downloader commit e381ea88ebd2bb8f5adfe36d710acfce673e30aa Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-11-04T00:31:40Z using a secondary parser to convert non-text streams commit ea7871bd4afae7d18e500ffc285e58afd08f5e86 Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-11-08T07:36:48Z Add regex based NER commit 084985b3612438e9ca7107fecdffd67757d04d10 Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-11-08T07:38:17Z Add CoreNLP NER with runtime binding commit e4d74218ece77143d1e5245a3ef64ddf5578c310 Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-11-08T23:41:15Z Added support for chaining NER implementations commit 7e6b43c83ec6cdd35ea258f52c0110ba986c82b3 Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-11-09T05:58:58Z charset specified commit caba68773a287752dea43f3366e6d4309fde861c Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-11-10T01:34:04Z Merge branch 'trunk' of github.com:apache/tika into trunk commit 08b916790b279cda0201f2529ca58646dea4b2f9 Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-11-10T19:06:29Z Resolved Code formatting issues + Removed star imports + Removed dead code / commented code + Added License header to missing files commit e07ac630d54cc79d9a7bfc9ac82332474d07434b Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-11-16T09:05:07Z Add missing doc strings, fix code formatting issues commit 96d4d7cc29d4bcd8ac0cf7a595c39b6ed64d4d19 Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-11-18T03:03:41Z Fix: build phase for model downloader commit 6d0b121b8b321e8a31257fc608bb001d3fe7afb5 Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-12-11T14:33:36Z Merge branch 'trunk' of github.com:apache/tika into trunk commit 66d3a10ffabf1f54cff384ce1c7325c2a3c16279 Author: Thamme Gowda <tgow...@gmail.com> Date: 2015-12-19T18:59:26Z Fix : TIKA-1815 by Thamme Gowda N. 1. Writing text content to XMLContentHandler 2. Added RegexNERParser to Default parser chain ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---