[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled

2015-12-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065908#comment-15065908
 ] 

Hudson commented on TIKA-1815:
--

UNSTABLE: Integrated in tika-trunk-jdk1.7 #893 (See 
[https://builds.apache.org/job/tika-trunk-jdk1.7/893/])
Fix for TIKA-1815 Text content from parser is empty when NamedEntityParser is 
enabled contributed by Thamme Gowda  this closes #67 
(mattmann: [http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1721058])
* trunk/CHANGES.txt
* 
trunk/tika-parsers/src/main/java/org/apache/tika/parser/ner/NamedEntityParser.java


> Text content from parser is empty when NamedEntityParser is enabled
> ---
>
> Key: TIKA-1815
> URL: https://issues.apache.org/jira/browse/TIKA-1815
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Reporter: Thamme Gowda N
>Assignee: Chris A. Mattmann
>  Labels: memex
> Fix For: 1.12
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> When the NamedEntityParser is enabled, the Tika#parseToString() and other 
> parse() methods produces an empty string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled

2015-12-20 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065888#comment-15065888
 ] 

Chris A. Mattmann commented on TIKA-1815:
-

Applied thanks [~thammegowda]!

{noformat}
[chipotle:~/tmp/tika1.12] mattmann% svn commit -m "Fix for TIKA-1815 Text 
content from parser is empty when NamedEntityParser is enabled contributed by 
Thamme Gowda  this closes #67"
SendingCHANGES.txt
Sending
tika-parsers/src/main/java/org/apache/tika/parser/ner/NamedEntityParser.java
Transmitting file data ..
Committed revision 1721058.
[chipotle:~/tmp/tika1.12] mattmann% 
{noformat}


> Text content from parser is empty when NamedEntityParser is enabled
> ---
>
> Key: TIKA-1815
> URL: https://issues.apache.org/jira/browse/TIKA-1815
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Reporter: Thamme Gowda N
>Assignee: Chris A. Mattmann
>  Labels: memex
> Fix For: 1.12
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> When the NamedEntityParser is enabled, the Tika#parseToString() and other 
> parse() methods produces an empty string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled

2015-12-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065887#comment-15065887
 ] 

ASF GitHub Bot commented on TIKA-1815:
--

Github user asfgit closed the pull request at:

https://github.com/apache/tika/pull/67


> Text content from parser is empty when NamedEntityParser is enabled
> ---
>
> Key: TIKA-1815
> URL: https://issues.apache.org/jira/browse/TIKA-1815
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Reporter: Thamme Gowda N
>Assignee: Chris A. Mattmann
>  Labels: memex
> Fix For: 1.12
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> When the NamedEntityParser is enabled, the Tika#parseToString() and other 
> parse() methods produces an empty string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled

2015-12-20 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065885#comment-15065885
 ] 

Chris A. Mattmann commented on TIKA-1815:
-

Applied and all tests pass, commiting:

{noformat}
[INFO] Installing /Users/mattmann/tmp/tika1.12/pom.xml to 
/Users/mattmann/.m2/repository/org/apache/tika/tika/1.12-SNAPSHOT/tika-1.12-SNAPSHOT.pom
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Tika parent . SUCCESS [  1.917 s]
[INFO] Apache Tika core ... SUCCESS [ 18.468 s]
[INFO] Apache Tika parsers  SUCCESS [03:25 min]
[INFO] Apache Tika XMP  SUCCESS [  2.795 s]
[INFO] Apache Tika serialization .. SUCCESS [  1.663 s]
[INFO] Apache Tika batch .. SUCCESS [01:57 min]
[INFO] Apache Tika application  SUCCESS [ 38.660 s]
[INFO] Apache Tika OSGi bundle  SUCCESS [ 21.145 s]
[INFO] Apache Tika translate .. SUCCESS [  1.832 s]
[INFO] Apache Tika server . SUCCESS [ 24.827 s]
[INFO] Apache Tika examples ... SUCCESS [ 18.857 s]
[INFO] Apache Tika Java-7 Components .. SUCCESS [  2.609 s]
[INFO] Apache Tika  SUCCESS [  0.032 s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 07:36 min
[INFO] Finished at: 2015-12-20T11:08:58-08:00
[INFO] Final Memory: 102M/1708M
{noformat}


> Text content from parser is empty when NamedEntityParser is enabled
> ---
>
> Key: TIKA-1815
> URL: https://issues.apache.org/jira/browse/TIKA-1815
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Reporter: Thamme Gowda N
>Assignee: Chris A. Mattmann
>  Labels: memex
> Fix For: 1.12
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> When the NamedEntityParser is enabled, the Tika#parseToString() and other 
> parse() methods produces an empty string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled

2015-12-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065535#comment-15065535
 ] 

ASF GitHub Bot commented on TIKA-1815:
--

Github user thammegowda closed the pull request at:

https://github.com/apache/tika/pull/66


> Text content from parser is empty when NamedEntityParser is enabled
> ---
>
> Key: TIKA-1815
> URL: https://issues.apache.org/jira/browse/TIKA-1815
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Reporter: Thamme Gowda N
> Fix For: 1.12
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> When the NamedEntityParser is enabled, the Tika#parseToString() and other 
> parse() methods produces an empty string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled

2015-12-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065533#comment-15065533
 ] 

ASF GitHub Bot commented on TIKA-1815:
--

GitHub user thammegowda opened a pull request:

https://github.com/apache/tika/pull/67

FIX for TIKA-1815 contributed by Thamme Gowda

+ Writing the text content to XML Document
+ Added Regex recogniser to default NER chain

Closes #66  (this is a simpler version of the same). Fixes #TIKA-1815

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/thammegowda/tika TIKA-1815

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tika/pull/67.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #67


commit a40a18e2f61f2152fa065bda193ceb74e7e60c97
Author: Thamme Gowda 
Date:   2015-12-19T20:56:21Z

FIX for TIKA-1815 contributed by Thamme Gowda

+ Writing the text content to XML Document
+ Added Regex recogniser to default NER chain




> Text content from parser is empty when NamedEntityParser is enabled
> ---
>
> Key: TIKA-1815
> URL: https://issues.apache.org/jira/browse/TIKA-1815
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Reporter: Thamme Gowda N
> Fix For: 1.12
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> When the NamedEntityParser is enabled, the Tika#parseToString() and other 
> parse() methods produces an empty string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled

2015-12-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065485#comment-15065485
 ] 

ASF GitHub Bot commented on TIKA-1815:
--

GitHub user thammegowda opened a pull request:

https://github.com/apache/tika/pull/66

Fix for TIKA-1815 contributed by Thamme Gowda

+ Outputting the text content to XMLDocumentHandler

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/thammegowda/tika fix-TIKA-1815

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tika/pull/66.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #66


commit e96da2bc28d5eef81d034e39eb05099ed5d38ac1
Author: Thamme Gowda 
Date:   2015-10-30T21:47:45Z

Add NamedEntityParser

Add OpenNLPNERecogniser as default

commit a720507a1c1906a501470a7d5c5cec335412fcd3
Author: Thamme Gowda 
Date:   2015-10-30T22:16:11Z

Set charset for converting text to stream

commit 6b1a20e681a5d319886464ec147967c876b7e60d
Author: Thamme Gowda 
Date:   2015-10-31T04:23:43Z

Automated OpenNLP NER model downloader

commit e381ea88ebd2bb8f5adfe36d710acfce673e30aa
Author: Thamme Gowda 
Date:   2015-11-04T00:31:40Z

using a secondary parser to convert non-text streams

commit ea7871bd4afae7d18e500ffc285e58afd08f5e86
Author: Thamme Gowda 
Date:   2015-11-08T07:36:48Z

Add regex based NER

commit 084985b3612438e9ca7107fecdffd67757d04d10
Author: Thamme Gowda 
Date:   2015-11-08T07:38:17Z

Add CoreNLP NER with runtime binding

commit e4d74218ece77143d1e5245a3ef64ddf5578c310
Author: Thamme Gowda 
Date:   2015-11-08T23:41:15Z

Added support for chaining NER implementations

commit 7e6b43c83ec6cdd35ea258f52c0110ba986c82b3
Author: Thamme Gowda 
Date:   2015-11-09T05:58:58Z

charset specified

commit caba68773a287752dea43f3366e6d4309fde861c
Author: Thamme Gowda 
Date:   2015-11-10T01:34:04Z

Merge branch 'trunk' of github.com:apache/tika into trunk

commit 08b916790b279cda0201f2529ca58646dea4b2f9
Author: Thamme Gowda 
Date:   2015-11-10T19:06:29Z

Resolved Code formatting issues

+ Removed star imports
+ Removed dead code / commented code
+ Added License header to missing files

commit e07ac630d54cc79d9a7bfc9ac82332474d07434b
Author: Thamme Gowda 
Date:   2015-11-16T09:05:07Z

Add missing doc strings, fix code formatting issues

commit 96d4d7cc29d4bcd8ac0cf7a595c39b6ed64d4d19
Author: Thamme Gowda 
Date:   2015-11-18T03:03:41Z

Fix: build phase for model downloader

commit 6d0b121b8b321e8a31257fc608bb001d3fe7afb5
Author: Thamme Gowda 
Date:   2015-12-11T14:33:36Z

Merge branch 'trunk' of github.com:apache/tika into trunk

commit 66d3a10ffabf1f54cff384ce1c7325c2a3c16279
Author: Thamme Gowda 
Date:   2015-12-19T18:59:26Z

Fix : TIKA-1815 by Thamme Gowda N.

1. Writing text content to XMLContentHandler
2. Added RegexNERParser to Default parser chain




> Text content from parser is empty when NamedEntityParser is enabled
> ---
>
> Key: TIKA-1815
> URL: https://issues.apache.org/jira/browse/TIKA-1815
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Reporter: Thamme Gowda N
> Fix For: 1.12
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> When the NamedEntityParser is enabled, the Tika#parseToString() and other 
> parse() methods produces an empty string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)