[ https://issues.apache.org/jira/browse/TIKA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16078237#comment-16078237 ]
Chris A. Mattmann commented on TIKA-1988: ----------------------------------------- Agree on #3. I'm going to take a first cut at tika-nlp. In the future when we unify our recognisers for Object/Text, we should think about moving the NER stuff from tika-parsers into tika-nlp. I'm not going to bother now, b/c it would create a situation where people previously had tika-app support NER, but in the future they would have to include tika-nlp. The other thing I think we should seriously consider - that tika-app's size ballooned as you put it - who cares? what if I'll gladly take a 181MB jar file if it gives me capability A, B, C, D all in a box? Two thoughts there. First is that we stop worrying about keeping tika-app so small. Pros: easy, doesn't require anything special; Cons: Size aficionados will be disappointed ;) Second, we could make a tika-app-full module and tika-server-full that is tika-app, plus tika-dl and tika-nlp. Thoughts there? > Age Detection Tika Recogniser > ----------------------------- > > Key: TIKA-1988 > URL: https://issues.apache.org/jira/browse/TIKA-1988 > Project: Tika > Issue Type: New Feature > Reporter: Madhav Sharan > Assignee: Chris A. Mattmann > Labels: age, machine_learning, memex, nlp, opennlp > Fix For: 1.16 > > > Author age can be firs feature and more can be added later > -- > Integrating work done on age classification. More details about classifier in > below repo - > https://github.com/USCDataScience/Age-Predictor > Git repo have a java client which can be integrated in Tika -- This message was sent by Atlassian JIRA (v6.4.14#64029)