[ https://issues.apache.org/jira/browse/TIKA-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026783#comment-14026783 ]
Ray Gauss II commented on TIKA-1328: ------------------------------------ Leaning towards the whitelist approach, perhaps we could add an {{isTranslatable}} field / method and corresponding constructor to the {{Property}} class (with a default of false) and update the properties we want to support translation on? > Translate Metadata and Content > ------------------------------ > > Key: TIKA-1328 > URL: https://issues.apache.org/jira/browse/TIKA-1328 > Project: Tika > Issue Type: New Feature > Reporter: Tyler Palsulich > Fix For: 1.7 > > > Right now, Translation is only done on Strings. Ideally, users would be able > to "turn on" translation while parsing. I can think of a couple options: > - Make a TranslateAutoDetectParser. Automatically detect the file type, parse > it, then translate the content. > - Make a Context switch. When true, translate the content regardless of the > parser used. I'm not sure the best way to go about this method, but I prefer > it over another Parser. > Regardless, we need a black or white list for translation. I think black list > would be the way to go -- which fields should not be translated (dates, > versions, ...) Any ideas? Also, somewhat unrelated, does anyone know of any > other open source translation libraries? If we were really lucky, it wouldn't > depend on an online service. -- This message was sent by Atlassian JIRA (v6.2#6252)