[ 
https://issues.apache.org/jira/browse/TIKA-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated TIKA-1328:
------------------------------
    Fix Version/s:     (was: 2.0.0)
                   2.0.1

> Translate Metadata and Content
> ------------------------------
>
>                 Key: TIKA-1328
>                 URL: https://issues.apache.org/jira/browse/TIKA-1328
>             Project: Tika
>          Issue Type: New Feature
>          Components: translation
>            Reporter: Tyler Bui-Palsulich
>            Priority: Major
>             Fix For: 1.17, 2.0.0-BETA, 2.0.1
>
>
> Right now, Translation is only done on Strings. Ideally, users would be able 
> to "turn on" translation while parsing. I can think of a couple options:
> - Make a TranslateAutoDetectParser. Automatically detect the file type, parse 
> it, then translate the content.
> - Make a Context switch. When true, translate the content regardless of the 
> parser used. I'm not sure the best way to go about this method, but I prefer 
> it over another Parser.
> Regardless, we need a black or white list for translation. I think black list 
> would be the way to go -- which fields should not be translated (dates, 
> versions, ...) Any ideas? Also, somewhat unrelated, does anyone know of any 
> other open source translation libraries? If we were really lucky, it wouldn't 
> depend on an online service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to