[ https://issues.apache.org/jira/browse/OAK-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568742#comment-15568742 ]
Vikas Saurabh commented on OAK-4348: ------------------------------------ [~teofili], I completely missed checking this out. I think we should check how costly is it to consult Joshua. Moreover, even if it's cheap, I think the query terms should be enhanced if some property on index def declares to do so - that way, we can quit early for un-interesting index defs. wdyt? > Cross language search via SMT > ----------------------------- > > Key: OAK-4348 > URL: https://issues.apache.org/jira/browse/OAK-4348 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: query > Reporter: Tommaso Teofili > Assignee: Tommaso Teofili > Fix For: 1.6 > > > It would be interesting to investigate usage of statistical machine > translation toolkits (like Apache Joshua) in order to enable cross language > search, so that query can be eventually expanded to search over translated > terms too. > Example: > - enable spanish to english translation > - perform full text search for "hola" > - query engine looks for translations for "hola" > - SMT returns "hello" > - query engine add an additional (UNION) clause for the translated term > - the query performed by Oak becomes "hello OR hola" > - both results for english and spanish terms get returned > This of course should be configurable. > Note that the integration may happen also via Apache Tika which provides a > Translator API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)