Hi Rosie, Replies inline On Tue, Jul 3, 2018 at 6:41 AM, rosie.ole...@baesystems.com < rosie.ole...@baesystems.com> wrote:
> > > Our goal is to use Joshua to translate, with a reasonable speed and > accuracy ratio, large documents. > The terms 'reasonable speed' and 'accuracy' should be further defined as of course there are tradeoffs. These are highly configurable based upon the generation and use of language model(s) used within the SMT. > We want to integrate this into the rest of our platform, so we are > developing a REST API to wrap around the functionality. > Sounds good. > > > Our solution is to create a Node.js Express API and call Joshua. We have > narrowed this down to two possibilities: running Joshua commands through > the command line or running Joshua as a HTTP server, formatting the input > document content into sentences to send to the Joshua REST endpoint. > There is an existing Python implementation demonstrating how this could be done https://github.com/joshua-decoder/joshua_translation_engine > > > With both options we have had a number of issues. > > > > Firstly, running the commands through the command line. > > · The documentation is specific to Linux and bash terminals > whereas we want to apply the functionality to a Windows Operating System, > so we can only run bash scripts through a git-bash terminal which has been > difficult to implement in a nodejs module. We are especially having issues > implementing the prepare.sh script. Do you have any solutions for running > this script, or mimic what it does, through Windows command line? > Absolutely none what-so-ever I have not used Windows for many years. I would highly suggest that you run Joshua as a service. > > > Running Joshua as a HTTP server > > > > · The documentation for using the Joshua live server suggests that > to translate text, the content must be broken down manually into sentences > and make separate HTTP GET requests to the server with the sentence in the > URL. Is there any functionality in Joshua that handles the translation of a > large block of text? > No AFAIK input is processing of sentences. > > > Overall we understand it will be more efficient and faster to run Joshua > as a HTTP server, ideally with multiple languages, especially since we are > having problems running the tokenise and normalise scripts through the > command line. Do you have an idea of which method is best to implement > Joshua? > See above. > > > Alternatively, we have an idea of interacting directly with the methods in > the jar file but we can’t find any documentation on using it , do you have > any insight on this? > You shouldn't need to do this, provisioning Joshua-as-a-service will enable all of the functionality you require. Each language pack already provides this as well. See https://cwiki.apache.org/confluence/display/JOSHUA/Language+Packs > > > > Finally what is the status of Joshua in the Apache Incubator? > We are in the process of graduating as a top level project. > Is it still being developed and supported? > > > > Yes, Joshua is being developed and maintained by the existing community. Please keep the questions coming. Lewis