> On Dec 12, 2016, at 3:04 PM, Aliaksei Rudak <alru...@gmail.com> wrote: > > 1) If English-German pair will be recompiled to German-English (vice-versa) > do I need a separate instance to process back translation ? Or they can work > in one instance in both directions ? > A whole new model needs to be trained. You need a separate model for each direction. > 2) Are there any documents about how to recompile model to work vice-versa > from German-English to English-German ? > > At this page under the “Project Info” title links “Community page” and > “Current Documentation” not working > > http://incubator.apache.org/projects/joshua.html > <http://incubator.apache.org/projects/joshua.html> This document on running the pipeline:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65871630 > 3) Are there ways of increasing translation quality without changing > (extending) language model? > > At this page under “How do I make Joshua produce better results? at second > option (Joshua directly) link not working > > http://joshua.incubator.apache.org/6.0/faq.html > <http://joshua.incubator.apache.org/6.0/faq.html> Yes but it's complicated. The best way is to add data, but there are lots of other models and parameter variations that could be tried. > 4) How can I reduce the amount of memory each language pair instance use > without losing process speed and quality? > If you can find German–French parallel data, use that. Otherwise, pivot through another language. > 5) To make translation from German to French do I need to make translation > via English conversion ? (like German to English first and then English to > French) > > I mean for the case without German-French parallel data. > > > > > > Regards, > > Alexei > > > > > > > 2016-12-12 17:58 GMT+03:00 Matt Post <p...@cs.jhu.edu > <mailto:p...@cs.jhu.edu>>: > No, each has to be run separately. But not all are equally good, so I suggest > starting with a few and building up. > > If you get KenLM working in place of BerkeleyLM, the language models will be > shared between them if they are on the same machine. I will post instructions > soon. > > Yes, each one has two language models that are interpolated. > > > >> On Dec 12, 2016, at 9:20 AM, Aliaksei Rudak <alru...@gmail.com >> <mailto:alru...@gmail.com>> wrote: >> >> Hi Matt, >> >> You was right about increasing memory. Spanish works fine now but need about >> 16GB to run. Is it possible to use one Joshua instance for all language >> pairs simultaneously ? Right now I use one instance for each pair at it >> takes about 4GB, so for all 60 languages I need 240 GB of RAM memory and 60 >> running instances. But may be it's possible to process all language >> translation with one instance and use for example 32 GB ? >> >> Also I found that every language pair archive has 2 language models ( >> Berkeley and KenLM ) Do I need them two at once ? Or Joshua selects one of >> them depending on some parameters ? >> >> Regards, >> Alexei >> >> >> >> >> 2016-12-07 15:51 GMT+03:00 Matt Post <p...@cs.jhu.edu >> <mailto:p...@cs.jhu.edu>>: >> I fixed the Czech link. >> >> For Spanish–English, what is the error? I imagine you have to provide more >> memory. Edit the "joshua" script and double or triple the amount of memory. >> >> >>> On Dec 7, 2016, at 7:14 AM, Aliaksei Rudak <alru...@gmail.com >>> <mailto:alru...@gmail.com>> wrote: >>> >>> Hi Matt, >>> >>> Can you check Czech-English language pack, it has broken link. >>> Spanish-English pair not works, throws exceptions >>> >>> >>> Regards, >>> Alexei >>> >>> 2016-11-28 17:30 GMT+03:00 <alru...@gmail.com <mailto:alru...@gmail.com>>: >>> Hi Matt, what time (total price ) will be to record video of how to make >>> translation vice-versa (from german to english) to english to german pair >>> >>> Regards, >>> Alexei >>> >>> On Nov 28, 2016, at 17:59, Matt Post <p...@cs.jhu.edu >>> <mailto:p...@cs.jhu.edu>> wrote: >>> >>>> Inline below: >>>> >>>>> On Nov 26, 2016, at 11:12 AM, Aliaksei Rudak <alru...@gmail.com >>>>> <mailto:alru...@gmail.com>> wrote: >>>>> >>>>> Hi Matt, >>>>> >>>>> >>>>> >>>>> We need to prepare all infrastructure now so you can make changes in >>>>> future. Preparation will take time. Right now I have several questions >>>>> about all this things. >>>>> >>>>> 1) Does Joshua has language auto-detect feature ? If yes – how to use it? >>>>> If not – is it hard to do it ? >>>>> >>>> This feature is called LID ("language ID"). It is not in Joshua currently >>>> but we have talked about it, and it wouldn't be too difficult to add in. >>>>> 2) On this page >>>>> >>>>> https://cwiki.apache.org/confluence/display/JOSHUA/Notes+on+Language+Pack+Creation >>>>> >>>>> <https://cwiki.apache.org/confluence/display/JOSHUA/Notes+on+Language+Pack+Creation> >>>>> In first sentence there is link to “Corpus” at the end where language >>>>> datasets should be located, but when I clicked on link it gives me >>>>> English-German pack to download. >>>>> >>>>> Is it correct behavior ? if not – can you give the link to such datasets >>>>> >>>> Sorry, the link should go to http://opus.lingfil.uu.se >>>> <http://opus.lingfil.uu.se/>. I just fixed it. >>>>> 3) Can you record a video of your screen of how to recompile language >>>>> pair to translate vice-versa ? To make English-German pair to translate >>>>> from German to English ? >>>>> >>>>> Can I pay for such video without contract now (or I can mark paypal >>>>> payment for example that I’m paying for your assistance)? >>>>> >>>>> Because we need to make initial setup of all system and check how much >>>>> assistance we need , when and where. >>>>> >>>>> 4) What kind of contract and conditions do you prefer ? What is your >>>>> hourly rate ? >>>>> >>>> I still have to confirm with my employer that I am allowed to engage in >>>> outside work. My hourly rate is $250. I would give you estimates ahead of >>>> time so you could know what it would cost you. >>>> >>>> If that sounds good to you, can you clarify for me who the money would be >>>> coming from? If it's a company, what is the name of the company, and where >>>> is it incorporated? If it's a person, what is their name, and what is >>>> their citizenship? I would need this information for my own tax purposes. >>>> >>>> Sincerely, >>>> Matt >>>> >>>>> Regards, >>>>> >>>>> Alexei >>>>> >>>>> >>>>> 2016-11-24 19:14 GMT+03:00 Aliaksei Rudak <alru...@gmail.com >>>>> <mailto:alru...@gmail.com>>: >>>>> yes, ok, >>>>> >>>>> Skype chat at 9 AM EST on Friday, November 25 >>>>> >>>>> 2016-11-24 18:33 GMT+03:00 Matt Post <p...@cs.jhu.edu >>>>> <mailto:p...@cs.jhu.edu>>: >>>>> Great, let's chat at 9 AM EST? >>>>> >>>>> >>>>>> On Nov 23, 2016, at 4:31 PM, Aliaksei Rudak <alru...@gmail.com >>>>>> <mailto:alru...@gmail.com>> wrote: >>>>>> >>>>>> I sent you skype request. Let's plan on Friday when you have a free >>>>>> time. As for me I can from 8-00 till 16-00 (EST) anytime. I will deploy >>>>>> all things from local machine to some service (will select it tomorrow ) >>>>>> and send you access. >>>>>> >>>>>> Regards, >>>>>> Alexei >>>>>> >>>>>> 2016-11-24 0:08 GMT+03:00 Matt Post <p...@cs.jhu.edu >>>>>> <mailto:p...@cs.jhu.edu>>: >>>>>> I am mpost89 — what time works for you? I have a little time now, >>>>>> otherwise not till Friday. I am in EST time zone. >>>>>> >>>>>> matt >>>>>> >>>>>> >>>>>>> On Nov 23, 2016, at 3:28 PM, Aliaksei Rudak <alru...@gmail.com >>>>>>> <mailto:alru...@gmail.com>> wrote: >>>>>>> >>>>>>> Hi Matt, >>>>>>> >>>>>>> My skype is "alrudak". Can we talk with voice and discuss all details ? >>>>>>> >>>>>>> Regards, >>>>>>> Alexei >>>>>>> >>>>>>> 2016-11-23 22:41 GMT+03:00 Matt Post <p...@cs.jhu.edu >>>>>>> <mailto:p...@cs.jhu.edu>>: >>>>>>> That should be fairly easy to do. What about running them as Amazon AMI >>>>>>> instances? Or do you want them to run on your own servers? Would docker >>>>>>> containers suffice? >>>>>>> >>>>>>> This might be something I could do for you. Can you give me more >>>>>>> information about your company? >>>>>>> >>>>>>> matt >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Nov 23, 2016, at 9:48 AM, Aliaksei Rudak <alru...@gmail.com >>>>>>>> <mailto:alru...@gmail.com>> wrote: >>>>>>>> >>>>>>>> I'm trying to create my own translation service like Google Translate >>>>>>>> with api to use it from my mobile apps (as clients) or as web-site >>>>>>>> where you can enter phrase for translation (like Google did). You >>>>>>>> told that "Google-translate-style API" is already presented in server >>>>>>>> mode, how can I use it ? >>>>>>>> >>>>>>>> I was able to install server and download one language pair to test. >>>>>>>> For example English - German. Does this language pair can do >>>>>>>> translation only in one direction (English German) and not vice versa >>>>>>>> (From German to English)? If it's possible to translate vice versa how >>>>>>>> can I do this ? >>>>>>>> >>>>>>>> If someone can help me on paid basis - please give it's contacts. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Alexei >>>>>>>> >>>>>>>> 2016-11-23 16:21 GMT+03:00 Matt Post <p...@cs.jhu.edu >>>>>>>> <mailto:p...@cs.jhu.edu>>: >>>>>>>> 1. Yes, you can translate as much as you'd like. Do you mean lots of >>>>>>>> sentences or long sentences? >>>>>>>> >>>>>>>> 2. Yes, that is what it does. It even offers (in server mode) a >>>>>>>> Google-translate-style API. >>>>>>>> >>>>>>>> 3. There may be someone interested in helping you. What exactly are >>>>>>>> you trying to do? What do you mean "all" language pairs? >>>>>>>> >>>>>>>> > On Nov 19, 2016, at 2:08 PM, Aliaksei Rudak <alru...@gmail.com >>>>>>>> > <mailto:alru...@gmail.com>> wrote: >>>>>>>> > >>>>>>>> > Hi Matt, >>>>>>>> > Can you help me and ask several questions about Joshua project ? >>>>>>>> > >>>>>>>> > 1) Is it possible to translate big amounts of text with Joshua ? ( >>>>>>>> > For example 1000 characters per transaction) >>>>>>>> > 2) Does Joshua works like Google Translate ? So you can put sentence >>>>>>>> > in one language and get translated in another language ? >>>>>>>> > 3) Can you (or your teammates ) help me with deployment Joshua on >>>>>>>> > my server and setup all language pairs ? I will pay you. >>>>>>>> > >>>>>>>> > Regards, >>>>>>>> > Alexei >>>>>>>> > >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>> >> >> > >