Hi IGA, That would be great.
There is also this collection of data for English/Japanese translation. If you collect and prepare all of this, I can then either help you build a model, or build it myself. http://www.phontron.com/japanese-translation-data.php Sincerely, Matt > On Aug 5, 2016, at 5:22 AM, IGA Tosiki <igap...@gmail.com> wrote: > > Hi Matt, > > I can convert those XML en-ja pair into other format as you point, if > you think the pairs are useful, and if you want to do so. > > Regards, > Toshiki > > 2016-08-05 17:53 GMT+09:00 IGA Tosiki <igap...@gmail.com>: >> Hi Matt, >> >> I can share my en-ja parallel data. >> >> https://osdn.jp/projects/blancofw/releases/52952 >> >> It is pair that translation en to ja for Eclipse IDE menu and >> messages. It is translated by human and also checked by human. >> >> Toshiki >> >> 2016-08-04 22:02 GMT+09:00 Matt Post <p...@cs.jhu.edu>: >>> Hi Toshiki, >>> >>> Have you been able to gather any parallel data? >>> >>> matt >>> >>> >>>> On Jul 22, 2016, at 3:50 PM, Henry Saputra <henry.sapu...@gmail.com> wrote: >>>> >>>> HI Toshiki, >>>> >>>> For this kind of discussion, let's have it in the dev@ list. >>>> >>>> You can ask the question to dev@joshua.incubator.apache.org. >>>> >>>> Thanks, >>>> >>>> Henry >>>> >>>> On Thu, Jul 21, 2016 at 9:46 PM, IGA Tosiki <igap...@gmail.com> wrote: >>>> >>>>> Hi Matt, >>>>> >>>>> Thanks for your reply! >>>>> >>>>> I'm happy to read your mail, I want to help you Japanese-English language >>>>> pack. >>>>> And YES, I mean translation memories by TMS/XLIFF. But I may convert >>>>> TMS to what you specified format. >>>>> >>>>> And also I knew English to Japanese is very difficult, but also I >>>>> believe sample of English-Japanese language pack will attract many >>>>> Japanese people to use Joshua. >>>>> >>>>> Regards, >>>>> Toshiki >>>>> >>>>> 2016-07-22 12:42 GMT+09:00 Matt Post <p...@cs.jhu.edu>: >>>>>> Hi, >>>>>> >>>>>> There is no Japanese--English language pack, but I would be happy to >>>>> build one if you could help by pointing me to data. What we need is >>>>> parallel data in the form of sentences that are translations of each >>>>> other. >>>>> If you have access to this or pointers to where I could find some, I would >>>>> be happy to build it. There are likely standard datasets available; people >>>>> like Graham Neubig (http://www.phontron.com) have been working on this >>>>> for a while. >>>>>> >>>>>> What are TMS and LTIFF? Are you talking about translation memories? >>>>>> >>>>>> As a side note, translation between English and Japanese is very >>>>> difficult and tends not to be very good. One approach that helps is >>>>> translating from trees and forests. Joshua does not have this capability >>>>> at >>>>> the moment. >>>>>> >>>>>> Sincerely, >>>>>> matt >>>>>> >>>>>> >>>>>>> On Jul 21, 2016, at 11:28 PM, IGA Tosiki <igap...@gmail.com> wrote: >>>>>>> >>>>>>> Hi team, >>>>>>> >>>>>>> I got interest about Joshua, and language pack. I am Japanese, and I >>>>>>> want to know around Japanese language pack. >>>>>>> >>>>>>> Is there any plan about building Japanese-English language pack? >>>>>>> I believe TMS or LTIFF will usefull to building such language pack. I >>>>>>> have many OSS based TMS between English-Japanese. Is there any path >>>>>>> using TMX or LTIFF for input of Joshua language pack? >>>>>>> >>>>>>> Best regards, >>>>>>> Toshiki Iga >>>>>> >>>>> >>>