My use case for Joshua involves creating internal scalable web services to
translate text across several language arcs.  Most of the code changes I
(and others on my team) aim to contribute focus on Joshua stability and
performance.  So far my work has been mostly around speeding up decoding
and training speed (I hope to have some significant patches incoming around
the time of the Con).

I'll go ahead and book flights to Vancouver as well.  Hope to see most of
you there.

-Kellen


I'm going to look into making a short trip. I think I'd arrive on the
11th and then leave on
Friday the 13th. Could we plan a meet up for the night of Thursday the
12th? It'd be great
to meet everyone (and having a deadline would help me prioritize :) )

matt


> On Mar 15, 2016, at 12:24 PM, Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>
wrote:
>
> Hi Matt,
>
> On Mon, Mar 14, 2016 at 8:26 AM, Matt Post <p...@cs.jhu.edu> wrote:
>
>> Whoa! Lewis, can you give some more detail on this talk, what you
>> proposed, and what you plan to talk about?
>>
>
> http://sched.co/6OJI
>
>
>>
>> I haven't ever been to ApacheCon, but am interested in going. I don't have
>> much of a feel for what motivates folks outside the academic research
>> community, and that would be good to have in laying out projects that might
>> interest people.
>>
>
> I agree. Would be great to meet you there. We could have a Joshua meetup.
>
>
>>
>> Regarding those project, I have a number of them. Perhaps it would be
>> useful to flesh them out with some more detail, and perhaps post them, for
>> those who are interested. First, with respect to Tommaso's question, the
>> following:
>>
>> - Use cases. I'd really like to push machine translation as a black box,
>> where people can download and use models, not caring how they work, and
>> building on top of them. I think this could be transformative. I've just
>> added to Joshua the ability to add, store, and manage custom phrasal
>> translation rules, which would let people take a model and add their own
>> translations on top of it, perhaps correcting mistakes as they encounter
>> them. There's a JSON API for it (undocumented).
>>
>> Building this up would also require pulling together lots of different
>> test sets, evaluating changes, and so on.
>>
>> - Neural nets. This is a huge research area. I think the advantages are
>> that it could enable releasing models that are much smaller. However, on
>> the down side, it's not clear what the best way to integrate these models
>> into Joshua is. Fully neural attention models would require re-architecting
>> Joshua, as they are essentially a new paradigm. Adding neural components as
>> feature functions that interact with the existing decoding algorithm would
>> be an intermediate step.
>>
>
> OK. This sounds like bang on for a meet up topic. Regardless of who is
> there, we could have a Webex or something similar for the incubating
> community,
>
>
>>
>> For other projects, I'd love:
>>
>> - Better documentation, developer and end-user (probably I need to write a
>> lot of this; if nothing else, it would be hugely useful to me in terms of
>> prioritizing to know that people want it)
>
>
>> - Rewriting certain components. The tuning modules, in particular, are a
>> real mess, and should be synthesized and improved.
>>
>> - Replacing Moses components. Joshua can call out to Moses to build phrase
>> tables; it would be nice to get rid of this (and wouldn't be that hard)
>> with our own Java implementations. It would also be good to add a
>> lexicalized distortion model to the phrase-based decoder.
>>
>>
> These all sound excellent and would all make very reasonable GSoC projects,
> Thanks
> Lewis

Reply via email to