Re: ApacheCon 2016 and Joshua

Lewis John Mcgibbney Wed, 23 Mar 2016 13:16:24 -0700

This sounds cool folks.
I'll be there for the entire conference ... I hope.
I think we should jump on Matt's suggestion to meetup on Thursday. Time TBC.
@Matt, can you put a wiki page together and we can get an agenda together?
https://cwiki.apache.org/confluence/display/JOSHUA/Joshua+%28Incubating%29+Home


On Tue, Mar 22, 2016 at 8:04 AM, kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> My use case for Joshua involves creating internal scalable web services to
> translate text across several language arcs.  Most of the code changes I
> (and others on my team) aim to contribute focus on Joshua stability and
> performance.  So far my work has been mostly around speeding up decoding
> and training speed (I hope to have some significant patches incoming around
> the time of the Con).
>
> I'll go ahead and book flights to Vancouver as well.  Hope to see most of
> you there.
>
> -Kellen
>
>
> I'm going to look into making a short trip. I think I'd arrive on the
> 11th and then leave on
> Friday the 13th. Could we plan a meet up for the night of Thursday the
> 12th? It'd be great
> to meet everyone (and having a deadline would help me prioritize :) )
>
> matt
>
>
> > On Mar 15, 2016, at 12:24 PM, Lewis John Mcgibbney <
> lewis.mcgibb...@gmail.com>
> wrote:
> >
> > Hi Matt,
> >
> > On Mon, Mar 14, 2016 at 8:26 AM, Matt Post <p...@cs.jhu.edu> wrote:
> >
> >> Whoa! Lewis, can you give some more detail on this talk, what you
> >> proposed, and what you plan to talk about?
> >>
> >
> > http://sched.co/6OJI
> >
> >
> >>
> >> I haven't ever been to ApacheCon, but am interested in going. I don't
> have
> >> much of a feel for what motivates folks outside the academic research
> >> community, and that would be good to have in laying out projects that
> might
> >> interest people.
> >>
> >
> > I agree. Would be great to meet you there. We could have a Joshua meetup.
> >
> >
> >>
> >> Regarding those project, I have a number of them. Perhaps it would be
> >> useful to flesh them out with some more detail, and perhaps post them,
> for
> >> those who are interested. First, with respect to Tommaso's question, the
> >> following:
> >>
> >> - Use cases. I'd really like to push machine translation as a black box,
> >> where people can download and use models, not caring how they work, and
> >> building on top of them. I think this could be transformative. I've just
> >> added to Joshua the ability to add, store, and manage custom phrasal
> >> translation rules, which would let people take a model and add their own
> >> translations on top of it, perhaps correcting mistakes as they encounter
> >> them. There's a JSON API for it (undocumented).
> >>
> >> Building this up would also require pulling together lots of different
> >> test sets, evaluating changes, and so on.
> >>
> >> - Neural nets. This is a huge research area. I think the advantages are
> >> that it could enable releasing models that are much smaller. However, on
> >> the down side, it's not clear what the best way to integrate these
> models
> >> into Joshua is. Fully neural attention models would require
> re-architecting
> >> Joshua, as they are essentially a new paradigm. Adding neural
> components as
> >> feature functions that interact with the existing decoding algorithm
> would
> >> be an intermediate step.
> >>
> >
> > OK. This sounds like bang on for a meet up topic. Regardless of who is
> > there, we could have a Webex or something similar for the incubating
> > community,
> >
> >
> >>
> >> For other projects, I'd love:
> >>
> >> - Better documentation, developer and end-user (probably I need to
> write a
> >> lot of this; if nothing else, it would be hugely useful to me in terms
> of
> >> prioritizing to know that people want it)
> >
> >
> >> - Rewriting certain components. The tuning modules, in particular, are a
> >> real mess, and should be synthesized and improved.
> >>
> >> - Replacing Moses components. Joshua can call out to Moses to build
> phrase
> >> tables; it would be nice to get rid of this (and wouldn't be that hard)
> >> with our own Java implementations. It would also be good to add a
> >> lexicalized distortion model to the phrase-based decoder.
> >>
> >>
> > These all sound excellent and would all make very reasonable GSoC
> projects,
> > Thanks
> > Lewis
>



-- 
*Lewis*

Re: ApacheCon 2016 and Joshua

Reply via email to