This sounds cool folks. I'll be there for the entire conference ... I hope. I think we should jump on Matt's suggestion to meetup on Thursday. Time TBC. @Matt, can you put a wiki page together and we can get an agenda together? https://cwiki.apache.org/confluence/display/JOSHUA/Joshua+%28Incubating%29+Home
On Tue, Mar 22, 2016 at 8:04 AM, kellen sunderland < kellen.sunderl...@gmail.com> wrote: > My use case for Joshua involves creating internal scalable web services to > translate text across several language arcs. Most of the code changes I > (and others on my team) aim to contribute focus on Joshua stability and > performance. So far my work has been mostly around speeding up decoding > and training speed (I hope to have some significant patches incoming around > the time of the Con). > > I'll go ahead and book flights to Vancouver as well. Hope to see most of > you there. > > -Kellen > > > I'm going to look into making a short trip. I think I'd arrive on the > 11th and then leave on > Friday the 13th. Could we plan a meet up for the night of Thursday the > 12th? It'd be great > to meet everyone (and having a deadline would help me prioritize :) ) > > matt > > > > On Mar 15, 2016, at 12:24 PM, Lewis John Mcgibbney < > lewis.mcgibb...@gmail.com> > wrote: > > > > Hi Matt, > > > > On Mon, Mar 14, 2016 at 8:26 AM, Matt Post <p...@cs.jhu.edu> wrote: > > > >> Whoa! Lewis, can you give some more detail on this talk, what you > >> proposed, and what you plan to talk about? > >> > > > > http://sched.co/6OJI > > > > > >> > >> I haven't ever been to ApacheCon, but am interested in going. I don't > have > >> much of a feel for what motivates folks outside the academic research > >> community, and that would be good to have in laying out projects that > might > >> interest people. > >> > > > > I agree. Would be great to meet you there. We could have a Joshua meetup. > > > > > >> > >> Regarding those project, I have a number of them. Perhaps it would be > >> useful to flesh them out with some more detail, and perhaps post them, > for > >> those who are interested. First, with respect to Tommaso's question, the > >> following: > >> > >> - Use cases. I'd really like to push machine translation as a black box, > >> where people can download and use models, not caring how they work, and > >> building on top of them. I think this could be transformative. I've just > >> added to Joshua the ability to add, store, and manage custom phrasal > >> translation rules, which would let people take a model and add their own > >> translations on top of it, perhaps correcting mistakes as they encounter > >> them. There's a JSON API for it (undocumented). > >> > >> Building this up would also require pulling together lots of different > >> test sets, evaluating changes, and so on. > >> > >> - Neural nets. This is a huge research area. I think the advantages are > >> that it could enable releasing models that are much smaller. However, on > >> the down side, it's not clear what the best way to integrate these > models > >> into Joshua is. Fully neural attention models would require > re-architecting > >> Joshua, as they are essentially a new paradigm. Adding neural > components as > >> feature functions that interact with the existing decoding algorithm > would > >> be an intermediate step. > >> > > > > OK. This sounds like bang on for a meet up topic. Regardless of who is > > there, we could have a Webex or something similar for the incubating > > community, > > > > > >> > >> For other projects, I'd love: > >> > >> - Better documentation, developer and end-user (probably I need to > write a > >> lot of this; if nothing else, it would be hugely useful to me in terms > of > >> prioritizing to know that people want it) > > > > > >> - Rewriting certain components. The tuning modules, in particular, are a > >> real mess, and should be synthesized and improved. > >> > >> - Replacing Moses components. Joshua can call out to Moses to build > phrase > >> tables; it would be nice to get rid of this (and wouldn't be that hard) > >> with our own Java implementations. It would also be good to add a > >> lexicalized distortion model to the phrase-based decoder. > >> > >> > > These all sound excellent and would all make very reasonable GSoC > projects, > > Thanks > > Lewis > -- *Lewis*