Dynamite folks. On Thu, May 12, 2016 at 5:15 PM, Tom Barber <t...@analytical-labs.com> wrote:
> I'd also like to discuss deployment for users (yeah I'm the boring guy). > > I saw some docker stuff in the emails earlier, I've also got the majority > of a Juju charm ready, so users can do: > juju deploy joshua-decoder > juju action joshua-decoder/0 add-language-pack es-en-phrase > > for example and they'll have a kickstarted server ready for them to use. > But clearly there must be a bunch more stuff I can do to enhance this for > the people wanting to train it etc. > > Tom > > -------------- > > Director Meteorite.bi - Saiku Analytics Founder > Tel: +44(0)5603641316 > > (Thanks to the Saiku community we reached our Kickstart > < > http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/ > > > goal, but you can always help by sponsoring the project > <http://www.meteorite.bi/products/saiku/sponsorship>) > > On 12 May 2016 at 22:30, kellen sunderland <kellen.sunderl...@gmail.com> > wrote: > > > Thanks for organizing Lewis, > > > > Here's some topics for discussion I've been noting while working with > > Joshua. None of these are high priority issues for me, but if we are all > > in agreement on them it might make sense to log them. > > > > Boring code convention stuff: Logging with log4j, throw Runtime > Exceptions > > instead of Typed, remove all system exits (replace with > RuntimeExceptions), > > refactor some large files. > > > > Testing: Integrate existing unit tests, provide some good test examples > so > > others can begin adding more tests. > > > > Configuration: We also touched on IoC, CLI args, and configuration > changes > > that are possible. > > > > OO stuff: Joshua is pretty good here, but I would personally prefer more > > granular interfaces. I wouldn't advocate radical changes, but maybe a > > little refactoring might make sense to better align with the interface > > segregation principle. > > https://en.wikipedia.org/wiki/Interface_segregation_principle > > > > JNI reliance: We've found KenLM works really well with Joshua, but there > > is one issue with using it. It requires many JNI calls during decoding > and > > these calls impact GC performance. In fact when a JNI call happens the > GC > > throws out any work it may have done and quits until the JNI call > > completes. The GC will then resume and start marking objects for > > collection from scratch. This is not ideal especially for programs with > > large heaps (Joshua / Spark). There's a couple ways we could mitigate > this > > and I think they'd all speed up Joshua quite a lot. > > > > High level roadmap topics: > > > > * Distributed Decoding is something I'll likely continue working on. > > Theres some obvious things we can do given usage patterns of translation > > engines that can help us out here (I think). > > * Providing a way to optimize Joshua for low-latency, low-throughput > calls > > could be interesting for those with near real-time use cases. Providing > a > > way to optimize for high-latency, high-throughput could be interesting > for > > async/batch use cases. > > * The machine learning optimization algorithms could be cleaned up a bit > > (MERT/MIRA). > > * The Vocabulary could probably be replaced with a simpler > implementation > > (without sacrificing performance). > > > > -Kellen > > > > > > > > On Thu, May 12, 2016 at 12:32 PM, Lewis John Mcgibbney < > > lewis.mcgibb...@gmail.com> wrote: > > > > > Hi Folks, > > > Kellen, Henri and I are going to get together tomorrow 13th around > > > lunchtime PST to talk everything Joshua. > > > Would be great to have others online via GChat if possible. > > > Let's say around 11am PST for the time being. > > > See you then folks. > > > Thanks > > > Lewis > > > > > > > > > -- > > > *Lewis* > > > > > > -- *Lewis*