Yes, tighter integration with other Apache projects sounds like a good idea to 
me. Rewriting Thrax to use a more modern tool would also be hugely helpful to 
Joshua in the long term. It is getting harder and harder to find and maintain 
(much less justify) Hadoop clusters that are separate from other research ones.


> On Jun 28, 2017, at 3:42 AM, Tommaso Teofili <tommaso.teof...@gmail.com> 
> wrote:
> 
> +1
> 
> Tommaso
> 
> Il giorno mer 28 giu 2017 alle ore 07:46 lewis john mcgibbney <
> lewi...@apache.org> ha scritto:
> 
>> Hi Suneel,
>> I think it's worth opening a JIRA issue and we can possibly mark it for
>> 7.X?
>> lewis
>> 
>> On Tue, Jun 27, 2017 at 9:36 PM, <
>> dev-digest-h...@joshua.incubator.apache.org> wrote:
>> 
>>> 
>>> From: Suneel Marthi <smar...@apache.org>
>>> To: dev@joshua.incubator.apache.org
>>> Cc:
>>> Bcc:
>>> Date: Fri, 23 Jun 2017 01:59:28 -0400
>>> Subject: Re: [ANNOUNCE] - Apache Joshua 6.1 incubating release
>>> Congrats on the release.
>>> 
>>> I have been a silent lurker on this channel since I first heard of Joshua
>>> last September at Amazon, Berlin.
>>> 
>>> Tommaso and myself recently did a talk at Berlin Buzzwords 2017 -
>>> 'Embracing Diversity - searching over multiple languages' [1]
>>> using Apache Joshua for Machine Translation, and Apache OpenNLP for
>>> Language detection.
>>> 
>>> I have been wondering how much of the present VLPS can be replaced by
>>> OpenNLP with Flink/Beam pipelines.
>>> I did a talk last week at Hadoop Summit, San Jose about 'Large Scale Text
>>> processing with Apache OpenNLP and Apache Flink [2].
>>> 
>>> Also that Thrax which is presently MapReduce based, can definitely be
>>> ported over to modern streaming distributed frameworks like Flink/Kafka
>>> Streams/Beam.
>>> 
>>> 
>>> [1]
>>> https://www.youtube.com/watch?v=ZrWxySF-9KY&index=20&t=2s&;
>>> list=PLq-odUc2x7i-9Nijx-WfoRMoAfHC9XzTt
>>> [2] https://www.slideshare.net/SuneelMarthi/large-scale-text-processing
>>> 
>>> 
>>> 
>> 

Reply via email to