Re: Does the UIMA pipeline support analysis components written as mahout map-reduce jobs
Thanks Julien, I'll have a closer look at the Mahout module supported by Behemoth. On Fri, Feb 15, 2013 at 8:52 AM, Julien Nioche < lists.digitalpeb...@gmail.com> wrote: > BTW Behemoth has a Mahout module which allows to generate input vectors for > the clustering. Annoyingly the Mahout classifiers have no standard > interface and expect different inputs but it wouldn't be too difficult to > hack the code in the mahout module to generate whatever input is needed for > a particular implementation of a Mahout classifier. > > On 15 February 2013 16:14, Som Satpathy wrote: > > > Thanks for sharing your thoughts guys. I think it would be better for me > to > > keep the two layers separate. The UIMA pipeline can be used to extract > > useful features. Another layer can then use those features to implement > and > > generate deep learning models (via mahout/mapR jobs) > > > > Cheers > > Som > > > > > > On Fri, Feb 15, 2013 at 6:48 AM, Brian Dolan > wrote: > > > > > We tackled this same issue. Ultimately, since a UIMA process is > usually > > > concerned with a single document, it made more structural sense to wrap > > the > > > UIMA task within a Mapper. That keeps the entire process within the > > > functional programming paradigm. We also were concerned with how > fragile > > > the UIMA configuration can be and it was easier to control when > embedded > > > within a Mapper. Similarly with Mahout, though we separated the two > > jobs. > > > > > > > > > On Feb 15, 2013, at 2:37 AM, Julien Nioche < > > lists.digitalpeb...@gmail.com> > > > wrote: > > > > > > Hi > > > > > > I suppose you could expose MapReduce jobs as UIMA components but it > would > > > certainly be easier to do the other way round and use e.g. Behemoth > [1] > > to > > > run the UIMA PEARs on MapReduce. > > > > > > HTH > > > > > > Julien > > > > > > [1] https://github.com/DigitalPebble/behemoth > > > > > > On 13 February 2013 22:47, Som Satpathy wrote: > > > > > > > Hi all, > > > > > > > > I have been toying around with UIMA pipelines for some time now. I > was > > > > wondering if UIMA can support analysis components written as mahout > > > > map-reduce jobs as part of a UIMA pipeline ? > > > > > > > > I would appreciate any help/hints/pointers. > > > > > > > > Thanks, > > > > Som > > > > > > > > > > > > > > > > -- > > > * > > > *Open Source Solutions for Text Engineering > > > > > > http://digitalpebble.blogspot.com/ > > > http://www.digitalpebble.com > > > http://twitter.com/digitalpebble > > > > > > > > > > > > -- > * > *Open Source Solutions for Text Engineering > > http://digitalpebble.blogspot.com/ > http://www.digitalpebble.com > http://twitter.com/digitalpebble >
Re: Does the UIMA pipeline support analysis components written as mahout map-reduce jobs
BTW Behemoth has a Mahout module which allows to generate input vectors for the clustering. Annoyingly the Mahout classifiers have no standard interface and expect different inputs but it wouldn't be too difficult to hack the code in the mahout module to generate whatever input is needed for a particular implementation of a Mahout classifier. On 15 February 2013 16:14, Som Satpathy wrote: > Thanks for sharing your thoughts guys. I think it would be better for me to > keep the two layers separate. The UIMA pipeline can be used to extract > useful features. Another layer can then use those features to implement and > generate deep learning models (via mahout/mapR jobs) > > Cheers > Som > > > On Fri, Feb 15, 2013 at 6:48 AM, Brian Dolan wrote: > > > We tackled this same issue. Ultimately, since a UIMA process is usually > > concerned with a single document, it made more structural sense to wrap > the > > UIMA task within a Mapper. That keeps the entire process within the > > functional programming paradigm. We also were concerned with how fragile > > the UIMA configuration can be and it was easier to control when embedded > > within a Mapper. Similarly with Mahout, though we separated the two > jobs. > > > > > > On Feb 15, 2013, at 2:37 AM, Julien Nioche < > lists.digitalpeb...@gmail.com> > > wrote: > > > > Hi > > > > I suppose you could expose MapReduce jobs as UIMA components but it would > > certainly be easier to do the other way round and use e.g. Behemoth [1] > to > > run the UIMA PEARs on MapReduce. > > > > HTH > > > > Julien > > > > [1] https://github.com/DigitalPebble/behemoth > > > > On 13 February 2013 22:47, Som Satpathy wrote: > > > > > Hi all, > > > > > > I have been toying around with UIMA pipelines for some time now. I was > > > wondering if UIMA can support analysis components written as mahout > > > map-reduce jobs as part of a UIMA pipeline ? > > > > > > I would appreciate any help/hints/pointers. > > > > > > Thanks, > > > Som > > > > > > > > > > > -- > > * > > *Open Source Solutions for Text Engineering > > > > http://digitalpebble.blogspot.com/ > > http://www.digitalpebble.com > > http://twitter.com/digitalpebble > > > > > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble
Re: Does the UIMA pipeline support analysis components written as mahout map-reduce jobs
Thanks for sharing your thoughts guys. I think it would be better for me to keep the two layers separate. The UIMA pipeline can be used to extract useful features. Another layer can then use those features to implement and generate deep learning models (via mahout/mapR jobs) Cheers Som On Fri, Feb 15, 2013 at 6:48 AM, Brian Dolan wrote: > We tackled this same issue. Ultimately, since a UIMA process is usually > concerned with a single document, it made more structural sense to wrap the > UIMA task within a Mapper. That keeps the entire process within the > functional programming paradigm. We also were concerned with how fragile > the UIMA configuration can be and it was easier to control when embedded > within a Mapper. Similarly with Mahout, though we separated the two jobs. > > > On Feb 15, 2013, at 2:37 AM, Julien Nioche > wrote: > > Hi > > I suppose you could expose MapReduce jobs as UIMA components but it would > certainly be easier to do the other way round and use e.g. Behemoth [1] to > run the UIMA PEARs on MapReduce. > > HTH > > Julien > > [1] https://github.com/DigitalPebble/behemoth > > On 13 February 2013 22:47, Som Satpathy wrote: > > > Hi all, > > > > I have been toying around with UIMA pipelines for some time now. I was > > wondering if UIMA can support analysis components written as mahout > > map-reduce jobs as part of a UIMA pipeline ? > > > > I would appreciate any help/hints/pointers. > > > > Thanks, > > Som > > > > > > -- > * > *Open Source Solutions for Text Engineering > > http://digitalpebble.blogspot.com/ > http://www.digitalpebble.com > http://twitter.com/digitalpebble > >
Re: Does the UIMA pipeline support analysis components written as mahout map-reduce jobs
We tackled this same issue. Ultimately, since a UIMA process is usually concerned with a single document, it made more structural sense to wrap the UIMA task within a Mapper. That keeps the entire process within the functional programming paradigm. We also were concerned with how fragile the UIMA configuration can be and it was easier to control when embedded within a Mapper. Similarly with Mahout, though we separated the two jobs. On Feb 15, 2013, at 2:37 AM, Julien Nioche wrote: Hi I suppose you could expose MapReduce jobs as UIMA components but it would certainly be easier to do the other way round and use e.g. Behemoth [1] to run the UIMA PEARs on MapReduce. HTH Julien [1] https://github.com/DigitalPebble/behemoth On 13 February 2013 22:47, Som Satpathy wrote: > Hi all, > > I have been toying around with UIMA pipelines for some time now. I was > wondering if UIMA can support analysis components written as mahout > map-reduce jobs as part of a UIMA pipeline ? > > I would appreciate any help/hints/pointers. > > Thanks, > Som > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble
Re: Does the UIMA pipeline support analysis components written as mahout map-reduce jobs
Hi I suppose you could expose MapReduce jobs as UIMA components but it would certainly be easier to do the other way round and use e.g. Behemoth [1] to run the UIMA PEARs on MapReduce. HTH Julien [1] https://github.com/DigitalPebble/behemoth On 13 February 2013 22:47, Som Satpathy wrote: > Hi all, > > I have been toying around with UIMA pipelines for some time now. I was > wondering if UIMA can support analysis components written as mahout > map-reduce jobs as part of a UIMA pipeline ? > > I would appreciate any help/hints/pointers. > > Thanks, > Som > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble
Re: Does the UIMA pipeline support analysis components written as mahout map-reduce jobs
What do you want to do? Map-reduce is batch processing, whereas a UIMA AE works online, so this doesn't really fit. In Mahout map-reduce is usually used for training, not e.g. for applying a trained classifier. So you would train whichever way you want (e.g. using map-reduce, etc.), but your UIMA AE would actually be a wrapper for an online classifier, not a map-reduce task. Best, Jens On 02/13/2013 11:47 PM, Som Satpathy wrote: Hi all, I have been toying around with UIMA pipelines for some time now. I was wondering if UIMA can support analysis components written as mahout map-reduce jobs as part of a UIMA pipeline ? I would appreciate any help/hints/pointers. Thanks, Som