Re: Incubator Proposal: Pig

Thilo Goetz Sun, 23 Sep 2007 23:00:21 -0700

Niclas Hedhman wrote:
[...]
> 
> b) I can't say that I understand the technical merits of the proposal, and 
> just see the headline "analyzing large data sets". And I would like to know 
> the relationship with UIMA's statement "... analyze large volumes of 
> unstructured information..." and hear whether there are overlap, synergies 
> and/or collaboration in view.


Niclas,

I'm not 100% clear on where there could be synergies between
Pig and UIMA.  Map/reduce is a natural distribution
strategy for UIMA, so executing UIMA programs on top of Hadoop
seems natural.  Maybe Pig can help with that and make it easier
somehow.  However, that is not clear to me from the proposal
at this time.

At the same time, I don't really think there is any overlap.
Pig is concerned with computation in a distributed environment,
while UIMA is agnostic in that respect.  On the other hand,
UIMA offers a component model to develop analysis modules and
combine them into processing chains (with an emphasis on reuse).
I do not see from the proposal that Pig is in the business of
defining a component model.

So synergies probably yes, no overlap as far as I can see.

--Thilo


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Incubator Proposal: Pig

Reply via email to