Consider cleaning up backend code

2010-04-22 Thread Richard Ding
Pig has an abstraction layer (interfaces and abstract classes) to support multiple execution engines. After PIG-1053, Hadoop is the only execution engine supported by Pig. I wonder if we should remove this layer of code, and make Hadoop THE execution engine for Pig. This will simplify a lot the bac

Re: Consider cleaning up backend code

2010-04-22 Thread Milind A Bhandarkar
+1. - milind On 4/22/10 11:35 AM, "Richard Ding" wrote: > Pig has an abstraction layer (interfaces and abstract classes) to > support multiple execution engines. After PIG-1053, Hadoop is the only > execution engine supported by Pig. I wonder if we should remove this > layer of code, and make

Re: Consider cleaning up backend code

2010-04-22 Thread Arun C Murthy
+1 Arun On Apr 22, 2010, at 11:35 AM, Richard Ding wrote: Pig has an abstraction layer (interfaces and abstract classes) to support multiple execution engines. After PIG-1053, Hadoop is the only execution engine supported by Pig. I wonder if we should remove this layer of code, and make Hadoop

Re: Consider cleaning up backend code

2010-04-22 Thread Dmitriy Ryaboy
I kind of dig the concept of being able to plug in a different backend, though I definitely thing we should get rid of the dead localmode code. Can you give an example of how this will simplify the codebase? Is it more than just GenericClass foo = new SpecificClass(), and the associated extra files

Re: Consider cleaning up backend code

2010-04-22 Thread Milind A Bhandarkar
I think it is a great idea to be able to plug-in a different back-ends. But the way to do that, IMHO, is to make the intermediate artifacts public (akin to making byte-code specs public). That way, independent projects can spring up that take the translated pig script, and provide a new interpret

Re: Consider cleaning up backend code

2010-04-22 Thread Arun C Murthy
I read it as getting rid of concepts parallel to hadoop in src/org/ apache/pig/backend/hadoop/datastorage. Is that true? thanks, Arun On Apr 22, 2010, at 1:34 PM, Dmitriy Ryaboy wrote: I kind of dig the concept of being able to plug in a different backend, though I definitely thing we shou

RE: Consider cleaning up backend code

2010-04-22 Thread Richard Ding
@hadoop.apache.org Subject: Re: Consider cleaning up backend code I read it as getting rid of concepts parallel to hadoop in src/org/ apache/pig/backend/hadoop/datastorage. Is that true? thanks, Arun On Apr 22, 2010, at 1:34 PM, Dmitriy Ryaboy wrote: > I kind of dig the concept of being able to plug i

Re: Consider cleaning up backend code

2010-04-22 Thread Arun C Murthy
[mailto:a...@yahoo-inc.com] Sent: Thursday, April 22, 2010 4:14 PM To: pig-dev@hadoop.apache.org Subject: Re: Consider cleaning up backend code I read it as getting rid of concepts parallel to hadoop in src/org/ apache/pig/backend/hadoop/datastorage. Is that true? thanks, Arun On Apr 22, 2010

Re: Consider cleaning up backend code

2010-04-22 Thread Alan Gates
A couple of years ago we had this concept that Pig as is should be able to run on other backends (like say Dryad if it were open source). So we built this whole backend interface and (mostly) kept Hadoop specific objects out of the front end. Recently we have modified that stand and said t

Re: Consider cleaning up backend code

2010-04-22 Thread Jianyong Dai
+1 for removing. This interface does not bring us any value when we decide to move closer to hadoop. Writing a backend is almost writing half of Pig. I don't think this interface is attractive to most developers. Instead, I +1 for Milind's idea to make intermediate artifacts available, or provi