Re: Merging Hadoop-Eclipse project into HDT

Rahul Sharma Thu, 18 Jul 2013 22:23:34 -0700

I guess this just creation of a branch in repo, so Adam I could take this
up if u allow me. Moreover,  we should also setup HDT build in Apache
jenkins [1].  I am able to login but I don't have the rights to create a
new job. Does anyone have job creation rights there ? If not then I think
Chris can give admin rights[2] for jenkins, or do I need to open INFRA
issue ?


regards,
Rahul

[1]https://builds.apache.org/
[2] http://wiki.apache.org/general/Jenkins#How_do_I_get_an_account



On Wed, Jul 17, 2013 at 11:53 AM, Srimanth Gunturi <[email protected]>wrote:

> Hi Adam,
> Thank you for setting up a separate branch. Please share the name of the
> branch once created.
> I hope to start submitting patches soon. I will be renaming my plugins to
> org.apache.hdt.* as part of the transfer.
> Best regards,
> Srimanth
>
>
>
>
>
> On Thu, Jul 11, 2013 at 2:51 PM, Adam Berry <[email protected]> wrote:
>
> > Hey guys,
> >
> > so for background, the intent for this project was always to get to where
> > we can connect to multiple versions of Hadoop clusters from one dev tools
> > install.
> > http://wiki.apache.org/incubator/HadoopDevelopmentToolsProposalfor
> > a little reference.
> >
> > The code as it stands out is basically a first pass at splitting the code
> > from Hadoop contrib, there it resided as a single plug-in, with all the
> > same client library limitations. And yes, you cannot connect to multiple
> > versions of Hadoop etc, and the logic is pretty coupled in some spots.
> >
> > Personally, I'm all for pulling in this new work to for HDFS and
> zookeeper
> > work, and then bringing the MR side up to the same point with the same
> > approach, as Srimanth suggested.
> >
> > I also vote for doing this in a feature branch in our git repo, and if no
> > one objects I'll get that setup so we can get to work.
> >
> > Cheers,
> > Adam
> >
> >
> > On Tue, Jul 9, 2013 at 2:35 AM, Srimanth Gunturi <[email protected]>
> > wrote:
> >
> > > Hi Rahul,
> > > Hadoop-Eclipse UI is decoupled from models and controllers.
> > > EMF models automatically provide model and controller separation.
> > > That work is already done. If we reused the available code as is, we do
> > not
> > > have to rewrite a major bulk of the functionality.
> > >
> > > IMHO, the least amount of work in getting both projects merged, is
> > putting
> > > effort into moving MR core/UI functionality onto the above plugins
> > > following the same paradigms. If we did this, we wouldnt have to
> rewrite
> > > HDFS, and the underlying internals/UI, as they are already working.
> > > Best regards,
> > > Srimanth
> > >
> > >
> > >
> > >
> > > On Mon, Jul 8, 2013 at 11:02 PM, Rahul Sharma <[email protected]>
> > wrote:
> > >
> > > > +1 to the idea to have some abstraction to access HDFS/Zookeeper. We
> > > could
> > > > make different implementations for different versions. As for
> complete
> > UI
> > > > decoupling , I think we would like to achieve it.  My only concern
> here
> > > is
> > > > that this looks like complete overwrite as the current version UI is
> > > > tightly coupled with logic. We should try to distribute this in
> > multiple
> > > > releases. I will dive into hadoop-eclipse some time this week and
> share
> > > my
> > > > thoughts on the same J
> > > >
> > > >
> > > > regards,
> > > > Rahul
> > > >
> > > >
> > > > On Mon, Jul 8, 2013 at 11:49 AM, Srimanth Gunturi <
> [email protected]
> > > > >wrote:
> > > >
> > > > >  Hello,
> > > > > With regards to contributing HDFS/ZooKeeper functionalities, I was
> > > going
> > > > > through HDT code and noted some design issues/thoughts that I
> wanted
> > to
> > > > > discuss.
> > > > >
> > > > > 1) Client cannot connect with multiple versions of HDFS/MR servers
> > > > > org.apache.hdt.core.cluster.HadoopCluster which represents a
> cluster,
> > > > > provides direct access to the HDFS and MR java API. This implies
> that
> > > at
> > > > > any time, only 1 version of HDFS and MR client libraries can be
> used
> > > > > (typically, whichever version gets loaded first by the
> classloader).
> > So
> > > > if
> > > > > there was any use case where interactions with multiple HDFS/MR
> > > versions
> > > > is
> > > > > required, we would hit runtime issues. The client would be at the
> > mercy
> > > > of
> > > > > backward/forward compatibility capabilities of the
> > > HDFS/MR/[any-service]
> > > > > clients.
> > > > >
> > > > > In the Hadoop-Eclipse project, to get around this issue, I have
> > created
> > > > an
> > > > > extension-point based abstraction, where the Eclipse functionality
> > > itself
> > > > > would never directly use HDFS/ZooKeeper/[service] classes. Rather,
> > from
> > > > > multiple versions of extension point implementations, the right one
> > > would
> > > > > be used to talk to the server. This allows the UI/core(headless)
> > > > > functionalities to be free from the ever changing versions of
> > > > > clients/servers.
> > > > >
> > > > >
> > > > > 2) No clean seperation of UI, non-UI capabilities.
> > > > > In HDFS, almost all functionality is non-UI (create, read, write,
> > > delete
> > > > of
> > > > > files/folders). However, currently all HDT plugins are dependent on
> > UI
> > > > > plugins (starting with org.apache.hdt.core). This goes against the
> > > > > model-view-controller (MVC) paradigm, where the Eclipse UI (view)
> is
> > > > mixed
> > > > > in with the models and controllers. There is no reason why someone
> > > could
> > > > > not leverage or extend the core/headless/non-UI capabilities of
> > various
> > > > > Hadoop services in Eclipse without the UI.
> > > > >
> > > > > In the Hadoop-Eclipse project, plugins are categorized into core
> > > > > (representing non-UI capabilities) and UI plugins. You can create
> > > > > connections, create/read/write/delete HDFS/ZooKeeper contents,
> etc.,
> > > > > without even having UI plugins. This is helpful in nightly JUnit
> > tests
> > > to
> > > > > start. But it also allows others to provide their own UI
> interactions
> > > on
> > > > > top of us. The models that are persisted (HDFS/ZooKeeper
> connections,
> > > > > metadata, etc.) are Eclipse Modeling Framework
> > > > > (EMF)<http://www.eclipse.org/modeling/emf/>models, which have a
> > > > > built-in notification mechanism. They help in a clean
> > > > > separation of Models and Controllers in MVC.
> > > > >
> > > > >
> > > > > The above were some of the major ones which came to mind.
> > > > > I encourage the community to go through the
> > > > > Hadoop-Eclipse<http://people.apache.org/~srimanth/hadoop-eclipse/
> > > > >project
> > > > > codebase, and discuss any issues/concerns you have.
> > > > >
> > > > > I am thinking of the best way to merge the functionalities of both
> > > > > projects, and would like to put forward a proposal.
> > > > > HDFS is the only functionality common between both projects, along
> > with
> > > > > underlying framework. If we can come to a consensus on which parts
> we
> > > > want
> > > > > from where, it will be a smoother effort merging the code. From my
> > end
> > > of
> > > > > the spectrum, I was thinking it might be easier if the MR
> > functionality
> > > > > could be merged into the HDFS/ZooKeeper functionalities, thus
> > > providing a
> > > > > union of both projects.
> > > > >
> > > > > I just wanted to get the merging process started, and look forward
> to
> > > > > discussing more about it.
> > > > > Best regards,
> > > > > Srimanth
> > > > >
> > > >
> > >
> >
>

Re: Merging Hadoop-Eclipse project into HDT

Reply via email to