Yes, both HBase and Phoenix are packaged by Bigtop (I did the latter work),
but this is largely orthogonal to the question of whether HBase should ship
'convenience binaries' including a SQL shell, except where I hear "let's
not do that and point people to alternatives like Bigtop". Am I stating
this position correctly? If not, please pardon. If so, thanks for the
feedback.

I don't think we want a 'contrib'. We had one of those a long time ago and
got rid of it. Note that I'm not suggesting we bring Phoenix *into* HBase.
This also goes to Jon's comment about circular dependencies. There are no
dependencies at all introduced here. This would be post-build packaging of
a convenience binary only.

I was hoping to sidestep semvar and multi-branching multi-versioning
concerns by making this about convenience binaries only. By definition it's
something done for the convenience of users on a best-effort basis. It
doesn't have to cover all of our build combinatorics. Otherwise neither we
nor the Phoenix project are going to be the position to always have version
X to cover release Y of the other. We'd throw our hands up and leave the
actual integration of Phoenix with HBase to the users themselves, or to
Bigtop, or to commercial vendors. I don't think we need to hamstring
ourselves like that, but if the community thinks the 'convenience binary'
distinction doesn't matter, or doesn't matter *enough*, then ok, no need to
discuss this further. Is that the case?

> What if other projects were considered for this special treatment?
> Projects like cask and tephra have a large overlap of hbase community
> members as well.  Would we have to have criteria to determine how/when to
> include those project as well?

Do we need a criteria in place to cover every related contingency whenever
something is proposed? Has there been a problem getting consensus as needed
on an ad hoc basis up to now? For all the good that surely will come of it,
I think we have also opened the door to a legislative itch with semvar. I
fear we will risk a constitutional crisis whenever Hadoop upgrades a
dependency, but there *I* go conflating things. (smile) You can strike that
as not related to the matter at hand.


On Tue, Mar 17, 2015 at 7:26 PM, Sean Busbey <bus...@cloudera.com> wrote:

> I like the idea of having an out of the box solution for using phoenix on
> top of hbase, but I worry about the conflict when folks want to upgrade one
> or the other. Our instructions for replacing Hadoop jars will get
> substantially more complicated if they have to include phoenix and its
> dependencies.
>
> Two possible compromise positions:
>
> * Apache Bigtop - it already integrates the rest of the stack. My apologies
> if it's in there (or proposed and rejected), phone access limits my ability
> to check.
>
> * Sub-Project - either hbase or phoenix could start a contrib repo that did
> this out of the box combined distro. It could also try to help other
> on-ramping problems, like setting up a cluster without having to manage
> your own deployment of HDFS / ZK.
>
> As the subproject matures we'd have a lower risk way of assessing how
> coupled hbase and phoenix releases are and what kind of deployment
> efficiencies we get.
>
> --
> Sean
> On Mar 17, 2015 7:47 PM, "Jonathan Hsieh" <j...@cloudera.com> wrote:
>
> > I like Nick's approach of including a hbase (and its deps) inside of
> > phoenix releases or having the dockerfile with the components
> "installed".
> > This coupling seems more easy to manage since phoenix already has two
> > branches for 0.94 and 0.98 support -- each could include its own hbase
> and
> > choose to upgrade point versions or minor versions without introducing
> > confusion.  That approach is a clean way to deal with semvar breaking
> > dependencies in the other hadoop/hbase deps discussion (vs the
> > hadoop1-hadoop2 compat stuff we had before).
> >
> > Only having phoenix binaries in the 0.98 branch may cause confusion.  It
> > would be a special case and break the new features in trunk convention
> and
> > if extended could potentially block releases of newer versions.
> >
> > If we kept the policy intact and Include phoenix in trunk/master (an
> notion
> > that should rightfully be avoided), we would cause problems if phoenix
> > breaking API changes were introduced.  It brings in other awkward
> questions
> > such as how often would we pull in the latest phoenix? are we willing to
> > tolerate a  broken master build (we sort of do already admittedly but
> that
> > is not ideal) ?  would phoenix be able block a core hbase release?
> >
> > Are there examples of this kind of "reverse" inclusion in other projects?
> > One that seems analogous is curator to zookeeper -- and curator is a
> > separate project from zookeeper.
> >
> > What if other projects were considered for this special treatment?
> > Projects like cask and tephra have a large overlap of hbase community
> > members as well.  Would we have to have criteria to determine how/when to
> > include those project as well?
> >
> > Keeping the already large hbase project's scope and code base focused and
> > independent of new circular dependencies seems prudent.
> >
> > Jon.
> >
> >
> > On Tue, Mar 17, 2015 at 12:54 PM, Nick Dimiduk <ndimi...@gmail.com>
> wrote:
> >
> > > I've been thinking of something along these lines as well. Rather an
> > either
> > > official Apache project, I was thinking it could be something as simple
> > as
> > > a github managed dockerfile that stands up a HBase + Phoenix singlenode
> > > deal, see if momentum builds.
> > >
> > > Another idea is Phoenix could include HBase in its binary release, the
> > same
> > > way HBase includes Hadoop. That way there's an "out of the box"
> > > distribution for Phoenix. That would be a discussion for the Phoenix
> dev
> > > list.
> > >
> > > -n
> > >
> > > On Tuesday, March 17, 2015, Andrew Purtell <apurt...@apache.org>
> wrote:
> > >
> > > > Consider if the HBase project starts releasing new "convenience
> > > binaries",
> > > > in addition to the existing ones, in which we bundle a
> > > recent/vetted/stable
> > > > version of Phoenix, with the site file changes for loading their
> > > > coprocessors already patched in (to hbase-default.xml) For now this
> > would
> > > > be done for 0.98 only, since that's the only release line supported
> by
> > an
> > > > actively developed Phoenix version. We could also do this for 0.94
> > > releases
> > > > with Phoenix 3 if the 0.94 RM wants, but I doubt there would be any
> > > demand
> > > > for this, Phoenix 3 is inactive because that community has all moved
> to
> > > 4,
> > > > I'd imagine that carries over here.
> > > >
> > > > Advantages:
> > > >
> > > > - HBase would ship with a SQL access option. There's the Phoenix JDBC
> > > > driver of course, and we'd also bundle the psql and sqlline exec
> > wrappers
> > > > from the Phoenix binary distribution. We'd have both the jruby shell
> > and
> > > a
> > > > SQL shell, this is a powerful combination.
> > > >
> > > > - HBase ships with a library that assists users in making efficient
> > > queries
> > > > if their data is typed, but this doesn't include the server side
> > > > optimizations that the Phoenix coprocessors provide, and in that case
> > no
> > > > hand rolling is necessary.
> > > >
> > > > - HBase would ship with secondary indexes. These would not cover all
> > > > possible use cases and requirements, let's stipulate that now and
> hope
> > > this
> > > > doesn't kick off another circular discussion on that front.
> > > Unquestionably
> > > > this is a compelling Phoenix feature so some use cases obviously can
> > > > benefit, and if users find the combined distribution useful enough we
> > > don't
> > > > have to discuss secondary indexes in HBase core again.
> > > >
> > > > - We will have done the necessary integration work for the combined
> > > result
> > > > to be easy to use. Apache software cat herders will appreciate this.
> > > >
> > > > - It's totally optional, simply ignore the new binary packages if you
> > > don't
> > > > care. This is not a Grand Unification proposal.
> > > >
> > > > Concerns:
> > > >
> > > > - More work for the RM. Unquestionably.
> > > >
> > > > - Concerns about the quality of the combined convenience artifact: Is
> > > there
> > > > an implied warranty? Could we disclaim? Should we disclaim? If not,
> how
> > > > does HBase do QA on this. Related to the above concern about RM
> > > bandwidth.
> > > > Maybe Phoenix could help.
> > > >
> > > > - Increased coupling between the projects. Frankly, I think this
> > already
> > > > there, we just don't see it until we trip over issues that could have
> > > been
> > > > avoided with more communication between projects. Pushing on Phoenix
> > for
> > > > bits for a monthly HBase release cadence will surface issues faster
> and
> > > > improve communication between the projects. This benefits Phoenix
> with
> > > more
> > > > QA bandwidth. This benefits HBase because we see Phoenix bringing in
> a
> > > > significant number of users.
> > > >
> > > > - We may want to revisit again normalizing type support in HBase's
> > client
> > > > library and Phoenix, eventually.
> > > >
> > > > I could add more items to the advantage or concern lists but mainly
> > want
> > > to
> > > > float the idea for feedback at this time.
> > > >
> > > > Thoughts?
> > > >
> > > > --
> > > > Best regards,
> > > >
> > > >    - Andy
> > > >
> > > > Problems worthy of attack prove their worth by hitting back. - Piet
> > Hein
> > > > (via Tom White)
> > > >
> > >
> >
> >
> >
> > --
> > // Jonathan Hsieh (shay)
> > // HBase Tech Lead, Software Engineer, Cloudera
> > // j...@cloudera.com // @jmhsieh
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Reply via email to