Re: Proposal to un-fork Sqlline

2014-02-04 Thread Julian Hyde
On Feb 3, 2014, at 5:29 PM, Xuefu Zhang  wrote:

> I guess another point that I tried to make is that if sqlline fork is
> bundled with Hive, it defeats the purpose of allowing other projects to
> share the same code line. I'd like to see the possibility of fork sqlline
> to apache as an independent module.

Yeah, I agree in principle. But to create a small independent module in Apache 
you have to go through the whole incubator process. I think a library within 
the Hive namespace is an acceptable compromise. When that is complete I will 
look at moving it elsewhere, say commons or as a sub-project within 
http://db.apache.org/.

(Am I the first person to note that Apache effectively discourages code re-use 
because they set a very high bar for creating a project?)

Julian



Re: Proposal to un-fork Sqlline

2014-02-03 Thread Xuefu Zhang
I guess another point that I tried to make is that if sqlline fork is
bundled with Hive, it defeats the purpose of allowing other projects to
share the same code line. I'd like to see the possibility of fork sqlline
to apache as an independent module.


On Mon, Feb 3, 2014 at 5:18 PM, Julian Hyde  wrote:

> On Feb 3, 2014, at 11:15 AM, Xuefu Zhang  wrote:
>
> > I'm thinking if it makes more sense to fork sqlline
> > directly into Apache. upon its completion, Hive gets rid of its copy of
> > sqlline and creates a dependency on the forked sqlline instead. I guess
> > this is a top-down approach and the benefits are immediate across
> multiple
> > projects.
>
> You're basically suggesting that I do step 3 before 1 and 2. It makes
> sense, because it reduces risk.
>
> I have logged https://issues.apache.org/jira/browse/HIVE-6361 with an
> updated proposal.
>
> Julian


Re: Proposal to un-fork Sqlline

2014-02-03 Thread Julian Hyde
On Feb 3, 2014, at 11:15 AM, Xuefu Zhang  wrote:

> I'm thinking if it makes more sense to fork sqlline
> directly into Apache. upon its completion, Hive gets rid of its copy of
> sqlline and creates a dependency on the forked sqlline instead. I guess
> this is a top-down approach and the benefits are immediate across multiple
> projects.

You’re basically suggesting that I do step 3 before 1 and 2. It makes sense, 
because it reduces risk.

I have logged https://issues.apache.org/jira/browse/HIVE-6361 with an updated 
proposal.

Julian

Re: Proposal to un-fork Sqlline

2014-02-03 Thread Xuefu Zhang
Hi Julian,

Thanks for sharing your thought. I'm certainly on board on code sharing
among project. However, I don't see immediate benefits for Hive by
separating Beeline into two modules. Instead, it requires additional work
and potentially creates instability, while code sharing isn't achieved
until the proposed hive-sqlline module is promoted to an independent
project.

On the other hand, I'm thinking if it makes more sense to fork sqlline
directly into Apache. upon its completion, Hive gets rid of its copy of
sqlline and creates a dependency on the forked sqlline instead. I guess
this is a top-down approach and the benefits are immediate across multiple
projects.

Thanks,
Xuefu


On Mon, Feb 3, 2014 at 10:49 AM, Julian Hyde  wrote:

> As you probably know, Hive's SQL command-line interface Beeline was
> created by forking Sqlline [1] [2]. At the time it was a useful but
> low-activity project languishing on SourceForge without an active owner.
> Around the same time, I independently picked up the Sqlline code, moved it
> to github [3], put in place a maven build process, and gave it some love.
> Now several projects are using it, including Apache Drill, Apache Phoenix,
> Cascading Lingual and Optiq. So, now we have two active forks of Sqlline.
>
> I propose to merge these development forks.
>
> This will achieve a few things. We should be able to fix more bugs, and
> add more features, and get more people using sqlline. (Just today, someone
> ran into a bug that Drill was not saving/restoring command history, then
> noticed that it was fixed in sqlline-1.1.3 [4] [5]. It seems that that bug
> still exists in Hive's beeline.)
>
> I propose the following:
> 1. Move the parts of hive-beeline module that do not depend upon Hive
> (about 90% of the code) into a new module in the hive repo, hive-sqlline.
> 2. What remains in the hive-beeline module is Beeline.java (a derived
> class of Sqlline.java) and Hive-specific extensions. The hive-beeline
> module depends upon the hive-sqlline module.
> 3. Make sure that the new Hive sqlline module contains all fixes and
> useful changes from both forks.
> 4. Release sqlline as a maven artifact, say {groupId=org.apache.hive,
> artifactId=hive-sqlline} and tell clients of julianhyde-sqlline to migrate
> to it.
> 5. Longer term, consider moving hive-sqlline out of Hive, but still within
> Apache.
>
> This achieves continuity for Hive's users, gives the users of the non-Hive
> sqlline a version with minimal dependencies, unifies the two code lines,
> and brings everything under the Apache roof.
>
> Please let me know if this sounds like a good proposal. I'll log a jira
> case, then start work on a patch.
>
> Julian
>
> [1] https://issues.apache.org/jira/browse/HIVE-987
> [2] https://issues.apache.org/jira/browse/HIVE-3100
> [3] https://github.com/julianhyde/sqlline
> [4] https://github.com/julianhyde/sqlline/issues/19
> [5] https://issues.apache.org/jira/browse/DRILL-327


Proposal to un-fork Sqlline

2014-02-03 Thread Julian Hyde
As you probably know, Hive’s SQL command-line interface Beeline was created by 
forking Sqlline [1] [2]. At the time it was a useful but low-activity project 
languishing on SourceForge without an active owner. Around the same time, I 
independently picked up the Sqlline code, moved it to github [3], put in place 
a maven build process, and gave it some love. Now several projects are using 
it, including Apache Drill, Apache Phoenix, Cascading Lingual and Optiq. So, 
now we have two active forks of Sqlline.

I propose to merge these development forks.

This will achieve a few things. We should be able to fix more bugs, and add 
more features, and get more people using sqlline. (Just today, someone ran into 
a bug that Drill was not saving/restoring command history, then noticed that it 
was fixed in sqlline-1.1.3 [4] [5]. It seems that that bug still exists in 
Hive’s beeline.)

I propose the following:
1. Move the parts of hive-beeline module that do not depend upon Hive (about 
90% of the code) into a new module in the hive repo, hive-sqlline.
2. What remains in the hive-beeline module is Beeline.java (a derived class of 
Sqlline.java) and Hive-specific extensions. The hive-beeline module depends 
upon the hive-sqlline module.
3. Make sure that the new Hive sqlline module contains all fixes and useful 
changes from both forks.
4. Release sqlline as a maven artifact, say {groupId=org.apache.hive, 
artifactId=hive-sqlline} and tell clients of julianhyde-sqlline to migrate to 
it.
5. Longer term, consider moving hive-sqlline out of Hive, but still within 
Apache.

This achieves continuity for Hive’s users, gives the users of the non-Hive 
sqlline a version with minimal dependencies, unifies the two code lines, and 
brings everything under the Apache roof.

Please let me know if this sounds like a good proposal. I’ll log a jira case, 
then start work on a patch.

Julian

[1] https://issues.apache.org/jira/browse/HIVE-987
[2] https://issues.apache.org/jira/browse/HIVE-3100
[3] https://github.com/julianhyde/sqlline
[4] https://github.com/julianhyde/sqlline/issues/19
[5] https://issues.apache.org/jira/browse/DRILL-327