[jira] Commented: (HIVE-987) Hive CLI Omnibus Improvement ticket

2010-04-22 Thread Raghotham Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12859751#action_12859751
 ] 

Raghotham Murthy commented on HIVE-987:
---

jdbc/src/test/org/apache/hadoop/hive/jdbc/TestJdbcDriver.java shows using 
hive's jdbc driver in embedded mode. Doesnt it run as part of ant test?

 Hive CLI Omnibus Improvement ticket
 ---

 Key: HIVE-987
 URL: https://issues.apache.org/jira/browse/HIVE-987
 Project: Hadoop Hive
  Issue Type: Improvement
Reporter: Carl Steinbach
 Attachments: HIVE-987.1.patch, sqlline-1.0.8_eb.jar


 Add the following features to the Hive CLI:
 * Command History
 * ReadLine support
 ** HIVE-120: Add readline support/support for alt-based commands in the CLI
 ** Java-ReadLine is LGPL, but it depends on GPL readline library. We probably 
 need to use JLine instead.
 * Tab completion
 ** HIVE-97: tab completion for hive cli
 * Embedded/Standalone CLI modes, and ability to connect to different Hive 
 Server instances.
 ** HIVE-818: Create a Hive CLI that connects to hive ThriftServer
 * .hiverc configuration file
 ** HIVE-920: .hiverc doesnt work
 * Improved support for comments.
 ** HIVE-430: Ability to comment desired for hive query files
 * Different output formats
 ** HIVE-49: display column header on CLI
 ** XML output format
 For additional inspiration we may want to look at the Postgres psql shell: 
 http://www.postgresql.org/docs/8.1/static/app-psql.html
 Finally, it would be really cool if we implemented this in a generic fashion 
 and spun it off as an apache-commons
 shell framework. It seems like most of the Apache Hadoop projects have their 
 own shells, and I'm sure the same is true
 for non-Hadoop Apache projects as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [DISCUSSION] To be (or not to be) a TLP - that is the question

2010-04-22 Thread Edward Capriolo
On Wed, Apr 21, 2010 at 10:35 PM, Jeff Hammerbacher ham...@cloudera.comwrote:

 Hive already does the work to run on multiple versions of Hadoop, and the
 release cycle is independent of Hadoop's. I don't see why it should remain
 a
 subproject. I'm +1 on Hive becoming a TLP.

 On Tue, Apr 20, 2010 at 2:03 PM, Zheng Shao zsh...@gmail.com wrote:

  As a Hive committer, I don't feel the benefit we get from becoming a
  TLP is big enough (compared with the cost) to make Hive a TLP.
  From Chris's comment I see that the cost is not that big, but I still
  wonder what benefit we will get from that.
 
  Also I didn't get the idea of the joke (In fact, one could argue that
  Pig opting not to be TLP yet is why Hive should go TLP). I don't see
  any reasons that applies to Pig but not Hive.
  We should continue the discussion here, but anything in the Pig's
  discussion should also be considered here.
 
  Zheng
 
  On Mon, Apr 19, 2010 at 5:48 PM, Amr Awadallah a...@cloudera.com wrote:
   I am personally +1 on Hive being a TLP, I think it did reach the
  community
   adoption and maturity level required for that. In fact, one could argue
  that
   Pig opting not to be TLP yet is why Hive should go TLP :) (jk).
  
   The real question to ask is whether there is a volunteer to take care
 of
  the
   administrative tasks, which isn't a ton of work afaiu (I am willing
 to
   volunteer if no body else up to the task, but I am not a committer and
  only
   contributed a minor patch for bash/cygwin).
  
   BTW, here is a very nice summary from Yahoo's Chris Douglas on TLP
   tradeoffs. I happen to agree with all he says, and frankly I couldn't
  have
   wrote it better my self. I highlight certain parts from his message,
 but
  I
   recommend you read the whole thing.
  
   -- Forwarded message --
   From: Chris Douglas cdoug...@apache.org
   Date: Tue, Apr 13, 2010 at 11:46 PM
   Subject: Subprojects and TLP status
   To: gene...@hadoop.apache.org, priv...@hadoop.apache.org
  
   Most of Hadoop's subprojects have discussed becoming top-level Apache
   projects (TLPs) in the last few weeks. Most have expressed a desire to
   remain in Hadoop. The salient parts of the discussions I've read tend
   to focus on three aspects: a technical dependence on Hadoop,
   additional overhead as a TLP, and visibility both within the Hadoop
   ecosystem and in the open source community generally.
  
   Life as a TLP: this is not much harder than being a Hadoop subproject,
   and the Apache preferences being tossed around- particularly
   insufficiently diverse- are not blockers. Every subproject needs to
   write a section of the report Hadoop sends to the board; almost the
   same report, sent to a new address. The initial cost is similarly
   light: copy bylaws, send a few notes to INFRA, and follow some
   directions. I think the estimated costs are far higher than they will
   be in practice. Inertia is a powerful force, but it should be
   overcome. The directions are here, and should not intimidating:
  
   http://apache.org/dev/project-creation.html
  
   Visibility: the Hadoop site does not need to change. For each
   subproject, we can literally change the hyperlinks to point to the new
   page and be done. Long-term, linking to all ASF projects that run on
   Hadoop from a prominent page is something we all want. So particularly
   in the medium-term that most are considering: visibility through the
   website will not change. Each subproject will still be linked from the
   front page.
  
   Hadoop would not be nearly as popular as it is without Zookeeper,
   HBase, Hive, and Pig. All statistics on work in shared MapReduce
   clusters show that users vastly prefer running Pig and Hive queries to
   writing MapReduce jobs. HBase continues to push features in HDFS that
   increase its adoption and relevance outside MapReduce, while sharing
   some of its NoSQL limelight. Zookeeper is not only a linchpin in real
   workloads, but many proposals for future features require it. The
   bottom line is that MapReduce and HDFS need these projects for
   visibility and adoption in precisely the same way. I don't think
   separate TLPs will uncouple the broader community from one another.
  
   Technical dependence: this has two dimensions. First, influencing
   MapReduce and HDFS. This is nonsense. Earning influence by
   contributing to a subproject is the only way to push code changes;
   nobody from any of these projects has violated that by unilaterally
   committing to HDFS or MapReduce, anyway. And anyone cynical enough to
   believe that MapReduce and HDFS would deliberately screw over or
   ignore dependent projects because they don't have PMC members is
   plainly unsuited to community-driven development. I understand that
   these projects need to protect their users, but lobbying rights are
   not an actual benefit.
  
   Second, being a coherent part of the Hadoop ecosystem. It is (mostly)
   true that 

Hudson build is back to normal : Hive-trunk-h0.18 #421

2010-04-22 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/421/changes




RE: [DISCUSSION] To be (or not to be) a TLP - that is the question

2010-04-22 Thread Ashish Thusoo
What is the advantage of becoming a TLP to the project itself? I have heard 
that it is something that apache wants, but considering that we are very 
comfortable on how Hive interacts with the Hadoop ecosystem as a sub project 
for Hadoop, there has to be some big incentive for the project to be a TLP and 
nowhere have a seen how this would benefit Hive. Any thoughts on that?

Ashish


From: Jeff Hammerbacher [mailto:ham...@cloudera.com]
Sent: Wednesday, April 21, 2010 7:35 PM
To: hive-dev@hadoop.apache.org
Cc: Ashish Thusoo
Subject: Re: [DISCUSSION] To be (or not to be) a TLP - that is the question

Hive already does the work to run on multiple versions of Hadoop, and the 
release cycle is independent of Hadoop's. I don't see why it should remain a 
subproject. I'm +1 on Hive becoming a TLP.

On Tue, Apr 20, 2010 at 2:03 PM, Zheng Shao 
zsh...@gmail.commailto:zsh...@gmail.com wrote:
As a Hive committer, I don't feel the benefit we get from becoming a
TLP is big enough (compared with the cost) to make Hive a TLP.
From Chris's comment I see that the cost is not that big, but I still
wonder what benefit we will get from that.

Also I didn't get the idea of the joke (In fact, one could argue that
Pig opting not to be TLP yet is why Hive should go TLP). I don't see
any reasons that applies to Pig but not Hive.
We should continue the discussion here, but anything in the Pig's
discussion should also be considered here.

Zheng

On Mon, Apr 19, 2010 at 5:48 PM, Amr Awadallah 
a...@cloudera.commailto:a...@cloudera.com wrote:
 I am personally +1 on Hive being a TLP, I think it did reach the community
 adoption and maturity level required for that. In fact, one could argue that
 Pig opting not to be TLP yet is why Hive should go TLP :) (jk).

 The real question to ask is whether there is a volunteer to take care of the
 administrative tasks, which isn't a ton of work afaiu (I am willing to
 volunteer if no body else up to the task, but I am not a committer and only
 contributed a minor patch for bash/cygwin).

 BTW, here is a very nice summary from Yahoo's Chris Douglas on TLP
 tradeoffs. I happen to agree with all he says, and frankly I couldn't have
 wrote it better my self. I highlight certain parts from his message, but I
 recommend you read the whole thing.

 -- Forwarded message --
 From: Chris Douglas cdoug...@apache.orgmailto:cdoug...@apache.org
 Date: Tue, Apr 13, 2010 at 11:46 PM
 Subject: Subprojects and TLP status
 To: gene...@hadoop.apache.orgmailto:gene...@hadoop.apache.org, 
 priv...@hadoop.apache.orgmailto:priv...@hadoop.apache.org

 Most of Hadoop's subprojects have discussed becoming top-level Apache
 projects (TLPs) in the last few weeks. Most have expressed a desire to
 remain in Hadoop. The salient parts of the discussions I've read tend
 to focus on three aspects: a technical dependence on Hadoop,
 additional overhead as a TLP, and visibility both within the Hadoop
 ecosystem and in the open source community generally.

 Life as a TLP: this is not much harder than being a Hadoop subproject,
 and the Apache preferences being tossed around- particularly
 insufficiently diverse- are not blockers. Every subproject needs to
 write a section of the report Hadoop sends to the board; almost the
 same report, sent to a new address. The initial cost is similarly
 light: copy bylaws, send a few notes to INFRA, and follow some
 directions. I think the estimated costs are far higher than they will
 be in practice. Inertia is a powerful force, but it should be
 overcome. The directions are here, and should not intimidating:

 http://apache.org/dev/project-creation.html

 Visibility: the Hadoop site does not need to change. For each
 subproject, we can literally change the hyperlinks to point to the new
 page and be done. Long-term, linking to all ASF projects that run on
 Hadoop from a prominent page is something we all want. So particularly
 in the medium-term that most are considering: visibility through the
 website will not change. Each subproject will still be linked from the
 front page.

 Hadoop would not be nearly as popular as it is without Zookeeper,
 HBase, Hive, and Pig. All statistics on work in shared MapReduce
 clusters show that users vastly prefer running Pig and Hive queries to
 writing MapReduce jobs. HBase continues to push features in HDFS that
 increase its adoption and relevance outside MapReduce, while sharing
 some of its NoSQL limelight. Zookeeper is not only a linchpin in real
 workloads, but many proposals for future features require it. The
 bottom line is that MapReduce and HDFS need these projects for
 visibility and adoption in precisely the same way. I don't think
 separate TLPs will uncouple the broader community from one another.

 Technical dependence: this has two dimensions. First, influencing
 MapReduce and HDFS. This is nonsense. Earning influence by
 contributing to a subproject is the only way to push code changes;
 

Re: [DISCUSSION] To be (or not to be) a TLP - that is the question

2010-04-22 Thread Dhruba Borthakur
I am definitely against moving Hive out of Hadoop. There is appreciable
representation of Hive inside the Hadoop PMC and, as far as I can say, there
is no additional burden on the Hadooo PMC to make Hive remain inside Hadoop.

I respect Jeff/Amr's comments on their viewpoints, but I beg to differ from
that. I really do not see any benefit on moving Hive out of Hadoop.

thanks,
dhruba

On Thu, Apr 22, 2010 at 10:09 AM, Ashish Thusoo athu...@facebook.comwrote:

 What is the advantage of becoming a TLP to the project itself? I have heard
 that it is something that apache wants, but considering that we are very
 comfortable on how Hive interacts with the Hadoop ecosystem as a sub project
 for Hadoop, there has to be some big incentive for the project to be a TLP
 and nowhere have a seen how this would benefit Hive. Any thoughts on that?

 Ashish

 
 From: Jeff Hammerbacher [mailto:ham...@cloudera.com]
 Sent: Wednesday, April 21, 2010 7:35 PM
 To: hive-dev@hadoop.apache.org
 Cc: Ashish Thusoo
 Subject: Re: [DISCUSSION] To be (or not to be) a TLP - that is the question

 Hive already does the work to run on multiple versions of Hadoop, and the
 release cycle is independent of Hadoop's. I don't see why it should remain a
 subproject. I'm +1 on Hive becoming a TLP.

 On Tue, Apr 20, 2010 at 2:03 PM, Zheng Shao zsh...@gmail.commailto:
 zsh...@gmail.com wrote:
 As a Hive committer, I don't feel the benefit we get from becoming a
 TLP is big enough (compared with the cost) to make Hive a TLP.
 From Chris's comment I see that the cost is not that big, but I still
 wonder what benefit we will get from that.

 Also I didn't get the idea of the joke (In fact, one could argue that
 Pig opting not to be TLP yet is why Hive should go TLP). I don't see
 any reasons that applies to Pig but not Hive.
 We should continue the discussion here, but anything in the Pig's
 discussion should also be considered here.

 Zheng

 On Mon, Apr 19, 2010 at 5:48 PM, Amr Awadallah a...@cloudera.commailto:
 a...@cloudera.com wrote:
  I am personally +1 on Hive being a TLP, I think it did reach the
 community
  adoption and maturity level required for that. In fact, one could argue
 that
  Pig opting not to be TLP yet is why Hive should go TLP :) (jk).
 
  The real question to ask is whether there is a volunteer to take care of
 the
  administrative tasks, which isn't a ton of work afaiu (I am willing to
  volunteer if no body else up to the task, but I am not a committer and
 only
  contributed a minor patch for bash/cygwin).
 
  BTW, here is a very nice summary from Yahoo's Chris Douglas on TLP
  tradeoffs. I happen to agree with all he says, and frankly I couldn't
 have
  wrote it better my self. I highlight certain parts from his message, but
 I
  recommend you read the whole thing.
 
  -- Forwarded message --
  From: Chris Douglas cdoug...@apache.orgmailto:cdoug...@apache.org
  Date: Tue, Apr 13, 2010 at 11:46 PM
  Subject: Subprojects and TLP status
  To: gene...@hadoop.apache.orgmailto:gene...@hadoop.apache.org,
 priv...@hadoop.apache.orgmailto:priv...@hadoop.apache.org
 
  Most of Hadoop's subprojects have discussed becoming top-level Apache
  projects (TLPs) in the last few weeks. Most have expressed a desire to
  remain in Hadoop. The salient parts of the discussions I've read tend
  to focus on three aspects: a technical dependence on Hadoop,
  additional overhead as a TLP, and visibility both within the Hadoop
  ecosystem and in the open source community generally.
 
  Life as a TLP: this is not much harder than being a Hadoop subproject,
  and the Apache preferences being tossed around- particularly
  insufficiently diverse- are not blockers. Every subproject needs to
  write a section of the report Hadoop sends to the board; almost the
  same report, sent to a new address. The initial cost is similarly
  light: copy bylaws, send a few notes to INFRA, and follow some
  directions. I think the estimated costs are far higher than they will
  be in practice. Inertia is a powerful force, but it should be
  overcome. The directions are here, and should not intimidating:
 
  http://apache.org/dev/project-creation.html
 
  Visibility: the Hadoop site does not need to change. For each
  subproject, we can literally change the hyperlinks to point to the new
  page and be done. Long-term, linking to all ASF projects that run on
  Hadoop from a prominent page is something we all want. So particularly
  in the medium-term that most are considering: visibility through the
  website will not change. Each subproject will still be linked from the
  front page.
 
  Hadoop would not be nearly as popular as it is without Zookeeper,
  HBase, Hive, and Pig. All statistics on work in shared MapReduce
  clusters show that users vastly prefer running Pig and Hive queries to
  writing MapReduce jobs. HBase continues to push features in HDFS that
  increase its adoption and relevance outside MapReduce, while 

[jira] Commented: (HIVE-987) Hive CLI Omnibus Improvement ticket

2010-04-22 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12859949#action_12859949
 ] 

John Sichi commented on HIVE-987:
-

@Raghu:  you are right.  I screwed up when I was testing the embedded mode; it 
actually works fine already.


 Hive CLI Omnibus Improvement ticket
 ---

 Key: HIVE-987
 URL: https://issues.apache.org/jira/browse/HIVE-987
 Project: Hadoop Hive
  Issue Type: Improvement
Reporter: Carl Steinbach
 Attachments: HIVE-987.1.patch, sqlline-1.0.8_eb.jar


 Add the following features to the Hive CLI:
 * Command History
 * ReadLine support
 ** HIVE-120: Add readline support/support for alt-based commands in the CLI
 ** Java-ReadLine is LGPL, but it depends on GPL readline library. We probably 
 need to use JLine instead.
 * Tab completion
 ** HIVE-97: tab completion for hive cli
 * Embedded/Standalone CLI modes, and ability to connect to different Hive 
 Server instances.
 ** HIVE-818: Create a Hive CLI that connects to hive ThriftServer
 * .hiverc configuration file
 ** HIVE-920: .hiverc doesnt work
 * Improved support for comments.
 ** HIVE-430: Ability to comment desired for hive query files
 * Different output formats
 ** HIVE-49: display column header on CLI
 ** XML output format
 For additional inspiration we may want to look at the Postgres psql shell: 
 http://www.postgresql.org/docs/8.1/static/app-psql.html
 Finally, it would be really cool if we implemented this in a generic fashion 
 and spun it off as an apache-commons
 shell framework. It seems like most of the Apache Hadoop projects have their 
 own shells, and I'm sure the same is true
 for non-Hadoop Apache projects as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1318) External Tables: Selecting a partition that does not exist produces errors

2010-04-22 Thread Edward Capriolo (JIRA)
External Tables: Selecting a partition that does not exist produces errors
--

 Key: HIVE-1318
 URL: https://issues.apache.org/jira/browse/HIVE-1318
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Edward Capriolo
 Attachments: partdoom.q

{noformat}
dfs -mkdir /tmp/a;
dfs -mkdir /tmp/a/b;
dfs -mkdir /tmp/a/c;
create external table abc( key string, val string  )
partitioned by (part int)
location '/tmp/a/';

alter table abc ADD PARTITION (part=1)  LOCATION 'b';
alter table abc ADD PARTITION (part=2)  LOCATION 'c';

select key from abc where part=1;
select key from abct where part=70;

{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1318) External Tables: Selecting a partition that does not exist produces errors

2010-04-22 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-1318:
--

Attachment: partdoom.q

 External Tables: Selecting a partition that does not exist produces errors
 --

 Key: HIVE-1318
 URL: https://issues.apache.org/jira/browse/HIVE-1318
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Edward Capriolo
 Attachments: partdoom.q


 {noformat}
 dfs -mkdir /tmp/a;
 dfs -mkdir /tmp/a/b;
 dfs -mkdir /tmp/a/c;
 create external table abc( key string, val string  )
 partitioned by (part int)
 location '/tmp/a/';
 alter table abc ADD PARTITION (part=1)  LOCATION 'b';
 alter table abc ADD PARTITION (part=2)  LOCATION 'c';
 select key from abc where part=1;
 select key from abct where part=70;
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-987) Hive CLI Omnibus Improvement ticket

2010-04-22 Thread Ashish Thusoo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12859956#action_12859956
 ] 

Ashish Thusoo commented on HIVE-987:


I am +1 on this. I think this can open up good possibilities. I have not looked 
at sqlline code but how much does it depend on actually SQL dialect. Plus, how 
easy is it to extend to hdfs related command e.g. the CLI today has commands 
that can do set of conf variables. It also supports the hadoop dfs commands as 
well which talk directly to hdfs. I am not sure if too many people use them, 
but I do. Would be great to get them integrated with sqlline if that is 
possible.


 Hive CLI Omnibus Improvement ticket
 ---

 Key: HIVE-987
 URL: https://issues.apache.org/jira/browse/HIVE-987
 Project: Hadoop Hive
  Issue Type: Improvement
Reporter: Carl Steinbach
 Attachments: HIVE-987.1.patch, sqlline-1.0.8_eb.jar


 Add the following features to the Hive CLI:
 * Command History
 * ReadLine support
 ** HIVE-120: Add readline support/support for alt-based commands in the CLI
 ** Java-ReadLine is LGPL, but it depends on GPL readline library. We probably 
 need to use JLine instead.
 * Tab completion
 ** HIVE-97: tab completion for hive cli
 * Embedded/Standalone CLI modes, and ability to connect to different Hive 
 Server instances.
 ** HIVE-818: Create a Hive CLI that connects to hive ThriftServer
 * .hiverc configuration file
 ** HIVE-920: .hiverc doesnt work
 * Improved support for comments.
 ** HIVE-430: Ability to comment desired for hive query files
 * Different output formats
 ** HIVE-49: display column header on CLI
 ** XML output format
 For additional inspiration we may want to look at the Postgres psql shell: 
 http://www.postgresql.org/docs/8.1/static/app-psql.html
 Finally, it would be really cool if we implemented this in a generic fashion 
 and spun it off as an apache-commons
 shell framework. It seems like most of the Apache Hadoop projects have their 
 own shells, and I'm sure the same is true
 for non-Hadoop Apache projects as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1319) Alter table add partition fails if ADD PARTITION is not in upper case

2010-04-22 Thread Edward Capriolo (JIRA)
Alter table add partition fails if ADD PARTITION is not in upper case
-

 Key: HIVE-1319
 URL: https://issues.apache.org/jira/browse/HIVE-1319
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Edward Capriolo


{noformat}
dfs -mkdir /tmp/a;
dfs -mkdir /tmp/a/b;
dfs -mkdir /tmp/a/c;
create external table abc( key string, val string  )
partitioned by (part int)
location '/tmp/a/';

alter table abc ADD PARTITION (part=1)  LOCATION 'b';
alter table abc add partition (part=2)  LOCATION 'c';

select key from abc where part=1;
select key from abct where part=70;
{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-987) Hive CLI Omnibus Improvement ticket

2010-04-22 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12859976#action_12859976
 ] 

John Sichi commented on HIVE-987:
-

sqlline is agnostic as to SQL dialect, so commands such as show/describe/dfs 
just work.

(The one exception I have found so far is the set command, which is throwing an 
NPE; probably something about the result set we return to list all the 
settings.  Shouldn't be hard to fix.)

sqlline has some of its own commands such as !help and !quit; these are always 
prefixed with bang.  Anything else it just sends through, with the exception of 
comments, which it strips off before sending.


 Hive CLI Omnibus Improvement ticket
 ---

 Key: HIVE-987
 URL: https://issues.apache.org/jira/browse/HIVE-987
 Project: Hadoop Hive
  Issue Type: Improvement
Reporter: Carl Steinbach
 Attachments: HIVE-987.1.patch, sqlline-1.0.8_eb.jar


 Add the following features to the Hive CLI:
 * Command History
 * ReadLine support
 ** HIVE-120: Add readline support/support for alt-based commands in the CLI
 ** Java-ReadLine is LGPL, but it depends on GPL readline library. We probably 
 need to use JLine instead.
 * Tab completion
 ** HIVE-97: tab completion for hive cli
 * Embedded/Standalone CLI modes, and ability to connect to different Hive 
 Server instances.
 ** HIVE-818: Create a Hive CLI that connects to hive ThriftServer
 * .hiverc configuration file
 ** HIVE-920: .hiverc doesnt work
 * Improved support for comments.
 ** HIVE-430: Ability to comment desired for hive query files
 * Different output formats
 ** HIVE-49: display column header on CLI
 ** XML output format
 For additional inspiration we may want to look at the Postgres psql shell: 
 http://www.postgresql.org/docs/8.1/static/app-psql.html
 Finally, it would be really cool if we implemented this in a generic fashion 
 and spun it off as an apache-commons
 shell framework. It seems like most of the Apache Hadoop projects have their 
 own shells, and I'm sure the same is true
 for non-Hadoop Apache projects as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-987) Hive CLI Omnibus Improvement ticket

2010-04-22 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12859978#action_12859978
 ] 

John Sichi commented on HIVE-987:
-

Change SQLLINE_OPTS to this to use embedded mode:

SQLLINE_OPTS='-u jdbc:hive:// -d org.apache.hadoop.hive.jdbc.HiveDriver -n sa'

Options such as this can also be overridden on the command line when invoking, 
e.g. to connect to a particular server:

hive --service beeline -u jdbc:hive://theirserver:10001/default


 Hive CLI Omnibus Improvement ticket
 ---

 Key: HIVE-987
 URL: https://issues.apache.org/jira/browse/HIVE-987
 Project: Hadoop Hive
  Issue Type: Improvement
Reporter: Carl Steinbach
 Attachments: HIVE-987.1.patch, sqlline-1.0.8_eb.jar


 Add the following features to the Hive CLI:
 * Command History
 * ReadLine support
 ** HIVE-120: Add readline support/support for alt-based commands in the CLI
 ** Java-ReadLine is LGPL, but it depends on GPL readline library. We probably 
 need to use JLine instead.
 * Tab completion
 ** HIVE-97: tab completion for hive cli
 * Embedded/Standalone CLI modes, and ability to connect to different Hive 
 Server instances.
 ** HIVE-818: Create a Hive CLI that connects to hive ThriftServer
 * .hiverc configuration file
 ** HIVE-920: .hiverc doesnt work
 * Improved support for comments.
 ** HIVE-430: Ability to comment desired for hive query files
 * Different output formats
 ** HIVE-49: display column header on CLI
 ** XML output format
 For additional inspiration we may want to look at the Postgres psql shell: 
 http://www.postgresql.org/docs/8.1/static/app-psql.html
 Finally, it would be really cool if we implemented this in a generic fashion 
 and spun it off as an apache-commons
 shell framework. It seems like most of the Apache Hadoop projects have their 
 own shells, and I'm sure the same is true
 for non-Hadoop Apache projects as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1320) NPE with lineage in a query of union alls on joins.

2010-04-22 Thread Ashish Thusoo (JIRA)
NPE with lineage in a query of union alls on joins.
---

 Key: HIVE-1320
 URL: https://issues.apache.org/jira/browse/HIVE-1320
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Ashish Thusoo
Assignee: Ashish Thusoo


The following query generates a NPE in the lineage ctx code

EXPLAIN
INSERT OVERWRITE TABLE dest_l1
SELECT j.*
FROM (SELECT t1.key, p1.value
  FROM src1 t1
  LEFT OUTER JOIN src p1
  ON (t1.key = p1.key)
  UNION ALL
  SELECT t2.key, p2.value
  FROM src1 t2
  LEFT OUTER JOIN src p2
  ON (t2.key = p2.key)) j;

The stack trace is:

FAILED: Hive Internal Error: java.lang.NullPointerException(null)
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.optimizer.lineage.LineageCtx$Index.mergeDependency(LineageCtx.java:116)
at 
org.apache.hadoop.hive.ql.optimizer.lineage.OpProcFactory$UnionLineage.process(OpProcFactory.java:396)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
at 
org.apache.hadoop.hive.ql.optimizer.lineage.Generator.transform(Generator.java:72)
at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:83)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5976)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1320) NPE with lineage in a query of union alls on joins.

2010-04-22 Thread Ashish Thusoo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Thusoo updated HIVE-1320:


Attachment: HIVE-1320.patch

Fixed the NPE. The cause was that we were not checking for inp_dep to be null 
in the union all code path. We have to do that for all operators that have more 
than 1 parents.


 NPE with lineage in a query of union alls on joins.
 ---

 Key: HIVE-1320
 URL: https://issues.apache.org/jira/browse/HIVE-1320
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Ashish Thusoo
Assignee: Ashish Thusoo
 Attachments: HIVE-1320.patch


 The following query generates a NPE in the lineage ctx code
 EXPLAIN
 INSERT OVERWRITE TABLE dest_l1
 SELECT j.*
 FROM (SELECT t1.key, p1.value
   FROM src1 t1
   LEFT OUTER JOIN src p1
   ON (t1.key = p1.key)
   UNION ALL
   SELECT t2.key, p2.value
   FROM src1 t2
   LEFT OUTER JOIN src p2
   ON (t2.key = p2.key)) j;
 The stack trace is:
 FAILED: Hive Internal Error: java.lang.NullPointerException(null)
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.optimizer.lineage.LineageCtx$Index.mergeDependency(LineageCtx.java:116)
 at 
 org.apache.hadoop.hive.ql.optimizer.lineage.OpProcFactory$UnionLineage.process(OpProcFactory.java:396)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
 at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
 at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
 at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
 at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
 at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
 at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
 at 
 org.apache.hadoop.hive.ql.optimizer.lineage.Generator.transform(Generator.java:72)
 at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:83)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5976)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1320) NPE with lineage in a query of union alls on joins.

2010-04-22 Thread Ashish Thusoo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Thusoo updated HIVE-1320:


   Status: Patch Available  (was: Open)
Affects Version/s: 0.6.0
Fix Version/s: 0.6.0

 NPE with lineage in a query of union alls on joins.
 ---

 Key: HIVE-1320
 URL: https://issues.apache.org/jira/browse/HIVE-1320
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: Ashish Thusoo
Assignee: Ashish Thusoo
 Fix For: 0.6.0

 Attachments: HIVE-1320.patch


 The following query generates a NPE in the lineage ctx code
 EXPLAIN
 INSERT OVERWRITE TABLE dest_l1
 SELECT j.*
 FROM (SELECT t1.key, p1.value
   FROM src1 t1
   LEFT OUTER JOIN src p1
   ON (t1.key = p1.key)
   UNION ALL
   SELECT t2.key, p2.value
   FROM src1 t2
   LEFT OUTER JOIN src p2
   ON (t2.key = p2.key)) j;
 The stack trace is:
 FAILED: Hive Internal Error: java.lang.NullPointerException(null)
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.optimizer.lineage.LineageCtx$Index.mergeDependency(LineageCtx.java:116)
 at 
 org.apache.hadoop.hive.ql.optimizer.lineage.OpProcFactory$UnionLineage.process(OpProcFactory.java:396)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
 at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
 at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
 at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
 at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
 at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
 at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
 at 
 org.apache.hadoop.hive.ql.optimizer.lineage.Generator.transform(Generator.java:72)
 at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:83)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5976)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function

2010-04-22 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12860061#action_12860061
 ] 

John Sichi commented on HIVE-259:
-

I couldn't see the point of having two competing UDF guide pages, so I renamed 
the XPath-specific one as such and linked it from the main one.  Just 
housekeeping to reduce confusion; I did not actually add the percentile info.


 Add PERCENTILE aggregate function
 -

 Key: HIVE-259
 URL: https://issues.apache.org/jira/browse/HIVE-259
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Venky Iyer
Assignee: Jerome Boulon
 Fix For: 0.6.0

 Attachments: HIVE-259-2.patch, HIVE-259-3.patch, HIVE-259.1.patch, 
 HIVE-259.4.patch, HIVE-259.5.patch, HIVE-259.patch, jb2.txt, Percentile.xlsx


 Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.