[jira] Updated: (HIVE-1321) bugs with temp directories, trailing blank fields in HBase bulk load

2010-04-22 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1321:
-

Status: Patch Available  (was: Open)

Changed test script to exercise blanks and nulls, and did some manual testing 
to verify that these got into HBase (or not) as expected.


> bugs with temp directories, trailing blank fields in HBase bulk load
> 
>
> Key: HIVE-1321
> URL: https://issues.apache.org/jira/browse/HIVE-1321
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 0.6.0
>Reporter: John Sichi
>Assignee: John Sichi
> Fix For: 0.6.0
>
> Attachments: HIVE-1321.1.patch
>
>
> HIVE-1295 had two bugs discovered during testing with production data:
> (1) extra directories may be present in the output directory depending on how 
> the cluster is configured; we need to walk down these to find the column 
> family directory
> (2) if a record ends with fields which are blank strings, the text format 
> omits the corresponding Control-A delimiters, so we need to fill in blanks 
> for these fields (instead of throwing ArrayIndexOutOfBoundsException)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1321) bugs with temp directories, trailing blank fields in HBase bulk load

2010-04-22 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1321:
-

Attachment: HIVE-1321.1.patch

> bugs with temp directories, trailing blank fields in HBase bulk load
> 
>
> Key: HIVE-1321
> URL: https://issues.apache.org/jira/browse/HIVE-1321
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 0.6.0
>Reporter: John Sichi
>Assignee: John Sichi
> Fix For: 0.6.0
>
> Attachments: HIVE-1321.1.patch
>
>
> HIVE-1295 had two bugs discovered during testing with production data:
> (1) extra directories may be present in the output directory depending on how 
> the cluster is configured; we need to walk down these to find the column 
> family directory
> (2) if a record ends with fields which are blank strings, the text format 
> omits the corresponding Control-A delimiters, so we need to fill in blanks 
> for these fields (instead of throwing ArrayIndexOutOfBoundsException)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1320) NPE with lineage in a query of union alls on joins.

2010-04-22 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1320:
-

Status: Resolved  (was: Patch Available)
Resolution: Fixed

Committed. Thanks Ashish!

> NPE with lineage in a query of union alls on joins.
> ---
>
> Key: HIVE-1320
> URL: https://issues.apache.org/jira/browse/HIVE-1320
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: Ashish Thusoo
>Assignee: Ashish Thusoo
> Fix For: 0.6.0
>
> Attachments: HIVE-1320.patch
>
>
> The following query generates a NPE in the lineage ctx code
> EXPLAIN
> INSERT OVERWRITE TABLE dest_l1
> SELECT j.*
> FROM (SELECT t1.key, p1.value
>   FROM src1 t1
>   LEFT OUTER JOIN src p1
>   ON (t1.key = p1.key)
>   UNION ALL
>   SELECT t2.key, p2.value
>   FROM src1 t2
>   LEFT OUTER JOIN src p2
>   ON (t2.key = p2.key)) j;
> The stack trace is:
> FAILED: Hive Internal Error: java.lang.NullPointerException(null)
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.lineage.LineageCtx$Index.mergeDependency(LineageCtx.java:116)
> at 
> org.apache.hadoop.hive.ql.optimizer.lineage.OpProcFactory$UnionLineage.process(OpProcFactory.java:396)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
> at 
> org.apache.hadoop.hive.ql.optimizer.lineage.Generator.transform(Generator.java:72)
> at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:83)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5976)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1321) bugs with temp directories, trailing blank fields in HBase bulk load

2010-04-22 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860072#action_12860072
 ] 

John Sichi commented on HIVE-1321:
--

Also need to omit null values (\N) completely.


> bugs with temp directories, trailing blank fields in HBase bulk load
> 
>
> Key: HIVE-1321
> URL: https://issues.apache.org/jira/browse/HIVE-1321
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 0.6.0
>Reporter: John Sichi
>Assignee: John Sichi
> Fix For: 0.6.0
>
>
> HIVE-1295 had two bugs discovered during testing with production data:
> (1) extra directories may be present in the output directory depending on how 
> the cluster is configured; we need to walk down these to find the column 
> family directory
> (2) if a record ends with fields which are blank strings, the text format 
> omits the corresponding Control-A delimiters, so we need to fill in blanks 
> for these fields (instead of throwing ArrayIndexOutOfBoundsException)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1295) facilitate HBase bulk loads from Hive

2010-04-22 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860064#action_12860064
 ] 

John Sichi commented on HIVE-1295:
--

Followup logged as HIVE-1321.


> facilitate HBase bulk loads from Hive
> -
>
> Key: HIVE-1295
> URL: https://issues.apache.org/jira/browse/HIVE-1295
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 0.6.0
>Reporter: John Sichi
>Assignee: John Sichi
> Fix For: 0.6.0
>
> Attachments: HIVE-1295.1.patch
>
>
> HBase supports a bulk load procedure:
> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/mapreduce/package-summary.html#bulk
> We would like to add support to Hive so that users can bulk load HBase from 
> Hive without having to write any map/reduce code.
> Ideally, this could be done with a single INSERT statement targeting the 
> HBase storage handler (with an option set to request bulk load instead of 
> row-level inserts).
> However, that will take a lot of work, so this JIRA is a first step to allow 
> the bulk load files to be prepared inside of Hive via a sequence of SQL 
> statements and then pushed into HBase via the loadtable.rb script.
> Note that until HBASE-1861 is implemented, the bulk load target table can 
> only have a single column family.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1321) bugs with temp directories, trailing blank fields in HBase bulk load

2010-04-22 Thread John Sichi (JIRA)
bugs with temp directories, trailing blank fields in HBase bulk load


 Key: HIVE-1321
 URL: https://issues.apache.org/jira/browse/HIVE-1321
 Project: Hadoop Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.6.0


HIVE-1295 had two bugs discovered during testing with production data:

(1) extra directories may be present in the output directory depending on how 
the cluster is configured; we need to walk down these to find the column family 
directory

(2) if a record ends with fields which are blank strings, the text format omits 
the corresponding Control-A delimiters, so we need to fill in blanks for these 
fields (instead of throwing ArrayIndexOutOfBoundsException)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1320) NPE with lineage in a query of union alls on joins.

2010-04-22 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860062#action_12860062
 ] 

Ning Zhang commented on HIVE-1320:
--

+1 will commit after tests.

> NPE with lineage in a query of union alls on joins.
> ---
>
> Key: HIVE-1320
> URL: https://issues.apache.org/jira/browse/HIVE-1320
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: Ashish Thusoo
>Assignee: Ashish Thusoo
> Fix For: 0.6.0
>
> Attachments: HIVE-1320.patch
>
>
> The following query generates a NPE in the lineage ctx code
> EXPLAIN
> INSERT OVERWRITE TABLE dest_l1
> SELECT j.*
> FROM (SELECT t1.key, p1.value
>   FROM src1 t1
>   LEFT OUTER JOIN src p1
>   ON (t1.key = p1.key)
>   UNION ALL
>   SELECT t2.key, p2.value
>   FROM src1 t2
>   LEFT OUTER JOIN src p2
>   ON (t2.key = p2.key)) j;
> The stack trace is:
> FAILED: Hive Internal Error: java.lang.NullPointerException(null)
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.lineage.LineageCtx$Index.mergeDependency(LineageCtx.java:116)
> at 
> org.apache.hadoop.hive.ql.optimizer.lineage.OpProcFactory$UnionLineage.process(OpProcFactory.java:396)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
> at 
> org.apache.hadoop.hive.ql.optimizer.lineage.Generator.transform(Generator.java:72)
> at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:83)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5976)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function

2010-04-22 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860061#action_12860061
 ] 

John Sichi commented on HIVE-259:
-

I couldn't see the point of having two competing UDF guide pages, so I renamed 
the XPath-specific one as such and linked it from the main one.  Just 
housekeeping to reduce confusion; I did not actually add the percentile info.


> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Fix For: 0.6.0
>
> Attachments: HIVE-259-2.patch, HIVE-259-3.patch, HIVE-259.1.patch, 
> HIVE-259.4.patch, HIVE-259.5.patch, HIVE-259.patch, jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1320) NPE with lineage in a query of union alls on joins.

2010-04-22 Thread Ashish Thusoo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Thusoo updated HIVE-1320:


   Status: Patch Available  (was: Open)
Affects Version/s: 0.6.0
Fix Version/s: 0.6.0

> NPE with lineage in a query of union alls on joins.
> ---
>
> Key: HIVE-1320
> URL: https://issues.apache.org/jira/browse/HIVE-1320
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: Ashish Thusoo
>Assignee: Ashish Thusoo
> Fix For: 0.6.0
>
> Attachments: HIVE-1320.patch
>
>
> The following query generates a NPE in the lineage ctx code
> EXPLAIN
> INSERT OVERWRITE TABLE dest_l1
> SELECT j.*
> FROM (SELECT t1.key, p1.value
>   FROM src1 t1
>   LEFT OUTER JOIN src p1
>   ON (t1.key = p1.key)
>   UNION ALL
>   SELECT t2.key, p2.value
>   FROM src1 t2
>   LEFT OUTER JOIN src p2
>   ON (t2.key = p2.key)) j;
> The stack trace is:
> FAILED: Hive Internal Error: java.lang.NullPointerException(null)
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.lineage.LineageCtx$Index.mergeDependency(LineageCtx.java:116)
> at 
> org.apache.hadoop.hive.ql.optimizer.lineage.OpProcFactory$UnionLineage.process(OpProcFactory.java:396)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
> at 
> org.apache.hadoop.hive.ql.optimizer.lineage.Generator.transform(Generator.java:72)
> at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:83)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5976)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1320) NPE with lineage in a query of union alls on joins.

2010-04-22 Thread Ashish Thusoo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Thusoo updated HIVE-1320:


Attachment: HIVE-1320.patch

Fixed the NPE. The cause was that we were not checking for inp_dep to be null 
in the union all code path. We have to do that for all operators that have more 
than 1 parents.


> NPE with lineage in a query of union alls on joins.
> ---
>
> Key: HIVE-1320
> URL: https://issues.apache.org/jira/browse/HIVE-1320
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Ashish Thusoo
>Assignee: Ashish Thusoo
> Attachments: HIVE-1320.patch
>
>
> The following query generates a NPE in the lineage ctx code
> EXPLAIN
> INSERT OVERWRITE TABLE dest_l1
> SELECT j.*
> FROM (SELECT t1.key, p1.value
>   FROM src1 t1
>   LEFT OUTER JOIN src p1
>   ON (t1.key = p1.key)
>   UNION ALL
>   SELECT t2.key, p2.value
>   FROM src1 t2
>   LEFT OUTER JOIN src p2
>   ON (t2.key = p2.key)) j;
> The stack trace is:
> FAILED: Hive Internal Error: java.lang.NullPointerException(null)
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.lineage.LineageCtx$Index.mergeDependency(LineageCtx.java:116)
> at 
> org.apache.hadoop.hive.ql.optimizer.lineage.OpProcFactory$UnionLineage.process(OpProcFactory.java:396)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
> at 
> org.apache.hadoop.hive.ql.optimizer.lineage.Generator.transform(Generator.java:72)
> at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:83)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5976)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1320) NPE with lineage in a query of union alls on joins.

2010-04-22 Thread Ashish Thusoo (JIRA)
NPE with lineage in a query of union alls on joins.
---

 Key: HIVE-1320
 URL: https://issues.apache.org/jira/browse/HIVE-1320
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Ashish Thusoo
Assignee: Ashish Thusoo


The following query generates a NPE in the lineage ctx code

EXPLAIN
INSERT OVERWRITE TABLE dest_l1
SELECT j.*
FROM (SELECT t1.key, p1.value
  FROM src1 t1
  LEFT OUTER JOIN src p1
  ON (t1.key = p1.key)
  UNION ALL
  SELECT t2.key, p2.value
  FROM src1 t2
  LEFT OUTER JOIN src p2
  ON (t2.key = p2.key)) j;

The stack trace is:

FAILED: Hive Internal Error: java.lang.NullPointerException(null)
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.optimizer.lineage.LineageCtx$Index.mergeDependency(LineageCtx.java:116)
at 
org.apache.hadoop.hive.ql.optimizer.lineage.OpProcFactory$UnionLineage.process(OpProcFactory.java:396)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
at 
org.apache.hadoop.hive.ql.optimizer.lineage.Generator.transform(Generator.java:72)
at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:83)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5976)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-987) Hive CLI Omnibus Improvement ticket

2010-04-22 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859978#action_12859978
 ] 

John Sichi commented on HIVE-987:
-

Change SQLLINE_OPTS to this to use embedded mode:

SQLLINE_OPTS='-u jdbc:hive:// -d org.apache.hadoop.hive.jdbc.HiveDriver -n sa'

Options such as this can also be overridden on the command line when invoking, 
e.g. to connect to a particular server:

hive --service beeline -u jdbc:hive://theirserver:10001/default


> Hive CLI Omnibus Improvement ticket
> ---
>
> Key: HIVE-987
> URL: https://issues.apache.org/jira/browse/HIVE-987
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Carl Steinbach
> Attachments: HIVE-987.1.patch, sqlline-1.0.8_eb.jar
>
>
> Add the following features to the Hive CLI:
> * Command History
> * ReadLine support
> ** HIVE-120: Add readline support/support for alt-based commands in the CLI
> ** Java-ReadLine is LGPL, but it depends on GPL readline library. We probably 
> need to use JLine instead.
> * Tab completion
> ** HIVE-97: tab completion for hive cli
> * Embedded/Standalone CLI modes, and ability to connect to different Hive 
> Server instances.
> ** HIVE-818: Create a Hive CLI that connects to hive ThriftServer
> * .hiverc configuration file
> ** HIVE-920: .hiverc doesnt work
> * Improved support for comments.
> ** HIVE-430: Ability to comment desired for hive query files
> * Different output formats
> ** HIVE-49: display column header on CLI
> ** XML output format
> For additional inspiration we may want to look at the Postgres psql shell: 
> http://www.postgresql.org/docs/8.1/static/app-psql.html
> Finally, it would be really cool if we implemented this in a generic fashion 
> and spun it off as an apache-commons
> shell framework. It seems like most of the Apache Hadoop projects have their 
> own shells, and I'm sure the same is true
> for non-Hadoop Apache projects as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-987) Hive CLI Omnibus Improvement ticket

2010-04-22 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859976#action_12859976
 ] 

John Sichi commented on HIVE-987:
-

sqlline is agnostic as to SQL dialect, so commands such as show/describe/dfs 
just work.

(The one exception I have found so far is the set command, which is throwing an 
NPE; probably something about the result set we return to list all the 
settings.  Shouldn't be hard to fix.)

sqlline has some of its own commands such as !help and !quit; these are always 
prefixed with bang.  Anything else it just sends through, with the exception of 
comments, which it strips off before sending.


> Hive CLI Omnibus Improvement ticket
> ---
>
> Key: HIVE-987
> URL: https://issues.apache.org/jira/browse/HIVE-987
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Carl Steinbach
> Attachments: HIVE-987.1.patch, sqlline-1.0.8_eb.jar
>
>
> Add the following features to the Hive CLI:
> * Command History
> * ReadLine support
> ** HIVE-120: Add readline support/support for alt-based commands in the CLI
> ** Java-ReadLine is LGPL, but it depends on GPL readline library. We probably 
> need to use JLine instead.
> * Tab completion
> ** HIVE-97: tab completion for hive cli
> * Embedded/Standalone CLI modes, and ability to connect to different Hive 
> Server instances.
> ** HIVE-818: Create a Hive CLI that connects to hive ThriftServer
> * .hiverc configuration file
> ** HIVE-920: .hiverc doesnt work
> * Improved support for comments.
> ** HIVE-430: Ability to comment desired for hive query files
> * Different output formats
> ** HIVE-49: display column header on CLI
> ** XML output format
> For additional inspiration we may want to look at the Postgres psql shell: 
> http://www.postgresql.org/docs/8.1/static/app-psql.html
> Finally, it would be really cool if we implemented this in a generic fashion 
> and spun it off as an apache-commons
> shell framework. It seems like most of the Apache Hadoop projects have their 
> own shells, and I'm sure the same is true
> for non-Hadoop Apache projects as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1319) Alter table add partition fails if ADD PARTITION is not in upper case

2010-04-22 Thread Edward Capriolo (JIRA)
Alter table add partition fails if ADD PARTITION is not in upper case
-

 Key: HIVE-1319
 URL: https://issues.apache.org/jira/browse/HIVE-1319
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Edward Capriolo


{noformat}
dfs -mkdir /tmp/a;
dfs -mkdir /tmp/a/b;
dfs -mkdir /tmp/a/c;
create external table abc( key string, val string  )
partitioned by (part int)
location '/tmp/a/';

alter table abc ADD PARTITION (part=1)  LOCATION 'b';
alter table abc add partition (part=2)  LOCATION 'c';

select key from abc where part=1;
select key from abct where part=70;
{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-987) Hive CLI Omnibus Improvement ticket

2010-04-22 Thread Ashish Thusoo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859956#action_12859956
 ] 

Ashish Thusoo commented on HIVE-987:


I am +1 on this. I think this can open up good possibilities. I have not looked 
at sqlline code but how much does it depend on actually SQL dialect. Plus, how 
easy is it to extend to hdfs related command e.g. the CLI today has commands 
that can do set of conf variables. It also supports the hadoop dfs commands as 
well which talk directly to hdfs. I am not sure if too many people use them, 
but I do. Would be great to get them integrated with sqlline if that is 
possible.


> Hive CLI Omnibus Improvement ticket
> ---
>
> Key: HIVE-987
> URL: https://issues.apache.org/jira/browse/HIVE-987
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Carl Steinbach
> Attachments: HIVE-987.1.patch, sqlline-1.0.8_eb.jar
>
>
> Add the following features to the Hive CLI:
> * Command History
> * ReadLine support
> ** HIVE-120: Add readline support/support for alt-based commands in the CLI
> ** Java-ReadLine is LGPL, but it depends on GPL readline library. We probably 
> need to use JLine instead.
> * Tab completion
> ** HIVE-97: tab completion for hive cli
> * Embedded/Standalone CLI modes, and ability to connect to different Hive 
> Server instances.
> ** HIVE-818: Create a Hive CLI that connects to hive ThriftServer
> * .hiverc configuration file
> ** HIVE-920: .hiverc doesnt work
> * Improved support for comments.
> ** HIVE-430: Ability to comment desired for hive query files
> * Different output formats
> ** HIVE-49: display column header on CLI
> ** XML output format
> For additional inspiration we may want to look at the Postgres psql shell: 
> http://www.postgresql.org/docs/8.1/static/app-psql.html
> Finally, it would be really cool if we implemented this in a generic fashion 
> and spun it off as an apache-commons
> shell framework. It seems like most of the Apache Hadoop projects have their 
> own shells, and I'm sure the same is true
> for non-Hadoop Apache projects as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1318) External Tables: Selecting a partition that does not exist produces errors

2010-04-22 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-1318:
--

Attachment: partdoom.q

> External Tables: Selecting a partition that does not exist produces errors
> --
>
> Key: HIVE-1318
> URL: https://issues.apache.org/jira/browse/HIVE-1318
> Project: Hadoop Hive
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: Edward Capriolo
> Attachments: partdoom.q
>
>
> {noformat}
> dfs -mkdir /tmp/a;
> dfs -mkdir /tmp/a/b;
> dfs -mkdir /tmp/a/c;
> create external table abc( key string, val string  )
> partitioned by (part int)
> location '/tmp/a/';
> alter table abc ADD PARTITION (part=1)  LOCATION 'b';
> alter table abc ADD PARTITION (part=2)  LOCATION 'c';
> select key from abc where part=1;
> select key from abct where part=70;
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1318) External Tables: Selecting a partition that does not exist produces errors

2010-04-22 Thread Edward Capriolo (JIRA)
External Tables: Selecting a partition that does not exist produces errors
--

 Key: HIVE-1318
 URL: https://issues.apache.org/jira/browse/HIVE-1318
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Edward Capriolo
 Attachments: partdoom.q

{noformat}
dfs -mkdir /tmp/a;
dfs -mkdir /tmp/a/b;
dfs -mkdir /tmp/a/c;
create external table abc( key string, val string  )
partitioned by (part int)
location '/tmp/a/';

alter table abc ADD PARTITION (part=1)  LOCATION 'b';
alter table abc ADD PARTITION (part=2)  LOCATION 'c';

select key from abc where part=1;
select key from abct where part=70;

{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-987) Hive CLI Omnibus Improvement ticket

2010-04-22 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859949#action_12859949
 ] 

John Sichi commented on HIVE-987:
-

@Raghu:  you are right.  I screwed up when I was testing the embedded mode; it 
actually works fine already.


> Hive CLI Omnibus Improvement ticket
> ---
>
> Key: HIVE-987
> URL: https://issues.apache.org/jira/browse/HIVE-987
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Carl Steinbach
> Attachments: HIVE-987.1.patch, sqlline-1.0.8_eb.jar
>
>
> Add the following features to the Hive CLI:
> * Command History
> * ReadLine support
> ** HIVE-120: Add readline support/support for alt-based commands in the CLI
> ** Java-ReadLine is LGPL, but it depends on GPL readline library. We probably 
> need to use JLine instead.
> * Tab completion
> ** HIVE-97: tab completion for hive cli
> * Embedded/Standalone CLI modes, and ability to connect to different Hive 
> Server instances.
> ** HIVE-818: Create a Hive CLI that connects to hive ThriftServer
> * .hiverc configuration file
> ** HIVE-920: .hiverc doesnt work
> * Improved support for comments.
> ** HIVE-430: Ability to comment desired for hive query files
> * Different output formats
> ** HIVE-49: display column header on CLI
> ** XML output format
> For additional inspiration we may want to look at the Postgres psql shell: 
> http://www.postgresql.org/docs/8.1/static/app-psql.html
> Finally, it would be really cool if we implemented this in a generic fashion 
> and spun it off as an apache-commons
> shell framework. It seems like most of the Apache Hadoop projects have their 
> own shells, and I'm sure the same is true
> for non-Hadoop Apache projects as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [DISCUSSION] To be (or not to be) a TLP - that is the question

2010-04-22 Thread Dhruba Borthakur
I am definitely against moving Hive out of Hadoop. There is appreciable
representation of Hive inside the Hadoop PMC and, as far as I can say, there
is no additional burden on the Hadooo PMC to make Hive remain inside Hadoop.

I respect Jeff/Amr's comments on their viewpoints, but I beg to differ from
that. I really do not see any benefit on moving Hive out of Hadoop.

thanks,
dhruba

On Thu, Apr 22, 2010 at 10:09 AM, Ashish Thusoo wrote:

> What is the advantage of becoming a TLP to the project itself? I have heard
> that it is something that apache wants, but considering that we are very
> comfortable on how Hive interacts with the Hadoop ecosystem as a sub project
> for Hadoop, there has to be some big incentive for the project to be a TLP
> and nowhere have a seen how this would benefit Hive. Any thoughts on that?
>
> Ashish
>
> 
> From: Jeff Hammerbacher [mailto:ham...@cloudera.com]
> Sent: Wednesday, April 21, 2010 7:35 PM
> To: hive-dev@hadoop.apache.org
> Cc: Ashish Thusoo
> Subject: Re: [DISCUSSION] To be (or not to be) a TLP - that is the question
>
> Hive already does the work to run on multiple versions of Hadoop, and the
> release cycle is independent of Hadoop's. I don't see why it should remain a
> subproject. I'm +1 on Hive becoming a TLP.
>
> On Tue, Apr 20, 2010 at 2:03 PM, Zheng Shao  zsh...@gmail.com>> wrote:
> As a Hive committer, I don't feel the benefit we get from becoming a
> TLP is big enough (compared with the cost) to make Hive a TLP.
> From Chris's comment I see that the cost is not that big, but I still
> wonder what benefit we will get from that.
>
> Also I didn't get the idea of the joke ("In fact, one could argue that
> Pig opting not to be TLP yet is why Hive should go TLP"). I don't see
> any reasons that applies to Pig but not Hive.
> We should continue the discussion here, but anything in the Pig's
> discussion should also be considered here.
>
> Zheng
>
> On Mon, Apr 19, 2010 at 5:48 PM, Amr Awadallah  a...@cloudera.com>> wrote:
> > I am personally +1 on Hive being a TLP, I think it did reach the
> community
> > adoption and maturity level required for that. In fact, one could argue
> that
> > Pig opting not to be TLP yet is why Hive should go TLP :) (jk).
> >
> > The real question to ask is whether there is a volunteer to take care of
> the
> > "administrative" tasks, which isn't a ton of work afaiu (I am willing to
> > volunteer if no body else up to the task, but I am not a committer and
> only
> > contributed a minor patch for bash/cygwin).
> >
> > BTW, here is a very nice summary from Yahoo's Chris Douglas on TLP
> > tradeoffs. I happen to agree with all he says, and frankly I couldn't
> have
> > wrote it better my self. I highlight certain parts from his message, but
> I
> > recommend you read the whole thing.
> >
> > -- Forwarded message --
> > From: Chris Douglas mailto:cdoug...@apache.org>>
> > Date: Tue, Apr 13, 2010 at 11:46 PM
> > Subject: Subprojects and TLP status
> > To: gene...@hadoop.apache.org,
> priv...@hadoop.apache.org
> >
> > Most of Hadoop's subprojects have discussed becoming top-level Apache
> > projects (TLPs) in the last few weeks. Most have expressed a desire to
> > remain in Hadoop. The salient parts of the discussions I've read tend
> > to focus on three aspects: a technical dependence on Hadoop,
> > additional overhead as a TLP, and visibility both within the Hadoop
> > ecosystem and in the open source community generally.
> >
> > Life as a TLP: this is not much harder than being a Hadoop subproject,
> > and the Apache preferences being tossed around- particularly
> > "insufficiently diverse"- are not blockers. Every subproject needs to
> > write a section of the report Hadoop sends to the board; almost the
> > same report, sent to a new address. The initial cost is similarly
> > light: copy bylaws, send a few notes to INFRA, and follow some
> > directions. I think the estimated costs are far higher than they will
> > be in practice. Inertia is a powerful force, but it should be
> > overcome. The directions are here, and should not intimidating:
> >
> > http://apache.org/dev/project-creation.html
> >
> > Visibility: the Hadoop site does not need to change. For each
> > subproject, we can literally change the hyperlinks to point to the new
> > page and be done. Long-term, linking to all ASF projects that run on
> > Hadoop from a prominent page is something we all want. So particularly
> > in the medium-term that most are considering: visibility through the
> > website will not change. Each subproject will still be linked from the
> > front page.
> >
> > Hadoop would not be nearly as popular as it is without Zookeeper,
> > HBase, Hive, and Pig. All statistics on work in shared MapReduce
> > clusters show that users vastly prefer running Pig and Hive queries to
> > writing MapReduce jobs. HBase continues to push features in HDF

RE: [DISCUSSION] To be (or not to be) a TLP - that is the question

2010-04-22 Thread Ashish Thusoo
What is the advantage of becoming a TLP to the project itself? I have heard 
that it is something that apache wants, but considering that we are very 
comfortable on how Hive interacts with the Hadoop ecosystem as a sub project 
for Hadoop, there has to be some big incentive for the project to be a TLP and 
nowhere have a seen how this would benefit Hive. Any thoughts on that?

Ashish


From: Jeff Hammerbacher [mailto:ham...@cloudera.com]
Sent: Wednesday, April 21, 2010 7:35 PM
To: hive-dev@hadoop.apache.org
Cc: Ashish Thusoo
Subject: Re: [DISCUSSION] To be (or not to be) a TLP - that is the question

Hive already does the work to run on multiple versions of Hadoop, and the 
release cycle is independent of Hadoop's. I don't see why it should remain a 
subproject. I'm +1 on Hive becoming a TLP.

On Tue, Apr 20, 2010 at 2:03 PM, Zheng Shao 
mailto:zsh...@gmail.com>> wrote:
As a Hive committer, I don't feel the benefit we get from becoming a
TLP is big enough (compared with the cost) to make Hive a TLP.
>From Chris's comment I see that the cost is not that big, but I still
wonder what benefit we will get from that.

Also I didn't get the idea of the joke ("In fact, one could argue that
Pig opting not to be TLP yet is why Hive should go TLP"). I don't see
any reasons that applies to Pig but not Hive.
We should continue the discussion here, but anything in the Pig's
discussion should also be considered here.

Zheng

On Mon, Apr 19, 2010 at 5:48 PM, Amr Awadallah 
mailto:a...@cloudera.com>> wrote:
> I am personally +1 on Hive being a TLP, I think it did reach the community
> adoption and maturity level required for that. In fact, one could argue that
> Pig opting not to be TLP yet is why Hive should go TLP :) (jk).
>
> The real question to ask is whether there is a volunteer to take care of the
> "administrative" tasks, which isn't a ton of work afaiu (I am willing to
> volunteer if no body else up to the task, but I am not a committer and only
> contributed a minor patch for bash/cygwin).
>
> BTW, here is a very nice summary from Yahoo's Chris Douglas on TLP
> tradeoffs. I happen to agree with all he says, and frankly I couldn't have
> wrote it better my self. I highlight certain parts from his message, but I
> recommend you read the whole thing.
>
> -- Forwarded message --
> From: Chris Douglas mailto:cdoug...@apache.org>>
> Date: Tue, Apr 13, 2010 at 11:46 PM
> Subject: Subprojects and TLP status
> To: gene...@hadoop.apache.org, 
> priv...@hadoop.apache.org
>
> Most of Hadoop's subprojects have discussed becoming top-level Apache
> projects (TLPs) in the last few weeks. Most have expressed a desire to
> remain in Hadoop. The salient parts of the discussions I've read tend
> to focus on three aspects: a technical dependence on Hadoop,
> additional overhead as a TLP, and visibility both within the Hadoop
> ecosystem and in the open source community generally.
>
> Life as a TLP: this is not much harder than being a Hadoop subproject,
> and the Apache preferences being tossed around- particularly
> "insufficiently diverse"- are not blockers. Every subproject needs to
> write a section of the report Hadoop sends to the board; almost the
> same report, sent to a new address. The initial cost is similarly
> light: copy bylaws, send a few notes to INFRA, and follow some
> directions. I think the estimated costs are far higher than they will
> be in practice. Inertia is a powerful force, but it should be
> overcome. The directions are here, and should not intimidating:
>
> http://apache.org/dev/project-creation.html
>
> Visibility: the Hadoop site does not need to change. For each
> subproject, we can literally change the hyperlinks to point to the new
> page and be done. Long-term, linking to all ASF projects that run on
> Hadoop from a prominent page is something we all want. So particularly
> in the medium-term that most are considering: visibility through the
> website will not change. Each subproject will still be linked from the
> front page.
>
> Hadoop would not be nearly as popular as it is without Zookeeper,
> HBase, Hive, and Pig. All statistics on work in shared MapReduce
> clusters show that users vastly prefer running Pig and Hive queries to
> writing MapReduce jobs. HBase continues to push features in HDFS that
> increase its adoption and relevance outside MapReduce, while sharing
> some of its NoSQL limelight. Zookeeper is not only a linchpin in real
> workloads, but many proposals for future features require it. The
> bottom line is that MapReduce and HDFS need these projects for
> visibility and adoption in precisely the same way. I don't think
> separate TLPs will uncouple the broader community from one another.
>
> Technical dependence: this has two dimensions. First, influencing
> MapReduce and HDFS. This is nonsense. Earning influence by
> contributing to a subproject is the only way

Hudson build is back to normal : Hive-trunk-h0.18 #421

2010-04-22 Thread Apache Hudson Server
See 




Re: [DISCUSSION] To be (or not to be) a TLP - that is the question

2010-04-22 Thread Edward Capriolo
On Wed, Apr 21, 2010 at 10:35 PM, Jeff Hammerbacher wrote:

> Hive already does the work to run on multiple versions of Hadoop, and the
> release cycle is independent of Hadoop's. I don't see why it should remain
> a
> subproject. I'm +1 on Hive becoming a TLP.
>
> On Tue, Apr 20, 2010 at 2:03 PM, Zheng Shao  wrote:
>
> > As a Hive committer, I don't feel the benefit we get from becoming a
> > TLP is big enough (compared with the cost) to make Hive a TLP.
> > From Chris's comment I see that the cost is not that big, but I still
> > wonder what benefit we will get from that.
> >
> > Also I didn't get the idea of the joke ("In fact, one could argue that
> > Pig opting not to be TLP yet is why Hive should go TLP"). I don't see
> > any reasons that applies to Pig but not Hive.
> > We should continue the discussion here, but anything in the Pig's
> > discussion should also be considered here.
> >
> > Zheng
> >
> > On Mon, Apr 19, 2010 at 5:48 PM, Amr Awadallah  wrote:
> > > I am personally +1 on Hive being a TLP, I think it did reach the
> > community
> > > adoption and maturity level required for that. In fact, one could argue
> > that
> > > Pig opting not to be TLP yet is why Hive should go TLP :) (jk).
> > >
> > > The real question to ask is whether there is a volunteer to take care
> of
> > the
> > > "administrative" tasks, which isn't a ton of work afaiu (I am willing
> to
> > > volunteer if no body else up to the task, but I am not a committer and
> > only
> > > contributed a minor patch for bash/cygwin).
> > >
> > > BTW, here is a very nice summary from Yahoo's Chris Douglas on TLP
> > > tradeoffs. I happen to agree with all he says, and frankly I couldn't
> > have
> > > wrote it better my self. I highlight certain parts from his message,
> but
> > I
> > > recommend you read the whole thing.
> > >
> > > -- Forwarded message --
> > > From: Chris Douglas 
> > > Date: Tue, Apr 13, 2010 at 11:46 PM
> > > Subject: Subprojects and TLP status
> > > To: gene...@hadoop.apache.org, priv...@hadoop.apache.org
> > >
> > > Most of Hadoop's subprojects have discussed becoming top-level Apache
> > > projects (TLPs) in the last few weeks. Most have expressed a desire to
> > > remain in Hadoop. The salient parts of the discussions I've read tend
> > > to focus on three aspects: a technical dependence on Hadoop,
> > > additional overhead as a TLP, and visibility both within the Hadoop
> > > ecosystem and in the open source community generally.
> > >
> > > Life as a TLP: this is not much harder than being a Hadoop subproject,
> > > and the Apache preferences being tossed around- particularly
> > > "insufficiently diverse"- are not blockers. Every subproject needs to
> > > write a section of the report Hadoop sends to the board; almost the
> > > same report, sent to a new address. The initial cost is similarly
> > > light: copy bylaws, send a few notes to INFRA, and follow some
> > > directions. I think the estimated costs are far higher than they will
> > > be in practice. Inertia is a powerful force, but it should be
> > > overcome. The directions are here, and should not intimidating:
> > >
> > > http://apache.org/dev/project-creation.html
> > >
> > > Visibility: the Hadoop site does not need to change. For each
> > > subproject, we can literally change the hyperlinks to point to the new
> > > page and be done. Long-term, linking to all ASF projects that run on
> > > Hadoop from a prominent page is something we all want. So particularly
> > > in the medium-term that most are considering: visibility through the
> > > website will not change. Each subproject will still be linked from the
> > > front page.
> > >
> > > Hadoop would not be nearly as popular as it is without Zookeeper,
> > > HBase, Hive, and Pig. All statistics on work in shared MapReduce
> > > clusters show that users vastly prefer running Pig and Hive queries to
> > > writing MapReduce jobs. HBase continues to push features in HDFS that
> > > increase its adoption and relevance outside MapReduce, while sharing
> > > some of its NoSQL limelight. Zookeeper is not only a linchpin in real
> > > workloads, but many proposals for future features require it. The
> > > bottom line is that MapReduce and HDFS need these projects for
> > > visibility and adoption in precisely the same way. I don't think
> > > separate TLPs will uncouple the broader community from one another.
> > >
> > > Technical dependence: this has two dimensions. First, influencing
> > > MapReduce and HDFS. This is nonsense. Earning influence by
> > > contributing to a subproject is the only way to push code changes;
> > > nobody from any of these projects has violated that by unilaterally
> > > committing to HDFS or MapReduce, anyway. And anyone cynical enough to
> > > believe that MapReduce and HDFS would deliberately screw over or
> > > ignore dependent projects because they don't have PMC members is
> > > plainly unsuited to community-driven development. I understand that
> > > t

[jira] Commented: (HIVE-987) Hive CLI Omnibus Improvement ticket

2010-04-22 Thread Raghotham Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859751#action_12859751
 ] 

Raghotham Murthy commented on HIVE-987:
---

jdbc/src/test/org/apache/hadoop/hive/jdbc/TestJdbcDriver.java shows using 
hive's jdbc driver in embedded mode. Doesnt it run as part of ant test?

> Hive CLI Omnibus Improvement ticket
> ---
>
> Key: HIVE-987
> URL: https://issues.apache.org/jira/browse/HIVE-987
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Carl Steinbach
> Attachments: HIVE-987.1.patch, sqlline-1.0.8_eb.jar
>
>
> Add the following features to the Hive CLI:
> * Command History
> * ReadLine support
> ** HIVE-120: Add readline support/support for alt-based commands in the CLI
> ** Java-ReadLine is LGPL, but it depends on GPL readline library. We probably 
> need to use JLine instead.
> * Tab completion
> ** HIVE-97: tab completion for hive cli
> * Embedded/Standalone CLI modes, and ability to connect to different Hive 
> Server instances.
> ** HIVE-818: Create a Hive CLI that connects to hive ThriftServer
> * .hiverc configuration file
> ** HIVE-920: .hiverc doesnt work
> * Improved support for comments.
> ** HIVE-430: Ability to comment desired for hive query files
> * Different output formats
> ** HIVE-49: display column header on CLI
> ** XML output format
> For additional inspiration we may want to look at the Postgres psql shell: 
> http://www.postgresql.org/docs/8.1/static/app-psql.html
> Finally, it would be really cool if we implemented this in a generic fashion 
> and spun it off as an apache-commons
> shell framework. It seems like most of the Apache Hadoop projects have their 
> own shells, and I'm sure the same is true
> for non-Hadoop Apache projects as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.