[jira] Updated: (HIVE-1211) Tapping logs from child processes

2010-06-29 Thread bc Wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bc Wong updated HIVE-1211:
--

Attachment: HIVE-1211-2.patch

> Tapping logs from child processes
> -
>
> Key: HIVE-1211
> URL: https://issues.apache.org/jira/browse/HIVE-1211
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Logging
>Reporter: bc Wong
>Assignee: bc Wong
> Fix For: 0.6.0
>
> Attachments: HIVE-1211-2.patch, HIVE-1211.1.patch
>
>
> Stdout/stderr from child processes (e.g. {{MapRedTask}}) are redirected to 
> the parent's stdout/stderr. There is little one can do to to sort out which 
> log is from which query.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1211) Tapping logs from child processes

2010-06-29 Thread bc Wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bc Wong updated HIVE-1211:
--

Status: Patch Available  (was: Open)

Updated patch on 0.6 branch.

> Tapping logs from child processes
> -
>
> Key: HIVE-1211
> URL: https://issues.apache.org/jira/browse/HIVE-1211
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Logging
>Reporter: bc Wong
>Assignee: bc Wong
> Fix For: 0.6.0
>
> Attachments: HIVE-1211-2.patch, HIVE-1211.1.patch
>
>
> Stdout/stderr from child processes (e.g. {{MapRedTask}}) are redirected to 
> the parent's stdout/stderr. There is little one can do to to sort out which 
> log is from which query.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1443) Add support to turn off bucketing with ALTER TABLE

2010-06-29 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883792#action_12883792
 ] 

Namit Jain commented on HIVE-1443:
--

+1


will commit if the tests pass

> Add support to turn off bucketing with ALTER TABLE
> --
>
> Key: HIVE-1443
> URL: https://issues.apache.org/jira/browse/HIVE-1443
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Paul Yang
>Assignee: Paul Yang
> Fix For: 0.7.0
>
> Attachments: HIVE-1443.1.patch
>
>
> Currently, there is an alter table command that can change the bucketing / 
> sort columns, as well as the number of buckets (ALTER TABLE table_name 
> CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name, ...)] INTO 
> num_buckets BUCKETS). However, this command does not provide a means to 
> disable bucketing. This proposes to introduce a syntax like
> ALTER TABLE src NOT CLUSTERED;
> that would turn off bucketing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1443) Add support to turn off bucketing with ALTER TABLE

2010-06-29 Thread Paul Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Yang updated HIVE-1443:


Status: Patch Available  (was: Open)

> Add support to turn off bucketing with ALTER TABLE
> --
>
> Key: HIVE-1443
> URL: https://issues.apache.org/jira/browse/HIVE-1443
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Paul Yang
>Assignee: Paul Yang
> Fix For: 0.7.0
>
> Attachments: HIVE-1443.1.patch
>
>
> Currently, there is an alter table command that can change the bucketing / 
> sort columns, as well as the number of buckets (ALTER TABLE table_name 
> CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name, ...)] INTO 
> num_buckets BUCKETS). However, this command does not provide a means to 
> disable bucketing. This proposes to introduce a syntax like
> ALTER TABLE src NOT CLUSTERED;
> that would turn off bucketing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1443) Add support to turn off bucketing with ALTER TABLE

2010-06-29 Thread Paul Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Yang updated HIVE-1443:


Attachment: HIVE-1443.1.patch

> Add support to turn off bucketing with ALTER TABLE
> --
>
> Key: HIVE-1443
> URL: https://issues.apache.org/jira/browse/HIVE-1443
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Paul Yang
>Assignee: Paul Yang
> Fix For: 0.7.0
>
> Attachments: HIVE-1443.1.patch
>
>
> Currently, there is an alter table command that can change the bucketing / 
> sort columns, as well as the number of buckets (ALTER TABLE table_name 
> CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name, ...)] INTO 
> num_buckets BUCKETS). However, this command does not provide a means to 
> disable bucketing. This proposes to introduce a syntax like
> ALTER TABLE src NOT CLUSTERED;
> that would turn off bucketing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1434) Cassandra Storage Handler

2010-06-29 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-1434:
--

Attachment: hive-1434-1.txt

Just a start. (To prove that I am doing something with this ticket)

> Cassandra Storage Handler
> -
>
> Key: HIVE-1434
> URL: https://issues.apache.org/jira/browse/HIVE-1434
> Project: Hadoop Hive
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
> Attachments: hive-1434-1.txt
>
>
> Add a cassandra storage handler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1443) Add support to turn off bucketing with ALTER TABLE

2010-06-29 Thread Paul Yang (JIRA)
Add support to turn off bucketing with ALTER TABLE
--

 Key: HIVE-1443
 URL: https://issues.apache.org/jira/browse/HIVE-1443
 Project: Hadoop Hive
  Issue Type: Improvement
Reporter: Paul Yang
Assignee: Paul Yang
 Fix For: 0.7.0


Currently, there is an alter table command that can change the bucketing / sort 
columns, as well as the number of buckets (ALTER TABLE table_name CLUSTERED BY 
(col_name, col_name, ...) [SORTED BY (col_name, ...)] INTO num_buckets 
BUCKETS). However, this command does not provide a means to disable bucketing. 
This proposes to introduce a syntax like

ALTER TABLE src NOT CLUSTERED;

that would turn off bucketing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1434) Cassandra Storage Handler

2010-06-29 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883743#action_12883743
 ] 

Jeremy Hanna commented on HIVE-1434:


I guess this is the hive version of CASSANDRA-913.  I saw hammer in the hall at 
the hadoop summit and he said there was a hive ticket on this now.

> Cassandra Storage Handler
> -
>
> Key: HIVE-1434
> URL: https://issues.apache.org/jira/browse/HIVE-1434
> Project: Hadoop Hive
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>
> Add a cassandra storage handler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-1439) Alter the number of buckets for a table

2010-06-29 Thread Paul Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Yang resolved HIVE-1439.
-

Fix Version/s: 0.6.0
   Resolution: Not A Problem

> Alter the number of buckets for a table
> ---
>
> Key: HIVE-1439
> URL: https://issues.apache.org/jira/browse/HIVE-1439
> Project: Hadoop Hive
>  Issue Type: New Feature
>Affects Versions: 0.6.0, 0.7.0
>Reporter: Paul Yang
>Assignee: Paul Yang
> Fix For: 0.6.0
>
>
> Add an alter table command to change the number of buckets for a table.
> e.g.
> {code}
> ALTER TABLE mytabl SET NUMBUCKETS 64;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1439) Alter the number of buckets for a table

2010-06-29 Thread Paul Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883732#action_12883732
 ] 

Paul Yang commented on HIVE-1439:
-

This is actually supported with ALTER TABLE table_name CLUSTERED BY (col_name, 
col_name, ...) [SORTED BY (col_name, ...)] INTO num_buckets BUCKETS. However 
that command doesn't support turning off bucketing. Will file a JIRA for that.

> Alter the number of buckets for a table
> ---
>
> Key: HIVE-1439
> URL: https://issues.apache.org/jira/browse/HIVE-1439
> Project: Hadoop Hive
>  Issue Type: New Feature
>Affects Versions: 0.6.0, 0.7.0
>Reporter: Paul Yang
>Assignee: Paul Yang
> Fix For: 0.6.0
>
>
> Add an alter table command to change the number of buckets for a table.
> e.g.
> {code}
> ALTER TABLE mytabl SET NUMBUCKETS 64;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1428) ALTER TABLE ADD PARTITION fails with a remote Thirft metastore

2010-06-29 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated HIVE-1428:
-

Attachment: TestHiveMetaStoreRemote.java

I have tried writing a unit test based on the existing one in TestHiveMetastore 
which starts off a thrift server. Unfortunately I have hit issues since some if 
the ant properties like build.dir do not get expanded in the conf seen by the 
server leading to error. I am attaching the Test file - can someone suggest how 
I can resolve this and if there is any other test which has attempted something 
similar.

Here is the exception I see:
{noformat}
2Codec, fs.checkpoint.size=67108864}
[junit] MetaException(message:Got exception: java.io.FileNotFoundException 
File file:/test/data/warehouse/compdb.db does not exist.)
[junit] at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_database_result.read(ThriftHiveMetastore.java:2751)
[junit] at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_database(ThriftHiveMetastore.java:127)
[junit] at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_database(ThriftHiveMetastore.java:104)
[junit] at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createDatabase(HiveMetaStoreClient.java:270)
{noformat}

Here is the value in the conf as seen by the server:
 hive.metastore.warehouse.dir=file://${build.dir}/test/data/warehouse/

> ALTER TABLE ADD PARTITION fails with a remote Thirft metastore
> --
>
> Key: HIVE-1428
> URL: https://issues.apache.org/jira/browse/HIVE-1428
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.6.0, 0.7.0
>Reporter: Paul Yang
> Attachments: HIVE-1428.patch, TestHiveMetaStoreRemote.java
>
>
> If the hive cli is configured to use a remote metastore, ALTER TABLE ... ADD 
> PARTITION commands will fail with an error similar to the following:
> [prade...@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e "ALTER TABLE 
> mytable add partition(datestamp = '20091101', srcid = '10',action) location 
> '/user/pradeepk/mytable/20091101/10';"
> 10/06/16 17:08:59 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found 
> in the classpath. Usage of hadoop-site.xml is deprecated. Instead use 
> core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of 
> core-default.xml, mapred-default.xml and hdfs-default.xml respectively
> Hive history 
> file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt
> FAILED: Error in metadata: org.apache.thrift.TApplicationException: 
> get_partition failed: unknown result
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> [prade...@chargesize:~/dev/howl]
> This is due to a check that tries to retrieve the partition to see if it 
> exists. If it does not, an attempt is made to pass a null value from the 
> metastore. Since thrift does not support null return values, an exception is 
> thrown.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Build failed in Hudson: Hive-trunk-h0.19 #486

2010-06-29 Thread Apache Hudson Server
See 

Changes:

[namit] HIVE-1440. Bug in RCFiles with local work (map-join or sort-merge join)
(He Yongqiang via namit)

--
[...truncated 13845 lines...]
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function4.q
[junit] Begin query: unknown_table1.q
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_table1.q
[junit] Begin query: unknown_table2.q
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2

Build failed in Hudson: Hive-trunk-h0.18 #486

2010-06-29 Thread Apache Hudson Server
See 

Changes:

[namit] HIVE-1440. Bug in RCFiles with local work (map-join or sort-merge join)
(He Yongqiang via namit)

--
[...truncated 13782 lines...]
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_function4.q
[junit] Begin query: unknown_table1.q
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_table1.q
[junit] Begin query: unknown_table2.q
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2

Hudson build is back to normal : Hive-trunk-h0.17 #484

2010-06-29 Thread Apache Hudson Server
See 




[jira] Commented: (HIVE-1096) Hive Variables

2010-06-29 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883561#action_12883561
 ] 

HBase Review Board commented on HIVE-1096:
--

Message from: "Edward Capriolo" 


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 260
bq.  > 
bq.  >
bq.  > I like the word "interpolate" too, but I think more people are 
probably familiar with "substitute". Please change the name to 
HIVESUBSTITUTEVARIABLES.

I do not want to have the feature called different things across the code base. 
Replace here interpolate there it will be confusing for all. You originally 
suggested interpolate: "Driver.interpolateCommandVariables()".


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 561
bq.  > 
bq.  >
bq.  > No need to test a boolean valued method for equality to true. The 
logic can also be simplified as follows:
bq.  > 
bq.  > if (!getBoolVar(ConfVars.HIVEVARIABLEINTERPOLATE)) {
bq.  >   return expr;
bq.  > }
bq.  > l4j.info("Interpolation is on");
bq.  > ...
bq.  > 
bq.  > Also, please move this out of the substitution function and into the 
Driver, i.e Driver.compile() calls VariableSubstitution.substitute() iff 
HIVESUBSTITUEVARIABLES == true.

Different components are using substitute SetProcessor, Driver , File 
Processor, having the interpolation on/off logic in each class is redundant. 
This way is better as it supports information hiding.


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 562
bq.  > 
bq.  >
bq.  > This log message is going to generate a lot of noise, and is 
unnecessary since you can determine the value using the 'set' command. Please 
remove it.

I think it is very important to see the command before and after substitution. 
We can set this to debug, I do not think it is noise.


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 587
bq.  > 
bq.  >
bq.  > I thought the plan was to also introduce a "conf:" prefix for 
referencing configuration properties, and that any value not prefixed by 
system/env/conf would map to a new variable namespace? Please add the conf 
prefix and introduce some state in the SessionState for storing variables (i.e. 
non System/Env/Conf properties).

After pondering this I think the "conf:" prefix is confusing and hurts 
backwards compatibility. Right now this is "hive -hiveconf x=y"  = "set x=y". I 
do not think we want to introduce another switch. "--hivevarconf". in most 
cases users want conf access to set properties that can be picked up by 
classes. hadoop & hive conf is the way it is adding something else will not fix 
the problem and will confuse people.


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/ql/src/test/queries/clientpositive/set_processor_namespaces.q, 
line 4
bq.  > 
bq.  >
bq.  > You need a test for the "env" namespace. I think this is impossible 
to do here, so you probably need to add a unit test.

Env variables are not changeable and are system dependent. I do not think there 
is a way to test these.


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 595
bq.  > 
bq.  >
bq.  > Please remove this log call. It will generate a lot of noise.

I think it is very important to see the command before and after substitution. 
We can set this to debug, I do not think it is noise.


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 50
bq.  > 
bq.  >
bq.  > The variables varPat and MAX_SUBST both exist as private variables 
in the parent class. I think that redefining them in HiveConf will result in 
lots of confusion down the road, plus they don't really belong here. Please 
move this to a "VariableSubstitution" class located in ql.

I see your point. Then again, hadoop does the substitution inside the conf 
class. Also QL is becoming the 'ubber package' At this point what isnt ql?


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > 
trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java, line 
93
bq.  > 

Re: Review Request: HIVE-1096: Hive Variables

2010-06-29 Thread Edward Capriolo


> On 2010-06-29 00:50:22, Carl Steinbach wrote:
> > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 260
> > 
> >
> > I like the word "interpolate" too, but I think more people are probably 
> > familiar with "substitute". Please change the name to 
> > HIVESUBSTITUTEVARIABLES.

I do not want to have the feature called different things across the code base. 
Replace here interpolate there it will be confusing for all. You originally 
suggested interpolate: "Driver.interpolateCommandVariables()".


> On 2010-06-29 00:50:22, Carl Steinbach wrote:
> > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 561
> > 
> >
> > No need to test a boolean valued method for equality to true. The logic 
> > can also be simplified as follows:
> > 
> > if (!getBoolVar(ConfVars.HIVEVARIABLEINTERPOLATE)) {
> >   return expr;
> > }
> > l4j.info("Interpolation is on");
> > ...
> > 
> > Also, please move this out of the substitution function and into the 
> > Driver, i.e Driver.compile() calls VariableSubstitution.substitute() iff 
> > HIVESUBSTITUEVARIABLES == true.

Different components are using substitute SetProcessor, Driver , File 
Processor, having the interpolation on/off logic in each class is redundant. 
This way is better as it supports information hiding.


> On 2010-06-29 00:50:22, Carl Steinbach wrote:
> > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 562
> > 
> >
> > This log message is going to generate a lot of noise, and is 
> > unnecessary since you can determine the value using the 'set' command. 
> > Please remove it.

I think it is very important to see the command before and after substitution. 
We can set this to debug, I do not think it is noise.


> On 2010-06-29 00:50:22, Carl Steinbach wrote:
> > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 587
> > 
> >
> > I thought the plan was to also introduce a "conf:" prefix for 
> > referencing configuration properties, and that any value not prefixed by 
> > system/env/conf would map to a new variable namespace? Please add the conf 
> > prefix and introduce some state in the SessionState for storing variables 
> > (i.e. non System/Env/Conf properties).

After pondering this I think the "conf:" prefix is confusing and hurts 
backwards compatibility. Right now this is "hive -hiveconf x=y"  = "set x=y". I 
do not think we want to introduce another switch. "--hivevarconf". in most 
cases users want conf access to set properties that can be picked up by 
classes. hadoop & hive conf is the way it is adding something else will not fix 
the problem and will confuse people.


> On 2010-06-29 00:50:22, Carl Steinbach wrote:
> > trunk/ql/src/test/queries/clientpositive/set_processor_namespaces.q, line 4
> > 
> >
> > You need a test for the "env" namespace. I think this is impossible to 
> > do here, so you probably need to add a unit test.

Env variables are not changeable and are system dependent. I do not think there 
is a way to test these.


> On 2010-06-29 00:50:22, Carl Steinbach wrote:
> > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 595
> > 
> >
> > Please remove this log call. It will generate a lot of noise.

I think it is very important to see the command before and after substitution. 
We can set this to debug, I do not think it is noise.


> On 2010-06-29 00:50:22, Carl Steinbach wrote:
> > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 50
> > 
> >
> > The variables varPat and MAX_SUBST both exist as private variables in 
> > the parent class. I think that redefining them in HiveConf will result in 
> > lots of confusion down the road, plus they don't really belong here. Please 
> > move this to a "VariableSubstitution" class located in ql.

I see your point. Then again, hadoop does the substitution inside the conf 
class. Also QL is becoming the 'ubber package' At this point what isnt ql?


> On 2010-06-29 00:50:22, Carl Steinbach wrote:
> > trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java, 
> > line 93
> > 
> >
> > I don't think you want to perform variable substitution at this point. 
> > It makes it impossible to create nested variables.

I will check it out.


- Edward


---
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/229/#review

[jira] Commented: (HIVE-1096) Hive Variables

2010-06-29 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883452#action_12883452
 ] 

Carl Steinbach commented on HIVE-1096:
--



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


   This code does not belong in common. Please move it to ql.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


   The variables varPat and MAX_SUBST both exist as private variables in the 
parent class. I think that redefining them in HiveConf will result in lots of 
confusion down the road, plus they don't really belong here. Please move this 
to a "VariableSubstitution" class located in ql.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


   I like the word "interpolate" too, but I think more people are probably 
familiar with "substitute". Please change the name to HIVESUBSTITUTEVARIABLES.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


   Please fix the formatting problems in this method (spaces before and after 
parens, no space between literals and operators, etc). Please run checkstyle 
and verify that you are not introducing any new checkstyle errors.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


   No need to test a boolean valued method for equality to true. The logic can 
also be simplified as follows:

   if (!getBoolVar(ConfVars.HIVEVARIABLEINTERPOLATE)) {
 return expr;
   }
   l4j.info("Interpolation is on");
   ...

   Also, please move this out of the substitution function and into the Driver, 
i.e Driver.compile() calls VariableSubstitution.substitute() iff 
HIVESUBSTITUEVARIABLES == true.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


   This log message is going to generate a lot of noise, and is unnecessary 
since you can determine the value using the 'set' command. Please remove it.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


   Presumably there aren't going to be any properties in the HiveConf starting 
with "system:" or "env:". Please move the "env:" check before the conf check, 
and move these string comparisons outside of the try/catch blocks, e.g:

   private static final String SYSTEM_VAR_PREFIX = "system:";
   private static final String ENV_VAR_PREFIX = "env:";

   if (var.startsWith(SYSTEM_VAR_PREFIX)) {
 try {
   val = System.getProperty(var.substring(SYSTEM_VAR_PREFIX.length()));
 } catch (SecurityException se) {
   ...
 }
   } else if (var.startsWith(ENV_VAR_PREFIX)) {
 val = System.getenv(var.substring(ENV_VAR_PREFIX.length()));
   } else {
 val = ss.getConf().get(var);
   }



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


   I thought the plan was to also introduce a "conf:" prefix for referencing 
configuration properties, and that any value not prefixed by system/env/conf 
would map to a new variable namespace? Please add the conf prefix and introduce 
some state in the SessionState for storing variables (i.e. non System/Env/Conf 
properties).



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


   Please remove this log call. It will generate a lot of noise.



trunk/conf/hive-default.xml


   Please change the name to "hive.substitute.variables", and the description 
to "Enable variable substitution".



trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java


   Check HIVESUBSTITUTEVARIABLES here.



trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java


   Please create static final variables for "system:" and "env:" and call 
length() on these variables instead of using magic numbers.



trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java


   I don't think you want to perform variable substitution at this point. It 
makes it impossible to create nested variables.



trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java


   This logic for mapping variable names to values has been repeated several 
times. Please move it to a dedicated method in the VariableSubstitution class.



trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java


   Please run checkstyle and fix any checkstyle errors that you have introduced.



trunk/ql/src/t

[jira] Commented: (HIVE-1096) Hive Variables

2010-06-29 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883453#action_12883453
 ] 

HBase Review Board commented on HIVE-1096:
--

Message from: "Carl Steinbach" 

---
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/229/#review288
---



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


This code does not belong in common. Please move it to ql.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


The variables varPat and MAX_SUBST both exist as private variables in the 
parent class. I think that redefining them in HiveConf will result in lots of 
confusion down the road, plus they don't really belong here. Please move this 
to a "VariableSubstitution" class located in ql.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


I like the word "interpolate" too, but I think more people are probably 
familiar with "substitute". Please change the name to HIVESUBSTITUTEVARIABLES.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


Please fix the formatting problems in this method (spaces before and after 
parens, no space between literals and operators, etc). Please run checkstyle 
and verify that you are not introducing any new checkstyle errors.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


No need to test a boolean valued method for equality to true. The logic can 
also be simplified as follows:

if (!getBoolVar(ConfVars.HIVEVARIABLEINTERPOLATE)) {
  return expr;
}
l4j.info("Interpolation is on");
...

Also, please move this out of the substitution function and into the 
Driver, i.e Driver.compile() calls VariableSubstitution.substitute() iff 
HIVESUBSTITUEVARIABLES == true.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


This log message is going to generate a lot of noise, and is unnecessary 
since you can determine the value using the 'set' command. Please remove it.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


Presumably there aren't going to be any properties in the HiveConf starting 
with "system:" or "env:". Please move the "env:" check before the conf check, 
and move these string comparisons outside of the try/catch blocks, e.g:

private static final String SYSTEM_VAR_PREFIX = "system:";
private static final String ENV_VAR_PREFIX = "env:";

if (var.startsWith(SYSTEM_VAR_PREFIX)) {
  try {
val = System.getProperty(var.substring(SYSTEM_VAR_PREFIX.length()));
  } catch (SecurityException se) {
...
  }
} else if (var.startsWith(ENV_VAR_PREFIX)) {
  val = System.getenv(var.substring(ENV_VAR_PREFIX.length()));
} else {
  val = ss.getConf().get(var);
}



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


I thought the plan was to also introduce a "conf:" prefix for referencing 
configuration properties, and that any value not prefixed by system/env/conf 
would map to a new variable namespace? Please add the conf prefix and introduce 
some state in the SessionState for storing variables (i.e. non System/Env/Conf 
properties).



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


Please remove this log call. It will generate a lot of noise.



trunk/conf/hive-default.xml


Please change the name to "hive.substitute.variables", and the description 
to "Enable variable substitution".



trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java


Check HIVESUBSTITUTEVARIABLES here.



trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java


Please create static final variables for "system:" and "env:" and call 
length() on these variables instead of using magic numbers.



trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java


I don't think you want to perform variable substitution at this point. It 
makes it impossible to create nested variables.



trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java


This logic for mapping variable names to values has been repea

[jira] Updated: (HIVE-1096) Hive Variables

2010-06-29 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1096:
-

Status: Open  (was: Patch Available)

> Hive Variables
> --
>
> Key: HIVE-1096
> URL: https://issues.apache.org/jira/browse/HIVE-1096
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
> Fix For: 0.6.0, 0.7.0
>
> Attachments: 1096-9.diff, hive-1096-10-patch.txt, 
> hive-1096-11-patch.txt, hive-1096-2.diff, hive-1096-7.diff, hive-1096-8.diff, 
> hive-1096.diff
>
>
> From mailing list:
> --Amazon Elastic MapReduce version of Hive seems to have a nice feature 
> called "Variables." Basically you can define a variable via command-line 
> while invoking hive with -d DT=2009-12-09 and then refer to the variable via 
> ${DT} within the hive queries. This could be extremely useful. I can't seem 
> to find this feature even on trunk. Is this feature currently anywhere in the 
> roadmap?--
> This could be implemented in many places.
> A simple place to put this is 
> in Driver.compile or Driver.run we can do string substitutions at that level, 
> and further downstream need not be effected. 
> There could be some benefits to doing this further downstream, parser,plan. 
> but based on the simple needs we may not need to overthink this.
> I will get started on implementing in compile unless someone wants to discuss 
> this more.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Review Request: HIVE-1096: Hive Variables

2010-06-29 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/229/#review288
---



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


This code does not belong in common. Please move it to ql.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


The variables varPat and MAX_SUBST both exist as private variables in the 
parent class. I think that redefining them in HiveConf will result in lots of 
confusion down the road, plus they don't really belong here. Please move this 
to a "VariableSubstitution" class located in ql.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


I like the word "interpolate" too, but I think more people are probably 
familiar with "substitute". Please change the name to HIVESUBSTITUTEVARIABLES.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


Please fix the formatting problems in this method (spaces before and after 
parens, no space between literals and operators, etc). Please run checkstyle 
and verify that you are not introducing any new checkstyle errors.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


No need to test a boolean valued method for equality to true. The logic can 
also be simplified as follows:

if (!getBoolVar(ConfVars.HIVEVARIABLEINTERPOLATE)) {
  return expr;
}
l4j.info("Interpolation is on");
...

Also, please move this out of the substitution function and into the 
Driver, i.e Driver.compile() calls VariableSubstitution.substitute() iff 
HIVESUBSTITUEVARIABLES == true.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


This log message is going to generate a lot of noise, and is unnecessary 
since you can determine the value using the 'set' command. Please remove it.



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


Presumably there aren't going to be any properties in the HiveConf starting 
with "system:" or "env:". Please move the "env:" check before the conf check, 
and move these string comparisons outside of the try/catch blocks, e.g:

private static final String SYSTEM_VAR_PREFIX = "system:";
private static final String ENV_VAR_PREFIX = "env:";

if (var.startsWith(SYSTEM_VAR_PREFIX)) {
  try {
val = System.getProperty(var.substring(SYSTEM_VAR_PREFIX.length()));
  } catch (SecurityException se) {
...
  }
} else if (var.startsWith(ENV_VAR_PREFIX)) {
  val = System.getenv(var.substring(ENV_VAR_PREFIX.length()));
} else {
  val = ss.getConf().get(var);
}



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


I thought the plan was to also introduce a "conf:" prefix for referencing 
configuration properties, and that any value not prefixed by system/env/conf 
would map to a new variable namespace? Please add the conf prefix and introduce 
some state in the SessionState for storing variables (i.e. non System/Env/Conf 
properties).



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


Please remove this log call. It will generate a lot of noise.



trunk/conf/hive-default.xml


Please change the name to "hive.substitute.variables", and the description 
to "Enable variable substitution".



trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java


Check HIVESUBSTITUTEVARIABLES here.



trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java


Please create static final variables for "system:" and "env:" and call 
length() on these variables instead of using magic numbers.



trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java


I don't think you want to perform variable substitution at this point. It 
makes it impossible to create nested variables.



trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java


This logic for mapping variable names to values has been repeated several 
times. Please move it to a dedicated method in the VariableSubstitution class.



trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java


Please run checkstyle and fix any checkstyle errors that you have 
introdu