[jira] Updated: (HIVE-79) Print number of raws inserted to table(s) when the query is finished.

2009-01-29 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony updated HIVE-79:
--

Attachment: patch_79_1.txt

This path logs inserted row count to hive query log.  
Logged format will be:
TaskEnd TASK_ROWS_INSERTED="tmp_suresh_12:181687,tmp_suresh_13:181687"

Made changes to semantic analyzer keep tack id-table name map.

HiveHistory converts id back to table name and writes to structured query log.

> Print number of raws inserted to table(s) when  the query is finished.
> --
>
> Key: HIVE-79
> URL: https://issues.apache.org/jira/browse/HIVE-79
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Logging
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: patch_79_1.txt
>
>
> It is good to print the number of rows inserted into each table at end of 
> query. 
> insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10;
> This query can print something like:
> tab1 rows=100

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-260) hive cli should not output the line by default

2009-01-29 Thread Zheng Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao resolved HIVE-260.
-

Resolution: Invalid

"-S" option does remove that line.


> hive cli should not output the line by default
> --
>
> Key: HIVE-260
> URL: https://issues.apache.org/jira/browse/HIVE-260
> Project: Hadoop Hive
>  Issue Type: Bug
>Affects Versions: 0.2.0
>Reporter: Zheng Shao
>Priority: Blocker
>
> This is at the beginning of hive cli output:
> Hive history file=/tmp/zshao/hive_job_log_zshao_200901291532_-1964746650.txt
> We should remove it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HIVE-223) when using map-side aggregates - perform single map-reduce group-by

2009-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-223:
---

Assignee: Namit Jain

> when using map-side aggregates - perform single map-reduce group-by
> ---
>
> Key: HIVE-223
> URL: https://issues.apache.org/jira/browse/HIVE-223
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Joydeep Sen Sarma
>Assignee: Namit Jain
>
> today even when we do map side aggregates - we do multiple map-reduce jobs. 
> however - the reason for doing multiple map-reduce group-bys (for single 
> group-bys) was the fear of skews. When we are doing map side aggregates - 
> skews should not exist for the most part. There can be two reason for skews:
> - large number of entries for a single grouping set - map side aggregates 
> should take care of this
> - badness in hash function that sends too much stuff to one reducer - we 
> should be able to take care of this by having good hash functions (and prime 
> number reducer counts)
> So i think we should be able to do a single stage map-reduce when doing 
> map-side aggregates.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-262) outer join gets some duplicate rows in some scenarios

2009-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-262:


Attachment: patch262.2.txt

forgot to update parse result files

> outer join gets some duplicate rows in some scenarios
> -
>
> Key: HIVE-262
> URL: https://issues.apache.org/jira/browse/HIVE-262
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.2.0
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.2.0
>
> Attachments: patch.262.1.txt, patch262.2.txt
>
>
> SELECT * FROM src src1 JOIN src src2 ON (src1.key = src2.key AND src1.key < 
> 10) RIGHT OUTER JOIN src src3 ON (src1.key = src3.key AND src3.key < 20);
> returns duplicate rows for outer join

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-262) outer join gets some duplicate rows in some scenarios

2009-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-262:


Status: Open  (was: Patch Available)

> outer join gets some duplicate rows in some scenarios
> -
>
> Key: HIVE-262
> URL: https://issues.apache.org/jira/browse/HIVE-262
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.2.0
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.2.0
>
> Attachments: patch.262.1.txt, patch262.2.txt
>
>
> SELECT * FROM src src1 JOIN src src2 ON (src1.key = src2.key AND src1.key < 
> 10) RIGHT OUTER JOIN src src3 ON (src1.key = src3.key AND src3.key < 20);
> returns duplicate rows for outer join

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-262) outer join gets some duplicate rows in some scenarios

2009-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-262:


Status: Patch Available  (was: Open)

> outer join gets some duplicate rows in some scenarios
> -
>
> Key: HIVE-262
> URL: https://issues.apache.org/jira/browse/HIVE-262
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.2.0
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.2.0
>
> Attachments: patch.262.1.txt, patch262.2.txt
>
>
> SELECT * FROM src src1 JOIN src src2 ON (src1.key = src2.key AND src1.key < 
> 10) RIGHT OUTER JOIN src src3 ON (src1.key = src3.key AND src3.key < 20);
> returns duplicate rows for outer join

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-262) outer join gets some duplicate rows in some scenarios

2009-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-262:


Fix Version/s: 0.2.0
Affects Version/s: 0.2.0
   Status: Patch Available  (was: Open)

> outer join gets some duplicate rows in some scenarios
> -
>
> Key: HIVE-262
> URL: https://issues.apache.org/jira/browse/HIVE-262
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.2.0
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.2.0
>
> Attachments: patch.262.1.txt
>
>
> SELECT * FROM src src1 JOIN src src2 ON (src1.key = src2.key AND src1.key < 
> 10) RIGHT OUTER JOIN src src3 ON (src1.key = src3.key AND src3.key < 20);
> returns duplicate rows for outer join

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-262) outer join gets some duplicate rows in some scenarios

2009-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-262:


Attachment: patch.262.1.txt

> outer join gets some duplicate rows in some scenarios
> -
>
> Key: HIVE-262
> URL: https://issues.apache.org/jira/browse/HIVE-262
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.2.0
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.2.0
>
> Attachments: patch.262.1.txt
>
>
> SELECT * FROM src src1 JOIN src src2 ON (src1.key = src2.key AND src1.key < 
> 10) RIGHT OUTER JOIN src src3 ON (src1.key = src3.key AND src3.key < 20);
> returns duplicate rows for outer join

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-106) Join operation fails for some queries

2009-01-29 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668711#action_12668711
 ] 

Namit Jain commented on HIVE-106:
-

Josh, can you provide the data files for the tables activities and users which 
was failing

> Join operation fails for some queries
> -
>
> Key: HIVE-106
> URL: https://issues.apache.org/jira/browse/HIVE-106
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Josh Ferguson
>Assignee: Namit Jain
>Priority: Critical
>
> The Tables Are
> CREATE TABLE activities 
> (actor_id STRING, actee_id STRING, properties MAP) 
> PARTITIONED BY (account STRING, application STRING, dataset STRING, hour INT) 
> CLUSTERED BY (actor_id, actee_id) INTO 32 BUCKETS 
> ROW FORMAT DELIMITED 
> COLLECTION ITEMS TERMINATED BY '44'
> MAP KEYS TERMINATED BY '58'
> STORED AS TEXTFILE;
> Detailed Table Information:
> Table(tableName:activities,dbName:default,owner:Josh,createTime:1228208598,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:actor_id,type:string,comment:null),
>  FieldSchema(name:actee_id,type:string,comment:null), 
> FieldSchema(name:properties,type:map,comment:null)],location:/user/hive/warehouse/activities,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:32,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe,parameters:{colelction.delim=44,mapkey.delim=58,serialization.format=org.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol}),bucketCols:[actor_id,
>  
> actee_id],sortCols:[],parameters:{}),partitionKeys:[FieldSchema(name:account,type:string,comment:null),
>  FieldSchema(name:application,type:string,comment:null), 
> FieldSchema(name:dataset,type:string,comment:null), 
> FieldSchema(name:hour,type:int,comment:null)],parameters:{})
> CREATE TABLE users 
> (id STRING, properties MAP) 
> PARTITIONED BY (account STRING, application STRING, dataset STRING, hour INT) 
> CLUSTERED BY (id) INTO 32 BUCKETS 
> ROW FORMAT DELIMITED 
> COLLECTION ITEMS TERMINATED BY '44'
> MAP KEYS TERMINATED BY '58'
> STORED AS TEXTFILE;
> Detailed Table Information:
> Table(tableName:users,dbName:default,owner:Josh,createTime:1228208633,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:id,type:string,comment:null),
>  
> FieldSchema(name:properties,type:map,comment:null)],location:/user/hive/warehouse/users,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:32,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe,parameters:{colelction.delim=44,mapkey.delim=58,serialization.format=org.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol}),bucketCols:[id],sortCols:[],parameters:{}),partitionKeys:[FieldSchema(name:account,type:string,comment:null),
>  FieldSchema(name:application,type:string,comment:null), 
> FieldSchema(name:dataset,type:string,comment:null), 
> FieldSchema(name:hour,type:int,comment:null)],parameters:{})
> A working query is
> SELECT activities.* FROM activities WHERE activities.dataset='poke' AND 
> activities.properties['verb'] = 'Dance';
> A non working query is
> SELECT activities.*, users.* FROM activities LEFT OUTER JOIN users ON 
> activities.actor_id = users.id WHERE activities.dataset='poke' AND 
> activities.properties['verb'] = 'Dance';
> The Exception Is
> java.lang.RuntimeException: Hive 2 Internal error: cannot evaluate index 
> expression on string
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeIndexEvaluator.evaluate(ExprNodeIndexEvaluator.java:64)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeFuncEvaluator.evaluate(ExprNodeFuncEvaluator.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeFuncEvaluator.evaluate(ExprNodeFuncEvaluator.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:67)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:262)
>   at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.createForwardJoinObject(JoinOperator.java:257)
>   at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:477)
>   at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:467)
>   at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:467)
>   at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.checkAndGenObject(JoinOperator.java:507)
>   at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:489)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:

[jira] Created: (HIVE-262) outer join gets some duplicate rows in some scenarios

2009-01-29 Thread Namit Jain (JIRA)
outer join gets some duplicate rows in some scenarios
-

 Key: HIVE-262
 URL: https://issues.apache.org/jira/browse/HIVE-262
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain


SELECT * FROM src src1 JOIN src src2 ON (src1.key = src2.key AND src1.key < 10) 
RIGHT OUTER JOIN src src3 ON (src1.key = src3.key AND src3.key < 20);


returns duplicate rows for outer join

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-260) hive cli should not output the line by default

2009-01-29 Thread Zheng Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668706#action_12668706
 ] 

Zheng Shao commented on HIVE-260:
-

No. The number of reducers is not there.




> hive cli should not output the line by default
> --
>
> Key: HIVE-260
> URL: https://issues.apache.org/jira/browse/HIVE-260
> Project: Hadoop Hive
>  Issue Type: Bug
>Affects Versions: 0.2.0
>Reporter: Zheng Shao
>Priority: Blocker
>
> This is at the beginning of hive cli output:
> Hive history file=/tmp/zshao/hive_job_log_zshao_200901291532_-1964746650.txt
> We should remove it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-82) Augment build.xml with a target to build the forrest docs and javadocs

2009-01-29 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668700#action_12668700
 ] 

Edward Capriolo commented on HIVE-82:
-

We can also use the Hive-Web-Interface to display the javadoc. If we create a 
folder $HIVE_HOME/doc the hive web server can load it as a static context.

> Augment build.xml with a target to build the forrest docs and javadocs
> --
>
> Key: HIVE-82
> URL: https://issues.apache.org/jira/browse/HIVE-82
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Build Infrastructure
>Reporter: Jeff Hammerbacher
>
> See hadoop's build.xml, especially the targets "docs" and "javadoc-dev"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function

2009-01-29 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668699#action_12668699
 ] 

Edward Capriolo commented on HIVE-259:
--

95% percentile is very often used in Internet Service Provider billing that 
might be useful. 

The percentile calculation is a sort and then picking an element. The syntax 
could be like:

* PERCENTILE(column, .99) 
* PERCENTILE(column, .50)

In this manner you could do any percentile.

> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-30) Hive web interface

2009-01-29 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-30?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-30:


Attachment: hive-30-9.patch

Newest patch. (not a svn stat DOH!)

> Hive web interface
> --
>
> Key: HIVE-30
> URL: https://issues.apache.org/jira/browse/HIVE-30
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Web UI
>Reporter: Jeff Hammerbacher
>Assignee: Edward Capriolo
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: HIVE-30-5.patch, HIVE-30-6.patch, hive-30-7.patch, 
> hive-30-9.patch, HIVE-30-A.patch, HIVE-30.patch, HIVE-30.patch
>
>
> Hive needs a web interface. The initial checkin should have:
> * simple schema browsing
> * query submission
> * query history (similar to MySQL's SHOW PROCESSLIST)
> A suggested feature: the ability to have a query notify the user when it's 
> completed.
> Edward Capriolo has expressed some interest in driving this process.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-260) hive cli should not output the line by default

2009-01-29 Thread Ashish Thusoo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668681#action_12668681
 ] 

Ashish Thusoo commented on HIVE-260:


Does this show up along with the message that outputs the number of reducers 
etc...

If so would this just not go away with running the cli in silent mode?


> hive cli should not output the line by default
> --
>
> Key: HIVE-260
> URL: https://issues.apache.org/jira/browse/HIVE-260
> Project: Hadoop Hive
>  Issue Type: Bug
>Affects Versions: 0.2.0
>Reporter: Zheng Shao
>Priority: Blocker
>
> This is at the beginning of hive cli output:
> Hive history file=/tmp/zshao/hive_job_log_zshao_200901291532_-1964746650.txt
> We should remove it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-261) union all query hangs

2009-01-29 Thread Hao Liu (JIRA)
union all query hangs
-

 Key: HIVE-261
 URL: https://issues.apache.org/jira/browse/HIVE-261
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Hao Liu


we have this query:

SELECT a.u, b.id FROM (
 SELECT a1.u, a1.id as id FROM t_1 a1 WHERE a1.date = '2009-01-01' UNION ALL
 SELECT a2.u, a2.id as id FROM t_2 a2 WHERE a2.date = '2009-01-01' UNION ALL
 ...
 SELECT aN.u, aN.id as id FROM t_N an WHERE aN.date = '2009-01-01'
) a 
JOIN t b ON a.id = b.id WHERE b.date='2009-01-01' 
GROUP BY a.u, b.id

When we union more than 20 tables, the query will hang. It looks like something 
wrong in the compiler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-260) hive cli should not output the line by default

2009-01-29 Thread Zheng Shao (JIRA)
hive cli should not output the line by default
--

 Key: HIVE-260
 URL: https://issues.apache.org/jira/browse/HIVE-260
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.2.0
Reporter: Zheng Shao
Priority: Blocker


This is at the beginning of hive cli output:

Hive history file=/tmp/zshao/hive_job_log_zshao_200901291532_-1964746650.txt

We should remove it.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HIVE-79) Print number of raws inserted to table(s) when the query is finished.

2009-01-29 Thread Suresh Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Antony reassigned HIVE-79:
-

Assignee: Suresh Antony

> Print number of raws inserted to table(s) when  the query is finished.
> --
>
> Key: HIVE-79
> URL: https://issues.apache.org/jira/browse/HIVE-79
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Logging
>Reporter: Suresh Antony
>Assignee: Suresh Antony
>Priority: Minor
> Fix For: 0.2.0
>
>
> It is good to print the number of rows inserted into each table at end of 
> query. 
> insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10;
> This query can print something like:
> tab1 rows=100

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-259) Add PERCENTILE aggregate function

2009-01-29 Thread Venky Iyer (JIRA)
Add PERCENTILE aggregate function
-

 Key: HIVE-259
 URL: https://issues.apache.org/jira/browse/HIVE-259
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Venky Iyer


Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-96) improve CLI error messages by changing ANTLR error reporter to print out custom string for production rather than just the production name (which may be obscure)

2009-01-29 Thread Ashish Thusoo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-96?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Thusoo resolved HIVE-96.
---

   Resolution: Duplicate
Fix Version/s: 0.2.0
 Hadoop Flags: [Reviewed]

Duplicate of HIVE-119.

> improve CLI error messages by changing ANTLR error reporter to print out 
> custom string for production rather than just the production name (which may 
> be obscure)
> -
>
> Key: HIVE-96
> URL: https://issues.apache.org/jira/browse/HIVE-96
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Clients
>Reporter: Pete Wyckoff
> Fix For: 0.2.0
>
>
> Annotate each rule in the grammar with a more humanly readable message to be 
> printed rather than just the rule name which may not be easy for the user to 
> understand.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-14) selecting fields from a complex object column in transform clause is throwing a Parse Error.

2009-01-29 Thread Ashish Thusoo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668628#action_12668628
 ] 

Ashish Thusoo commented on HIVE-14:
---

Prasad,

can you verify if this works and close out...

Thanks,
Ashish

> selecting fields from a complex object column in transform clause is throwing 
> a Parse Error.
> 
>
> Key: HIVE-14
> URL: https://issues.apache.org/jira/browse/HIVE-14
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Prasad Chakka
>Priority: Critical
>
> For a thrift table whose columns are complex objects the following query 
> throws up an error
> from ( from cdx select transform(cdx.a.b) as tx using 'mapper' cluster by tx 
> ) mo insert into output select tx;
> the error thrown on the second . of expresion 'cdx.a.b'.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-253) rand() gets precomputated in compilation phase

2009-01-29 Thread Ashish Thusoo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Thusoo updated HIVE-253:
---

Affects Version/s: 0.2.0
Fix Version/s: 0.2.0

Marking this for 0.2.0 version.

> rand() gets precomputated in compilation phase
> --
>
> Key: HIVE-253
> URL: https://issues.apache.org/jira/browse/HIVE-253
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.2.0
>Reporter: Zheng Shao
>Assignee: Ashish Thusoo
>Priority: Blocker
> Fix For: 0.2.0
>
>
> SELECT * FROM t WHERE rand() < 0.01;
> Hive will say: "No need to submit job", because the condition evaluates to 
> false.
> The rand() function is special in the sense that every time it evaluates to a 
> different value. We should disallow computing the value in the compiling 
> phase.
> One way to do that is to add an annotation in the UDFRand and check that in 
> the compiling phase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HIVE-253) rand() gets precomputated in compilation phase

2009-01-29 Thread Ashish Thusoo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Thusoo reassigned HIVE-253:
--

Assignee: Ashish Thusoo

> rand() gets precomputated in compilation phase
> --
>
> Key: HIVE-253
> URL: https://issues.apache.org/jira/browse/HIVE-253
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.2.0
>Reporter: Zheng Shao
>Assignee: Ashish Thusoo
>Priority: Blocker
> Fix For: 0.2.0
>
>
> SELECT * FROM t WHERE rand() < 0.01;
> Hive will say: "No need to submit job", because the condition evaluates to 
> false.
> The rand() function is special in the sense that every time it evaluates to a 
> different value. We should disallow computing the value in the compiling 
> phase.
> One way to do that is to add an annotation in the UDFRand and check that in 
> the compiling phase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-30) Hive web interface

2009-01-29 Thread Ashish Thusoo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668624#action_12668624
 ] 

Ashish Thusoo commented on HIVE-30:
---

Hi Edward,

It seems like the latest patch has the output for svn stat instead of svn 
diff... 

Thanks,
Ashish


> Hive web interface
> --
>
> Key: HIVE-30
> URL: https://issues.apache.org/jira/browse/HIVE-30
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Web UI
>Reporter: Jeff Hammerbacher
>Assignee: Edward Capriolo
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: HIVE-30-5.patch, HIVE-30-6.patch, hive-30-7.patch, 
> HIVE-30-A.patch, HIVE-30.patch, HIVE-30.patch
>
>
> Hive needs a web interface. The initial checkin should have:
> * simple schema browsing
> * query submission
> * query history (similar to MySQL's SHOW PROCESSLIST)
> A suggested feature: the ability to have a query notify the user when it's 
> completed.
> Edward Capriolo has expressed some interest in driving this process.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-219) Map-side aggregates output one row per reducer when not grouping

2009-01-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-219.
-

   Resolution: Duplicate
Fix Version/s: 0.2.0
 Assignee: Namit Jain

marking duplicate of 256

> Map-side aggregates output one row per reducer when not grouping
> 
>
> Key: HIVE-219
> URL: https://issues.apache.org/jira/browse/HIVE-219
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: David Phillips
>Assignee: Namit Jain
>Priority: Blocker
> Fix For: 0.2.0
>
>
> Example: SELECT count(1) FROM table;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-30) Hive web interface

2009-01-29 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-30?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-30:


Attachment: hive-30-7.patch

This patch adds:
* the WAR location to be specified in hive-site.conf (changed to HiveConf)
* also the class path refers to hadoop.root
rather then a hardcoded version ie 0.19.0

> Hive web interface
> --
>
> Key: HIVE-30
> URL: https://issues.apache.org/jira/browse/HIVE-30
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Web UI
>Reporter: Jeff Hammerbacher
>Assignee: Edward Capriolo
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: HIVE-30-5.patch, HIVE-30-6.patch, hive-30-7.patch, 
> HIVE-30-A.patch, HIVE-30.patch, HIVE-30.patch
>
>
> Hive needs a web interface. The initial checkin should have:
> * simple schema browsing
> * query submission
> * query history (similar to MySQL's SHOW PROCESSLIST)
> A suggested feature: the ability to have a query notify the user when it's 
> completed.
> Edward Capriolo has expressed some interest in driving this process.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



FW: svn commit: r738914 - /hadoop/hive/branches/branch-0.2/

2009-01-29 Thread Ashish Thusoo
FYI...

Created branch-0.2

Ashish

From: athu...@apache.org [athu...@apache.org]
Sent: Thursday, January 29, 2009 8:25 AM
To: hive-comm...@hadoop.apache.org
Subject: svn commit: r738914 - /hadoop/hive/branches/branch-0.2/

Author: athusoo
Date: Thu Jan 29 16:25:07 2009
New Revision: 738914

URL: http://svn.apache.org/viewvc?rev=738914&view=rev
Log:
Created 0.2 branch

Added:
hadoop/hive/branches/branch-0.2/
  - copied from r738913, hadoop/hive/trunk/