Re: Is anybody working on the globally order by of hive ?

2010-06-12 Thread Jeff Hammerbacher
See https://issues.apache.org/jira/browse/HIVE-1402. On Fri, Jun 11, 2010 at 1:22 PM, John Sichi jsi...@facebook.com wrote: If someone is interested in adding parallel ORDER BY to Hive (using TotalOrderPartitioner), here's a good starting point:

Re: Is anybody working on the globally order by of hive ?

2010-06-12 Thread Jeff Zhang
Great, I can work on this issue. On Sat, Jun 12, 2010 at 2:02 PM, Jeff Hammerbacher ham...@cloudera.com wrote: See https://issues.apache.org/jira/browse/HIVE-1402. On Fri, Jun 11, 2010 at 1:22 PM, John Sichi jsi...@facebook.com wrote: If someone is interested in adding parallel ORDER BY

Is anybody working on the globally order by of hive ?

2010-06-11 Thread Jeff Zhang
Hi all, From the wiki of hive, Hive do not have the feature of globally order by, the sort by of hive is for each reducer. Our team think the globally order by is an important feature for users, so wondering is anybody working it ? I am very interested to been involved. -- Best Regards Jeff

Re: Is anybody working on the globally order by of hive ?

2010-06-11 Thread Edward Capriolo
On Fri, Jun 11, 2010 at 5:24 AM, Jeff Zhang zjf...@gmail.com wrote: Hi all, From the wiki of hive, Hive do not have the feature of globally order by, the sort by of hive is for each reducer. Our team think the globally order by is an important feature for users, so wondering is anybody

Re: Is anybody working on the globally order by of hive ?

2010-06-11 Thread Ning Zhang
Good idea Edward. It would definitely better if it is what it sounds to be. Btw Jeff, order by is supported in trunk with certain limititions in strict mode (has to have a limit). I will be able to update the wiki when I come back. Thanks, Ning -- Sent from my blackberry

RE: Is anybody working on the globally order by of hive ?

2010-06-11 Thread John Sichi
If someone is interested in adding parallel ORDER BY to Hive (using TotalOrderPartitioner), here's a good starting point: http://wiki.apache.org/hadoop/Hive/HBaseBulkLoad The goal would be to take that manual two-step sample-then-sort process and turn it into an automatic plan within Hive. I