Re: Lisk Bucketing DDL Patch

2012-07-27 Thread Namit Jain
Yes, that patch will become quiet big to be done a single shot.

Moreover, the skew information can be used by a variety of use-cases.

1. List Bucketing
2. Skew Joins: https://cwiki.apache.org/Hive/skewed-join-optimization.html
3. Another variant of skew joins:
https://issues.apache.org/jira/browse/HIVE-3286

So, the skew may not be limited to be used for list bucketing only.

So, it might be simpler to split into DDL and DML support.

DDL will be common to all the use-cases who want to use/store skew
information. 

Each use-case can implement the DML/Query separately.


Thanks,
-namit


On 7/28/12 7:07 AM, Carl Steinbach c...@cloudera.com wrote:

 Since we are close to release the first patch DDL.

In a comment on the design doc you said that the first phase would involve
implementing this feature for a single-column end-to-end (DML+DDL). Has
that plan changed?

Thanks.

Carl

On Wed, Jul 25, 2012 at 12:31 AM, Gang Tim Liu g...@fb.com wrote:

 Dear all hive developers,

 Please review the documentation:

 https://cwiki.apache.org/confluence/display/Hive/ListBucketing

 Since we are close to release the first patch DDL.

 We will continue to update the wiki about new information and in the
 meanwhile want to collect your feedback.

 Thanks

 Tim





Re: Lisk Bucketing DDL Patch

2012-07-27 Thread Gang Tim Liu
Yes, Namit has a great summary. thanks

On 7/27/12 9:09 PM, Namit Jain nj...@fb.com wrote:

Yes, that patch will become quiet big to be done a single shot.

Moreover, the skew information can be used by a variety of use-cases.

1. List Bucketing
2. Skew Joins: https://cwiki.apache.org/Hive/skewed-join-optimization.html
3. Another variant of skew joins:
https://issues.apache.org/jira/browse/HIVE-3286

So, the skew may not be limited to be used for list bucketing only.

So, it might be simpler to split into DDL and DML support.

DDL will be common to all the use-cases who want to use/store skew
information. 

Each use-case can implement the DML/Query separately.


Thanks,
-namit


On 7/28/12 7:07 AM, Carl Steinbach c...@cloudera.com wrote:

 Since we are close to release the first patch DDL.

In a comment on the design doc you said that the first phase would
involve
implementing this feature for a single-column end-to-end (DML+DDL). Has
that plan changed?

Thanks.

Carl

On Wed, Jul 25, 2012 at 12:31 AM, Gang Tim Liu g...@fb.com wrote:

 Dear all hive developers,

 Please review the documentation:

 https://cwiki.apache.org/confluence/display/Hive/ListBucketing

 Since we are close to release the first patch DDL.

 We will continue to update the wiki about new information and in the
 meanwhile want to collect your feedback.

 Thanks

 Tim






Re: Lisk Bucketing DDL Patch

2012-07-26 Thread Namit Jain
Note that, we are also planning to use the same syntax for specifying the
syntax for skew for
optimizing joins.

https://cwiki.apache.org/Hive/skewed-join-optimization.html


Thanks,
-namit



On 7/25/12 1:01 PM, Gang Tim Liu g...@fb.com wrote:

Dear all hive developers,

Please review the documentation:

https://cwiki.apache.org/confluence/display/Hive/ListBucketing

Since we are close to release the first patch DDL.

We will continue to update the wiki about new information and in the
meanwhile want to collect your feedback.

Thanks

Tim




Lisk Bucketing DDL Patch

2012-07-25 Thread Gang Tim Liu
Dear all hive developers,

Please review the documentation:

https://cwiki.apache.org/confluence/display/Hive/ListBucketing

Since we are close to release the first patch DDL.

We will continue to update the wiki about new information and in the
meanwhile want to collect your feedback.

Thanks

Tim