[jira] Updated: (HIVE-1194) sorted merge join

2010-03-04 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-1194:
---

Attachment: hive-1194-2010-3-4.patch

attached a new patch 

> sorted merge join
> -
>
> Key: HIVE-1194
> URL: https://issues.apache.org/jira/browse/HIVE-1194
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: He Yongqiang
> Fix For: 0.6.0
>
> Attachments: hive-1194-2010-02-28.patch, hive-1194-2010-3-2.2.patch, 
> hive-1194-2010-3-2.patch, hive-1194-2010-3-3-2.patch, 
> hive-1194-2010-3-3.patch, hive-1194-2010-3-4.patch
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1194) sorted merge join

2010-03-03 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-1194:
---

Attachment: hive-1194-2010-3-3-2.patch

a new one added the reportProgress

> sorted merge join
> -
>
> Key: HIVE-1194
> URL: https://issues.apache.org/jira/browse/HIVE-1194
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: He Yongqiang
> Fix For: 0.6.0
>
> Attachments: hive-1194-2010-02-28.patch, hive-1194-2010-3-2.2.patch, 
> hive-1194-2010-3-2.patch, hive-1194-2010-3-3-2.patch, hive-1194-2010-3-3.patch
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1194) sorted merge join

2010-03-03 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-1194:
---

Attachment: hive-1194-2010-3-3.patch

A new patch integrates Namit and Siying's comments. Thanks Namit and Siying!

> sorted merge join
> -
>
> Key: HIVE-1194
> URL: https://issues.apache.org/jira/browse/HIVE-1194
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: He Yongqiang
> Fix For: 0.6.0
>
> Attachments: hive-1194-2010-02-28.patch, hive-1194-2010-3-2.2.patch, 
> hive-1194-2010-3-2.patch, hive-1194-2010-3-3.patch
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1194) sorted merge join

2010-03-02 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-1194:
---

Attachment: hive-1194-2010-3-2.2.patch

> sorted merge join
> -
>
> Key: HIVE-1194
> URL: https://issues.apache.org/jira/browse/HIVE-1194
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: He Yongqiang
> Fix For: 0.6.0
>
> Attachments: hive-1194-2010-02-28.patch, hive-1194-2010-3-2.2.patch, 
> hive-1194-2010-3-2.patch
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1194) sorted merge join

2010-03-02 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-1194:
---

Attachment: hive-1194-2010-3-2.patch

a new patch added more testcases and fixed some bugs.
@namit,
I agree, that will make the code more clear. can we do that in a followup jira, 
because it requires a code refactoring which may break existing mapjoin etc.

> sorted merge join
> -
>
> Key: HIVE-1194
> URL: https://issues.apache.org/jira/browse/HIVE-1194
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: He Yongqiang
> Fix For: 0.6.0
>
> Attachments: hive-1194-2010-02-28.patch, hive-1194-2010-3-2.patch
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1194) sorted merge join

2010-02-28 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-1194:
---

Attachment: hive-1194-2010-02-28.patch

for early review only. 
I will test it more and add more testcases.

> sorted merge join
> -
>
> Key: HIVE-1194
> URL: https://issues.apache.org/jira/browse/HIVE-1194
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: He Yongqiang
> Fix For: 0.6.0
>
> Attachments: hive-1194-2010-02-28.patch
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.