[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547958#comment-13547958 ] Hudson commented on HIVE-3171: -- Integrated in Hive-trunk-hadoop2 #54 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/54/]) HIVE-3171. Bucketed sort merge join doesn't work when multiple files exist for small alias (Navis Ryu via cws) (Revision 1382098) Result = ABORTED cws : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1382098 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecMapperContext.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedMergeBucketMapJoinOptimizer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/util/ObjectPair.java * /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_1.q * /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_2.q * /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_3.q * /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_4.q * /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_5.q * /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_6.q * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_2.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_3.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_4.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_5.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_6.q.out * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > Fix For: 0.10.0 > > Attachments: HIVE-3171.1.patch.txt, HIVE-3171.2.patch.txt > > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451254#comment-13451254 ] Hudson commented on HIVE-3171: -- Integrated in Hive-trunk-h0.21 #1655 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1655/]) HIVE-3171. Bucketed sort merge join doesn't work when multiple files exist for small alias (Navis Ryu via cws) (Revision 1382098) Result = FAILURE cws : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1382098 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecMapperContext.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedMergeBucketMapJoinOptimizer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/util/ObjectPair.java * /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_1.q * /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_2.q * /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_3.q * /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_4.q * /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_5.q * /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_6.q * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_2.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_3.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_4.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_5.q.out * /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_6.q.out * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > Fix For: 0.10.0 > > Attachments: HIVE-3171.1.patch.txt, HIVE-3171.2.patch.txt > > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450762#comment-13450762 ] Namit Jain commented on HIVE-3171: -- @Carl, let me know if you are busy. I also ran the tests, and they ran fine. I can commit it. I need this patch for some other jiras, and the sooner I get it in, the easier it is for me. > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > Attachments: HIVE-3171.1.patch.txt, HIVE-3171.2.patch.txt > > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449425#comment-13449425 ] Carl Steinbach commented on HIVE-3171: -- +1. Will commit if tests pass. > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > Attachments: HIVE-3171.1.patch.txt, HIVE-3171.2.patch.txt > > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449275#comment-13449275 ] Navis commented on HIVE-3171: - I've addressed comments and just finished full test. Will update patch shortly. > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > Attachments: HIVE-3171.1.patch.txt > > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448872#comment-13448872 ] Namit Jain commented on HIVE-3171: -- @Navis, are you planning to work on this ? Most of the comments are minor - it would be useful to get it in. > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > Attachments: HIVE-3171.1.patch.txt > > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443929#comment-13443929 ] Carl Steinbach commented on HIVE-3171: -- @Navis: I added a couple small comments on phabricator. Thanks. > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > Attachments: HIVE-3171.1.patch.txt > > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13442517#comment-13442517 ] Namit Jain commented on HIVE-3171: -- I know the policy, but in most of the cases, I think we don't commit our own patches. Anyway, it is not a big deal - I don't feel very comfortable about it, but have no reservations if you want to take that path. > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > Attachments: HIVE-3171.1.patch.txt > > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13442354#comment-13442354 ] Carl Steinbach commented on HIVE-3171: -- @Namit: The Apache Hive Project Bylaws are located here: https://cwiki.apache.org/confluence/display/Hive/Bylaws The bylaws explicitly prohibit committers from voting on their own patches, but they don't prohibit committers from committing their own patches as long as another committer has already reviewed and +1'd the patch. It's also worth noting that the bylaws stipulate that the +1 vote must be sent to dev@hive, and that 24 hours must elapse between the first +1 vote and the point at which the patch is committed. > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > Attachments: HIVE-3171.1.patch.txt > > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440920#comment-13440920 ] Namit Jain commented on HIVE-3171: -- @Carl, @Navis, we mostly don't commit our patches. There have been a few exceptions, but I think we should try to stick to this policy: dont commit our own patches. Also, can you hold off for a few hours - I wanted to take a pass at this patch. Should be done today itself. > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > Attachments: HIVE-3171.1.patch.txt > > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440886#comment-13440886 ] Navis commented on HIVE-3171: - @Carl: I've not yet received account for ASF, and is it ok to commit by myself? (I thought that the author and the committer should be different) > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > Attachments: HIVE-3171.1.patch.txt > > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440869#comment-13440869 ] Carl Steinbach commented on HIVE-3171: -- @Navis: Since the tests passed can you please commit this yourself? Thanks. > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > Attachments: HIVE-3171.1.patch.txt > > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440629#comment-13440629 ] Carl Steinbach commented on HIVE-3171: -- @Navis: Can you please attach the most recent version of the patch to this ticket? Thanks. > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440508#comment-13440508 ] Carl Steinbach commented on HIVE-3171: -- +1. Will commit if tests pass. > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439390#comment-13439390 ] Namit Jain commented on HIVE-3171: -- comments > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439286#comment-13439286 ] Navis commented on HIVE-3171: - @Carl: Fixed bug (thanks) and passed tests mentioned above. I'll update patch after passing full test. > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437661#comment-13437661 ] Navis commented on HIVE-3171: - I think the patch is not reflecting various changes made on the FetchWork. I'll start working on this when HIVE-2925 is visible on trunk. > Bucketed sort merge join doesn't work when multiple files exist for small > alias > --- > > Key: HIVE-3171 > URL: https://issues.apache.org/jira/browse/HIVE-3171 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Joey Echeverria >Assignee: Navis > Labels: bucketing, joins, partitioning > > Executing a query with the MAPJOIN hint and the bucketed sort merge join > optimizations enabled: > {noformat} > set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > {noformat} > works fine with partitioned tables if there is only one partition in the > table. However, if you add a second partition, Hive attempts to do a regular > map-side join which can fail because the tables are too large. Hive ought to > be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira