Re: Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1194/ --- (Updated 2011-09-10 21:10:06.178279) Review request for hive and John Sichi. Changes --- Added order-by to queries for test determinism. Summary --- This patch has defined a new AggregateIndexHandler which is used to optimize the query plan for groupby queries. This addresses bug HIVE-1694. https://issues.apache.org/jira/browse/HIVE-1694 Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66ee0be data/files/lineitem.txt PRE-CREATION data/files/tbl.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndex.java 591c9ff ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 5053576 ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 7a00c00 ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java bec8787 ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 590d69a ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteParseContextGenerator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java dcdfb9e ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 699519b ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q PRE-CREATION ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out PRE-CREATION Diff: https://reviews.apache.org/r/1194/diff Testing --- Thanks, Prajakta
Re: Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1194/ --- (Updated 2011-09-09 01:14:16.218940) Review request for hive and John Sichi. Summary --- This patch has defined a new AggregateIndexHandler which is used to optimize the query plan for groupby queries. This addresses bug HIVE-1694. https://issues.apache.org/jira/browse/HIVE-1694 Diffs (updated) - ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out PRE-CREATION ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 699519b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java dcdfb9e ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteParseContextGenerator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java PRE-CREATION common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66ee0be data/files/lineitem.txt PRE-CREATION data/files/tbl.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndex.java 591c9ff ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 5053576 ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 7a00c00 ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java bec8787 ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 590d69a ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java PRE-CREATION Diff: https://reviews.apache.org/r/1194/diff Testing --- Thanks, Prajakta
Re: Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
On 2011-08-05 21:20:21, John Sichi wrote: ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java, line 172 https://reviews.apache.org/r/1194/diff/2/?file=30443#file30443line172 See recent changes in corresponding CompactIndexHandler code for HIVEOPTINDEXFILTER; need the same here (or better, factor out common code here and elsewhere). On a related note, you may be able to use the same technique instead of isQueryInsertToTable; this would be preferable since it's nice to be able to use the index rewrite in cases where it's a normal INSERT table with index being used for GROUP BY on SELECT from some other table. I have factored out the common code in all Index handler classes and placed it in IndexUtils file. I also removed the code for isQueryInsertToTable and am setting the HIVEOPTINDEXFILTER to false instead. On 2011-08-05 21:20:21, John Sichi wrote: ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java, line 153 https://reviews.apache.org/r/1194/diff/2/?file=30449#file30449line153 Shouldn't this be the same as COUNT(*)? Yes it is. I missed to change this part from the previous code. - Prajakta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1194/#review1303 --- On 2011-08-03 10:31:42, Prajakta Kalmegh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1194/ --- (Updated 2011-08-03 10:31:42) Review request for hive and John Sichi. Summary --- This patch has defined a new AggregateIndexHandler which is used to optimize the query plan for groupby queries. This addresses bug HIVE-1694. https://issues.apache.org/jira/browse/HIVE-1694 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b46976f ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndex.java 591c9ff ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java a57f9cf ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 590d69a ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteParseContextGenerator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 8295687 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 699519b ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q PRE-CREATION ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out PRE-CREATION Diff: https://reviews.apache.org/r/1194/diff Testing --- Thanks, Prajakta
Re: Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1194/#review1303 --- ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java https://reviews.apache.org/r/1194/#comment2955 Can't you just look up AGGREGATES in the map? ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java https://reviews.apache.org/r/1194/#comment2953 Add a helper method to avoid duplicating the code in the else block below. ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java https://reviews.apache.org/r/1194/#comment2954 Can't you just look up AGGREGATES in the map? ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java https://reviews.apache.org/r/1194/#comment2956 See recent changes in corresponding CompactIndexHandler code for HIVEOPTINDEXFILTER; need the same here (or better, factor out common code here and elsewhere). On a related note, you may be able to use the same technique instead of isQueryInsertToTable; this would be preferable since it's nice to be able to use the index rewrite in cases where it's a normal INSERT table with index being used for GROUP BY on SELECT from some other table. ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java https://reviews.apache.org/r/1194/#comment2957 @params here don't match actual params ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java https://reviews.apache.org/r/1194/#comment2958 Shouldn't this be the same as COUNT(*)? ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q https://reviews.apache.org/r/1194/#comment2980 Besides EXPLAIN, you should include a few queries against a non-empty table verifying that you get the correct results both with and without the optimization applied. Remember to include an ORDER BY for test determinism. ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q https://reviews.apache.org/r/1194/#comment2978 Isn't this set redundant? - John On 2011-08-03 10:31:42, Prajakta Kalmegh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1194/ --- (Updated 2011-08-03 10:31:42) Review request for hive and John Sichi. Summary --- This patch has defined a new AggregateIndexHandler which is used to optimize the query plan for groupby queries. This addresses bug HIVE-1694. https://issues.apache.org/jira/browse/HIVE-1694 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b46976f ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndex.java 591c9ff ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java a57f9cf ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 590d69a ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteParseContextGenerator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 8295687 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 699519b ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q PRE-CREATION ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out PRE-CREATION Diff: https://reviews.apache.org/r/1194/diff Testing --- Thanks, Prajakta
Re: Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
On 2011-07-28 21:40:30, John Sichi wrote: ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java, line 61 https://reviews.apache.org/r/1194/diff/1/?file=27052#file27052line61 Please run ant checkstyle and fix all the formatting discrepancies it reports for your new files. Done! The code is still having checkstyle formatting errors only for places where we have used LinkedHashMap, HashMap and ArrayList. The error states Declaring variables, return values or parameters of type 'HashMap' is not allowed. On 2011-07-28 21:40:30, John Sichi wrote: ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java, line 184 https://reviews.apache.org/r/1194/diff/1/?file=27052#file27052line184 Don't you need to reuse the compact implementation here so that the index can be used for WHERE (not just GROUP BY)? The AggregateIndexHandler now extends from CompactIndexHandler instead of TableBasedIndexHandler. We override only analyzeIndexDefinition(...) and getIndexBuilderMapRedTask(...) methods from CompactIndexHandler. On 2011-07-28 21:40:30, John Sichi wrote: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 700 https://reviews.apache.org/r/1194/diff/1/?file=27054#file27054line700 This method is redundant now. Removed. Sorry to have missed that. On 2011-07-28 21:40:30, John Sichi wrote: ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java, line 252 https://reviews.apache.org/r/1194/diff/1/?file=27056#file27056line252 I can't think of a case where it would be worse. Ok. On 2011-07-28 21:40:30, John Sichi wrote: ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java, line 164 https://reviews.apache.org/r/1194/diff/1/?file=27057#file27057line164 Actually group-by is now preserved in all cases. Forgot to change a few comments after fixing the bug. Done! On 2011-07-28 21:40:30, John Sichi wrote: ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java, line 66 https://reviews.apache.org/r/1194/diff/1/?file=27058#file27058line66 Please use HTML bullet syntax for Javadoc (otherwise it all gets run together into one line when rendered). Done! On 2011-07-28 21:40:30, John Sichi wrote: ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java, line 93 https://reviews.apache.org/r/1194/diff/1/?file=27060#file27060line93 Shouldn't this be BIGINT? Also, I think you're supposed to use a TypeInfoFactory for this purpose. Yes. Changed it to bigint. Also changed it in AggregateIndexHandler where I had declared the type to be int. On 2011-07-28 21:40:30, John Sichi wrote: ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java, line 603 https://reviews.apache.org/r/1194/diff/1/?file=27062#file27062line603 Not sure why this new constructor is needed...after using it, all you do is get the table out of it. The only other constructor option for tableSpec needs the ASTNode as one of its parameters. Since we need to construct a new tableSpec using only the index table name, and we do not have a ASTNode for this, I need this constructor. If you have any other way in mind, please let me know. That would be helpful. On 2011-07-28 21:40:30, John Sichi wrote: ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q, line 27 https://reviews.apache.org/r/1194/diff/1/?file=27063#file27063line27 This should *not* be using the index, since the index is built on count(l_shipdate), and l_shipdate may contain nulls, whereas the query is referencing count(1), which is insensitive to nulls. Yes true. :( I have now changed the count(1) queries with count(l_shipdate) in ql_rewrite_gbtoidx.q file. Also, verified that the count(1) queries are not using the index. On 2011-07-28 21:40:30, John Sichi wrote: ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q, line 61 https://reviews.apache.org/r/1194/diff/1/?file=27063#file27063line61 Need additional tests to verify all the cases where the optimization should *not* be used: * when configuration disables it * when index partitions do not cover table partitions (I still don't see the code for this case) * ... all the other conditions checked for in the code ... Added new queries to verify that optimization is not used in case of: * when configuration disables it * ... all the other conditions checked for in the code ... About * when index partitions do not cover table partitions , still pending (also the code for it). I will upload the new patch once this is done. - Prajakta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1194/#review1212
Re: Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
On 2011-07-28 21:40:30, John Sichi wrote: ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java, line 61 https://reviews.apache.org/r/1194/diff/1/?file=27052#file27052line61 Please run ant checkstyle and fix all the formatting discrepancies it reports for your new files. Prajakta Kalmegh wrote: Done! The code is still having checkstyle formatting errors only for places where we have used LinkedHashMap, HashMap and ArrayList. The error states Declaring variables, return values or parameters of type 'HashMap' is not allowed. Best practice is to only use interfaces (Map/List) except at the point of instantiation where you select a concrete class. Hive violates this in a number of places, and sometimes that forces you to violate it in new code too; but otherwise, please follow this one. On 2011-07-28 21:40:30, John Sichi wrote: ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java, line 603 https://reviews.apache.org/r/1194/diff/1/?file=27062#file27062line603 Not sure why this new constructor is needed...after using it, all you do is get the table out of it. Prajakta Kalmegh wrote: The only other constructor option for tableSpec needs the ASTNode as one of its parameters. Since we need to construct a new tableSpec using only the index table name, and we do not have a ASTNode for this, I need this constructor. If you have any other way in mind, please let me know. That would be helpful. I'm asking why you even need to construct a new tableSpec instance. All you do with it is reference ts.tableHandle. And to create that tableHandle, you can just do db.getTable(tableName). So I don't see the purpose of the tableSpec instance. - John --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1194/#review1212 --- On 2011-07-26 14:44:01, Prajakta Kalmegh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1194/ --- (Updated 2011-07-26 14:44:01) Review request for hive and John Sichi. Summary --- This patch has defined a new AggregateIndexHandler which is used to optimize the query plan for groupby queries. This addresses bug HIVE-1694. https://issues.apache.org/jira/browse/HIVE-1694 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b46976f ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndex.java 591c9ff ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 2ca63b3 ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 590d69a ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteParseContextGenerator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 77a6dc6 ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q PRE-CREATION ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out PRE-CREATION Diff: https://reviews.apache.org/r/1194/diff Testing --- Thanks, Prajakta
Re: Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1194/#review1212 --- ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java https://reviews.apache.org/r/1194/#comment2711 Please run ant checkstyle and fix all the formatting discrepancies it reports for your new files. ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java https://reviews.apache.org/r/1194/#comment2695 Don't you need to reuse the compact implementation here so that the index can be used for WHERE (not just GROUP BY)? ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java https://reviews.apache.org/r/1194/#comment2696 This method is redundant now. ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java https://reviews.apache.org/r/1194/#comment2698 I can't think of a case where it would be worse. ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java https://reviews.apache.org/r/1194/#comment2699 Actually group-by is now preserved in all cases. ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java https://reviews.apache.org/r/1194/#comment2700 Please use HTML bullet syntax for Javadoc (otherwise it all gets run together into one line when rendered). ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java https://reviews.apache.org/r/1194/#comment2701 indentation ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java https://reviews.apache.org/r/1194/#comment2703 Shouldn't this be BIGINT? Also, I think you're supposed to use a TypeInfoFactory for this purpose. ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java https://reviews.apache.org/r/1194/#comment2702 indentation ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java https://reviews.apache.org/r/1194/#comment2704 typo: Repace ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java https://reviews.apache.org/r/1194/#comment2707 Not sure why this new constructor is needed...after using it, all you do is get the table out of it. ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q https://reviews.apache.org/r/1194/#comment2709 This should *not* be using the index, since the index is built on count(l_shipdate), and l_shipdate may contain nulls, whereas the query is referencing count(1), which is insensitive to nulls. ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q https://reviews.apache.org/r/1194/#comment2710 Need additional tests to verify all the cases where the optimization should *not* be used: * when configuration disables it * when index partitions do not cover table partitions (I still don't see the code for this case) * ... all the other conditions checked for in the code ... - John On 2011-07-26 14:44:01, Prajakta Kalmegh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1194/ --- (Updated 2011-07-26 14:44:01) Review request for hive and John Sichi. Summary --- This patch has defined a new AggregateIndexHandler which is used to optimize the query plan for groupby queries. This addresses bug HIVE-1694. https://issues.apache.org/jira/browse/HIVE-1694 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b46976f ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndex.java 591c9ff ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 2ca63b3 ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 590d69a ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteParseContextGenerator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 77a6dc6 ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q PRE-CREATION ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out PRE-CREATION
Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1194/ --- Review request for hive and John Sichi. Summary --- This patch has defined a new AggregateIndexHandler which is used to optimize the query plan for groupby queries. This addresses bug HIVE-1694. https://issues.apache.org/jira/browse/HIVE-1694 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b46976f ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndex.java 591c9ff ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 2ca63b3 ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 590d69a ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteParseContextGenerator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 77a6dc6 ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q PRE-CREATION ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out PRE-CREATION Diff: https://reviews.apache.org/r/1194/diff Testing --- Thanks, Prajakta
Re: Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/505/#review339 --- ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java https://reviews.apache.org/r/505/#comment684 Suggestion is to make this configurable (via IDXPROPERTIES) to save space when column is known NOT NULL. (Also later to allow for specification of other aggregates.) ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteParseContextGenerator.java https://reviews.apache.org/r/505/#comment686 Indentation is messed up here. ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteParseContextGenerator.java https://reviews.apache.org/r/505/#comment685 Please eliminate all TODO's, and don't use printStackTrace. ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteParseContextGenerator.java https://reviews.apache.org/r/505/#comment687 Instead of downcasting over and over, you should probably be doing it just once in the calling method (and asserting that you got the right type since otherwise generateOperatorTree is not going to have the desired effect. ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java https://reviews.apache.org/r/505/#comment688 Hive naming convention for variables is camelCase, not under_score. ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java https://reviews.apache.org/r/505/#comment690 I see query_has_distinct being written but never read. Why do we need it? I don't think we want to be relying on the parse tree at all. ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java https://reviews.apache.org/r/505/#comment689 Don't swallow exceptions like this. - John On 2011-03-13 20:00:28, Prajakta Kalmegh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/505/ --- (Updated 2011-03-13 20:00:28) Review request for hive. Summary --- New Review starting from patch 3. This addresses bug HIVE-1694. https://issues.apache.org/jira/browse/HIVE-1694 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 46739b7 ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndex.java 308d985 ql/src/java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 916b235 ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 50db44c ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 590d69a ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteParseContextGenerator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteIndexSubqueryCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteIndexSubqueryProcFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteRemoveGroupbyCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteRemoveGroupbyProcFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 04f560f ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q PRE-CREATION ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out PRE-CREATION Diff: https://reviews.apache.org/r/505/diff Testing --- Thanks, Prajakta
Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/505/ --- Review request for hive. Summary --- New Review starting from patch 3. This addresses bug HIVE-1694. https://issues.apache.org/jira/browse/HIVE-1694 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 46739b7 ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndex.java 308d985 ql/src/java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 916b235 ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 50db44c ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 590d69a ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteParseContextGenerator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteIndexSubqueryCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteIndexSubqueryProcFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteRemoveGroupbyCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteRemoveGroupbyProcFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 04f560f ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q PRE-CREATION ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out PRE-CREATION Diff: https://reviews.apache.org/r/505/diff Testing --- Thanks, Prajakta
Re: Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
On 2011-02-27 15:49:28, John Sichi wrote: http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteGBUsingIndex.java, line 19 https://reviews.apache.org/r/392/diff/1/?file=10598#file10598line19 For HIVE-1644, the HMC team is creating a new package org.apache.hadoop.hive.ql.optimizer.index. I think any optimizer code related to indexing should go in there. Prajakta Kalmegh wrote: I understand we just need to place our code in the 'org.apache.hadoop.hive.ql.optimizer.index' package. Right? Correct. But for generic optimizer code (e.g. RewriteParseContextGenerator, which is unrelated to indexing) can stay where it is. On 2011-02-27 15:49:28, John Sichi wrote: http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteGBUsingIndex.java, line 51 https://reviews.apache.org/r/392/diff/1/?file=10598#file10598line51 This code is currently tied to the compact index representation. We mentioned earlier that we'll need a new index representation (summary) instead in order to implement the counts correctly (we should leave the compact representation as is). So: * until the summary representation is added, we can't enable this * in general, it would be good to find a way to make this pluggable; for example, the bitmap index representation can also be utilized by counting the bits, but the rewrite expression would be slightly different Prajakta Kalmegh wrote: We have a couple of designs ready for the new index handler. I will post them on JIRA. When you say pluggable does it mean that we should be able to change the rewrite expression depending on which index handler is used? For example, we will get different rewrite expressions if we use either the new index handler or the bitmap index handler. Our optimization code should work with any rewrite expression. Is this understanding correct? Correct. If you look at what's going on in the latest HIVE-1644 patch, they are extending the HiveIndexHandler interface with an optional method for dealing with a WHERE clause. You can do the same for aggregation. It may take a few iterations to get it generic enough, but it helps that we already have the bitmap work in progress so that's two cases to generalize from. On 2011-02-27 15:49:28, John Sichi wrote: http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyProcFactory.java, line 291 https://reviews.apache.org/r/392/diff/1/?file=10597#file10597line291 Need to abstract out this dependency on _c convention. Prajakta Kalmegh wrote: Can you suggest a way using which we can get rid of this dependency? Actually, isn't just checking internalToAlias.get != null already good enough? If so we can just eliminate the _c check. - John --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/392/#review217 --- On 2011-02-03 16:45:15, John Sichi wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/392/ --- (Updated 2011-02-03 16:45:15) Review request for hive. Summary --- Preliminary review. This addresses bug HIVE-1694. https://issues.apache.org/jira/browse/HIVE-1694 Diffs - http://svn.apache.org/repos/asf/hive/trunk/build.xml 1067048 http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyCtx.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyProcFactory.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteGBUsingIndex.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteIndexSubqueryCtx.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteIndexSubqueryProcFactory.java PRE-CREATION
Re: Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/392/#review217 --- http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java https://reviews.apache.org/r/392/#comment432 I suggest hive.optimize.index.groupby for this property. http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java https://reviews.apache.org/r/392/#comment433 Typo: Unknown http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyCtx.java https://reviews.apache.org/r/392/#comment435 I don't understand why these state variables are maintained as conf variables rather than just data members of a class. Could you explain why that is necessary? http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyCtx.java https://reviews.apache.org/r/392/#comment434 Why is the exception being ignored here? http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyProcFactory.java https://reviews.apache.org/r/392/#comment437 Need Apache headers on all new files. http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyProcFactory.java https://reviews.apache.org/r/392/#comment436 Use expr instead of engfd http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyProcFactory.java https://reviews.apache.org/r/392/#comment440 Typo: groub-by http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyProcFactory.java https://reviews.apache.org/r/392/#comment506 What about nested function invocations? http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyProcFactory.java https://reviews.apache.org/r/392/#comment442 Eliminate commented-out code http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyProcFactory.java https://reviews.apache.org/r/392/#comment443 Need to abstract out this dependency on _c convention. http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteGBUsingIndex.java https://reviews.apache.org/r/392/#comment507 For HIVE-1644, the HMC team is creating a new package org.apache.hadoop.hive.ql.optimizer.index. I think any optimizer code related to indexing should go in there. http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteGBUsingIndex.java https://reviews.apache.org/r/392/#comment444 This code is currently tied to the compact index representation. We mentioned earlier that we'll need a new index representation (summary) instead in order to implement the counts correctly (we should leave the compact representation as is). So: * until the summary representation is added, we can't enable this * in general, it would be good to find a way to make this pluggable; for example, the bitmap index representation can also be utilized by counting the bits, but the rewrite expression would be slightly different http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteGBUsingIndex.java https://reviews.apache.org/r/392/#comment445 TODO? http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteGBUsingIndex.java https://reviews.apache.org/r/392/#comment446 Might want to use a finer level than LOG.info http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteGBUsingIndex.java https://reviews.apache.org/r/392/#comment447 Just a sanity check to avoid huge payloads coming back from thrift. http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteGBUsingIndex.java https://reviews.apache.org/r/392/#comment448 Hmm, no, I think we should fail hard here. If the underlying problem is fatal (e.g. the metastore went down), we should not try to hide it. - John On 2011-02-03 16:45:15, John Sichi wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/392/ --- (Updated 2011-02-03 16:45:15) Review request for hive. Summary --- Preliminary review. This addresses bug HIVE-1694. https://issues.apache.org/jira/browse/HIVE-1694 Diffs - http://svn.apache.org/repos/asf/hive/trunk/build.xml 1067048
Re: Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
On 2011-02-27 15:49:28, John Sichi wrote: http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyCtx.java, line 121 https://reviews.apache.org/r/392/diff/1/?file=10596#file10596line121 I don't understand why these state variables are maintained as conf variables rather than just data members of a class. Could you explain why that is necessary? The intent was to have these variables available as part of the current configuration. I think that was unnecessary. We will change it to data members of RewriteCanApplyCtx class. On 2011-02-27 15:49:28, John Sichi wrote: http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyProcFactory.java, line 187 https://reviews.apache.org/r/392/diff/1/?file=10597#file10597line187 What about nested function invocations? Yes we missed that. We will upload a new patch with this change. On 2011-02-27 15:49:28, John Sichi wrote: http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyProcFactory.java, line 291 https://reviews.apache.org/r/392/diff/1/?file=10597#file10597line291 Need to abstract out this dependency on _c convention. Can you suggest a way using which we can get rid of this dependency? On 2011-02-27 15:49:28, John Sichi wrote: http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteGBUsingIndex.java, line 19 https://reviews.apache.org/r/392/diff/1/?file=10598#file10598line19 For HIVE-1644, the HMC team is creating a new package org.apache.hadoop.hive.ql.optimizer.index. I think any optimizer code related to indexing should go in there. I understand we just need to place our code in the 'org.apache.hadoop.hive.ql.optimizer.index' package. Right? On 2011-02-27 15:49:28, John Sichi wrote: http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteGBUsingIndex.java, line 51 https://reviews.apache.org/r/392/diff/1/?file=10598#file10598line51 This code is currently tied to the compact index representation. We mentioned earlier that we'll need a new index representation (summary) instead in order to implement the counts correctly (we should leave the compact representation as is). So: * until the summary representation is added, we can't enable this * in general, it would be good to find a way to make this pluggable; for example, the bitmap index representation can also be utilized by counting the bits, but the rewrite expression would be slightly different We have a couple of designs ready for the new index handler. I will post them on JIRA. When you say pluggable does it mean that we should be able to change the rewrite expression depending on which index handler is used? For example, we will get different rewrite expressions if we use either the new index handler or the bitmap index handler. Our optimization code should work with any rewrite expression. Is this understanding correct? - Prajakta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/392/#review217 --- On 2011-02-03 16:45:15, John Sichi wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/392/ --- (Updated 2011-02-03 16:45:15) Review request for hive. Summary --- Preliminary review. This addresses bug HIVE-1694. https://issues.apache.org/jira/browse/HIVE-1694 Diffs - http://svn.apache.org/repos/asf/hive/trunk/build.xml 1067048 http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyCtx.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyProcFactory.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteGBUsingIndex.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteIndexSubqueryCtx.java
Review Request: HIVE-1694: Accelerate GROUP BY execution using indexes
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/392/ --- Review request for hive. Summary --- Preliminary review. This addresses bug HIVE-1694. https://issues.apache.org/jira/browse/HIVE-1694 Diffs - http://svn.apache.org/repos/asf/hive/trunk/build.xml 1067048 http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyCtx.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteCanApplyProcFactory.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteGBUsingIndex.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteIndexSubqueryCtx.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteIndexSubqueryProcFactory.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteParseContextGenerator.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteRemoveGroupbyCtx.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/RewriteRemoveGroupbyProcFactory.java PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java 1067048 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/fatal.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out PRE-CREATION Diff: https://reviews.apache.org/r/392/diff Testing --- Thanks, John