[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user maryannxue closed the pull request at: https://github.com/apache/phoenix/pull/281 ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r153594426 --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/CostBasedDecisionIT.java --- @@ -0,0 +1,171 @@ +package org.apache.phoenix.end2end; + +import static org.apache.phoenix.util.TestUtil.TEST_PROPERTIES; +import static org.junit.Assert.assertTrue; + +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.util.Map; +import java.util.Properties; + +import org.apache.phoenix.query.BaseTest; +import org.apache.phoenix.query.QueryServices; +import org.apache.phoenix.util.PropertiesUtil; +import org.apache.phoenix.util.QueryUtil; +import org.apache.phoenix.util.ReadOnlyProps; +import org.junit.AfterClass; +import org.junit.BeforeClass; +import org.junit.Test; + +import com.google.common.collect.Maps; + +public class CostBasedDecisionIT extends BaseUniqueNamesOwnClusterIT { --- End diff -- 3 tests is better than 1, but more would be better. Here are some ideas: - test that DELETE is optimized through cost-based as expected - same for UPSERT SELECT - how about a UNION query? - same for some join queries - how about a case where one plan would have a filter on a single leading column while another would have a filter on the first two leading columns, but the former plan would scan less data. In theory, the cost-based approach would choose the former. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r153574645 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/optimize/QueryOptimizer.java --- @@ -91,8 +91,23 @@ public QueryPlan optimize(PhoenixStatement statement, SelectStatement select, Co } public QueryPlan optimize(QueryPlan dataPlan, PhoenixStatement statement, List targetColumns, ParallelIteratorFactory parallelIteratorFactory) throws SQLException { -Listplans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, true); -return plans.get(0); +List plans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, false); +if (plans.size() == 1) { +return plans.get(0); +} + +// Get the best plan based on their costs. Costs will be ZERO if stats are not +// available, thus the first plan will be returned. +Cost minCost = null; +QueryPlan bestPlan = null; +for (QueryPlan plan : plans) { +Cost cost = plan.getCost(); +if (minCost == null || cost.compareTo(minCost) < 0) { +minCost = cost; +bestPlan = plan; +} +} +return bestPlan; --- End diff -- I'm still not seeing this change. The only existing method that should change in QueryOptimizer is this private one: private List getApplicablePlans(QueryPlan dataPlan, PhoenixStatement statement, List targetColumns, ParallelIteratorFactory parallelIteratorFactory, boolean stopAtBestPlan) throws SQLException { The only change necessary should be to call a new method that orders the plans according to cost. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r153404439 --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/CostBasedDecisionIT.java --- @@ -0,0 +1,176 @@ +package org.apache.phoenix.end2end; + +import static org.apache.phoenix.util.TestUtil.TEST_PROPERTIES; +import static org.junit.Assert.assertTrue; + +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.util.Map; +import java.util.Properties; + +import org.apache.phoenix.query.BaseTest; +import org.apache.phoenix.query.QueryServices; +import org.apache.phoenix.util.PropertiesUtil; +import org.apache.phoenix.util.QueryUtil; +import org.apache.phoenix.util.ReadOnlyProps; +import org.junit.AfterClass; +import org.junit.BeforeClass; +import org.junit.Test; + +import com.google.common.collect.Maps; + +public class CostBasedDecisionIT extends BaseTest { --- End diff -- Thanks for pointing this out, @JamesRTaylor! Changed in my latest CI as suggested. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r153378386 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/optimize/QueryOptimizer.java --- @@ -91,8 +91,23 @@ public QueryPlan optimize(PhoenixStatement statement, SelectStatement select, Co } public QueryPlan optimize(QueryPlan dataPlan, PhoenixStatement statement, List targetColumns, ParallelIteratorFactory parallelIteratorFactory) throws SQLException { -Listplans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, true); -return plans.get(0); +List plans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, false); +if (plans.size() == 1) { +return plans.get(0); +} + +// Get the best plan based on their costs. Costs will be ZERO if stats are not +// available, thus the first plan will be returned. +Cost minCost = null; +QueryPlan bestPlan = null; +for (QueryPlan plan : plans) { +Cost cost = plan.getCost(); +if (minCost == null || cost.compareTo(minCost) < 0) { +minCost = cost; +bestPlan = plan; +} +} +return bestPlan; --- End diff -- QueryOptimize.optimize() is not the only entry point to the optimizer. For example, DeleteCompiler calls both getApplicablePlans() and getBestPlan(). I think it'd be best if you hid this logic at the end of the private getApplicablePlans since all calls channel through that. That way all optimizer decisions will use cost if it's enabled. You could conditional order the plans based on their cost if cost-based is enabled here: return hintedPlan != null ? plans : isCostBasedEnabled ? orderPlansBasedOnCost(plans) : orderPlansBestToWorst(select, plans, stopAtBestPlan); ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r153377074 --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/CostBasedDecisionIT.java --- @@ -0,0 +1,176 @@ +package org.apache.phoenix.end2end; + +import static org.apache.phoenix.util.TestUtil.TEST_PROPERTIES; +import static org.junit.Assert.assertTrue; + +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.util.Map; +import java.util.Properties; + +import org.apache.phoenix.query.BaseTest; +import org.apache.phoenix.query.QueryServices; +import org.apache.phoenix.util.PropertiesUtil; +import org.apache.phoenix.util.QueryUtil; +import org.apache.phoenix.util.ReadOnlyProps; +import org.junit.AfterClass; +import org.junit.BeforeClass; +import org.junit.Test; + +import com.google.common.collect.Maps; + +public class CostBasedDecisionIT extends BaseTest { + +@BeforeClass +public static void doSetup() throws Exception { +Mapprops = Maps.newHashMapWithExpectedSize(1); +props.put(QueryServices.STATS_GUIDEPOST_WIDTH_BYTES_ATTRIB, Long.toString(20)); +props.put(QueryServices.STATS_UPDATE_FREQ_MS_ATTRIB, Long.toString(5)); +props.put(QueryServices.USE_STATS_FOR_PARALLELIZATION, Boolean.toString(true)); +props.put(QueryServices.COST_BASED_OPTIMIZER_ENABLED, Boolean.toString(true)); +setUpTestDriver(new ReadOnlyProps(props.entrySet().iterator())); +} + +@AfterClass +public static void tearDownMiniCluster() throws Exception { --- End diff -- This method shouldn't be necessary ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r153377003 --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/CostBasedDecisionIT.java --- @@ -0,0 +1,176 @@ +package org.apache.phoenix.end2end; + +import static org.apache.phoenix.util.TestUtil.TEST_PROPERTIES; +import static org.junit.Assert.assertTrue; + +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.util.Map; +import java.util.Properties; + +import org.apache.phoenix.query.BaseTest; +import org.apache.phoenix.query.QueryServices; +import org.apache.phoenix.util.PropertiesUtil; +import org.apache.phoenix.util.QueryUtil; +import org.apache.phoenix.util.ReadOnlyProps; +import org.junit.AfterClass; +import org.junit.BeforeClass; +import org.junit.Test; + +import com.google.common.collect.Maps; + +public class CostBasedDecisionIT extends BaseTest { --- End diff -- Derive from BaseUniqueNamesOwnClusterIT instead so that it runs when mvn verify is run (it's based on the annotation which this won't have) ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151829519 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/optimize/QueryOptimizer.java --- @@ -91,8 +91,23 @@ public QueryPlan optimize(PhoenixStatement statement, SelectStatement select, Co } public QueryPlan optimize(QueryPlan dataPlan, PhoenixStatement statement, List targetColumns, ParallelIteratorFactory parallelIteratorFactory) throws SQLException { -Listplans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, true); -return plans.get(0); +List plans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, false); +if (plans.size() == 1) { +return plans.get(0); +} + +// Get the best plan based on their costs. Costs will be ZERO if stats are not +// available, thus the first plan will be returned. +Cost minCost = null; +QueryPlan bestPlan = null; +for (QueryPlan plan : plans) { +Cost cost = plan.getCost(); +if (minCost == null || cost.compareTo(minCost) < 0) { +minCost = cost; +bestPlan = plan; +} +} +return bestPlan; --- End diff -- Thank you both for the input. I've change the logic for hinted plans as well as added new tests. Please review again. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151623809 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/optimize/QueryOptimizer.java --- @@ -91,8 +91,23 @@ public QueryPlan optimize(PhoenixStatement statement, SelectStatement select, Co } public QueryPlan optimize(QueryPlan dataPlan, PhoenixStatement statement, List targetColumns, ParallelIteratorFactory parallelIteratorFactory) throws SQLException { -Listplans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, true); -return plans.get(0); +List plans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, false); +if (plans.size() == 1) { +return plans.get(0); +} + +// Get the best plan based on their costs. Costs will be ZERO if stats are not +// available, thus the first plan will be returned. +Cost minCost = null; +QueryPlan bestPlan = null; +for (QueryPlan plan : plans) { +Cost cost = plan.getCost(); +if (minCost == null || cost.compareTo(minCost) < 0) { +minCost = cost; +bestPlan = plan; +} +} +return bestPlan; --- End diff -- I agree with @katameru. If you plug in your optimization code to impact the sort order done by QueryOptimizer.orderPlansBestToWorst() when getApplicablePlans() is called, we'd keep the behavior that the hints would override the optimizer (whether cost-based is enabled or not). I still think we need more tests. We've found that if the tests don't get added with the initial commit, they unfortunately never get added. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user katameru commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151620890 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/optimize/QueryOptimizer.java --- @@ -91,8 +91,23 @@ public QueryPlan optimize(PhoenixStatement statement, SelectStatement select, Co } public QueryPlan optimize(QueryPlan dataPlan, PhoenixStatement statement, List targetColumns, ParallelIteratorFactory parallelIteratorFactory) throws SQLException { -Listplans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, true); -return plans.get(0); +List plans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, false); +if (plans.size() == 1) { +return plans.get(0); +} + +// Get the best plan based on their costs. Costs will be ZERO if stats are not +// available, thus the first plan will be returned. +Cost minCost = null; +QueryPlan bestPlan = null; +for (QueryPlan plan : plans) { +Cost cost = plan.getCost(); +if (minCost == null || cost.compareTo(minCost) < 0) { +minCost = cost; +bestPlan = plan; +} +} +return bestPlan; --- End diff -- I'd argue that the observable behavior should stay the same, so a hinted query plan should be used above all others. Currently we could just sidestep the cost calculations in favor of a hinted plan, but I think that having a way to influence the optimizer via hints could be very useful. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151580119 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/optimize/QueryOptimizer.java --- @@ -91,8 +91,23 @@ public QueryPlan optimize(PhoenixStatement statement, SelectStatement select, Co } public QueryPlan optimize(QueryPlan dataPlan, PhoenixStatement statement, List targetColumns, ParallelIteratorFactory parallelIteratorFactory) throws SQLException { -Listplans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, true); -return plans.get(0); +List plans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, false); +if (plans.size() == 1) { +return plans.get(0); +} + +// Get the best plan based on their costs. Costs will be ZERO if stats are not +// available, thus the first plan will be returned. +Cost minCost = null; +QueryPlan bestPlan = null; +for (QueryPlan plan : plans) { +Cost cost = plan.getCost(); +if (minCost == null || cost.compareTo(minCost) < 0) { +minCost = cost; +bestPlan = plan; +} +} +return bestPlan; --- End diff -- Thanks for the feedback, @katameru! We need to define a strategy how hints work in the cost-based optimizer mode. Your input would be very much appreciated! Since this is the first issue in which I've brought "cost-based" in, the idea is to plug-in the framework and pieces of code here and there needed to make it work in general. We can see there's still a lot to do to refine this framework, and as I dig further into other issues under PHOENIX-4313. Right now I think adding a option to disable cost-based optimizer (pls see my latest ci) should be enough. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user katameru commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151421537 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/optimize/QueryOptimizer.java --- @@ -91,8 +91,23 @@ public QueryPlan optimize(PhoenixStatement statement, SelectStatement select, Co } public QueryPlan optimize(QueryPlan dataPlan, PhoenixStatement statement, List targetColumns, ParallelIteratorFactory parallelIteratorFactory) throws SQLException { -Listplans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, true); -return plans.get(0); +List plans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, false); +if (plans.size() == 1) { +return plans.get(0); +} + +// Get the best plan based on their costs. Costs will be ZERO if stats are not +// available, thus the first plan will be returned. +Cost minCost = null; +QueryPlan bestPlan = null; +for (QueryPlan plan : plans) { +Cost cost = plan.getCost(); +if (minCost == null || cost.compareTo(minCost) < 0) { +minCost = cost; +bestPlan = plan; +} +} +return bestPlan; --- End diff -- This will ignore any index hints used in the query. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151013728 --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/CostBasedDecisionIT.java --- @@ -0,0 +1,50 @@ +package org.apache.phoenix.end2end; + +import static org.apache.phoenix.util.TestUtil.TEST_PROPERTIES; +import static org.junit.Assert.assertTrue; + +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.util.Properties; + +import org.apache.phoenix.util.PropertiesUtil; +import org.apache.phoenix.util.QueryUtil; +import org.junit.Test; + +public class CostBasedDecisionIT extends ParallelStatsEnabledIT { --- End diff -- I think we need more tests for this. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151015089 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/execute/AggregatePlan.java --- @@ -112,7 +114,26 @@ private AggregatePlan(StatementContext context, FilterableStatement statement, T public Expression getHaving() { return having; } - + +@Override +public Cost getCost() throws SQLException { +Long byteCount = getEstimatedBytesToScan(); +if (byteCount == null) { +return Cost.ZERO; +} + --- End diff -- What does parallelLevel represent? Is that the level of parallelization done on the client? Is that derivable from thread pool size or cluster size? For now, maybe have a constant and doc it a bit. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151014919 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/execute/AggregatePlan.java --- @@ -112,7 +114,26 @@ private AggregatePlan(StatementContext context, FilterableStatement statement, T public Expression getHaving() { return having; } - + +@Override +public Cost getCost() throws SQLException { +Long byteCount = getEstimatedBytesToScan(); +if (byteCount == null) { +return Cost.ZERO; +} + +int parallelLevel = 10; --- End diff -- How about passing in groupBy instead of just groupBy.isUngroupedAggregate() as I think we'd want to get more info eventually out of the groupBy (for example, if it's ordered or unordered which impacts memory usage substantially)? ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151016226 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/execute/AggregatePlan.java --- @@ -112,7 +114,26 @@ private AggregatePlan(StatementContext context, FilterableStatement statement, T public Expression getHaving() { return having; } - + +@Override +public Cost getCost() throws SQLException { +Long byteCount = getEstimatedBytesToScan(); +if (byteCount == null) { +return Cost.ZERO; +} + +int parallelLevel = 10; --- End diff -- How about passing in groupBy instead of just groupBy.isUngroupedAggregate() as I think we'd want to get more info eventually out of the groupBy (for example, if it's ordered or unordered which impacts memory usage substantially)? This might help keep the interfaces a little more stable. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151015944 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/optimize/QueryOptimizer.java --- @@ -91,8 +91,23 @@ public QueryPlan optimize(PhoenixStatement statement, SelectStatement select, Co } public QueryPlan optimize(QueryPlan dataPlan, PhoenixStatement statement, List targetColumns, ParallelIteratorFactory parallelIteratorFactory) throws SQLException { -Listplans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, true); -return plans.get(0); +List plans = getApplicablePlans(dataPlan, statement, targetColumns, parallelIteratorFactory, false); --- End diff -- We'll need to have a new master boolean config option (like phoenix.optimize.costBased in QueryServices) that controls whether or not we're doing cost based optimization to make sure users can disable if they want to keep the existing behavior the same. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151015329 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/execute/ClientAggregatePlan.java --- @@ -87,6 +89,25 @@ public ClientAggregatePlan(StatementContext context, FilterableStatement stateme } @Override +public Cost getCost() throws SQLException { +Long byteCount = getEstimatedBytesToScan(); +if (byteCount == null) { +return Cost.ZERO; +} + +int parallelLevel = 1; --- End diff -- So parallelLevel is 1 because aggregation is done client-side instead of pushed to the cluster, right? Would be good to doc the thinking behind calculating the cost throughout. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151012514 --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/CostBasedDecisionIT.java --- @@ -0,0 +1,50 @@ +package org.apache.phoenix.end2end; + +import static org.apache.phoenix.util.TestUtil.TEST_PROPERTIES; +import static org.junit.Assert.assertTrue; + +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.util.Properties; + +import org.apache.phoenix.util.PropertiesUtil; +import org.apache.phoenix.util.QueryUtil; +import org.junit.Test; + +public class CostBasedDecisionIT extends ParallelStatsEnabledIT { + +@Test +public void testBug4288() throws Exception { +Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES); +Connection conn = DriverManager.getConnection(getUrl(), props); +conn.setAutoCommit(true); +try { +conn.createStatement().execute("CREATE TABLE test4288 (\n" + +"rowkey VARCHAR PRIMARY KEY,\n" + +"c1 VARCHAR,\n" + +"c2 VARCHAR)"); +conn.createStatement().execute("CREATE LOCAL INDEX test4388_c1_idx ON test4288 (c1)"); + --- End diff -- How about getting the explain plan here and verify that it's doing a full table scan (i.e. before there are any stats available)? ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151014126 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/execute/HashJoinPlan.java --- @@ -290,6 +291,25 @@ public FilterableStatement getStatement() { return statement; } +@Override +public Cost getCost() throws SQLException { +Long byteCount = getEstimatedBytesToScan(); +if (byteCount == null) { +return Cost.ZERO; +} + +Cost cost = new Cost(0, 0, byteCount); +Cost lhsCost = delegate.getCost(); +if (keyRangeExpressions != null) { +lhsCost = lhsCost.multiplyBy(0.01); --- End diff -- Why the 0.01? More comments would be good. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151012292 --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/CostBasedDecisionIT.java --- @@ -0,0 +1,50 @@ +package org.apache.phoenix.end2end; + +import static org.apache.phoenix.util.TestUtil.TEST_PROPERTIES; +import static org.junit.Assert.assertTrue; + +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.util.Properties; + +import org.apache.phoenix.util.PropertiesUtil; +import org.apache.phoenix.util.QueryUtil; +import org.junit.Test; + +public class CostBasedDecisionIT extends ParallelStatsEnabledIT { + +@Test +public void testBug4288() throws Exception { +Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES); +Connection conn = DriverManager.getConnection(getUrl(), props); +conn.setAutoCommit(true); +try { +conn.createStatement().execute("CREATE TABLE test4288 (\n" + --- End diff -- Make sure to generate a unique name for your table (see other tests for examples) ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/281#discussion_r151014001 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/execute/BaseQueryPlan.java --- @@ -500,7 +500,10 @@ public ExplainPlan getExplainPlan() throws SQLException { if (context.getScanRanges() == ScanRanges.NOTHING) { return new ExplainPlan(Collections.singletonList("DEGENERATE SCAN OVER " + getTableRef().getTable().getName().getString())); } - + +// Initialize dummy iterator to get the stats. --- End diff -- Comment that you need the stats prior to optimization as they help drive cost based decisions. ---
[GitHub] phoenix pull request #281: PHOENIX-4288 Indexes not used when ordering by pr...
GitHub user maryannxue opened a pull request: https://github.com/apache/phoenix/pull/281 PHOENIX-4288 Indexes not used when ordering by primary key 1. Add class Cost. 2. Add method getCost() in QueryPlan. 3. Let QueryOptimizer choose the best plan based on Cost; meanwhile if stats are not available the QueryOptimizer will keep the original behavior. You can merge this pull request into a Git repository by running: $ git pull https://github.com/maryannxue/phoenix phoenix-4388 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/phoenix/pull/281.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #281 commit f75c1b095683c153f8749107c5505304309dcb36 Author: maryannxueDate: 2017-11-07T21:40:37Z PHOENIX-4288 Indexes not used when ordering by primary key ---