[jira] [Reopened] (HIVE-20552) Get Schema from LogicalPlan faster

Eric Wohlstadter (JIRA) Mon, 29 Oct 2018 19:25:11 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Eric Wohlstadter reopened HIVE-20552:
-------------------------------------

Reopened. The patch is failing for me with:
{code:java}
Caused by: java.lang.NullPointerException: null^M
        at 
org.apache.hadoop.hive.ql.parse.QBSubQuery.<init>(QBSubQuery.java:489)^M
        at 
org.apache.hadoop.hive.ql.parse.SubQueryUtils.buildSubQuery(SubQueryUtils.java:249)^M
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.subqueryRestrictionCheck(CalcitePlanner.java:3141)^M
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genSubQueryRelNode(CalcitePlanner.java:3322)^M
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterRelNode(CalcitePlanner.java:3379)^M
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterLogicalPlan(CalcitePlanner.java:3434)^M
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:4970)^M
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1722)^M
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1670)^M
        at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)^M
        at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)^M
        at 
org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)^M
        at 
org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)^M
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1431)^M
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genLogicalPlan(CalcitePlanner.java:393)^M
        at 
org.apache.hadoop.hive.ql.parse.ParseUtils.parseQueryAndGetSchema(ParseUtils.java:554)^M
        at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:254)^M
        at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:206)^M
        at 
org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)^M
        at 
org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)^M
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)^M
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)^M
        at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)^M
        at 
org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)^M
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)^M
        at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)^M
        at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)^M
        at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)^M
        at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)^M
        ... 16 more{code}

> Get Schema from LogicalPlan faster
> ----------------------------------
>
>                 Key: HIVE-20552
>                 URL: https://issues.apache.org/jira/browse/HIVE-20552
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Teddy Choi
>            Assignee: Teddy Choi
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0, 3.2.0
>
>         Attachments: HIVE-20552.1.patch, HIVE-20552.2.patch, 
> HIVE-20552.3.patch
>
>
> To get the schema of a query faster, it currently needs to compile, optimize, 
> and generate a TezPlan, which creates extra overhead when only the 
> LogicalPlan is needed.
> 1. Copy the method \{{HiveMaterializedViewsRegistry.parseQuery}}, making it 
> \{{public static}} and putting it in a utility class. 
> 2. Change the return statement of the method to \{{return 
> analyzer.getResultSchema();}}
> 3. Change the return type of the method to \{{List<FieldSchema>}}
> 4. Call the new method from \{{GenericUDTFGetSplits.createPlanFragment}} 
> replacing the current code which does this:
> {code}
>  if(num == 0) {
>  //Schema only
>  return new PlanFragment(null, schema, null);
>  }
> {code}
> moving the call earlier in \{{getPlanFragment}} ... right after the HiveConf 
> is created ... bypassing the code that uses \{{HiveTxnManager}} and 
> \{{Driver}}.
> 5. Convert the \{{List<FieldSchema>}} to 
> \{{org.apache.hadoop.hive.llap.Schema}}.
> 6. return from \{{getPlanFragment}} by returning \{{new PlanFragment(null, 
> schema, null)}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Reopened] (HIVE-20552) Get Schema from LogicalPlan faster

Reply via email to