soumyakanti3578 commented on code in PR #5131: URL: https://github.com/apache/hive/pull/5131#discussion_r1576566058
########## ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java: ########## @@ -1745,9 +1748,72 @@ public RelNode apply(RelOptCluster cluster, RelOptSchema relOptSchema, SchemaPlu if (LOG.isDebugEnabled()) { LOG.debug("Plan after post-join transformations:\n" + RelOptUtil.toString(calcitePlan)); } + perfLogger.perfLogEnd(this.getClass().getName(), PerfLogger.OPTIMIZER); + + if (conf.getBoolVar(ConfVars.TEST_CBO_PLAN_SERIALIZATION_DESERIALIZATION_ENABLED)) { + calcitePlan = testSerializationAndDeserialization(perfLogger, calcitePlan); + } + + return calcitePlan; + } + + @Nullable + private RelNode testSerializationAndDeserialization(PerfLogger perfLogger, RelNode calcitePlan) { + if (!isSerializable(calcitePlan)) { + return calcitePlan; + } + perfLogger.perfLogBegin(this.getClass().getName(), "plan serializer"); + String calcitePlanJson = serializePlan(calcitePlan); + perfLogger.perfLogEnd(this.getClass().getName(), "plan serializer"); + + if (stringSizeGreaterThan(calcitePlanJson, PLAN_SERIALIZATION_DESERIALIZATION_STR_SIZE_LIMIT)) { + return calcitePlan; + } + + if (LOG.isDebugEnabled()) { + LOG.debug("Size of calcite plan: {}", calcitePlanJson.getBytes(Charset.defaultCharset()).length); + LOG.debug("JSON plan: \n{}", calcitePlanJson); + } + + try { + perfLogger.perfLogBegin(this.getClass().getName(), "plan deserializer"); + RelNode fromJson = deserializePlan(calcitePlan.getCluster(), calcitePlanJson); + perfLogger.perfLogEnd(this.getClass().getName(), "plan deserializer"); + + if (LOG.isDebugEnabled()) { + LOG.debug("Base plan: \n{}", RelOptUtil.toString(calcitePlan)); + LOG.debug("Plan from JSON: \n{}", RelOptUtil.toString(fromJson)); + } + + calcitePlan = fromJson; + } catch (IOException e) { + throw new RuntimeException(e); + } + Review Comment: Only reason I have the method `testSerializationAndDeserialization` here instead of a unit test is to enable integration tests with just a property from the qfile. My idea was to enable `TEST_CBO_PLAN_SERIALIZATION_DESERIALIZATION_ENABLED` from either each individual qfile, or from `hive-site.xml` for a whole driver like `TestTezTPCDS30TBPerfCliDriver`. That would ensure that we will test this for a larger number and more diverse set of queries, instead of testing a few scenarios with unit tests. Let me know if you still want me to add a unit test for this and remove this from here and I will do it. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org