Will do. On Wed, Aug 15, 2018 at 4:55 PM Julian Hyde <jh...@apache.org> wrote:
> I see now. > > I think the problem only occurs when you call > AbstractRelNode.recomputeDigest(). > > The first time the digest is computed, the input RelNodes have a digest > (and desc) as it has been set in AbstractRelNode’s constructor: > > this.digest = getRelTypeName() + "#" + id; > this.desc = digest; > > Explain writer uses the “desc” field to identify inputs, but maybe it > should use id or type + id. Or maybe the “desc” field should be final. > > By the way, the comment > > // Substring uses the same underlying array of chars, so saves a bit > // of memory. > > was true until JDK 1.6 but is no longer true. > > Can you log a JIRA case please. > > Julian > > > > > On Aug 15, 2018, at 2:37 PM, Laurent Goujon <laur...@dremio.com> wrote: > > > > Sorry, I should have mentioned the method too: HepPlanner#buildFinalPlan > > (when running RelOptRulesTest#testWindowInParenthesis()) > > > > On Wed, Aug 15, 2018 at 2:36 PM Laurent Goujon <laur...@dremio.com> > wrote: > > > >> It looks to happen when building the final plan: the hep planner goes > >> recursively to each node to recompute the digest. In that relnode tree, > >> there's no more HepRelVertex nodes, and the digest now includes the > whole > >> input(s) description. > >> > >> On Wed, Aug 15, 2018 at 2:33 PM Julian Hyde <jh...@apache.org> wrote: > >> > >>> When I run that test I get > >>> > >>> LogicalProject(input=HepRelVertex#10,$0=$9) > >>> > >>> Have you screwed something up? > >>> > >>>> On Aug 15, 2018, at 2:23 PM, Laurent Goujon <laur...@dremio.com> > wrote: > >>>> > >>>> Just ran RelOptRulesTest with a breakpoint in > >>>> AbstractRelNode#computeDigest() and I'm able to observe those kind of > >>>> digest: > >>>> > >>> > "LogicalProject(input=rel#6:LogicalWindow(input=rel#0:LogicalTableScan(table=[CATALOG, > >>>> SALES, EMP]),window#0=window(partition {0} order by [0] range between > >>>> UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT()])),$0=$9)" > >>>> > >>>> On Wed, Aug 15, 2018 at 2:09 PM Laurent Goujon <laur...@dremio.com> > >>> wrote: > >>>> > >>>>> Here's one (partial) example (truncated because it contains potential > >>>>> sensitive info, and didn't obfuscate or try to reproduce locally with > >>> non > >>>>> sensitive data): > >>>>> > >>>>> > >>> > "rel#8643738:LogicalProject.NONE.ANY([]).[](input=rel#8643736:LogicalUnion.NONE.ANY([]).[](input#0=rel#8643702:LogicalUnion.NONE.ANY([]).[](input#0=rel#8643668:LogicalUnion.NONE.ANY([]).[](input#0=rel#8643634:LogicalProject.NONE.ANY([]).[](input=rel#8643632:LogicalAggregate.NONE.ANY([]).[](input=rel#8643630:LogicalAggregate.NONE.ANY([]).[](input=rel#8643628:LogicalProject.NONE.ANY([]).[](input=rel#8643626:LogicalFilter.NONE.ANY([]).[](input=rel#8643624:LogicalProject.NONE.ANY([]).[](input=rel#8643622:LogicalProject.NONE.ANY([]).[](input=rel#8643842:MultiJoin.NONE.ANY([]).[](input#0=rel#8643838:LogicalProject.NONE.ANY([]).[](input=rel#8643615:MultiJoin.NONE.ANY([]).[](input#0=rel#8643603:LogicalProject.NONE.ANY([]).[](input=rel#8643601:SampleCrel.NONE.ANY([]).[](input=rel#8639853:ScanCrel.NONE.ANY([]).[](table="... > >>>>> > >>>>> The Logical* relnodes don't override computeDigest method, so this is > >>>>> basically whatever AbstractRelNode#computeDigest is doing: > >>>>> > >>> > https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/AbstractRelNode.java#L415 > >>>>> > >>>>> Laurent > >>>>> > >>>>> > >>>>> > >>>>> On Wed, Aug 15, 2018 at 1:57 PM Julian Hyde <jh...@apache.org> > wrote: > >>>>> > >>>>>> I thought the digest only included the IDs of the inputs, not the > >>> digest > >>>>>> of the inputs. Am I mistaken? > >>>>>> > >>>>>> Could you give an example of large description & digest? > >>>>>> > >>>>>>> On Aug 15, 2018, at 1:46 PM, Laurent Goujon <laur...@dremio.com> > >>> wrote: > >>>>>>> > >>>>>>> Hi folks, > >>>>>>> > >>>>>>> I'm looking for some guidance here before opening JIRAs/pull > >>> requests. > >>>>>>> > >>>>>>> I'm examining a memory dump during a planning operation and a > >>>>>> significant > >>>>>>> amount of memory are strings used for RelNode digest and > description > >>>>>> (some > >>>>>>> strings being around 130kb). In that particular case, the relnode > >>> tree > >>>>>> is > >>>>>>> particularly deep, and since the digest is basically done > >>> recursively, > >>>>>> the > >>>>>>> deepest/widest the tree, the longer the digest. > >>>>>>> > >>>>>>> The easy solution would be to not go deep when adding inputs to the > >>>>>> digest, > >>>>>>> and instead of adding the input description to only add their type, > >>> id > >>>>>> and > >>>>>>> traits (and also not recurse). Would this break parts of calcite, > or > >>>>>> cause > >>>>>>> other inconvenience because some use-cases rely on > digest/description > >>>>>> to be > >>>>>>> basically the whole tree in a textual form? > >>>>>>> > >>>>>>> Laurent > >>>>>> > >>>>>> > >>> > >>> > >