Thanks Julian,

After despairing for a day over the countless places where we invoke in code 
`instance()` and having to go over them one-by one, I made a break through 
today when I added instrumentation to count where we call instance() from at 
runtime. Turns out the culprit is use of one deprecated API: 
AbstractRelNode.getRows():

  @SuppressWarnings("deprecation")
  public final double getRows() {
    return estimateRowCount(RelMetadataQuery.instance());
  }

This alone was called directly 2995340 times, and causing indirectly about ~1M 
more calls. These are Hive code calls, not Calcite issues. I made changes to 
use the estimateRowCount(RelMetadataQuery) call instead, and the biggest 
offenders all had handy mq instances to pass in. I’m waiting on tests, but it 
looks very promising.

Thanks,
~Remus

On 5/25/17, 11:57 AM, "Julian Hyde" <[email protected]> wrote:

    This is a hard problem. We need to make a lot of metadata calls
    (directly and indirectly), we want it to be efficient (hence we want
    some caching) but we want the metadata to change when we make
    improvements. The RelMetadataQuery system, including generated code,
    is a carefully thought out solution to that problem.
    
    I don't recommend adding thread-level caching. The metadata goes out
    of date when you apply transformation rules.
    
    The idea is to create a RelMetadataQuery instance when you want a
    "transaction" giving you read-consistent view of the metadata.
    Probably inside an onMatch method, and used for all metadata calls
    from that method.
    
    If RelMetadataQuery.instance() is being called from within a metadata
    handler that is bad news. Eliminate all such calls and see whether the
    instance count goes down.
    
    On Thu, May 25, 2017 at 11:43 AM, Remus Rusanu <[email protected]> 
wrote:
    > Hi all,
    >
    > Investigating performance of Hive query compile I found some problems 
around memorization in GeneratedMetadataHandler_xxx classes. These classes are 
generated from JaninonRelMetadataProvider and the generated code does 
memorization, anchored on the ‘map’ field of the RelMetadataQuery ‘mq’ 
parameter. My measurements show that these calls explode into deep recursive 
stacks. I was measuring a complex query (49 joins, plenty of expressions) and 
some of the numbers look staggering. Take for instance 
GeneratedMetadataHandler_RowCount.getRowCount:
    > -          It is called 9303 times as top call
    > -          It generates 165754 total recursive calls, up to a nest level 
18
    > -          The memo cache is hit 18065 (successful found key)
    > -          The memo is populated 147689 times (missed key)
    > -          The function gets no less than 22186 distinct RelMetadataQuery 
`mq` instances (!!).
    >
    > This situation repeats for each GeneratedMetadataHandler_XXX code:
    > Class
    >
    >  Top Calls
    >
    >  Total Calls
    >
    >  Memo Cache Hits
    >
    >  Memo Cache Miss
    >
    >  Distinct RelMetadataQuery instances
    >
    > getMaxRowCount
    >
    > 489
    >
    > 105262
    >
    > 6615
    >
    > 98647
    >
    > 489
    >
    > getDistinctRowCount
    >
    > 6828
    >
    > 74286
    >
    > 6179
    >
    > 68107
    >
    > 5828
    >
    > areColumnsUnique
    >
    > 19197
    >
    > 149708
    >
    > 13367
    >
    > 136341
    >
    > 2139
    >
    > getColumnOrigins
    >
    > 250
    >
    > 3021
    >
    > 39
    >
    > 2982
    >
    > 24
    >
    > getCumulativeCost
    >
    > 2249
    >
    > 9267
    >
    > 4660
    >
    > 4607
    >
    > 1
    >
    > getNonCumulativeCost
    >
    > 3559
    >
    > 3559
    >
    > 1285
    >
    > 3559
    >
    > 18
    >
    > getSelectivity
    >
    > 15636
    >
    > 35727
    >
    > 1114
    >
    > 34613
    >
    > 4984
    >
    > getUniqueKeys
    >
    > 26311
    >
    > 111715
    >
    > 5212
    >
    > 106503
    >
    > 3266
    >
    >
    > Looking at this, the root problems seems to be the fact that the code 
uses often the construct `RelMetadataQuery.instance()` to obtain a reference to 
a needed object. But each call to `instance()` returns a new object, and this 
new object has a new, clean, memorization `map` field. So we have a very poor 
memo cache hit ratio, but far worse is the effect of repeating ad-nauseam 
recursive calls on deep trees. I made an experiment where I modified the code 
in `RelMetadataQuery.instance()` to reference a threadLocal `map` field and the 
difference is like night vs. day:
    >
    > Class
    >
    >  Top Calls
    >
    >  Total Calls
    >
    >  Memo Cache Hits
    >
    >  Memo Cache Miss
    >
    > getRowCount
    >
    > 8179
    >
    > 14426
    >
    > 12920
    >
    > 1506
    >
    > getMaxRowCount
    >
    > 489
    >
    > 2535
    >
    > 1327
    >
    > 1208
    >
    > getDistinctRowCount
    >
    > 3138
    >
    > 7947
    >
    > 4939
    >
    > 3008
    >
    > areColumnsUnique
    >
    > 1103
    >
    > 3191
    >
    > 1495
    >
    > 1696
    >
    > getColumnOrigins
    >
    > 250
    >
    > 1273
    >
    > 205
    >
    > 1068
    >
    > getCumulativeCost
    >
    > 2249
    >
    > 6755
    >
    > 4288
    >
    > 2467
    >
    > getNonCumulativeCost
    >
    > 2274
    >
    > 2274
    >
    > 0
    >
    > 2274
    >
    > getSelectivity
    >
    > 539
    >
    > 562
    >
    > 27
    >
    > 535
    >
    > getUniqueKeys
    >
    > 2635
    >
    > 5248
    >
    > 2437
    >
    > 2811
    >
    >
    > Gone are the deep recursive calls and explosion into +100k calls. I’m 
seeing 2x - 10x compile time improvements.
    >
    > I’m asking here if there is some reason behind the frequent replacement 
of the RelMetadataQuery object being used (and hence a clean mem cache), or is 
just some unintended consequence? I am making now changes on Hive side to 
address this (HIVE-16757), if the cache reset effect is accidental we should 
address this as well in Calcite.
    >
    > BTW I think `RelMetadataQuery.instance()` should be named 
`RelMetadataQuery.createInstance()` to be clear what the effect is.
    >
    > Thanks,
    > ~Remus
    
    

Reply via email to