Thanks Konstantin, Your suggestions are very helpful. I have added them into issues of vectorize_engine repo https://github.com/zhangh43/vectorize_engine/issues
On Wed, Dec 4, 2019 at 10:08 PM Konstantin Knizhnik < k.knizh...@postgrespro.ru> wrote: > > > On 04.12.2019 12:13, Hubert Zhang wrote: > > 3. Why you have to implement your own plan_tree_mutator and not using > expression_tree_mutator? > > I also want to replace plan node, e.g. Agg->CustomScan(with VectorAgg > implementation). expression_tree_mutator cannot be used to mutate plan node > such as Agg, am I right? > > > O, sorry, I see. > > > >> 4. As far as I understand you now always try to replace SeqScan with your >> custom vectorized scan. But it makes sense only if there are quals for this >> scan or aggregation is performed. >> In other cases batch+unbatch just adds extra overhead, doesn't it? >> > Probably extra overhead for heap format and query like 'select i from t;' > without qual, projection, aggregation. > But with column store, VectorScan could directly read batch, and no > additional batch cost. Column store is the better choice for OLAP queries. > > > Generally, yes. > But will it be true for the query with a lot of joins? > > select * from T1 join T2 on (T1.pk=T2.fk) join T3 on (T2.pk=T3.fk) join T4 > ... > > How can batching improve performance in this case? > Also if query contains LIMIT clause or cursors, then batching can cause > fetching of useless records (which never will be requested by client). > > Can we conclude that it would be better to use vector engine for OLAP > queries and row engine for OLTP queries. > > 5. Throwing and catching exception for queries which can not be vectorized >> seems to be not the safest and most efficient way of handling such cases. >> May be it is better to return error code in plan_tree_mutator and >> propagate this error upstairs? > > > Yes, as for efficiency, another way is to enable some plan node to be > vectorized and leave other nodes not vectorized and add batch/unbatch layer > between them(Is this what you said "propagate this error upstairs"). As you > mentioned, this could introduce additional overhead. Is there any other > good approaches? > What do you mean by not safest? > PG catch will receive the ERROR, and fallback to the original > non-vectorized plan. > > > The problem with catching and ignoring exception was many times discussed > in hackers. > Unfortunately Postgres PG_TRY/PG_CATCH mechanism is not analog of > exception mechanism in more high level languages, like C++, Java... > It doesn't perform stack unwind. If some resources (files, locks, > memory,...) were obtained before throwing error, then them are not > reclaimed. > Only rollback of transaction is guaranteed to release all resources. And > it actually happen in case of normal error processing. > But if you catch and ignore exception , trying to continue execution, then > it can cause many problems. > > May be in your case it is not a problem, because you know for sure where > error can happen: it is thrown by plan_tree_mutator > and looks like there are no resources obtained by this function. But in > any case overhead of setjmp is much higher than of explicit checks of > return code. > So checking return codes will not actually add some noticeable overhead > except code complication by adding extra checks. > But in can be hidden in macros which are used in any case (like MUTATE). > > > 7. How vectorized scan can be combined with parallel execution (it is >> already supported in9.6, isn't it?) >> > > We didn't implement it yet. But the idea is the same as non parallel one. > Copy the current parallel scan and implement vectorized Gather, keeping > their interface to be VectorTupleTableSlot. > Our basic idea to reuse most of the current PG executor logic, and make > them vectorized, then tuning performance gradually. > > > Parallel scan is scattering pages between parallel workers. > To fill VectorTupleSlot with data you may need more than one page (unless > you make a decision that it can fetch tuples only from single page). > So it should be somehow take in account specific of parallel search. > Also there is special nodes for parallel search so if we want to provide > parallel execution for vectorized operations we need also to substitute > this nodes with > custom nodes. > > -- > Konstantin Knizhnik > Postgres Professional: http://www.postgrespro.com > <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.postgrespro.com&d=DwMDaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=lz-kpGdw_rtpgYV2ho3DjDSB5Psxis_b-3VZKON7K7c&m=vdzzVhvy3WXoHG6U6a8YqBZnVe-7lCDU5SzNWwPDxSM&s=0TXQmqH_G8_Nao7F_n5m-ekne2NfeaJJPCaRkH_4_ME&e=> > The Russian Postgres Company > > -- Thanks Hubert Zhang