[Maria-developers] MariaDB Developers Unconference, New York
A final reminder about the MariaDB Developers Unconference taking place 23 - 24 February in New York, just before OpenWorks. We're almost at capacity, so if you haven't signed up yet, please do so soon. Details about where to sign up and how to suggest sessions are at https://mariadb.org/2019-developers-unconference-new-york/ Hope to see you there! ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
[Maria-developers] Review for ANALYZE TABLE with Sampling
Hi Sergei! Can you review that you are happy with the storage engine API changes? I'veustructured the commit to be as small as possible to achieve the desired outcome. In my tests, we are now twice as fast as MySQL for a 10 mil row table with 13 columns. Vicențiu Forwarded Message From: vicentiu@mariadb.orgTo: commits@mariadb.orgSubject: 53730224efd: Improve histogram collection performance by samplingDate: Sun, 10 Feb 2019 20:09:49 +0200 (EET) revision-id: 53730224efd987f97a6cc968ff5214ee499d84e0 (mariadb-10.4.1- 163-g53730224efd)parent(s): 3c305d3f1951f1667f84e48ddd98674c6318c39dauthor: Vicențiu Ciorbarucommitter: Vicențiu Ciorbarutimestamp: 2019-02-10 19:54:50 +0200message: Improve histogram collection performance by sampling Histogram collection is done by sampling a percentage of rows from the table,not looking at all individual ones. The default implementation, to keep the server's Engine IndepenentStatistics component working uses Bernoulli sampling. It does a simpletable scan, but only accepts rows that pass a dice roll. Thisimplementation is done as a storage engine interface method, so as toallow faster and/or more efficient implementations for storage enginesinternally. The number of rows collected is capped to a minimum of 5 andincreases logarithmically with a coffecient of 4096. The coffecient ischosen so that we expect an error of less than 3% in our estimationsaccording to the paper:"Random Sampling for Histogram Construction: How much is enough?”– Surajit Chaudhuri, Rajeev Motwani, Vivek Narasayya, ACM SIGMOD, 1998. This interface is also a precursor to allowing SELECT ... FROM tablewith sampling to work. Performance wise, for a table of 10 million rows and 13 columns, 6 int,6 doubles, one string, the previous analyze table statistics took1 minute 20 seconds to collect all data. Post implementation, thetime is less than 6 seconds. This was run on an InnoDB table, NVME SSD withapproximately 2GB/s read speed and 500MB/s write speed. --- mysql-test/main/selectivity.result| 8 +++ mysql- test/main/selectivity_innodb.result | 8 +++-- -- sql/handler.cc| 14 +++ sql/handler.h | 40 ++- sql/sql_statistics.cc | 32 +++-- 5 files changed, 86 insertions(+), 16 deletions(-) diff --git a/mysql-test/main/selectivity.result b/mysql- test/main/selectivity.resultindex 00907235ecc..6d09c1c9b62 100644--- a/mysql-test/main/selectivity.result+++ b/mysql- test/main/selectivity.result@@ -1290,9 +1290,9 @@ explain extended select * from t1, t2, t1 as t3 where t1.b=t2.c and t2.d=t3.a and t3.b<5 and t1.a < 2000; id select_type table typepossibl e_keys key key_len ref rowsfilteredExtra-1 SIMPLE t1 ALL NULLNULLNULLNULL262144 100.00 Using where+1 SIMPLE t1 ALL NULLNULLNULLNULL262117 100.00 Using where 1 SIMPLE t2 ref c,d c 5 test.t1.b 5 100.00 -1 SIMPLE t3 ALL NULL NULLNULLNULL262144 100.00 Using where; Using join buffer (flat, BNL join)+1 SIMPLE t3 ALL NULLNULLNULL NULL262117 100.00 Using where; Using join buffer (flat, BNL join) Warnings: Note1003select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`c` AS `c`,`test`.`t2`.`d` AS `d`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b` from `test`.`t1` join `test`.`t2` join `test`.`t1` `t3` where `test`.`t2`.`c` = `test`.`t1`.`b` and `test`.`t3`.`a` = `test`.`t2`.`d` and `test`.`t3`.`b` < 5 and `test`.`t1`.`a` < 2000 select * from t1, t2, t1 as t3@@ -1307,9 +1307,9 @@ explain extended select * from t1, t2, t1 as t3 where t1.b=t2.c and t2.d=t3.a and t3.b<5 and t1.a < 2000; id select_ typetable typepossible_keys key key_len ref rows filteredExtra-1 SIMPLE t3 ALL NULLNULLNULL NULL262144 0.00Using where+1 SIMPLE t3 ALL NULL NULLNULLNULL262117 0.00Using where 1 SIMPLE t2 ref c,d d 5 test.t3.a 7 100.00 -1 SIMPLE t1 ALL NULLNULLNULLNULL262144 2.00 Using where; Using join buffer (flat, BNL join)+1 SIMPLE t1 ALL NULLNULLNULLNULL262117 2.00Using where; Using join buffer (flat, BNL join) Warnings: Note 1003select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`c` AS `c`,`test`.`t2`.`d` AS `d`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b` from `test`.`t1` join `test`.`t2` join `test`.`t1` `t3` where `test`.`t1`.`b` = `test`.`t2`.`c` and `test`.`t2`.`d` = `test`.`t3`.`a` and `test`.`t3`.`b` < 5 and `test`.`t1`.`a` < 2000 select * from t1, t2, t1 as t3diff --git a/mysql-test/main/selectivity_innodb.result b/mysql-test/main/selectivity_innodb.resultindex 93917065722..0b20a40f69f
Re: [Maria-developers] Some more input on optimizer_trace
On Sun, Feb 10, 2019 at 02:14:51PM +0200, Sergey Petrunia wrote: > Hi Varun, > > I've did some adjustments MDEV-18489 and pushed the patch into > 10.4-optimizer-trace, please check it out. > > Also I have filed MDEV-18527 and MDEV-18528. > > Some input on the code: > > > --- 10.4-optimizer-trace-orig/sql/sql_select.cc > > +++ 10.4-optimizer-trace-cl/sql/sql_select.cc > One more thing - Json_value_context is not really a context anymore. (the original intent was that "one gets a value context object if the Json_writer's current state is such that it currently expects a value (and not a name)" but apparently it is not used this way. Which is fine, but then I guess the object should be renamed. (to _helper? or something like that?) BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
[Maria-developers] Some more input on optimizer_trace
Hi Varun, I've did some adjustments MDEV-18489 and pushed the patch into 10.4-optimizer-trace, please check it out. Also I have filed MDEV-18527 and MDEV-18528. Some input on the code: > --- 10.4-optimizer-trace-orig/sql/sql_select.cc > +++ 10.4-optimizer-trace-cl/sql/sql_select.cc > @@ -15983,12 +16250,26 @@ optimize_cond(JOIN *join, COND *conds, >that occurs in a function set a pointer to the multiple equality >predicate. Substitute a constant instead of this field if the >multiple equality contains a constant. > -*/ > -DBUG_EXECUTE("where", print_where(conds, "original", QT_ORDINARY);); > +*/ > + > +Opt_trace_context *const trace = >opt_trace; > +Json_writer *writer= trace->get_current_json(); > +Json_writer_object trace_wrapper(writer); > +Json_writer_object trace_cond(writer, "condition_processing"); > +trace_cond.add("condition", join->conds == conds ? "WHERE" : "HAVING") > + .add("original_condition", conds); > + > +Json_writer_array trace_steps(writer, "steps"); > + DBUG_EXECUTE("where", print_where(conds, "original", QT_ORDINARY);); Small question: Why was DBUG_EXECUTE shifted right? A bigger question: the code seems unnecesarily verbose: 1. > +Opt_trace_context *const trace = >opt_trace; 2. > +Json_writer *writer= trace->get_current_json(); 3. > +Json_writer_object trace_wrapper(writer); 4. > +Json_writer_object trace_cond(writer, "condition_processing"); Can we save the space by just calling the constructors: Json_writer_object trace_wrapper(thd); Json_writer_object trace_cond(thd, "condition_processing"); ? This applies here and in many other places. Alternative, we could use: Json_writer_object trace_wrapper(thd); Json_writer_object trace_cond(trace_wrapper, "condition_processing"); .. which makes the nesting clearer (and we could also add debug safety check: it is invalid to operate on trace_wrapper until trace_cond hasn't been end()'ed) > --- 10.4-optimizer-trace-orig/sql/opt_range.cc > +++ 10.4-optimizer-trace-cl/sql/opt_range.cc > > +void TRP_ROR_UNION::trace_basic_info(const PARAM *param, > + Json_writer_object *trace_object) const > +{ > + Opt_trace_context *const trace = >thd->opt_trace; > + Json_writer* writer= trace->get_current_json(); > + trace_object->add("type", "index_roworder_union"); > + Json_writer_array ota(writer, "union_of"); The name 'ota' makes sense in MySQL codebase (where it is a contraction of Optimizer_trace_array), but is really confusing in MariaDB codebase. Please change everywhere to "smth_trace" or something like that (jwa in the worst case). > @@ -2654,12 +2833,18 @@ int SQL_SELECT::test_quick_select(THD *t > > if (cond) > { > - if ((tree= cond->get_mm_tree(, ))) > + { > +Json_writer_array trace_range_summary(writer, > + "setup_range_conditions"); > +tree= cond->get_mm_tree(, ); > + } Does this ever produce anything meaningful, other than empty: "setup_range_conditions": [], In MySQL, the possible contents of this array are: "impossible_condition": { "cause": "comparison_with_null_always_false" } /* impossible_condition */ "impossible_condition": { "cause": "null_field_in_non_null_column" } /* impossible_condition */ But in MariaDB we don't have this (should we?) Also, why_use_underscores_in_value_of_cause? It is a quoted string where spaces are allowed. MySQL seems to have figured this out at some point and have a few cause strings using spaces. > --- 10.4-optimizer-trace-orig/sql/opt_table_elimination.cc > +++ 10.4-optimizer-trace-cl/sql/opt_table_elimination.cc > @@ -522,7 +524,8 @@ eliminate_tables_for_list(JOIN *join, >List *join_list, >table_map tables_in_list, >Item *on_expr, > - table_map tables_used_elsewhere); > + table_map tables_used_elsewhere, > + Json_writer_array* eliminate_tables); Please change the name to something indicating it's just a trace. BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp