[Maria-developers] GSoC [MDEV-6017]

2019-05-27 Thread Alexey Mogilyovkin
Hello. This is the thread where I would like to publish my weekly reports
every Monday.
During preparatory weeks I set up development environment. I briefly
examined testing framework, debugging process and now I am ready to start
work.
Work progress will be available on my github repository in branch
"MDEV-6017" https://github.com/ZeroICQ/server/tree/MDEV-6017
___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


[Maria-developers] Proposal for automatic code formatting (clang-format)

2019-05-27 Thread Eugene Kosov
Hi.

I'm proposing to add some tool that automatically formats source code.

Main purpose is overall productivity. It's possible to save time on manual
formatting. Also, reading a uniformly formatted code is easier.

Some new languages has one correct coding style and code formatters.
Examples are Go (https://golang.org/cmd/gofmt/) and (to be done) Rust
(https://github.com/rust-dev-tools/fmt-rfcs/blob/master/guide/guide.md).
I suggest to read 'Motivation - why use a formatting tool' from second link.

There is no single correct coding style in C and C++ worlds: different projects
use different styles if any at all. But there is a tool which does formatting
and supports different styles: clang-format.

It's a binary available for every major platform. It's configurable: just put
.clang-format file with formatting options in source directory and formatter
will use it for current and nested directories.

I actually do not propose here to use exactly clang-format but as I know it's
the best of its kind. Here is a demo https://youtu.be/uZI_Qla4pNA?t=1893

Tool is configurable but it will not act exactly as you might expect and this is
a pay for uniformity.

So, let's add .clang-format to root directory, start using and polishing style.
Here is some info on how to add it to some text editors 
https://clang.llvm.org/docs/ClangFormat.html

I do not(!) suggesting to reformat whole project. At least for now.

Here is my PR to add .clang-format files to server and InnoDB: 
https://github.com/MariaDB/server/pull/1270

I think that eventually .clang-format usage will lead to reformatting
everything and having some project-wide precommit hook which will properly
format your diffs.

Do you have any objections or questions on having such a tool?
Let's discuss, please.

-- 
Eugene


___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Regarding approach used for Early NULL filtering in the range optimizer(MDEV-15777)

2019-05-27 Thread Sergey Petrunia
On Fri, May 24, 2019 at 04:39:51PM -0700, Igor Babaev wrote:
> On 05/24/2019 01:41 PM, Sergey Petrunia wrote:
> > On Fri, May 24, 2019 at 10:08:04AM -0700, Igor Babaev wrote:
> > > On 05/24/2019 04:03 AM, Varun Gupta wrote:
> > > > Hi Igor,
> > > > After discussing with Sergey, we came up with these conclusions as to
> > > > why we used the approach of going through all the keyuses in the KEYUSE
> > > > array
> > > > 
> > > > Cases to consider:
> > > > we have an index on column a
> > > > 
> > > > 1) a OP const
> > > > where OP can be ( > > > This case is already handled by the range optimizer. We do not create a
> > > > NULL rejecting predicate for such conditions.
> > > > 
> > > > 2) eq_join conditions (tbl1.a = tbl2.col2)
> > > > This is the specific case we have tried to implement in MDEV-15777, we
> > > > create NULL rejecting predicates for the keyuses for each table and then
> > > > feed these predicates to range optimizer
> > > There is no logical explanation for checking null rejection of fields used
> > > in equalities with the help of array keyuse.
> > The logic here is as follows:
> > 
> > We already collect the set of KEYUSE::null_rejecting attributes. Its
> > meaning is very close to what the patch needs.
> The meaning of this argument is to be used a key value for ref access by an
> index i1.
> You need range access by a different index i2. There is no logical
> connection
> between these two indexes. The usage of i1 can be blocked by IGNORE clause
> and this case
> there won't be any keyuses for it. But the equality from which the
> null-rejection can be
> deduced is still there. You need range scan to reduce the number of records
> produced
> by the left operand of a join operation. And you should not care weather it
> should be the
> argument of any ref access.
> Here is the situation:
> The field of interest for range access is t1.a and you have two conditions
>t1.a=t2.a with index i1 on t2.a
>t1.b=t3.b with index i2 on t3.b
> The use prohibited to use i1. t1.a apparently remains null-rejecting. Yet it
> cannot be not detected
> as such by a look through array of keyuses.

I don't follow the logic. What MDEV-15777 does is to basically look at KEYUSE
object, which represent

  tbl.key_col = other_column

and feed this into the range optimizer:
  
  tbl.key_col IS NOT NULL

If the query CAN use a certain index:
- we will have a KEYUSE object for it
- we will feed the IS NOT NULL into the range optimizer.

if the query CANNOT use a certain index (because of IGNORE INDEX), then
- we won't have KEYUSE object for that index (and so won't construct an
  IS NOT NULL condition for it)
- range optimizer will not build ranges for it (so it doesn't really matter 
  if IS NOT NULL was constructed or not)

Can you provide a full example?

> > 
> > Because of this, the part of the patch for MDEV-15777 which analyzes the 
> > KEYUSE
> > array is very small: 122 lines including the comment (I counted
> > make_null_rejecting_conds() and add_cond()).
> > 
> > I think, the CPU overhead is small, too.
> > 
> > > > 3) a < func(b,c)
> > > > we do not handle this case, because:
> > > > 
> > > > a) It is harder to process. We would need to process the whole WHERE
> > > > clause to infer
> > > > non-nullability of the condition. Example:
> > > > 
> > > >(t1.f2+1 < t1.f1+t1.f3) OR ... OR t1.f1 is null
> > > > 
> > > > here the left part of the OR condition doesn't allow f1 to be NULL, but
> > > > the
> > > > right condition allows it. We would also need to take into account 
> > > > tables
> > > > inside outer joins. We would need to go through Item classes and add 
> > > > code
> > > > which say "Item_xxx() will compute to false when its argument Y is null"
> > > > (there is item->not_null_tables() currently, but not 
> > > > not_null_columns()).
> > > Have you actually looked at the implementations of the virtual function
> > > not_null_tables()?
> > > Do you need me to provide the implementation of the virtual function
> > > not_null_columns()?
> > It will require multiple implementations. At the moment we have:
> > 
> > nm mysqld --demangle | grep 'Item.*::not_null_tables'  | wc -l
> > 21
> > 
> > Another question: do you intend to collect not_null_columns() for all 
> > columns,
> > or just columns that are a part of some index?
> > 
> > What would a good data structure to store a set of tablename.column_name ?
> It will be just a bitmap created for each table that can be used for many
> other purposes.
> If you want to filter out those that are not part of any index it can be
> easily done
> (even once for any table share)

My point was that constructing such a bitmap is a lot heavier than the
not_null_table_map().

table.h declares a field bitmap class:

  typedef Bitmap Field_map;

but it's huge:

(gdb) p sizeof(Field_map)
  $52 = 544

and note that we will not be able to just put one Field_map into the TABLE
object:

If we think about how to compute not_null_columns() for Item_cond_and or
Item

Re: [Maria-developers] [Commits] 53d0e921794: MDEV-11094: Blackhole table updates on slave fail when row annotation is enabled

2019-05-27 Thread andrei . elkin
Sujatha, howdy.

Thank you for spelling it out for me!

> Hello Andrei,
...
>>>
>>> Analysis:
>>> =
>>> Enabling "replicate_annotate_row_events" on slave, Tells the slave to write
>>> annotate rows events received from the master to its own binary log. The
>>> received annotate events are applied after the Gtid event as shown below.
>>> thd->query() will be set to the actual query received from the master, 
>>> through
>>> annotate event. Annotate_rows event should not be deleted after the event is
>>> applied as the thd->query will be used to generate new Annotate_rows event
>>> during applying the subsequent Rows events.
>> [ here ]
>>The Table_map and Rows_log_event that follow do not touch thd->query().
>
> The Table_map and Rows_log_event that follow "Annotate_rows" event do
> make use of "thd->query()".

Frankly I did not expect that. Perhaps it would have not been a bad idea
to make Annotate_rows_log_event::do_apply_event() logging of
itself, which still makes sense even if the slave applier will filter
out its group's Rows_log_event:s so in this case the slave side event group
would consists of mere Annotate.

The above is a mere remark though, I have to accept what we have.

>>
>> I also suggest to check out mysqlbinlog run. There seems to be no test
>> for that though, so one needs to be written. The goal would be to prove
>> a created binlog output is as expected (e.g to include Annotate).
>
>
> Sure, I will add a test case.

Super! The patch is approved then.

Cheers,

Andrei

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] [Commits] 42129dae63c: MDEV-18741: Optimizer trace: multi-part key ranges are printed incorrectly

2019-05-27 Thread Sergey Petrunia
Hi Varun,

One cosmetic comment below. Ok to push after it is addressed.

On Fri, May 24, 2019 at 03:51:00PM +0530, Varun wrote:
> revision-id: 42129dae63c4331b01f9a504c9f3ef1c0c1f857a 
> (mariadb-10.4.4-126-g42129dae63c)
> parent(s): 592fe954ef82be1bc08b29a8e54f7729eb1e1343
> author: Varun Gupta
> committer: Varun Gupta
> timestamp: 2019-05-24 14:43:44 +0530
> message:
> 
> MDEV-18741: Optimizer trace: multi-part key ranges are printed incorrectly
> 
> Changed the function append_range_all_keyparts to use sel_arg_range_seq_init 
> / sel_arg_range_seq_next to produce ranges.
> Also adjusted to print format for the ranges, now the ranges are printed as:
> (keypart1_min, keypart2_min,..)  OP (keypart1_name,keypart2_name, ..) OP 
> (keypart1_max,keypart2_max, ..)
> 
> Also added more tests for range and index merge access for optimizer trace
...
> diff --git a/sql/opt_range_mrr.cc b/sql/opt_range_mrr.cc
> index 3a25da3edb2..5644e456dd4 100644
> --- a/sql/opt_range_mrr.cc
> +++ b/sql/opt_range_mrr.cc
> @@ -53,6 +53,8 @@ typedef struct st_sel_arg_range_seq
>int i; /* Index of last used element in the above array */
>
>bool at_start; /* TRUE <=> The traversal has just started */
> +  /* TRUE if last checked tree->key can be used for ROR-scan */
> +  bool is_ror_scan;
>  } SEL_ARG_RANGE_SEQ;
>  
> 

This comment made sense at the previous location, but here it doesn't.

Here the meaning is something like "Iteration functions will set this to FALSE"
if ranges being traversed do not allow to construct a ROR-scan".



BR
 Sergei
-- 
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog



___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp