[Maria-developers] Cost analysis: optimizer statistics vs real dataset properties

2012-08-08 Thread Sergei Petrunia
So, we have done:
alter table name add index(n_name);

and now we have:

explain extended
select sql_calc_found_rows
   s_name, s_address
from nation straight_join supplier
where s_suppkey in (select ps_suppkey from partsupp
where ps_partkey in (select p_partkey from part
 where p_name like 'forest%')
  and ps_availqty >
  (select 0.5 * sum(l_quantity)
   from lineitem
   where l_partkey = ps_partkey
 and l_suppkey = ps_suppkey
 and l_shipdate >= date('1994-01-01')
 and l_shipdate < date('1994-01-01') +
 interval '1' year ))
and s_nationkey = n_nationkey
and n_name = 'canada'
order by s_name
limit 10;
+--+--+---+--++---+---+---++-+
|id|select_type   |table  |type  ||key|key_len|ref  
  |rows|Extra   
 |
+--+--+---+--++---+---+---++-+
| 1|PRIMARY   |nation |ref   ||n_name |26 |const
  |   1|Using where; Using index; Using temporary; 
Using filesort|
| 1|PRIMARY   |supplier   |ref   ||i_s_nationkey  |5  
|nation.n_nationkey | 251|  
   |
| 1|PRIMARY   ||eq_ref||distinct_key   |4  |func 
  |   1|
 |
| 2|MATERIALIZED  |part   |range ||p_name |58 |NULL 
  |2387|Using where; Using index
 |
| 2|MATERIALIZED  |partsupp   |ref   ||PRIMARY|4  
|part.p_partkey |   1|Using where   
   |
| 4|DEPENDENT SUBQUERY|lineitem   |ref   ||i_l_suppkey_partkey|10 
|partsupp.ps_partkey,partsupp.ps_suppkey|   3|Using where   
   |
+--+--+---+--++---+---+---++-+


=== nation: ===
1 row scanned, for real OK.
- "using where" doesn't filter anything out.

=== supplier ===
- dataset has 412 rows for CANADA  
- rec_per_key: 10L suppliers / 25 countries=400

===  == 
- eq_ref, we know that there's always match
- "using where" won't filter anything out.

=== part ===
- dataset/where match 2378 rows for (p_name like 'forest%')  OK.
- "using where" won't filter anything out.

=== partsupp ===
- rec_per_key: partsupp has 800K rows and 200K distinct ps_partkey values, which
  gives rec_per_key=4  (while EXPLAIN shows 1!)

- select count(*) from partsupp 
  where ps_partkey in (select p_partkey from part where p_name like 'forest%');
  gives 9552 rows.

(rec_per_key=4) * 2378 = 9512 , close to 9552.

- "using where" won't filter anything out (NOT TAKING scalar-subquery into 
account)



+--+-++--++---+---+---+++-+
|id|select_type  |table   |type  ||key|key_len|ref  
  |rows|filtered|Extra  
  |
+--+-++--++---+---+---+++-+
| 1|PRIMARY  |nation  |ref   ||n_name |26 |const
  |   1|  100.00|Using where; Using index; Using temporary; 
Using filesort|
| 1|PRIMARY  |supplier|ref   ||i_s_nationkey  |5  
|nation.n_nationkey | 251|  100.00| 
|
| 1|PRIMARY  |partsupp|ref   ||i_ps_suppkey   |4  
|supplier.s_suppkey |  34|  100.00|Using where  
|
| 1|PRIMARY  |part|eq_ref||PRIMARY|4  
|partsupp.ps_partkey|   1|  100.00|Using where; 
FirstMatch(supplier)|
| 4|DEP. SUBQUERY|lineitem|ref   ||i_l_suppkey_partkey|10 
|partsupp.ps_partkey,partsupp.ps_suppkey|   3|  100.00|Using where  
|
+--+-++--++---+---+-

[Maria-developers] Cost analysis: comparison of the two approaches

2012-08-08 Thread Sergei Petrunia
So, we're comparing two plans that were mentioned earlier.

FirstMatch plan has:

  (gdb) p *current_read_time
  $426 = 360931
  (gdb) p *current_record_count
  $427 = 4500

which is much more greater than the SJ-Materialization plan, which has:

  (gdb) print current_record_count
$634 = 4500
  (gdb) print current_read_time
$635 = 8663.1088587744125

Let's check whether the difference can be overcame by adding the cost of the 
scalar-subquery predicate.

Cost of one execution of scalar subquery predicate  is 4.5 ~= 5.
SJ-Materialization plan will execute the subquery predicate 2378 times:

  (gdb) p join->best_read
$575 = 6229.5578587744112
  (gdb) p join->record_count
$576 = 2387

which gives added cost of 

  2378 * 5 = 11890.   
   
   8663 + 11890 is still much less than 360931.

Hence, just taking scalar-subquery cost into account is not sufficient for 
query plan to be changed.

ps.
psergey's opinion: too much unaccounted-for WHERE clauses. to be elaborated 
further...

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Review request: SHOW EXPLAIN, attempt 2

2012-08-08 Thread Sergei Petrunia
Hi Sergei,

On Tue, Aug 07, 2012 at 01:37:33PM +0200, Sergei Golubchik wrote:
> On Jul 17, Sergei Petrunia wrote:
> > > > === modified file 'sql/sql_join_cache.cc'
> > > > --- sql/sql_join_cache.cc   2012-03-24 17:21:22 +
> > > > +++ sql/sql_join_cache.cc   2012-05-16 20:59:03 +
> > > > @@ -2236,7 +2236,7 @@ enum_nested_loop_state JOIN_CACHE::join_
> > > >
> > > >while (!(error= join_tab_scan->next()))   
> > > >{
> > > > -if (join->thd->killed)
> > > > +if (join->thd->check_killed())
> > > 
> > > I would recommend to rename thd->killed (for example, to thd->killed_flag)
> > > just to make sure no code uses thd->killed the old way,
> > > and - more importantly - that we won't get new thd->killed uses in
> > > merges from MySQL.  
> > 
> > psergey:
> > I've tried renaming. This required to make changes in 160 places, and was a
> > 1600-line patch:
> > http://s.petrunia.net/scratch/psergey-rename-killed-to-killed_flag.diff
> > 
> > Did you really intend to make such a big change?
> 
> Yes. Looking at this big patch I see that we have lots of places where
> thd->killed is checked directly, instead of using thd->check_killed(),
> that is, there can be long delays where APC requests won't be served
> (despite the fact that KILL works there).
> 
> But if you feel uneasy doing this, let's create a separate jira task for
> "refactor KILL to work via APC" and I will do it.

Filed https://mariadb.atlassian.net/browse/MDEV-442

> > > > === modified file 'sql/sql_lex.h'
> > > > --- sql/sql_lex.h   2012-05-21 13:30:25 +
> > > > +++ sql/sql_lex.h   2012-07-10 17:23:00 +
> > > > @@ -2344,7 +2357,8 @@ struct LEX: public Query_tables_list
> > > >char *backup_dir;/* For RESTORE/BACKUP */
> > > >char* to_log; /* For PURGE MASTER 
> > > > LOGS TO */
> > > >char* x509_subject,*x509_issuer,*ssl_cipher;
> > > > -  String *wild;
> > > > +  String *wild; /* Wildcard in SHOW {something} LIKE 'wild'*/ 
> > > > +  Item *show_explain_for_thread; /* id in SHOW EXPLAIN FOR id */
> > > 
> > > Why do extend LEX even more?
> > > KILL, for example, uses LEX::value_list
> > > 
> > psergey:
> > KILL does NOT use INFORMATION_SCHEMA. This is why it is able to use
> > LEX::value_list.
> 
> Could you please explain, why using INFORMATION_SCHEMA prevents you
> from using LEX::value_list ?
>
Ok, we've figured there's no valid reason. Started to use LEX::value_list.

> > > > === modified file 'sql/sql_parse.cc'
> > > > --- sql/sql_parse.cc2012-05-21 18:54:41 +
> > > > +++ sql/sql_parse.cc2012-07-10 17:23:00 +
> > > > @@ -2143,6 +2144,32 @@ mysql_execute_command(THD *thd)
> > > >  execute_show_status(thd, all_tables);
> > > >  break;
> > > >}
> > > > +  case SQLCOM_SHOW_EXPLAIN:
> > > > +  {
> > > > +if (!thd->security_ctx->priv_user[0] &&
> > > > +check_global_access(thd,PROCESS_ACL))
> > > > +  break;
> > > > +
> > > > +/*
> > > > +  The select should use only one table, it's the SHOW EXPLAIN 
> > > > pseudo-table
> > > > +*/
> > > > +if (lex->sroutines.records || lex->query_tables->next_global)
> > > > +{
> > > > +  my_error(ER_NOT_SUPPORTED_YET, MYF(0), "Usage of subqueries or 
> > > > stored "
> > > > +   "function calls as part of this statement");
> > > 
> > > Ok. Here I'd suggest to create a new error message
> > > 
> > > ER_MSG_SUBQUERIES_OR_ROUTINES
> > >   "Usage of subqueries or stored function calls as part of this statement"
> > > 
> > > and use it as
> > > 
> > >my_error(ER_NOT_SUPPORTED_YET, MYF(0), 
> > > ER(ER_MSG_SUBQUERIES_OR_ROUTINES));
> > > 
> > > also in SQLCOM_KILL and SQLCOM_CREATE_EVENT cases.
> > > 
> > psergey:
> > I've changed SHOW EXPLAIN code not to use this error (it uses
> > ER_SET_CONSTANTS_ONLY now). Is that ok?
> 
> Yes, thanks.
> 
> > > > === modified file 'sql/sql_select.cc'
> > > > --- sql/sql_select.cc   2012-06-04 15:26:11 +
> > > > +++ sql/sql_select.cc   2012-07-05 20:28:30 +
> > > > @@ -928,6 +979,13 @@ bool JOIN::prepare_stage2()
> > > >  }
> > > >  
> > > >  
> > > > +int JOIN::optimize()
> > > > +{
> > > > +  int res= optimize_inner();
> > > > +  if (!res)
> > > > +have_query_plan= QEP_AVAILABLE;
> > > 
> > > Why wouldn't you simply set have_query_plan from the old JOIN::optimize,
> > > I mean, why did you have to move it to optimize_inner() ?
> > > 
> > > I thought it's because in some cases you need to invoke optimize_inner()
> > > directly, without setting have_query_plan. But you don't seem to do it
> > > anywhere.
> > > 
> > psergey:
> > There are multiple exits from JOIN::optimize, and I wanted to cover them all
> > with one statement.
> 
> I wouldn't do it that way, but it's your task.
> Okay.
> 
> > > > === modified file 'sql/sql_show.cc'
> > > > --- sql/sql_show.cc 2012-05-21 18:54:41 +
> > > > +++ sql/sql_show.cc 2012-07-10 17:23:00 +
> > > > @@ -1998,6 +1998,12

Re: [Maria-developers] Cost analysis: FirstMatch plan

2012-08-08 Thread Sergei Petrunia
On Wed, Aug 08, 2012 at 07:04:39PM +0400, Sergei Petrunia wrote:
> Please find below cost analysis for the plan with FirstMatch strategy.
> 
> Debugging with optimizer_switch='semijoin=off', query:

Correction. I meant 'materialization=off'.

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


[Maria-developers] Cost analysis: Materialization plan

2012-08-08 Thread Sergei Petrunia
Debugging with optimizer_switch='materialization=on', query:

explain extended
select sql_calc_found_rows
   s_name, s_address
from nation straight_join supplier
where s_suppkey in (select ps_suppkey from partsupp
where ps_partkey in (select p_partkey from part
 where p_name like 'forest%')
  and ps_availqty >
  (select 0.5 * sum(l_quantity)
   from lineitem
   where l_partkey = ps_partkey
 and l_suppkey = ps_suppkey
 and l_shipdate >= date('1994-01-01')
 and l_shipdate < date('1994-01-01') +
 interval '1' year ))
and s_nationkey = n_nationkey
and n_name = 'canada'
order by s_name
limit 10;
++--+---+--++---+---+---++++
|id  |select_type   |table  |type  ||key|key_len|ref
|rows|filtered|Extra
   |
++--+---+--++---+---+---++++
|   1|PRIMARY   |nation |ALL   ||NULL   |NULL   |NULL   
|  25|  100.00|Using where; Using temporary; 
Using filesort|
|   1|PRIMARY   |supplier   |ref   ||i_s_nationkey  |5  
|nation.n_nationkey | 180|  100.00| 
   |
|   1|PRIMARY   ||eq_ref||distinct_key   |4  |func   
|   1|  100.00| 
   |
|   2|MATERIALIZED  |part   |range ||p_name |58 |NULL   
|2387|  100.00|Using where; Using index 
   |
|   2|MATERIALIZED  |partsupp   |ref   ||PRIMARY|4  
|part.p_partkey |   1|  100.00|Using where  
   |
|   4|DEPENDENT SUBQUERY|lineitem   |ref   ||i_l_suppkey_partkey|10 
|partsupp.ps_partkey,partsupp.ps_suppkey|   4|  100.00|Using where  
   |
++--+---+--++---+---+---++++

* SEMI-JOIN pre-optimization **

= PRE-1. part === 
(gdb) p position->records_read
  $564 = 2387
(gdb) p position->read_time
  $565 = 497.33246183668518
(gdb) p current_record_count / TIME_FOR_COMPARE
  $566 = 477.38
(gdb) p current_read_time
  $567 = 974.73246183668516

= PRE-2. partsupp === 

(gdb) p position->records_read
  $570 = 1
(gdb) p position->read_time
  $571 = 2390.4263969377266
(gdb) p current_record_count / TIME_FOR_COMPARE
  $572 = 477.38
(gdb) p current_read_time
  $573 = 3842.5588587744119

## (part, partsupp) is the best plan to materialize the subquery.
  (gdb) p join->best_read
$575 = 6229.5578587744112
  (gdb) p join->record_count
$576 = 2387

## Extra expenses for semi-join
(gdb) print lookup_cost
  $582 = 0.050003
(gdb) print write_cost
  $584 = 0.050003

* MAIN join optimization **
= 1. nation === 
(gdb) p position->records_read
  $587 = 25
(gdb) p position->read_time
  $588 = 1
(gdb) p current_record_count / TIME_FOR_COMPARE
  $589 = 5
(gdb) p current_read_time
  $590 = 6

= 2. supplier === 
(gdb) p position->records_read
  $593 = 180
(gdb) p position->read_time
  $594 = 4525
(gdb) p current_record_count / TIME_FOR_COMPARE
  $595 = 900
(gdb) p current_read_time
  $596 = 5431

## here it considers "part, partsupp" extension, with huge cost: 
## I don't skip it:

=== 3.A part ===
(gdb) p position->records_read
  $599 = 2387
(gdb) p position->read_time
  $600 = 2237996.0782650835
(gdb) p current_record_count / TIME_FOR_COMPARE
  $601 = 2148300
(gdb) p current_read_time
  $602 = 4391727.078265084

=== 4.A partsupp ==
(gdb) p position->records_read
  $605 = 1
(gdb) p position->read_time
  $606 = 10741500
(gdb) p current_record_count / TIME_FOR_COMPARE
  $607 = 2148300
(gdb) p current_read_time
  $608 = 17281527.078265086

> advance_sj_state
  > Firstmatch_picker::check_qep
### Picks first match with 
(gdb) print read_time
  $615 = 12984927.078265084
(gdb) print rec_count
  $616 = 4500
  < Firstmatch_picker::check_qep

  > Sj_materialization_picker::check_qep
(gdb) p prefix_cost.total_cost()
  $622 = 5431   # OK, same as $596

(gdb) p mat_info->materialization_cost.total_cost()
  $623 = 3007.1088587744121  
## This one is weird.  It shoul

[Maria-developers] Cost analysis: FirstMatch plan

2012-08-08 Thread Sergei Petrunia
Hi Timour, 

Please find below cost analysis for the plan with FirstMatch strategy.

Debugging with optimizer_switch='semijoin=off', query:

explain extended
select sql_calc_found_rows
   s_name, s_address
from nation straight_join supplier
where s_suppkey in (select ps_suppkey from partsupp
where ps_partkey in (select p_partkey from part
 where p_name like 'forest%')
  and ps_availqty >
  (select 0.5 * sum(l_quantity)
   from lineitem
   where l_partkey = ps_partkey
 and l_suppkey = ps_suppkey
 and l_shipdate >= date('1994-01-01')
 and l_shipdate < date('1994-01-01') +
 interval '1' year ))
and s_nationkey = n_nationkey
and n_name = 'CANADA'
order by s_name
limit 10;

++--++--++---+---+---+++
|id  |select_type   |table   |type  ||key|key_len|ef
 |rows|Extra   |
++--++--++---+---+---+++
|   1|PRIMARY   |nation  |ALL   ||NULL   |NULL   |NULL  
 |  25|Using where; Using temporary; Using filesort|
|   1|PRIMARY   |supplier|ref   ||i_s_nationkey  |5  
|nation.n_nationkey | 180|  
  |
|   1|PRIMARY   |partsupp|ref   ||i_ps_suppkey   |4  
|supplier.s_suppkey |  39|Using where   
  |
|   1|PRIMARY   |part|eq_ref||PRIMARY|4  
|partsupp.ps_partkey|   1|Using where; FirstMatch(supplier) 
  |
|   4|DEPENDENT SUBQUERY|lineitem|ref   ||i_l_suppkey_partkey|10 
|partsupp.ps_partkey,partsupp.ps_suppkey|   4|Using where   
  |
++--++--++---+---+---+++

= 1. nation === 
(gdb) p position->records_read 
  $367 = 25
(gdb) p position->read_time
  $368 = 1
(gdb) p current_record_count / TIME_FOR_COMPARE 
  $369 = 5
(gdb) p current_read_time
  $370 = 6

= 2. supplier === 

(gdb) p position->records_read
  $377 = 180
(gdb) p position->read_time
  $378 = 4525
(gdb) p current_record_count / TIME_FOR_COMPARE
  $379 = 900
(gdb) p current_read_time
  $380 = 5431

## here it considers "part, partsupp" extension, with cost 
## 12,984,927. I skip that, because that's not the optimal plan

= 3. partsupp === 

(gdb) p position->records_read
  $411 = 39
(gdb) p position->read_time
  $412 = 18
(gdb) p current_record_count / TIME_FOR_COMPARE
  $413 = 35100
(gdb) p current_read_time
  $414 = 220531
 
= 4. part === 
(gdb) p position->records_read
  $421 = 1
(gdb) p position->read_time
  $422 = 175500
(gdb) p current_record_count / TIME_FOR_COMPARE
  $423 = 35100
(gdb) p current_read_time
  $424 = 431131

> advance_sj_state()
  > Firstmatch_picker::check_qep
> optimize_wo_join_buffering

 ### It doesn't make any best_access_path() calls here, because no table is
 ###  using join buffering.

 ### It forgets to add (current_record_count / TIME_FOR_COMPARE), and ends
 ###   up with a slightly lower cost of
 (gdb) print cost
 $425 = 360931
> optimize_wo_join_buffering
  > Firstmatch_picker::check_qep

  ## FirstMatch is chosen as the way we will use to eliminate duplicates..
  
  ## At the end of advance_sj_state, we have:

  (gdb) p *current_read_time
$426 = 360931
  (gdb) p *current_record_count
$427 = 4500
> advance_sj_state().


And this is what is chosen as the final query plan.

-- 
BR
 Sergei
-- 
Sergei Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] what compiler should i use when building mariadb?

2012-08-08 Thread Rich Prohaska
We did not build gcc 4.7.1 correctly.  When we built gcc 4.7.1, we
used the default assembler that ships in centos 5.  We needed to build
gcc 4.7.1 with a newer binutils, so we built binutils 2.22 and then
built gcc 4.7.1 with the binutils 2.22 assembler and linker.  This
compiler now generates the CFI directives that are needed to compile
mariadb.

On Mon, Aug 6, 2012 at 3:44 PM, Kristian Nielsen
 wrote:
> Rich Prohaska  writes:
>
>> I suspect that we have a tool chain problem here at tokutek.
>> We can build maria 5.5 with gcc 4.7.1 and binutils 2.22 on a centos
>> 6.2 machine (we built gcc and binutils ourselves).
>> We can not build it on a centos 5.8 machine with the same gcc 4.7.1
>> and binutils 2.22 (we also built gcc and binutils ourselves).
>> So, we are doing additional experiments to find the root cause of this 
>> problem.
>
> Aha, I see.
>
> One possible explanation is that some missing dependency caused your
> gcc/binutils build on the centos 5.8 to be configured without support for
> DWARF debug information. This could result in an error such as you saw.
> (But it is just a guess, and I may be completely wrong).
>
>  - Kristian.

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] MDEV-318 IF (NOT) EXIST clauses for ALTER TABLE

2012-08-08 Thread Alexey Botchkov

30.07.2012 18:27, Sergei Golubchik wrote:

Hi, Alexey!

First: You didn't include all my comments in your reply.
Should I assume that all omitted comments you simply agree with and
you will change the patch accordingly?

That's right.


Why did you do that in a separate function? You have basically
duplicated all checks. normal alter table code checks for duplicate
columns/key names and for non-existant columns/keys for DROP.

this code is still in place after your patch. so, basically,
you'll check for duplicates and non-existant names twice.
what's the point of that?

I'd expect you to take check_exists flag into account
in the existing checking code and convert errors into warnings
as appropriate.

Naturally i started doing it that way initially.
Then i realized the code looks rather ugly. These checks are scattered
around the code and parts of the code with these checks are used not
only for ALTER TABLE, but for CREATE and other internal needs.

=== modified file 'sql/sql_table.cc'
--- sql/sql_table.cc2012-04-05 21:07:18 +
+++ sql/sql_table.cc2012-07-11 14:28:40 +
@@ -5788,6 +5788,226 @@ is_index_maintenance_unique (TABLE *tabl
   
   
   /*

+  Preparation for table creation
+
+  SYNOPSIS
+handle_if_exists_option()
I didn't spend on this much time, but grepping for
ER_DUP_FIELDNAME and ER_DUP_KEYNAME (which, I suppose, should show all
checks), didn't show all that many places. Only in sql_table.cc, and
only in mysql_prepare_create_table().


That's not enough. You also should grep for ER_SAME_NAME_PARTITION,
ER_DROP_PARTITION_NON_EXISTENT, ER_CANT_DROP_FIELD_OR_KEY,
ER_BAD_FIELD_ERROR




There we need to be sure these check_if_exist flags aren't set.

Yes, but because it's not set by default, it should be always unset,
unless you set it explicitly in sql_yacc.yy, right?


Yes. Still i can imagine one setting that value and we're having 
mysterious bugs as a result.



And the loops should be done in a different way in these places so we
can remove items from the list in the process.

It is already supported by the current code - both for fields and keys.


Not really.
For fields it works like that:
mysql_alter_table()
calls mysql_prepare_alter_table()
 there the new_alter_info is formed from all the old fields 
that weren't dropped

 and all the added fields. No duplicated fields checks yet.
 then it calls compare_tables()
   which calls mysql_prepare_create_table()
which finally discovers the duplicated fields. The 
code to remove the field with the 'if_exists' mark
from the list look ugly there. Which is worse, 
there it's no use to remove something
from the list there at all as it gets only a 
temporary copy of the Alter_info.
Later in the code we call the same 
mysql_prepare_create_table() with the real data.


For partitions code is not nice for removing something from the list at all.
And we have to provide the 'check_exists' flag to the remote functions
like partition_info::check_partition_info().


And one more thing.
It's easy to imagine an user that likes to do the ALTER IF EXISTS 
queries often.


But even if the 'new_field' exists already, ADD FIELD IF EXISTS 
new_field still does a lot.

The list of calls is:
 mysql_create_table_no_lock()  - creates the .frm file with the 
temporary name

 mysql_rename_table()
 mysql_rename_table()   once more
 quick_rm_table()  - removes the old .frm
 query_cache_invalidate()
 reopen_table()

Using the separate function like i did it's easy to notice that the 
current ALTER is NOOP and return at once.




Thus i decided to do that in the separate function to be called from
the mysql_alter_table exclusively.

In my opinion, the fuction won't affect the performance of the
existing functions at all. And it's easy to understand what and why it
does, so the code duplication will produce the more error-prone code
here.

Well, did i convince you the separate function is nicer?

I'm sorry. Not quite.
If the current code is structured badly, it does not mean that we should
add a special function with duplicate checks for every new task.

If you think the current code is ugly, please, do a little refactoring
and move these checks to separate functions, that you can easily extend
for your purposes.




So i see two options - seriously revise the way mysql_alter_table works or
keep the EXISTS handling in the separate function.
BTW i don't feel the separate function is that bad. I don't duplicate that
much code there besides the loops. But we do such loops in many places 
anyway.


Best regards.
HF


___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp