Re: [Maria-developers] parallel replication monitoring

2013-11-15 Thread Giuseppe Maxia
Hi Kristian,

Thanks for the explanation. Let me elaborate a bit about the questions:
* how many workers are used: I know how many workers I have *enabled* by 
@@slave_parallel_threads. But how many are really used? I see in 
mysql.gtid_slave_pos that the number of rows grows and shrinks while the 
workers are replicating, but this does not give me any useful information on 
the effectiveness of the operations. Just to give you an idea, in Tungsten 
replication, I can see for every thread how many requests were processed. This 
can help me fine tuning the process, and add or remove threads to reach optimal 
performance.
* which database they are running is important even if the parallel replication 
is not split by schema. As a DBA, I need to know at a glance where the action 
is occurring. SHOW PROCESSLIST gives me that information, but it doesn't tell 
me the GTID, which is the info that gives me the pulse of the replication 
progress. What bothers me is that I can get either the GTID list or where the 
action is happening. Having both in one place should not be difficult (a I_S or 
P_S view, for example). I see in MariaDB implementation some of the lack of 
integration that I have seen in MySQL 5.6.

I hope to see better integration and more tools in the future. As it is now, 
parallel replication seems to be a very powerful engine with brake and steering 
wheel but with a blackened out dashboard. Makes for some adventurous driving!

Cheers

Giuseppe

On November 15, 2013 at 08:49:29 , Kristian Nielsen (kniel...@knielsen-hq.org) 
wrote:

Giuseppe Maxia g.ma...@gmail.com writes:  

 I am looking for:  
 * how many workers are used;  

The number of workers created is determined by the value of  
@@slave_parallel_threads. That many worker threads are in the replication  
worker thread pool, shared among all multi-source master connections.  

Workers are scheduled by the SQL threads dynamically, depending on what  
parallelism is available in the events being read from the relay logs at any  
given point in time. Workers for which there is currently no work available  
will appear as Waiting for work from SQL thread in SHOW PROCESSLIST.  

 * what query they are running.  

I think this should appear in SHOW PROCESSLIST, same as for the SQL thread in  
non-parallel replication.  

 * in which schema they are working;  

That is not really meaningful? A worker thread can be working in multiple  
schemas at the same time (multi-table update...).  

The parallel replication feature works like this:  

- On the master, binlog events are marked for whether they can be replicated  
in parallel. Two events can be replication in parallel if 1) their GTID  
events have the same group commit ID (cid=XXX in mysqlbinlog output), or 2)  
if their GTID has different replication domain (first number in the  
GTID=D-S-N tripple), and the slave is operating in GTID mode.  

- On the slave, the SQL thread reads events from the relay log, and  
dynamically schedules the same or a new worker to execute the events so  
read, depending on whether parallelism is possible or not. If the pool of  
threads is exhausted, the SQL thread waits for a worker thread to become  
idle.  

So unlike MySQL multi-threaded slave, worker threads in MariaDB have no  
identity. Any worker thread can potentially be scheduled to work on any event  
from any master connection.  

 What can I use to see the status of parallel replication?  

Nowhere, really :-(  

The parallel replication feature has been basically rushed ruthlessly to  
release, with no time taken for any polish like this or other similar  
reasonable concerns...  

I did briefly think of what could be made available, but I did not come up  
with any immediate ideas. The scheduling of parallel replication work really  
is very dynamic in nature, just like any other application load. Maybe the  
performance schema or something like that could be used, but I am not familiar  
with how it works. Suggestions definitely welcome.  

Giuseppe, btw, thanks for once again taking the time to look and comment on my  
work!  

- Kristian.  
___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] parallel replication monitoring

2013-11-15 Thread Peter Laursen
useful to have in I_S or P_S.  I vote for I_S. P_S may be disabled by
admin due to its memory and cpu overhead.

-- Peter
-- Webyog


On Fri, Nov 15, 2013 at 11:20 AM, Kristian Nielsen kniel...@knielsen-hq.org
 wrote:

 Giuseppe Maxia g.ma...@gmail.com writes:

  * which database they are running is important even if the parallel
  replication is not split by schema. As a DBA, I need to know at a glance
  where the action is occurring. SHOW PROCESSLIST gives me that
 information,
  but it doesn't tell me the GTID, which is the info that gives me the
 pulse
  of the replication progress. What bothers me is that I can get either the
  GTID list or where the action is happening. Having both in one place
 should
  not be difficult (a I_S or P_S view, for example).

 Right, I think I see.

 So it could be useful to have in I_S or P_S a table that gives for each
 worker
 stuff like:

  - The GTID of the event group currently executing, or NULL if idle
  - Status (executing, waiting for prior transaction before starting,
 waiting
for prior transaction before committing, stuff like that)
  - The current database (USE xxx)
  - Currently executing query
  - Whether this worker was scheduled in parallel with something else, and
 if
so, why that was possible (group commit id or replication domain id)
  - Total number of events and event groups executed by worker thread
  - Possibly time spent idle, time spent executing, and time spent waiting
 for
prior transactions to commit (if such times can be obtained without too
high performance overhead).

 That is a nice input, thanks. Something like this would definitely be nice
 to
 have. I've put this into Jira:

 https://mariadb.atlassian.net/browse/MDEV-5296

 On top of this, I think we could also add some statistics for the SQL
 thread. Like, how often did it have to wait for a worker to become free to
 schedule a potentially parallel transaction (might indicate a too-low
 --slave-parallel-threads). And how many transactions could / could not be
 scheduled in parallel (could indicate the need to tune the master to
 provide
 more parallelism in the binlog).

 What would be really nice is to have two numbers in SHOW SLAVE STATUS. One
 is
 the wall-clock time since START SLAVE. The other is the total time spent by
 workers on executing events for this master connection (excluding waiting
 for
 other replication threads). The ratio between these two numbers would
 immediately give an indication of how effective parallel replication is at
 utilising the machine, same as the cpu% numbers in the `top` Linux utility.

  I see in mysql.gtid_slave_pos that the number of rows grows and shrinks
  while the workers are replicating, but this does not give me any useful
  information on the effectiveness of the operations

 Agree, that is mostly useless information. All except the rows with maximal
 sub_id are justa garbage rows, the deletion of which has been delayed to
 avoid
 lock contention between transactions.

  I hope to see better integration and more tools in the future. As it is
 now,
  parallel replication seems to be a very powerful engine with brake and
  steering wheel but with a blackened out dashboard. Makes for some
  adventurous driving!

 Hehe, a very nice and succinct description ;-)

 Thanks,

  - Kristian.

 ___
 Mailing list: https://launchpad.net/~maria-developers
 Post to : maria-developers@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~maria-developers
 More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] parallel replication monitoring

2013-11-15 Thread Giuseppe Maxia
Kristian,
Thanks for taking action with this issue. Another nice touch would be having 
the default value of slave_parallel_threads=1 instead of 0, and doing the right 
thing, as explained in this article:
http://datacharmer.blogspot.it/2013/11/parallel-replication-off-by-one.html

Cheers

Giuseppe


On November 15, 2013 at 11:20:36 , Kristian Nielsen (kniel...@knielsen-hq.org) 
wrote:

Giuseppe Maxia g.ma...@gmail.com writes:  

 * which database they are running is important even if the parallel  
 replication is not split by schema. As a DBA, I need to know at a glance  
 where the action is occurring. SHOW PROCESSLIST gives me that information,  
 but it doesn't tell me the GTID, which is the info that gives me the pulse  
 of the replication progress. What bothers me is that I can get either the  
 GTID list or where the action is happening. Having both in one place should  
 not be difficult (a I_S or P_S view, for example).  

Right, I think I see.  

So it could be useful to have in I_S or P_S a table that gives for each worker  
stuff like:  

- The GTID of the event group currently executing, or NULL if idle  
- Status (executing, waiting for prior transaction before starting, waiting  
for prior transaction before committing, stuff like that)  
- The current database (USE xxx)  
- Currently executing query  
- Whether this worker was scheduled in parallel with something else, and if  
so, why that was possible (group commit id or replication domain id)  
- Total number of events and event groups executed by worker thread  
- Possibly time spent idle, time spent executing, and time spent waiting for  
prior transactions to commit (if such times can be obtained without too  
high performance overhead).  

That is a nice input, thanks. Something like this would definitely be nice to  
have. I've put this into Jira:  

https://mariadb.atlassian.net/browse/MDEV-5296  

On top of this, I think we could also add some statistics for the SQL  
thread. Like, how often did it have to wait for a worker to become free to  
schedule a potentially parallel transaction (might indicate a too-low  
--slave-parallel-threads). And how many transactions could / could not be  
scheduled in parallel (could indicate the need to tune the master to provide  
more parallelism in the binlog).  

What would be really nice is to have two numbers in SHOW SLAVE STATUS. One is  
the wall-clock time since START SLAVE. The other is the total time spent by  
workers on executing events for this master connection (excluding waiting for  
other replication threads). The ratio between these two numbers would  
immediately give an indication of how effective parallel replication is at  
utilising the machine, same as the cpu% numbers in the `top` Linux utility.  

 I see in mysql.gtid_slave_pos that the number of rows grows and shrinks  
 while the workers are replicating, but this does not give me any useful  
 information on the effectiveness of the operations  

Agree, that is mostly useless information. All except the rows with maximal  
sub_id are justa garbage rows, the deletion of which has been delayed to avoid  
lock contention between transactions.  

 I hope to see better integration and more tools in the future. As it is now,  
 parallel replication seems to be a very powerful engine with brake and  
 steering wheel but with a blackened out dashboard. Makes for some  
 adventurous driving!  

Hehe, a very nice and succinct description ;-)  

Thanks,  

- Kristian.  
___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] parallel replication monitoring

2013-11-15 Thread Jean Weisbuch

Hi,

I think it makes sense to have it this way as its consistent with other 
options in MySQL/MariaDB which makes it easier to check/bench the 
overhead of a function.
Its also usually noted on the documentation that using this parameter is 
not recommanded on a normal usecase (eg. 
https://mariadb.com/kb/en/segmented-key-cache/).



Regards.

On 15/11/2013 11:31, Giuseppe Maxia wrote:

Kristian,
Thanks for taking action with this issue. Another nice touch would be
having the default value of slave_parallel_threads=1 instead of 0, and
doing the right thing, as explained in this article:
http://datacharmer.blogspot.it/2013/11/parallel-replication-off-by-one.html

Cheers

Giuseppe


___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


[Maria-developers] parallel replication monitoring

2013-11-14 Thread Giuseppe Maxia

What can I use to see the status of parallel replication?
I am looking for:
* how many workers are used;
* in which schema they are working;
* what query they are running.

So far, I only see mysql.gtid_slave_pos and information_schema.processlist.

Anything else available?

Thanks in advance

Giuseppe___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp