Re: [Maria-developers] mdev6027 RLIKE: "." no longer matching new line (default_regex_flags)

2014-04-23 Thread Sergei Golubchik
Hi, Alexander!

On Apr 22, Sergei Golubchik wrote:
> On Apr 17, Alexander Barkov wrote:
> > Hello Serg,
> > 
> > Please review a patch implementing a new system variable
> > default_regex_flags, to address the remaining incompatibilities
> > between PCRE and the old regex library.

Ah, something else.
Please, make sure this new variable is documented.

Regards,
Sergei

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] mdev6027 RLIKE: "." no longer matching new line (default_regex_flags)

2014-04-23 Thread Alexander Barkov

Hi Jan, Sergei,


On 04/23/2014 11:25 AM, Sergei Golubchik wrote:

Hi, Alexander!

On Apr 22, Sergei Golubchik wrote:

On Apr 17, Alexander Barkov wrote:

Hello Serg,

Please review a patch implementing a new system variable
default_regex_flags, to address the remaining incompatibilities
between PCRE and the old regex library.


Ah, something else.
Please, make sure this new variable is documented.


Yeah, I just finished writing a description for Jan :)


Regards,
Sergei



Jan, can you please update the manual?




A new system variable default_regexp_flags was added,
to set the default behaviour of the PCRE regex engine.

Scope: global, session.

Affected functions and operators: RLIKE, REGEXP_SUBSTR, REGEXP_REPLACE.

Possible values: any combination of zero or more of the following 
options, comma separated:


DOTALL
DUPNAMES
EXTENDED
EXTRA
MULTILINE
UNGREEDY

Default value: empty (all options are off).

Example:

SET default_regex_flags='';
SET default_regex_flags='DOTALL';
SET default_regex_flags='DOTALL,DUPNAMES,EXTENDED,EXTRA,MULTILINE,UNGREEDY';


The meaning of the values:

Value   Pattern equivalent  Meaning
-   --  ---
DOTALL  (?s). matches anything including NL
DUPNAMES(?J)Allow duplicate names for subpatterns
EXTENDED(?x)Ignore white space and # comments
EXTRA   (?X)extra features (e.g. error on unknown 
escape character)

MULTILINE   (?m)^ and $ match newlines within data
UNGREEDY(?U)Invert greediness of quantifiers

See here for the list of the equivalent PCRE options:
https://mariadb.com/kb/en/pcre-regular-expressions/#option-setting


Examples:

# The default behaviour (multiline match is off)

mysql> SELECT 'a\nb\nc' RLIKE '^b$';
+---+
| '(?m)a\nb\nc' RLIKE '^b$' |
+---+
| 0 |
+---+

# Enabling the multiline option using the PCRE option syntax:

mysql> SELECT 'a\nb\nc' RLIKE '(?m)^b$';
+---+
| 'a\nb\nc' RLIKE '(?m)^b$' |
+---+
| 1 |
+---+


# Enabling the miltiline option using default_regex_flags

mysql> SET default_regex_flags='MULTILINE';
mysql> SELECT 'a\nb\nc' RLIKE '^b$';
+---+
| 'a\nb\nc' RLIKE '^b$' |
+---+
| 1 |
+---+


The goal of the new variable is to simplify writing PCRE patterns,
as well as to have a way to configure the default behaviour of the PCRE
engine in a more compatible way with the old regex engine used in
MariaDB-5.5 and MySQL.

Note, unlike the old regex engine, dot (.) does not match a
new line character in PCRE by default. Those who need a better
compatibility with the old regex engine might consider adding this
command into /etc/my.cnf:

[mysqld]
default-regex-flags=DOTALL



Thanks.

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] mdev6027 RLIKE: "." no longer matching new line (default_regex_flags)

2014-04-23 Thread Ian Gilfillan

On 23/04/2014 09:28, Alexander Barkov wrote:

Jan, can you please update the manual?


Sure, will do. Thanks for the detailed writeup. Which release is this 
variable scheduled to be included in?


It's Ian, not Jan btw, although all comes from one and the same root 
name originally ;)


ian

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] [GSoC] Accepted student ready to work : )

2014-04-23 Thread Pablo Estrada
Hello Sergei,

as for (2) - I'd say the most important part is to figure out how to
> select the subset of tests to run, not how to integrate it in buildbot.
>

Definitely! I was just reading into buildbot to spend time while I got
access to the data! : )


> I have a bzr repository with the test data (text files, basically
> database dumps) and a pretty hairy perl script that processes them and
> outputs result files. Then a gnuplot script that reads these result
> files and shows graphs.



There were 9 script runs, that is 9 result files and 9 graphs.
> I'll attach them too.
>
> The model I used was:
>
> 1. For every new revision buildbot runs all tests on all platforms
>
> 2. But they are sorted specially. No matter in what order tests are
> executed, this doesn't change the result - the set of failed tests.  So,
> my script was emulating that, changing the order in which tests were
> run. With the goal to have failures happen as early as possible.
>
> That's how to interpret the graphs. For example, on v5 you can see that
> more than 90% of all test failures happen within the first 20K tests
>

Yup, I understand what you mean here... I can grasp the concepts, but I am
still having some trouble understanding some of the terms that you use in
the graphs. I can see that recall is related to the percentage of failures
that have been encountered, and I guess that cutoff has to do with how many
files you analyze before starting reordering... Also, I can see by your
comments a bit of your thought process, but I have a few questions.

I also have some questions regarding what you did with some of the data, to
get some ideas on how to do it myself. Also regarding how the buildbot
organizes builds and how they correspond to code changes and to test runs.

If it's not too inconvenient, do you think we could set up a Google Hangout
or a Skype call on Monday to go over a few questions quickly?

If you are busy, we can do it through email or on IRC. I can also take a
dive into the bzr repo, and look at what you did, as well as read up all
the info regarding the data; but it would be real helpful if you could lend
me a hand : )

Thank you very much.
Best
Pablo
___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] mdev6027 RLIKE: "." no longer matching new line (default_regex_flags)

2014-04-23 Thread Alexander Barkov



On 04/23/2014 11:44 AM, Ian Gilfillan wrote:

On 23/04/2014 09:28, Alexander Barkov wrote:

Jan, can you please update the manual?


Sure, will do. Thanks for the detailed writeup. Which release is this
variable scheduled to be included in?

It's Ian, not Jan btw, although all comes from one and the same root
name originally ;)


Oops. Sorry for the typo :)




ian


___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] [GSoC] Accepted student ready to work : )

2014-04-23 Thread Elena Stepanova

Hi Pablo,

On 4/23/2014 12:07 PM, Pablo Estrada wrote:

Hello Sergei,



Yup, I understand what you mean here... I can grasp the concepts, but I
am still having some trouble understanding some of the terms that you
use in the graphs. I can see that recall is related to the percentage of
failures that have been encountered, and I guess that cutoff has to do
with how many files you analyze before starting reordering... Also, I
can see by your comments a bit of your thought process, but I have a few
questions.

I also have some questions regarding what you did with some of the data,
to get some ideas on how to do it myself. Also regarding how the
buildbot organizes builds and how they correspond to code changes and to
test runs.


The buildbot dump that I sent to you will give you an idea of how 
buildbot organizes builds etc. There is also a buildbot master 
configuration file in LP 
(http://bazaar.launchpad.net/~maria-captains/mariadb-tools/trunk/view/head:/buildbot/maria-master.cfg) 
which might be useful.




If it's not too inconvenient, do you think we could set up a Google
Hangout or a Skype call on Monday to go over a few questions quickly?

If you are busy, we can do it through email or on IRC. I can also take a
dive into the bzr repo, and look at what you did, as well as read up all
the info regarding the data; but it would be real helpful if you could
lend me a hand : )


I suggest doing it in a different order. Please do take a dive into the 
bzr repo, look at what Sergei did, and read all the info.
I hope after you have done that, there won't be so many questions left, 
so a voice session won't be necessary, and email or two will cover the 
rest. It will be more efficient at the end, will help us keep track of 
the information and ideas, and besides Sergei is currently on vacation 
anyway.


Regards,
Elena



Thank you very much.
Best
Pablo


___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] a Query_log_event charset bug in parallel replication

2014-04-23 Thread Kristian Nielsen
Hi nanyi607rao, sorry for the delayed response, in part due to the Easter 
holidays.

"nanyi607rao"  writes:

> If character_set in different Query_log_events changed, worker threads may 
> apply them with wrong character_set. the codes leading this problem is in 
> Query_log_event::do_apply_event, that is:
>   if (charset_inited)
>   {
> if (rli->cached_charset_compare(charset))
> {
>   /* Verify that we support the charsets found in the event. */
>   if (!(thd->variables.character_set_client=
> get_charset(uint2korr(charset), MYF(MY_WME))) ||
>   !(thd->variables.collation_connection=
> get_charset(uint2korr(charset+2), MYF(MY_WME))) ||
>   !(thd->variables.collation_server=
> get_charset(uint2korr(charset+4), MYF(MY_WME
>   {
>
> There is a charset[6] in rli, which cached last Query_log_event's charset in 
> serial replication. But in parallel replication it would lead mistake, 
> because every worker thread can read and set rli->charset[6], so 
> rli->charset[6] isn't any worker threads' last Query_log_event charset. but 
> rli->charset[6] can affect every worker threads' 
> thd->variables.character_set_* setting.

Right, I see. Probably the cached_charset just needs to be moved into the
rpl_group_info struct.

I have filed a bug for this that I will fix:

https://mariadb.atlassian.net/browse/MDEV-6156

Thanks a lot for finding this, as usual your input on parallel replication is
most useful, keep it coming!

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] bb01.mariadb.net

2014-04-23 Thread Kristian Nielsen
Daniel Bartholomew  writes:

> Serg and Knielsen: I've got the box mostly configured I think, but
> could be (very) wrong, so I wanted to give you a chance to make sure
> everything looks good on it before I go ahead and add it to the

It looks like it already mostly works.

I noticed an error in a build though:

https://buildbot.askmonty.org/buildbot/builders/kvm-tarbake-jaunty-x86/builds/5603

The "kernel" log has this:

---
/dev/sda1 contains a file system with errors, check forced.
/dev/sda1: Entry 'pack-names' in /home/buildbot/.bzr/repository (3415) has an 
incorrect filetype (was 1, should be 2).


/dev/sda1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)
---

One likely reason is a failed rsync step. This is the tarbake build; unlike
most other virtual machines, this one is actually modified during a build. So
it's quite possible that the rsync would go wrong if a tarbake build happened
to run in the meantime.

So just re-sync that particular VM, and that might solve the problem.

Thanks for setting this up!

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] bb01.mariadb.net

2014-04-23 Thread Daniel Bartholomew
On Wed, Apr 23, 2014 at 10:00 AM, Kristian Nielsen
 wrote:
> Daniel Bartholomew  writes:
>
>> Serg and Knielsen: I've got the box mostly configured I think, but
>> could be (very) wrong, so I wanted to give you a chance to make sure
>> everything looks good on it before I go ahead and add it to the
>
> It looks like it already mostly works.
>
> I noticed an error in a build though:
>
> https://buildbot.askmonty.org/buildbot/builders/kvm-tarbake-jaunty-x86/builds/5603
>
>
>
> The "kernel" log has this:
>
> ---
> /dev/sda1 contains a file system with errors, check forced.
> /dev/sda1: Entry 'pack-names' in /home/buildbot/.bzr/repository (3415) has an 
> incorrect filetype (was 1, should be 2).
>
>
> /dev/sda1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
> (i.e., without -a or -p options)
> ---
>
> One likely reason is a failed rsync step. This is the tarbake build; unlike
> most other virtual machines, this one is actually modified during a build. So
> it's quite possible that the rsync would go wrong if a tarbake build happened
> to run in the meantime.
>
> So just re-sync that particular VM, and that might solve the problem.
>
> Thanks for setting this up!

Yes. The issue was that I had pre-synced the most popular VMs over to
bb01, but then, yesterday, I started a sync of the remaining,
less-used VMs, but unfortunately it picked up the tarbake and a new
build started while it was syncing it, like you guessed. I've shutdown
bb01 while I re-sync that VM. Then I'll re-enable (and then restart
the other rsync with the tarbake excluded).

Thanks!

-- 
Daniel Bartholomew, MariaDB Release Manager
MariaDB | http://mariadb.com

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] a Query_log_event charset bug in parallel replication

2014-04-23 Thread Kristian Nielsen
Kristian Nielsen  writes:

> I have filed a bug for this that I will fix:
>
> https://mariadb.atlassian.net/browse/MDEV-6156

I've now pushed a fix for this bug to 10.0.

Thanks again for your help!

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] a Query_log_event charset bug in parallel replication

2014-04-23 Thread nanyi607rao
hi, kristian

> Right, I see. Probably the cached_charset just needs to be moved into the
> rpl_group_info struct.

> I have filed a bug for this that I will fix:

>https://mariadb.atlassian.net/browse/MDEV-6156

I'm afraid there is still some cases would lead mistake if move cached_charset 
into rpl_group_info struct.

For a worker thread can keep a lot rpl_group_info structs in its rgi_free_list, 
and those rgis can be reused many times.
in this case:
a worker thread executes there transactions, which is trans1,trans2 and trans3

trans1's charset is utf8  and it use rgi1
trans2's charset is latin1  and it use rgi2
trans3's charset is utf8  and it use rgi1

so when worker thread start to execute trans3, rgi1->cached_charset == utf8. 
but thd->variables.character_set_* are latin1. and 
them wouldn't be changed to utf8 for rgi->cached_charset_compare return 0.

Thanks___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] a Query_log_event charset bug in parallel replication

2014-04-23 Thread Kristian Nielsen
"nanyi607rao"  writes:

>> I have filed a bug for this that I will fix:
>
>>https://mariadb.atlassian.net/browse/MDEV-6156
>
> I'm afraid there is still some cases would lead mistake if move 
> cached_charset into rpl_group_info struct.

Yes, you are absolutely right, the patch I pushed for this is completely
wrong, just as you explained :-(

Sorry about this, I will try to come up with a better fix.

Thanks,

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp