Re: [Maria-developers] Patch for parallel replication DBUG_ASSERT check on DROP TABLE et al
Michael Wideniuswrites: > Do you happen to know of any failures caused by the patch? As I mentioned on IRC, the assertion triggers during rpl_mdev6020, for example here: https://buildbot.askmonty.org/buildbot/builders/work-amd64-valgrind/builds/9199/steps/test/logs/stdio As we discussed, one possible reason is a race because did_mark_start_commit is set only after waking up the next thread. - Kristian. ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Re: [Maria-developers] Patch for parallel replication DBUG_ASSERT check on DROP TABLE et al
Hi! On Thu, Aug 18, 2016 at 12:24 PM, Kristian Nielsenwrote: > Monty, > > Apparently you pushed this patch into 10.0, even though I explained that it > is incorrect, and why. That's not cool, and you can even see it failing in > Buildbot now. I pushed the patch as I didn't see (probably missed) a review from you for more than a day. I was also going away for a few days and I wanted that Elena would have my code in 10.0 while she was testing things that could trigger the assert. As this was a DBUG_ASSERT and could not cause a problem for anyone in production I didn't think it was totally critical to push it before the review. > Can you please fix it ASAP? Of course. Still I don't know of any case in buildbot where the patch has caused any issues. I checked with Elena and she couldn't find anything either that she could attribute to the patch. Do you happen to know of any failures caused by the patch? Regards, Monty ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Re: [Maria-developers] Sachin weekly report
Hello Sergei! I am stuck at one problem consider table t1(a blob , b blob , unique(a,b)); although select * from t1 where a= 12 and b= 23 works but consider the case like select * from t1 where ( a= 12 or a=45 ) and (b= 23 or b=45 ) does not works and also update and delete using long index does not work simple query like delete/ update table t1 where a=1 and b=3; does not works The reason is that because these query uses test_quick_select function which does not recognize hash index basically in get_mm_parts it always return false because it compare db_row_hash_1 to a and b i solved this problem but now the problems 1. first how to assure we have both column a, and b is in where currently i do this as crawl through tree->key[i]-> next_key_part untill next_key_part is null this works for simple case like a= 1 and b=2 but i am not sure how will i do this in conditions like ((a = 1 or a =34) and (b=34 or b=33)) or (b=34) ^^^ in this condition before or past it is okay to use hash but for b=34 we should not use hash index but do not know how to do 2. when SEL_TREE is evaluated ? 3. where should i evaluate hash i can do that but main problem is that i need to preserve original data like a=1 and b=3 etc because hash does not guaranty same. 4 there is one more problem in test_quick_select there is one code if ((range_trp= get_key_scans_params(, tree, FALSE, TRUE, best_read_time))) in the case of hash key before reaching this function we must have to calculate hash after this it will work but i guess upto this point our tree is not evaluated , i do not know please review branch https://github.com/SachinSetiya/server/tree/unique_index_where_up_de and tell me what should i do and also review the orignal branch unique_index_sachin i added more test cases and reinsert also works On Sun, Aug 14, 2016 at 1:46 PM, Sergei Golubchikwrote: > > Hi, Sachin! > > I'm reviewing. > Here I only comment on your reply to the previous review. > > > >> +longlong Item_func_hash::val_int() > > >> +{ > > >> + unsigned_flag= true; > > >> + ulong nr1= 1,nr2= 4; > > >> + CHARSET_INFO *cs; > > >> + for(uint i= 0;i > >> + { > > >> +String * str = args[i]->val_str(); > > >> +if(args[i]->null_value) > > >> +{ > > >> + null_value= 1; > > >> + return 0; > > >> +} > > >> +cs= str->charset(); > > >> +uchar l[4]; > > >> +int4store(l,str->length()); > > >> +cs->coll->hash_sort(cs,l,sizeof(l), , ); > > > looks good, but use my_charset_binary for the length. > > did not get it :( > > You use > > cs->coll->hash_sort(cs,l,sizeof(l), , ); > > to sort the length, the byte value of str->length(). > This is binary data, you should have > > cs= my_charset_binary; i do not find any variable name my_charset_binary but i used my_charset_utf8_bin is it ok ? > > but you have > > cs= str->charset(); > > for example, it can be latin1_general_ci, and then 'a' and 'A' will be > considered equal. But 'a' is the string length 97, while 'A' is string > length 65. String lengths are binary data, they do not consist of > "letters" or "characters", so you need to use binary charset for > lengths, but str->charset() for actual string data. > > > >> +cs->coll->hash_sort(cs, (uchar *)str->ptr(), str->length(), , ); > > ^^^ here cs= str->charset() is correct. > > > >> + } > > >> + return (longlong)nr1; > > >> +} > > >> + > > >> + > > >> diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy > > >> index e614692..4872d20 100644 > > >> --- a/sql/sql_yacc.yy > > >> +++ b/sql/sql_yacc.yy > > >> | COMMENT_SYM TEXT_STRING_sys { Lex->last_field->comment= $2; } > > >> +| HIDDEN > > >> + { > > >> + LEX *lex =Lex; > > > no need to do that ^^^ you can simply write Lex->last_field->field_visibility=... > > change but in sql_yacc.yy this type of code is use everywhere > > Yes :) Many years ago Lex was defined to be current_thd->lex and on some > platforms current_thd (which is pthread_getspecific) is expensive, so we > were trying to avoid using it when possible. That's why Lex was saved in > a local variable. > > But now (also, for many years) Lex is YYTHD->Lex, where YYTHD is the > argument of the MYSQLyyparse function, not pthread_getspecific. > So there is no need to avoid Lex anymore. Nobody bothered to fix the old > code, though. > > Regards, > Sergei > Chief Architect MariaDB > and secur...@mariadb.org ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Re: [Maria-developers] Patch for parallel replication DBUG_ASSERT check on DROP TABLE et al
Monty, Apparently you pushed this patch into 10.0, even though I explained that it is incorrect, and why. That's not cool, and you can even see it failing in Buildbot now. Can you please fix it ASAP? - Kristian. Kristian Nielsenwrites: > Monty writes: > >> Can you please check if this looks correct? > >> diff --git a/sql/mdl.cc b/sql/mdl.cc >> index 28d2006..3b1f9f2 100644 >> --- a/sql/mdl.cc >> +++ b/sql/mdl.cc >> @@ -17,6 +17,7 @@ >> #include "sql_class.h" >> #include "debug_sync.h" >> #include "sql_array.h" >> +#include "rpl_rli.h" >> #include >> #include >> #include >> @@ -442,6 +443,7 @@ class MDL_lock >>virtual void notify_conflicting_locks(MDL_context *ctx) = 0; >> >>virtual bitmap_t hog_lock_types_bitmap() const = 0; >> + bool check_if_conflicting_replication_locks(MDL_context *ctx); > > This should probably be #ifdef NDEBUG, since you're only using it in debug > build? > >> >>/** List of granted tickets for this lock. */ >>Ticket_list m_granted; >> @@ -2290,6 +2292,44 @@ void >> MDL_scoped_lock::notify_conflicting_locks(MDL_context *ctx) >>} >> } >> >> +/** >> + Check if there is any conflicting lock that could cause this thread >> + to wait for another thread which is not ready to commit. >> + This is always an error, as the upper level of parallel replication >> + should not allow a scheduling of a conflicting DDL until all earlier >> + transactions has commited. >> + >> + This function is only called for a slave using parallel replication >> + and trying to get an exclusive lock for the table. >> +*/ >> + >> +bool MDL_lock::check_if_conflicting_replication_locks(MDL_context *ctx) >> +{ >> + Ticket_iterator it(m_granted); >> + MDL_ticket *conflicting_ticket; >> + >> + while ((conflicting_ticket= it++)) >> + { >> +if (conflicting_ticket->get_ctx() != ctx) >> +{ >> + MDL_context *conflicting_ctx= conflicting_ticket->get_ctx(); >> + >> + /* >> +If the conflicting thread is another parallel replication >> +thread for the same master and it's not in commit stage, then >> +the current transaction has started too early and something is >> +seriously wrong. >> + */ >> + if (conflicting_ctx->get_thd()->rgi_slave && >> + conflicting_ctx->get_thd()->rgi_slave->rli == >> + ctx->get_thd()->rgi_slave->rli && >> + !conflicting_ctx->get_thd()->rgi_slave->did_mark_start_commit) >> +return 1; // Fatal error > > No, this is not correct. For example, the threads might be in different > domains, or using different multi-source connections. You need a check like > in thd_report_wait_for(). _If_ T1 and T2 are in the same parallel > replication domain, _and_ T1 goes before T2, _and_ there is a DDL lock > conflict but T1 has not started to commit, then it indicates a potential > problem. > > Also, the comments are too vague, "Something is seriously wrong" is not a > good description. What is wrong? Please describe the issue being checked > more precisely. What is the role of each of the two threads involved? What > is the situation being checked? Why is it wrong? > > Also, use some temporaries for conflicting_ctx->get_thd()->rgi_slave etc, so > the condition becomes more readable. > >> +} >> + } >> + return 0; >> +} >> + >> >> /** >>Acquire one lock with waiting for conflicting locks to go away if needed. >> @@ -2355,6 +2395,11 @@ MDL_context::acquire_lock(MDL_request *mdl_request, >> ulong lock_wait_timeout) >>if (lock->needs_notification(ticket) && lock_wait_timeout) >> lock->notify_conflicting_locks(this); >> >> + DBUG_ASSERT(mdl_request->type != MDL_INTENTION_EXCLUSIVE || >> + !(get_thd()->rgi_slave && >> +get_thd()->rgi_slave->is_parallel_exec && >> +lock->check_if_conflicting_replication_locks(this))); >> + >>mysql_prlock_unlock(>m_rwlock); >> >>will_wait_for(ticket); >> diff --git a/sql/mdl.h b/sql/mdl.h >> index c4d792a..13de602 100644 >> --- a/sql/mdl.h >> +++ b/sql/mdl.h >> @@ -910,7 +910,6 @@ class MDL_context >> */ >>MDL_wait_for_subgraph *m_waiting_for; >> private: >> - THD *get_thd() const { return m_owner->get_thd(); } >>MDL_ticket *find_ticket(MDL_request *mdl_req, >>enum_mdl_duration *duration); >>void release_locks_stored_before(enum_mdl_duration duration, MDL_ticket >> *sentinel); >> @@ -919,6 +918,7 @@ class MDL_context >> MDL_ticket **out_ticket); >> >> public: >> + THD *get_thd() const { return m_owner->get_thd(); } >>void find_deadlock(); >> >>ulong get_thread_id() const { return thd_get_thread_id(get_thd()); } > > - Kristian. > > ___ > Mailing list: https://launchpad.net/~maria-developers > Post to : maria-developers@lists.launchpad.net > Unsubscribe :
Re: [Maria-developers] [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?
Rich Prohaskawrites: > I have tokudb kill query working with parallel replication. Needed to > attach the mysql thd pointer to each lock request, and use the thd as a key > into the pending lock requests to find the one being used by a particular > thd. Once found, this lock request is immediately completed with a lock > timeout result. Also needed to rearrange the lock wait for code so that > the callback is called when not holding any tokudb locks. Ok, sounds great! > This solves 1 or 10 parallel replication stalls with tokudb. The remainder > require the tokudb lock timer to pop. > > The problem is that the parallel replicator attempts to kill the query for > a mysql slave thread that is waiting for a prior txn to complete. This > does not have any effect on tokudb. The workaround for now is to use an > extremely small tokudb lock timeout so that the stalls are short. Hm, that sounds like a problem with the parallel replication code. It should be normal that a thread needs to be killed in "waiting for prior txn", and it should work. I will look into this. Is your latest kill query code on github? Thanks, - Kristian. ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp