Re: [Wikitech-l] Using MySQL as a NoSQL
Hi! I have recently encountered this text in which the author claims very high MySQL speedups for simple queries It is not that he speeds up simple queries (you'd notice that maybe if you used infiniband, and even then it wouldn't matter much :) He just avoided hitting some expensive critical sections that make scaling on multicore systems problematic. It looks interesting. There are some places where mediawiki could take that shortcut if available. It wouldn't be a shortcut if you had to establish another database connection besides existing one. I wonder if we have such CPU bottleneck, though. No, not really. Our average (do note, this isn't median and is affected by heavy queries more) DB response time is 1.3ms (measuring on the client). Domas ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Using MySQL as a NoSQL
Hi! A: It's easy to get fast results if you don't care about your reads being atomic (*), and I find it hard to believe they've managed to get atomic reads without going through MySQL. MySQL upper layers know nothing much about transactions, it is all engine-specific - BEGIN and COMMIT processing is deferred to table handlers. It would incredibly easy for them to implement repeatable read snapshots :) (if thats what you mean by atomic read) (*) Among other possibilities, just use MyISAM. How is that applicable to any discussion? Domas ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Using MySQL as a NoSQL
On 12/24/2010 10:01 AM, Domas Mituzas wrote: I wonder if we have such CPU bottleneck, though. No, not really. Our average (do note, this isn't median and is affected by heavy queries more) DB response time is 1.3ms (measuring on the client). This could also reduce memory usage by not using memcached (as often) which, I understand, is a bigger problem. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Using MySQL as a NoSQL
Hi! This could also reduce memory usage by not using memcached (as often) which, I understand, is a bigger problem. No it is not. First of all, our memcached and database access times not that far away - 0.7 vs 1.3 ms (again, memcached is static response time, whereas database average is impacted by calculations). On another hand, we don't store in memcached what is stored in database and we don't store in database what is stored in memcached. Think about these as two separate systems, not as complementing each other too much. We use memcached to offload application cluster, not database cluster. And database cluster already has over a terabyte of RAM (replicas and whatnot), whereas our memcached lives in puny 158GB arena. I described some of fundamental differences of how we use memcached in http://dom.as/uc/workbook2007.pdf - pages 11-13. Nothing much changed since then. Domas ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Using MySQL as a NoSQL
Hi, -Original Message- From: wikitech-l-boun...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Domas Mituzas Sent: 24 December 2010 09:09 To: Wikimedia developers Subject: Re: [Wikitech-l] Using MySQL as a NoSQL Hi! A: It's easy to get fast results if you don't care about your reads being atomic (*), and I find it hard to believe they've managed to get atomic reads without going through MySQL. MySQL upper layers know nothing much about transactions, it is all engine-specific - BEGIN and COMMIT processing is deferred to table handlers. It would incredibly easy for them to implement repeatable read snapshots :) (if thats what you mean by atomic read) It seems from my tinkering that MySQL query cache handling is circumvented via HandlerSocket. So if you update/insert/delete via HandlerSocket, then query via SQL your not guarenteed to see the changes unless you use SQL_NO_CACHE. (*) Among other possibilities, just use MyISAM. How is that applicable to any discussion? Domas ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l Jared ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Using MySQL as a NoSQL
Hi! It seems from my tinkering that MySQL query cache handling is circumvented via HandlerSocket. On busy systems (I assume we talk about busy systems, as discussion is about HS) query cache is usually eliminated anyway. Either by compiling it out, or by patching the code not to use qcache mutexes unless it really really is enabled. In worst case, it is just simply disabled. :) So if you update/insert/delete via HandlerSocket, then query via SQL your not guarenteed to see the changes unless you use SQL_NO_CACHE. You are probably right. Again, nobody cares about qcache at those performance boundaries. Domas ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Using MySQL as a NoSQL
On Fri, Dec 24, 2010 at 4:08 AM, Domas Mituzas midom.li...@gmail.com wrote: Hi! A: It's easy to get fast results if you don't care about your reads being atomic (*), and I find it hard to believe they've managed to get atomic reads without going through MySQL. MySQL upper layers know nothing much about transactions, it is all engine-specific - BEGIN and COMMIT processing is deferred to table handlers. It would incredibly easy for them to implement repeatable read snapshots :) (if thats what you mean by atomic read) I suppose it's possible in theory, but in any case, it's not what they're doing. They *are* going through MySQL, via the HandlerSocket plugin. I wonder if they'd get much different performance by just using prepared statements and read committed isolation, with the transactions spanning multiple requests. The tables would only get locked once per transaction, right? Or do I just have no idea what I'm talking about? (*) Among other possibilities, just use MyISAM. How is that applicable to any discussion? It was an example of a way to get fast results if you don't care about your reads being atomic. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Using MySQL as a NoSQL
Domas Mituzas wrote: It looks interesting. There are some places where mediawiki could take that shortcut if available. It wouldn't be a shortcut if you had to establish another database connection besides existing one. I was assuming usage of pfsockopen(), of course. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Using MySQL as a NoSQL
-Original Message- From: wikitech-l-boun...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Domas Mituzas Sent: 24 December 2010 13:42 To: Wikimedia developers Subject: Re: [Wikitech-l] Using MySQL as a NoSQL Hi! It seems from my tinkering that MySQL query cache handling is circumvented via HandlerSocket. On busy systems (I assume we talk about busy systems, as discussion is about HS) query cache is usually eliminated anyway. Either by compiling it out, or by patching the code not to use qcache mutexes unless it really really is enabled. In worst case, it is just simply disabled. :) So if you update/insert/delete via HandlerSocket, then query via SQL your not guarenteed to see the changes unless you use SQL_NO_CACHE. You are probably right. Again, nobody cares about qcache at those performance boundaries. Domas Ah, interesting. The only reason I took at it was because you don't have to pfaff with encoding/escaping values* the way you have to SQL. SQL injection vulnerabilities don't exist. * And the protocol handles binary values which normally have to pfaff about getting in and out of MySQL with the various PHP apis. Does seem a bit specialised, could have a persistent cache, maybe as a session handler. Jared ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Using MySQL as a NoSQL
-Original Message- From: wikitech-l-boun...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Jared Williams Sent: 24 December 2010 16:18 To: 'Wikimedia developers' Subject: Re: [Wikitech-l] Using MySQL as a NoSQL -Original Message- From: wikitech-l-boun...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Domas Mituzas Sent: 24 December 2010 13:42 To: Wikimedia developers Subject: Re: [Wikitech-l] Using MySQL as a NoSQL Hi! It seems from my tinkering that MySQL query cache handling is circumvented via HandlerSocket. On busy systems (I assume we talk about busy systems, as discussion is about HS) query cache is usually eliminated anyway. Either by compiling it out, or by patching the code not to use qcache mutexes unless it really really is enabled. In worst case, it is just simply disabled. :) So if you update/insert/delete via HandlerSocket, then query via SQL your not guarenteed to see the changes unless you use SQL_NO_CACHE. You are probably right. Again, nobody cares about qcache at those performance boundaries. Domas Ah, interesting. The only reason I took at it was because you don't have to pfaff with encoding/escaping values* the way you have to SQL. SQL injection vulnerabilities don't exist. * And the protocol handles binary values which normally have to pfaff about getting in and out of MySQL with the various PHP apis. Does seem a bit specialised, could have a persistent cache, maybe as a session handler. Maybe a session handler even. Jared ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Using MySQL as a NoSQL
Hi! I was assuming usage of pfsockopen(), of course. Though protocol is slightly cheaper, you still have to do TCP handshake :) Domas ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Alternative to opendir() functions?
In the HISTORY file: * glob() is horribly unreliable and doesn't work on some systems, including free.fr shared hosting. No longer using it in Language::getLanguageNames() -X! On Dec 24, 2010, at 12:24 PM, Brion Vibber wrote: Glob works too I think. -- brion On Dec 23, 2010 12:06 PM, Ilmari Karonen nos...@vyznev.net wrote: On 12/22/2010 12:16 AM, Platonides wrote: We are only using opendir for getting a full directory list. That's a good point. Perhaps what we need is simply a utility method to list all files in a directory. In fact, I just realized that PHP already has one. It's called scandir(). Its only flaw IMO is that it doesn't automatically skip the current and parent dir entries, but you could always do something like $files = array_diff( scandir( $dir ), array( '.', '..' ) ); to accomplish that cleanly (or use preg_grep() to remove all dotfiles if you prefer). -- Ilmari Karonen ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Xmldatadumps-l] dataset1, xml dumps
The new host Dataset2 is now up and running and serving XML dumps. Those of you paying attention to DNS entries should see the change within the hour. We are not generating new dumps yet but expect to do so soon. Ariel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Xmldatadumps-l] dataset1, xml dumps
Hi, That is great news, that you for all the hard work you have done on this and most of all Seasons Greetings, Merry Christmas, and Happy New Year! :) best regards, Jamie - Original Message - From: Ariel T. Glenn ar...@wikimedia.org Date: Friday, December 24, 2010 10:42 am Subject: Re: [Xmldatadumps-l] [Wikitech-l] dataset1, xml dumps To: Wikimedia developers wikitech-l@lists.wikimedia.org Cc: xmldatadump...@lists.wikimedia.org The new host Dataset2 is now up and running and serving XML dumps. Those of you paying attention to DNS entries should see the change within the hour. We are not generating new dumps yet but expect to do so soon. Ariel ___ Xmldatadumps-l mailing list xmldatadump...@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l