[Wikitech-l] war on Cite/{{cite}}
Hello, I understand the need for cite, thats why it is still there :) But... - We format Cite references list every 100th request to backend, though it takes 8.15% backend response time (thanks parser cache, without it Cite formatting would take 815% cluster time - though developers should understand I'm not exactly right at this hyperbole ;-) - When parsing articles like one of most popular today, [[en:Rod_Blagojevich_corruption_charges]], it takes 20s to produce the page, 17s is spent on Cite block, executing {{cite}} mostly. That makes every editor wait for ages to get a page displayed, and due to cache stampede after invalidation it causes considerable stress on site (look at numbers mentioned above). - This 8% is in real-time, which includes waiting for search, databases, and simply CPU contention, which we end up having today. CPU-time wise it is way higher, so can actually have 20% CPU time impact on our application farm. Thats at least 100k$ worth of hardware (and rising), even if new/modern one, just for citation formatting. So, a checklist what can be done ( simple to complex ) [ ] - Simplification of {{cite}} [ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win, but there is theoretical chance of stripping 1% or so. ;) [ ] - Offload some templates like {{cite}} to actual PHP extensions (can of worms, but, oh well, can be standardized process too) [ ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua - another can of worms, though yet again, can be managed via trusted set of people, on top20 wikis or so). [ ] - Frustrated operations guy adding something like ( return ; ) in some random extension, and syncing the live hack. Obviously there would be some HAHA YOU THOUGHT I COULDN'T DO THIS comments in there. I for one can directly participate in at least two of these options. ;-) Unfortunately, {{cite}} is the only template I can profile/account for now, we don't have proper per-template profiling, but I wish to get one some day. Then we'd have more war on ... topics ;-D Generally, templates are major part of our parsing, and thats over 50% of our current cluster CPU load. As we've actually managed to hit 100% last week, something what hasn't happened for a while, some of work has to be done here. Of course, new hardware will help for a while, but I for one have huge personal satisfaction saving donation money. ;-) CHEERS! -- Domas Mituzas -- http://dammit.lt/ -- [[user:midom]] ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] The never-dying topic: category intersection
Marcus Buck schreef: I just read the last category intersection discussion from December to see, what's the latest state of it. While doing that, I saw, that the last message in that thread was this post from Roan Kattouw, providing his extension. Oddly, nobody reacted on it. After 65 posts on that thread somebody posted a solution and nobody reacted. Can this extension be seen live anywhere? Yes, actually, at http://mixesdb.com/db/index.php/Special:AdvancedSearch (the site I was hired to write it for in the first place). Roan Kattouw (Catrope) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] The never-dying topic: category intersection
2009/1/31 Roan Kattouw roan.katt...@home.nl: Marcus Buck schreef: I just read the last category intersection discussion from December to see, what's the latest state of it. While doing that, I saw, that the last message in that thread was this post from Roan Kattouw, providing his extension. Oddly, nobody reacted on it. After 65 posts on that thread somebody posted a solution and nobody reacted. Can this extension be seen live anywhere? Yes, actually, at http://mixesdb.com/db/index.php/Special:AdvancedSearch (the site I was hired to write it for in the first place). Win! So what's in the way of this going live on Wikimedia? (Commons first?) - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] The never-dying topic: category intersection
David Gerard schreef: Win! So what's in the way of this going live on Wikimedia? (Commons first? As I said before, the extension was written especially for MixesDB, and has all kinds of features WMF wikis don't need or don't want for performance reasons. Also, the UI is pretty crude (note that all the pretty colors and help stuff was added by the MixesDB guy). The most useful part is the code that builds, maintains and searches a category index, but it hasn't been updated to include Andrew's hack for short words yet, and it should probably be ported to use Lucene for use on WMF wikis. Roan Kattouw (Catrope) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Sat, Jan 31, 2009 at 2:03 PM, Domas Mituzas wrote: Hello, I understand the need for cite, thats why it is still there :) But... (...) What about converting these to ref tags? Unfortunately, {{cite}} is the only template I can profile/account for now, we don't have proper per-template profiling, but I wish to get one some day. Then we'd have more war on ... topics ;-D Stub templates, for example :D Generally, templates are major part of our parsing, and thats over 50% of our current cluster CPU load. Wow. Can you compare the load to the systems with the load caused by solely using tags? Marco -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: Use GnuPG with Firefox : http://getfiregpg.org (Version: 0.7.2) iD8DBQFJhG4xW6S2GapJUuQRAsQdAJ0WHP1DfI0+5BF5s0PYlHe6Ax5rPwCfRXax f/yjmuQRbPinnl4mzvRWCtw= =F6F1 -END PGP SIGNATURE- ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
A long while ago I remember looking at the parser and realizing that the recursive template expansion and argument handling led the parser to run all branches of #if and #switch statements before deciding which one to include. In other words, given {{#if: something | statements_A | statements_B }}, the parser was fully expanding both statements_A and statements_B before checking #if to decide which one to keep. Obviously that is inefficient and in the case of very complicated conditional templates potentially very expensive. The parser has changed so much since I last worked with it that I am having difficulty figuring out if this is still true. Hopefully, someone already went through and improved the branch handling logic, but if not, I would suggest that this would also be a good generalized target for improving template operation. -Robert Rohde On Sat, Jan 31, 2009 at 5:03 AM, Domas Mituzas midom.li...@gmail.com wrote: Hello, I understand the need for cite, thats why it is still there :) But... - We format Cite references list every 100th request to backend, though it takes 8.15% backend response time (thanks parser cache, without it Cite formatting would take 815% cluster time - though developers should understand I'm not exactly right at this hyperbole ;-) - When parsing articles like one of most popular today, [[en:Rod_Blagojevich_corruption_charges]], it takes 20s to produce the page, 17s is spent on Cite block, executing {{cite}} mostly. That makes every editor wait for ages to get a page displayed, and due to cache stampede after invalidation it causes considerable stress on site (look at numbers mentioned above). - This 8% is in real-time, which includes waiting for search, databases, and simply CPU contention, which we end up having today. CPU-time wise it is way higher, so can actually have 20% CPU time impact on our application farm. Thats at least 100k$ worth of hardware (and rising), even if new/modern one, just for citation formatting. So, a checklist what can be done ( simple to complex ) [ ] - Simplification of {{cite}} [ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win, but there is theoretical chance of stripping 1% or so. ;) [ ] - Offload some templates like {{cite}} to actual PHP extensions (can of worms, but, oh well, can be standardized process too) [ ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua - another can of worms, though yet again, can be managed via trusted set of people, on top20 wikis or so). [ ] - Frustrated operations guy adding something like ( return ; ) in some random extension, and syncing the live hack. Obviously there would be some HAHA YOU THOUGHT I COULDN'T DO THIS comments in there. I for one can directly participate in at least two of these options. ;-) Unfortunately, {{cite}} is the only template I can profile/account for now, we don't have proper per-template profiling, but I wish to get one some day. Then we'd have more war on ... topics ;-D Generally, templates are major part of our parsing, and thats over 50% of our current cluster CPU load. As we've actually managed to hit 100% last week, something what hasn't happened for a while, some of work has to be done here. Of course, new hardware will help for a while, but I for one have huge personal satisfaction saving donation money. ;-) CHEERS! -- Domas Mituzas -- http://dammit.lt/ -- [[user:midom]] ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
Domas Mituzas wrote: So, a checklist what can be done ( simple to complex ) [ ] - Simplification of {{cite}} Short of significant improvements to the parser or requireing people to ask Domas before editing the template, I can [ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win, but there is theoretical chance of stripping 1% or so. ;) [ ] - Offload some templates like {{cite}} to actual PHP extensions (can of worms, but, oh well, can be standardized process too) I've actually considered something like this in the past, basically creating a Cite 2.0 extension, where all the main cite options would be in the ref tags themselves with pre-defined templates written in PHP for web citations, book citations, etc.; this would greatly reduce the amount of stuff that needs to be done using the Cite wiki-templates and run through the parser. You would have something like: ref author=Foo title=Bar type=bookPages 1-10/ref Any parameters in the ref tag would be converted to HTML output using the book template in the extension rather than a thousand parser functions in some meta-template, and only the content of the tag (the page numbers in this case) would have to be run through the parser, so it would also be backwards-compatible with the current templates until they can all be migrated. The main downside to this is that it requires someone to file a Bugzilla request every time a template needs changing. [ ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua - another can of worms, though yet again, can be managed via trusted set of people, on top20 wikis or so). [ ] - Frustrated operations guy adding something like ( return ; ) in some random extension, and syncing the live hack. Obviously there would be some HAHA YOU THOUGHT I COULDN'T DO THIS comments in there. I for one can directly participate in at least two of these options. ;-) Unfortunately, {{cite}} is the only template I can profile/account for now, we don't have proper per-template profiling, but I wish to get one some day. Then we'd have more war on ... topics ;-D Generally, templates are major part of our parsing, and thats over 50% of our current cluster CPU load. As we've actually managed to hit 100% last week, something what hasn't happened for a while, some of work has to be done here. Of course, new hardware will help for a while, but I for one have huge personal satisfaction saving donation money. ;-) CHEERS! -- Alex (wikipedia:en:User:Mr.Z-man) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Drafts extension in testing
Hi,There seems to be an issue with the extension: It seems when I saved a draft after editing a section, the draft was considered a draft of the corresponding section number (it was mentioned as Article#Section name in the list of drafts). If the given section was removed, clicking on this saved draft I received the error that section number 6 doesn't exist, and thus it could not restore the draft. Changing the page, to again contain at least 6 sections, restoring the draft was possible, at the cost of removing the new section that has replaced it. I believe that this is not really user friendly, even if this is intended behaviour. (You click on a named section and receive a raw number (of the section) in the error message; without any help message or the possibility to restore the text of your draft is someone changes the page in the mean time in an unexpected way). Best regards, Bence Damokos On Wed, Jan 21, 2009 at 7:26 PM, Platonides platoni...@gmail.com wrote: Alex wrote: A possible option would be to have a checkbox (probably on Special:Drafts itself, to avoid cluttering the edit page and to avoid accidental clicks) to mark drafts as public. This would be especially useful when combined with bug 17067, the ability to create drafts of protected pages, a user could make a draft, mark it as public, then link to it for an admin to add to the page. I worry it goes beyond what Drafts attempted to do. So now you start having queues of Drafts, someone seeing the public draft shouldn't delete others drafts when saving, but perhaps the original draft should be marked as 'Foo did an edit from this'. Should the history mark the draft author somewhere? Welcome to the Wikimedia developer life, Trevor. :) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
Would storing an intermediate template improve things? I mean, keep a template but where the inner templates are substed, depending on the original parameters. Robert Rohde wrote: A long while ago I remember looking at the parser and realizing that the recursive template expansion and argument handling led the parser to run all branches of #if and #switch statements before deciding which one to include. In other words, given {{#if: something | statements_A | statements_B }}, the parser was fully expanding both statements_A and statements_B before checking #if to decide which one to keep. Obviously that is inefficient and in the case of very complicated conditional templates potentially very expensive. The new preprocessor don't follow unused branches (or so were we told ;). http://en.wikipedia.org/wiki/Template:Citation/core screams for having loops ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
On Sat, Jan 31, 2009 at 1:28 PM, Alex mrzmanw...@gmail.com wrote: Domas Mituzas wrote: So, a checklist what can be done ( simple to complex ) [ ] - Simplification of {{cite}} Short of significant improvements to the parser or requireing people to ask Domas before editing the template, I can [ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win, but there is theoretical chance of stripping 1% or so. ;) [ ] - Offload some templates like {{cite}} to actual PHP extensions (can of worms, but, oh well, can be standardized process too) I've actually considered something like this in the past, basically creating a Cite 2.0 extension, where all the main cite options would be in the ref tags themselves with pre-defined templates written in PHP for web citations, book citations, etc.; this would greatly reduce the amount of stuff that needs to be done using the Cite wiki-templates and run through the parser. You would have something like: ref author=Foo title=Bar type=bookPages 1-10/ref Any parameters in the ref tag would be converted to HTML output using the book template in the extension rather than a thousand parser functions in some meta-template, and only the content of the tag (the page numbers in this case) would have to be run through the parser, so it would also be backwards-compatible with the current templates until they can all be migrated. The main downside to this is that it requires someone to file a Bugzilla request every time a template needs changing. What about throwing them in MediaWiki: space, similar to editnotices? At least then they could be cached to hell and back in the message cache. -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] ordered lists starting at a certain number
Gentlemen, In wikitext I want to do ol start=6101 lia lib /ol but http://www.w3.org/TR/html401/struct/lists.html says that is deprecated. In fact I really want to just use # and have that start at 6101. OK, I'll just hard wire them into the page 6101. a 6102. b ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
Chad wrote: On Sat, Jan 31, 2009 at 1:28 PM, Alex mrzmanw...@gmail.com wrote: Domas Mituzas wrote: So, a checklist what can be done ( simple to complex ) [ ] - Simplification of {{cite}} Short of significant improvements to the parser or requireing people to ask Domas before editing the template, I can [ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win, but there is theoretical chance of stripping 1% or so. ;) [ ] - Offload some templates like {{cite}} to actual PHP extensions (can of worms, but, oh well, can be standardized process too) I've actually considered something like this in the past, basically creating a Cite 2.0 extension, where all the main cite options would be in the ref tags themselves with pre-defined templates written in PHP for web citations, book citations, etc.; this would greatly reduce the amount of stuff that needs to be done using the Cite wiki-templates and run through the parser. You would have something like: ref author=Foo title=Bar type=bookPages 1-10/ref Any parameters in the ref tag would be converted to HTML output using the book template in the extension rather than a thousand parser functions in some meta-template, and only the content of the tag (the page numbers in this case) would have to be run through the parser, so it would also be backwards-compatible with the current templates until they can all be migrated. The main downside to this is that it requires someone to file a Bugzilla request every time a template needs changing. What about throwing them in MediaWiki: space, similar to editnotices? At least then they could be cached to hell and back in the message cache. -Chad I considered that as well, but I'm not sure how much that will actually help. Looking at http://en.wikipedia.org/wiki/Joe%20the%20Plumber?action=purgeforceprofile=true it took 21.796 seconds to load, most of which seems be from Parser::recursiveTagParse, about 90% of that that is from Cite::referencesFormat-parse. Even if the templates themselves are heavily cached, it still has to run all the conditionals and formatting through the parser. Heavy caching might help if there's lots of refs with the same content on multiple pages, but I don't think that's very common. -- Alex (wikipedia:en:User:Mr.Z-man) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
I understand the need for cite, thats why it is still there :) But... (...) What about converting these to ref tags? Unfortunately most of those are designed to format the ref's to a proper standard that we use (Harvard/MLA standard iirc) and are designed to easily updated when we change out standards (eg: recently the pages value changed in one of the cite templates and a bot when though and fixed them all) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
On Sat, Jan 31, 2009 at 5:37 PM, Alex mrzmanw...@gmail.com wrote: Chad wrote: On Sat, Jan 31, 2009 at 1:28 PM, Alex mrzmanw...@gmail.com wrote: Domas Mituzas wrote: So, a checklist what can be done ( simple to complex ) [ ] - Simplification of {{cite}} Short of significant improvements to the parser or requireing people to ask Domas before editing the template, I can [ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win, but there is theoretical chance of stripping 1% or so. ;) [ ] - Offload some templates like {{cite}} to actual PHP extensions (can of worms, but, oh well, can be standardized process too) I've actually considered something like this in the past, basically creating a Cite 2.0 extension, where all the main cite options would be in the ref tags themselves with pre-defined templates written in PHP for web citations, book citations, etc.; this would greatly reduce the amount of stuff that needs to be done using the Cite wiki-templates and run through the parser. You would have something like: ref author=Foo title=Bar type=bookPages 1-10/ref Any parameters in the ref tag would be converted to HTML output using the book template in the extension rather than a thousand parser functions in some meta-template, and only the content of the tag (the page numbers in this case) would have to be run through the parser, so it would also be backwards-compatible with the current templates until they can all be migrated. The main downside to this is that it requires someone to file a Bugzilla request every time a template needs changing. What about throwing them in MediaWiki: space, similar to editnotices? At least then they could be cached to hell and back in the message cache. -Chad I considered that as well, but I'm not sure how much that will actually help. Looking at http://en.wikipedia.org/wiki/Joe%20the%20Plumber?action=purgeforceprofile=true it took 21.796 seconds to load, most of which seems be from Parser::recursiveTagParse, about 90% of that that is from Cite::referencesFormat-parse. Even if the templates themselves are heavily cached, it still has to run all the conditionals and formatting through the parser. Heavy caching might help if there's lots of refs with the same content on multiple pages, but I don't think that's very common. -- Alex (wikipedia:en:User:Mr.Z-man) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l Throw a caching layer on top of it. Do a final expansion until final substitution at the {{cite book}} etc level. Then you've got less to recursively parse. -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] – Fixing {val}
On Sat, Jan 31, 2009 at 7:12 PM, Platonides platoni...@gmail.com wrote: {{val}} is just a presentational template. It's trivial to create an equivalent, fixed, parserfunction. We do not want to create a new parser function for every presentational template people come up with. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] ordered lists starting at a certain number
Aryeh Gregor wrote: On Sat, Jan 31, 2009 at 4:23 PM, jida...@jidanni.org wrote: Gentlemen, In wikitext I want to do ol start=6101 lia lib /ol but http://www.w3.org/TR/html401/struct/lists.html says that is deprecated. It's been un-deprecated in HTML5, for what that's worth. I don't know whether XHTML2 has done so as well. IMHO it should be allowed to do ol start=6101 # a # b /ol instead of having to revert to html to do this -otherwise not too uncommon- action. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
Aryeh Gregor wrote: On Sat, Jan 31, 2009 at 8:03 AM, Domas Mituzas midom.li...@gmail.com wrote: [ ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua - another can of worms, though yet again, can be managed via trusted set of people, on top20 wikis or so). This seems like it's the only solution from your list that would be generally applicable to similar future scenarios. I don't think the users would have to be particularly trusted -- just make sure that the runtime of the programs is limited, and that it's properly sandboxed (is the Lua PECL extension sandboxed?). That would be like adding a dependancy on Lua extension for reusers, as the core templates will be implemented in Lua. And I don't think worth reimplementing a Lua interpreter in php... ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Fixing text encoding corruption
Just as an FYI, wiki.freeculture.org has mis-encoded UTF-8 for the better part of the past four years. This is because we used the old Latin 1 schemas. Now we don't have these problems anymore. I wrote up my notes at http://wiki.freeculture.org/Fixing_text_encoding_corruption , but here they are for y'all's convenience: 1. freeze writes to the main wiki 2. Dump freecult_wikidb to dump.sql 3. Create a fresh MW install (just for the table schemas) in freecult_wikidb2 4. Create a temporary empty DB, and import dump.sql to it 5. In the temporary DB, ALTER TABLE on the text table so it has the same columns as freecult_wikidb2's text table 6. Dump wikidb3 and have certainty that the column names will line up (but don't copy the sucky old schema) * mysqldump --no-create-info --add-locks --complete-insert freecult_wikidb3 sql 7. Import that into freecult_wikidb2, skipping the tables that are missing * mysql -f freecult_wikidb2 sql * WATCH for errors other than skipping missing table 8. php maintenance/rebuildall.php * If this fails with key errors, just drop the recentchanges table and recreate it with the wikidb2 schema I've poured enough of my life into this issue, but if someone else wants to take this up and document it better, by all means go ahead! -- Asheesh. -- It is often the case that the man who can't tell a lie thinks he is the best judge of one. -- Mark Twain, Pudd'nhead Wilson's Calendar ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] – Fixing {val}
This discussion is getting side tracked. The real complaint here is that {{#expr:(0.7 * 1000 * 1000) mod 1000}} is giving 69 when it should give 70. This is NOT a formatting issue, but rather it is bug in the #expr parser function, presumably caused by some kind of round-off error. -Robert Rohde On Sat, Jan 31, 2009 at 2:27 PM, Greg L greg_l_at_wikipe...@comcast.net wrote: Yes, {val} is a tool for making attractive and convenient scientific notation. The look of {{tl|val}} was discussed at length on both WT:MOSNUM and WT:MOS and achieved broad support for how it works and renders numbers. It delimits numbers with narrow spaces that aren't really spaces; they use CSS span tags to move characters. Thus, the significands can be copied and pasted into Excel where they will be treated as real numbers without the need to first hand-delete spaces. The problem with it {val} is outlined here at… http://en.wikipedia.org/w/index.php?title=User_talk:Jimbo_Walesoldid=260819871 - Developer_support_for_parser_function In a nutshell, about 5 to 10% of the time, {val} gives rounding errors. For instance, the expression {{val|0.55007|e=6}} will return a significand of 0.550069. This is the product of the buggy math-based parser functions it must use. To date, notwithstanding that Jimbo is solidly behind this, and that Erik supports the production of the required parser function, no volunteer developer has stepped up to the plate with a parser function that can character-counting parser function. Greg On Jan 31, 2009, at 2:17 PM, Platonides wrote: Greg L wrote: All, Can anyone figure out how to fix {{tl|val}} so an expression like {{val|0.55007|e=6}} …works properly? Greg L You can't figure out what it should do just from the description. I imagine you mean http://en.wikipedia.org/wiki/Template:Val Set of templates that can be used to easily present values in scientific notation, including uncertainty Another ugly template... ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] -- Fixing {val}
Aryeh Gregor wrote: On Sat, Jan 31, 2009 at 7:12 PM, Platonides wrote: {{val}} is just a presentational template. It's trivial to create an equivalent, fixed, parserfunction. We do not want to create a new parser function for every presentational template people come up with. I know, that's the problem of such approach. Although it could be worth to parserify a set of stable core templates. Not only would they be faster, they would be more readable. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] – Fixing {val}
On Sat, Jan 31, 2009 at 8:33 PM, Robert Rohde raro...@gmail.com wrote: This discussion is getting side tracked. The real complaint here is that {{#expr:(0.7 * 1000 * 1000) mod 1000}} is giving 69 when it should give 70. This is NOT a formatting issue, but rather it is bug in the #expr parser function, presumably caused by some kind of round-off error. $ php -r 'echo (0.7 * 1000 * 1000) % 1000 . \n;' 69 $ php -r 'echo (int)(0.7 * 1000) . \n;' 699 The issue is bog-standard floating-point error. If PHP has a decent library for exact-precision arithmetic, we could probably use that. Otherwise, template programmers will have to learn how floating-point numbers work just like all other programmers in the universe. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] -- Fixing {val}
Aryeh, this reaction of “We do not want to create a new parser function for every presentational template people come up with” is understandable. However, I understand that a character-counting parser function in another form has been in the works for a long time but hasn’t proven to be reliable enough to be released into the wild. If someone could finally develop a bullet-proof character-counting parser function, I’m quite certain that a number of valuable uses could be found for it. That is why I encourage the writing of a parser function over the effort of writing a developer’s version of a template that doesn’t work very well. The only reason {val} doesn’t work well is because it must rely upon math-based parser functions that produce rounding errors. Having said that… The MOS and MOSNUM community has waited seven months for a version of {val} that works well for all numbers—even ones that are really big. Any developer who is willing to tackle this issue, regardless of whether it is a parser function or a revised version of {val}, would be most welcome. However, both Jimbo Wales (in particular) as well as Erik seemed to think the best way to leverage developer effort would be to produce the character-counting parser function as this would enable the production of template tools we haven’t even conceived of yet. On Jan 31, 2009, at 5:30 PM, Platonides wrote: Aryeh Gregor wrote: On Sat, Jan 31, 2009 at 7:12 PM, Platonides wrote: {{val}} is just a presentational template. It's trivial to create an equivalent, fixed, parserfunction. We do not want to create a new parser function for every presentational template people come up with. I know, that's the problem of such approach. Although it could be worth to parserify a set of stable core templates. Not only would they be faster, they would be more readable. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] -- Fixing {val}
On Sat, Jan 31, 2009 at 8:53 PM, greg_l_at_wikipedia greg_l_at_wikipe...@comcast.net wrote: Aryeh, this reaction of We do not want to create a new parser function for every presentational template people come up with is understandable. However, I understand that a character-counting parser function in another form has been in the works for a long time but hasn't proven to be reliable enough to be released into the wild. It would be trivial to write up such a function, and in fact plenty of people have. I could add it right now in five minutes. The question is whether it's desirable to make templates into more of a full-fledged programming language than they already are. There's been reluctance on many people's part to do that. Personally, I think they're close enough anyway so that you may as well give them some basic string functions like {{#len:}}, if the Lua proposal isn't accepted. The only reason {val} doesn't work well is because it must rely upon math-based parser functions that produce rounding errors. As I said in my other response, the exact same errors occur in PHP, and the same type of error occurs in all programming languages. If you aren't familiar with floating-point calculations, see: http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems In a real programming language, of course, there would be workarounds like defining new data types, whereas in template programming that would be tricky. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
^_^ Wikipedia is already a horrible place to copy templates from. Unlike Wikipedia most other MW installations don't bother turning on Tidy, and Wikipedia abuses that /feature/ way to much. ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://nadir-seen-fire.com] -Nadir-Point (http://nadir-point.com) -Wiki-Tools (http://wiki-tools.com) -MonkeyScript (http://monkeyscript.nadir-point.com) -Animepedia (http://anime.wikia.com) -Narutopedia (http://naruto.wikia.com) -Soul Eater Wiki (http://souleater.wikia.com) Aryeh Gregor wrote: On Sat, Jan 31, 2009 at 8:19 PM, Platonides platoni...@gmail.com wrote: That would be like adding a dependancy on Lua extension for reusers, as the core templates will be implemented in Lua. Yes, that would be the major disadvantage I can see. In practice, nobody can reuse large chunks of Wikipedia content on shared hosting anyway, since it's way too big, but it would be a serious obstacle for people who want to reuse only parts of Wikipedia. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] ordered lists starting at a certain number
On Sat, Jan 31, 2009 at 10:12 PM, Daniel Friesen dan_the_...@telus.net wrote: Someone needs to read a good WP article before they start mentioning (X)HTML version numbers: http://en.wikipedia.org/wiki/XHTML Both HTML5 and XHTML2 are successors to HTML4. That's all that's really relevant here. HTML5 has un-deprecated the start attribute of ol, so nobody should be worrying about HTML4's deprecation of it. (XHTML2 does appear to have removed the attribute, so I guess you could worry about it if you plan to move to XHTML2 in the future. But probably nobody is going to use XHTML2, and MediaWiki almost certainly isn't.) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] – Fixing {val}
On Sat, Jan 31, 2009 at 5:43 PM, Aryeh Gregor simetrical+wikil...@gmail.com wrote: On Sat, Jan 31, 2009 at 8:33 PM, Robert Rohde raro...@gmail.com wrote: This discussion is getting side tracked. The real complaint here is that {{#expr:(0.7 * 1000 * 1000) mod 1000}} is giving 69 when it should give 70. This is NOT a formatting issue, but rather it is bug in the #expr parser function, presumably caused by some kind of round-off error. $ php -r 'echo (0.7 * 1000 * 1000) % 1000 . \n;' 69 $ php -r 'echo (int)(0.7 * 1000) . \n;' 699 The issue is bog-standard floating-point error. If PHP has a decent library for exact-precision arithmetic, we could probably use that. Otherwise, template programmers will have to learn how floating-point numbers work just like all other programmers in the universe. In r46671 I have added an explicit test for floating point numbers that are within 1 part in 10^10 of integers before performing round-off sensitive conversions and comparisons. This should eliminate these errors in many cases. -Robert Rohde ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Character-counting parser function
As I understand it, there is rightfully little interest in the developer community to write a new parser function for every single template need to come along. Therefore, when it comes to a template like {{val}}, which now generates rounding errors about 5–10% of the time because of the math- based parser functions it must use, it would be nice if the template- authoring community could have a character-counting parser function that is not only suitable for {{val}}, but which could be a general - purpose parser function that could be used for a great variety of purposes. A description of what {{val}} tries to do at its fundamental level is described here: http://en.wikipedia.org/wiki/Wikipedia_talk:Manual_of_Style_(dates_and_numbers)/Archive_94#Grouping_of_digits_after_the_decimal_point_.28next_attempt.29 Is there a developer whom I can have the author of {{val}} e-mail to see if you two can arrive at a relatively easy-to-make parser function that A) meets the basic needs of {{val}}, and B) has sufficient utility to be useful for other character-counting needs? ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
Hoi, Let us please appreciate what is being said here: Wikipedia is a horrible place to copy templates from. We pride ourselves of being open source and the current templates make us as bad as the worst proprietary vendor. We have what is effectively an API and it is not documented at all. Thanks, GerardM 2009/2/1 Daniel Friesen dan_the_...@telus.net ^_^ Wikipedia is already a horrible place to copy templates from. Unlike Wikipedia most other MW installations don't bother turning on Tidy, and Wikipedia abuses that /feature/ way to much. ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://nadir-seen-fire.com] -Nadir-Point (http://nadir-point.com) -Wiki-Tools (http://wiki-tools.com) -MonkeyScript (http://monkeyscript.nadir-point.com) -Animepedia (http://anime.wikia.com) -Narutopedia (http://naruto.wikia.com) -Soul Eater Wiki (http://souleater.wikia.com) Aryeh Gregor wrote: On Sat, Jan 31, 2009 at 8:19 PM, Platonides platoni...@gmail.com wrote: That would be like adding a dependancy on Lua extension for reusers, as the core templates will be implemented in Lua. Yes, that would be the major disadvantage I can see. In practice, nobody can reuse large chunks of Wikipedia content on shared hosting anyway, since it's way too big, but it would be a serious obstacle for people who want to reuse only parts of Wikipedia. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
How is the api not documented? Between the docs on Mediawiki.org and the fact that every parameter is documented (with examples), I'd say its highly documented. -Chad On Feb 1, 2009 12:18 AM, Gerard Meijssen gerard.meijs...@gmail.com wrote: Hoi, Let us please appreciate what is being said here: Wikipedia is a horrible place to copy templates from. We pride ourselves of being open source and the current templates make us as bad as the worst proprietary vendor. We have what is effectively an API and it is not documented at all. Thanks, GerardM 2009/2/1 Daniel Friesen dan_the_...@telus.net ^_^ Wikipedia is already a horrible place to copy templates from. Unlike Wikipedia most other M... ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
On Sat, Jan 31, 2009 at 9:16 PM, Gerard Meijssen gerard.meijs...@gmail.com wrote: Hoi, Let us please appreciate what is being said here: Wikipedia is a horrible place to copy templates from. We pride ourselves of being open source and the current templates make us as bad as the worst proprietary vendor. We have what is effectively an API and it is not documented at all. Thanks, GerardM Actually, I think Daniel had a somewhat different point. Wikimedia uses Tidy which does a good job at closing dangling format tags. A very substantial fraction of our templates actually have dangling divs, and tables, and other bad syntax that Tidy is covering up for us. Anyone who has ever tried to copy Wikimedia templates into a wiki with Tidy turned off (the default setting) knows that many of our templates will actually return a lot of junk. Strictly speaking it should be the editors' job to properly close tables and divs, etc., but because Tidy is so good at it they don't have to, which makes our wikicode less portable. -Robert Rohde ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
How is the api not documented? Between the docs on Mediawiki.org and the fact that every parameter is documented (with examples), I'd say its highly documented. I think he means on wiki, most people probably won't know to look for information on how to use it at the main/official mediawiki wiki and just go by the scraps they can find on whatever local wiki they are on (in this case en.wiki). ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
Then that's solely enwikis fault for having poor docs. If developers have documented where expected (in code and on mw.org) then they've done their part. -Chad On Feb 1, 2009 12:33 AM, K. Peachey p858sn...@yahoo.com.au wrote: How is the api not documented? Between the docs on Mediawiki.org and the fact that every paramet... I think he means on wiki, most people probably won't know to look for information on how to use it at the main/official mediawiki wiki and just go by the scraps they can find on whatever local wiki they are on (in this case en.wiki). ___ Wikitech-l mailing list wikitec...@lists.wikimedia ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Character-counting parser function
Output a big red error when giving numbers that will encounter a floating point error? Perhaps also provide a # of use limited #expr equivalent that will use a bignum library rather than normal numbers which can be used in cases where that big red error shows up. ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://nadir-seen-fire.com] -Nadir-Point (http://nadir-point.com) -Wiki-Tools (http://wiki-tools.com) -MonkeyScript (http://monkeyscript.nadir-point.com) -Animepedia (http://anime.wikia.com) -Narutopedia (http://naruto.wikia.com) -Soul Eater Wiki (http://souleater.wikia.com) Robert Rohde wrote: On Sat, Jan 31, 2009 at 9:39 PM, Tim Starling tstarl...@wikimedia.org wrote: greg_l_at_wikipedia wrote: As I understand it, there is rightfully little interest in the developer community to write a new parser function for every single template need to come along. Therefore, when it comes to a template like {{val}}, which now generates rounding errors about 5–10% of the time because of the math- based parser functions it must use, it would be nice if the template- authoring community could have a character-counting parser function that is not only suitable for {{val}}, but which could be a general - purpose parser function that could be used for a great variety of purposes. I would rather have an application-specific number formatting function, rather than a character-counting function. It could be similar to PHP's number_format(). Wikitext is a terrible programming language, slow to execute and hard to understand. It's much better to write in PHP. We already have {{formatnum:}} with a very limited functionality that presumably could be extended. Though I would like to re-emphasize that Greg's complaint principally arrises because of floating point round-off errors in #expr that are difficult for normal editors to predict or plan for, and that should be addressed irrespective of other work to improve number formatting. -Robert Rohde ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l