[Forwarding to the list after using wrong mail account in my first reply] ---------- Weitergeleitete Nachricht ----------
Betreff: Re: [SMW-devel] Semantic MediaWiki and Parser Function Initialization Datum: Samstag, 16. August 2008 Von: DanTMan <[EMAIL PROTECTED]> An: [EMAIL PROTECTED] Yes, the style is fine. ^_^ In fact, TimStarling directly commented that extensions were able and allowed to do that. ~Daniel Friesen(Dantman, Nadir-Seen-Fire) of: -The Nadir-Point Group (http://nadir-point.com) --It's Wiki-Tools subgroup (http://wiki-tools.com) --The ElectronicMe project (http://electronic-me.org) --Games-G.P.S. (http://ggps.org) -And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) --Animepedia (http://anime.wikia.com) --Narutopedia (http://naruto.wikia.com) Markus Krötzsch wrote: > On Freitag, 15. August 2008, Daniel Friesen wrote: > >> PHP Objects are dynamically extensible. You can add variables into the >> ParserOutput just by using $parser->mOutput->varname. >> > > I know, but is this considered good style? Is it safe when extensions start > making extensions to MediaWiki objects? Well, with a proper naming scheme, > why not ... I never really considered using this. > > >> Also you might want to note that ParserOutput also has addHeadItem which >> should work for avoiding adding head items to the wrong place. >> > > Yes, we use that. But we need to have head items also in cases where only HTML > is generated (Special pages), hence we also add HTML head items if they did > not yet get added by the parser. > > -- Markus > > >> ~Daniel Friesen(Dantman, Nadir-Seen-Fire) of: >> -The Nadir-Point Group (http://nadir-point.com) >> --It's Wiki-Tools subgroup (http://wiki-tools.com) >> --The ElectronicMe project (http://electronic-me.org) >> --Games-G.P.S. (http://ggps.org) >> -And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) >> --Animepedia (http://anime.wikia.com) >> --Narutopedia (http://naruto.wikia.com) >> >> Markus Krötzsch wrote: >> >>> On Freitag, 15. August 2008, Daniel Friesen wrote: >>> >>>> Sub parsers. In what kind of case does this kind of thing happen for >>>> you? >>>> >>> Normally, sub-parses happen with (a clone of) the current parser, e.g. >>> when using a <gallery>. But I am not aware of any guideline that states >>> that extensions are not allowed to create or clone new parser objects and >>> use them with any title they like. So anything could happen. >>> >>> >>>> When one thing is being parsed, there is one parser doing that task. I >>>> don't know of many cases where multiple parsers exist (unless an >>>> extension is doing something screwy). >>>> >>> We have observed the use of multiple parsers or of one parser with >>> multiple title objects (this distinction is not really relevant for us) >>> in between SMW calls on various wikis. We use hooks during parsing to set >>> the title of the page that is currently processed, so we notice when >>> titles change and we have to reset the data (in a long PHP run there >>> might be many titles that are processed, and there is no guarantee that >>> some save-hook is called before the next page starts processing). >>> >>> Initially in 1.2, we did just reset the data and title once during >>> parsing, and not all hooks did set the title again. This has lead to mean >>> bugs, where data was stored for the wrong title (we even had annotated >>> special pages in one case!). Since the title for storing was only set >>> within hooks of the parser (using getTitle() of the supplied parser), the >>> only explanation is that some other parser fired those hooks with a >>> different title object being parsed, and that this happened before we >>> saved the current data to the DB. >>> >>> Now we make sure that each and every hook call first sets the proper >>> current title and only later saves data of any kind. In this way it is at >>> least ensured that no data ever ends up in the wrong title, but data can >>> still be lost. Again it happened that titles changed between parsing and >>> storing (leading to losses of data, since the change of title also lead >>> to clearing the internal data buffer). So we now use a second buffer to >>> store the data already parsed for the *previous* title, just in case it >>> turns out that the next saving method actually wants to save this data! >>> But this is just a hack: we are blindly moving from hook to hook, parsing >>> data here and there and not knowing for which cases there will be a >>> save-hook later on. It is all very frustrating, and race conditions are >>> still possible. >>> >>> Even now we still experience cases where apparently random data is lost >>> when we create update jobs for all pages: some pages just loose their >>> properties, but these are different pages each time we try. And of course >>> this affects at most 10 pages each time on a densely annotated wiki with >>> 7000 articles (semanticweb.org). >>> >>> With your report this morning, I also removed setting the title in >>> ParserBeforeStrip. Maybe this reduces the amount of wrongly set titles. >>> >>> >>>> Have you tried making use of the ParserOutput? That seams like a to the >>>> point thing, there should only be one of those for a parse. >>>> >>> I did not really find a way yet to use it properly. Can it hold >>> additional data somewhere? >>> >>> Not only the semantic data, but also other "globals" are affected by >>> similar problems. We use globals to add CSS and JavaScripts to pages >>> based on whether they were needed on a page. It turned out that jobs are >>> executed when viewing a special page in between the time when the page is >>> parsed and when the output HTML is created. Hence any job would actually >>> have to capture the current globals and reset them after using any >>> parsing, or otherwise the job's parsers will "use up" the script data >>> needed by the special page. Again, one could add further protection to >>> make sure scripts are only "consumed" by the page that created them, but >>> these are all just workarounds for the basic problem: if you need to >>> preserve data between hooks, how can you make sure that the data is not >>> stored for ever and still remains available long enough until you need >>> it? >>> >>> -- Markus >>> >>> >>>> ~Daniel Friesen(Dantman, Nadir-Seen-Fire) of: >>>> -The Nadir-Point Group (http://nadir-point.com) >>>> --It's Wiki-Tools subgroup (http://wiki-tools.com) >>>> --The ElectronicMe project (http://electronic-me.org) >>>> --Games-G.P.S. (http://ggps.org) >>>> -And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) >>>> --Animepedia (http://anime.wikia.com) >>>> --Narutopedia (http://naruto.wikia.com) >>>> >>>> Markus Krötzsch wrote: >>>> >>>>> Hi Daniel, >>>>> >>>>> it's always refreshing to get some thorough code critique from you in >>>>> the morning -- thanks for caring! I have added you to our contributors' >>>>> list, and I would much appreciate your ideas on some further hacks that >>>>> I am well aware of, see below. >>>>> >>>>> >>>>>> Anyone want to explain to me why the ParserBeforeStrip hook is being >>>>>> used to register parser functions? >>>>>> >>>>> In defence of my code: it works. Up to the introductions of >>>>> ParserFirstCallInit it was also one of the few hooks that got reliably >>>>> (at least in my experience) called before any parser function would be >>>>> needed. >>>>> >>>>> >>>>>> That is a poor place for it, as well as unreliable. Which I can see by >>>>>> how the function being called is a major hack relying on the first >>>>>> call returning the callback name when already set.. >>>>>> >>>>> Well, I have seen worse hacks (only part of which were in my code, but >>>>> see the remarks below on a major problem I still see there). But point >>>>> taken for this issue too. >>>>> >>>>> >>>>>> Since I took the liberty of fixing up Semantic Forms, please see it as >>>>>> a reference on how to correctly add Parser Functions to the parser: >>>>>> http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/SemanticFor >>>>>> ms /in cludes/SF_ParserFunctions.php?view=markup >>>>>> >>>>> Great, I added similar code to SMW now. >>>>> >>>>> >>>>> To stay with this topic, I feel that the whole parser hooking business >>>>> is bound to be one large hack. As a parser extension that stores data, >>>>> you need to hook to several places in MW, hoping that they are somehow >>>>> called in the expected order and that nobody overwrites your data in >>>>> between hooks. We have to store the parsed data somewhere, and this >>>>> place needs to be globally accessible since the parser offers no local >>>>> storage to us (none that would not be cloned with unrelated subparsers >>>>> anyway). But parsing is not global and happens in many parsers, or in >>>>> many, possibly nested, runs of one parser. The current code has evolved >>>>> to prevent many problems that this creates, but it lacks a unified >>>>> approach towards handling this situation. >>>>> >>>>> Many things can still go wrong. There is no way of finding out whether >>>>> we run in the main parsing method of a wiki page text, or if we are >>>>> just called on some page footer or sub-parsing action triggered by some >>>>> extension. Jobs and extensions cross-fire with their own parsing calls, >>>>> often using different Title objects. >>>>> >>>>> Do you have any insights on how to improve the runtime data management >>>>> in SMW so that we can collect data belonging to one article in multiple >>>>> hooks, not have it overwritten by other sub-hooks, and still do not get >>>>> memory leaks on very long runs? We cannot keep all data indefinitely >>>>> just because we are unsure whether we are still in a sub-parser and >>>>> need the data later on. But if we only store the *current* data, we >>>>> need to find out what title actually is currently parsed with the goal >>>>> of storing or updating its data in the DB. >>>>> >>>>> >>>>> Best regards, >>>>> >>>>> Markus >>>>> >>>>> >>>>> >>>>> ----------------------------------------------------------------------- >>>>> - >>>>> >>>>> ----------------------------------------------------------------------- >>>>> -- This SF.Net email is sponsored by the Moblin Your Move Developer's >>>>> challenge Build the coolest Linux based applications with Moblin SDK & >>>>> win great prizes Grand prize is a trip for two to an Open Source event >>>>> anywhere in the world >>>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>> ----------------------------------------------------------------------- >>>>> - >>>>> >>>>> _______________________________________________ >>>>> Semediawiki-devel mailing list >>>>> Semediawiki-devel@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel >>>>> >>> ------------------------------------------------------------------------ >>> >>> ------------------------------------------------------------------------- >>> This SF.Net email is sponsored by the Moblin Your Move Developer's >>> challenge Build the coolest Linux based applications with Moblin SDK & >>> win great prizes Grand prize is a trip for two to an Open Source event >>> anywhere in the world >>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Semediawiki-devel mailing list >>> Semediawiki-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel >>> > > > > ------------------------------------------------------- -- Markus Krötzsch Semantic MediaWiki http://semantic-mediawiki.org http://korrekt.org [EMAIL PROTECTED]
signature.asc
Description: This is a digitally signed message part.
------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel