[Forwarding to the list after using wrong mail account in my first reply]

----------  Weitergeleitete Nachricht  ----------

Betreff: Re: [SMW-devel] Semantic MediaWiki and Parser Function Initialization
Datum: Samstag, 16. August 2008
Von: DanTMan <[EMAIL PROTECTED]>
An: [EMAIL PROTECTED]

Yes, the style is fine. ^_^ In fact, TimStarling directly commented that 
extensions were able and allowed to do that.

~Daniel Friesen(Dantman, Nadir-Seen-Fire) of:
-The Nadir-Point Group (http://nadir-point.com)
--It's Wiki-Tools subgroup (http://wiki-tools.com)
--The ElectronicMe project (http://electronic-me.org)
--Games-G.P.S. (http://ggps.org)
-And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
--Animepedia (http://anime.wikia.com)
--Narutopedia (http://naruto.wikia.com)

Markus Krötzsch wrote:
> On Freitag, 15. August 2008, Daniel Friesen wrote:
>   
>> PHP Objects are dynamically extensible. You can add variables into the
>> ParserOutput just by using $parser->mOutput->varname.
>>     
>
> I know, but is this considered good style? Is it safe when extensions start 
> making extensions to MediaWiki objects? Well, with a proper naming scheme, 
> why not ... I never really considered using this.
>
>   
>> Also you might want to note that ParserOutput also has addHeadItem which
>> should work for avoiding adding head items to the wrong place.
>>     
>
> Yes, we use that. But we need to have head items also in cases where only 
HTML 
> is generated (Special pages), hence we also add HTML head items if they did 
> not yet get added by the parser.
>
> -- Markus
>
>   
>> ~Daniel Friesen(Dantman, Nadir-Seen-Fire) of:
>> -The Nadir-Point Group (http://nadir-point.com)
>> --It's Wiki-Tools subgroup (http://wiki-tools.com)
>> --The ElectronicMe project (http://electronic-me.org)
>> --Games-G.P.S. (http://ggps.org)
>> -And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
>> --Animepedia (http://anime.wikia.com)
>> --Narutopedia (http://naruto.wikia.com)
>>
>> Markus Krötzsch wrote:
>>     
>>> On Freitag, 15. August 2008, Daniel Friesen wrote:
>>>       
>>>> Sub parsers. In what kind of case does this kind of thing happen for
>>>> you?
>>>>         
>>> Normally, sub-parses happen with (a clone of) the current parser, e.g.
>>> when using a <gallery>. But I am not aware of any guideline that states
>>> that extensions are not allowed to create or clone new parser objects and
>>> use them with any title they like. So anything could happen.
>>>
>>>       
>>>> When one thing is being parsed, there is one parser doing that task. I
>>>> don't know of many cases where multiple parsers exist (unless an
>>>> extension is doing something screwy).
>>>>         
>>> We have observed the use of multiple parsers or of one parser with
>>> multiple title objects (this distinction is not really relevant for us)
>>> in between SMW calls on various wikis. We use hooks during parsing to set
>>> the title of the page that is currently processed, so we notice when
>>> titles change and we have to reset the data (in a long PHP run there
>>> might be many titles that are processed, and there is no guarantee that
>>> some save-hook is called before the next page starts processing).
>>>
>>> Initially in 1.2, we did just reset the data and title once during
>>> parsing, and not all hooks did set the title again. This has lead to mean
>>> bugs, where data was stored for the wrong title (we even had annotated
>>> special pages in one case!). Since the title for storing was only set
>>> within hooks of the parser (using getTitle() of the supplied parser), the
>>> only explanation is that some other parser fired those hooks with a
>>> different title object being parsed, and that this happened before we
>>> saved the current data to the DB.
>>>
>>> Now we make sure that each and every hook call first sets the proper
>>> current title and only later saves data of any kind. In this way it is at
>>> least ensured that no data ever ends up in the wrong title, but data can
>>> still be lost. Again it happened that titles changed between parsing and
>>> storing (leading to losses of data, since the change of title also lead
>>> to clearing the internal data buffer). So we now use a second buffer to
>>> store the data already parsed for the *previous* title, just in case it
>>> turns out that the next saving method actually wants to save this data!
>>> But this is just a hack: we are blindly moving from hook to hook, parsing
>>> data here and there and not knowing for which cases there will be a
>>> save-hook later on. It is all very frustrating, and race conditions are
>>> still possible.
>>>
>>> Even now we still experience cases where apparently random data is lost
>>> when we create update jobs for all pages: some pages just loose their
>>> properties, but these are different pages each time we try. And of course
>>> this affects at most 10 pages each time on a densely annotated wiki with
>>> 7000 articles (semanticweb.org).
>>>
>>> With your report this morning, I also removed setting the title in
>>> ParserBeforeStrip. Maybe this reduces the amount of wrongly set titles.
>>>
>>>       
>>>> Have you tried making use of the ParserOutput? That seams like a to the
>>>> point thing, there should only be one of those for a parse.
>>>>         
>>> I did not really find a way yet to use it properly. Can it hold
>>> additional data somewhere?
>>>
>>> Not only the semantic data, but also other "globals" are affected by
>>> similar problems. We use globals to add CSS and JavaScripts to pages
>>> based on whether they were needed on a page. It turned out that jobs are
>>> executed when viewing a special page in between the time when the page is
>>> parsed and when the output HTML is created. Hence any job would actually
>>> have to capture the current globals and reset them after using any
>>> parsing, or otherwise the job's parsers will "use up" the script data
>>> needed by the special page. Again, one could add further protection to
>>> make sure scripts are only "consumed" by the page that created them, but
>>> these are all just workarounds for the basic problem: if you need to
>>> preserve data between hooks, how can you make sure that the data is not
>>> stored for ever and still remains available long enough until you need
>>> it?
>>>
>>> -- Markus
>>>
>>>       
>>>> ~Daniel Friesen(Dantman, Nadir-Seen-Fire) of:
>>>> -The Nadir-Point Group (http://nadir-point.com)
>>>> --It's Wiki-Tools subgroup (http://wiki-tools.com)
>>>> --The ElectronicMe project (http://electronic-me.org)
>>>> --Games-G.P.S. (http://ggps.org)
>>>> -And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
>>>> --Animepedia (http://anime.wikia.com)
>>>> --Narutopedia (http://naruto.wikia.com)
>>>>
>>>> Markus Krötzsch wrote:
>>>>         
>>>>> Hi Daniel,
>>>>>
>>>>> it's always refreshing to get some thorough code critique from you in
>>>>> the morning -- thanks for caring! I have added you to our contributors'
>>>>> list, and I would much appreciate your ideas on some further hacks that
>>>>> I am well aware of, see below.
>>>>>
>>>>>           
>>>>>> Anyone want to explain to me why the ParserBeforeStrip hook is being
>>>>>> used to register parser functions?
>>>>>>             
>>>>> In defence of my code: it works. Up to the introductions of
>>>>> ParserFirstCallInit it was also one of the few hooks that got reliably
>>>>> (at least in my experience) called before any parser function would be
>>>>> needed.
>>>>>
>>>>>           
>>>>>> That is a poor place for it, as well as unreliable. Which I can see by
>>>>>> how the function being called is a major hack relying on the first
>>>>>> call returning the callback name when already set..
>>>>>>             
>>>>> Well, I have seen worse hacks (only part of which were in my code, but
>>>>> see the remarks below on a major problem I still see there). But point
>>>>> taken for this issue too.
>>>>>
>>>>>           
>>>>>> Since I took the liberty of fixing up Semantic Forms, please see it as
>>>>>> a reference on how to correctly add Parser Functions to the parser:
>>>>>> http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/SemanticFor
>>>>>> ms /in cludes/SF_ParserFunctions.php?view=markup
>>>>>>             
>>>>> Great, I added similar code to SMW now.
>>>>>
>>>>>
>>>>> To stay with this topic, I feel that the whole parser hooking business
>>>>> is bound to be one large hack. As a parser extension that stores data,
>>>>> you need to hook to several places in MW, hoping that they are somehow
>>>>> called in the expected order and that nobody overwrites your data in
>>>>> between hooks. We have to store the parsed data somewhere, and this
>>>>> place needs to be globally accessible since the parser offers no local
>>>>> storage to us (none that would not be cloned with unrelated subparsers
>>>>> anyway). But parsing is not global and happens in many parsers, or in
>>>>> many, possibly nested, runs of one parser. The current code has evolved
>>>>> to prevent many problems that this creates, but it lacks a unified
>>>>> approach towards handling this situation.
>>>>>
>>>>> Many things can still go wrong. There is no way of finding out whether
>>>>> we run in the main parsing method of a wiki page text, or if we are
>>>>> just called on some page footer or sub-parsing action triggered by some
>>>>> extension. Jobs and extensions cross-fire with their own parsing calls,
>>>>> often using different Title objects.
>>>>>
>>>>> Do you have any insights on how to improve the runtime data management
>>>>> in SMW so that we can collect data belonging to one article in multiple
>>>>> hooks, not have it overwritten by other sub-hooks, and still do not get
>>>>> memory leaks on very long runs? We cannot keep all data indefinitely
>>>>> just because we are unsure whether we are still in a sub-parser and
>>>>> need the data later on. But if we only store the *current* data, we
>>>>> need to find out what title actually is currently parsed with the goal
>>>>> of storing or updating its data in the DB.
>>>>>
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Markus
>>>>>
>>>>>
>>>>>
>>>>> -----------------------------------------------------------------------
>>>>> -
>>>>>
>>>>> -----------------------------------------------------------------------
>>>>> -- This SF.Net email is sponsored by the Moblin Your Move Developer's
>>>>> challenge Build the coolest Linux based applications with Moblin SDK &
>>>>> win great prizes Grand prize is a trip for two to an Open Source event
>>>>> anywhere in the world
>>>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/
>>>>> -----------------------------------------------------------------------
>>>>> -
>>>>>
>>>>> _______________________________________________
>>>>> Semediawiki-devel mailing list
>>>>> Semediawiki-devel@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
>>>>>           
>>> ------------------------------------------------------------------------
>>>
>>> -------------------------------------------------------------------------
>>> This SF.Net email is sponsored by the Moblin Your Move Developer's
>>> challenge Build the coolest Linux based applications with Moblin SDK &
>>> win great prizes Grand prize is a trip for two to an Open Source event
>>> anywhere in the world
>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Semediawiki-devel mailing list
>>> Semediawiki-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
>>>       
>
>
>
>   

-------------------------------------------------------

-- 
Markus Krötzsch
Semantic MediaWiki    http://semantic-mediawiki.org
http://korrekt.org    [EMAIL PROTECTED]

Attachment: signature.asc
Description: This is a digitally signed message part.

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel

Reply via email to