I too see the long preg-match regex error, even when just doing a
runJobs.php to clear the workqueue.  Perhaps it has something to do with
Unicode pages with lots of semantic data (the only data connection I can see
between Temlakos' wiki and mine).  Please let me know how I can help us all
track down the source of this glitch.

-Robert

On Fri, Jul 11, 2008 at 8:07 PM, Temlakos <[EMAIL PROTECTED]> wrote:

> Markus Krötzsch wrote:
> > On Freitag, 11. Juli 2008, Temlakos wrote:
> >
> >> Everyone:
> >>
> >> Several weeks ago, I finally figured out how to install SMW's
> >> maintenance scripts as symlinks in my server's wiki maintenance
> >> subdirectory so that I could run them.
> >>
> >> But when I ran SMW_refreshData.php, I got multiple warnings saying that
> >> a call to preg_match failed on an overly long regular expression. The
> >> implicated file was my custom "historical date" file. And after multiple
> >> such "warnings," the execution of the file finally ended with one word:
> >> "Killed."
> >>
> >> Markus, I believe you have a copy of the historical-date file
> >> (SMW_DV_HxDate.php). The longest regular expression (regex) in it is
> >> $screenpat, and my file calls preg_match with that string in order to
> >> screen out date texts that are not in a form that the script would
> >> recognize. I do that to ensure that any annotated date that passed that
> >> test would be sure to represent a valid date, so long as month names
> >> were spelled correctly, etc.
> >>
> >> But if a long regex is creating a problem, then I must solve it today,
> >> before I update my wiki. Otherwise, SMW_refreshData.php will kill itself
> >> again, and it will leave the job unfinished.
> >>
> >> How long can a regex be and not cause a problem with the execution of
> >> SMW_refreshData?
> >>
> >> The regex strings in the file are $screenpat and $format1, $format2,
> >> $format3, $format4, and $format5.
> >>
> >> These strings have 219, 83, 89, 84, 85, and 55 characters, respectively.
> >>
> >> Any assistance would be appreciated. Furthermore, if anyone else hopes
> >> to use the Historical Date script, then I can't have it creating a
> >> problem every time someone wants to run SMW_refreshData.php.
> >>
> >
> > I never encountered a similar problem. We also have long regexps in SMW,
> and
> > the lengths you gave do not sound impressingly long to me either. Are all
> > regexps static and do not use any variables of possibly unexpected
> content?
> > Can a websearch help you on your warning/error messages?
> >
> > Of course, "Killed" sounds like an emergency break due to the shortage of
> some
> > resource (such as memory). Does the problem occur when you start on that
> very
> > page (using -v and then -s <id> as options fo refreshData)?
> >
> > SMW_refreshData.php as such does not do many things that would be
> different
> > from normal page editing, though it calls functions in a slightly
> different
> > program context (I just fixed some bugs in SemanticCalendar, which relied
> on
> > the global $wgTitle that is not ensured to contain anythin during parsing
> in
> > general and refreshData in particular). Besides these things, it would
> > normally use the same code as during writing a page. Of course, the php
> > command on a server may have different behaviour than the php module in
> > Apache (and the admissible length of regexps appears to be rather
> specific to
> > PHP).
> >
> > Anyway, you can use the option -v to see which page id causes the
> problem, and
> > then use the option -s <id+1> to continue after that page. This way, you
> skip
> > one page but can still refresh the rest. Use the MediaWiki API or the
> > database to find out which page causes the problem, and check whether it
> > works normally when read/edited on the web.
> >
> > Markus
> >
> >
> > P.S. It seems that this is a discussion for the developers' list ...
> >
> Duly noted. I will now publish this to the development list as well,
> though the other users might want to see my answers.
>
> I have no insight on the problem of long regex strings. My regexes are
> static, first of all. The warning messages created such confusion and
> went by so fast that I didn't have a chance to see where the "kill"
> occurred before it happened. This might or might not be significant: I
> did not at first run refreshData and restrict it to refreshing type and
> property pages only. Instead I ran it on the entire database, using the
> -v option. That's why I saw all those warnings.
>
> Terry A. Hurlbut
>
> PS: Thank you for detailing the -s option. I did not at first see that
> on the commented documentation in the file.
>
> TAH
>
>
> -------------------------------------------------------------------------
> Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
> Studies have shown that voting for your favorite open source project,
> along with a healthy diet, reduces your potential for chronic lameness
> and boredom. Vote Now at http://www.sourceforge.net/community/cca08
> _______________________________________________
> Semediawiki-devel mailing list
> Semediawiki-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
>



-- 
Roses are red,Violets are blue,I'm schizophrenic,and so am I.
-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel

Reply via email to