Markus Krötzsch wrote: > On Freitag, 11. Juli 2008, Temlakos wrote: > >> Everyone: >> >> Several weeks ago, I finally figured out how to install SMW's >> maintenance scripts as symlinks in my server's wiki maintenance >> subdirectory so that I could run them. >> >> But when I ran SMW_refreshData.php, I got multiple warnings saying that >> a call to preg_match failed on an overly long regular expression. The >> implicated file was my custom "historical date" file. And after multiple >> such "warnings," the execution of the file finally ended with one word: >> "Killed." >> >> Markus, I believe you have a copy of the historical-date file >> (SMW_DV_HxDate.php). The longest regular expression (regex) in it is >> $screenpat, and my file calls preg_match with that string in order to >> screen out date texts that are not in a form that the script would >> recognize. I do that to ensure that any annotated date that passed that >> test would be sure to represent a valid date, so long as month names >> were spelled correctly, etc. >> >> But if a long regex is creating a problem, then I must solve it today, >> before I update my wiki. Otherwise, SMW_refreshData.php will kill itself >> again, and it will leave the job unfinished. >> >> How long can a regex be and not cause a problem with the execution of >> SMW_refreshData? >> >> The regex strings in the file are $screenpat and $format1, $format2, >> $format3, $format4, and $format5. >> >> These strings have 219, 83, 89, 84, 85, and 55 characters, respectively. >> >> Any assistance would be appreciated. Furthermore, if anyone else hopes >> to use the Historical Date script, then I can't have it creating a >> problem every time someone wants to run SMW_refreshData.php. >> > > I never encountered a similar problem. We also have long regexps in SMW, and > the lengths you gave do not sound impressingly long to me either. Are all > regexps static and do not use any variables of possibly unexpected content? > Can a websearch help you on your warning/error messages? > > Of course, "Killed" sounds like an emergency break due to the shortage of > some > resource (such as memory). Does the problem occur when you start on that very > page (using -v and then -s <id> as options fo refreshData)? > > SMW_refreshData.php as such does not do many things that would be different > from normal page editing, though it calls functions in a slightly different > program context (I just fixed some bugs in SemanticCalendar, which relied on > the global $wgTitle that is not ensured to contain anythin during parsing in > general and refreshData in particular). Besides these things, it would > normally use the same code as during writing a page. Of course, the php > command on a server may have different behaviour than the php module in > Apache (and the admissible length of regexps appears to be rather specific to > PHP). > > Anyway, you can use the option -v to see which page id causes the problem, > and > then use the option -s <id+1> to continue after that page. This way, you skip > one page but can still refresh the rest. Use the MediaWiki API or the > database to find out which page causes the problem, and check whether it > works normally when read/edited on the web. > > Markus > > > P.S. It seems that this is a discussion for the developers' list ... > Duly noted. I will now publish this to the development list as well, though the other users might want to see my answers.
I have no insight on the problem of long regex strings. My regexes are static, first of all. The warning messages created such confusion and went by so fast that I didn't have a chance to see where the "kill" occurred before it happened. This might or might not be significant: I did not at first run refreshData and restrict it to refreshing type and property pages only. Instead I ran it on the entire database, using the -v option. That's why I saw all those warnings. Terry A. Hurlbut PS: Thank you for detailing the -s option. I did not at first see that on the commented documentation in the file. TAH ------------------------------------------------------------------------- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 _______________________________________________ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel