Hello, I'm working on making a Wiki and attempting to use Regular expressions to parse the wiki code.
I was trying to figure out how performance was affected by dealing with the string split up (since I have areas I don't really want parsed) and how "joining" 2 conditions together affected performance. Splitting it up line-by-line seemed to hurt performance a bit even though it matched less items. Probably due to memory disjointness. When I merged 2 conditions like this for example: (\*\*|Bob) it hurt performance by more than double what it would take for each individually. I tried running each part the \*\* and Bob match over the entire string to see if it was slower than combining it... but I found that it took 1/4 the time the merged Regex took. I woulda thought that having it merged would allow the Regex engine to be able to scan for a * or B and then go on from there w/o much trouble. And since it would be going through the string once, it would be faster since less cache-misses would be required. I ran these tests with the "compiled" regexes on Mono. Strange enough... I re-ran the regexes non-compiled w/ the same performance stats. Does Mono not have compiling yet??? If anyone has any suggestions as to how to work with parsing a Wiki, any help would be greatly appreciated. A few notes as to how I'm doing the Wiki: I'm parsing the wiki into a tree of elements so that output could potentially be to things other than HTML such as re-outputting to Wiki (to cleanup things? the regexes eat up a little bit of junk that's output and accept a few things that are 'ok' but not in the spec), or even an application so that the Wiki could be used offline. I'm using Lua as a scripting language to allow for various things to be dealt with programatically. It could allow for easy extension of the abilities of the Wiki. It's all coded in C# (well.. w/ Lua as a little Glue for putting together the main pages and other stuff). I've looked into using Jay and other parsers... but those look like overkill. But if anyone thinks a parser/lexer could work better/faster, please let me know. Thanks! -- Thomas Harning Jr.
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list