Erez Schatz <moonb...@gmail.com> writes: > Care to share your script here? I'd love to see what you came up with.
OK. I've stripped-out the application-specific data from the script; here it is in its redacted form: ,{ ,y/<form/ g/<head/ s/(.|\n)*>(Expected Page Title)<(.|\n)*updated.*>([0-9]+\/[0-9]+\/[0-9]+)<(.|\n)*/:Title: \2\n:Date: \4\n/ s/<form// ,y/<form/ v/<head/ y/<h2/ v|/h2| d ,y/<form/ v/<head/ s/<h2/\n/ ,y/<form/ v/<head/ y/<h2/ { g|/h2| y/<tr/ { v/<td/ s/([ >]| )*([^<]*)<\/h2>(.|\n)*/\n\n\2\n==========================\n/ g/<td/ g/Entry#/ y/<td/ { v|/td| c/\n/ } g/<td/ v/Entry#/ y/<td/ v|/td| d g/<td/ y/<td/ { g|/td| v/colspan="3"/ g/>(Header Foo)/ y/(Header Foo)/ { v|/td| c/:/ g|/td| s/ : /:/ g|/td| x/(\n|<[^<>]+>\??| )+/ c/ / g|/td| a/\n/ } g|/td| v/colspan="3"/ g/>(Header Bar)/ y/(Header Bar)/ { v|/td| c/:/ g|/td| i/:/ g|/td| x/( |\n|<[^<>]+>\??| )+/ c/ / g|/td| a/\n/ } g|/td| v/colspan="3"/ g/>(Entry#|Field Foo|Filed Bar|Field Baz|Field Quux|Field Snarf|Field Barf)/ y/(Entry#|Field Foo|Filed Bar|Field Baz|Field Quux|Field Snarf|Field Barf)/ { v|/td| c/:/ g|/td| s/[:.?]*/:/ g|/td| x/( |\n|<[^<>]+>| )+/ c/ / g|/td| a/\n/ } g|/td| v/colspan="3"/ g/inch/ { s/[^>]*>( |\n|<[^<>]+>| )*(([a-zA-Z0-9\-]+ )+inch ?([a-zA-Z0-9\-]+ )*[a-zA-Z0-9\-]+)( |\n|<[^<>]+>| )*/:Inches: \2\n/ } g|/td| v/colspan="3"/ g/Stock/ { i/:Density:/ x/( |\n|<?[^<>]+>| )+/ c/ / a/\n/ } g|/td| g/colspan="3"/ { s/[^>]*>/:Details:/ y/[^>]*colspan="3"[^>]*>/ x/( |\n|<[^<>]+>\??| )+/ c/ / a/\n:Notes: \n\n/ } g|/td| g|checkbox| d } } } } ,x/<h2|<tr|<td/d That last line is a bit of a hack. I needed it because there didn't appear to be any way to delete the "<h2" delimiters from within the ,y/<h2/. But it works because those HTML tags do not appear in the final reStructuredText. A lot of the complexity of the script comes from the need to keep the changes in sequence. I really hope that the implementation of the sam language in Acme doesn't impose the same requirement to keep changes in order; it's a Real Pain(TM). One of the things that still perplexes me is the apparent necessity of the s command. The sam paper claims that the s command isn't necessary. But I couldn't find any way to do the edits without resorting to it. If you could figure out how to replace the s commands with a combination of other sam commands, I'd be quite impressed indeed! -- +---------------------------------------------------------------+ |Smiley <smi...@icebubble.org> PGP key ID: BC549F8B | |Fingerprint: 9329 DB4A 30F5 6EDA D2BA 3489 DAB7 555A BC54 9F8B| +---------------------------------------------------------------+