On Sat, Mar 06, 2004 at 05:41:21PM -0800, Dan Quinlan wrote: > Given how we decode MIME data, I think this might make sense:
As an fyi, this exact discussion came up a month or so ago. There was a ticket about it, and some sa-users (or dev?) chatter. I tried to dig up a pointer to it, but didn't find it in my archive or in bugzilla, strangely. :( > * header - stays the same > * body - decoded and rendered text (unchanged) > * decoded - decoded text (by default, see below), not rendered (new > type, similar to the old "rawbody") > * raw - pristine body, no changes, (raw means raw, whee) > * full - pristine message, headers plus body (mostly for checksum tests) Sounds fine to me, but see below. BTW: the current full is essentially the same except the headers get slightly cooked. specifically whitespace folding is removed so that the RE matches would be simpler to write. > For each test, make the default form of the data be a reference to an > array like how body currently works. Sure. > Next, a set of modifiers for each: > > one common modifier for all 4 types: > - a 'join' (or 'string' or whatever) modifier to return the entire > data in a single string, performance-be-damned ok. > one modifier for "decoded": > - a regex to select decoded versions of specific content-types, any > possible content type: text, application, image, etc. "decoded" > would default to the same set of types that are ultimately > rendered as body, of course I don't really like the modifier idea for this, and I think the same thing would need to work for "raw" as well. "raw" normal would be the whole pristine body, versus "raw" with a modifier or whatever which could search for specific parts, etc. Things that you can search/deal with now trivially: Content-Type (just the type, no reason to use the full header), Attachment Name (/\.zip$/ for instance), decoded section offset (start string at offset X bytes) , and decoded section length (only let me match against the first Y decoded bytes). Doing both raw and decoded in this form, btw, would be sufficient for ticket 3010 as well imo. BTW: In talking with Dan about this earlier today off-list, I was strongly opposed due to the fact we need to get a release out sooner than later. But after thinking about it some more, it's really not a "minor version" type of change, so we should straighten this out now or wait for 4.0... -- Randomly Generated Tagline: "You just reciprocate the small one ..." - Peter Sagerson
pgp2VeMDvNj4S.pgp
Description: PGP signature
