Re: Email::Address::XS
On Tuesday 14 February 2017 21:26:34 p...@cpan.org wrote: > On Saturday 28 January 2017 21:48:55 Ricardo Signes wrote: > > * p...@cpan.org [2017-01-14T15:32:57] > > > > > So lets move. This is implemented in my pull request: > > > https://github.com/rjbs/Email-MIME/pull/35 > > > > Done! > > I have there open question about header_to_class_map. Can you look at > it? And last part which implements Email::Address::XS support in Email::MIME via new module Email::MIME::Header::AddressList is in this pull request: https://github.com/rjbs/Email-MIME/pull/38 Feel free to comment it provide better idea or how to improve if it does not look ok...
Re: Email::Address::XS
On Monday 23 May 2016 19:05:39 p...@cpan.org wrote: > Hello! > > I created new perl module Email::Address::XS for parsing and formatting > email groups or addresses. Parser is borrowed from dovecot and that part > implemented in C/XS. > > Source code is currently at: > https://github.com/pali/Email-Address-XS > > Email::Address::XS has backward compatible API with old Email::Address > module (which has security problem CVE-2015-7686) and my new module is > intended to replace old Email::Address. > > This module supports not only single list of addresses, but also named > groups of addresses (according to RFC 2822). > > I tried to make source code readable, documented and also fast (thanks > to dovecot parser written in C; not in perl regex). > > It contains also lot of examples and test cases to check that parser and > formatter is correct. > > See pod documentation and unit tests: > https://github.com/pali/Email-Address-XS/blob/master/lib/Email/Address/XS.pm > https://github.com/pali/Email-Address-XS/blob/master/t/Email-Address-XS.t > > Thanks to named group support I would like to extend Email::MIME module > to allow passing directly Email::Address::XS objects, not only string > headers to make MIME encoding and decoding from applications easier. > > What do you think about it? Back to the my original email about Email::Address::XS... I fixed last known C/XS related bugs and automatic tests passed on Travis-CI and AppVeyor with different perl versions. Finally Email::Address::XS is available on cpan: https://metacpan.org/pod/Email::Address::XS Module is compatible back to perl 5.6.0 and is working fine on Linux, Windows, FreeBSD and HP-UX. If you found any problems with it then let me know.
Re: Email::Address::XS
On Saturday 28 January 2017 21:48:55 Ricardo Signes wrote: > * p...@cpan.org [2017-01-14T15:32:57] > > > So lets move. This is implemented in my pull request: > > https://github.com/rjbs/Email-MIME/pull/35 > > Done! I have there open question about header_to_class_map. Can you look at it?
Re: Email::Address::XS
* p...@cpan.org [2017-01-14T15:32:57] > So lets move. This is implemented in my pull request: > https://github.com/rjbs/Email-MIME/pull/35 Done! -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
On Saturday 14 January 2017 21:32:57 p...@cpan.org wrote: > On Sunday 04 September 2016 00:24:56 Ricardo Signes wrote: > > If we never *store* objects, but only produce them as requested, then > > I think the total needed changes are -- but I'm sure I'll miss > > things -- as follows: > > > > * allow header_str and header args to Email::MIME->create to include > > objects, which are immediately asked to encode themselves for > > storage > > * add header_as_obj that takes a header name and, optionally, a class > > name and offset (an offset so you can ask for an object of the nth > > Received header) > > * a registry used by header_as_obj to get a default class name from > > header name > > So lets move. This is implemented in my pull request: > https://github.com/rjbs/Email-MIME/pull/35 > > Default class name for header name is retrieved from hash: > Email::MIME::Header::header_to_class_map > > Comments and review is welcome! rjbs, can you review my pull request?
Re: Email::Address::XS
On Sunday 04 September 2016 00:24:56 Ricardo Signes wrote: > If we never *store* objects, but only produce them as requested, then > I think the total needed changes are -- but I'm sure I'll miss > things -- as follows: > > * allow header_str and header args to Email::MIME->create to include > objects, which are immediately asked to encode themselves for > storage > * add header_as_obj that takes a header name and, optionally, a class > name and offset (an offset so you can ask for an object of the nth > Received header) > * a registry used by header_as_obj to get a default class name from > header name So lets move. This is implemented in my pull request: https://github.com/rjbs/Email-MIME/pull/35 Default class name for header name is retrieved from hash: Email::MIME::Header::header_to_class_map Comments and review is welcome!
Re: Email::Address::XS
On Sunday 04 September 2016 00:24:56 Ricardo Signes wrote: > If we never *store* objects, but only produce them as requested, then > I think the total needed changes are -- but I'm sure I'll miss > things -- as follows: > > * allow header_str and header args to Email::MIME->create to include > objects, which are immediately asked to encode themselves for > storage > * add header_as_obj that takes a header name and, optionally, a class > name and offset (an offset so you can ask for an object of the nth > Received header) As prerequisite for offset small change to Email::Simple is needed. Currently Email::Simple cannot returns just nth value for header with specified name. Support for this is in new pull request: https://github.com/rjbs/Email-Simple/pull/16 > * a registry used by header_as_obj to get a default class name from > header name
Re: Email::Address::XS
On Monday 05 September 2016 10:25:11 p...@cpan.org wrote: > On Saturday 03 September 2016 18:24:56 Ricardo Signes wrote: > > The Email::MIME changes look like they could be broken up into > > several PRs, some of which would be obviously good to apply > > immediately, like removals of dead code and pointers to bad > > modules. > > If you think that some of those changes can be merged immediately, > please specify commits and I create new pull request for them. Btw, > I'm preparing another big patch series for Encode::MIME::Header > module (call encode("MIEM-Header", ...)) which will fix remaining > bugs. So if you know about some in that, let me know ASAP, so I can > fix it in my patch series ;-) New Encode was released with my fixes to MIME-Header... https://metacpan.org/pod/Encode::MIME::Header > ..Which means that removing pointer to that module will not be > needed.. Now I created pull request for Email::MIME: https://github.com/rjbs/Email-MIME/pull/32 It should contains only code cleanup and fixes, no Email::Address::XS... Look at it and if there are some problems, let me know!
Re: Email::Address::XS
On Wednesday 28 September 2016 15:29:28 Ricardo Signes wrote: > * p...@cpan.org [2016-09-18T11:40:53] > > > Currently passing string values of From/To/Cc/Bcc/... headers into > > header_str() method is broken in Email::MIME. That is because > > Email::MIME currently uses Email::Address for generating those > > header values (which is broken) and then MIME encode those broken > > outputs. > > > > Email::Address::XS has (looks like) correctly implemented formatter > > and so it is needed to correctly MIME encode From/To/Cc/Bcc > > headers. > > I suggest making Email::MIME use Email::Address::XS if it is > available, and adding Email::Address::XS to the recommended prereqs > of Email::MIME. The right behavior will be easy to get, and usually > be installed, but it will be possible to proceed with less correct > behavior if you haven't got a compiler (for some sad reason). > > Part of the question is: how wrong do things go, in what > circumstances, if Email::Address is substituted for > Email::Address::XS. First problem is CVE-2015-7686. If you pass "unsafe" string into Email::Address then perl starts eating CPU for a very long period. Next problem with Email::Address is that it eats names of email groups. Which means that if you ask or pass MIME-decoded string version of raw header: To: undisclosed-recipients:; You will just get: To: Another problem is parsing name/phrase which have MIME encoded string which "looks like" an email address. E.g. for MIME header: To: =?UTF-8?B?PG15Pg==?= You will get MIME-decoded string: To: , Because just of first problem CVE-2015-7686, I would really suggest to totally avoid using Email::Address. If you process "unsafe" email from attacker on some server, you get perfect DOS attack. I think that returning original header (not MIME decoded) or croaking is better then using Email::Address. > > As compromise could be: Whole Email::MIME will not depends on > > module Email::Address::XS. But if somebody want to pass Unicode > > string (via header_str) to Email::MIME then MIME encoding will be > > done via Email::MIME::Header::AddressList (which will use > > Email::Address::XS). So if caller encodes manually From/To/Cc/... > > headers and pass them via header_raw() then Email::Address::XS > > will not be needed. > > Specifically, I think, a non-ASCII string. I'm guessing that > most/many users are really just passing in fixed ASCII strings, so > this rule wouldn't affect them at all. Users passing in non-ASCII > would start getting a "automatic encoding of non-ASCII $field header > requires " error. Seems okay. Ok. > > And can be Email::MIME::Header::AddressList part of Email-MIME > > distribution (even if only this module will depends on XS)? > > I guess so. We need to mark this stuff experimental for a while, I > think, too. No problem. I think now we discussed everything needed... I will implement patches for Email::MIME and will see how it is usable and how it is working...
Re: Email::Address::XS
* p...@cpan.org [2016-09-18T11:40:53] > Currently passing string values of From/To/Cc/Bcc/... headers into > header_str() method is broken in Email::MIME. That is because > Email::MIME currently uses Email::Address for generating those header > values (which is broken) and then MIME encode those broken outputs. > > Email::Address::XS has (looks like) correctly implemented formatter and > so it is needed to correctly MIME encode From/To/Cc/Bcc headers. I suggest making Email::MIME use Email::Address::XS if it is available, and adding Email::Address::XS to the recommended prereqs of Email::MIME. The right behavior will be easy to get, and usually be installed, but it will be possible to proceed with less correct behavior if you haven't got a compiler (for some sad reason). Part of the question is: how wrong do things go, in what circumstances, if Email::Address is substituted for Email::Address::XS. > As compromise could be: Whole Email::MIME will not depends on module > Email::Address::XS. But if somebody want to pass Unicode string (via > header_str) to Email::MIME then MIME encoding will be done via > Email::MIME::Header::AddressList (which will use Email::Address::XS). So > if caller encodes manually From/To/Cc/... headers and pass them via > header_raw() then Email::Address::XS will not be needed. Specifically, I think, a non-ASCII string. I'm guessing that most/many users are really just passing in fixed ASCII strings, so this rule wouldn't affect them at all. Users passing in non-ASCII would start getting a "automatic encoding of non-ASCII $field header requires " error. Seems okay. > And can be Email::MIME::Header::AddressList part of Email-MIME > distribution (even if only this module will depends on XS)? I guess so. We need to mark this stuff experimental for a while, I think, too. -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
On Sunday 18 September 2016 17:26:11 Ricardo Signes wrote: > * p...@cpan.org [2016-09-17T19:05:51] > > > $class->from_mime_string() will take raw MIME encoded string and > > returns new object of $class (which will have decoded string > > parts) $object->as_mime_string() will convert (Unicode) $object > > into raw MIME encoded string > > > > It is OK for you? > > That all sounded fine. I think the paragraph I left quoted > overspecifies a bit. Whether the object is storing things decoded > or not isn't any of our concern as long as it has those two methods. > But I think we're on the same page. OK! > The Email::Address::XS use should be optional, as right now people > can install Email::MIME in an compiler-free environment. We can add > it as a recommended prereq. Currently passing string values of From/To/Cc/Bcc/... headers into header_str() method is broken in Email::MIME. That is because Email::MIME currently uses Email::Address for generating those header values (which is broken) and then MIME encode those broken outputs. Email::Address::XS has (looks like) correctly implemented formatter and so it is needed to correctly MIME encode From/To/Cc/Bcc headers. I started working on Email::MIME::Header::AddressList module (which will have from_mime_string() and as_mime_string() methods for From/To/Cc/Bcc headers) and this module cannot work without Email::Address::XS. So what to do with currently broken From/To/Cc/Bcc/... headers (incorrectly MIME encoded) which Email::MIME generate? I do not see other option as dependency on Email::Address::XS. As compromise could be: Whole Email::MIME will not depends on module Email::Address::XS. But if somebody want to pass Unicode string (via header_str) to Email::MIME then MIME encoding will be done via Email::MIME::Header::AddressList (which will use Email::Address::XS). So if caller encodes manually From/To/Cc/... headers and pass them via header_raw() then Email::Address::XS will not be needed. But when it pass Unicode string for From/To/Cc/.. headers via header_str then it Email::MIME will loads Email::MIME::Header::AddressList which depends on Email::Address::XS... It is acceptable? And can be Email::MIME::Header::AddressList part of Email-MIME distribution (even if only this module will depends on XS)?
Re: Email::Address::XS
* p...@cpan.org [2016-09-17T19:05:51] > $class->from_mime_string() will take raw MIME encoded string and returns > new object of $class (which will have decoded string parts) > $object->as_mime_string() will convert (Unicode) $object into raw MIME > encoded string > > It is OK for you? That all sounded fine. I think the paragraph I left quoted overspecifies a bit. Whether the object is storing things decoded or not isn't any of our concern as long as it has those two methods. But I think we're on the same page. The Email::Address::XS use should be optional, as right now people can install Email::MIME in an compiler-free environment. We can add it as a recommended prereq. -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
On Saturday 17 September 2016 00:37:40 Ricardo Signes wrote: > * p...@cpan.org [2016-09-12T03:26:52] > > > And as I wrote if Email::MIME is not good place, then what about > > other modules like Email::MIME::Header::Address (or invent other > > name) which will use Address parse/format functions and will also > > do that MIME encode/decode procedure? We can maybe add classes > > also for other headers (like you suggested for DKIM signatures, > > etc...). > > I had started to write a lot of reply on the previous parts of your > email, but I think that this is the only part that really matters in > the end. Yes, I think some thing like that is sufficient. In the > end, I think what's best is: > > * a thing that can take a raw (encoded) header string and give you an > object * ...which is an object with access to the header's > structured data * ...which you can turn back into a raw header to > store as needed > > With that facility, people can plug in (header => class) > configuration and things just go. We can start off suggesting, for > example, an address one. Ok, so first step can be support passing blessed objects with as_mime_string() method into $email->header_str_set(). Object's as_mime_string() will be responsible for producing correct MIME-encoded header value Next, I believe we agreed on $email->header_as_obj($name, $class) method which will return object for header name of class. It can use e.g. $class->from_mime_string() method for creating object. (Plus there will be some registration mechanism to predefine $name => $class mapping, so $class argument does not needs to be mandatory). I think these two parts should be enough for Email::MIME API from user of Email::MIME perspective. And I would propose new module (e.g. Email::MIME::Header::AddressList) which will be in Email::MIME distribution and will represent list of Email::Address::XS objects with own implementation of ->as_mime_string() and ->from_mime_string() methods. That could be used for passing list/groups of Email::Address::XS objects into Email::MIME and also get them via header_as_obj() API. $class->from_mime_string() will take raw MIME encoded string and returns new object of $class (which will have decoded string parts) $object->as_mime_string() will convert (Unicode) $object into raw MIME encoded string It is OK for you?
Re: Email::Address::XS
* p...@cpan.org [2016-09-12T03:26:52] > And as I wrote if Email::MIME is not good place, then what about other > modules like Email::MIME::Header::Address (or invent other name) which > will use Address parse/format functions and will also do that MIME > encode/decode procedure? We can maybe add classes also for other headers > (like you suggested for DKIM signatures, etc...). I had started to write a lot of reply on the previous parts of your email, but I think that this is the only part that really matters in the end. Yes, I think some thing like that is sufficient. In the end, I think what's best is: * a thing that can take a raw (encoded) header string and give you an object * ...which is an object with access to the header's structured data * ...which you can turn back into a raw header to store as needed With that facility, people can plug in (header => class) configuration and things just go. We can start off suggesting, for example, an address one. -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
On Sunday 11 September 2016 18:58:42 Ricardo Signes wrote: > * p...@cpan.org [2016-09-05T04:25:11] > > I do not want to add ->as_mime_header (or other function) which will do > > MIME encoding/decoding into Email::Address::XS. That module is for > > formating and parsing email addresses headers, not for MIME > > encoding/decoding. Same as it is for Email::Address module. > > The best way to know how to properly encode a structured header field is to > know both its structure and the way that the structure ie meant to be encoded. > For example, to know that a `mailboxes` structure may have a display-name, > which would be words, which can be encoded, and may have an addr-spec, which > is > not words, and so cannot be encoded. > > Parsing and encoding are not separable concerns, here, because to know whether > to decode a part, one most know what part it is, which means it has been > parsed. You can only properly decode the structured data by knowing the > relationship between structure and encoding. Then, you can only encode the > data by knowing the same. This means that any interface between Email::MIME > and some structured field representation has a much more complex API, if > Email::MIME is responsible for the encoding and decoding. It doesn't just > need > a map of field name to class name, but also instructions on how structured > data > are encoded and decoded. > > This seems like it becomes a nasty mess of deep coupling. Am I making some > fundamental mistake, here? > > Anyway, isn't it certain that people who parse addresses from headers will > want > to get a decoded form? Surely this will be common: > > my @recipients = map {; $_->phrase_str // $_->address } >Address->parse( $email->header('To') ); > > The Dovecot parser (as I recall) does not decode encoded-words, so phrase_str > is easy to write. Does every consumer of Address need to know how to decode? > Further, won't people want to write: > > my $to = Address->new("김정은", "k...@example.biz"); > > ...and then pass that object on to something that knows what to do with it? > Does every possible consumer of Address need to know how to encode an Address > object? Yes, this is what I already wrote. User of Email::MIME and Address modules just want to construct Unicode Address object and pass it into Email module, without calling any encode/decode functions. Something like this: my $to = Address->new("김정은", "k...@example.biz"); $email->header("To", $to); $to is internally stored as Unicode (no MIME) and $email in final must be MIME-encoded. So either $email or something between must do that MIME encoding. $to->format() # will produce '"김정은" ' $email->as_string() # will contains 'To: =?UTF-8?B?6rmA7KCV7J2A?= ' But as I wrote I do not want to add MIME encode/decode functions into Address classes and I think that whole MIME encode/decode procedure should be at one place (something like encode("MIME-Header", $str)). Currently encoding and decoding MIME words is broken in perl, Encode::MIME::Header has bugs, so I do not want to see that every module will be its own decoding and encoding (plus badly). Email::MIME is good place where can be correct encode/decode implemented... Btw, I have prepared patches for Encode::MIME::Header so I hope bugs will be out of perl... And as I wrote if Email::MIME is not good place, then what about other modules like Email::MIME::Header::Address (or invent other name) which will use Address parse/format functions and will also do that MIME encode/decode procedure? We can maybe add classes also for other headers (like you suggested for DKIM signatures, etc...). Dovecot parser does not do anything with MIME. Upper 8bit characters stay as is unchanged.
Re: Email::Address::XS
* p...@cpan.org [2016-09-05T04:25:11] > I do not want to add ->as_mime_header (or other function) which will do > MIME encoding/decoding into Email::Address::XS. That module is for > formating and parsing email addresses headers, not for MIME > encoding/decoding. Same as it is for Email::Address module. The best way to know how to properly encode a structured header field is to know both its structure and the way that the structure ie meant to be encoded. For example, to know that a `mailboxes` structure may have a display-name, which would be words, which can be encoded, and may have an addr-spec, which is not words, and so cannot be encoded. Parsing and encoding are not separable concerns, here, because to know whether to decode a part, one most know what part it is, which means it has been parsed. You can only properly decode the structured data by knowing the relationship between structure and encoding. Then, you can only encode the data by knowing the same. This means that any interface between Email::MIME and some structured field representation has a much more complex API, if Email::MIME is responsible for the encoding and decoding. It doesn't just need a map of field name to class name, but also instructions on how structured data are encoded and decoded. This seems like it becomes a nasty mess of deep coupling. Am I making some fundamental mistake, here? Anyway, isn't it certain that people who parse addresses from headers will want to get a decoded form? Surely this will be common: my @recipients = map {; $_->phrase_str // $_->address } Address->parse( $email->header('To') ); The Dovecot parser (as I recall) does not decode encoded-words, so phrase_str is easy to write. Does every consumer of Address need to know how to decode? Further, won't people want to write: my $to = Address->new("김정은", "k...@example.biz"); ...and then pass that object on to something that knows what to do with it? Does every possible consumer of Address need to know how to encode an Address object? -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
On Saturday 03 September 2016 18:24:56 Ricardo Signes wrote: > > Look at my proposal just for first version and lets change parts which > > are not OK for you. I do not believe that everything is totally wrong. > > Okay! > > First, I think we should just leave Email::Simple alone. In general, I think > the cases for using Email::Simple are very few, and almost nobody should ever > use it. Giving it new and ostensibly MIME-related features seems unnecessary. > Having said that, I'm not going to look at the Email::Simple changes in depth. > (We definitely don't want to make installing Email::Simple require loading > Email::Address::List::XS, I'll note.) > > I think that ->format is probably not a great name choice, as it might exist > other places too easily. For example, Email::Address has a ->format, but I > don't think it will be suitable for this, as it doesn't encode properly. This For Email::Simple it is correct. And for _raw functions in Email::MIME too. ->format from Email::Address (and also Email::Address::XS as it is replacement) correctly generate that header. But it expect that input fields are ASCII, so MIME encoding must be done by caller, who are creating that /Email::Address(::XS)?/ object. > is why I originally suggested something almost guaranteed not to clash, like > ->as_mime_header. We can assume that programmers won't have to call this very > often, only the innards of Email::MIME, so it's okay if it's a bit wordy. I do not want to add ->as_mime_header (or other function) which will do MIME encoding/decoding into Email::Address::XS. That module is for formating and parsing email addresses headers, not for MIME encoding/decoding. Same as it is for Email::Address module. And because whole MIME encoding/decoding is done in Email::MIME, I think that encoding/decoding of Email::Address should be done in Email::MIME too. Maybe code can be moved to some submodule e.g. Email::MIME::, but still part of Email-MIME distribution. > The Email::MIME changes look like they could be broken up into several PRs, > some of which would be obviously good to apply immediately, like removals of > dead code and pointers to bad modules. If you think that some of those changes can be merged immediately, please specify commits and I create new pull request for them. Btw, I'm preparing another big patch series for Encode::MIME::Header module (call encode("MIEM-Header", ...)) which will fix remaining bugs. So if you know about some in that, let me know ASAP, so I can fix it in my patch series ;-) ..Which means that removing pointer to that module will not be needed.. > Primarily, I don't like the special weight given to the addrlist header. > While > it's likely to be the most common one, I think that implementing it as a > special case rather than an application of the general case, is going to lead > to problems. (Just yesterday I spent much of the day on DKIM, and it was > clear > that Authentication-Results and Domain-Signature could both usefully have > special objects.) Ok. > > [...] > > So easy extensible API needs to have one method which do that. Now I > > have only idea with something like this: > > > > my $addrlist = $email->header_to_obj("Cc", "Email::Address::List::XS"); > > > > That will convert header "Cc" to object Email::Address::List::XS and > > MIME decode parts which needs to be decoded. > > > > (Maybe class name could be optional and some mapping table for most > > common headers could be prepared) > > I think this is all plausible. The parts that are important to me are: > > * objects working for all headers equally well That just mean to create classes for needed headers. > * a registry of common field-name-to-class-name mappings Problem is what is "common". From RFC5322? From some subset of RFC5322? >From later RFCs which update RFC5322? Or all RFCs which define some structured header? > > That method still needs to be know how to MIME decode object > > Email::Address::List::XS... > > I'm not sure what you mean, here. Do you mean that if we've stored a header > entry as an object that has an as-mime-encoded-string method, we also end up > needed a means to get it as-decoded-string? I'm afraid I just don't > understand > the sentence. Ok, I will try to explain it differently. You already pointed to module Email::Address. That module represent structure of From header (contains one email address) as specified in RFC5322 (or 2822 or 822). But it does not do any MIME encoding/decoding. So if somebody fill Unicode strings in Email::Address module and you want to include that header (which is represented by Email::Address object), you first need to call MIME encoder/decoder on ->phrase and ->comment members of that object. If you create new class which will represent other structured header defined in RFC5322, then it again does will not deal with MIME encoding and decoding. But it will have different members (not ->phrase and ->comment) which will be needed
Re: Email::Address::XS
I know I'm taking a long time between replies. Thanks for being patient. I've been rotating through "out of town" and "catching up with work backlog from being out of town," basically, and in the leftover time, I don't have any brain left for anything much at all. This week is another "all work all week" week, but maybe the week after things will even out for a while. * p...@cpan.org [2016-08-25T03:40:20] > On Wednesday 24 August 2016 22:55:05 Ricardo Signes wrote: > > > > I don't understand "you have no idea about arbitrary object." Obviously you > > would get a type of object based on the header in question. > > Then you need to create mapping from header name to object name. Plus > this does not solve problems for extended/application specific header > (X-Something) which can be used for type which application wants. Yes, you need that mapping, and you extend it on an application-specific basis. > > This reads like, "Look, just use the API that you don't like because I > > already > > It is not like it... I apologize, this was an impolite response. > I would rather know what is wrong with it? And which part? Both > Email::Simple & Email::MIME? Or only some subpart of it? And both > getting and setting headers? Or only getting them? What I meant was: we were talking about questions of API design, and you moved to implementation, which I think is premature. > Do not take me wrong, but to check that API is usable, you need to > implement at least some POC and try to use it yourself. If it does not > meet everything needed, then you need to rework it. And this is what now > did. I agree that you need to test an API to determine whether it is sufficient, but it's also possible to see something is insufficient before trying. Since I don't think the header_addrlist API is sufficient, it seems like implementation is jumping the gun, to me. > Look at my proposal just for first version and lets change parts which > are not OK for you. I do not believe that everything is totally wrong. Okay! First, I think we should just leave Email::Simple alone. In general, I think the cases for using Email::Simple are very few, and almost nobody should ever use it. Giving it new and ostensibly MIME-related features seems unnecessary. Having said that, I'm not going to look at the Email::Simple changes in depth. (We definitely don't want to make installing Email::Simple require loading Email::Address::List::XS, I'll note.) I think that ->format is probably not a great name choice, as it might exist other places too easily. For example, Email::Address has a ->format, but I don't think it will be suitable for this, as it doesn't encode properly. This is why I originally suggested something almost guaranteed not to clash, like ->as_mime_header. We can assume that programmers won't have to call this very often, only the innards of Email::MIME, so it's okay if it's a bit wordy. The Email::MIME changes look like they could be broken up into several PRs, some of which would be obviously good to apply immediately, like removals of dead code and pointers to bad modules. Primarily, I don't like the special weight given to the addrlist header. While it's likely to be the most common one, I think that implementing it as a special case rather than an application of the general case, is going to lead to problems. (Just yesterday I spent much of the day on DKIM, and it was clear that Authentication-Results and Domain-Signature could both usefully have special objects.) > [...] > So easy extensible API needs to have one method which do that. Now I > have only idea with something like this: > > my $addrlist = $email->header_to_obj("Cc", "Email::Address::List::XS"); > > That will convert header "Cc" to object Email::Address::List::XS and > MIME decode parts which needs to be decoded. > > (Maybe class name could be optional and some mapping table for most > common headers could be prepared) I think this is all plausible. The parts that are important to me are: * objects working for all headers equally well * a registry of common field-name-to-class-name mappings > That method still needs to be know how to MIME decode object > Email::Address::List::XS... I'm not sure what you mean, here. Do you mean that if we've stored a header entry as an object that has an as-mime-encoded-string method, we also end up needed a means to get it as-decoded-string? I'm afraid I just don't understand the sentence. Your changes to Email::Simple don't store objects, but produce them on demand. I'm thinking of: https://github.com/rjbs/Email-Simple/compare/master...pali:master#diff-8816e211b9069c6bfa4cc4c82b7410b3R224 If we never *store* objects, but only produce them as requested, then I think the total needed changes are -- but I'm sure I'll miss things -- as follows: * allow header_str and header args to Email::MIME->create to include objects, which are immediately asked to encode themselves for storage * add header_as_obj
Re: Email::Address::XS
On Wednesday 24 August 2016 22:55:05 Ricardo Signes wrote: > * p...@cpan.org [2016-08-23T03:56:24] > > > > Also it must be possible to get either named groups from Original-Cc > > > > header or only list of addresses. And I think with your proposal API it > > > > is not possible. You would need to call some "downgrade" function and > > > > then "upgrade" it to another object or so... > > > > > > Why would this not be possible? There is some object storing the > > > mailboxes > > > structure, and it provides methods that answer the questions one needs to > > > ask. > > > > Because you have no idea about arbitrary object. If you want to e.g. > > decode Email::Address::XS object, you must decode only ->phrase() and > > ->comment() parts! Not others. > > I don't understand "you have no idea about arbitrary object." Obviously you > would get a type of object based on the header in question. Then you need to create mapping from header name to object name. Plus this does not solve problems for extended/application specific header (X-Something) which can be used for type which application wants. > > Anyway, lets move forward. I already implemented something and send > > information in email with subject "Email::Simple & Email::MIME with > > Email::Address::XS" to pep mailing list... > > > > I think this is good approach to provide usable API + ability to extend > > code for other objects... > > This reads like, "Look, just use the API that you don't like because I already > wrote some code." That's not going to sway me. It is not like it... I would rather know what is wrong with it? And which part? Both Email::Simple & Email::MIME? Or only some subpart of it? And both getting and setting headers? Or only getting them? Do not take me wrong, but to check that API is usable, you need to implement at least some POC and try to use it yourself. If it does not meet everything needed, then you need to rework it. And this is what now did. And I thought we discuss about it for a long time without trying to implement something. Look at my proposal just for first version and lets change parts which are not OK for you. I do not believe that everything is totally wrong. > What happens when someone wants a Date object for the Date header? Do we add > header_date? Then header_rcvd for Received headers, and so on? This > interface > leads to either a proliferation of these things or to some line where we say, > "well *these* headers are important enough and *these* are not." On the other > hand, a generic mechanism is generic. Hm... here we are dealing with problem: I want header XYX from Email::MIME, but I want it as object of class ABC. Right? So easy extensible API needs to have one method which do that. Now I have only idea with something like this: my $addrlist = $email->header_to_obj("Cc", "Email::Address::List::XS"); That will convert header "Cc" to object Email::Address::List::XS and MIME decode parts which needs to be decoded. (Maybe class name could be optional and some mapping table for most common headers could be prepared) That method still needs to be know how to MIME decode object Email::Address::List::XS... But fill free to propose something different/better for this problem. > You can always publish your work as a subclass, if you think it that popular > acclaim will convince me I'm wrong. I would rather fix it, instead creating fork or subclass.
Re: Email::Address::XS
* p...@cpan.org [2016-08-23T03:56:24] > > > Also it must be possible to get either named groups from Original-Cc > > > header or only list of addresses. And I think with your proposal API it > > > is not possible. You would need to call some "downgrade" function and > > > then "upgrade" it to another object or so... > > > > Why would this not be possible? There is some object storing the mailboxes > > structure, and it provides methods that answer the questions one needs to > > ask. > > Because you have no idea about arbitrary object. If you want to e.g. > decode Email::Address::XS object, you must decode only ->phrase() and > ->comment() parts! Not others. I don't understand "you have no idea about arbitrary object." Obviously you would get a type of object based on the header in question. > Anyway, lets move forward. I already implemented something and send > information in email with subject "Email::Simple & Email::MIME with > Email::Address::XS" to pep mailing list... > > I think this is good approach to provide usable API + ability to extend > code for other objects... This reads like, "Look, just use the API that you don't like because I already wrote some code." That's not going to sway me. What happens when someone wants a Date object for the Date header? Do we add header_date? Then header_rcvd for Received headers, and so on? This interface leads to either a proliferation of these things or to some line where we say, "well *these* headers are important enough and *these* are not." On the other hand, a generic mechanism is generic. You can always publish your work as a subclass, if you think it that popular acclaim will convince me I'm wrong. -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
* p...@cpan.org [2016-08-23T03:50:03] > That is really bad API :-( This is the *low level* API which of course the user does not call. They'd want the version of Email::MIME that gives them these header objects on the fly. I suggested there'd be a means to "upgrade" all the known headers so that ->header_thing('To') would give you an object. If that worked, it would probably be useful to have an analog to ->new that did this while initializing. -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
On Monday 22 August 2016 22:34:39 Ricardo Signes wrote: > * p...@cpan.org [2016-08-20T06:01:16] > > Email::MIME is module which automatically do any MIME encoding/decoding > > without user interaction, so that decoding must be done automatically > > and without such "upgrade" function. > > So do you mean that whenever someone reads the header with a specific method, > the header is parsed just-in-time? > > If so, this seems like something very easy to add in an Email::MIME subclass > to > show it off. > > Of course, it's also easy to take the hypothetical code behind > "upgrade_headers($email)" to do something just like this, on a > per-known-header > basis. > > > I do not want to do that decode manually or call some "upgrade" function > > which you propose... Reading one email header should not change internal > > email structure. > > I never suggested that anything was changed by virtue of being read, but > rather > that one could explictly upgrade structures if desired. > > > Also it must be possible to get either named groups from Original-Cc > > header or only list of addresses. And I think with your proposal API it > > is not possible. You would need to call some "downgrade" function and > > then "upgrade" it to another object or so... > > Why would this not be possible? There is some object storing the mailboxes > structure, and it provides methods that answer the questions one needs to ask. Because you have no idea about arbitrary object. If you want to e.g. decode Email::Address::XS object, you must decode only ->phrase() and ->comment() parts! Not others. Other objects will have different methods which needs to be encoded/decoded. And MIME module (in this case Email::MIME) must known which of them needs to be encoded/decoded. Anyway, lets move forward. I already implemented something and send information in email with subject "Email::Simple & Email::MIME with Email::Address::XS" to pep mailing list... I think this is good approach to provide usable API + ability to extend code for other objects...
Re: Email::Address::XS
On Monday 22 August 2016 22:26:09 Ricardo Signes wrote: > Here's a verbose form: > > # Get an email. > my $email = get_some_email_mime(); > > # Get the header -- the (unfolded) raw bytes. > my $cc_hdr = $email->header_raw('Original-CC'); > > # parse it into an object > my $cc_obj = parse_mailboxes( $cc_obj ); > > # put that object into the header: > $email->header_set('Original-CC', $cc_obj); > > # get the raw mime-encoded bytes again: > my $cc_hdr2 = $email->header_raw('Original-CC'); > > # get a list of sub-object from the object's imaginary interface: > my @boxes = $email->header_obj('Original-CC')->boxes; That is really bad API :-( User of Email::MIME is really not interested in getting RAW header and then manually converting it to some object (provided by parse_mailboxes), then putting it back to Email::MIME object... Email::MIME is there for doing whole MIME encoding/decoding and basically user should not need to call any RAW method (only in case when he needs to manually encode/decode MIME parts). And I would expect from Email::MIME to do that encoding/decoding also for From, To, CC... headers...
Re: Email::Address::XS
* p...@cpan.org [2016-08-20T06:01:16] > Email::MIME is module which automatically do any MIME encoding/decoding > without user interaction, so that decoding must be done automatically > and without such "upgrade" function. So do you mean that whenever someone reads the header with a specific method, the header is parsed just-in-time? If so, this seems like something very easy to add in an Email::MIME subclass to show it off. Of course, it's also easy to take the hypothetical code behind "upgrade_headers($email)" to do something just like this, on a per-known-header basis. > I do not want to do that decode manually or call some "upgrade" function > which you propose... Reading one email header should not change internal > email structure. I never suggested that anything was changed by virtue of being read, but rather that one could explictly upgrade structures if desired. > Also it must be possible to get either named groups from Original-Cc > header or only list of addresses. And I think with your proposal API it > is not possible. You would need to call some "downgrade" function and > then "upgrade" it to another object or so... Why would this not be possible? There is some object storing the mailboxes structure, and it provides methods that answer the questions one needs to ask. -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
* p...@cpan.org [2016-08-18T17:35:10] > On Thursday 18 August 2016 23:21:28 Ricardo Signes wrote: > > As you say, 1 and 2 are dealt with. For 3 or 4, you want to have an > > object in the header slot, rather than a string. Once you've done > > that, you use its methods. If the object's To field stores a > > string, you "upgrade" it with something like: > > > > $email->header(To => mailbox_headers_from( $email->header('To') ); > > > > ...and it seems like one would quickly amass some sort of routine > > like: > > > > upgrade_headers($email); > > > > ...that would upgrade all the headers it knows about. > > Can you describe (or write code) how you imagine that I get header > "Original-Cc" in form of addresses in list of named groups from email > which is stored in string scalar? I'm not sure that I understand how you > mean it... Here's a verbose form: # Get an email. my $email = get_some_email_mime(); # Get the header -- the (unfolded) raw bytes. my $cc_hdr = $email->header_raw('Original-CC'); # parse it into an object my $cc_obj = parse_mailboxes( $cc_obj ); # put that object into the header: $email->header_set('Original-CC', $cc_obj); # get the raw mime-encoded bytes again: my $cc_hdr2 = $email->header_raw('Original-CC'); # get a list of sub-object from the object's imaginary interface: my @boxes = $email->header_obj('Original-CC')->boxes; -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
On Thursday 18 August 2016 23:35:10 p...@cpan.org wrote: > On Thursday 18 August 2016 23:21:28 Ricardo Signes wrote: > > > If I create Email::MIME object from input string, I would like to > > > get: > > > > > > 1) Raw (ASCII) string representation of To: field > > > > > > 2) Unicode string representation of To: field > > > > > > 3) List of Email::Address::XS objects which are in To: field > > > > > > 4) List of named groups with Email::Address::XS objects of To: > > > field > > > > > > For 1) and 2) I can use ->header_raw and ->header_str methods. > > > For 3) and 4) are needed new method(s). Ideally if caller is > > > able to get original MIME encoded objects (where in ->phrase > > > part of address object is still MIME encoded) and also if > > > objects strings are Unicode. > > > > As you say, 1 and 2 are dealt with. For 3 or 4, you want to have > > an object in the header slot, rather than a string. Once you've > > done that, you use its methods. If the object's To field stores a > > > > string, you "upgrade" it with something like: > > $email->header(To => mailbox_headers_from( $email->header('To') > > ); > > > > ...and it seems like one would quickly amass some sort of routine > > > > like: > > upgrade_headers($email); > > > > ...that would upgrade all the headers it knows about. > > Can you describe (or write code) how you imagine that I get header > "Original-Cc" in form of addresses in list of named groups from email > which is stored in string scalar? I'm not sure that I understand how > you mean it... I still do not know how you though about it, but I understand it as unusable API... Currently with Email::Address::XS I can achieve it by: 1 use Email::MIME; 2 use Email::Address::XS qw(parse_email_groups); 3 my $str = join "", <>; 4 my $email = Email::MIME->new($str); 5 my @groups = parse_email_groups($email->header_raw("Original-Cc")); 6 foreach (0 .. int($#groups / 2)) { 7 $groups[2 * $_] = mime_decode($groups[2 * $_]); 8 next unless defined $groups[2 * $_ + 1]; 9 foreach (@{$groups[2 * $_ + 1]}) { 10 $_->phrase(mime_decode($_->phrase)); 11 $_->comment(mime_decode($_->comment)); 12 } 13 } 14 # @group now contains: 15 # ('group1' => [ $obj1, $obj2 ], 'group2' => [ $obj3 ]) (where mime_decode is equivalent to Encode::decode("MIME-Header", ...)) And for easier usage I need to have one Email::MIME method which will do everything in lines 5-13. Method should get header name (in that case it is Original-CC) and returns MIME decoded final @groups list. Email::MIME is module which automatically do any MIME encoding/decoding without user interaction, so that decoding must be done automatically and without such "upgrade" function. I do not want to do that decode manually or call some "upgrade" function which you propose... Reading one email header should not change internal email structure. Also it must be possible to get either named groups from Original-Cc header or only list of addresses. And I think with your proposal API it is not possible. You would need to call some "downgrade" function and then "upgrade" it to another object or so... This can be elegantly solved by my proposal, when there will be two additional Email::MIME method. First header_addr which just returns list of addresses objects and second header_grps which returns named group with addresses objects (that one is in lines 5-13).
Re: Email::Address::XS
On Thursday 18 August 2016 23:21:28 Ricardo Signes wrote: > > If I create Email::MIME object from input string, I would like to > > get: > > > > 1) Raw (ASCII) string representation of To: field > > > > 2) Unicode string representation of To: field > > > > 3) List of Email::Address::XS objects which are in To: field > > > > 4) List of named groups with Email::Address::XS objects of To: > > field > > > > For 1) and 2) I can use ->header_raw and ->header_str methods. For > > 3) and 4) are needed new method(s). Ideally if caller is able to > > get original MIME encoded objects (where in ->phrase part of > > address object is still MIME encoded) and also if objects strings > > are Unicode. > > As you say, 1 and 2 are dealt with. For 3 or 4, you want to have an > object in the header slot, rather than a string. Once you've done > that, you use its methods. If the object's To field stores a > string, you "upgrade" it with something like: > > $email->header(To => mailbox_headers_from( $email->header('To') ); > > ...and it seems like one would quickly amass some sort of routine > like: > > upgrade_headers($email); > > ...that would upgrade all the headers it knows about. Can you describe (or write code) how you imagine that I get header "Original-Cc" in form of addresses in list of named groups from email which is stored in string scalar? I'm not sure that I understand how you mean it...
Re: Email::Address::XS
* p...@cpan.org [2016-08-08T17:41:04] > Here we are need to deal with objects which internally needs to be MIME > encoded and objects which mustn't. I don't think this matters. If an object is passed in, it must be able to produce a MIME encoded form. Even if you say: header_str => [ header => $obj_based_on_text ] ...the object will be required to have an as_mime_header method, which must produce an encoded form. The point was to make the behavior of objects unambiguous no matter where you put them, not to multiply the possible semantics of objects by two. The only difference between header and header_str, I said, would be how they treated plain strings. > If I create Email::MIME object from input string, I would like to get: > > 1) Raw (ASCII) string representation of To: field > > 2) Unicode string representation of To: field > > 3) List of Email::Address::XS objects which are in To: field > > 4) List of named groups with Email::Address::XS objects of To: field > > For 1) and 2) I can use ->header_raw and ->header_str methods. For 3) > and 4) are needed new method(s). Ideally if caller is able to get > original MIME encoded objects (where in ->phrase part of address object > is still MIME encoded) and also if objects strings are Unicode. As you say, 1 and 2 are dealt with. For 3 or 4, you want to have an object in the header slot, rather than a string. Once you've done that, you use its methods. If the object's To field stores a string, you "upgrade" it with something like: $email->header(To => mailbox_headers_from( $email->header('To') ); ...and it seems like one would quickly amass some sort of routine like: upgrade_headers($email); ...that would upgrade all the headers it knows about. > I can accept that both "header" and "header_str" will work with objects, > but I think that my suggestion about do not encoding "phrase" string > part of object passed to "header" is useful... I agree that if you provide an object, Email::MIME should not try to further encode anything, and that it should trust the object to provide its own encoded form. (Email::MIME will line-fold, though, as discussed.) -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
On Wednesday 03 August 2016 00:36:11 Ricardo Signes wrote: > * p...@cpan.org [2016-08-02T17:03:07] > > > I can imagine, that people could be confused about header_str > > meaning. It has suffix _str and I would expect it needs (Unicode) > > string, not object... Name "header" is better as it does not say > > it needs string. > > People will want to be able to pass non-ASCII strings in as subject, > meaning that header is not suitable for the "one true list of > fields." Passing in a pre-encoded value is pretty sure to be the > exception, not the rule. > > In other words, I think this would be more sensible: > > header_str => [ > Foo => raw_mime($header_raw), > Bar => "Text string to be encoded", > Baz => $message_id_object, > ], > > The alternative, using header, is: > > header => [ > Foo => $header_raw, > Bar => mime_encode("Text string to be encoded"), > Baz => $message_id_object, > ], Here we are need to deal with objects which internally needs to be MIME encoded and objects which mustn't. Image: I have $address object which phrase is already MIME encoded and I want to pass it into Cc header. So I would suggest to use header => [...] syntax only for 7bit strings or object with 7bit strings (e.g. when caller is responsible for encoding all strings or object strings into 7bit MIME). And syntax header_str => [...] for Unicode strings (or objects which have Unicode strings). Which means: if I pass $address object into "header", then phrase of $address must be already MIME encoded (or ASCII only) and when I pass it into "header_str" then it will be automatically MIME encoded. It is OK? > > And there is another problem still not solved. From $email object > > it is needed also to read From/To/Cc headers and user (caller) of > > Email::MIME module is sometimes interested in de-composited > > addresses objects (e.g. when want to parse each email address in > > CC field) and sometimes interested only in one string > > representation (e.g. want to write header to STDOUT)... > > > > With explicit $email->header_str() $email->header_addr() and also > > $email->header_grps() calls user get type which wants. I cannot > > imagine without 3 different calls how to achieve it. > > Here is the first idea that comes to mind: > > ->header_str always returns a text string. ok > ->header_raw always returns a byte string. ok > Pardon the arbitrary name, but: > > $email->header_frob($field); > > Read only, always returns an object that can ->as_mime_string. For > fields that were set without an object, it returns an unstructured > just-in-time proxy. Headers set with "raw" return the same kind of > object I proposed above for passing a raw header into header_str. > Headers set with header_str get the kind of thing that mime_encode() > returns. Possibly/probably if you have set the From header with > header_str, you get the object currently being produced, just for > brief use, in Email::MIME::Encode. If I create Email::MIME object from input string, I would like to get: 1) Raw (ASCII) string representation of To: field 2) Unicode string representation of To: field 3) List of Email::Address::XS objects which are in To: field 4) List of named groups with Email::Address::XS objects of To: field For 1) and 2) I can use ->header_raw and ->header_str methods. For 3) and 4) are needed new method(s). Ideally if caller is able to get original MIME encoded objects (where in ->phrase part of address object is still MIME encoded) and also if objects strings are Unicode. Any idea for 3) and 4)? Note that caller should be able to get list of those objects (maybe as one object?) of arbitrary header name. Not only for some hardcoded. As there are new headers like Original-From which contains addresses... > > But if you still prefer that there should be only one function > > which accept both objects and strings, lets define its name, how > > should it act on different types of strings + header names. And > > also how user of Email::MIME can receive for arbitrary header > > Unicode string value... > > I believe I'm happy with my suggestion above that both header and > header_str can work with objects, with the difference being the > behavior on plain old strings. I can accept that both "header" and "header_str" will work with objects, but I think that my suggestion about do not encoding "phrase" string part of object passed to "header" is useful... > I realize I have expanded it in the course of this email. Do you > think it is unworkable in some way? I think once we solve problem how to get objects from email which is created from string (or file), it can be usable...
Re: Email::Address::XS
* p...@cpan.org [2016-08-02T17:03:07] > I can imagine, that people could be confused about header_str meaning. > It has suffix _str and I would expect it needs (Unicode) string, not > object... Name "header" is better as it does not say it needs string. People will want to be able to pass non-ASCII strings in as subject, meaning that header is not suitable for the "one true list of fields." Passing in a pre-encoded value is pretty sure to be the exception, not the rule. In other words, I think this would be more sensible: header_str => [ Foo => raw_mime($header_raw), Bar => "Text string to be encoded", Baz => $message_id_object, ], The alternative, using header, is: header => [ Foo => $header_raw, Bar => mime_encode("Text string to be encoded"), Baz => $message_id_object, ], Of course, there's no reason that both header and header_str can't accept these objects, and the user can pick whichever is more convenient, right? The difference between header and header_str becomes only the behavior for plain strings. > > * if you know exactly octets you, the user, want in the header field, > > use "header", but this is likely rare > > Do you mean $email->header_raw_set()? > > I think it is not rare to encode header (to MIME) externally and then > pass ASCII 7bit string to $email. At least I see this usage for From > header (in previous version of Email::MIME encoding of From/To/Cc > headers was totally broken). I mean both "header" in the initializer and header_set and header_raw_set, which are equivalent. > > unchanged are probably in error at least insofar as they let you put > > non-7-bit-clean data in your headers. This should probably be > > fatal: > > > > header_str => [ Date => "\N{SMILING FACE WITH HORNS}" ] > > Here is problem: Should Email::MIME understand meaning of email headers? I think its level of understanding is roughly appropriate, although imperfect. It's meant to prevent you passing in a string of addresses that are naively correct but actually need encoding. It's better if people use something structured for headers where this is complex, though. > Here we see that header_str does not say (or specify) which string must > be specified as parameter. Unicode string? Arbitrary 8bit string? 7bit > ASCII string? Or ASCII subset visible characters? It says, in the docs for create: This method creates a new MIME part. The "header_str" parameter is a list of headers pairs to include in the message. The value for each pair is expected to be a text string that will be MIME-encoded as needed. A similar "header" parameter can be provided in addition to or instead of "header_str". Its values will be used verbatim. *text string*, not byte string. > I think we should unify API for it. And ideally describe into > documentation how to correctly use it. Agreed. > That /mostly/ with special exceptions for Message-Id or Date is wrong. I don't think I agree. I think that the behaviors on address list headers is useful. Ideally, people use methods to produce objects for structured headers. email_addr_list() for example. The current behavior is roughly to saying: bare strings for these headers are implicitly parsed into objects that then encode things. That's roughly how the message list headers are implemented. That the Date field is bogus is unfortunate. I imagine that really there are only about 3 things to worry about: * mailbox and mailbox list * fields that do not allow encoded words (and so must be 7-bit clean) * fields that are sequences of words If people know how to produce the already-encoded form, they can do so already. If they don't, but know what the decoded string would look like, the current system can continue to improve over time. In other words: if you say "I have this structured data and it isn't yet encoded, please encode it for me," we need to understand it exactly enough to know how to encode it, so this behavior is necessary if header_str is going to work for structured fields. > 1) Function name say what it accept I am not very swayed by this. Users can be surprised once for a brief moment when they see [ header_str => [ From => $object ] ] and then they know forever. On the other hand, having multiple sets of headers to write is annoying every time. > 2) No problem with meaning which type of string is accepted (subset > ASCII, ASCII or Unicode as described above) This is already unambiguous. _str forms always expect character strings. > 3) Possible performance optimization (less objects are created) How? > And there is another problem still not solved. From $email object it is > needed also to read From/To/Cc headers and user (caller) of Email::MIME > module is sometimes interested in de-composited addresses objects (e.g. > when want to parse each email address in CC field) and sometimes > interested only in one string representation (e.g. want to write header > to STDOUT)
Re: Email::Address::XS
On Tuesday 02 August 2016 01:00:02 Ricardo Signes wrote: > * p...@cpan.org [2016-07-12T11:43:02] > > > On Monday 04 July 2016 01:52:41 Ricardo Signes wrote: > > > I'd stick to header_str, I think, but I'm not sure. At any rate: > > > yes. > > > > And this is what I do not like... to pass objects to function with > > name header_str. That name sounds like it takes string, not object > > (or more objects)... > > Either we can add a new name, so people end up having to give > "header_str" and "header_obj" or we can say "in general everything > uses header_str, which follows these simple rules." I would rather > do that. I can imagine, that people could be confused about header_str meaning. It has suffix _str and I would expect it needs (Unicode) string, not object... Name "header" is better as it does not say it needs string. > > > > Still do not know how to handle non-MIME headers correctly in > > > > Email::MIME module. We can either create blacklist of non-MIME > > > > headers and extend it every time when somebody report problem > > > > or create whitelist of MIME headers... Or let caller to decide > > > > if his header must be MIME-encoded or not. > > > > > > I'm sorry, I don't understand. Could you elaborate? > > > > If passed pair (header-name, header-value) needs to be MIME encoded > > or not. Currently there is blacklist in Email::MIME for header > > names which are never MIME encoded (like Message-Id, Date, ...) > > when passing as header_str. > > So, I'd assume we'd go forward with: > > * if you know exactly octets you, the user, want in the header field, > use "header", but this is likely rare Do you mean $email->header_raw_set()? I think it is not rare to encode header (to MIME) externally and then pass ASCII 7bit string to $email. At least I see this usage for From header (in previous version of Email::MIME encoding of From/To/Cc headers was totally broken). > * if you want to provide a string for a field that's pretty much just > a string, use header_str and if it requires special handling, we do > our best, which should get better over time I fully agree. > * but if things are complicated, use an object that represents the > structured data Yes. > I don't like the idea that this will be broken further by adding the > object behavior, though. > > $email->header_str_set($field => $email->header($field)); > > ...should not break things. > > > > "header_str" is "text string" which means it will get encoded. > > > > Not exactly, there are exceptions (Message-Id, Date, ...) plus > > special behaviour for addresses headers. > > Those /mostly/ still get encoded, but we know that the strings are > meant to be structured, so we try to deconstruct them and encode > them correctly. I think those fields that get passed through > unchanged are probably in error at least insofar as they let you put > non-7-bit-clean data in your headers. This should probably be > fatal: > > header_str => [ Date => "\N{SMILING FACE WITH HORNS}" ] Here is problem: Should Email::MIME understand meaning of email headers? --> If yes, then for Date should be accepted only valid Date header (according to RFC!) and so Unicode string is disallowed. --> If not, then Email::MIME should not distinguish between header Date and X-MyOwnDate. And so it should be allowed to MIME encode string for headers. But Email::MIME currently do something between... Here we see that header_str does not say (or specify) which string must be specified as parameter. Unicode string? Arbitrary 8bit string? 7bit ASCII string? Or ASCII subset visible characters? I think we should unify API for it. And ideally describe into documentation how to correctly use it. That /mostly/ with special exceptions for Message-Id or Date is wrong. > > Addresses and groups are really something different as previous > > types (strings). And if we threat them as objects, I would rather > > see e.g. header_obj (or other different name) instead mixing it > > again with header_str (which already have exceptions :-(). This is > > my initial reason for header_addr/grps to distinguish it. > > My feeling is that Perl programmers are used to polymorphic > interfaces, and that multiplying the number of ways to specify > headers is a needless confusion. What is the benefit to the end user > of splitting things up? I see at least 3 benefits: 1) Function name say what it accept 2) No problem with meaning which type of string is accepted (subset ASCII, ASCII or Unicode as described above) 3) Possible performance optimization (less objects are created) And there is another problem still not solved. From $email object it is needed also to read From/To/Cc headers and user (caller) of Email::MIME module is sometimes interested in de-composited addresses objects (e.g. when want to parse each email address in CC field) and sometimes interested only in one string representation (e.g. want to write header to STDOUT)... With explicit $email
Re: Email::Address::XS
* p...@cpan.org [2016-07-12T11:43:02] > On Monday 04 July 2016 01:52:41 Ricardo Signes wrote: > > > > I'd stick to header_str, I think, but I'm not sure. At any rate: > > yes. > > And this is what I do not like... to pass objects to function with name > header_str. That name sounds like it takes string, not object (or more > objects)... Either we can add a new name, so people end up having to give "header_str" and "header_obj" or we can say "in general everything uses header_str, which follows these simple rules." I would rather do that. > > > Still do not know how to handle non-MIME headers correctly in > > > Email::MIME module. We can either create blacklist of non-MIME > > > headers and extend it every time when somebody report problem or > > > create whitelist of MIME headers... Or let caller to decide if his > > > header must be MIME-encoded or not. > > > > I'm sorry, I don't understand. Could you elaborate? > > If passed pair (header-name, header-value) needs to be MIME encoded or > not. Currently there is blacklist in Email::MIME for header names which > are never MIME encoded (like Message-Id, Date, ...) when passing as > header_str. So, I'd assume we'd go forward with: * if you know exactly octets you, the user, want in the header field, use "header", but this is likely rare * if you want to provide a string for a field that's pretty much just a string, use header_str and if it requires special handling, we do our best, which should get better over time * but if things are complicated, use an object that represents the structured data I don't like the idea that this will be broken further by adding the object behavior, though. $email->header_str_set($field => $email->header($field)); ...should not break things. > > "header_str" is "text string" which means it will get encoded. > > Not exactly, there are exceptions (Message-Id, Date, ...) plus special > behaviour for addresses headers. Those /mostly/ still get encoded, but we know that the strings are meant to be structured, so we try to deconstruct them and encode them correctly. I think those fields that get passed through unchanged are probably in error at least insofar as they let you put non-7-bit-clean data in your headers. This should probably be fatal: header_str => [ Date => "\N{SMILING FACE WITH HORNS}" ] > Addresses and groups are really something different as previous types > (strings). And if we threat them as objects, I would rather see e.g. > header_obj (or other different name) instead mixing it again with > header_str (which already have exceptions :-(). This is my initial > reason for header_addr/grps to distinguish it. My feeling is that Perl programmers are used to polymorphic interfaces, and that multiplying the number of ways to specify headers is a needless confusion. What is the benefit to the end user of splitting things up? -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
On Monday 04 July 2016 01:52:41 Ricardo Signes wrote: > * p...@cpan.org [2016-07-03T08:39:22] > > > On Friday 01 July 2016 02:51:31 Ricardo Signes wrote: > > > What if we defined a role (here, just a well-known name) called > > > Email::MIME::Header::Value, which is used to signal that a > > > particular method, say "as_mime_header", should be used to > > > stringify? > > > > In this case, do we need role at all? Is not existence of method > > "as_mime_header" enough? > > That method alone is fine with me. > > > And all this would be passed via header or header_str? > > I'd stick to header_str, I think, but I'm not sure. At any rate: > yes. And this is what I do not like... to pass objects to function with name header_str. That name sounds like it takes string, not object (or more objects)... > > If address(), addrlist() and addrgroup() returns those objects > > (with as_mime_header() method) it could be usable... > > > > But I was thinking about using same syntax in Email::MIME for > > passing addrlist/addrgroup as is in Email::Address::XS > > format_email_addresses and format_email_groups functions. > > I'm afraid I don't understand what you mean. Syntax/API of passing email groups to function format_email_groups from Email::Address::XS module. E.g. to call format_email_groups and Email::MIME functions with same syntax of objects/structures/arguments.. > > In my opinion folding and unfolding should be done in > > Email::Simple. I'm not huge fan of adding folding and CRLF code > > into Email::Address::XS as it has nothing to do with it. That > > module parse and format one line of list of addresses. > > I agree. I think if we start with the API described above, and leave > the folding for the message to perform, we'll be okay. We can > certainly find out by writing the code! > > > > What do you think of this all? > > > > Still do not know how to handle non-MIME headers correctly in > > Email::MIME module. We can either create blacklist of non-MIME > > headers and extend it every time when somebody report problem or > > create whitelist of MIME headers... Or let caller to decide if his > > header must be MIME-encoded or not. > > I'm sorry, I don't understand. Could you elaborate? If passed pair (header-name, header-value) needs to be MIME encoded or not. Currently there is blacklist in Email::MIME for header names which are never MIME encoded (like Message-Id, Date, ...) when passing as header_str. > > Basically we need unambiguous way to specify: > > > > * ascii string which will never be MIME-encoded (error for unicode > > char) > > * unicode string which will be MIME-encoded if contains > > unicode char > > * addresses/groups - but again with ability to > > specify if do MIME-encode > > We have that, right? > > "header" is "already encoded", which is another way of saying "do not > encode this." Yes. > "header_str" is "text string" which means it will get encoded. Not exactly, there are exceptions (Message-Id, Date, ...) plus special behaviour for addresses headers. > The address or groups thing falls under "object." I had assumed it > would, itself, know how to become MIME encoded. This is important, > because the semantics of what gets encoded differ per field type. > So, as_mime_header is the encoded form. If you want to offer an > unencoded form, as_string seems like the obvious method. Addresses and groups are really something different as previous types (strings). And if we threat them as objects, I would rather see e.g. header_obj (or other different name) instead mixing it again with header_str (which already have exceptions :-(). This is my initial reason for header_addr/grps to distinguish it.
Re: Email::Address::XS
* p...@cpan.org [2016-07-03T08:39:22] > On Friday 01 July 2016 02:51:31 Ricardo Signes wrote: > > > What if we defined a role (here, just a well-known name) called > > Email::MIME::Header::Value, which is used to signal that a particular > > method, say "as_mime_header", should be used to stringify? > > In this case, do we need role at all? Is not existence of method > "as_mime_header" enough? That method alone is fine with me. > And all this would be passed via header or header_str? I'd stick to header_str, I think, but I'm not sure. At any rate: yes. > If address(), addrlist() and addrgroup() returns those objects (with > as_mime_header() method) it could be usable... > > But I was thinking about using same syntax in Email::MIME for passing > addrlist/addrgroup as is in Email::Address::XS format_email_addresses > and format_email_groups functions. I'm afraid I don't understand what you mean. > In my opinion folding and unfolding should be done in Email::Simple. I'm > not huge fan of adding folding and CRLF code into Email::Address::XS as > it has nothing to do with it. That module parse and format one line of > list of addresses. I agree. I think if we start with the API described above, and leave the folding for the message to perform, we'll be okay. We can certainly find out by writing the code! > > What do you think of this all? > > Still do not know how to handle non-MIME headers correctly in > Email::MIME module. We can either create blacklist of non-MIME headers > and extend it every time when somebody report problem or create > whitelist of MIME headers... Or let caller to decide if his header must > be MIME-encoded or not. I'm sorry, I don't understand. Could you elaborate? > Basically we need unambiguous way to specify: > > * ascii string which will never be MIME-encoded (error for unicode char) > * unicode string which will be MIME-encoded if contains unicode char > * addresses/groups - but again with ability to specify if do MIME-encode We have that, right? "header" is "already encoded", which is another way of saying "do not encode this." "header_str" is "text string" which means it will get encoded. The address or groups thing falls under "object." I had assumed it would, itself, know how to become MIME encoded. This is important, because the semantics of what gets encoded differ per field type. So, as_mime_header is the encoded form. If you want to offer an unencoded form, as_string seems like the obvious method. -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
On Friday 01 July 2016 02:51:31 Ricardo Signes wrote: > My coworkers have returned to the other side of the world! I > attended YAPC! i had a vacation! I am back. > > * p...@cpan.org [2016-06-01T12:44:01] > > > On Tuesday 31 May 2016 02:42:48 Ricardo Signes wrote: > > > * p...@cpan.org [2016-05-28T16:48:40] > > > > > > > Basically yes. From caller perspective I want to pass email > > > > address object and let Email::MIME to do MIME encoding > > > > correctly. Something like this: > > > > > > > > my $email = Email::MIME->create( > > > > > > > > header_addr => [ ... ], > > > > > > > > ); > > > > > > I think that requiring people to break headers up even further > > > into to add a "header_addr" argument is a bit much. And why > > > header_grps? > > So, you had some responses to this which were quite helpful. > > My suggestion was meant to be something like "why not make > Email::MIME understand some kind of object as the value in a > header?" I think this is still right. > > Your main responses were (please correct me if I am misunderstanding > them): > > 1. it should be possible and easy to supply a list of address > objects > 2. it should be possible to have a named group, but not > required Both 1. and 2. are required and 1. should be easy to use by caller. 2. is not so common, but still email module should be able to support such thing. > 3. we don't want ambiguity in how objects passed to > (header_str => [...]) are interpreted Yes! > What if we defined a role (here, just a well-known name) called > Email::MIME::Header::Value, which is used to signal that a particular > method, say "as_mime_header", should be used to stringify? In this case, do we need role at all? Is not existence of method "as_mime_header" enough? > When building the header, the code will do something like: > > $string = $name . ": " > . ($value->DOES('Email::MIME::Header::Value') > ? $value->as_mime_header > > : "$value"); > > No existing object will become confused by this change, only objects > which do the new role. > > Then Email::Address::XS could provide some helper routines, so you > could write and of: > > From => 'r...@cpan.org' > > From => Email::Address::XS->new(...) > > From => address('r...@cpan.org', 'Ricardo SIGNES') > > From => addrlist( address('r...@cpan.org', 'Ricardo SIGNES'), ... > ) > > From => addrgroup( Humans => address('r...@cpan.org', 'Rik'), ... > ) And all this would be passed via header or header_str? If address(), addrlist() and addrgroup() returns those objects (with as_mime_header() method) it could be usable... But I was thinking about using same syntax in Email::MIME for passing addrlist/addrgroup as is in Email::Address::XS format_email_addresses and format_email_groups functions. > It might be best to make the first code sample actually do: > > $string = $name . ": " > . ($value->DOES('Email::MIME::Header::Value') > ? $value->as_mime_header($name, $mycrlf) # <-- changed > > : "$value"); > > ...to let the object do folding. I'm not sure about that one. I'd > want to double-check whether there's a reason to not always do the > folding of the post-stringified form in Email::MIME. In my opinion folding and unfolding should be done in Email::Simple. I'm not huge fan of adding folding and CRLF code into Email::Address::XS as it has nothing to do with it. That module parse and format one line of list of addresses. I think that automatically adding CRLF into output string is not good idea. But maybe it could make sense to tell Email::Address::XS module to "prepare" output string in format that it can be split by greedy algorithm for lines which are XX chars long (maybe 72 by default?). But now I'm not sure if it is possible... Email::Address::XS know nothing about MIME or encodings. It just format what caller pass it... > Anyway, this avoids adding multiple more places to set headers and > makes the API extensible for other header types like Message-ID, > etc, in the future. > > What do you think of this all? Still do not know how to handle non-MIME headers correctly in Email::MIME module. We can either create blacklist of non-MIME headers and extend it every time when somebody report problem or create whitelist of MIME headers... Or let caller to decide if his header must be MIME-encoded or not. And I think that it is better to let caller decide and do not do any magic (decide based on header name if MIME-encode or not). As adding any entry into black/white-list could cause breaking existing SW when update to new Email::MIME module. My attempt with header_addr and header_grps achieve it. Passing objects with own as_mime_header() method could work too (problem is with that folding). But reason why I proposed header_addr and header_grps is ability to directly pass input to / output from Email::Address::XS functions (without
Re: Email::Address::XS
My coworkers have returned to the other side of the world! I attended YAPC! i had a vacation! I am back. * p...@cpan.org [2016-06-01T12:44:01] > On Tuesday 31 May 2016 02:42:48 Ricardo Signes wrote: > > * p...@cpan.org [2016-05-28T16:48:40] > > > > > Basically yes. From caller perspective I want to pass email address > > > object and let Email::MIME to do MIME encoding correctly. Something > > > like this: > > > > > > my $email = Email::MIME->create( > > > > > > header_addr => [ ... ], > > > > > > ); > > > > I think that requiring people to break headers up even further into > > to add a "header_addr" argument is a bit much. And why header_grps? So, you had some responses to this which were quite helpful. My suggestion was meant to be something like "why not make Email::MIME understand some kind of object as the value in a header?" I think this is still right. Your main responses were (please correct me if I am misunderstanding them): 1. it should be possible and easy to supply a list of address objects 2. it should be possible to have a named group, but not required 3. we don't want ambiguity in how objects passed to (header_str => [...]) are interpreted What if we defined a role (here, just a well-known name) called Email::MIME::Header::Value, which is used to signal that a particular method, say "as_mime_header", should be used to stringify? When building the header, the code will do something like: $string = $name . ": " . ($value->DOES('Email::MIME::Header::Value') ? $value->as_mime_header : "$value"); No existing object will become confused by this change, only objects which do the new role. Then Email::Address::XS could provide some helper routines, so you could write and of: From => 'r...@cpan.org' From => Email::Address::XS->new(...) From => address('r...@cpan.org', 'Ricardo SIGNES') From => addrlist( address('r...@cpan.org', 'Ricardo SIGNES'), ... ) From => addrgroup( Humans => address('r...@cpan.org', 'Rik'), ... ) It might be best to make the first code sample actually do: $string = $name . ": " . ($value->DOES('Email::MIME::Header::Value') ? $value->as_mime_header($name, $mycrlf) # <-- changed : "$value"); ...to let the object do folding. I'm not sure about that one. I'd want to double-check whether there's a reason to not always do the folding of the post-stringified form in Email::MIME. Anyway, this avoids adding multiple more places to set headers and makes the API extensible for other header types like Message-ID, etc, in the future. What do you think of this all? -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
On Tuesday 31 May 2016 02:42:48 Ricardo Signes wrote: > * p...@cpan.org [2016-05-28T16:48:40] > > > Basically yes. From caller perspective I want to pass email address > > object and let Email::MIME to do MIME encoding correctly. Something > > like this: > > > > my $email = Email::MIME->create( > > > > header_addr => [ ... ], > > > > ); > > I think that requiring people to break headers up even further into > to add a "header_addr" argument is a bit much. And why header_grps? In most cases you do not send emails with named group of addresses. So it make sense to make it more easier to specify just list of addresses which does not belong to any named group. But in some cases you can compose & send email with list of addresses which are in some named group. So some extended syntax is needed. > How about object that represents the group? For those who want to compose email is easier to specify addresses in perl list, not creating such objects or similar... It also leads to more readable code. > Then the existing > header and header_str arguments can start silently accepting these > objects and doing the right thing. Yes, that can be implemented. But there is problem with stringification and other such things of object. Currently when I passed some variable into header or header_str I was sure that string value of that variable was used. When some object magic is used then it will not be such easy to check if string value (stringified) or object value is used. So this is reason why I proposed header_addr and header_grps to make it clear what you want to pass. Also it is similar to list/group API of Email::Address::XS module. If you think that API for named group support in Email::Address::XS is not good or can be improved, let me know. What I would like to see is same/similar usage named group addresses in Email::MIME and Email::Address::XS.
Re: Email::Address::XS
* p...@cpan.org [2016-05-28T16:48:40] > Basically yes. From caller perspective I want to pass email address > object and let Email::MIME to do MIME encoding correctly. Something like > this: > > my $email = Email::MIME->create( > header_addr => [ ... ], > ); I think that requiring people to break headers up even further into to add a "header_addr" argument is a bit much. And why header_grps? How about object that represents the group? Then the existing header and header_str arguments can start silently accepting these objects and doing the right thing. -- rjbs signature.asc Description: Digital signature
Re: Email::Address::XS
On Saturday 28 May 2016 22:33:02 Ricardo Signes wrote: > > Thanks to named group support I would like to extend Email::MIME > > module to allow passing directly Email::Address::XS objects, not > > only string headers to make MIME encoding and decoding from > > applications easier. > > > > What do you think about it? > > I'm not sure what you're suggesting. Do you mean: > > Email::MIME->create(..., header => [ To => $addr_xs, ... ]); > > ...as opposed to: > > Email::MIME->create(..., header => [ To => $addr_xs->as_string, ... > ]); > > ? Could you elaborate? Basically yes. From caller perspective I want to pass email address object and let Email::MIME to do MIME encoding correctly. Something like this: my $email = Email::MIME->create( header_addr => [ From => Email::Address::XS->new(Name => 'user@host'), To => [ Email::Address::XS->new(Name2 => 'user2@host'), Email::Address::XS->new(Name3 => 'user3@host'), ], ], ); Currently Email::MIME module takes UTF-8 formatted To (or Cc) header, construct from it Email::Address object, then MIME encode phrase part and after that format header back to string line. If I pass Email::Address::XS object directly to Email::MIME, then one step of decomposition (from ->as_string back to Email::Address object) will not be needed. Also in same way I would to pass named group of email addresses, e.g: my $email = Email::MIME->create( header_grps => [ To => [ $group_name => [ $address1_obj, $address2_obj ], ], ], ); Currently Email::MIME from all named groups, because it uses Email::Address parser and it does not support it. My Email::Address::XS supports also named groups of addresses, so above syntax can be implemented via Email::Address::XS module.
Re: Email::Address::XS
* p...@cpan.org [2016-05-23T13:05:39] > I created new perl module Email::Address::XS for parsing and formatting > email groups or addresses. Parser is borrowed from dovecot and that part > implemented in C/XS. Cool! > Thanks to named group support I would like to extend Email::MIME module > to allow passing directly Email::Address::XS objects, not only string > headers to make MIME encoding and decoding from applications easier. > > What do you think about it? I'm not sure what you're suggesting. Do you mean: Email::MIME->create(..., header => [ To => $addr_xs, ... ]); ...as opposed to: Email::MIME->create(..., header => [ To => $addr_xs->as_string, ... ]); ? Could you elaborate? -- rjbs signature.asc Description: Digital signature