Joe, That's exactly the idea.
I envision to, from, cc, connecting host (src_ip of the last hop), subject, time and possibly an option to iterate over the headers, adding discretionary key value pairs for things like spamassassin scores, etc. I pkan to keep things simple so I don't intend to add things like SPF, DKIM, etc but keen to consider. Happy to call it ExtractMailAttachment. I considered this type of more explicit name previously but settled for parse just because syslog adopted parse as well(although ListenSyslog is also capable of parsing). Will raise a JIRA to track. Cheers On 19 May 2016 12:12, "Joe Witt" <joe.w...@gmail.com> wrote: > Andre > > I like the idea. I'd suggest having 'ListenSMTP' go ahead and create > a good set of FlowFile attributes for things like > to/from/cc/subject/number of attachments/time/etc... that make sense > for a given e-mail. The body of the flowfile would be the entire > message which i believe would include the attachments themselves which > is fair game. If you did need/want to split out the attachments in > your flow then I'd say the 'ParseEmail' idea is good but perhaps call > it 'SplitEmail' or 'ExtractEmailAttachment' or something like that. > > Thanks > Joe > > On Wed, May 18, 2016 at 7:43 PM, Andre F de Miranda <af...@fucs.org> > wrote: > > All, > > > > I have been considering writing a "ListenSMTP" processor and was > wondering > > *what is the best way of dealing with multiple attachments*. > > > > Looking in here > > > https://mail-archives.apache.org/mod_mbox/nifi-users/201602.mbox/%3ccaljk9a5ulcitnfo0dlsvd5d-jkcsqm+rqjxuruzwgrdbqad...@mail.gmail.com%3E > > > > > > I can read Joe suggesting not using attributes to store large volumes of > > data, so far so good, however, as far as I understand a flowfile can only > > contain one "content". > > > > Currently the way I envision this would be modular that taps into the > > pattern set by ListenSyslog / ParseSyslog: > > > > ListenSMTP - A processor that only provides an SMTP interface > > > > ParseEmail - A processor that reads the flowfile holding the email body > and > > split it into 1 or more flowfiles containing the attached mime objects. > > > > The advantage here is that people can use FetchFile or to create a > GetIMAP > > processor to parse messages. > > > > Would anyone have a different view on how to achieve this? > > > > I thank you in advance >