Github user trixpan commented on a diff in the pull request: https://github.com/apache/nifi/pull/827#discussion_r74414087 --- Diff: nifi-nar-bundles/nifi-email-bundle/nifi-email-processors/src/main/java/org/apache/nifi/processors/email/ListenSMTP.java --- @@ -13,89 +13,52 @@ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. -*/ + */ package org.apache.nifi.processors.email; -import javax.net.ssl.SSLContext; -import javax.net.ssl.SSLSocket; -import javax.net.ssl.SSLSocketFactory; import java.io.IOException; -import java.io.InputStream; +import java.io.OutputStream; import java.net.InetSocketAddress; import java.net.Socket; import java.util.ArrayList; -import java.util.Collection; import java.util.Collections; -import java.util.HashMap; import java.util.HashSet; import java.util.List; -import java.util.Map; import java.util.Set; -import java.util.concurrent.LinkedBlockingQueue; import java.util.concurrent.TimeUnit; -import java.util.concurrent.atomic.AtomicBoolean; - -import org.apache.commons.lang3.StringUtils; - -import org.subethamail.smtp.server.SMTPServer; - - -import org.apache.nifi.annotation.lifecycle.OnStopped; -import org.apache.nifi.annotation.lifecycle.OnUnscheduled; -import org.apache.nifi.flowfile.attributes.CoreAttributes; -import org.apache.nifi.processor.DataUnit; +import java.util.concurrent.atomic.AtomicInteger; -import org.apache.nifi.annotation.lifecycle.OnScheduled; -import org.apache.nifi.components.PropertyDescriptor; -import org.apache.nifi.processor.AbstractProcessor; -import org.apache.nifi.processor.ProcessorInitializationContext; -import org.apache.nifi.processor.Relationship; -import org.apache.nifi.processor.util.StandardValidators; +import javax.net.ssl.SSLContext; +import javax.net.ssl.SSLSocket; +import javax.net.ssl.SSLSocketFactory; +import org.apache.commons.io.IOUtils; import org.apache.nifi.annotation.behavior.InputRequirement; -import org.apache.nifi.annotation.behavior.WritesAttribute; -import org.apache.nifi.annotation.behavior.WritesAttributes; import org.apache.nifi.annotation.documentation.CapabilityDescription; import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnStopped; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.AbstractSessionFactoryProcessor; +import org.apache.nifi.processor.DataUnit; import org.apache.nifi.processor.ProcessContext; import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessSessionFactory; +import org.apache.nifi.processor.Relationship; import org.apache.nifi.processor.exception.ProcessException; -import org.apache.nifi.components.ValidationContext; -import org.apache.nifi.components.ValidationResult; -import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.io.OutputStreamCallback; +import org.apache.nifi.processor.util.StandardValidators; import org.apache.nifi.ssl.SSLContextService; - -import org.apache.nifi.processors.email.smtp.event.SmtpEvent; -import org.apache.nifi.processors.email.smtp.handler.SMTPResultCode; -import org.apache.nifi.processors.email.smtp.handler.SMTPMessageHandlerFactory; +import org.subethamail.smtp.server.SMTPServer; @Tags({"listen", "email", "smtp"}) @InputRequirement(InputRequirement.Requirement.INPUT_FORBIDDEN) -@CapabilityDescription("This processor implements a lightweight SMTP server to an arbitrary port, " + - "allowing nifi to listen for incoming email. " + - "" + - "Note this server does not perform any email validation. If direct exposure to the internet is sought," + - "it may be a better idea to use the combination of NiFi and an industrial scale MTA (e.g. Postfix)") -@WritesAttributes({ - @WritesAttribute(attribute = "mime.type", description = "The value used during HELO"), - @WritesAttribute(attribute = "smtp.helo", description = "The value used during HELO"), - @WritesAttribute(attribute = "smtp.certificates.*.serial", description = "The serial numbers for each of the " + - "certificates used by an TLS peer"), - @WritesAttribute(attribute = "smtp.certificates.*.principal", description = "The principal for each of the " + - "certificates used by an TLS peer"), - @WritesAttribute(attribute = "smtp.from", description = "The value used during MAIL FROM (i.e. envelope)"), - @WritesAttribute(attribute = "smtp.to", description = "The value used during RCPT TO (i.e. envelope)"), - @WritesAttribute(attribute = "smtp.src", description = "The source IP of the SMTP connection")}) --- End diff -- I understand you may not want to add those in IMAP and POP but they are intrinsically different processors: Listeners get connections from outside entities which may be outside the control of the DFM, Getters (in theory) will only establish connections to targets defined by the DFM. It is therefore critical for the chain of custody of a flow (an inbound email may a medical record...) we record information through the ingestion of the data to its delivery. Attributes help with that. Note that this isn't something particular to this processor, it is a pattern used all across NiFi. HandleHttpRequest records TLS client details and things like: ``` http.remote.host The hostname of the requestor http.remote.addr The hostname:port combination of the requestor http.remote.user The username of the requestor ``` ListenRELP ``` relp.sender The sending host of the messages. relp.port The sending port the messages were received over. ``` ListenTCP & ListenUDP ``` tcp.sender The sending host of the messages. udp.sender The sending host of the messages. ``` ListenSyslog records both the Hostname and the Sender. ``` syslog.hostname The hostname of the Syslog message. syslog.sender The hostname of the Syslog server that sent the message. ``` You may not be fully aware but the sender of a syslog message is the server IP, while the hostname is a field of the syslog message. Think of sender as MAIL FROM envelope while hostname is the "From" of a message. Why is this recorded? Because the sender may just be forwarding messages on behalf of a certain host and you want to know that. Otherwise evilmachine.com may send messages on behalf of clueless.org and you would never know. Add that to the fact @bbende allowed for decoupled processing (ParseSyslog) and you find a pattern where Listeners record "wire" level details and parsers deal with payload content. Hence ListenSMTP (ingest and record envelope data) -> ExtractEmailAttributes (deals exclusively with Header data) -> ExtractAttachments (deals only with attachments) -> ExtractTNEFAttachments (handles attachments that happen to be TNEF files) Hope this helps to clarify.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---