Github user trixpan commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/827#discussion_r74414087
  
    --- Diff: 
nifi-nar-bundles/nifi-email-bundle/nifi-email-processors/src/main/java/org/apache/nifi/processors/email/ListenSMTP.java
 ---
    @@ -13,89 +13,52 @@
      *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
      *  See the License for the specific language governing permissions and
      *  limitations under the License.
    -*/
    + */
     package org.apache.nifi.processors.email;
     
    -import javax.net.ssl.SSLContext;
    -import javax.net.ssl.SSLSocket;
    -import javax.net.ssl.SSLSocketFactory;
     import java.io.IOException;
    -import java.io.InputStream;
    +import java.io.OutputStream;
     import java.net.InetSocketAddress;
     import java.net.Socket;
     import java.util.ArrayList;
    -import java.util.Collection;
     import java.util.Collections;
    -import java.util.HashMap;
     import java.util.HashSet;
     import java.util.List;
    -import java.util.Map;
     import java.util.Set;
    -import java.util.concurrent.LinkedBlockingQueue;
     import java.util.concurrent.TimeUnit;
    -import java.util.concurrent.atomic.AtomicBoolean;
    -
    -import org.apache.commons.lang3.StringUtils;
    -
    -import org.subethamail.smtp.server.SMTPServer;
    -
    -
    -import org.apache.nifi.annotation.lifecycle.OnStopped;
    -import org.apache.nifi.annotation.lifecycle.OnUnscheduled;
    -import org.apache.nifi.flowfile.attributes.CoreAttributes;
    -import org.apache.nifi.processor.DataUnit;
    +import java.util.concurrent.atomic.AtomicInteger;
     
    -import org.apache.nifi.annotation.lifecycle.OnScheduled;
    -import org.apache.nifi.components.PropertyDescriptor;
    -import org.apache.nifi.processor.AbstractProcessor;
    -import org.apache.nifi.processor.ProcessorInitializationContext;
    -import org.apache.nifi.processor.Relationship;
    -import org.apache.nifi.processor.util.StandardValidators;
    +import javax.net.ssl.SSLContext;
    +import javax.net.ssl.SSLSocket;
    +import javax.net.ssl.SSLSocketFactory;
     
    +import org.apache.commons.io.IOUtils;
     import org.apache.nifi.annotation.behavior.InputRequirement;
    -import org.apache.nifi.annotation.behavior.WritesAttribute;
    -import org.apache.nifi.annotation.behavior.WritesAttributes;
     import org.apache.nifi.annotation.documentation.CapabilityDescription;
     import org.apache.nifi.annotation.documentation.Tags;
    +import org.apache.nifi.annotation.lifecycle.OnStopped;
    +import org.apache.nifi.components.PropertyDescriptor;
    +import org.apache.nifi.flowfile.FlowFile;
    +import org.apache.nifi.processor.AbstractSessionFactoryProcessor;
    +import org.apache.nifi.processor.DataUnit;
     import org.apache.nifi.processor.ProcessContext;
     import org.apache.nifi.processor.ProcessSession;
    +import org.apache.nifi.processor.ProcessSessionFactory;
    +import org.apache.nifi.processor.Relationship;
     import org.apache.nifi.processor.exception.ProcessException;
    -import org.apache.nifi.components.ValidationContext;
    -import org.apache.nifi.components.ValidationResult;
    -import org.apache.nifi.flowfile.FlowFile;
    +import org.apache.nifi.processor.io.OutputStreamCallback;
    +import org.apache.nifi.processor.util.StandardValidators;
     import org.apache.nifi.ssl.SSLContextService;
    -
    -import org.apache.nifi.processors.email.smtp.event.SmtpEvent;
    -import org.apache.nifi.processors.email.smtp.handler.SMTPResultCode;
    -import 
org.apache.nifi.processors.email.smtp.handler.SMTPMessageHandlerFactory;
    +import org.subethamail.smtp.server.SMTPServer;
     
     @Tags({"listen", "email", "smtp"})
     @InputRequirement(InputRequirement.Requirement.INPUT_FORBIDDEN)
    -@CapabilityDescription("This processor implements a lightweight SMTP 
server to an arbitrary port, " +
    -        "allowing nifi to listen for incoming email. " +
    -        "" +
    -        "Note this server does not perform any email validation. If direct 
exposure to the internet is sought," +
    -        "it may be a better idea to use the combination of NiFi and an 
industrial scale MTA (e.g. Postfix)")
    -@WritesAttributes({
    -        @WritesAttribute(attribute = "mime.type", description = "The value 
used during HELO"),
    -        @WritesAttribute(attribute = "smtp.helo", description = "The value 
used during HELO"),
    -        @WritesAttribute(attribute = "smtp.certificates.*.serial", 
description = "The serial numbers for each of the " +
    -                "certificates used by an TLS peer"),
    -        @WritesAttribute(attribute = "smtp.certificates.*.principal", 
description = "The principal for each of the " +
    -                "certificates used by an TLS peer"),
    -        @WritesAttribute(attribute = "smtp.from", description = "The value 
used during MAIL FROM (i.e. envelope)"),
    -        @WritesAttribute(attribute = "smtp.to", description = "The value 
used during RCPT TO (i.e. envelope)"),
    -        @WritesAttribute(attribute = "smtp.src", description = "The source 
IP of the SMTP connection")})
    --- End diff --
    
    I understand you may not want to add those in IMAP and POP but they are 
intrinsically different processors:
    
    Listeners get connections from outside entities which may be outside the 
control of the DFM, Getters (in theory) will only establish connections to 
targets defined by the DFM.
    
    It is therefore critical for the chain of custody of a flow (an inbound 
email may a medical record...) we record information through the ingestion of 
the data to its delivery. Attributes help with that.
    
    Note that this isn't something particular to this processor, it is a 
pattern used all across NiFi. 
    
    HandleHttpRequest records TLS client details and things like:
    
    ```
    http.remote.host    The hostname of the requestor
    http.remote.addr    The hostname:port combination of the requestor
    http.remote.user    The username of the requestor
    ```
    
    ListenRELP
    
    ```
    relp.sender The sending host of the messages.
    relp.port   The sending port the messages were received over.
    ```
    
    ListenTCP & ListenUDP
    ```
    tcp.sender  The sending host of the messages.
    udp.sender  The sending host of the messages.
    
    ```
    
    ListenSyslog records both the Hostname and the Sender.
    
    ```
    syslog.hostname     The hostname of the Syslog message.
    syslog.sender       The hostname of the Syslog server that sent the message.
    
    ```
    
    You may not be fully aware but the sender of a syslog message is the server 
IP, while the hostname is a field of the syslog message. Think of sender as 
MAIL FROM envelope while hostname is the "From" of a message.
    
    Why is this recorded? 
    
    Because the sender may just be forwarding messages on behalf of a certain 
host and you want to know that. Otherwise evilmachine.com may send messages on 
behalf of clueless.org and you would never know.
    
    Add that to the fact @bbende allowed for decoupled processing (ParseSyslog) 
and you find a pattern where Listeners record "wire" level details and parsers 
deal with payload content. Hence ListenSMTP (ingest and record envelope data) 
-> ExtractEmailAttributes (deals exclusively with Header data) -> 
ExtractAttachments (deals only with attachments) -> ExtractTNEFAttachments 
(handles attachments that happen to be TNEF files)
    
    Hope this helps to clarify.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to