[ 
https://issues.apache.org/jira/browse/NIFI-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394989#comment-17394989
 ] 

ASF subversion and git services commented on NIFI-8773:
-------------------------------------------------------

Commit bf52973d628b092902f389e638448a6cd458d6a2 in nifi's branch 
refs/heads/main from Mark Payne
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=bf52973 ]

NIFI-8773: Implemented Line Start Pattern in TailFile

Each message encountered in the tailed file will be buffered (up to some 
configurable max) until the subsequent message arrives. At that point, the 
previous message will be flushed.

This closes #5251

Signed-off-by: David Handermann <exceptionfact...@apache.org>


> Allow TailFile to hold off on ingesting lines of text if the full 
> (multi-line) message is not available
> -------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-8773
>                 URL: https://issues.apache.org/jira/browse/NIFI-8773
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>             Fix For: 1.15.0
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When using TailFile, there are times when multi-line messages are written to 
> a file. For example, we may have something like:
> {code}
> <1> My Message
> <2> My Message
> <3> My Message
>    A continuation of my message
> {code}
> If TailFile now runs, it will ingest these 4 lines of text as a FlowFile.
> Perhaps the next lines to get written, though, will be something like:
> {code}
>   Another continuation of my message
>   A final continuation
> <4> Another Message
> <5> Yet another Message
> {code}
> And we may want to avoid pulling in lines "<3> My Message" and "   A 
> continuation of my message" until we are able to fully consume the full 
> message.
> We should enable this capability by allowing for a new property that 
> specifies a Regular Expression to run against the start of a line. If we read 
> a line from the file and it matches that Regex, then we know the previous 
> message is complete. Otherwise, the previous message may not be complete and 
> should be buffered (up to some configurable limit, in order to avoid 
> exhausting the Java heap).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to