Flow File Packager v3. You can find the source here:

https://github.com/apache/nifi/blob/main/nifi-commons/nifi-flowfile-packager/src/main/java/org/apache/nifi/util/FlowFilePackagerV3.java

It's a serialization format that is used for writing a flowfile (content
and attributes) to a stream (network, file, etc.). It's a simple binary
format, that is effectively the attributes serialized as key/value pairs
followed by the content. There are byte size markers written into the start
of each field, so that deserializing can read the values into a byte array
(or equivalent).

Flow File Packager v3 is primarily used in the MergeContent processor (for
bundling) and the UnpackContent processor (for extraction). But the
(deprecated) PostHTTP processor and the ListenHTTP processor has support
for this format somewhat transparently as well. Thus enabling two NiFi
systems to send a serialized flowfile across the wire using HTTP.

You might see this format name as "FlowFile Stream v3" or
"flowfile-stream-v3" when looking at either MergeContent or UnpackContent.





On Fri, Sep 8, 2023 at 2:14 PM Russell Bateman <r...@windofkeltia.com>
wrote:

> Uh, sorry, "Version 3" refers to what exactly?
>
> On 9/8/23 12:48, David Handermann wrote:
> > I agree that this would be a useful general feature. I also agree with
> > Joe that format support should be limited to*Version 3*  due to the
> > limitations of the earlier versions.
> >
> > This is definitely something that would be useful on the 1.x support
> > branch to provide a smooth upgrade path for NiFi 2.
> >
> > This general topic also came up on the dev channel on the Apache NiFi
> > Slack group:
> >
> > https://apachenifi.slack.com/archives/C0L9S92JY/p1692115270146369
> >
> > One key thing to note from that discussion is supporting
> > interoperability with services outside of NiFi. That may be too much
> > of a stretch for an initial implementation, but it is something I am
> > planning to evaluate as time allows.
> >
> > For now, something focused narrowly on FlowFile Version 3 encoding
> > seems like the best approach.
> >
> > I recommend referencing this discussion in a new Jira issue and
> > outlining the general design goals.
> >
> > Regards,
> > David Handermann
> >
> >
> > On Fri, Sep 8, 2023 at 1:11 PM Adam Taft<a...@adamtaft.com>  wrote:
> >> And also ... if we can land this in a 1.x release, this would help
> >> tremendously to those who are going to need a replacement for PostHTTP
> and
> >> don't want to "go dark" when they make the transition.
> >>
> >> That is, without this processor in 1.x, when a user upgrades from 1.x to
> >> 2.x, they will either have to have a MergeContent/InvokeHTTP solution in
> >> place already to replace PostHTTP, or they will have to take a
> (hopefully
> >> short) outage when they bring their canvas back up (removing PostHTTP
> and
> >> replacing with PackageFlowFile + InvokeHTTP).
> >>
> >> With this processor in 1.x, they can make that transition while
> PostHTTP is
> >> still available on their canvas. Wishful thinking that we can make the
> >> entire journey from 1.x to 2.x as smooth as possible, but this could
> >> potentially help some.
> >>
> >>
> >> On Fri, Sep 8, 2023 at 10:55 AM Adam Taft<a...@adamtaft.com>  wrote:
> >>
> >>> +1 on this as well. It's something I've kind of griped about before
> (with
> >>> the loss of PostHTTP).
> >>>
> >>> I don't think it would be horrible (as per Joe's concern) to offer a
> N:1
> >>> "bundling" property. It would just have to be stupid simple. No
> "groups",
> >>> timeouts, correlation attributes, minimum entries, etc. It should just
> >>> basically call the ProcessSession#get(int maxResults) where
> "maxResults" is
> >>> a configurable property. Whatever number of flowfiles returned in the
> list
> >>> is what is "bundled" into FFv3 format for output.
> >>>
> >>> /Adam
> >>>
> >>>
> >>> On Fri, Sep 8, 2023 at 7:19 AM Phillip Lord<phillord0...@gmail.com>
> >>> wrote:
> >>>
> >>>> +1 from me.
> >>>> I’ve experimented with both methods.  The simplicity of a
> PackageFlowfile
> >>>> straight up 1:1 is convenient and straightforward.
> >>>> MergeContent on the other hand can be difficult to understand and
> tweak
> >>>> appropriately to gain desired results/throughput.
> >>>> On Sep 8, 2023 at 10:14 AM -0400, Joe Witt<joe.w...@gmail.com>,
> wrote:
> >>>>> Ok. Certainly simplifies it but likely makes it applicable to larger
> >>>>> flowfiles only. The format is meant to allow appending and result in
> >>>> large
> >>>>> sets of flowfiles for io efficiency and specifically for storage as
> the
> >>>>> small files/tons of files thing can cause poor performance pretty
> >>>> quickly
> >>>>> (10s of thousands of files in a single directory).
> >>>>>
> >>>>> But maybe that simplicity is fine and we just link to the
> MergeContent
> >>>>> packaging option if users need more.
> >>>>>
> >>>>> On Fri, Sep 8, 2023 at 7:06 AM Michael Moser<moser...@gmail.com>
> >>>> wrote:
> >>>>>> I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of
> >>>> multiple
> >>>>>> files at all. Probably change the mime.type attribute. It might not
> >>>> even
> >>>>>> have any config properties at all if we only support flowfile-v3 and
> >>>> not v1
> >>>>>> or v2.
> >>>>>>
> >>>>>> -- Mike
> >>>>>>
> >>>>>>
> >>>>>> On Fri, Sep 8, 2023 at 9:56 AM Joe Witt<joe.w...@gmail.com>  wrote:
> >>>>>>
> >>>>>>> Mike
> >>>>>>>
> >>>>>>> In user terms this makes sense to me. Id only bother with v3 or
> >>>> whatever
> >>>>>> is
> >>>>>>> latest. We want to dump the old code. And if there are seriously
> >>>> older
> >>>>>>> versions v1,v2 then nifi 1.x can be used.
> >>>>>>>
> >>>>>>> The challenge is that you end up needing some of the same
> >>>> complexity in
> >>>>>>> implementation and config of merge content i think. What did you
> >>>> have in
> >>>>>>> mind for that?
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>>
> >>>>>>> On Fri, Sep 8, 2023 at 6:53 AM Michael Moser<moser...@gmail.com>
> >>>> wrote:
> >>>>>>>> Devs,
> >>>>>>>>
> >>>>>>>> I can't find if this was suggested before, so here goes. With the
> >>>>>> demise
> >>>>>>>> of PostHTTP in NiFi 2.0, the recommended alternative is to
> >>>>>> MergeContent 1
> >>>>>>>> file into FlowFile-v3 format then InvokeHTTP. What does the
> >>>> community
> >>>>>>>> think about supporting a new PackageFlowFile processor that is
> >>>> simple
> >>>>>> to
> >>>>>>>> configure (compared to MergeContent!) and simply packages flowfile
> >>>>>>>> attributes + content into a FlowFile-v[1,2,3] format? This would
> >>>> also
> >>>>>>>> offer a simple way to export flowfiles from NiFi that could later
> >>>> be
> >>>>>>>> re-ingested and recovered using UnpackContent. I don't want to
> >>>> submit
> >>>>>> a
> >>>>>>> PR
> >>>>>>>> for such a processor without first asking the community whether
> >>>> this
> >>>>>>> would
> >>>>>>>> be acceptable.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> -- Mike
> >>>>>>>>
>

Reply via email to