I was looking at nar sizes, and thought some data may be helpful. I used my
recent RC1 verification as a basis for getting file sizes, and just got the
file size for each file in the assembly named "*.nar". I don't know whether
the images I pasted in will go through, but I made some graphs.b The first
is a histogram of nar file size in buckets of 10MB. The second basically is
similar to a cumulative distribution, the x axis is the "rank" of the nar
(smallest to largest), and the y-axis is how what fraction of the all the
sizes of the nars together are that rank or lower. In other words, on the
graph, the dot at 60 and ~27 means that the smallest 60 nars contribute
only ~27% of the total. Of note, the standard and framework nars are at 83
and 84.


[image: Inline image 3]
[image: Inline image 4]

On Fri, Jan 12, 2018 at 5:04 PM, Michael Moser <moser...@gmail.com> wrote:

> And of course, as I hit <send> I thought of one more thing.
>
> We could keep all of the code in 1 git repo (1 project) but the
> nifi-assembly part of the build could be broken up to build core NiFi
> separately from the tar/zip functional grouping of other NARs.
>
> On Fri, Jan 12, 2018 at 5:01 PM, Michael Moser <moser...@gmail.com> wrote:
>
> > Long term I would also like to see #3 be the solution.  I think what
> > Joseph N described could be part of the capabilities of #3.
> >
> > I would like to add a note of caution with respect to reorganizing and
> > releasing extension bundles separately:
> >
> >    - the burden on release manager expands because many more projects
> >    have to be released; probably not all on each release cycle but it
> could
> >    still be many
> >    - the chance of accidentally forgetting to release a project in a
> >    release cycle becomes non-zero
> >    - sharing code between projects gets a bit harder because you have to
> >    manage releasing projects in a specific order
> >    - it becomes harder to find all of the projects that need to change
> >    when shared code is added
> >    - the simple act of finding code becomes harder ... in which project
> >    is that class in? (IDEs like IntelliJ can search in 1 project, but if
> they
> >    search across multiple projects, then I haven't learned how)
> >
> > I used to maintain several nars in separate projects, and recently
> > reorganized them into 1 project (following NiFi's multi-module maven
> build)
> > and life has become much easier!
> >
> > -- Mike
> >
> >
> >
> > On Fri, Jan 12, 2018 at 4:33 PM, Chris Herrera <
> chris.herrer...@gmail.com>
> > wrote:
> >
> >> I very much like the solution proposed by Bryan below. This would allow
> >> for a cleaner docker image as well, while still proving the
> functionality
> >> as needed. For sure, the extension registry will be great, but in the
> mean
> >> time this is an adequate mid step.
> >>
> >> Regards,
> >> Chris
> >>
> >> On Jan 12, 2018, 2:52 PM -0600, Bryan Bende <bbe...@gmail.com>, wrote:
> >> > Long term I'd like to see the extension registry take form and have
> >> > that be the solution (#3).
> >> >
> >> > In the more near term, we could separate all of the NARs, except for
> >> > framework and maybe standard processors & services, into a separate
> >> > git repo.
> >> >
> >> > In that new git repo we could organize things like Joe N just
> >> > described according to some kind of functional grouping. Each of these
> >> > functional bundles could produce its own tar/zip which we can make
> >> > available for download.
> >> >
> >> > That would separate the release cycles between core NiFi and the other
> >> > NARs, and also avoid having any single binary artifact that gets too
> >> > large.
> >> >
> >> >
> >> >
> >> > On Fri, Jan 12, 2018 at 3:43 PM, Joseph Niemiec <josephx...@gmail.com
> >
> >> wrote:
> >> > > just a random thought.
> >> > >
> >> > > Drop In Lib packs... All the Hadoop ones in one package for example
> >> that
> >> > > can be added to a slim Nifi install. Another may be for Cloud, or
> >> Database
> >> > > Interactions, Integration (JMS, FTP, etc) of course defining these
> >> groups
> >> > > would be the tricky part... Or perhaps some type of installer which
> >> allows
> >> > > you to elect which packages to download to add to the slim install?
> >> > >
> >> > >
> >> > > On Fri, Jan 12, 2018 at 3:10 PM, Joe Witt <joe.w...@gmail.com>
> wrote:
> >> > >
> >> > > > Team,
> >> > > >
> >> > > > The NiFi convenience binary (tar.gz/zip) size has grown to 1.1GB
> now
> >> > > > in the latest release. Apache infra expanded it to 1.6GB allowance
> >> > > > for us but has stated this is the last time.
> >> > > > https://issues.apache.org/jira/browse/INFRA-15816
> >> > > >
> >> > > > We need consider:
> >> > > > 1) removing old nars/less commonly used nars/or particularly
> massive
> >> > > > nars from the assembly we distribute by default. Folks can still
> use
> >> > > > these things if they want just not from our convenience binary
> >> > > > 2) collapsing nars with highly repeating deps
> >> > > > 3) Getting the extension registry baked into the Flow Registry
> then
> >> > > > moving to separate releases for extension bundles. The main
> release
> >> > > > then would be just the NiFi framework.
> >> > > >
> >> > > > Any other ideas ?
> >> > > >
> >> > > > I'll plan to start identifying candiates for removal soon.
> >> > > >
> >> > > > Thanks
> >> > > > Joe
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Joseph
> >>
> >
> >
>

Reply via email to