Re: Removing JRuby?

2023-11-06 Thread Edward Armes
While I do like the ExecuteScript processors as they are great at digging
you out of a hole. The performance of them isn't that great.

That being said I would ve careful about dropping Lua support as there is a
growing list of software that supports end user/administrator extensions
via Lua for those that don't want to have to re-compile software
themselves. On the other hand given that Jython doesn't yet have a Python 3
implementation it could be argued dropping Jython support is a must given
that the Python 2.x line is basically end of life.

Now I wonder if its worth re-factoring the ExecuteScript processors to be
per language implementations inheriting from a common base like a few
processors do already.

Edward

On Mon, 6 Nov 2023, 16:24 Matt Burgess,  wrote:

> I believe it is because in both ExecuteScript and ExecuteGroovyScript
> you can do "regular" groovy, but EGS has extra built-ins such as easy
> access to controller services, Groovy SQL stuff, etc, and we could
> keep building it out. But IMO we'd have to port the rest of the
> scripted components (ScriptedReader/Writer, etc.) over to the Groovy
> bundle and make sure there's a drop-in replacement in the Python stuff
> before we'd want to deprecate the scripted bundle.
>
> On the JRuby front, is that something you use actively? This question
> is for you and the entire community of course.
>
> Regards,
> Matt
>
> On Mon, Nov 6, 2023 at 7:12 AM Mike Thomsen 
> wrote:
> >
> > If we deprecate ExecuteScript, I think we need to have groovyx be ready
> to
> > function as a drop-in replacement if it's not there already.
> >
> > On Sun, Nov 5, 2023 at 9:21 PM Matt Burgess 
> wrote:
> >
> > > IIRC the removal of these engines was mostly due to lack of use or at
> > > least the perception thereof. If JRuby is being used by the community
> > > actively, I'm happy to revisit that discussion. Luaj's JSR-223
> > > interface left something to be desired, but JRuby just needed a system
> > > variable set or something like that.
> > >
> > > For the groovyx bundle, because it is Groovy-specific I tend to think
> > > we could make better use of that than ExecuteScript, especially if we
> > > do get rid of all the engines. We have a Groovy-specific processor, a
> > > "real" Python SDK, and no more Nashorn. Perhaps we move all the
> > > scripted components to the Groovy bundle, although I believe some
> > > folks still make use of Jython for these. Of course if we reinstate
> > > JRuby for ExecuteScript it's probably best to keep things the way they
> > > are, or create a jruby bundle. The original scripting bundle was
> > > aiming to support several engines, but if it turns out only one or two
> > > will be useful, it may not be worth shoehorning all that JSR-223 logic
> > > when engine-specific components could be simpler, more easily
> > > maintained, and allow for the idioms of the language to be better used
> > > (as is done in the groovyx bundle).
> > >
> > > Just my two cents, looking forward to everyone's thoughts!
> > >
> > > - Matt
> > >
> > > On Sun, Nov 5, 2023 at 8:31 PM Mike Thomsen 
> > > wrote:
> > > >
> > > > https://issues.apache.org/jira/browse/NIFI-11646
> > > >
> > > > I get the removal of Lua, but not the removal of JRuby. It's a clean
> > > > reimplementation of Ruby native to the JVM and AFAICT is pound for
> pound
> > > as
> > > > actively maintained as Groovy.
> > > >
> > > > Also, at this point, does it make sense to even keep the groovyx
> bundle
> > > > rather than deprecate it for 2.X?
> > >
>


Re: Use of attribute uuid and other "native" attributes

2023-07-18 Thread Edward Armes
Hmm,

I've seen this come up a few times now I wonder is there need for a rename
of the uuid field and a creation of an external id field?

Edward

On Tue, 18 Jul 2023, 20:53 Lucas Ottersbach, 
wrote:

> Hey Matt,
>
> you wrote that both `Session.create` and `Session.clone` set a new FlowFile
> UUID to the resulting FlowFile. This somewhat sounds like there is an
> alternative way where the UUID is not controlled by the framework itself?
>
> I've got a different use case than Russell, but was wondering whether it is
> even possible to control the FlowFile UUID as a Processor developer? I've
> got a processor pair for inter-cluster transfer of FlowFiles (where
> Site-to-Site is not applicable). As of now, the UUID on the receiving side
> differs from the original on the origin cluster, because I'm using
> `Session.create`.
> Is there a way to control the UUID of new FlowFiles?
>
>
> Best regards,
>
> Lucas
>
> Matt Burgess  schrieb am Di., 18. Juli 2023, 20:23:
>
> > In general I recommend only sending on those attributes that will be
> > used at some point downstream (unless you have an "original"
> > relationship that should maintain the original state with respect to
> > provenance). If you don't know that ahead of time you'll probably need
> > to send all/most of the attributes just in case.
> >
> > Are you using session.create() or session.clone()? They both set a new
> > "uuid" attribute on the created FlowFile, with at least the latter
> > setting some other attributes as well (see the Developer Guide [1] for
> > more details).
> >
> > Regards,
> > Matt
> >
> > [1] https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html
> >
> > On Tue, Jul 18, 2023 at 12:25 PM Russell Bateman 
> > wrote:
> > >
> > > I have a custom processor, /SplitHl7v4Resources/, that splits out
> > > individual FHIR resources (Patients, Observations, Encounters, etc.)
> > > from great Bundle flowfiles. So, for a given flowfile, it's split into
> > > hundreds of smaller ones.
> > >
> > > When I do this, I leave the existing NiFi attributes as they were on
> the
> > > original flowfile.
> > >
> > > As I contemplate the uuid attribute, it occurs to me that I should find
> > > out what its *significance is for provenance and other potential
> > > debugging/tracing concerns*. I never really look at it, but, if there
> > > were some kind of melt-down in a production environment, would I care
> > > that it multiplied across hundreds of flowfiles besided the original
> one?
> > >
> > > Also these two other NiFi attributes remain unchanged:
> > >
> > > filename
> > > path
> > >
> > >
> > > I do garnish each flowfile with many pointed/significant new attributes
> > > like resource.type that are my own. In my processing, I don't care
> about
> > > NiFi's original attributes, but should I?
> > >
> > > Thanks,
> > > Russ
> >
>


Re: [VOTE] Adopt NiFi 2.0 Proposed Release Goals

2022-12-14 Thread Edward Armes
-1 (non-binding)

Im not sure if this is covered by 8 and 5 , but I would like to suggest
that as part of 2.0 focus on removing places were concrete implementations
are used over interfaces, and update the way the website docs are generated
to ensure that NARs that not included in the standard distribution are
included.

I think this would allow for work of a NAR registry to happen outside of a
major release and allow for the potential consideration of individually
versions components reducing the sizes of future releases.

Edward


On Mon, 12 Dec 2022, 20:32 Andrew Lim,  wrote:

> +1 (binding)
>
> > On Dec 12, 2022, at 12:02 PM, David Handermann <
> exceptionfact...@apache.org> wrote:
> >
> > Team,
> >
> > Following positive feedback on NiFi 2.0 Proposed Release Goals [1] on the
> > recent discussion thread [2], I am calling this vote to adopt the
> following
> > as Release Goals for NiFi 2.0:
> >
> > 1. Remove Java 8 support and require Java 11
> > 2. Remove deprecated components
> > 3. Remove deprecated component properties
> > 4. Remove components integrating with unmaintained services
> > 5. Remove compatibility classes and methods
> > 6. Remove flow.xml.gz in favor of flow.json.gz
> > 7. Remove duplicative features
> > 8. Upgrade internal Java API references
> > 9. Reorganize standard components
> > 10. Implement migration tools for upgrading flows
> >
> > A positive vote indicates agreement on these goals and the initiation of
> > the following actions:
> >
> > 1. Rename NiFi 2.0 Proposed Release Goals to NiFi 2.0 Release Goals
> > 2. Create version 1 branch in Git for subsequent support releases on the
> > version 1 series
> > 3. Update the current main branch in Git to version 2.0.0-SNAPSHOT
> >
> > The vote will be open for 72 hours and follow standard procedures for
> > release votes.
> >
> > Please review the linked goals and discussions for background.
> >
> > [ ] +1 Adopt NiFi 2.0 Release Goals
> > [ ] +0 No opinion
> > [ ] -1 Do not adopt NiFi 2.0 Release Goals for the following reasons...
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/NIFI/NiFi+2.0+Proposed+Release+Goals
> > [2] https://lists.apache.org/thread/xo77p9t3xg4k70356xrqbdg4m9sg7sf8
>
>


Re: Need some feedback on how upgrading Avro might cause problems

2022-04-07 Thread Edward Armes
I've had a quick look in JIRA and it looks like this might have happened as
a side effect of AVRO-1544.

I think it is worth upgrading especially given that it looks like few of
the changes refer to updating against newer bundled dependencies some of
which seem to  have CVE's against them.

Depending on peoples feelings would it be wroth creating 2 versions one
using Avro 1.8 and one using Avro 1.11.0 and then removing the 1.8 version
in a later release?

On an additional note in cases where people are writing schemas manually I
suspect they are probably going to be validating against the later versions
of Avro using the projects tooling and that may create issues further down
the line.

Edward

On Thu, Apr 7, 2022 at 11:52 AM Isha Lamboo 
wrote:

> Hi Mike,
>
> The "Infer schema" functionality in NiFi currently generates schemas with
> the order that will be invalid under Avro 1.9+. I noticed because I've been
> using that to copy-paste schemas that were "almost right" so I could
> manually fix them.
>
> I guess that inferred schemas should be fine if the inferring logic is
> also upgraded by the dependency, but for any cached schemas and my own
> manually saved schemas this will be somewhat painful.
> My use case for manually saving inferred schemas is mostly database
> migration where some inferred "choice" fields are not supported for the
> target database.
>
> Hope this helps,
>
> Isha
>
> -Oorspronkelijk bericht-
> Van: Mike Thomsen 
> Verzonden: donderdag 7 april 2022 12:11
> Aan: dev@nifi.apache.org
> Onderwerp: Need some feedback on how upgrading Avro might cause problems
>
> Thread is here for full details:
>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fnifi%2Fpull%2F5900%23pullrequestreview-922490039data=04%7C01%7Cisha.lamboo%40virtualsciences.nl%7C8c89ae3e621c4a255c3308da187eea09%7C21429da9e4ad45f99a6fcd126a64274b%7C0%7C0%7C637849231546931433%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=ztcxOWXBkpZFqEh%2Bu%2BG0du5BLUPyZ3WaMxqpeqn%2FexI%3Dreserved=0
>
> It looks like Avro 1.8's schema parser may have been more lenient (or
> buggy) in enforcing the specification with respect to the ordering of a
> union for a nullable type. 1.9.X and higher are definitely more opinionated
> and throw exceptions on schemas that used to work on 1.8.X. For example,
> this used to be valid:
>
> {
>   "name": "message",
>   "type": [ "null", "string" ],
>   "default": "Hello, world"
> }
>
> Now Avro **correctly** throws an exception per the specification.
> Under 1.8 it did not, and as you can see here, I had to change numerous
> test schemas in order to make them work under 1.9 to 1.11.0:
>
>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fnifi%2Fpull%2F5900%2Ffiles%23r835954170data=04%7C01%7Cisha.lamboo%40virtualsciences.nl%7C8c89ae3e621c4a255c3308da187eea09%7C21429da9e4ad45f99a6fcd126a64274b%7C0%7C0%7C637849231546931433%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=iBGW6sYZUdxvAADYIB5L2t94RBZH3A5%2BPJMhxuGv8q8%3Dreserved=0
>
> As I said to Matt, I think we're in a "damned if you do, damned if you
> don't" position here.
>
> Thoughts?
>


Re: Data Provenance

2021-09-04 Thread Edward Armes
Hi Anupam,

to the best of my knowledge it is not possible to read the provenance
records directly.

You can however read the provenance records from either the API directly or
use the site to site provenance reporting reporting task to ship the
provance records to the same Nifi instance (not recommended) anither Nifi
instance or Minifi for processing and reporting.

Hope that helps

Edward

On Fri, 3 Sep 2021, 19:57 BARUAH, ANUPAM, 
wrote:

> Dear Dev Team
>
> I am looking how the data is arranged in NiFi Provenance file. Can we read
> it using python or any other scripting language and store the information
> in some database.
>
> Looking more info like
>
>
> *   How the data is arranged in the data provenance file.
>
> *   From where the program should start reading the file(.prov)?
>
> *   What are the e unnecessary characters and useful information
> related to events?
>
> *   How to identify each event?
>
> Best Regards
> Anupam
>
>


Re: Running nifi as non-root user

2021-08-09 Thread Edward Armes
You should also be aware, that you might have some issues listening or
using protected/privileged ports (1 - 1023 inclusive)

Regards

Edward


On Mon, 9 Aug 2021, 08:58 Jens M. Kofoed,  wrote:

> No. But the user need rwx rights to all the folders which is configured
> for NiFi. And read+write to all the files.
> I create a user which is not allowed to login and change owners to all the
> different folders. If you don’t charge folders for NiFi database,
> provanace, content, logs etc. you should be ok to just use this command:
> chown -R nifi /opt/nifi/nifi-current
>
> If you use a username of nifi and if the running nifi is in folder
> /opt/nifi/nifi-current
>
> Kind regards
> Jens M. Kofoed
>
> > Den 8. aug. 2021 kl. 18.46 skrev Lovish Gulati :
> >
> > Hi,
> >
> > When running Nifi on CentOS 7 as a non-root user by using run.as=
> option,
> > does that non-root user need sudo or su capability?
> > Please advise.
> >
> > Thanks
> > Lovish
>


Re: [DISCUSS] NiFi 2.0 Release Goals

2021-07-26 Thread Edward Armes
Given the major version number shift and the spliting up of processors into
multiple NAR's. I'd like to suggest that we start individually versioning
individual NARs/ bundles.

I can see this bringing a large number of benifits including making Nifi
more deployable with things RPM, but also potentially allowing those that
have to deploy managed Nifi instances easier to upgrade.

Edward

On Mon, 26 Jul 2021, 20:42 Otto Fowler,  wrote:

>  The issue with updating the aws sdk is if it breaks any one of the
> processors.
> the Web Gateway API invoke processor for example is not using a high level
> purpose build client and may break.
>
> If we change the aws version, we need to coordinate in such a way that they
> all
> can come along reasonably.
> IE:  what happens if 1 or 2 break but the rest or OK?
>
>
>
> From: David Handermann 
> 
> Reply: dev@nifi.apache.org  
> Date: July 26, 2021 at 09:33:42
> To: dev@nifi.apache.org  
> Subject:  Re: [DISCUSS] NiFi 2.0 Release Goals
>
> Chris,
>
> Thanks for the reply and recommendations. It seems like some of the work to
> reorganize the module structure could be done outside of a major release,
> but it would be great to target any breaking changes for 2.0. Perhaps a
> separate feature proposal on module restructuring, with the goal of
> supporting optimized builds, would be a helpful way to move that part of
> the discussion forward.
>
> Regarding updating AWS SDK to version 2, it seems like that might be
> possible now. I haven't taken a close look at the referencing components,
> so I'm not sure about the level of effort involved. Minor NiFi version
> updates have incorporated major new versions of dependencies. For example,
> NiFi 1.14 included an upgrade from Spring Framework 4 to 5. On the one
> hand, including the AWS SDK update as part of a major release seems
> helpful, but unless there are changes that break existing component
> properties, upgrading the AWS SDK could be worked independently. Others may
> have more insight into particular usage of that library.
>
> Regards,
> David Handermann
>
> On Sun, Jul 25, 2021 at 2:12 AM Chris Sampson
>  wrote:
>
> > Might be worth considering refactoring the build as part of this work
> too,
> > e.g. only building the bits of the repo affected by a commit, etc. -
> > discussed briefly in previous threads but don't think any changes made
> yet.
> > If NARs/components are likely to be split up and refactored then such
> work
> > around the build probably makes sense to consider.
> >
> > I've a couple of PRs open that include updates to Elasticsearch versions
> > already, although I stopped at 7.10.2 (after which Elastic changed
> licence)
> > in case there were licence concerns. But more work can be done to tidy up
> > the processors, absolutely.
> >
> > AWS libraries to v2 would seem a sensible move and a refactor of those
> > processors as well.
> >
> >
> > Cheers,
> >
> > Chris Sampson
> >
> > On Sat, 24 Jul 2021, 17:47 David Handermann, <
> exceptionfact...@apache.org>
>
> > wrote:
> >
> > > Thanks for pointing out the standard NAR bundles, Mark. There are a
> > number
> > > of components in the standard NAR bundles with particular dependencies
> > that
> > > would make more sense in separate NARs. Reorganizing the standard NAR
> to
> > > components with limited dependencies and wide applicability would
> > > definitely help with future maintenance.
> > >
> > > Regards,
> > > David Handermann
> > >
> > > On Sat, Jul 24, 2021 at 10:57 AM Mark Payne 
> > wrote:
> > >
> > > > There’s also some code that exists in order to maintain backward
> > > > compatibility in the repositories. I would very much like the
> > > repositories
> > > > to contain no unnecessary code. And swap file format supports really
> > old
> > > > formats. And the old impls of the repositories themselves, like
> > > > PersistentProvRepo instead of WriteAheadProv Repo, etc. Lots of stuff
> > > there
> > > > that can be removed. And some methods in ProcessSession that are
> never
> > > used
> > > > by any processor in the codebase but exists in the public API so
> can’t
> > be
> > > > removed till 2.0.
> > > >
> > > > I think his is also a great time to clean up the “standard nar.” At
> > this
> > > > point, it’s something like 70 MB. And many of the components there
> are
> > > not
> > > > really “standard” - things like connecting to FTP & SFTP servers, XML
> > > > processing, Jolt transform, etc. could potentially be moved into
> other
> > > > nars. The nifi-standard-content-viewer-1.15.0-SNAPSHOT.war is 6.9 MB
> is
> > > not
> > > > necessary for stateless or minifi java. Lots of things probably to
> > > > reconsider within the standard nar.
> > > >
> > > > I definitely think this is a reasonable approach, to allow for a 2.0
> > that
> > > > is not a huge feature release but allows the project to be simpler
> and
> > > more
> > > > nimble.
> > > >
> > > > Thanks
> > > > -Mark
> > > >
> > > > On Jul 24, 2021, at 10:59 AM, Mike Thomsen  > >  > 

Re: Removing documentation from deployment

2021-03-17 Thread Edward Armes
Just had a thought given that the UI links through in the docs are there
any tests or build conditions that break around the UI packaging if this
was to be done?

Edward

On Tue, Mar 16, 2021 at 10:38 PM Matt Burgess  wrote:

> I haven't tried it, but I would think you could add a profile
> (disabled by default) that would override the nifi-docs dependency
> with a "provided" scope. That should exclude it from the assembly. But
> if you're building RPMs you may need to touch a couple other places.
>
> That should exclude the top-level docs like the Admin Guide and such.
> But AFAIK we generate the component docs as their NARs are loaded into
> NiFi, and the location is specified by the
> nifi.documentation.working.directory property (defaulting to
> ./work/docs/components). Perhaps you could set that to /dev/null or
> some other dead-letter location.
>
> Regards,
> Matt
>
> On Tue, Mar 16, 2021 at 11:56 AM Mike Thomsen 
> wrote:
> >
> > We have a situation where we might need to remove the documentation to
> > get around some security scans. What would be involved in removing the
> > whole documentation deployment? I can hack up the assembly descriptor
> > as much as needed.
> >
> > Thanks,
> >
> > Mike
>


Re: Username/Password authentication

2021-02-12 Thread Edward Armes
Hi Sumant,

The best way to do this would be using an approach known as Basic
Authentication. information you need to enable this can be found in
Administration guide which can be found here:
https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html

Regards

Edward


On Fri, Feb 12, 2021 at 1:32 PM Sumanta Mishra
 wrote:

> Adding us...@nifi.apache.org
>
> On Fri, Feb 12, 2021 at 5:40 PM Sumanta Mishra <
> sumanta.mis...@magicedtech.com> wrote:
>
> > Hi,
> >
> > Is there any direct and simple way to set username and password within
> > Nifi? So that:
> >
> >- I can create users
> >- Share the credentials with the users
> >- Application should prompt authentication popup if they open the
> >application
> >- Login and use the application using their credentials
> >
> >
> > LDAP authentication is bit difficult in ubuntu machine. I am using Azure
> > Virtual Machine to user Nifi.
> >
> > Any suggestions?
> >
> > Regards,
> > Sumant
> >
>


Re: Enabling Cross Origin Requests

2020-10-31 Thread Edward Armes
Hi Terry,

Just want to check I understand what your asking.

Are asking, if the Nifi REST API can be setup such that it can accept an
additional set of client certificates, that has been issued by an issuer
outside of the trust chain of the certificate that is used by Nifi itself?

Edward

On Sat, 31 Oct 2020, 19:50 Terry Walsh,  wrote:

> Hey guys,
>
> Just a quick one to see if you have any ideas with regards to enabling
> access to NiFi’s REST API from a different domain?  I’m currently
> developing a simplified interface for the average user so that they can
> stop and start processes etc. and all was working really well until I
> needed to introduce certificates.  Applying certificate authentication to
> NiFi is relatively straight forward thanks to your documentation so all
> good there.
>
> The environment I’m working with at the moment is hosting my Angular
> application on IIS and passing the API calls to NiFi via a URL rewrite,
> which was fine until I needed to use certificates.  I should note that the
> production environment is not IIS so for simplicity I’d like to solve the
> problem with NiFi accepting requests from other domains if possible.
>
> I tried to update the Jetty server manually but it looks like NiFi blows
> this away every time it starts.  Is it possible to configure this somewhere
> so when NiFi refreshes it can accept requests from different domains?
> Another alternative (which may not be supported) was to somehow deploy my
> Angular app to NiFi’s Jetty server but as NiFi refreshes its files on
> restart, it looks like this isn’t possible either.
>
> Thanks in advance,
>
> Terry Walsh


Re: Maximum load

2020-09-21 Thread Edward Armes
Hi Phil,

A few things that might be of help here.

It's been a while since I looked at the call chain for a processor
but...  I believe what happens is that when a processor is set to have
a number of concurrent processors set what happens is that per nifi node
the locks around the onTrigger method is set equal to that number, thus
this doesn't create any additional instances of a processor.

The implication of this being that if you have any instance variables in
your custom processor that are synchronized or protected by locks in some
way then this may cause a backlog. Now if you have any libraries that use a
singleton pattern or has resources that may also have instance locks (or
syncronised) internally this may also cause a backlog while each of the
instances race and block to get hold of that lock.

The other thing maybe is that, it could be a setting in either in the JVM,
or unix limit subsystem that may be blocking nifi from using more than 10%
of the CPU.e

Edward

On Mon, Sep 21, 2020 at 1:06 PM Mike Thomsen  wrote:

> Phil,
>
> > Basically I have two custom processors that are slow (data intensive)
> that create backlogs
>
> Perhaps I missed it, but you didn't say whether the processors are
> functionally I/O-bound or CPU-bound. Also, the nature of the output is
> important. How many flowfiles are you creating, what are their average
> sizes, etc.
>
> Thanks,
>
> Mike
>
> On Sun, Sep 20, 2020 at 11:02 PM Phil H  wrote:
> >
> > Hi there,
> >
> > The issue I have isn’t my processor, it’s the fact I can’t get Nifi to
> run
> > more threads. I’ve noticed that if I bump the max thread counts in the
> > controller settings to total over 256, then nifi as a whole locks up and
> I
> > have to clear the state and restart it.
> >
> > Thanks,
> > Phil
> >
> > On Mon, 21 Sep 2020 at 11:01, Mark Payne  wrote:
> >
> > > Phil,
> > >
> > >
> > >
> > > It sounds like you’re dealing with a large amount of lock contention.
> > > Either using explicit java.util.concurrency.Lock or synchronized
> blocks.
> > > Would review the code for lock contention or run a profiler against
> nifi
> > > and see what that shows.
> > >
> > >
> > >
> > > Sent from my iPhone
> > >
> > >
> > >
> > > > On Sep 20, 2020, at 8:30 PM, Phil H  wrote:
> > >
> > > >
> > >
> > > > Various numbers - max of 96 so far.
> > >
> > > >
> > >
> > > >> On Sat, 19 Sep 2020 at 21:44, Pierre Villard <
> > > pierre.villard...@gmail.com>
> > >
> > > >> wrote:
> > >
> > > >>
> > >
> > > >> How many concurrent tasks did you set on the two custom processors?
> > >
> > > >>
> > >
> > > >>
> > >
> > > >>
> > >
> > > >> Le sam. 19 sept. 2020 à 02:16, Phil H  a
> écrit :
> > >
> > > >>
> > >
> > > >>
> > >
> > > >>
> > >
> > > >>> Basically I have two custom processors that are slow (data
> intensive)
> > >
> > > >> that
> > >
> > > >>
> > >
> > > >>> create backlogs. All other aspects of the flow work well - I am
> running
> > >
> > > >> SSD
> > >
> > > >>
> > >
> > > >>> disks so that sort of I/O is not biting me. The process itself is
> > > highly
> > >
> > > >>
> > >
> > > >>> ‘parallelisable’ but I can’t get NiFi to use more than about 10%
> of the
> > >
> > > >> CPU
> > >
> > > >>
> > >
> > > >>> load. This is the only purpose this machine has - I want it to work
> > > much
> > >
> > > >>
> > >
> > > >>> harder!
> > >
> > > >>
> > >
> > > >>>
> > >
> > > >>
> > >
> > > >>> On Fri, 18 Sep 2020 at 20:56, Pierre Villard <
> > >
> > > >> pierre.villard...@gmail.com>
> > >
> > > >>
> > >
> > > >>> wrote:
> > >
> > > >>
> > >
> > > >>>
> > >
> > > >>
> > >
> > >  Hi,
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > >  Do you have backpressure with flow files accumulating somewhere in
> > > your
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > >  flow? If not, then you probably don't need more threads.
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > >  If you do find bottlenecks in your flow design, then you can look
> at
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > >  increasing the concurrent tasks on specific processors to have
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > >  multi-threading. Note that it's not always the solution.
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > >  Changing the pool size does not mean NiFi will use more resources.
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > >  Actually, I usually recommend changing the pool size only when
> you see
> > >
> > > >>
> > >
> > > >>> that
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > >  all the threads available are used.
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > >  Overall we'd need more details about your flow and observations.
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > > 
> > >
> > > >>
> > >
> > >  Thanks,
> > >
> > > >>
> > >
> > > 
> 

Re: Info required for service upgrade

2020-08-14 Thread Edward Armes
Hi Ganesh,

The most up to date instructions for upgrading Nifi can be found here:
https://cwiki.apache.org/confluence/display/NIFI/Upgrading+NiFi

To answer your specific questions:
Yes - It is required to suspend all activity on the entire Nifi cluster
before upgrading, and ensure that there is no data in the flows as it won't
be kept between migrations. The steps to do this without having data loss
during the upgrade is documented in the link above.

Yes - This can be done via Nifi REST API in fact any operation you can do
via the web ui you can do via the REST API. You can find the documentation
for this here: https://nifi.apache.org/docs/nifi-docs/rest-api/index.html

Both the dev and user mailing lists has plenty examples of how start
automating things in Nifi, a quick search through those lists might help
you hit the ground running.

Edward


On Fri, 14 Aug 2020, 06:24 Ganesh, B (Nokia - IN/Bangalore), <
b.gan...@nokia.com> wrote:

> Hi ,
>
> We have requirement to upgrade version of Nifi-cluster without incurring
> service downtime or without affecting data-loss.
> For that, is it necessary to suspend the current template running in Nifi
> before start of the version upgrade activity? If yes, whether there is any
> programmatic way (i.e. through REST call or some other java API based
> mechanism) to suspend and resume the complete template running in Nifi?
> If your answer is no to the former question, can you please point us to
> the relevant design documentation in Nifi internal which enables this nice
> functionality i.e. version upgrade possibility without service downtime and
> data loss in Nifi-cluster.
>
> Thanks & Regards,
> Ganesh.
>


Re: [DISCUSS] rename master branch, look through code for other related issues

2020-06-18 Thread Edward Armes
> >>>> change causes a drop in usage.
> >>> I don’t expect the community will opt to change the new terms back to
> >> ones
> >>> with negative connotations in the future. If there is discussion about
> >> it,
> >>> this thread will provide good historical context for why the decision
> was
> >>> made to change it, just as the mailing list discussions do for other
> code
> >>> changes.
> >>>
> >>>> - Of what percentage of people is this truly an issue for and what
> >>>> percentage isn't. Any change that has the potential to cause a major
> >>> split
> >>>> in the community, there must be as close as possible to a majority,
> and
> >>> not
> >>>> just from those that are vocal and active on the mailing lists.
> >>>> Disscustions on other groups are turning toxic, and in some cases are
> >>>> potentially leading to the collapse of these projects where these
> >> changes
> >>>> are being implemented with what appears to be without the agreement of
> >> a
> >>>> signifficant chunk of the community.
> >>>>
> >>> In my perspective this should be an issue for the entire community.
> Being
> >>> able to identify an issue that directly affects another person but not
> >>> one’s self is the definition of privilege. If I can look at how the use
> >> of
> >>> these words in someone’s daily life or career impacts them negatively,
> >> when
> >>> the change would not harm me at all, I see that as a failure on my
> part.
> >> I
> >>> understand the desire to hear from the silent majority, but active
> >>> participation and discussion on the mailing list is the exact measure
> >>> described by the Apache process for participation in the community.
> Those
> >>> who speak here are the ones who will have a voice.
> >>>
> >>>> - From a personal perspective, I sit on the autism spectrum and have
> >>> grown
> >>>> up with people using words that are very offensive and have hurt me
> >>> badly.
> >>>> Instead of having these words as offensive and untouchable. Myself and
> >>>> others have instead made these words our own and made them lose the
> >>>> negative connotations they have. As such, I do find the current
> >>>> disscustions deeply alarming and feels like they start to border into
> >> the
> >>>> realm of censorship.
> >>>>
> >>> I think it’s admirable that you have responded to negative
> circumstances
> >>> in that way. I also recognize that not everyone has that opportunity.
> If
> >> we
> >>> can take these actions as a community to improve the experience for
> >> others,
> >>> I am in favor of that.
> >>>
> >>>> - One final point (and potentially controversial), A good chunk of the
> >>>> wording that is proposed to be changed. Is being done so on the
> >>>> "modern"/"street" definition of these words and not the actual
> >>> definition.
> >>>> Language should change and evolve to introduce clarity, but right now
> >>> does
> >>>> this change improve the clarity across the engineering sector and I
> >>> believe
> >>>> it won't.
> >>>
> >>> I’ll paraphrase Emily Kager here with “developers spend an inordinate
> >>> amount of time and energy arguing about the meaning and semantics of
> >>> variable and method names, but pretend exclusionary terms are
> >> meaningless.”
> >>> [1] If we can expend that much energy deciding if a method creates vs.
> >>> builds vs. forms an imaginary concept like a
> >>> LibraryFrameworkWrapperDecorator, I refuse to concede that we can and
> in
> >>> fact should do so with the terms that actually affect our community
> >>> members’ lives.
> >>>
> >>> [1] https://twitter.com/EmilyKager/status/1271102865889734656 <
> >>> https://twitter.com/EmilyKager/status/1271102865889734656>
> >>>
> >>>
> >>>
> >>>
> >>> Andy LoPresto
> >>> alopre...@apache.org
> >>> alopresto.apa...@gmail.com
> >>> He/Him
> >>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> >>>
> >>>&g

Re: [DISCUSS] rename master branch, look through code for other related issues

2020-06-17 Thread Edward Armes
This is a difficult issue and causes no small amount of friction every
time. I'm personally against this for the following reassons:

- Some of the terms proposed are not industry standard and may potentially
cause significant issue for non-english speakers.

- For each change that is made can we guarantee that we will not lose
clarity of meaning, and then have revert the change down the line if the
change causes a drop in usage.

- Of what percentage of people is this truly an issue for and what
percentage isn't. Any change that has the potential to cause a major split
in the community, there must be as close as possible to a majority, and not
just from those that are vocal and active on the mailing lists.
Disscustions on other groups are turning toxic, and in some cases are
potentially leading to the collapse of these projects where these changes
are being implemented with what appears to be without the agreement of a
signifficant chunk of the community.

- From a personal perspective, I sit on the autism spectrum and have grown
up with people using words that are very offensive and have hurt me badly.
Instead of having these words as offensive and untouchable. Myself and
others have instead made these words our own and made them lose the
negative connotations they have. As such, I do find the current
disscustions deeply alarming and feels like they start to border into the
realm of censorship.

- One final point (and potentially controversial), A good chunk of the
wording that is proposed to be changed. Is being done so on the
"modern"/"street" definition of these words and not the actual definition.
Language should change and evolve to introduce clarity, but right now does
this change improve the clarity across the engineering sector and I believe
it won't.

Edward


On Thu, 18 Jun 2020, 01:11 Andy LoPresto,  wrote:

> I am a proponent of making this change and also using allow/deny list,
> meddler-in-the-middle, etc.
>
> Here is a blog [1] with easy instructions for executing the change in git,
> although I don’t know if there is any Apache-integration specific changes
> we would also need.
>
> [1]
> https://www.hanselman.com/blog/EasilyRenameYourGitDefaultBranchFromMasterToMain.aspx
>
> Andy LoPresto
> alopre...@apache.org
> alopresto.apa...@gmail.com
> He/Him
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> > On Jun 17, 2020, at 3:06 PM, Joe Witt  wrote:
> >
> > I suspect it would be fairly easy to make this change.  We do, I think,
> > have whitelist/blacklist in there somewhere but im not sure how involved.
> >
> > On Wed, Jun 17, 2020 at 3:04 PM Tony Kurc  wrote:
> >
> >> All,
> >> I've seen the discussion started on other projects [1][2], so I wanted
> to
> >> kick off a discussion to determine whether this is something nifi could
> >> look at too. Allen Wittenauer's post to yetus captures the why and some
> of
> >> the how, so rather than copy and pasting, you can take a look at what
> he's
> >> done. Thoughts?
> >>
> >> Tony
> >>
> >> 1.
> >>
> >>
> https://lists.apache.org/thread.html/rd38afa9fb6c0dcd77d1a677f1152b7398b3bda93c9106b3393149d10%40%3Cdev.yetus.apache.org%3E
> >> 2.
> >>
> >>
> https://lists.apache.org/thread.html/r0825eec0c84296bdab7cf898a987f06355443241ca02b2aaa51d3ef9%40%3Cdev.accumulo.apache.org%3E
> >>
>
>


Re: Duplicate flow files *without* their content

2019-07-31 Thread Edward Armes
HI Lars,

In short. depending on the how a FlowFile is duplicated, the content
shouldn't be duplicated as well.

In general, content is only duplicated when it has been deemed to have been
changed (copy-on-write semantics). For the most part (unless a FlowFIle has
a large number of attributes) a FlowFile is actually quite small and
therefore the waste is minimal, hence why they can be held in memory and
passed through a Flow.

The best way to branch/clone a flow file is to add another output from the
processor you want to log the output from, and the Framework that surrounds
a Processor will handle the rest. This does create a duplicate FlowFIle but
doesn't create a copy of the content. In the provenance repository this
marked as a CLONE event for the original FlowFIle and the new FlowFile gets
treated as it's own unique FlowFIle with a reference to the original
content.

This is quite a short explanation, and a better and more in depth
explanation can be found here and I think this covers all the scenarios
you're thinking about:
https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html.


Edward

On Wed, Jul 31, 2019 at 11:47 AM Lars Winderling 
wrote:

> Dear NiFi community,
>
> I often face the use-case where I import flow files with content of order
> O(1gb) or O(10gb) – already compressed.
> Let's day I need to branch off of a flow where the actual flow file should
> be processed further, and one some side branch I want just to do some kind
> of logging or whatever without accessing the flow file's contents. Thus
> it's clearly wasteful to duplicate the flow file including content.
> For this case I wrote a processor defining 2 relationships: "original" and
> "attributes only", so the flow file attributes can be accessed separately
> from the content.
> I will gladly prepare a PR if anyone finds that worth incorporating into
> NiFi.
> Only remaining question for me would be: use an individual processor to
> that end, or add it to e.g. the DuplicateFlowFile processor. The former
> seems cleaner to me. Proposed names would be something like ForkProcessor
> (no better idea yet).
>
> Thanks in advance!
> Best,
> Lars
>


Re: [DISCUSS] Streaming or "lazy" mode for `CompressContent`

2019-07-30 Thread Edward Armes
Joe,

My concern is that the record reading and writing as it stands isn't as
clear as it could be, and this could make it worse. I personally did find
it a little difficult understanding how some record processing processors
worked.

That aside however, I think that if a "flow level"/Processes Group settings
on compression was added, it would potentially work as a general solution.
What I'm thinking here is that as content leaves a processor it's checked
to see if it is already compressed and if it isn't compress it on the way
to the content repo and if it is. It's leave it alone. On the reverse, once
content is read from the content repo it is again intercepted and
de-compressed as it's loaded into processor, there would potentially need
to be a flag added for a processors to indicate to the core that it
shouldn't need to de-compress the input.

As for handling the compression algorithms maybe extending the
plugin-discovery functionality used for repo implementations could be used
to sliently detect compression formats and algorithms?

I think it work for record and non-record data be it text or binary.

Edward


On Tue, Jul 30, 2019 at 5:42 PM Joe Witt  wrote:

> Edward,
>
> I like your point/comment regarding separation of concerns/cohesion.  I
> think we could/should consider automatically decompressing data on the fly
> for processors in general in the event we know a given set of data to be
> compressed but being accessed for plaintext purposes.  For general block
> compression types this is probably fair game and could be quite compelling
> particularly to avoid the extra read/write/content repo hits involved.
>
> That said, I think for the case of record readers/writers I'm not sure we
> can avoid having a specific solution.  Some compression types can be
> concatted together and some cannot.  Some record types would be
> tolerant/still valid and some would not.
>
> Thanks
> Joe
>
> On Tue, Jul 30, 2019 at 12:34 PM Edward Armes 
> wrote:
>
> > So while I agree with in principle and it is a good idea on paper.
> >
> >  My concern is that this starts to add a bolt-on bloat problem. The Nifi
> > processors as they stand in general do follow the Unix Philosophy (Do one
> > thing, and do it well). My concern is while it could just be a case with
> > just adding a wrapper is that it then becomes an ask to just add the
> > wrapper to other processors to add similar functionalty or other. This
> does
> > start to cause a technical debt problem and also start to potentially a
> > detrimental experience to the user. Some of this I have mentioned in the
> > previous thread about the re-structuring the Nifi core.
> >
> > The reason why I suggest doing it either at the repo level or as the
> > InputStream is handed over to the processor from the core is that it adds
> > it as a global piece of functionality, which every processor that
> processes
> > data that compress well could benefit from. Now ideally it would be nice
> to
> > see it as a "per-flow" setting but I suspect that would be adding more
> > complexity, than is actually needed.
> >
> > I have seen an issue where over the time the content repo took up quite a
> > chunk of disk, for a multi-tenanted cluster that performed lots of small
> > changes on lots of FlowFiles, now while the hosts were under resourced,
> > being able to have compressed the content and trading it off for speed of
> > data through the flow might have helped that situation quite a bit.
> >
> > Edward
> >
> > On Tue, Jul 30, 2019 at 4:21 PM Joe Witt  wrote:
> >
> > > Malthe
> > >
> > > I do see value in having the Record readers/writers understand and
> handle
> > > compression directly as it will avoid the extra disk hit of decompress,
> > > read, compress cycles using existing processes and further there are
> > cases
> > > where the compression is record specific and not just holistic block
> > > encryption.
> > >
> > > I think Koji offered a great description of how to start thinking about
> > > this.
> > >
> > > Thanks
> > >
> > > On Tue, Jul 30, 2019 at 10:47 AM Malthe  wrote:
> > >
> > > > In reference to NIFI-6496 [1], I'd like to open a discussion on
> adding
> > > > compression support to flow files such that a processor such as
> > > > `CompressContent` might function in a streaming or "lazy" mode.
> > > >
> > > > Context, more details and initial feedback can be found in the ticket
> > > > referenced below as well as in a related SO entry [2].
> > > >
> > > > [1] https://issues.apache.org/jira/browse/NIFI-6496
> > > > [2]
> > > >
> > >
> >
> https://stackoverflow.com/questions/57005564/using-convertrecord-on-compressed-input
> > > >
> > >
> >
>


Re: [DISCUSS] Streaming or "lazy" mode for `CompressContent`

2019-07-30 Thread Edward Armes
So while I agree with in principle and it is a good idea on paper.

 My concern is that this starts to add a bolt-on bloat problem. The Nifi
processors as they stand in general do follow the Unix Philosophy (Do one
thing, and do it well). My concern is while it could just be a case with
just adding a wrapper is that it then becomes an ask to just add the
wrapper to other processors to add similar functionalty or other. This does
start to cause a technical debt problem and also start to potentially a
detrimental experience to the user. Some of this I have mentioned in the
previous thread about the re-structuring the Nifi core.

The reason why I suggest doing it either at the repo level or as the
InputStream is handed over to the processor from the core is that it adds
it as a global piece of functionality, which every processor that processes
data that compress well could benefit from. Now ideally it would be nice to
see it as a "per-flow" setting but I suspect that would be adding more
complexity, than is actually needed.

I have seen an issue where over the time the content repo took up quite a
chunk of disk, for a multi-tenanted cluster that performed lots of small
changes on lots of FlowFiles, now while the hosts were under resourced,
being able to have compressed the content and trading it off for speed of
data through the flow might have helped that situation quite a bit.

Edward

On Tue, Jul 30, 2019 at 4:21 PM Joe Witt  wrote:

> Malthe
>
> I do see value in having the Record readers/writers understand and handle
> compression directly as it will avoid the extra disk hit of decompress,
> read, compress cycles using existing processes and further there are cases
> where the compression is record specific and not just holistic block
> encryption.
>
> I think Koji offered a great description of how to start thinking about
> this.
>
> Thanks
>
> On Tue, Jul 30, 2019 at 10:47 AM Malthe  wrote:
>
> > In reference to NIFI-6496 [1], I'd like to open a discussion on adding
> > compression support to flow files such that a processor such as
> > `CompressContent` might function in a streaming or "lazy" mode.
> >
> > Context, more details and initial feedback can be found in the ticket
> > referenced below as well as in a related SO entry [2].
> >
> > [1] https://issues.apache.org/jira/browse/NIFI-6496
> > [2]
> >
> https://stackoverflow.com/questions/57005564/using-convertrecord-on-compressed-input
> >
>


Re: Not able to start apche-nifi in aks

2019-07-22 Thread Edward Armes
Hi Hitesh,

>From what you've said I believe that this is actually a Kubernetes issue
and not a problem with the Nifi docker container. As such I've responded to
your Stack Overflow post and hopefully someone with a bit more experience
with Kubernetes will then also be able help a bit more from there.

Edward


On Mon, Jul 22, 2019 at 11:51 AM Hitesh Ghuge  wrote:

> Dear Concern,
>
> I am trying to start apache-nifi on AKS(Azure kubernetes service).
> But not able to start, Please find error log below
>
>
>
>
>
>
>
>
>
>
>
>
>
> *replacing target file  /opt/nifi/nifi-current/conf/nifi.properties sed:
> preserving permissions for ‘/opt/nifi/nifi-current/conf/sedSFiVwC’:
> Operation not permitted replacing target file
>  /opt/nifi/nifi-current/conf/nifi.properties sed: preserving permissions
> for ‘/opt/nifi/nifi-current/conf/sedK3S1JJ’: Operation not permitted
> replacing target file  /opt/nifi/nifi-current/conf/nifi.properties sed:
> preserving permissions for ‘/opt/nifi/nifi-current/conf/sedbcm91T’:
> Operation not permitted replacing target file
>  /opt/nifi/nifi-current/conf/nifi.properties sed: preserving permissions
> for ‘/opt/nifi/nifi-current/conf/sedIuYSe1’: Operation not permitted NiFi
> running with PID 28. The specified run.as  user nifi does
> not exist. Exiting. Received trapped signal, beginning shutdown...*
>
> When I run same docker image (1.9.2)
> on my local machine works as excepted.
> Can't debug granular on aks as container get destroyed on error.
>
> I have post question on stackoverflow
> <
> https://stackoverflow.com/questions/57141136/not-able-to-start-apche-nifi-in-aks
> >
> as
> well as Hortonworks community
> <
> https://community.hortonworks.com/questions/249424/not-able-to-start-nifi-on-aksazure-kubernetes-serv.html
> >
>
>
> Revert for more input on the same
> Awaiting your response. :)
>
> --
> *Regards,*
> *Hitesh Ghuge.*
>


Re: Java 12 Compatibility

2019-07-17 Thread Edward Armes
So the warning here isn't something you need to worry about and you'll find
it's quite a common one, on a lot of Java applications and its down to
changes made in Java 9,

The short reason for this, is that in Java 9 a new type of package was
created a Module. A module is essentially a package of packages. The
important thing to note here is that by default unless it is done
explicitly then a module won't expose it's entire API by default even if a
resource is marked a public, the important part here is that this
restriction also applies to the Java reflection system as well.

In Java 9 the standard Java API seems to have been implemented into modules
and certain things are not exposed in the module definition. Now to get
people onto Java 9+ one of things that was decided was, that by default the
reflect system would allowed access unexposed API in modules  for now.
However it is clear from various bits of documentation that in the future
that this will change, and that specific JVM flags will need to be used to
override the module export definition to expose packages. The intention of
this flag seems to be that it is made clear that you are invoking an
internal module that a developer of the module wasn't intending people to
use, so when it breaks it's not developers fault.

I would note what  I've said above is a very rough explanation and there is
a lot more complexity and subtlety to how this works. At some point in the
future I'm going to do some proper research into this, as I can see it in
the future giving me and others the runaround.

Like I said it's nothing to worry about and it's quite common as you will
see this warning on anything that's using a version of Spring that pre-java
9 and using parts of Spring that use reflection on a Java 9 or newer run
time.

Edward

On Wed, Jul 17, 2019 at 12:39 AM Mike Thomsen 
wrote:

> I believe those warnings are given with Java 11 as well. Java 12 is not
> officially supported, but that's not to say it's incompatible with NiFi
> since the delta between it and Java 11 is not that big. I would recommend
> avoiding any experimental features bundled with Java 12 and stick to ones
> that the OpenJDK team says are stable in Java 12. For production scenarios,
> Java 8 or Java 11 would be strongly recommended over Java 12.
>
> On Tue, Jul 16, 2019 at 7:27 PM Hamza Mesbahi 
> wrote:
>
> > Hello,
> >
> > I've downloaded NiFi 1.9.2. and installed it on my Windows 10 using cmd
> > window. I currently have Java 12 installed on my computer and executable
> > from Path.
> > When I input the following command: "Start run-nifi.bat" I get the usual
> > gibberish but towards the end 10-15 lines this is what I get: WARNING: An
> > illegal reflective access operation has occurred
> > WARNING: Illegal reflective access by
> > org.apache.nifi.bootstrap.util.OSUtils
> > (file:/C:/Nifi/nifi-1.9.2/lib/bootstrap/nifi-bootstrap-1.9.2.jar) to
> method
> > java.lang.ProcessImpl.pid()
> > WARNING: Please consider reporting this to the maintainers of
> > org.apache.nifi.bootstrap.util.OSUtils
> > WARNING: Use --illegal-access=warn to enable warnings of further illegal
> > reflective access operations
> > WARNING: All illegal access operations will be denied in a future release
> > 2019-07-16 15:50:26,289 WARN [main] org.apache.nifi.bootstrap.Command
> > Failed to set permissions so that only the owner can read pid file
> > C:\Nifi\NIFI-1~1.2\bin\..\run\nifi.pid; this may allows others to have
> > access to the key needed to communicate with NiFi. Permissions should be
> > changed so that only the owner can read this file
> > 2019-07-16 15:50:26,293 WARN [main] org.apache.nifi.bootstrap.Command
> > Failed to set permissions so that only the owner can read status file
> > C:\Nifi\NIFI-1~1.2\bin\..\run\nifi.status; this may allows others to have
> > access to the key needed to communicate with NiFi. Permissions should be
> > changed so that only the owner can read this file
> > 2019-07-16 15:50:26,321 INFO [main] org.apache.nifi.bootstrap.Command
> > Launched Apache NiFi with Process ID 2112
> > I've been doing some research and have found that for a while Nifi wasn't
> > compatible with Java 11, however now it is. But I have Java 12 installed,
> > is Nifi 1.9.2 compatible with Java 12 SDK or is that the issue resulting
> in
> > that log output?
> >
> > Thanks,
> >
> > Hamza
> >
>


Re: [EXT] [discuss] Splitting NiFi framework and extension repos and releases

2019-07-12 Thread Edward Armes
I think Nifi would really benefit from this. One thing I think that, should
be looked into is something I noticed while trying to get to grips with the
Nifi source, At the start of the year I did a small exercise to track how a
call from the API to start and stop a processor translates to a processor
being scheduled on the underlying thread pool in the core.

While I was doing this I noticed a few things which I think might get in
the way of this. One of these was, there seems to be a lot of good
intention of louse coupling of the classes through the core/framework code
in a lot of places though there is hard codded reliance on the actual
implementation of the interface and not the interface itself. The other was
that it seemed to me that in the call chains for scheduling a processor
certain objects were being unnecessarily created and passed around when
they could just be created further down the line.

Both of these I think create some potential issue aside from the obvious
ones around code complexity, It does make the Nifi core code base have
quite a high barrier entry into people contributing and providing
enhancements in general. For example, I looked to see if I could implement
NIFI-966 (Expose Scheduling Strategy in ProcessContext)and because of the
way the ProcessContext is created and then handed around through the core
there is no good place (that I could see) to add that information.

I was planning to re-do my analysis with a few other API calls and raise a
ticket to propose to simplify the core but since this proposal been created
I thought I would mention it here and see what others thought?

Regards

Edward

https://issues.apache.org/jira/browse/NIFI-966

On Fri, Jul 12, 2019 at 5:48 PM Joe Witt  wrote:

> Ah I agree the JIRA thing would be too heavy handed.  A single JIRA with
> well defined components tied to 'repos' is good.
>
> As far as separate code repos we're talking about different releasable
> artifacts for which we as a PMC are responsible for the meaning/etc..  As a
> many time RM I definitely dislike the mono repo construct as I understand
> it to function.  I prefer repos per source release artifact where all
> source in that artifact is a function of the release. I am ok with
> different convenience binaries resulting from a single source release
> artifact though.
>
> Thanks
>
> On Fri, Jul 12, 2019 at 12:26 PM Adam Taft  wrote:
>
> > I think the concerns around user management are valid, are they not?
> > Overhead in JIRA goes up (assigning rights to users in JIRA is
> > multiplied).  Risk to new contributors is high, because each isolated
> > repository has its own life and code contribution styles.  Maybe the
> actual
> > apache infra involvement is low, but the negative effects of community
> and
> > source code bifurcation goes up.
> >
> > Tagging in mono-repos is done by prefixing the name of the component in
> the
> > tag name.  Your release sources are still generated from the component
> > folder (not from the root).
> >
> > Modularization (as being proposed) is a good thing, but can be done in a
> > single repository. It's not a requirement to split up the git project to
> > get the benefits of modularization.  That's the point I'm hoping is seen
> in
> > this.
> >
> >
> >
> > On Fri, Jul 12, 2019 at 10:08 AM Joe Witt  wrote:
> >
> > > to clarify user management for infra is not a prob.  it is an ldap
> group.
> > >
> > > repo creation is self service as well amd group access is tied to that.
> > >
> > > release artifact is the source we produce.  this is typically
> correlated
> > to
> > > a tag of the repo.  if we have all source in one repo it isnt clear to
> me
> > > how we can maintain that.
> > >
> > > in any event im not making a statement of whether to do many repos or
> > not.
> > > just correcting some potentially misleading claims.
> > >
> > > thanks
> > >
> > > On Fri, Jul 12, 2019, 12:01 PM Adam Taft  wrote:
> > >
> > > > Just as a point of discussion, I'm not entirely sure that splitting
> > into
> > > > multiple physical git repositories is actually adding any value.  I
> > think
> > > > it's worth consideration that all the (good) changes being proposed
> are
> > > > done under a single mono-repository model.
> > > >
> > > > If we split into multiple repositories, you have substantially
> > increased
> > > > the infra surface area. User account management overhead goes up.
> > Support
> > > > from the infra team goes up. JIRA issue management goes up,
> > > > misfiled/miscategorized issues become common. It becomes harder for
> > > > community members to interact and engage with the project, steeper
> > > learning
> > > > curve for new contributors. There are more "side channel"
> conversations
> > > and
> > > > less transparency into the project as a whole. Git history is much
> > harder
> > > > (or impossible) to follow across the entire project. Tracking down
> bugs
> > > and
> > > > performing git blame or git bisect becomes hard.
> > > >
> > >