Re: The rate of the dataflow is exceeding the provenance recording rate. Slowing down flow to accommodate

2017-12-25 Thread Koji Kawamura
Thanks for the updates, Ben. Glad to hear that!
Koji

On Tue, Dec 26, 2017 at 4:21 PM, 尹文才  wrote:
> Thanks Koji,  I have already updated the logback configuration to produce
> more verbose logs.
> I was trying to reply to you with the verbose nifi logs but since I
> switched to use the WriteAheadProvenanceRepository implementation, up till
> now I haven't seen the error again.
> I will continue to check when the error might occur and post the logs here
> if needed. Once again thanks very much for your help.
>
> Regards,
> Ben
>
> 2017-12-25 15:37 GMT+08:00 Koji Kawamura :
>
>> Hi Ben,
>>
>> You can make NiFi log more verbose by editing:
>> NIFI_HOME/conf/logback.xml
>>
>> For example, adding following entry will reveal how NiFi repositories run:
>>
>> 
>> 
>> 
>> 
>>
>> Thanks,
>> Koji
>>
>> On Mon, Dec 25, 2017 at 4:30 PM, 尹文才  wrote:
>> > Hi Koji, I also didn't find anything related to the unexpected shutdown
>> in
>> > my logs, is there anything I could do  to make NIFI log more verbose
>> > information to the logs?
>> >
>> > Regards,
>> > Ben
>> >
>> > 2017-12-25 14:56 GMT+08:00 Koji Kawamura :
>> >
>> >> Hi Ben,
>> >>
>> >> I looked at the log and I expected to see some indication for the
>> >> cause of shutdown, but couldn't find any.
>> >> The PersistentProvenanceRepository rate warning is just a warning, and
>> >> it shouldn't be the trigger of an unexpected shutdown. I suspect other
>> >> reasons such as OOM killer, but I can't do any further investigation
>> >> with only these logs.
>> >>
>> >> Thanks,
>> >> Koji
>> >>
>> >> On Mon, Dec 25, 2017 at 3:46 PM, 尹文才  wrote:
>> >> > Hi Koji, one more thing, do you have any idea why my first issue
>> leads to
>> >> > the unexpected shutdown of NIFI? according to the words, it will just
>> >> slow
>> >> > down the flow. thanks.
>> >> >
>> >> > Regards,
>> >> > Ben
>> >> >
>> >> > 2017-12-25 14:31 GMT+08:00 尹文才 :
>> >> >
>> >> >> Hi Koji, thanks for your help, for the first issue, I will switch to
>> use
>> >> >> the WriteAheadProvenanceReopsitory implementation.
>> >> >>
>> >> >> For the second issue, I have uploaded the relevant part of my log
>> file
>> >> >> onto my google drive, the link is:
>> >> >> https://drive.google.com/open?id=1oxAkSUyYZFy6IWZSeWqHI8e9Utnw1XAj
>> >> >>
>> >> >> You mean a custom processor could possibly process a flowfile twice
>> only
>> >> >> when it's trying to commit the session but it's interrupted so the
>> >> flowfile
>> >> >> still remains inside the original queue(like NIFI went down)?
>> >> >>
>> >> >> If you need to see the full log file, please let me know, thanks.
>> >> >>
>> >> >> Regards,
>> >> >> Ben
>> >> >>
>> >> >> 2017-12-25 13:51 GMT+08:00 Koji Kawamura :
>> >> >>
>> >> >>> Hi Ben,
>> >> >>>
>> >> >>> For your 2nd issue, NiFi commits a process session in Processor
>> >> >>> onTrigger when it's executed by NiFi flow engine by calling
>> >> >>> session.commit().
>> >> >>> https://github.com/apache/nifi/blob/master/nifi-api/src/main
>> >> >>> /java/org/apache/nifi/processor/AbstractProcessor.java#L28
>> >> >>> Once a process session is committed, the FlowFile state (including
>> >> >>> which queue it is in) is persisted to disk.
>> >> >>>
>> >> >>> It's possible for a Processor to process the same FlowFile more than
>> >> >>> once, if it has done its job, but failed to commit the session.
>> >> >>> For example, if your custom processor created a temp table from a
>> >> >>> FlowFile. Then before the process session is committed, something
>> >> >>> happened and NiFi process session was rollback. In this case, the
>> >> >>> target database is already updated (the temp table is created), but
>> >> >>> NiFi FlowFile stays in the incoming queue. If the FlowFile is
>> >> >>> processed again, the processor will get an error indicating the
>> table
>> >> >>> already exists.
>> >> >>>
>> >> >>> I tried to look at the logs you attached, but attachments do not
>> seem
>> >> >>> to be delivered to this ML. I don't see anything attached.
>> >> >>>
>> >> >>> Thanks,
>> >> >>> Koji
>> >> >>>
>> >> >>>
>> >> >>> On Mon, Dec 25, 2017 at 1:43 PM, Koji Kawamura <
>> ijokaruma...@gmail.com
>> >> >
>> >> >>> wrote:
>> >> >>> > Hi Ben,
>> >> >>> >
>> >> >>> > Just a quick recommendation for your first issue, 'The rate of the
>> >> >>> > dataflow is exceeding the provenance recording rate' warning
>> message.
>> >> >>> > I'd recommend using WriteAheadProvenanceRepository instead of
>> >> >>> > PersistentProvenanceRepository. WriteAheadProvenanceRepository
>> >> >>> > provides better performance.
>> >> >>> > Please take a look at the documentation here.
>> >> >>> > https://nifi.apache.org/docs/nifi-docs/html/administration-g
>> >> >>> uide.html#provenance-repository
>> >> >>> >
>> >> >>> > Thanks,
>> >> >>> > Koji
>> >> >>> >
>> >> >>> > On Mon, Dec 25, 2017 at 12:56 PM, 尹文才 
>> wrote:
>> >> >>> >> Hi guys, I'm using nifi 1.4.0 to do some ETL work in my team and
>> I
>> >> have
>> >> >>> >> encountered 2 problems dur

Re: The rate of the dataflow is exceeding the provenance recording rate. Slowing down flow to accommodate

2017-12-25 Thread 尹文才
Thanks Koji,  I have already updated the logback configuration to produce
more verbose logs.
I was trying to reply to you with the verbose nifi logs but since I
switched to use the WriteAheadProvenanceRepository implementation, up till
now I haven't seen the error again.
I will continue to check when the error might occur and post the logs here
if needed. Once again thanks very much for your help.

Regards,
Ben

2017-12-25 15:37 GMT+08:00 Koji Kawamura :

> Hi Ben,
>
> You can make NiFi log more verbose by editing:
> NIFI_HOME/conf/logback.xml
>
> For example, adding following entry will reveal how NiFi repositories run:
>
> 
> 
> 
> 
>
> Thanks,
> Koji
>
> On Mon, Dec 25, 2017 at 4:30 PM, 尹文才  wrote:
> > Hi Koji, I also didn't find anything related to the unexpected shutdown
> in
> > my logs, is there anything I could do  to make NIFI log more verbose
> > information to the logs?
> >
> > Regards,
> > Ben
> >
> > 2017-12-25 14:56 GMT+08:00 Koji Kawamura :
> >
> >> Hi Ben,
> >>
> >> I looked at the log and I expected to see some indication for the
> >> cause of shutdown, but couldn't find any.
> >> The PersistentProvenanceRepository rate warning is just a warning, and
> >> it shouldn't be the trigger of an unexpected shutdown. I suspect other
> >> reasons such as OOM killer, but I can't do any further investigation
> >> with only these logs.
> >>
> >> Thanks,
> >> Koji
> >>
> >> On Mon, Dec 25, 2017 at 3:46 PM, 尹文才  wrote:
> >> > Hi Koji, one more thing, do you have any idea why my first issue
> leads to
> >> > the unexpected shutdown of NIFI? according to the words, it will just
> >> slow
> >> > down the flow. thanks.
> >> >
> >> > Regards,
> >> > Ben
> >> >
> >> > 2017-12-25 14:31 GMT+08:00 尹文才 :
> >> >
> >> >> Hi Koji, thanks for your help, for the first issue, I will switch to
> use
> >> >> the WriteAheadProvenanceReopsitory implementation.
> >> >>
> >> >> For the second issue, I have uploaded the relevant part of my log
> file
> >> >> onto my google drive, the link is:
> >> >> https://drive.google.com/open?id=1oxAkSUyYZFy6IWZSeWqHI8e9Utnw1XAj
> >> >>
> >> >> You mean a custom processor could possibly process a flowfile twice
> only
> >> >> when it's trying to commit the session but it's interrupted so the
> >> flowfile
> >> >> still remains inside the original queue(like NIFI went down)?
> >> >>
> >> >> If you need to see the full log file, please let me know, thanks.
> >> >>
> >> >> Regards,
> >> >> Ben
> >> >>
> >> >> 2017-12-25 13:51 GMT+08:00 Koji Kawamura :
> >> >>
> >> >>> Hi Ben,
> >> >>>
> >> >>> For your 2nd issue, NiFi commits a process session in Processor
> >> >>> onTrigger when it's executed by NiFi flow engine by calling
> >> >>> session.commit().
> >> >>> https://github.com/apache/nifi/blob/master/nifi-api/src/main
> >> >>> /java/org/apache/nifi/processor/AbstractProcessor.java#L28
> >> >>> Once a process session is committed, the FlowFile state (including
> >> >>> which queue it is in) is persisted to disk.
> >> >>>
> >> >>> It's possible for a Processor to process the same FlowFile more than
> >> >>> once, if it has done its job, but failed to commit the session.
> >> >>> For example, if your custom processor created a temp table from a
> >> >>> FlowFile. Then before the process session is committed, something
> >> >>> happened and NiFi process session was rollback. In this case, the
> >> >>> target database is already updated (the temp table is created), but
> >> >>> NiFi FlowFile stays in the incoming queue. If the FlowFile is
> >> >>> processed again, the processor will get an error indicating the
> table
> >> >>> already exists.
> >> >>>
> >> >>> I tried to look at the logs you attached, but attachments do not
> seem
> >> >>> to be delivered to this ML. I don't see anything attached.
> >> >>>
> >> >>> Thanks,
> >> >>> Koji
> >> >>>
> >> >>>
> >> >>> On Mon, Dec 25, 2017 at 1:43 PM, Koji Kawamura <
> ijokaruma...@gmail.com
> >> >
> >> >>> wrote:
> >> >>> > Hi Ben,
> >> >>> >
> >> >>> > Just a quick recommendation for your first issue, 'The rate of the
> >> >>> > dataflow is exceeding the provenance recording rate' warning
> message.
> >> >>> > I'd recommend using WriteAheadProvenanceRepository instead of
> >> >>> > PersistentProvenanceRepository. WriteAheadProvenanceRepository
> >> >>> > provides better performance.
> >> >>> > Please take a look at the documentation here.
> >> >>> > https://nifi.apache.org/docs/nifi-docs/html/administration-g
> >> >>> uide.html#provenance-repository
> >> >>> >
> >> >>> > Thanks,
> >> >>> > Koji
> >> >>> >
> >> >>> > On Mon, Dec 25, 2017 at 12:56 PM, 尹文才 
> wrote:
> >> >>> >> Hi guys, I'm using nifi 1.4.0 to do some ETL work in my team and
> I
> >> have
> >> >>> >> encountered 2 problems during my testing.
> >> >>> >>
> >> >>> >> The first problem is I found the nifi bulletin board was showing
> the
> >> >>> >> following warning to me:
> >> >>> >>
> >> >>> >> 2017-12-25 01:31:00,460 WARN [Provenance Maintenance Thread-1]
> >> >>> >> o

Re: [DISCUSS] First Release of NiFi Registry

2017-12-25 Thread Pierre Villard
Hey guys,

Not sure that's the best place to give my feedbacks after running some
tests, let me know if I should open a new thread.

(I believe Joe P. already made some similar comments, but just in case...)

- in an unsecure environment, it's probably better to disable the "Add new
policy" button (NIFIREG-78)
- I've seen some logs that could be set to debug? “Access tokens are only
issued over HTTPS. Returning Conflict response.”, “Registry is not
configured to internally manage users, groups, or policies. Please contact
your system administrator.. Returning Conflict response.“
- general comment for NiFi UI: add tooltips on the icons of the upper
status bar? We've quite a few new icons coming with the Registry and I
guess it could help people not very familiar with it yet.
- is it possible to do a diff between two versions in the Registry UI?
- when adding a variable to a versioned PG, it does not show changes to
commit. Is it expected? (it does not to me)
- how to set a previous version as the new current one? does not seem
possible unless you stop version control and start again?
- very very minor comment, in the Registry UI, in the actions list, I'd set
"Delete" instead of "delete".

Another observation:

I have PG A containing PG B, both versioned. And I have two instances of PG
A in my NiFi UI PG A1 and PG A2.
- I deleted PG B tracking in NiFi Registry. I now have 404 errors on the
PGAx because PG B is not found in registry. All good. Then I disconnect PG
B in PG A1. PG A1 is shown as OK / up-to-date with nothing to commit.
- If I try to import a new instance of PG A, it’s not working because “The
Flow Registry with ID 893e20cc-0160-1000-8ab8-e0507c36aa94 reports that no
Flow exists with Bucket f66d8eb1-b893-41ad-974b-565bc33c8104, Flow
8d2df468-e8e4-4138-aaef-c7eadb71c2c4, Version 4”
- In PG A2, if I delete PG B, then it shows local change but I cannot
revert local changes: "Failed to retrieve flow with Flow Registry in order
to calculate local differences due to Error retrieving flow snapshot:
Versioned flow does not exist with identifier
08e85785-cb41-4cae-a516-6b4d3506960e"
- In the end I have to delete PG B, commit changes to get everything back
to normal. I’m wondering if disconnecting PG B shouldn’t be considered as a
local change to be committed? Because, I could be in a situation where I
don’t want to delete PG B, I just want to stop version control on it, no?

I'll run some more tests in secured environments.

Pierre




2017-12-21 18:50 GMT+01:00 Bryan Bende :

> Just wanted to give an update on this...
>
> Great progress has been made in the last two weeks in terms of getting
> ready for an RC. Still a few outstanding items, but I think we could
> have those wrapped up soon and kick out an RC some time next week.
> Depending when everything is ready we can adjust the voting period if
> needed to account for holidays and make sure there is adequate time
> for review.
>
> In the mean time, I encourage anyone who is interested to give it a
> try. Here is some info about how to get started...
>
> 1) Get the code for the registry
>
> The Apache repo is here:
>
> https://git-wip-us.apache.org/repos/asf/nifi-registry.git
>
> The github repo is here if you prefer to fork that:
>
> https://github.com/apache/nifi-registry
>
> 2) Build the registry code
>
> cd nifi-registry
> mvn clean install
>
> 3) Start the registry
>
> cd nifi-registry-assembly/target/nifi-registry-0.0.1-SNAPSHOT-
> bin/nifi-registry-0.0.1-SNAPSHOT/
> ./bin/nifi-registry.sh start
>
> 4) Create a bucket in the registry
>
> - Go to the registry UI at http://localhost:18080/nifi-registry
> - Click the tool icon in the top right corner
> - Click New Bucket from the bucket table
> - Enter a name and click create
>
> 5) Get the NiFi PR which adds the support for integrating with the registry
>
> https://github.com/apache/nifi/pull/2219
>
> Build that PR like normal.
>
> NOTE: That you must have already built nifi-registry with "mvn clean
> install" in order to build this PR because it depends on snapshot JARs
> being in your local Maven repo.
>
> 6) Tell NiFi about your local registry instance
>
> - Go the controller settings for NiFi from the top-right menu
> - Select the Registry Clients tab
> - Add a new Registry Client giving it a name and the url of
> http://localhost:18080
>
> 7) Create a process group and place it under version control
>
> - Right click on the PG and select the Version menu
> - Select Start Version Control
> - Choose the registry instance and bucket you want to use
> - Enter a name, description, and comment
>
> 8) Go back to the registry and refresh the main page and you should
> see the versioned flow you just saved
>
> 9) Import a new PG from a versioned flow
>
> - Drag on a new PG like normal
> - Instead of entering a name, click the Import link
> - Now choose the flow you saved before
>
> You should have a second identical PG now.
>
> From there you can try making changes to one of them, view local
> change