[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user david-streamlio commented on the issue: https://github.com/apache/nifi/pull/2702 I am going to close this PR and create a new one based on a fresh fork of the project ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user joewitt commented on the issue: https://github.com/apache/nifi/pull/2702 looks like we're back in git pr funkystate with tons of non contrib commits in the PR... ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user alopresto commented on the issue: https://github.com/apache/nifi/pull/2702 Your local `master` is in sync with your repository (`origin`) but not the Apache GitHub repository (`upstream`). You need to do the following: ``` $ git checkout master $ git pull upstream master $ git checkout NIFI-4914 $ git rebase master ``` ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user david-streamlio commented on the issue: https://github.com/apache/nifi/pull/2702 I tried rebase-ing against master, but it had no effect. I think that is because my "master" is a fork of the 1.7.0 branch... Anyways, here is the output of the rebase commands: "Davids-MacBook-Pro:nifi david$ git rebase master Current branch NIFI-4914 is up to date. Davids-MacBook-Pro:nifi david$ git checkout master Switched to branch 'master' Your branch is up to date with 'origin/master'." All the pom's are still using version 1.7.0-SNAPSHOT. ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user alopresto commented on the issue: https://github.com/apache/nifi/pull/2702 Yes, you should rebase against `master` as it appears the 1.7.0 release occurred after this PR was open, and current master is 1.8.0-SNAPSHOT. ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user david-streamlio commented on the issue: https://github.com/apache/nifi/pull/2702 From the errors I am seeing in the CI log, it appears that this PR is being built against the 1.8.0-SNAPSHOT release? Is that correct? Should I change the version in all of my POMs? [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must be unique: org.apache.nifi:nifi-utils:jar -> duplicate declaration of version 1.8.0-SNAPSHOT @ org.apache.nifi:nifi-couchbase-processors:[unknown-version], /home/travis/build/apache/nifi/nifi-nar-bundles/nifi-couchbase-bundle/nifi-couchbase-processors/pom.xml, line 65, column 21 @ [ERROR] The build could not read 1 project -> [Help 1] [ERROR] [ERROR] The project org.apache.nifi:nifi-pulsar-bundle:1.7.0-SNAPSHOT (/home/travis/build/apache/nifi/nifi-nar-bundles/nifi-pulsar-bundle/pom.xml) has 1 error [ERROR] Non-resolvable parent POM for org.apache.nifi:nifi-pulsar-bundle:1.7.0-SNAPSHOT: Could not find artifact org.apache.nifi:nifi-nar-bundles:pom:1.7.0-SNAPSHOT in sonatype-snapshots (https://oss.sonatype.org/content/repositories/snapshots/) and 'parent.relativePath' points at wrong local POM @ line 19, column 13 -> [Help 2] ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2702 That's correct. I would recommend you build a custom bundle and deliver that because you can deliver NARs independent of the release. Then any bug fixes you find along the way can be merged into this PR in 1.8. Also, FYI, we have two other big PRs that came in that are waiting. One's a big refactor of InfluxDB support and the other is a MarkLogic PR. Each of these tends to be the equivalent in difficult of reviewing half a dozen or more typical PRs so it can be hard to triage. ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user david-streamlio commented on the issue: https://github.com/apache/nifi/pull/2702 Does that mean this commit won't make the 1.7 release? ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2702 Nothing left for you at the moment. I keep getting side tracked with real work requirements. It's a big commit, so we'll need a lot of review. @alopresto @markap14 @bbende @ijokarumawak @joewitt once 1.7 is out the door I can try to find some time, but I'd like others to look at this because it's a non-trivial commit that brings in a big chunk of totally new functionality. ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user david-streamlio commented on the issue: https://github.com/apache/nifi/pull/2702 Just wanted to check in to see if there is anything more I needed to do on my end, or if the testing instructions were clear enough. ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user david-streamlio commented on the issue: https://github.com/apache/nifi/pull/2702 @MikeThomsen I have updated my code to use the Apache Pulsar 2.0 client API, but cannot commit the changes yet, as the release vote is still pending, and thus the client jar file hasn't been released to the maven central repository. As for testing the processors within a docker environment, you can use the following steps that I have verified and written down here: Testing NiFi Processors 1. Launch a docker container running Apache Pulsar 2.0: docker pull apachepulsar/pulsar:2.0.0-rc1-incubating docker run -d -i --name pulsar -p 6650:6650 -p 8000:8000 apachepulsar/pulsar:2.0.0-rc1-incubating 2. ssh into the pulsar container and start the Pulsar service docker exec -it pulsar /bin/bash root@266e559270ce:/pulsar/bin/pulsar standalone & Launch a docker container running apache NiFi 1.7.0-SNAPSHOT docker run -d -i --name nifi --link pulsar -p 8080:8080 apache/nifi:1.7.0-SNAPSHOT Load the following template to to NiFi, and start the ConsumePulsar processor FIRST. Then start the other processors. Verify that the data is being produced and consumed at the same rate. Open the Pulsar dashboard at http://localhost:8000 to see the topic was created and that the messages were generated. [Pulsar-Test-xml.txt](https://github.com/apache/nifi/files/2051691/Pulsar-Test-xml.txt) ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user david-streamlio commented on the issue: https://github.com/apache/nifi/pull/2702 @MikeThomsen I have updated the code to use the new Apache Pulsar 2.0 client API, but am blocked on the Apache release vote before the the client jar is published to the Maven central repo. In the meantime, I am having issues seeing the new Pulsar processors on the NiFi canvas using the docker image provided in the repo. I have confirmed that the docker container has the nifi-pulsar-nar-1.7.0-SNAPSHOT.nar file in /opt/nifi/nifi-1.7.0-SNAPSHOT/lib/ directory, and have started NiFi, but when I a attempt to add a Pulsar Processor to the Canvas, none of them are available. I have double checked the class names in the META-INF/services/org.apache.nifi.processor.Processor file and confirmed they are correct. I have also confirmed that the classes exist in the nifi-pulsar-nar-1.7.0-SNAPSHOT.nar file. Any ideas on what might be missing / wrong? ![screen shot 2018-05-29 at 10 27 27 am](https://user-images.githubusercontent.com/35466513/40674893-ef1f16a4-632a-11e8-8f81-e30386969091.png) ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2702 @markap14 I think you did a good chunk of the Kafka work, so would you like to take a look at this and join the review? ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2702 @david-streamlio I'd like to get back to this this week, so I have a question about your sandbox. Other than loading it up following the instructions that are provided w/ it, what needs to be done to set it up for use with these processors and services? ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user david-streamlio commented on the issue: https://github.com/apache/nifi/pull/2702 > Also, something to think about is whether Pulsar's client will work well across different broker versions. For example, when Pulsar 2.x comes out, will the 1.x client work well against a 2.x broker? or vice versa? Pulsar API compatibility vs binary protocol compatibility guidelines: * Protocol compatibility will be always ensured (unless major reasons) * API can be broken across major releases The problem is that Kafka broke protocol compatibility at every release, which necessitated the 0.9, .0.10, 0.11, and 1.0 versions. FYI, the 2.0 release is under voting, and should be officially available in maven central next week, but I have downloaded the code an built the jar locally and am in the process of replacing the deprecated classes with 2.0+ API changes. ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user bbende commented on the issue: https://github.com/apache/nifi/pull/2702 @MikeThomsen I've only spent a couple of minutes looking at this, but I'm not sure it can work out as nicely as you are hoping... the controller service API here is heavily dependent on the actual pulsar client API. The only way you can get the transparent swapping between processors and CS impls is if the client API is hidden behind the CS impl. For example with HBase, we have... - HBase processors (no dependency on hbase-client, dependency on hbase CS api) - HBase CS API (no dependency on hbase-client) - HBase CS 1.1.2 impl (dependency on hbase-client 1.1.2) So because the processors and CS API do not know about hbase-client, then we can transparently provide new implementations without changing the processors. In this case we have... - Pulsar processors (depends on Pulsar client 1.21.0-incubating) - Pulsar CS API (depends on Pulsar client 1.21.0-incubating) - Pulsar CS Impls (depends on Pulsar client 1.21.0-incubating) I'm not saying the current setup is bad, just mentioning that it won't work out the way the hbase setup works. The trade-off is that in order to achieve the hbase setup you essentially need to recreate parts of their client API and depending how much you have to recreate, it may not be worth it if you are recreating the entire pulsar client API just to shield the processors. Also, something to think about is whether Pulsar's client will work well across different broker versions. For example, when Pulsar 2.x comes out, will the 1.x client work well against a 2.x broker? or vice versa? In Kafka land their client has had issues across versions, like 0.8 client against 0.9 broker did not perform as well as 0.9 client against 0.9 broker, so for this reason you really need to use the corresponding client that goes with the broker. If Pulsar's client doesn't have this problem, maybe we don't need to worry at all about this versioning stuff. ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2702 @joewitt @markap14 @bbende Can one of you skim through @david-streamlio's use of controller services and let me know if you think he should refactor the names? My gut feeling is that he can rename the processors to very generic ones and have _1_X suffixed controller services so that if/when Pulsar goes 2.X the logic just gets updated in new NARs for that. To me, it looks like a lot of the work that is likely to change is in the controller services. ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2702 Don't worry about others' build errors for now. Master might just be a little off. TBH, we can even merge the code if the build is failing for reasons others than your commits so don't worry about that. ---
[GitHub] nifi issue #2702: Added Apache Pulsar processors
Github user david-streamlio commented on the issue: https://github.com/apache/nifi/pull/2702 Looks like there are error in the Email processors causing the CI build to fail. [ERROR] Tests run: 5, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 1.996 s <<< FAILURE! - in org.apache.nifi.processors.email.TestConsumeEmail [ERROR] testConsumePOP3(org.apache.nifi.processors.email.TestConsumeEmail) Time elapsed: 1.014 s <<< ERROR! java.lang.IllegalStateException: Could not start mail server imap:127.0.0.1:3143, try to set server startup timeout > 1000 via ServerSetup.setServerStartupTimeout(timeoutInMs) at org.apache.nifi.processors.email.TestConsumeEmail.setUp(TestConsumeEmail.java:55) [ERROR] testConsumePOP3(org.apache.nifi.processors.email.TestConsumeEmail) Time elapsed: 1.017 s <<< ERROR! java.lang.NullPointerException at org.apache.nifi.processors.email.TestConsumeEmail.cleanUp(TestConsumeEmail.java:66) [INFO] [INFO] Results: [INFO] [ERROR] Errors: [ERROR] org.apache.nifi.processors.email.TestConsumeEmail.testConsumePOP3(org.apache.nifi.processors.email.TestConsumeEmail) [ERROR] Run 1: TestConsumeEmail.setUp:55 û IllegalState Could not start mail server imap:127 [ERROR] Run 2: TestConsumeEmail.cleanUp:66 NullPointer [INFO] [INFO] [ERROR] Tests run: 21, Failures: 0, Errors: 1, Skipped: 0 ---