[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-07-11 Thread david-streamlio
Github user david-streamlio commented on the issue:

https://github.com/apache/nifi/pull/2702
  
I am going to close this PR and create a new one based on a fresh fork of 
the project


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-07-11 Thread joewitt
Github user joewitt commented on the issue:

https://github.com/apache/nifi/pull/2702
  
looks like we're back in git pr funkystate with tons of non contrib commits 
in the PR...


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-07-11 Thread alopresto
Github user alopresto commented on the issue:

https://github.com/apache/nifi/pull/2702
  
Your local `master` is in sync with your repository (`origin`) but not the 
Apache GitHub repository (`upstream`). You need to do the following:

```
$ git checkout master
$ git pull upstream master
$ git checkout NIFI-4914
$ git rebase master
```


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-07-11 Thread david-streamlio
Github user david-streamlio commented on the issue:

https://github.com/apache/nifi/pull/2702
  
I tried rebase-ing against master, but it had no effect. I think that is 
because my "master" is a fork of the 1.7.0 branch... Anyways, here is the 
output of the rebase commands:

"Davids-MacBook-Pro:nifi david$ git rebase master
Current branch NIFI-4914 is up to date.
Davids-MacBook-Pro:nifi david$ git checkout master
Switched to branch 'master'
Your branch is up to date with 'origin/master'."

All the pom's are still using version 1.7.0-SNAPSHOT.


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-07-11 Thread alopresto
Github user alopresto commented on the issue:

https://github.com/apache/nifi/pull/2702
  
Yes, you should rebase against `master` as it appears the 1.7.0 release 
occurred after this PR was open, and current master is 1.8.0-SNAPSHOT. 


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-07-11 Thread david-streamlio
Github user david-streamlio commented on the issue:

https://github.com/apache/nifi/pull/2702
  
From the errors I am seeing in the CI log, it appears that this PR is being 
built against the 1.8.0-SNAPSHOT release? Is that correct?  Should I change the 
version in all of my POMs?

[WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' 
must be unique: org.apache.nifi:nifi-utils:jar -> duplicate declaration of 
version 1.8.0-SNAPSHOT @ 
org.apache.nifi:nifi-couchbase-processors:[unknown-version], 
/home/travis/build/apache/nifi/nifi-nar-bundles/nifi-couchbase-bundle/nifi-couchbase-processors/pom.xml,
 line 65, column 21
 @ 
[ERROR] The build could not read 1 project -> [Help 1]
[ERROR]   
[ERROR]   The project org.apache.nifi:nifi-pulsar-bundle:1.7.0-SNAPSHOT 
(/home/travis/build/apache/nifi/nifi-nar-bundles/nifi-pulsar-bundle/pom.xml) 
has 1 error
[ERROR] Non-resolvable parent POM for 
org.apache.nifi:nifi-pulsar-bundle:1.7.0-SNAPSHOT: Could not find artifact 
org.apache.nifi:nifi-nar-bundles:pom:1.7.0-SNAPSHOT in sonatype-snapshots 
(https://oss.sonatype.org/content/repositories/snapshots/) and 
'parent.relativePath' points at wrong local POM @ line 19, column 13 -> [Help 2]


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-06-21 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2702
  
That's correct. I would recommend you build a custom bundle and deliver 
that because you can deliver NARs independent of the release. Then any bug 
fixes you find along the way can be merged into this PR in 1.8.

Also, FYI, we have two other big PRs that came in that are waiting. One's a 
big refactor of InfluxDB support and the other is a MarkLogic PR. Each of these 
tends to be the equivalent in difficult of reviewing half a dozen or more 
typical PRs so it can be hard to triage.


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-06-21 Thread david-streamlio
Github user david-streamlio commented on the issue:

https://github.com/apache/nifi/pull/2702
  
Does that mean this commit won't make the 1.7 release? 


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-06-21 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2702
  
Nothing left for you at the moment. I keep getting side tracked with real 
work requirements. It's a big commit, so we'll need a lot of review.

@alopresto @markap14 @bbende @ijokarumawak @joewitt  once 1.7 is out the 
door I can try to find some time, but I'd like others to look at this because 
it's a non-trivial commit that brings in a big chunk of totally new 
functionality.


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-06-21 Thread david-streamlio
Github user david-streamlio commented on the issue:

https://github.com/apache/nifi/pull/2702
  
Just wanted to check in to see if there is anything more I needed to do on 
my end, or if the testing instructions were clear enough.


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-05-29 Thread david-streamlio
Github user david-streamlio commented on the issue:

https://github.com/apache/nifi/pull/2702
  
@MikeThomsen 

I have updated my code to use the Apache Pulsar 2.0 client API, but cannot 
commit the changes yet, as the release vote is still pending, and thus the 
client jar file hasn't been released to the maven central repository.

As for testing the processors within a docker environment, you can use the 
following steps that I have verified and written down here:

Testing NiFi Processors

1. Launch a docker container running Apache Pulsar 2.0:  
docker pull apachepulsar/pulsar:2.0.0-rc1-incubating
docker run -d -i --name pulsar -p 6650:6650 -p 8000:8000 
apachepulsar/pulsar:2.0.0-rc1-incubating

2. ssh into the pulsar container and start the Pulsar service
docker exec -it pulsar /bin/bash
root@266e559270ce:/pulsar/bin/pulsar standalone &

Launch a docker container running apache NiFi 1.7.0-SNAPSHOT
  docker run -d -i --name nifi --link pulsar -p 8080:8080 
apache/nifi:1.7.0-SNAPSHOT
Load the following template to to NiFi, and start the ConsumePulsar 
processor FIRST.
Then start the other processors.
Verify that the data is being produced and consumed at the same rate.
Open the Pulsar dashboard at http://localhost:8000 to see the topic was 
created and that the messages were generated.

[Pulsar-Test-xml.txt](https://github.com/apache/nifi/files/2051691/Pulsar-Test-xml.txt)



---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-05-29 Thread david-streamlio
Github user david-streamlio commented on the issue:

https://github.com/apache/nifi/pull/2702
  
@MikeThomsen I have updated the code to use the new Apache Pulsar 2.0 
client API, but am blocked on the Apache release vote before the the client jar 
is published to the Maven central repo.

In the meantime, I am having issues seeing the new Pulsar processors on the 
NiFi canvas using the docker image provided in the repo.  I have confirmed that 
the docker container has the nifi-pulsar-nar-1.7.0-SNAPSHOT.nar file in 
/opt/nifi/nifi-1.7.0-SNAPSHOT/lib/ directory, and have started NiFi, but when I 
a attempt to add a Pulsar Processor to the Canvas, none of them are available.

I have double checked the class names in the 
META-INF/services/org.apache.nifi.processor.Processor file and confirmed they 
are correct. I have also confirmed that the classes exist in the 
nifi-pulsar-nar-1.7.0-SNAPSHOT.nar file.   Any ideas on what might be missing / 
wrong?

![screen shot 2018-05-29 at 10 27 27 
am](https://user-images.githubusercontent.com/35466513/40674893-ef1f16a4-632a-11e8-8f81-e30386969091.png)



---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-05-29 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2702
  
@markap14 I think you did a good chunk of the Kafka work, so would you like 
to take a look at this and join the review?


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-05-29 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2702
  
@david-streamlio I'd like to get back to this this week, so I have a 
question about your sandbox. Other than loading it up following the 
instructions that are provided w/ it, what needs to be done to set it up for 
use with these processors and services?


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-05-18 Thread david-streamlio
Github user david-streamlio commented on the issue:

https://github.com/apache/nifi/pull/2702
  
> Also, something to think about is whether Pulsar's client will work well 
across different broker versions. For example, when Pulsar 2.x comes out, will 
the 1.x client work well against a 2.x broker? or vice versa?

Pulsar API compatibility vs binary protocol compatibility guidelines:
* Protocol compatibility will be always ensured (unless major reasons)
* API can be broken across major releases
The problem is that Kafka broke protocol compatibility at every release, 
which necessitated the 0.9, .0.10, 0.11, and 1.0 versions.

FYI, the 2.0 release is under voting, and should be officially available in 
maven central next week, but I have downloaded the code an built the jar 
locally and am in the process of replacing the deprecated classes with 2.0+ API 
changes.


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-05-17 Thread bbende
Github user bbende commented on the issue:

https://github.com/apache/nifi/pull/2702
  
@MikeThomsen I've only spent a couple of minutes looking at this, but I'm 
not sure it can work out as nicely as you are hoping... the controller service 
API here is heavily dependent on the actual pulsar client API. 

The only way you can get the transparent swapping between processors and CS 
impls is if the client API is hidden behind the CS impl. For example with 
HBase, we have...

- HBase processors (no dependency on hbase-client, dependency on hbase CS 
api)
- HBase CS API (no dependency on hbase-client)
- HBase CS 1.1.2 impl (dependency on hbase-client 1.1.2)

So because the processors and CS API do not know about hbase-client, then 
we can transparently provide new implementations without changing the 
processors.

In this case we have...
- Pulsar processors (depends on Pulsar client 1.21.0-incubating)
- Pulsar CS API (depends on Pulsar client 1.21.0-incubating)
- Pulsar CS Impls (depends on Pulsar client 1.21.0-incubating)

I'm not saying the current setup is bad, just mentioning that it won't work 
out the way the hbase setup works. 

The trade-off is that in order to achieve the hbase setup you essentially 
need to recreate parts of their client API and depending how much you have to 
recreate, it may not be worth it if you are recreating the entire pulsar client 
API just to shield the processors.

Also, something to think about is whether Pulsar's client will work well 
across different broker versions. For example, when Pulsar 2.x comes out, will 
the 1.x client work well against a 2.x broker? or vice versa?

In Kafka land their client has had issues across versions, like 0.8 client 
against 0.9 broker did not perform as well as 0.9 client against 0.9 broker, so 
for this reason you really need to use the corresponding client that goes with 
the broker.

If Pulsar's client doesn't have this problem, maybe we don't need to worry 
at all about this versioning stuff.


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-05-17 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2702
  
@joewitt @markap14 @bbende Can one of you skim through @david-streamlio's 
use of controller services and let me know if you think he should refactor the 
names? My gut feeling is that he can rename the processors to very generic ones 
and have _1_X suffixed controller services so that if/when Pulsar goes 2.X the 
logic just gets updated in new NARs for that. To me, it looks like a lot of the 
work that is likely to change is in the controller services.


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-05-15 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2702
  
Don't worry about others' build errors for now. Master might just be a 
little off. TBH, we can even merge the code if the build is failing for reasons 
others than your commits so don't worry about that.


---


[GitHub] nifi issue #2702: Added Apache Pulsar processors

2018-05-14 Thread david-streamlio
Github user david-streamlio commented on the issue:

https://github.com/apache/nifi/pull/2702
  
Looks like there are error in the Email processors causing the CI build to 
fail.

[ERROR] Tests run: 5, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 
1.996 s <<< FAILURE! - in org.apache.nifi.processors.email.TestConsumeEmail
[ERROR] testConsumePOP3(org.apache.nifi.processors.email.TestConsumeEmail)  
Time elapsed: 1.014 s  <<< ERROR!
java.lang.IllegalStateException: Could not start mail server 
imap:127.0.0.1:3143, try to set server startup timeout > 1000 via 
ServerSetup.setServerStartupTimeout(timeoutInMs)
at 
org.apache.nifi.processors.email.TestConsumeEmail.setUp(TestConsumeEmail.java:55)

[ERROR] testConsumePOP3(org.apache.nifi.processors.email.TestConsumeEmail)  
Time elapsed: 1.017 s  <<< ERROR!
java.lang.NullPointerException
at 
org.apache.nifi.processors.email.TestConsumeEmail.cleanUp(TestConsumeEmail.java:66)

[INFO] 
[INFO] Results:
[INFO] 
[ERROR] Errors: 
[ERROR] 
org.apache.nifi.processors.email.TestConsumeEmail.testConsumePOP3(org.apache.nifi.processors.email.TestConsumeEmail)
[ERROR]   Run 1: TestConsumeEmail.setUp:55 » IllegalState Could not 
start mail server imap:127
[ERROR]   Run 2: TestConsumeEmail.cleanUp:66 NullPointer
[INFO] 
[INFO] 
[ERROR] Tests run: 21, Failures: 0, Errors: 1, Skipped: 0


---