Re: Nifi 0.50 and GetKafka Issues

2016-02-21 Thread xmlking
If planing to support both, please make 0.9 version of getKafka not to depend on zookeeper, as 0.9 Kafka consumes can directly communicate with brokers without needing explicit firewall ports opened between NiFi and zookeeper Sumo > On Feb 21, 2016, at 2:23 PM, Joe Witt

Re: Using Apache Nifi and Tika to extract content from pdf

2016-02-21 Thread Matt Burgess
There are some RegEx processors you can use to see if the PDF parsed text is "empty" or full of just whitespace, or you can use the scripting processor for that too. For Jython, check the unit test:

Re: Nifi 0.50 and GetKafka Issues

2016-02-21 Thread Oleg Zhurakousky
That is a good point Bryan and would allow Josh to use 0.5.0 while still using older Kafka bundle On Feb 21, 2016, at 6:07 PM, Bryan Bende > wrote: I know this does not address the larger problem, but in this specific case, would the 0.4.1 Kafka NAR

Re: Nifi 0.50 and GetKafka Issues

2016-02-21 Thread Bryan Bende
I know this does not address the larger problem, but in this specific case, would the 0.4.1 Kafka NAR still work in 0.5.x? If the NAR doesn't depend on any other NARs I would think it would still work, and could be a work around for those that need to stay on Kafka 0.8.2. On Sunday, February 21,

Re: Nifi 0.50 and GetKafka Issues

2016-02-21 Thread Oleg Zhurakousky
The unfortunate part is that between 0.8 and 0.9 there are also breaking API changes on the Kafka side that would affect our code, so I say we need to probably start thinking more about versioning. And in fact we are in the concept of extension registry, but what I am now suggesting is that

Re: Nifi 0.50 and GetKafka Issues

2016-02-21 Thread Joe Witt
Yeah the intent is to support 0.8 and 0.9. Will figure something out. Thanks Joe On Feb 21, 2016 4:47 PM, "West, Joshua" wrote: > Hi Oleg, > > Hmm -- from what I can tell, this isn't a Zookeeper communication issue. > Nifi is able to connect into the Kafka brokers'

Re: Nifi 0.50 and GetKafka Issues

2016-02-21 Thread West, Joshua
Hi Oleg, Hmm -- from what I can tell, this isn't a Zookeeper communication issue. Nifi is able to connect into the Kafka brokers' Zookeeper cluster and retrieve the list of the kafka brokers to connect to. Seems, from the logs, to be a problem when attempting to consume from Kafka itself.

Re: Nifi 0.50 and GetKafka Issues

2016-02-21 Thread West, Joshua
Hi Juan, Yep. Nifi 0.5.0 seems to be connecting on into Zookeeper without issues. It created the /nifi/components path as well. -- Josh West Bose Corporation On Sat, 2016-02-20 at 19:38 -0500, Juan Sequeiros wrote: Excuse me my message got cut short. I point you to

Re: Nifi 0.50 and GetKafka Issues

2016-02-21 Thread West, Joshua
Hi Oleg, Ahh. That's a shame. Hopefully the maintainers post this note in the release notes or documentation as I'm sure others will run into problems when upgrading Nifi while using existin kafka 0.8.x clusters. We look like we'll end up sticking with Kafka 0.8.2.1 and Nifi 0.4.1 for a

Re: Nifi 0.50 and GetKafka Issues

2016-02-21 Thread Oleg Zhurakousky
Josh Also, keep in mind that there are incompatible property names in Kafka between the 0.7 and 0.8 releases. One of the change that went it was replacing “zk.connectiontimeout.ms” with “zookeeper.connection.timeout.ms”. Not sure if it’s related though, but realizing that 0.4.1 was relying on

Re: Using Apache Nifi and Tika to extract content from pdf

2016-02-21 Thread Ralf Meier
Hi, thanks for your help. Now the workflow is working. But I still have some issues. The PutFile at the end of the workflow writes the file to disk. But in my case the content of the flow file is mostly empty (only one PDF worked for me). Even that the rest is processed just fine. Also when I