Re: Unable to see Nifi data lineage in Atlas

2018-07-29 Thread Koji Kawamura
Hi Mohit,

>From the log message, I assume that you are using an existing
atlas-application.properties copied from somewhere (most likely from
HDP environment) and PLAINTEXTSASL is used in it.
PLAINTEXTSASL is not supported by the ReportLineageToAtlas.

As a work-around, please set 'Create Atlas Configuration File' to true
and let the reporting task generate atlas-application.properties
instead.
SASL_PLAINTEXT is identical to PLAINTEXTSASL.
You may need to restart NiFi to take effect.

Hope this helps,
Koji

On Thu, Jul 26, 2018 at 7:12 PM, Mohit  wrote:
> Hi,
>
>
>
> While looking at the logs, I found  out that ReportingLineageToAtlas is not
> able to construct KafkaProducer.
>
> It throws the following logs –
>
>
>
> org.apache.kafka.common.KafkaException: Failed to construct kafka producer
>
> at
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:335)
>
> at
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:188)
>
> at
> org.apache.atlas.kafka.KafkaNotification.createProducer(KafkaNotification.java:286)
>
> at
> org.apache.atlas.kafka.KafkaNotification.sendInternal(KafkaNotification.java:207)
>
> at
> org.apache.atlas.notification.AbstractNotification.send(AbstractNotification.java:84)
>
> at
> org.apache.atlas.hook.AtlasHook.notifyEntitiesInternal(AtlasHook.java:133)
>
> at
> org.apache.atlas.hook.AtlasHook.notifyEntities(AtlasHook.java:118)
>
> at
> org.apache.atlas.hook.AtlasHook.notifyEntities(AtlasHook.java:171)
>
> at
> org.apache.nifi.atlas.NiFiAtlasHook.commitMessages(NiFiAtlasHook.java:150)
>
> at
> org.apache.nifi.atlas.reporting.ReportLineageToAtlas.lambda$consumeNiFiProvenanceEvents$6(ReportLineageToAtlas.java:721)
>
> at
> org.apache.nifi.reporting.util.provenance.ProvenanceEventConsumer.consumeEvents(ProvenanceEventConsumer.java:204)
>
> at
> org.apache.nifi.atlas.reporting.ReportLineageToAtlas.consumeNiFiProvenanceEvents(ReportLineageToAtlas.java:712)
>
> at
> org.apache.nifi.atlas.reporting.ReportLineageToAtlas.onTrigger(ReportLineageToAtlas.java:664)
>
> at
> org.apache.nifi.controller.tasks.ReportingTaskWrapper.run(ReportingTaskWrapper.java:41)
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
> at java.lang.Thread.run(Thread.java:748)
>
> Caused by: java.lang.IllegalArgumentException: No enum constant
> org.apache.kafka.common.protocol.SecurityProtocol.PLAINTEXTSASL
>
> at java.lang.Enum.valueOf(Enum.java:238)
>
> at
> org.apache.kafka.common.protocol.SecurityProtocol.valueOf(SecurityProtocol.java:28)
>
> at
> org.apache.kafka.common.protocol.SecurityProtocol.forName(SecurityProtocol.java:89)
>
> at
> org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:79)
>
> at
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:277)
>
> ... 20 common frames omitted
>
>
>
> Thanks,
>
> Mohit
>
>
>
> From: Mohit 
> Sent: 25 July 2018 17:46
> To: users@nifi.apache.org
> Subject: Unable to see Nifi data lineage in Atlas
>
>
>
> Hi all,
>
>
>
> I have configured ReportingLineageToAtlas reporting task to send Nifi flow
> information to Atlas. Nifi is integrated with Ranger.
>
> I am able to see all the information in the Atlas except the lineage. When I
> search for hdfs_path or hive_table, I can only see the hive side
> information. I can’t figure out anything wrong in the configuration.
>
> Is there something in the Ranger configuration that I’m missing?
>
>
>
> Regards,
>
> Mohit
>
>
>
>


Re: RPG S2S Error

2018-07-29 Thread Koji Kawamura
Hi Faisai,

Adding ControlRate processor before sending FlowFiles via RPG, you can
throttle the rate of sending data, that should help reducing the
probability for receiving side to get full.

If the current overall throughput is acceptable for your use-case, and
you don't see any data loss, then you should be able to ignore the
message.
You can filter message by log level, configured in conf/logback.xml.
By adding following line, you can filter EndpointConnectionPool
warning messages.


The SocketRemoteSiteListener log message level you want to filter is ERROR.
I think you need to write a custom log filter class to filter it.
https://logback.qos.ch/manual/filters.html

Thanks,
Koji

On Fri, Jul 20, 2018 at 3:11 PM, Faisal Durrani  wrote:
> Hi Joe/Koji,
>
> I cant seem to figure out a way to reduce the back pressure or to find the
> root cause of the errors
>
> 1.Unable to communicate with remote instance Peer [] due to
> java.io.EOFException; closing connection
> 2.indicates that port 37e64bd0-5326-3c3f-80f4-42a828dea1d5's destination is
> full; penalizing peer
>
> I have tried increasing the rate of delivery of the data by increasing the
> concurrent tasks, increasing the back pressure thresholds , replacing the
> puthbasejson processor with puthbaserecord(the slowest part of our data
> flow) etc. While i have seen some  improvement , I can't seem to get rid of
> the above errors. I also changed various settings in the Nifi config like
>
> nifi.cluster.node.protocol.threads =50
> JVM =4096
> nifi.cluster.node.max.concurrent.requests=400
> nifi.cluster.node.protocol.threads=50
> nifi.web.jetty.threads=400
>
> Would it be safe to ignore these error as they fill up the API logs or do I
> need to investigate further? If we can ignore these then is there any way to
> stop them from appearing in the log file?
>
>
>
> On Fri, Jul 13, 2018 at 10:42 AM Joe Witt  wrote:
>>
>> you can allow for larger backlogs by increasing the backpressure
>> thresholds OR you can add additional nodes OR you can expire data.
>>
>> The whole point of the backpressure and pressure release features are to
>> let you be in control of how many resources are dedicated to buffering data.
>> However, in the most basic sense if rate of data arrival always exceeds rate
>> of delivery then delivery must he made faster or data must be expired at
>> some threshold age.
>>
>> thanks
>>
>> On Thu, Jul 12, 2018, 9:34 PM Faisal Durrani  wrote:
>>>
>>> Hi Koji,
>>>
>>> I moved onto another cluster of Nifi nodes , did the same configuration
>>> for S2S there and boom.. the same error message all over the logs.(nothing
>>> on the bulletin board)
>>>
>>> Could it be because of the back pressure as i also get the  error
>>> -(indicates that port 8c77c1b0-0164-1000--052fa54c's destination is
>>> full; penalizing peer) at the same time i see the closing connection error.
>>> I don't see a way to resolve the back pressure as we get continue stream of
>>> data from the kafka which is then inserted into Hbase( the slowest part of
>>> the data flow) which eventually causes the back pressure.
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 6, 2018 at 4:55 PM Koji Kawamura 
>>> wrote:

 Hi Faisai,

 I think both error messages indicating the same thing, that is network
 communication is closed in the middle of a Site-to-Site transaction.
 That can be happen due to many reasons, such as freaky network, or
 manually stop the port or RPG while some transaction is being
 processed. I don't think it is a configuration issue, because NiFi was
 able to initiate S2S communication.

 Thanks,
 Koji

 On Fri, Jul 6, 2018 at 4:16 PM, Faisal Durrani 
 wrote:
 > Hi Koji,
 >
 > In the subsequent tests the above error did not come but now we are
 > getting
 > errors on the RPG :
 >
 >
 > RemoteGroupPort[name=1_pk_ip,targets=http://xx.prod.xx.local:9090/nifi/]
 > failed to communicate with remote NiFi instance due to
 > java.io.IOException:
 > Failed to confirm transaction with
 > Peer[url=nifi://xxx-x.prod.xx.local:5001] due to
 > java.io.IOException:
 > Connection reset by peer
 >
 > The transport protocol is RAW while the URLs mentioned while setting
 > up the
 > RPG is one of the node of the (4)node cluster.
 >
 > nifi.remote.input.socket.port = 5001
 >
 > nifi.remote.input.secure=false
 >
 > nifi.remote.input.http.transaction.ttl=60 sec
 >
 > nifi.remote.input.host=
 >
 > Please let me  know if there is any configuration changes that we need
 > to
 > make.
 >
 >
 >
 >
 > On Fri, Jul 6, 2018 at 9:48 AM Faisal Durrani 
 > wrote:
 >>
 >> Hi Koji ,
 >>
 >> Thank you for your reply. I updated the logback.xml and ran the test
 >> again. I can see an additional error in the app.log which is as
 >> below.
 >>
 >> 

Re: [EXT] Re: Hive w/ Kerberos Authentication starts failing after a week

2018-07-29 Thread Jeff
Peter,

They're in separate NARs, and are isolated by different ClassLoaders, so
their state regarding UGI will be separate.  There shouldn't be a problem
there.  The only way I could think of that might create a problem is if
Atlas JARs were added to HDFS using the Additional Classpath Resources
property (from memory, I don't think the Hive processors have that
property), but that also uses a separate (descendant) ClassLoader, and
shouldn't create a problem either.

On Fri, Jul 27, 2018 at 1:29 PM Peter Wicks (pwicks) 
wrote:

> As an aside, while digging around in the code, I noticed that the Atlas
> Reporting Task has its own Hadoop Kerberos authentication logic
> (org.apache.nifi.atlas.security.Kerberos). I’m not using this, but it made
> me wonder if this could cause trouble if Hive (synchronized) and Atlas
> (separate, unsynchronized) were both trying to login from Keytab at the
> same time.
>
>
>
> --Peter
>
>
>
> *From:* Shawn Weeks [mailto:swe...@weeksconsulting.us]
>
> *Sent:* Friday, July 27, 2018 10:29 AM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: [EXT] Re: Hive w/ Kerberos Authentication starts failing
> after a week
>
>
>
> If you're using the Hortonworks distribution it's fixed in the latest HDF
> 3.x release I think.
>
>
>
> Thanks
>
> Shawn
>
>
> --
>
> *From:* Peter Wicks (pwicks) 
> *Sent:* Friday, July 27, 2018 10:58 AM
> *To:* users@nifi.apache.org
> *Subject:* RE: [EXT] Re: Hive w/ Kerberos Authentication starts failing
> after a week
>
>
>
> Thanks Shawn. Looks like this was fixed in 1.7.0. Will have to upgrade.
>
>
>
> *From:* Shawn Weeks [mailto:swe...@weeksconsulting.us]
> *Sent:* Friday, July 27, 2018 8:07 AM
> *To:* users@nifi.apache.org
> *Subject:* Re: [EXT] Re: Hive w/ Kerberos Authentication starts failing
> after a week
>
>
>
> See NIFI-5134 as there was a known bug with the Hive Connection Pool that
> made it fail once the Kerberos Tickets expired and you lost your connection
> from Hive. If you don't have this patch in your version once the Kerberos
> Tickets reaches the end of it's lifetime the connection pool won't work
> till you restart NiFi.
>
>
>
> Thanks
>
> Shawn
> --
>
> *From:* Peter Wicks (pwicks) 
> *Sent:* Friday, July 27, 2018 8:51:54 AM
> *To:* users@nifi.apache.org
> *Subject:* RE: [EXT] Re: Hive w/ Kerberos Authentication starts failing
> after a week
>
>
>
> I don’t believe that is how this code works. Not to say that might not
> work, but I don’t believe that the Kerberos authentication used by NiFi
> processors relies in any way on the tickets that appear in klist.
>
>
>
> While we are only using a single account on this particular server, many
> of our servers use several Kerberos principals/keytab’s. I don’t think that
> doing kinit’s for all of them would work either.
>
>
>
> Thanks,
>
>   Peter
>
>
>
> *From:* Sivaprasanna [mailto:sivaprasanna...@gmail.com
> ]
> *Sent:* Friday, July 27, 2018 3:12 AM
> *To:* users@nifi.apache.org
> *Subject:* [EXT] Re: Hive w/ Kerberos Authentication starts failing after
> a week
>
>
>
> Did you try executing 'klist' to see if the tickets are there and renewed?
> If expired, try manual kinit and see if that fixes.
>
>
>
> On Fri, Jul 27, 2018 at 1:51 AM Peter Wicks (pwicks) 
> wrote:
>
> We are seeing frequent failures of our Hive DBCP connections after a week
> of use when using Kerberos with Principal/Keytab. We’ve tried with both the
> Credential Service and without (though in looking at the code, there should
> be no difference).
>
>
>
> It looks like the tickets are expiring and renewal is not happening?
>
>
>
> javax.security.sasl.SaslException: GSS initiate failed
>
> at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>
> at
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
>
> at
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
>
> at
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
>
> at
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
>
> at
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:422)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
>
> at
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
>
> at
> org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:204)
>
> at
> org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:176)
>
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
>
> at
> 

Re: Pushing to bitbucket from nifi registry

2018-07-29 Thread Kevin Doran
Hi,



Glad to hear you are finding the NiFi Registry features useful. Regarding
your question, the “Remote Access User” and “Remote Access Password”
properties are only used when the remote URL is an HTTPS url. When it is an
SSH url, it is expected that password-less SSH has been configured on your
NiFi Registry instance, and the user/password properties are ignored.



If it is possible, I would suggest you try one of the following:

   1. Setup password-less SSH (e.g., if you can ensure no one else has
   access to the private ssh key through some other mechanism, such as file
   permissions)
   2. If your remote bitbucket accepts username/password authentication
   over https, change your remote to be https based rather than ssh based.



I hope this helps! If you have any other trouble/questions let me know.



Regards,

Kevin


On Sun, Jul 29, 2018 at 11:18 AM, Krish  wrote:

> Hello,
>
> I've just upgraded from NiFi 1.2.0 to 1.7.1 and been playing around with
> the NiFi registry features which are great!
>
> I've been trying to get it to push automatically to my bitbucket
> repository but I keep getting the following error: "Failed to push commits
> to origin due to org.eclipse.jgit.api.errors.TransfportException
> ssh://git@bitbucketserver:1234/nifi-registry.git Auth Fail"
>
> In my configuration I have the folllowing setttings in my providers.xml:
> 
> org.apache.nifi.registry.provider.flow.git.
> GitFlowPersistenceProvider
> .
> */storage/nifi_registry_repo*
> *origin*
> *ssh_key_user*
> *ssh key password*
> 
> 
>
> The nifi_registry_repo is a git repo which I have cloned down from my
> bitbucket server.
> I've setup a SSH and loaded into the bitbucket repo, I can successfully
> push and pull from the command line on the nifi registry server using the
> ssh key.
>
> When I change a flow in the NiFi registry UI it commits successfully to
> the repo but fails to push the changes up to the bitbucket server.
>
> Has anyone else done this successfully? Or is there mistake I've made in
> my configs?
>
> Thanks
>
> K
>
>
>


Pushing to bitbucket from nifi registry

2018-07-29 Thread Krish

Hello,

I've just upgraded from NiFi 1.2.0 to 1.7.1 and been playing around with 
the NiFi registry features which are great!


I've been trying to get it to push automatically to my bitbucket 
repository but I keep getting the following error: "Failed to push 
commits to origin due to org.eclipse.jgit.api.errors.TransfportException 
ssh://git@bitbucketserver:1234/nifi-registry.git Auth Fail"


In my configuration I have the folllowing setttings in my providers.xml:


org.apache.nifi.registry.provider.flow.git.GitFlowPersistenceProvider
./storage/nifi_registry_repo

origin
ssh_key_user
ssh key 
password



The nifi_registry_repo is a git repo which I have cloned down from my 
bitbucket server.
I've setup a SSH and loaded into the bitbucket repo, I can successfully 
push and pull from the command line on the nifi registry server using 
the ssh key.


When I change a flow in the NiFi registry UI it commits successfully to 
the repo but fails to push the changes up to the bitbucket server.


Has anyone else done this successfully? Or is there mistake I've made in 
my configs?


Thanks

K