saving record into different tables?

2018-08-31 Thread l vic
Hi,
I have a json record that contains array "mylst":
{
"id": 0,
"name": "Root",
 "mylist": [{
  "id": 10,
  "info": "2am-3am"
},
{
"id": 11,
"info": "3AM-4AM"
},
{
"id": 12,
"info": "4am-5am"
}]

}
I have to save root data into one db table and array into another... Can
someone recommend an approach to "splitting" of record for 2 different
database writers?
Thank you,
V


Re: SAML based identity provider

2018-08-31 Thread Andy LoPresto
Hi Vijay,

Currently there are no community-supported SAML login identity providers. You 
can use the existing LDAP [1], Kerberos [2], and OIDC [3] implementations as 
examples on which to base your implementation. The LIP are not currently 
exposed as a first-class extension point, but you can certainly build a custom 
one and use it locally, even without submitting it for inclusion in the core 
project. Of course, this sounds like a valuable feature for the community, and 
we encourage contribution if possible.

We are open to rearchitecting the authentication and authorization mechanisms 
in NiFi, but cannot make breaking changes that would change backward 
compatibility on minor version releases because we follow semantic versioning 
[4]. Changes which alter the fundamental authentication story NiFi presents 
need to go in a major release (i.e. 2.0.0). NiFi strongly adheres to stable 
releases which follow the principle of least surprise.

If you have specific questions or need help with integrating the code, please 
feel free to reach out to the community here or on GitHub. You may also be 
interested in the developer mailing list at d...@nifi.apache.org 
 for more code-related questions and discussion. 
Thanks.


[1] 
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-ldap-iaa-providers-bundle/nifi-ldap-iaa-providers/src/main/java/org/apache/nifi/ldap/LdapProvider.java#L65
 

[2] 
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-kerberos-iaa-providers-bundle/nifi-kerberos-iaa-providers/src/main/java/org/apache/nifi/kerberos/KerberosProvider.java
 

[3] 
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-security/src/main/java/org/apache/nifi/web/security/oidc/StandardOidcIdentityProvider.java#L76
 

[4] https://semver.org/ 


Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Aug 31, 2018, at 3:40 PM, Curtis Ruck  wrote:
> 
> I've been trying to figure out how to improve this area of NiFi.  They 
> support OpenID Direct Connect (OIDC), but when you combine it with a reverse 
> proxy or their default/hardcoded PKI configuration, it's near impossible to 
> use.
> 
> Ideally the entire authn/z stack needs rearchitecting for better modularity 
> for any decent SSO integration.  The current APIs were built around having a 
> writable authn/z store like LDAP/RDBMS. They are not designed for common SSO 
> workflows where users connect to NiFi and inherit NiFi permissions based on 
> their assertion/attributes.
> 
> On Fri, Aug 31, 2018, 6:14 PM Vijay Chhipa  > wrote:
> Hello,
> 
> I am setting up NiFi in the company, but the out-of-the-box authentication 
> modules are not an option for me.
> I would like to write a SAML based login identity provider,
> Is there one out there already ?
> 
> I am on NiFi 1.7.1, with Java 8, SAML 2.0,
> 
> What do I need to get started with writing a new  login identity provider? 
> Any examples, sample, or pointers are highly appreciated
> 
> Vijay
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: SAML based identity provider

2018-08-31 Thread Curtis Ruck
I've been trying to figure out how to improve this area of NiFi.  They
support OpenID Direct Connect (OIDC), but when you combine it with a
reverse proxy or their default/hardcoded PKI configuration, it's near
impossible to use.

Ideally the entire authn/z stack needs rearchitecting for better modularity
for any decent SSO integration.  The current APIs were built around having
a writable authn/z store like LDAP/RDBMS. They are not designed for common
SSO workflows where users connect to NiFi and inherit NiFi permissions
based on their assertion/attributes.

On Fri, Aug 31, 2018, 6:14 PM Vijay Chhipa  wrote:

> Hello,
>
> I am setting up NiFi in the company, but the out-of-the-box authentication
> modules are not an option for me.
> I would like to write a SAML based login identity provider,
> Is there one out there already ?
>
> I am on NiFi 1.7.1, with Java 8, SAML 2.0,
>
> What do I need to get started with writing a new  login identity provider?
> Any examples, sample, or pointers are highly appreciated
>
> Vijay
>
>


SAML based identity provider

2018-08-31 Thread Vijay Chhipa
Hello,

I am setting up NiFi in the company, but the out-of-the-box authentication 
modules are not an option for me.
I would like to write a SAML based login identity provider, 
Is there one out there already ?

I am on NiFi 1.7.1, with Java 8, SAML 2.0, 

What do I need to get started with writing a new  login identity provider? Any 
examples, sample, or pointers are highly appreciated

Vijay



smime.p7s
Description: S/MIME cryptographic signature


Re: Secure NiFi cluster on kubernetes.

2018-08-31 Thread Peter Wilcsinszky
On Fri, 31 Aug 2018, 16:51 Varun Tomar,  wrote:

> Hi Peter,
>
> We started using nifi as statefulset last year you but moved to deployment.
>
> -CICD tool Spinnaker does not support statefulsets.
> - We have also customized logback.xml as it was log within log issue which
> was not getting parsed properly in ELK
> - For ports and cluster IP I pass them as argument so even if the pod
> reboot we don't have any issues.
>
Why do you need to pass an IP?

- we also use external zookeeper.
>
> I dint find any benefit of running statefulset .
>
> The only issue as I said is if we restart any undeying node we extra node
> and old nodes does not get deleted.
>
With a statefulset you wouldnt have issues with that and you would have
stable persistent volumes as well.


>
> Regards,
> Varun
>
> --
> *From:* Peter Wilcsinszky 
> *Sent:* Friday, August 31, 2018 2:50 AM
> *To:* users@nifi.apache.org
> *Subject:* Re: Secure NiFi cluster on kubernetes.
>
> Hi Dnyaneshwar,
>
> as Andy mentioned we are working on running NiFi in Kubernetes but I'm not
> sure when it will be available publicly. Some pointers that can help by
> then:
>  - You should use a StatefulSet to manage NiFi pods
>  - Probably Helm charts are the most efficient way to get started
>  - I recommend using the official NiFi image and wrapping the original
> nifi.sh script from the Kubernetes pod spec similarly how we do it in the
> Docker image [1]. Caveats: setting dynamic properties like
> nifi.web.http.host from the wrapper script is a good idea, but for more
> static properties like nifi.web.http.port you may want to use the config
> files directly as configmaps and do templating using Helm. This is
> especially true for more complex configurations like the authorizers.xml or
> the login-identity-providers.xml.
>  - Authorizations in NiFi can be configured for the initial cluster setup,
> but needs to be done manually when you add a new Node to the cluster above
> the initial cluster size. Also these extra nodes should have a vanilla
> authorizations.xml to avoid conflicts when joining to the existing ones.
> You can use the wrapper script to decide which configmap to use when
> starting the container. Once the pod has started you still have to add the
> node and authorize it manually using the UI. There is ongoing work to make
> this more dynamic: [3]
>  - We use a Kubernetes deployment to run NiFi Toolkit's tls-toolkit in
> server mode. The NiFi pods have an init container that uses tls-toolkit in
> client mode to request and receive certificates from the CA server. The
> communication is protected using a shared secret that is generated inside
> the cluster on the fly, also you can further protect access to the CA using
> NetworkPolicies.
>  - You should avoid using the embedded Zookeeper, but you can use an
> already existing helm chart as a dependency to install it [4] (caveat: the
> image used by that chart is not recommended for production use)
>
> [1]
> https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/sh/start.sh
> 
> [2]
> https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/sh/start.sh#L23
> 
> [3] https://issues.apache.org/jira/browse/NIFI-5542
> 
> [4] https://github.com/helm/charts/tree/master/incubator/zookeeper
> 
>
> On Thu, Aug 30, 2018 at 10:42 PM Varun Tomar 
> wrote:
>
>> Hi Dnyaneshwar,
>>
>>
>>
>> We have nifi running on k8s for around 8-10 months. We create nifi
>> cluster as part of CICD and then there is a stage which does the template
>> deployment. Haven’t faced any major issues. Just sometime if a node reboots
>> the old cluster member in nifi does not gets cleaned up.
>>
>>
>>
>> Regards,
>>
>> Varun
>>
>>
>>
>> *From: *Andy LoPresto 
>> *Reply-To: *
>> *Date: *Thursday, August 30, 2018 at 10:23 AM
>> *To: *
>> *Subject: *Re: Secure NiFi cluster on kubernetes.
>>
>>
>>
>> Hi 

Re: Secure NiFi cluster on kubernetes.

2018-08-31 Thread Varun Tomar
Hi Peter,

We started using nifi as statefulset last year you but moved to deployment.

-CICD tool Spinnaker does not support statefulsets.
- We have also customized logback.xml as it was log within log issue which was 
not getting parsed properly in ELK
- For ports and cluster IP I pass them as argument so even if the pod reboot we 
don't have any issues.
- we also use external zookeeper.

I dint find any benefit of running statefulset .

The only issue as I said is if we restart any undeying node we extra node and 
old nodes does not get deleted.


Regards,
Varun


From: Peter Wilcsinszky 
Sent: Friday, August 31, 2018 2:50 AM
To: users@nifi.apache.org
Subject: Re: Secure NiFi cluster on kubernetes.

Hi Dnyaneshwar,

as Andy mentioned we are working on running NiFi in Kubernetes but I'm not sure 
when it will be available publicly. Some pointers that can help by then:
 - You should use a StatefulSet to manage NiFi pods
 - Probably Helm charts are the most efficient way to get started
 - I recommend using the official NiFi image and wrapping the original nifi.sh 
script from the Kubernetes pod spec similarly how we do it in the Docker image 
[1]. Caveats: setting dynamic properties like nifi.web.http.host from the 
wrapper script is a good idea, but for more static properties like 
nifi.web.http.port you may want to use the config files directly as configmaps 
and do templating using Helm. This is especially true for more complex 
configurations like the authorizers.xml or the login-identity-providers.xml.
 - Authorizations in NiFi can be configured for the initial cluster setup, but 
needs to be done manually when you add a new Node to the cluster above the 
initial cluster size. Also these extra nodes should have a vanilla 
authorizations.xml to avoid conflicts when joining to the existing ones. You 
can use the wrapper script to decide which configmap to use when starting the 
container. Once the pod has started you still have to add the node and 
authorize it manually using the UI. There is ongoing work to make this more 
dynamic: [3]
 - We use a Kubernetes deployment to run NiFi Toolkit's tls-toolkit in server 
mode. The NiFi pods have an init container that uses tls-toolkit in client mode 
to request and receive certificates from the CA server. The communication is 
protected using a shared secret that is generated inside the cluster on the 
fly, also you can further protect access to the CA using NetworkPolicies.
 - You should avoid using the embedded Zookeeper, but you can use an already 
existing helm chart as a dependency to install it [4] (caveat: the image used 
by that chart is not recommended for production use)

[1] 
https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/sh/start.sh
[2] 
https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/sh/start.sh#L23
[3] 
https://issues.apache.org/jira/browse/NIFI-5542
[4] 
https://github.com/helm/charts/tree/master/incubator/zookeeper

On Thu, Aug 30, 2018 at 10:42 PM Varun Tomar 
mailto:varun.to...@zaplabs.com>> wrote:
Hi Dnyaneshwar,

We have nifi running on k8s for around 8-10 months. We create nifi cluster as 
part of CICD and then there is a stage which does the template deployment. 
Haven’t faced any major issues. Just sometime if a node reboots the old cluster 
member in nifi does not gets cleaned up.

Regards,
Varun

From: Andy LoPresto mailto:alopre...@apache.org>>
Reply-To: mailto:users@nifi.apache.org>>
Date: Thursday, August 30, 2018 at 10:23 AM
To: mailto:users@nifi.apache.org>>
Subject: Re: Secure NiFi cluster on kubernetes.

Hi Dnyaneshwar,

I know other users are working on the same thing, so yes, NiFi + Kubernetes 
will allow you to stand up secure clusters. There is ongoing work targeted for 
upcoming releases to make this easier and more performant (dynamic scaling, 
certificate 

Re: Nifi RecordReader for Json?

2018-08-31 Thread l vic
What would be the right processor to use JsonTreeReader to get json
attributes as name/value pairs and to write them into the file?
Thanks again,

On Thu, Aug 30, 2018 at 2:09 PM Otto Fowler  wrote:

> What he said.
>
>
> On August 30, 2018 at 13:31:54, Joe Witt (joe.w...@gmail.com) wrote:
>
> Wow!
>
> My apologies for the really bad response I gave linking to the same
> article you mentioned. I should be more careful when
> reading/responding on the phone!
>
> Thanks
> On Thu, Aug 30, 2018 at 11:01 AM Matt Burgess 
> wrote:
> >
> > V,
> >
> > Currently NiFi does not support specifying a schema in JSONSchema
> > format, you'll want to convert that to an Avro schema for use in
> > JsonTreeReader. I don't know JSONSchema that well so I'm not sure if
> > that "stats" schema is supposed to be included in the outgoing object.
> > I ran it through a utility-under-development [1] and got the following
> > Avro schema out:
> >
> >
> {"type":"record","name":"record0","fields":[{"name":"create_date","type":"long"},{"name":"id","type":"string"}]}
>
> >
> > This doesn't add the "stats" record and specifies "create_date" as a
> > long versus a BigInteger. I think you might want to use an Avro
> > logical type of "decimal" [2] depending on what the value is in the
> > actual JSON object:
> >
> > {"type":"record","name":"record0","fields":[
> > {"name":"create_date","type": {"type": "bytes","logicalType":
> > "decimal","precision": 12,"scale": 0}},
> > {"name":"id","type":"string"}
> > ]}
> >
> > If you have a stats object present, this might work:
> >
> > {"type":"record","name":"record0","fields":[
> > {"name": "stats", "type" :
> >
> {"type":"record","name":"statsRecord","fields":[{"name":"id","type":"string"},{"name":"bin_qualifier","type":"string"}]}},
>
> > {"name":"create_date","type": {"type": "bytes","logicalType":
> > "decimal","precision": 12,"scale": 0}},
> > {"name":"id","type":"string"}
> > ]}
> >
> > I didn't validate or try these so there may be typos or other
> > (hopefully minor) mistakes.
> >
> > Regards,
> > Matt
> >
> > [1] https://github.com/fge/json-schema-avro
> > [2] https://avro.apache.org/docs/1.8.2/spec.html#Decimal
> >
> > On Thu, Aug 30, 2018 at 9:54 AM l vic  wrote:
> > >
> > > I have json file for the schema that looks like the following:
> > >
> > > {
> > > "$schema": "http://json-schema.org/draft-04/schema#;,
> > > "definitions": {
> > > "stats": {
> > > "type": "object",
> > > "additionalProperties": false,
> > > "properties": {
> > > "id": {
> > > "type": "string"
> > > },
> > > "bin_qualifier": {
> > > "type": "string"
> > > }
> > > }
> > > }
> > > },
> > > "additionalProperties": false,
> > > "description": "attributes",
> > > "type": "object",
> > > "properties": {
> > > "id": {
> > > "type": "string",
> > > "required": true,
> > > },
> > > "create_date": {
> > > "type": "integer",
> > > "javaType": "java.math.BigInteger",
> > > "required": true
> > > }
> > > }
> > > }
> > >
> > >
> > > How can I add this schema for JsonTreeReader?
> > >
> > > On Thu, Aug 30, 2018 at 9:02 AM Otto Fowler 
> wrote:
> > >>
> > >> The record readers are services, that processors use.
> > >> When you use a *Record* processor, you will have to select a Reader
> and a Writer Service, or create one ( which you can do through the UI ).
> > >> https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi
> > >>
> > >>
> > >> On August 30, 2018 at 08:48:08, l vic (lvic4...@gmail.com) wrote:
> > >>
> > >> So, where's JsonTreeReader? I am on nifi-1.7.1-RC1 and i don't see it
> in the list of available processors...
> > >> Thanks,
> > >> V
> > >>
> > >> On Thu, Aug 30, 2018 at 5:31 AM Sivaprasanna <
> sivaprasanna...@gmail.com> wrote:
> > >>>
> > >>> Hi. Just like CSVRecordReader, we have record reader service for
> JSON. It's called JsonTreeReader. You can use AvroSchemaRegistry and
> provide an Avro schema (usually generated through InferAvroSchema
> processor) for your JSON. Refer:
> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.7.1/org.apache.nifi.json.JsonTreeReader/index.html
> > >>>
> > >>> -
> > >>> Sivaprasanna
> > >>>
> > >>> On Thu, 30 Aug 2018 at 2:21 PM, l vic  wrote:
> > 
> >  I need to save two different json messages according to json
> schemas available for each to different relational database tables.
> >  I saw this blog:
> >  https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi
> >  with example using CSVRecordReader for csv->json transformation.
> >  but what would be RecordReader for schema-based transformation from
> json? Is this a valid approach, or what would be best approach to solve
> this problem?
> >  I am using: nifi-1.7.1-RC1...
> >  Thank you,
>
>


Re: Best practices for running Apache NiFi in production in a Docker container

2018-08-31 Thread Peter Wilcsinszky
Hi,

I haven't done extensive research in this area but ran through the articles
and also found another one [1]. From what I understand
UseCGroupMemoryLimitForHeap is just the dynamic version of setting memory
limits manually using Xmx and Xms which is currently done by the NiFi start
script explicitly. In an environment where it should be done in a more
dynamic fashion the UseCGroupMemoryLimitForHeap with proper MaxRAMFraction
should be used but for caveats check the comments here: [1] and here: [2]
(My understanding: MaxRAMFraction=1 considered to be unsafe,
MaxRAMFraction=2 leaves half the memory unused)

[1] https://banzaicloud.com/blog/java-resource-limits/
[2]
https://stackoverflow.com/questions/49854237/is-xxmaxramfraction-1-safe-for-production-in-a-containered-environment


On Thu, Aug 30, 2018 at 7:54 PM Joe Percivall  wrote:

> Hey everyone,
>
> I was recently searching for a best practice guide for running a
> production instance of Apache NiFi within a Docker container and couldn't
> find anything specific other than the normal guidance for best practices of
> a high-performance instance[1]. I did expand my search for best practices
> on running the JVM within a container and found a couple good
> articles[2][3]. The first of which explains why the JVM will take up more
> than is set via "Xmx" and the second is about 2 JVM options which were
> backported from Java 9 to JDK 8u131 specifically for configuring the JVM
> heap for running in a "VM".
>
> So with that, a couple questions:
> 1: Does anyone have any best practices or lessons learned specifically for
> running NiFi in a container?
> 2:  "UseCGroupMemoryLimitForHeap" and "MaxRAMFraction" are technically
> "Experimental VM Options", has anyone used them in practice?
>
> [1]
> https://community.hortonworks.com/articles/7882/hdfnifi-best-practices-for-setting-up-a-high-perfo.html
>
> [2]
> https://developers.redhat.com/blog/2017/04/04/openjdk-and-containers/#more-433899
> [3]
> https://blog.csanchez.org/2017/05/31/running-a-jvm-in-a-container-without-getting-killed/
>
> Thanks,
> Joe
> --
> *Joe Percivall*
> linkedin.com/in/Percivall
> e: jperciv...@apache.com
>