[jira] [Commented] (NIFI-786) Add other supporting options for configuring credentials for AWS processors

2015-10-06 Thread Joseph Witt (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14945152#comment-14945152
 ] 

Joseph Witt commented on NIFI-786:
--

Interesting idea!  Curious to hear [~mpayne2] take on it.

> Add other supporting options for configuring credentials for AWS processors
> ---
>
> Key: NIFI-786
> URL: https://issues.apache.org/jira/browse/NIFI-786
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Michael Kobit
>Priority: Minor
>
> I was looking at https://issues.apache.org/jira/browse/NIFI-770 and looked at 
> how the AWS processors credentials are currently configured. As a NFM you 
> have a few options with the properties right now:
> 1) set basic, static credentials
> 2) set a credentials properties filepath
> 3) set neither, use anonymous credentials
> I think it would be better if each AWS could rely on a ControllerService that 
> returns `AWSCredentialsProvider` (instead of  `AWSCredentials`) that gives 
> all of the possible implementations that could be used, rather than relying 
> on a static credentials. *Provider implementations can be refreshed and can 
> also  other more complicated implementations, but already have built in 
> support for the Static and Properties file that are provided by NiFi today.
> My thinking is that the controller service would be something like
> public interface AwsCredentialsProviderService extends ControllerService {
>   AWSCredentialsProvider getCredentialsProvider();
> }
> and you could have `StaticAwsCredentialsProviderService`, 
> `PropertiesFileAwsCredentialsProviderService`, and 
> `AnonymousAwsCredentialsProviderService` to provide the functionality that is 
> supported right now. Additional credential providers could be added later, as 
> there a bunch more AWS provided versions that I think could fit in well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (NIFI-817) Create Processors to interact with HBase

2015-10-06 Thread Bryan Bende (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Bende reassigned NIFI-817:


Assignee: Bryan Bende  (was: Mark Payne)

> Create Processors to interact with HBase
> 
>
> Key: NIFI-817
> URL: https://issues.apache.org/jira/browse/NIFI-817
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Bryan Bende
> Fix For: 0.4.0
>
> Attachments: 
> 0001-NIFI-817-Initial-implementation-of-HBase-processors.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-817) Create Processors to interact with HBase

2015-10-06 Thread Bryan Bende (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14945138#comment-14945138
 ] 

Bryan Bende commented on NIFI-817:
--

All, going to pick up where Mark left off and try make progress on this ticket 
for 0.4.0...

On the extraction side of things, here is what I gathered from reading the 
above discussion...
* The GetHBase processor in the patch saves state on a single node, but needs 
to also save state across the cluster, most likely similar to what we do in 
ListHDFS with the distributed cache
* Would like a property/properties to specify columns and column families to 
return, and possibly filters as well
* Consider using Avro as an output mechanism to provide a schema for the results
* Consider using a replication end-point to stream WALs

I looked at the replication endpoint a little bit and it does seem like an 
interesting concept. My understanding is that you deploy a jar to the lib 
directory of every region server that contains the implementation of your 
endpoint, this endpoint is then responsible for sending to the other system, 
and there is also some code that has to be run to register/turn-on your 
endpoint. The best example I found was this:
https://github.com/risdenk/hbase-custom-replication-endpoint-example

We would have to figure out how this replication endpoint would be sending data 
to NiFi, the first thing that comes to mind is through the SiteToSiteClient, 
but haven't really thought through this. I'm wondering if we proceed for now on 
the GetHBase processor (with some improvements above) and track this 
replication idea as another ticket since it would likely have a much different 
feel than a regular processor, thoughts? 

The put side of things seems to be more straight forward... I refactored the 
processor in the current patch to pull in a configurable batch of FlowFiles on 
each call to onTrigger, then group them by table, and make one call to 
table.put(List) so in the best case if all FlowFiles are for the same 
table then it would be a single call, worst case they are all different tables 
and it would be no different than processing each FlowFile one at a time.


> Create Processors to interact with HBase
> 
>
> Key: NIFI-817
> URL: https://issues.apache.org/jira/browse/NIFI-817
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Bryan Bende
> Fix For: 0.4.0
>
> Attachments: 
> 0001-NIFI-817-Initial-implementation-of-HBase-processors.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-786) Add other supporting options for configuring credentials for AWS processors

2015-10-06 Thread Michael Kobit (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14945142#comment-14945142
 ] 

Michael Kobit commented on NIFI-786:


Any thoughts on this for the 1.0.0 version?

It seems like there could be 2 ways to do this:

1. Maintain backwards compatibility by checking if the `Access Key` and `Secret 
Key` are set or 'Credentials File', and if not use this new ControllerService
2. Get rid of those, and only use this ControllerService.

Since 1.0.0 (I believe) would be introducing some breaking changes, would this 
be considered for it?

> Add other supporting options for configuring credentials for AWS processors
> ---
>
> Key: NIFI-786
> URL: https://issues.apache.org/jira/browse/NIFI-786
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Michael Kobit
>Priority: Minor
>
> I was looking at https://issues.apache.org/jira/browse/NIFI-770 and looked at 
> how the AWS processors credentials are currently configured. As a NFM you 
> have a few options with the properties right now:
> 1) set basic, static credentials
> 2) set a credentials properties filepath
> 3) set neither, use anonymous credentials
> I think it would be better if each AWS could rely on a ControllerService that 
> returns `AWSCredentialsProvider` (instead of  `AWSCredentials`) that gives 
> all of the possible implementations that could be used, rather than relying 
> on a static credentials. *Provider implementations can be refreshed and can 
> also  other more complicated implementations, but already have built in 
> support for the Static and Properties file that are provided by NiFi today.
> My thinking is that the controller service would be something like
> public interface AwsCredentialsProviderService extends ControllerService {
>   AWSCredentialsProvider getCredentialsProvider();
> }
> and you could have `StaticAwsCredentialsProviderService`, 
> `PropertiesFileAwsCredentialsProviderService`, and 
> `AnonymousAwsCredentialsProviderService` to provide the functionality that is 
> supported right now. Additional credential providers could be added later, as 
> there a bunch more AWS provided versions that I think could fit in well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (NIFI-786) Add other supporting options for configuring credentials for AWS processors

2015-10-06 Thread Joseph Witt (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14945152#comment-14945152
 ] 

Joseph Witt edited comment on NIFI-786 at 10/6/15 3:10 PM:
---

Interesting idea!  Curious to hear [~markap14]'s take on it.


was (Author: joewitt):
Interesting idea!  Curious to hear [~mpayne2] take on it.

> Add other supporting options for configuring credentials for AWS processors
> ---
>
> Key: NIFI-786
> URL: https://issues.apache.org/jira/browse/NIFI-786
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 0.3.0
>Reporter: Michael Kobit
>Priority: Minor
>
> I was looking at https://issues.apache.org/jira/browse/NIFI-770 and looked at 
> how the AWS processors credentials are currently configured. As a NFM you 
> have a few options with the properties right now:
> 1) set basic, static credentials
> 2) set a credentials properties filepath
> 3) set neither, use anonymous credentials
> I think it would be better if each AWS could rely on a ControllerService that 
> returns `AWSCredentialsProvider` (instead of  `AWSCredentials`) that gives 
> all of the possible implementations that could be used, rather than relying 
> on a static credentials. *Provider implementations can be refreshed and can 
> also  other more complicated implementations, but already have built in 
> support for the Static and Properties file that are provided by NiFi today.
> My thinking is that the controller service would be something like
> public interface AwsCredentialsProviderService extends ControllerService {
>   AWSCredentialsProvider getCredentialsProvider();
> }
> and you could have `StaticAwsCredentialsProviderService`, 
> `PropertiesFileAwsCredentialsProviderService`, and 
> `AnonymousAwsCredentialsProviderService` to provide the functionality that is 
> supported right now. Additional credential providers could be added later, as 
> there a bunch more AWS provided versions that I think could fit in well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (NIFI-631) Create ListFile and FetchFile processors

2015-10-06 Thread Mark Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-631:

Attachment: (was: 
0001-NIFI-631-Initial-implementation-of-FetchFile-process.patch)

> Create ListFile and FetchFile processors
> 
>
> Key: NIFI-631
> URL: https://issues.apache.org/jira/browse/NIFI-631
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Mark Payne
>Assignee: Joe Skora
>
> This pair of Processors will provide several benefits over the existing 
> GetFile processor:
> 1. Currently, GetFile will continually pull the same files if the "Keep 
> Source File" property is set to true. There is no way to pull the file and 
> leave it in the directory without continually pulling the same file. We could 
> implement state here, but it would either be a huge amount of state to 
> remember everything pulled or it would have to always pull the oldest file 
> first so that we can maintain just the Last Modified Date of the last file 
> pulled plus all files with the same Last Modified Date that have already been 
> pulled.
> 2. If pulling from a network attached storage such as NFS, this would allow a 
> single processor to run ListFiles and then distribute those FlowFiles to the 
> cluster so that the cluster can share the work of pulling the data.
> 3. There are use cases when we may want to pull a specific file (for example, 
> in conjunction with ProcessHttpRequest/ProcessHttpResponse) rather than just 
> pull all files in a directory. GetFile does not support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-673) Create ListSFTP, FetchSFTP Processors

2015-10-06 Thread Mark Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14945038#comment-14945038
 ] 

Mark Payne commented on NIFI-673:
-

I deleted and replaced the second patch because i realized that there was an 
issue in how it was handling the "Completion Strategy"

> Create ListSFTP, FetchSFTP Processors
> -
>
> Key: NIFI-673
> URL: https://issues.apache.org/jira/browse/NIFI-673
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 0.4.0
>
> Attachments: 
> 0001-NIFI-673-Initial-implementation-of-ListSFTP-FetchSFT.patch, 
> 0002-NIFI-673-Added-Completion-Strategy-to-FetchSFTP.patch
>
>
> This will allow us to pull a listing from a single primary node and then 
> distribute the work of pulling and processing the data across the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (NIFI-631) Create ListFile and FetchFile processors

2015-10-06 Thread Mark Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-631:

Attachment: 0001-NIFI-631-Initial-implementation-of-FetchFile-process.patch

> Create ListFile and FetchFile processors
> 
>
> Key: NIFI-631
> URL: https://issues.apache.org/jira/browse/NIFI-631
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Mark Payne
>Assignee: Joe Skora
> Attachments: 
> 0001-NIFI-631-Initial-implementation-of-FetchFile-process.patch
>
>
> This pair of Processors will provide several benefits over the existing 
> GetFile processor:
> 1. Currently, GetFile will continually pull the same files if the "Keep 
> Source File" property is set to true. There is no way to pull the file and 
> leave it in the directory without continually pulling the same file. We could 
> implement state here, but it would either be a huge amount of state to 
> remember everything pulled or it would have to always pull the oldest file 
> first so that we can maintain just the Last Modified Date of the last file 
> pulled plus all files with the same Last Modified Date that have already been 
> pulled.
> 2. If pulling from a network attached storage such as NFS, this would allow a 
> single processor to run ListFiles and then distribute those FlowFiles to the 
> cluster so that the cluster can share the work of pulling the data.
> 3. There are use cases when we may want to pull a specific file (for example, 
> in conjunction with ProcessHttpRequest/ProcessHttpResponse) rather than just 
> pull all files in a directory. GetFile does not support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (NIFI-673) Create ListSFTP, FetchSFTP Processors

2015-10-06 Thread Mark Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-673:

Attachment: 0002-NIFI-673-Added-Completion-Strategy-to-FetchSFTP.patch

> Create ListSFTP, FetchSFTP Processors
> -
>
> Key: NIFI-673
> URL: https://issues.apache.org/jira/browse/NIFI-673
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 0.4.0
>
> Attachments: 
> 0001-NIFI-673-Initial-implementation-of-ListSFTP-FetchSFT.patch, 
> 0002-NIFI-673-Added-Completion-Strategy-to-FetchSFTP.patch
>
>
> This will allow us to pull a listing from a single primary node and then 
> distribute the work of pulling and processing the data across the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (NIFI-673) Create ListSFTP, FetchSFTP Processors

2015-10-06 Thread Mark Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-673:

Attachment: (was: 
0002-NIFI-673-Added-Completion-Strategy-to-FetchSFTP.patch)

> Create ListSFTP, FetchSFTP Processors
> -
>
> Key: NIFI-673
> URL: https://issues.apache.org/jira/browse/NIFI-673
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 0.4.0
>
> Attachments: 
> 0001-NIFI-673-Initial-implementation-of-ListSFTP-FetchSFT.patch, 
> 0002-NIFI-673-Added-Completion-Strategy-to-FetchSFTP.patch
>
>
> This will allow us to pull a listing from a single primary node and then 
> distribute the work of pulling and processing the data across the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (NIFI-673) Create ListSFTP, FetchSFTP Processors

2015-10-06 Thread Mark Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-673:

Attachment: 0001-NIFI-673-Added-Completion-Strategy-to-FetchSFTP.patch

> Create ListSFTP, FetchSFTP Processors
> -
>
> Key: NIFI-673
> URL: https://issues.apache.org/jira/browse/NIFI-673
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 0.4.0
>
> Attachments: 
> 0001-NIFI-673-Initial-implementation-of-ListSFTP-FetchSFT.patch, 
> 0002-NIFI-673-Added-Completion-Strategy-to-FetchSFTP.patch
>
>
> This will allow us to pull a listing from a single primary node and then 
> distribute the work of pulling and processing the data across the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)