[jira] [Commented] (NIFI-6009) Add Scan Kudu Processor

2019-08-13 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906453#comment-16906453
 ] 

Sandish Kumar HN commented on NIFI-6009:


[~joewitt] Done!

> Add Scan Kudu Processor 
> 
>
> Key: NIFI-6009
> URL: https://issues.apache.org/jira/browse/NIFI-6009
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Sandish Kumar HN
>Assignee: Sandish Kumar HN
>Priority: Major
>  Labels: kudu, nosql
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> ScanKudu Processor with a list of predicates to filter the kudu table



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (NIFI-6552) Kudu Processor: Kudu Put Operations

2019-08-13 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN updated NIFI-6552:
---
Status: Patch Available  (was: In Progress)

[https://github.com/apache/nifi/pull/3610]

> Kudu Processor: Kudu Put Operations 
> 
>
> Key: NIFI-6552
> URL: https://issues.apache.org/jira/browse/NIFI-6552
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Sandish Kumar HN
>Assignee: Sandish Kumar HN
>Priority: Major
>
> Adding Kudu Operation's Like Delete, Update, Upsert to Kudu Put Processor



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (NIFI-6552) Kudu Processor: Kudu Put Operations

2019-08-13 Thread Sandish Kumar HN (JIRA)
Sandish Kumar HN created NIFI-6552:
--

 Summary: Kudu Processor: Kudu Put Operations 
 Key: NIFI-6552
 URL: https://issues.apache.org/jira/browse/NIFI-6552
 Project: Apache NiFi
  Issue Type: Improvement
Reporter: Sandish Kumar HN


Adding Kudu Operation's Like Delete, Update, Upsert to Kudu Put Processor



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (NIFI-6552) Kudu Processor: Kudu Put Operations

2019-08-13 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-6552:
--

Assignee: Sandish Kumar HN

> Kudu Processor: Kudu Put Operations 
> 
>
> Key: NIFI-6552
> URL: https://issues.apache.org/jira/browse/NIFI-6552
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Sandish Kumar HN
>Assignee: Sandish Kumar HN
>Priority: Major
>
> Adding Kudu Operation's Like Delete, Update, Upsert to Kudu Put Processor



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (NIFI-6200) Should be able to configure client.id in PublishKafka processors

2019-06-08 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-6200:
--

Assignee: Sandish Kumar HN

> Should be able to configure client.id in PublishKafka processors
> 
>
> Key: NIFI-6200
> URL: https://issues.apache.org/jira/browse/NIFI-6200
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.9.2
>Reporter: Nimrod Avni
>Assignee: Sandish Kumar HN
>Priority: Major
>  Labels: ClientID, kafka, nifi, publish
>
> When using the PublishKafka processors (any version) there is no way to set 
> the client id ([client.id|http://client.id/]) property when publishing to a 
> topic. the PutKafka processor has this option (under the Client Name 
> property), but the kafka version it is set to work with is old (0.8.x) 
> relative to the PublishKafka processors (0_10, 0_11, 1_0, 2_0).
> I believe there should be an option to confgiure the client id property from 
> the processor properties 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-3108) PublishKafka Should Support Custom Partitioners

2019-04-24 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-3108:
--

Assignee: Sandish Kumar HN  (was: Koji Kawamura)

> PublishKafka Should Support Custom Partitioners
> ---
>
> Key: NIFI-3108
> URL: https://issues.apache.org/jira/browse/NIFI-3108
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Bryan Bende
>Assignee: Sandish Kumar HN
>Priority: Minor
>
> Currently PublishKafka/PublishKafka_0_10 have a property for choosing the 
> partitioner which equates to setting the 'partitioner.class' property on the 
> Kafka client, but the property on the processor only allows selecting from 
> "Default Partitioner" and "Round-Robin" partitioner which are provided by the 
> Kafka client.
> If someone wants to implement their own partitioner, there currently isn't an 
> easy way to add it to NiFi's classpath, and there is no way to specify it in 
> the processor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-3108) PublishKafka Should Support Custom Partitioners

2019-04-24 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825475#comment-16825475
 ] 

Sandish Kumar HN edited comment on NIFI-3108 at 4/24/19 8:05 PM:
-

sounds good design and less confusion around selecting Kafka partition.  
[~bende] 


was (Author: sanysand...@gmail.com):
sounds good design [~bende]

> PublishKafka Should Support Custom Partitioners
> ---
>
> Key: NIFI-3108
> URL: https://issues.apache.org/jira/browse/NIFI-3108
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Bryan Bende
>Assignee: Koji Kawamura
>Priority: Minor
>
> Currently PublishKafka/PublishKafka_0_10 have a property for choosing the 
> partitioner which equates to setting the 'partitioner.class' property on the 
> Kafka client, but the property on the processor only allows selecting from 
> "Default Partitioner" and "Round-Robin" partitioner which are provided by the 
> Kafka client.
> If someone wants to implement their own partitioner, there currently isn't an 
> easy way to add it to NiFi's classpath, and there is no way to specify it in 
> the processor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-3108) PublishKafka Should Support Custom Partitioners

2019-04-24 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825475#comment-16825475
 ] 

Sandish Kumar HN edited comment on NIFI-3108 at 4/24/19 8:05 PM:
-

sounds good design [~bende]


was (Author: sanysand...@gmail.com):
that makes sense [~bende]

> PublishKafka Should Support Custom Partitioners
> ---
>
> Key: NIFI-3108
> URL: https://issues.apache.org/jira/browse/NIFI-3108
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Bryan Bende
>Assignee: Koji Kawamura
>Priority: Minor
>
> Currently PublishKafka/PublishKafka_0_10 have a property for choosing the 
> partitioner which equates to setting the 'partitioner.class' property on the 
> Kafka client, but the property on the processor only allows selecting from 
> "Default Partitioner" and "Round-Robin" partitioner which are provided by the 
> Kafka client.
> If someone wants to implement their own partitioner, there currently isn't an 
> easy way to add it to NiFi's classpath, and there is no way to specify it in 
> the processor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-3108) PublishKafka Should Support Custom Partitioners

2019-04-24 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825475#comment-16825475
 ] 

Sandish Kumar HN commented on NIFI-3108:


that makes sense [~bende]

> PublishKafka Should Support Custom Partitioners
> ---
>
> Key: NIFI-3108
> URL: https://issues.apache.org/jira/browse/NIFI-3108
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Bryan Bende
>Assignee: Koji Kawamura
>Priority: Minor
>
> Currently PublishKafka/PublishKafka_0_10 have a property for choosing the 
> partitioner which equates to setting the 'partitioner.class' property on the 
> Kafka client, but the property on the processor only allows selecting from 
> "Default Partitioner" and "Round-Robin" partitioner which are provided by the 
> Kafka client.
> If someone wants to implement their own partitioner, there currently isn't an 
> easy way to add it to NiFi's classpath, and there is no way to specify it in 
> the processor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-3108) PublishKafka Should Support Custom Partitioners

2019-04-24 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825342#comment-16825342
 ] 

Sandish Kumar HN commented on NIFI-3108:


[~ijokarumawak], I'm working on adding partition optional field NIFI-4133 which 
is related to this story. if you're not working on this yet. I can take this. 

> PublishKafka Should Support Custom Partitioners
> ---
>
> Key: NIFI-3108
> URL: https://issues.apache.org/jira/browse/NIFI-3108
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Bryan Bende
>Assignee: Koji Kawamura
>Priority: Minor
>
> Currently PublishKafka/PublishKafka_0_10 have a property for choosing the 
> partitioner which equates to setting the 'partitioner.class' property on the 
> Kafka client, but the property on the processor only allows selecting from 
> "Default Partitioner" and "Round-Robin" partitioner which are provided by the 
> Kafka client.
> If someone wants to implement their own partitioner, there currently isn't an 
> easy way to add it to NiFi's classpath, and there is no way to specify it in 
> the processor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4133) PublishKafkaRecord_0_10 should allow publishing all messages from a flow file to the same partition

2019-04-24 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825238#comment-16825238
 ] 

Sandish Kumar HN commented on NIFI-4133:


[~bende] Thanks for the information. I Will work on it. 

> PublishKafkaRecord_0_10 should allow publishing all messages from a flow file 
> to the same partition
> ---
>
> Key: NIFI-4133
> URL: https://issues.apache.org/jira/browse/NIFI-4133
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Bryan Bende
>Assignee: Sandish Kumar HN
>Priority: Minor
>
> In some use cases it is required to publish all of the messages from a given 
> flow file to the same partition so that they can later be consumer in the 
> same order. 
> Currently the processor provides an option to choose between the default 
> partitioner and a round-robin partitioner, and also allows specifying the 
> name of a field in each record to use as a message key.
> The default partitioner has the following behavior:
> 1)  If a partition is specified in the record, use it
>  2) If no partition is specified but a key is present choose a partition 
> based on a hash of the key
>  3) If no partition or key is present choose a partition in a round-robin 
> fashion
> Currently we never pass in a partition to the Kafka record that is created, 
> so we always fall into #2 or #3, and the message key is really meant to be 
> unique per-event so we shouldn't be relying on every message using the same 
> message key.
> We should add an option to the processor like "Partition per FlowFile" which 
> can be used with the default partitioner, and the NiFi side will pass in the 
> same partition for each message created from the same flow file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4133) PublishKafkaRecord_0_10 should allow publishing all messages from a flow file to the same partition

2019-04-23 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16824746#comment-16824746
 ] 

Sandish Kumar HN commented on NIFI-4133:


[~bende] Please correct me If I'm wrong here.
We should have partition as an option for users to enter Kafka topic partition 
number? so all the messages would go to the same partition? and add EL support 
so that based FlowFlow attributed (set kafka.partition = 1), partition number 
would be decided and used in PublicKafka Processor partition option? do we need 
to add this feature for all versions of Kafka? 

> PublishKafkaRecord_0_10 should allow publishing all messages from a flow file 
> to the same partition
> ---
>
> Key: NIFI-4133
> URL: https://issues.apache.org/jira/browse/NIFI-4133
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Bryan Bende
>Assignee: Sandish Kumar HN
>Priority: Minor
>
> In some use cases it is required to publish all of the messages from a given 
> flow file to the same partition so that they can later be consumer in the 
> same order. 
> Currently the processor provides an option to choose between the default 
> partitioner and a round-robin partitioner, and also allows specifying the 
> name of a field in each record to use as a message key.
> The default partitioner has the following behavior:
> 1)  If a partition is specified in the record, use it
>  2) If no partition is specified but a key is present choose a partition 
> based on a hash of the key
>  3) If no partition or key is present choose a partition in a round-robin 
> fashion
> Currently we never pass in a partition to the Kafka record that is created, 
> so we always fall into #2 or #3, and the message key is really meant to be 
> unique per-event so we shouldn't be relying on every message using the same 
> message key.
> We should add an option to the processor like "Partition per FlowFile" which 
> can be used with the default partitioner, and the NiFi side will pass in the 
> same partition for each message created from the same flow file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-4066) Error messages are logged incompletly

2019-04-23 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-4066:
--

Assignee: (was: Sandish Kumar HN)

> Error messages are logged incompletly 
> --
>
> Key: NIFI-4066
> URL: https://issues.apache.org/jira/browse/NIFI-4066
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.2.0
>Reporter: Gardella Juan Pablo
>Priority: Minor
>
> I saw a lot of components are not logging properly the error messages, for 
> example:
> {noformat}
> 2017-06-13 11:42:45,949 ERROR [Timer-Driven Process Thread-10] 
> o.a.n.p.kafka.pubsub.PublishKafka_0_10 
> PublishKafka_0_10[id=4ecbe897-e2b0-35fa-f973-2f187172cf39] 
> PublishKafka_0_10[id=4ecbe897
> -e2b0-35fa-f973-2f187172cf39] failed to process session due to 
> org.apache.kafka.common.KafkaException: Failed to construct kafka producer: {}
> {noformat}
> Check at the end the element '{}'. In order to sort it out, instead of 
> passing the throwable in the arguments as an array, it should be 
> {{e.getMessage()}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-1642) PutKafka should validate topic expression and calculated value

2019-04-21 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN updated NIFI-1642:
---
Status: Patch Available  (was: In Progress)

[https://github.com/apache/nifi/pull/3450]

> PutKafka should validate topic expression and calculated value
> --
>
> Key: NIFI-1642
> URL: https://issues.apache.org/jira/browse/NIFI-1642
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Christopher McDermott
>Assignee: Sandish Kumar HN
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> PutKafka does not validate the expression supplied for the topic property, 
> like most other processors.  It should also try to validate the evaluated 
> value of the topic to see if it is a legal topic name.  Note I'm not 
> suggesting that the topic need to exist, just that the name is compliant to 
> what Kafka will accept.   This would be most helpful because if certain 
> (probably not all) illegal names are used, the Kafka client throws bizarre 
> and most unhelpful exceptions.
> -
> Chris,
> Assuming the client can validate #2 i am with you.  Please do feel
> free to fire up a JIRA for this.
> Thanks
> Joe
> On Wed, Mar 16, 2016 at 1:24 PM, McDermott, Chris Kevin (MSDU -
> STaTS/StorefrontRemote)  wrote:
> It turns out the root cause of the problem was an invalid topic name.  
> Strange error for that!
> I think there are a couple of improvements could be made to PutKafka.
> 1. Check the validity of the the expression in the topic property.
> 2. Check the validity of the topic name before attempting to write to the 
> topic.
> Chris
> On 3/16/16, 11:41 AM, "McDermott, Chris Kevin (MSDU - 
> STaTS/StorefrontRemote)"  wrote:
> Joe,
> I’ll checkout the disk-space.  We are running 0.9. If disk space is not the 
> issue we’ll give 0.8 a try.
> Thanks very much for your quick reply.
> Cheers,
> Chris
> On 3/16/16, 11:04 AM, "Joe Witt"  wrote:
> Chris,
> I have seen that when the diskspace kafka relies on is full.  We've
> seen a number of interesting exceptions recently in testing various
> configurations. But recommend checking that.
> Also, what version of Kafka broker are you using?  With Apache NiFi
> 0.5.x we moved to the kafka client 0.9.  In doing that we messed up
> support for 0.8.  So...with the upcoming release we will move back to
> the 0.8 client and thus it works great with Kafka 0.8 and 0.9 brokers
> albeit without the new SSL and Kerberos support they added in their
> 0.9 work.  We have a JIRA item to go after that for our next feature
> bearing release.
> Thanks
> Joe
> On Wed, Mar 16, 2016 at 11:01 AM, McDermott, Chris Kevin (MSDU -
> STaTS/StorefrontRemote)  wrote:
> I say strange because the timeout (63ms) is so very short.  The communication 
> timeout I’ve set is 30 sec.  Has anyone overseen this?
> 2016-03-16 14:41:38,227 ERROR [Timer-Driven Process Thread-8] 
> o.apache.nifi.processors.kafka.PutKafka 
> PutKafka[id=852c8d42-a2fa-3478-b06b-84ceb6\
> 6f8b0b] Failed to send 
> StandardFlowFileRecord[uuid=a0074162-0066-49e7-918b-cea1cfc5a955,claim=StandardContentClaim
>  [resourceClaim=StandardResour\
> ceClaim[id=1458079089737-67, container=default, section=67], offset=377796, 
> length=743],offset=0,name=2349680613178720,size=743] to Kafka; routi\
> ng to 'failure'; last failure reason reported was 
> org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
> after 63 ms.;: org.\
> apache.kafka.common.errors.TimeoutException: Failed to update metadata after 
> 63 ms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-4133) PublishKafkaRecord_0_10 should allow publishing all messages from a flow file to the same partition

2019-04-19 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-4133:
--

Assignee: Sandish Kumar HN

> PublishKafkaRecord_0_10 should allow publishing all messages from a flow file 
> to the same partition
> ---
>
> Key: NIFI-4133
> URL: https://issues.apache.org/jira/browse/NIFI-4133
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Bryan Bende
>Assignee: Sandish Kumar HN
>Priority: Minor
>
> In some use cases it is required to publish all of the messages from a given 
> flow file to the same partition so that they can later be consumer in the 
> same order. 
> Currently the processor provides an option to choose between the default 
> partitioner and a round-robin partitioner, and also allows specifying the 
> name of a field in each record to use as a message key.
> The default partitioner has the following behavior:
> 1)  If a partition is specified in the record, use it
>  2) If no partition is specified but a key is present choose a partition 
> based on a hash of the key
>  3) If no partition or key is present choose a partition in a round-robin 
> fashion
> Currently we never pass in a partition to the Kafka record that is created, 
> so we always fall into #2 or #3, and the message key is really meant to be 
> unique per-event so we shouldn't be relying on every message using the same 
> message key.
> We should add an option to the processor like "Partition per FlowFile" which 
> can be used with the default partitioner, and the NiFi side will pass in the 
> same partition for each message created from the same flow file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-4066) Error messages are logged incompletly

2019-04-19 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-4066:
--

Assignee: Sandish Kumar HN

> Error messages are logged incompletly 
> --
>
> Key: NIFI-4066
> URL: https://issues.apache.org/jira/browse/NIFI-4066
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.2.0
>Reporter: Gardella Juan Pablo
>Assignee: Sandish Kumar HN
>Priority: Minor
>
> I saw a lot of components are not logging properly the error messages, for 
> example:
> {noformat}
> 2017-06-13 11:42:45,949 ERROR [Timer-Driven Process Thread-10] 
> o.a.n.p.kafka.pubsub.PublishKafka_0_10 
> PublishKafka_0_10[id=4ecbe897-e2b0-35fa-f973-2f187172cf39] 
> PublishKafka_0_10[id=4ecbe897
> -e2b0-35fa-f973-2f187172cf39] failed to process session due to 
> org.apache.kafka.common.KafkaException: Failed to construct kafka producer: {}
> {noformat}
> Check at the end the element '{}'. In order to sort it out, instead of 
> passing the throwable in the arguments as an array, it should be 
> {{e.getMessage()}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-1642) PutKafka should validate topic expression and calculated value

2019-04-19 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-1642:
--

Assignee: Sandish Kumar HN

> PutKafka should validate topic expression and calculated value
> --
>
> Key: NIFI-1642
> URL: https://issues.apache.org/jira/browse/NIFI-1642
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Christopher McDermott
>Assignee: Sandish Kumar HN
>Priority: Minor
>
> PutKafka does not validate the expression supplied for the topic property, 
> like most other processors.  It should also try to validate the evaluated 
> value of the topic to see if it is a legal topic name.  Note I'm not 
> suggesting that the topic need to exist, just that the name is compliant to 
> what Kafka will accept.   This would be most helpful because if certain 
> (probably not all) illegal names are used, the Kafka client throws bizarre 
> and most unhelpful exceptions.
> -
> Chris,
> Assuming the client can validate #2 i am with you.  Please do feel
> free to fire up a JIRA for this.
> Thanks
> Joe
> On Wed, Mar 16, 2016 at 1:24 PM, McDermott, Chris Kevin (MSDU -
> STaTS/StorefrontRemote)  wrote:
> It turns out the root cause of the problem was an invalid topic name.  
> Strange error for that!
> I think there are a couple of improvements could be made to PutKafka.
> 1. Check the validity of the the expression in the topic property.
> 2. Check the validity of the topic name before attempting to write to the 
> topic.
> Chris
> On 3/16/16, 11:41 AM, "McDermott, Chris Kevin (MSDU - 
> STaTS/StorefrontRemote)"  wrote:
> Joe,
> I’ll checkout the disk-space.  We are running 0.9. If disk space is not the 
> issue we’ll give 0.8 a try.
> Thanks very much for your quick reply.
> Cheers,
> Chris
> On 3/16/16, 11:04 AM, "Joe Witt"  wrote:
> Chris,
> I have seen that when the diskspace kafka relies on is full.  We've
> seen a number of interesting exceptions recently in testing various
> configurations. But recommend checking that.
> Also, what version of Kafka broker are you using?  With Apache NiFi
> 0.5.x we moved to the kafka client 0.9.  In doing that we messed up
> support for 0.8.  So...with the upcoming release we will move back to
> the 0.8 client and thus it works great with Kafka 0.8 and 0.9 brokers
> albeit without the new SSL and Kerberos support they added in their
> 0.9 work.  We have a JIRA item to go after that for our next feature
> bearing release.
> Thanks
> Joe
> On Wed, Mar 16, 2016 at 11:01 AM, McDermott, Chris Kevin (MSDU -
> STaTS/StorefrontRemote)  wrote:
> I say strange because the timeout (63ms) is so very short.  The communication 
> timeout I’ve set is 30 sec.  Has anyone overseen this?
> 2016-03-16 14:41:38,227 ERROR [Timer-Driven Process Thread-8] 
> o.apache.nifi.processors.kafka.PutKafka 
> PutKafka[id=852c8d42-a2fa-3478-b06b-84ceb6\
> 6f8b0b] Failed to send 
> StandardFlowFileRecord[uuid=a0074162-0066-49e7-918b-cea1cfc5a955,claim=StandardContentClaim
>  [resourceClaim=StandardResour\
> ceClaim[id=1458079089737-67, container=default, section=67], offset=377796, 
> length=743],offset=0,name=2349680613178720,size=743] to Kafka; routi\
> ng to 'failure'; last failure reason reported was 
> org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
> after 63 ms.;: org.\
> apache.kafka.common.errors.TimeoutException: Failed to update metadata after 
> 63 ms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-3645) Add aws:kms, sse-c, sse-c-key, and sse-kms-key-id capability to server side encryption in the PutS3Object processor.

2019-04-04 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-3645:
--

Assignee: (was: Sandish Kumar HN)

> Add aws:kms, sse-c, sse-c-key, and sse-kms-key-id capability to server side 
> encryption in the PutS3Object processor.
> 
>
> Key: NIFI-3645
> URL: https://issues.apache.org/jira/browse/NIFI-3645
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Matthew Clarke
>Priority: Major
>
> Server Side Encryption (sse) currently exists in the PutS3Object processor 
> but don’t see sse-c, sse-c-key or sse-kms-key-id options. 
> --sse (string) Specifies server-side encryption of the object in S3. Valid 
> values are AES256 and aws:kms. If the parameter is specified but no value is 
> provided, AES256 is used. 
> --sse-c (string) Specifies server-side encryption using customer provided 
> keys of the the object in S3. AES256 is the only valid value. If the 
> parameter is specified but no value is provided, AES256 is used. If you 
> provide this value, --sse-c-key must be specified as well. 
> --sse-c-key (string) The customer-provided encryption key to use to 
> server-side encrypt the object in S3. If you provide this value, --sse-c must 
> be specified as well. The key provided should not be base64 encoded. 
> --sse-kms-key-id (string) The AWS KMS key ID that should be used to 
> server-side encrypt the object in S3. Note that you should only provide this 
> parameter if KMS key ID is different the default S3 master KMS key. 
> Would like to see full support of the various server side encryption 
> capabilities added to our S3 processors.
> This is related partially to another Apache Jira: NIFI-1769



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-3645) Add aws:kms, sse-c, sse-c-key, and sse-kms-key-id capability to server side encryption in the PutS3Object processor.

2019-04-02 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-3645:
--

Assignee: Sandish Kumar HN

> Add aws:kms, sse-c, sse-c-key, and sse-kms-key-id capability to server side 
> encryption in the PutS3Object processor.
> 
>
> Key: NIFI-3645
> URL: https://issues.apache.org/jira/browse/NIFI-3645
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Matthew Clarke
>Assignee: Sandish Kumar HN
>Priority: Major
>
> Server Side Encryption (sse) currently exists in the PutS3Object processor 
> but don’t see sse-c, sse-c-key or sse-kms-key-id options. 
> --sse (string) Specifies server-side encryption of the object in S3. Valid 
> values are AES256 and aws:kms. If the parameter is specified but no value is 
> provided, AES256 is used. 
> --sse-c (string) Specifies server-side encryption using customer provided 
> keys of the the object in S3. AES256 is the only valid value. If the 
> parameter is specified but no value is provided, AES256 is used. If you 
> provide this value, --sse-c-key must be specified as well. 
> --sse-c-key (string) The customer-provided encryption key to use to 
> server-side encrypt the object in S3. If you provide this value, --sse-c must 
> be specified as well. The key provided should not be base64 encoded. 
> --sse-kms-key-id (string) The AWS KMS key ID that should be used to 
> server-side encrypt the object in S3. Note that you should only provide this 
> parameter if KMS key ID is different the default S3 master KMS key. 
> Would like to see full support of the various server side encryption 
> capabilities added to our S3 processors.
> This is related partially to another Apache Jira: NIFI-1769



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-6009) Add Scan Kudu Processor

2019-03-25 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN updated NIFI-6009:
---
Status: Patch Available  (was: In Progress)

> Add Scan Kudu Processor 
> 
>
> Key: NIFI-6009
> URL: https://issues.apache.org/jira/browse/NIFI-6009
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Sandish Kumar HN
>Assignee: Sandish Kumar HN
>Priority: Major
>  Labels: kudu, nosql
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ScanKudu Processor with a list of predicates to filter the kudu table



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5267) Add Kafka record timestamp to flowfile attributes

2019-03-16 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794367#comment-16794367
 ] 

Sandish Kumar HN commented on NIFI-5267:


[~joewitt] Thanks for the quick response, 
 I think that's the reason, is it possible to change to 
[sanysand...@gmail.com|mailto:sanysand...@gmail.com] which is what I'm using 
for all the accounts i.e JIRA, GitHub and I don't have GitHub account on 
[s...@phdata.io.|mailto:s...@phdata.io.] 

> Add Kafka record timestamp to flowfile attributes
> -
>
> Key: NIFI-5267
> URL: https://issues.apache.org/jira/browse/NIFI-5267
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Jasper Knulst
>Assignee: Sandish Kumar HN
>Priority: Minor
>  Labels: newbie
> Fix For: 1.10.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The ConsumeKafkaRecord and ConsumeKafka processors (0_10, 0_11 and 1_0) can 
> yield 1 flowfile holding many Kafka records. For ConsumeKafka this is 
> optional (using demarcator). 
> Currently the resulting flowfile already gets an attribute 'kafka.offset' 
> which indicates the starting offset (lowest) of any Kafka record within that 
> bundle. 
> It would be valuable to also have a 'kafka.timestamp' attribute there (also 
> only related to the first record of that bundle) to be able to relate all the 
> records in the flowfile to the kafka timestamp and be able to replay some 
> kafka records based on this timestamp (feature in Kafka > 0.9 where replay by 
> offset and by timestamp is now a possibility)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-5267) Add Kafka record timestamp to flowfile attributes

2019-03-16 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794341#comment-16794341
 ] 

Sandish Kumar HN edited comment on NIFI-5267 at 3/16/19 9:05 PM:
-

[~bende] Thanks for merging this. I don't see my name and commits in here 
[https://github.com/apache/nifi/graphs/contributors] did I miss anything?


was (Author: sanysand...@gmail.com):
[~bende] I don't see my name and commits in here 
[https://github.com/apache/nifi/graphs/contributors] did I miss anything?

> Add Kafka record timestamp to flowfile attributes
> -
>
> Key: NIFI-5267
> URL: https://issues.apache.org/jira/browse/NIFI-5267
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Jasper Knulst
>Assignee: Sandish Kumar HN
>Priority: Minor
>  Labels: newbie
> Fix For: 1.10.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The ConsumeKafkaRecord and ConsumeKafka processors (0_10, 0_11 and 1_0) can 
> yield 1 flowfile holding many Kafka records. For ConsumeKafka this is 
> optional (using demarcator). 
> Currently the resulting flowfile already gets an attribute 'kafka.offset' 
> which indicates the starting offset (lowest) of any Kafka record within that 
> bundle. 
> It would be valuable to also have a 'kafka.timestamp' attribute there (also 
> only related to the first record of that bundle) to be able to relate all the 
> records in the flowfile to the kafka timestamp and be able to replay some 
> kafka records based on this timestamp (feature in Kafka > 0.9 where replay by 
> offset and by timestamp is now a possibility)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5267) Add Kafka record timestamp to flowfile attributes

2019-03-16 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794341#comment-16794341
 ] 

Sandish Kumar HN commented on NIFI-5267:


[~bende] I don't see my name and commits in here 
[https://github.com/apache/nifi/graphs/contributors] did I miss anything?

> Add Kafka record timestamp to flowfile attributes
> -
>
> Key: NIFI-5267
> URL: https://issues.apache.org/jira/browse/NIFI-5267
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Jasper Knulst
>Assignee: Sandish Kumar HN
>Priority: Minor
>  Labels: newbie
> Fix For: 1.10.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The ConsumeKafkaRecord and ConsumeKafka processors (0_10, 0_11 and 1_0) can 
> yield 1 flowfile holding many Kafka records. For ConsumeKafka this is 
> optional (using demarcator). 
> Currently the resulting flowfile already gets an attribute 'kafka.offset' 
> which indicates the starting offset (lowest) of any Kafka record within that 
> bundle. 
> It would be valuable to also have a 'kafka.timestamp' attribute there (also 
> only related to the first record of that bundle) to be able to relate all the 
> records in the flowfile to the kafka timestamp and be able to replay some 
> kafka records based on this timestamp (feature in Kafka > 0.9 where replay by 
> offset and by timestamp is now a possibility)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4358) Capability to activate compression on Cassandra connection

2019-03-15 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794147#comment-16794147
 ] 

Sandish Kumar HN commented on NIFI-4358:


[~joewitt] yes this is my first contrib, looking forward to contribute more. 

> Capability to activate compression on Cassandra connection
> --
>
> Key: NIFI-4358
> URL: https://issues.apache.org/jira/browse/NIFI-4358
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.3.0
>Reporter: Jean-Louis
>Assignee: Sandish Kumar HN
>Priority: Minor
>  Labels: cassandra
> Fix For: 1.10.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> It's interesting to activate compression in some use cases. Can we add a 
> processor property to set a compression. If we consider Cassandra Java 
> Driver, this can be done using this 
> (withCompression(ProtocolOptions.Compression.LZ4))



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-6089) Expansion of Parquet support

2019-03-14 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793008#comment-16793008
 ] 

Sandish Kumar HN edited comment on NIFI-6089 at 3/14/19 7:49 PM:
-

[~bende] currently there is Parquet writer but not parquet reader 


was (Author: sanysand...@gmail.com):
[~bende] currently there is Parquet writer but not parquet reader and Avro 
reader but not Avro writer. 

> Expansion of Parquet support
> 
>
> Key: NIFI-6089
> URL: https://issues.apache.org/jira/browse/NIFI-6089
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Affects Versions: 1.9.0
>Reporter: Robert Bruno
>Assignee: Sandish Kumar HN
>Priority: Minor
>
> Now that there is a ConvertAvroToParquet processor that has no HDFS 
> requirements (awesome by the way).  Any chance of either a 
> ConvertParquetToAvro or even better a Parquet Record reader and writer?  We 
> have use cases to go from Parquet back to Json so either of the two solutions 
> above would let us achieve this without having to use Fetch and Put Parquet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-6089) Expansion of Parquet support

2019-03-14 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793008#comment-16793008
 ] 

Sandish Kumar HN commented on NIFI-6089:


[~bende] currently there is Parquet writer but not parquet reader and Avro 
reader but not Avro writer. 

> Expansion of Parquet support
> 
>
> Key: NIFI-6089
> URL: https://issues.apache.org/jira/browse/NIFI-6089
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Affects Versions: 1.9.0
>Reporter: Robert Bruno
>Assignee: Sandish Kumar HN
>Priority: Minor
>
> Now that there is a ConvertAvroToParquet processor that has no HDFS 
> requirements (awesome by the way).  Any chance of either a 
> ConvertParquetToAvro or even better a Parquet Record reader and writer?  We 
> have use cases to go from Parquet back to Json so either of the two solutions 
> above would let us achieve this without having to use Fetch and Put Parquet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4985) Allow users to define a specific offset when starting ConsumeKafka

2019-03-11 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789849#comment-16789849
 ] 

Sandish Kumar HN commented on NIFI-4985:


This is what I'mthinkingg,
as Kafka only allows to consume by offset from a single partition/topic at a 
time.
We can create a starting_offset property and add a validator to check if the 
topic list property contains only a single topic name? only allow this feature 
for single topic Kafka processor. what do you think?  [~pvillard] 

> Allow users to define a specific offset when starting ConsumeKafka
> --
>
> Key: NIFI-4985
> URL: https://issues.apache.org/jira/browse/NIFI-4985
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Pierre Villard
>Assignee: Sandish Kumar HN
>Priority: Major
>
> It'd be useful to add support for dynamic properties in ConsumeKafka set of 
> processors so that users can define the offset to use when starting the 
> processor. The properties could be something like:
> {noformat}
> kafka...offset{noformat}
> If, for a configured topic, such a property is not defined for a given 
> partition, the consumer would use the auto offset property.
> If a custom offset is defined for a topic/partition, it'd be used when 
> initializing the consumer by calling:
> {noformat}
> seek(TopicPartition, long){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5267) Add Kafka record timestamp to flowfile attributes

2019-03-10 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN updated NIFI-5267:
---
Status: Patch Available  (was: In Progress)

> Add Kafka record timestamp to flowfile attributes
> -
>
> Key: NIFI-5267
> URL: https://issues.apache.org/jira/browse/NIFI-5267
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.6.0, 1.5.0
>Reporter: Jasper Knulst
>Assignee: Sandish Kumar HN
>Priority: Minor
>  Labels: newbie
> Fix For: 2.0.0
>
>
> The ConsumeKafkaRecord and ConsumeKafka processors (0_10, 0_11 and 1_0) can 
> yield 1 flowfile holding many Kafka records. For ConsumeKafka this is 
> optional (using demarcator). 
> Currently the resulting flowfile already gets an attribute 'kafka.offset' 
> which indicates the starting offset (lowest) of any Kafka record within that 
> bundle. 
> It would be valuable to also have a 'kafka.timestamp' attribute there (also 
> only related to the first record of that bundle) to be able to relate all the 
> records in the flowfile to the kafka timestamp and be able to replay some 
> kafka records based on this timestamp (feature in Kafka > 0.9 where replay by 
> offset and by timestamp is now a possibility)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-4358) Capability to activate compression on Cassandra connection

2019-03-10 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN updated NIFI-4358:
---
Status: Patch Available  (was: In Progress)

> Capability to activate compression on Cassandra connection
> --
>
> Key: NIFI-4358
> URL: https://issues.apache.org/jira/browse/NIFI-4358
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.3.0
>Reporter: Jean-Louis
>Assignee: Sandish Kumar HN
>Priority: Minor
>  Labels: cassandra
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It's interesting to activate compression in some use cases. Can we add a 
> processor property to set a compression. If we consider Cassandra Java 
> Driver, this can be done using this 
> (withCompression(ProtocolOptions.Compression.LZ4))



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-4985) Allow users to define a specific offset when starting ConsumeKafka

2019-03-09 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16788826#comment-16788826
 ] 

Sandish Kumar HN edited comment on NIFI-4985 at 3/10/19 1:39 AM:
-

[~pvillard] as NIFI Kafka processor support multiple topics (", separated 
topics"), proposed JIRA story can only be achieved for a single topic, as Kafka 
seek method is based on specific partition.  do you think it's still valuable 
if we implement for a single topic? 


was (Author: sanysand...@gmail.com):
[~pvillard] as NIFI Kafka processor support multiple topics (", separated 
topics"), proposed JIRA story can only be achieved for a single topic, do you 
think it's still valuable if we implement for a single topic? 

> Allow users to define a specific offset when starting ConsumeKafka
> --
>
> Key: NIFI-4985
> URL: https://issues.apache.org/jira/browse/NIFI-4985
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Pierre Villard
>Assignee: Sandish Kumar HN
>Priority: Major
>
> It'd be useful to add support for dynamic properties in ConsumeKafka set of 
> processors so that users can define the offset to use when starting the 
> processor. The properties could be something like:
> {noformat}
> kafka...offset{noformat}
> If, for a configured topic, such a property is not defined for a given 
> partition, the consumer would use the auto offset property.
> If a custom offset is defined for a topic/partition, it'd be used when 
> initializing the consumer by calling:
> {noformat}
> seek(TopicPartition, long){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4985) Allow users to define a specific offset when starting ConsumeKafka

2019-03-09 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16788826#comment-16788826
 ] 

Sandish Kumar HN commented on NIFI-4985:


[~pvillard] as NIFI Kafka processor support multiple topics (", separated 
topics"), proposed JIRA story can only be achieved for a single topic, do you 
think it's still valuable if we implement for a single topic? 

> Allow users to define a specific offset when starting ConsumeKafka
> --
>
> Key: NIFI-4985
> URL: https://issues.apache.org/jira/browse/NIFI-4985
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Pierre Villard
>Assignee: Sandish Kumar HN
>Priority: Major
>
> It'd be useful to add support for dynamic properties in ConsumeKafka set of 
> processors so that users can define the offset to use when starting the 
> processor. The properties could be something like:
> {noformat}
> kafka...offset{noformat}
> If, for a configured topic, such a property is not defined for a given 
> partition, the consumer would use the auto offset property.
> If a custom offset is defined for a topic/partition, it'd be used when 
> initializing the consumer by calling:
> {noformat}
> seek(TopicPartition, long){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5267) Add Kafka record timestamp to flowfile attributes

2019-03-08 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16788154#comment-16788154
 ] 

Sandish Kumar HN commented on NIFI-5267:


[~jasperknulst] when you get a free time can you please take a look into the 
pull request  [https://github.com/apache/nifi/pull/3359]

> Add Kafka record timestamp to flowfile attributes
> -
>
> Key: NIFI-5267
> URL: https://issues.apache.org/jira/browse/NIFI-5267
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Jasper Knulst
>Assignee: Sandish Kumar HN
>Priority: Minor
>  Labels: newbie
> Fix For: 2.0.0
>
>
> The ConsumeKafkaRecord and ConsumeKafka processors (0_10, 0_11 and 1_0) can 
> yield 1 flowfile holding many Kafka records. For ConsumeKafka this is 
> optional (using demarcator). 
> Currently the resulting flowfile already gets an attribute 'kafka.offset' 
> which indicates the starting offset (lowest) of any Kafka record within that 
> bundle. 
> It would be valuable to also have a 'kafka.timestamp' attribute there (also 
> only related to the first record of that bundle) to be able to relate all the 
> records in the flowfile to the kafka timestamp and be able to replay some 
> kafka records based on this timestamp (feature in Kafka > 0.9 where replay by 
> offset and by timestamp is now a possibility)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5267) Add Kafka record timestamp to flowfile attributes

2019-03-08 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16788057#comment-16788057
 ] 

Sandish Kumar HN commented on NIFI-5267:


[~jasperknulst] Yeah, which one would take into consideration at the time of 
replaying, we are trying to keep both initial kafka.offset and kafka.timestamp 
in the consumer realm. 

> Add Kafka record timestamp to flowfile attributes
> -
>
> Key: NIFI-5267
> URL: https://issues.apache.org/jira/browse/NIFI-5267
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Jasper Knulst
>Assignee: Sandish Kumar HN
>Priority: Minor
>  Labels: newbie
> Fix For: 2.0.0
>
>
> The ConsumeKafkaRecord and ConsumeKafka processors (0_10, 0_11 and 1_0) can 
> yield 1 flowfile holding many Kafka records. For ConsumeKafka this is 
> optional (using demarcator). 
> Currently the resulting flowfile already gets an attribute 'kafka.offset' 
> which indicates the starting offset (lowest) of any Kafka record within that 
> bundle. 
> It would be valuable to also have a 'kafka.timestamp' attribute there (also 
> only related to the first record of that bundle) to be able to relate all the 
> records in the flowfile to the kafka timestamp and be able to replay some 
> kafka records based on this timestamp (feature in Kafka > 0.9 where replay by 
> offset and by timestamp is now a possibility)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-6101) Upgrade nifi-solr-nar

2019-03-07 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-6101:
--

Assignee: Sandish Kumar HN

> Upgrade nifi-solr-nar 
> --
>
> Key: NIFI-6101
> URL: https://issues.apache.org/jira/browse/NIFI-6101
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Security
>Reporter: Nathan Gough
>Assignee: Sandish Kumar HN
>Priority: Major
>
> Upgrade the org.apache.solr:solr-solrj dependency in 
> nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/pom.xml from v6.2.0 to 
> v7.7.1
> [https://mvnrepository.com/artifact/org.apache.solr/solr-solrj/7.7.1]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-6089) Expansion of Parquet support

2019-03-07 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-6089:
--

Assignee: Sandish Kumar HN

> Expansion of Parquet support
> 
>
> Key: NIFI-6089
> URL: https://issues.apache.org/jira/browse/NIFI-6089
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Affects Versions: 1.9.0
>Reporter: Robert Bruno
>Assignee: Sandish Kumar HN
>Priority: Minor
>
> Now that there is a ConvertAvroToParquet processor that has no HDFS 
> requirements (awesome by the way).  Any chance of either a 
> ConvertParquetToAvro or even better a Parquet Record reader and writer?  We 
> have use cases to go from Parquet back to Json so either of the two solutions 
> above would let us achieve this without having to use Fetch and Put Parquet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5267) Add Kafka record timestamp to flowfile attributes

2019-03-07 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16787009#comment-16787009
 ] 

Sandish Kumar HN commented on NIFI-5267:


[~jasperknulst] should we give an option to select for replay by timestamp or 
replay by offset? 

> Add Kafka record timestamp to flowfile attributes
> -
>
> Key: NIFI-5267
> URL: https://issues.apache.org/jira/browse/NIFI-5267
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Jasper Knulst
>Assignee: Sandish Kumar HN
>Priority: Minor
>  Labels: newbie
> Fix For: 2.0.0
>
>
> The ConsumeKafkaRecord and ConsumeKafka processors (0_10, 0_11 and 1_0) can 
> yield 1 flowfile holding many Kafka records. For ConsumeKafka this is 
> optional (using demarcator). 
> Currently the resulting flowfile already gets an attribute 'kafka.offset' 
> which indicates the starting offset (lowest) of any Kafka record within that 
> bundle. 
> It would be valuable to also have a 'kafka.timestamp' attribute there (also 
> only related to the first record of that bundle) to be able to relate all the 
> records in the flowfile to the kafka timestamp and be able to replay some 
> kafka records based on this timestamp (feature in Kafka > 0.9 where replay by 
> offset and by timestamp is now a possibility)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-4493) PutCassandraQL Option to Disable Prepared Statements

2019-03-04 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-4493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-4493:
--

Assignee: Sandish Kumar HN

> PutCassandraQL Option to Disable Prepared Statements
> 
>
> Key: NIFI-4493
> URL: https://issues.apache.org/jira/browse/NIFI-4493
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.4.0
>Reporter: Ben Thorner
>Assignee: Sandish Kumar HN
>Priority: Minor
>  Labels: cassandra, putcassandraql
>
> Cassandra complains when using this processor to perform large numbers of 
> changing queries. In our scenario, we are using batch statements to insert 
> incoming data.
> INFO  [ScheduledTasks:1] 2017-10-17 16:13:35,213 QueryProcessor.java:134 - 
> 3849 prepared statements discarded in the last minute because cache limit 
> reached (66453504 bytes)
> In this scenario, I don't think it's feasible to use prepared statements, as 
> the number of ? parameters is impractical. Could we instead have an option to 
> disable prepared statements?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-4985) Allow users to define a specific offset when starting ConsumeKafka

2019-03-04 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-4985:
--

Assignee: Sandish Kumar HN

> Allow users to define a specific offset when starting ConsumeKafka
> --
>
> Key: NIFI-4985
> URL: https://issues.apache.org/jira/browse/NIFI-4985
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Pierre Villard
>Assignee: Sandish Kumar HN
>Priority: Major
>
> It'd be useful to add support for dynamic properties in ConsumeKafka set of 
> processors so that users can define the offset to use when starting the 
> processor. The properties could be something like:
> {noformat}
> kafka...offset{noformat}
> If, for a configured topic, such a property is not defined for a given 
> partition, the consumer would use the auto offset property.
> If a custom offset is defined for a topic/partition, it'd be used when 
> initializing the consumer by calling:
> {noformat}
> seek(TopicPartition, long){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-4358) Capability to activate compression on Cassandra connection

2019-03-04 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-4358:
--

Assignee: Sandish Kumar HN

> Capability to activate compression on Cassandra connection
> --
>
> Key: NIFI-4358
> URL: https://issues.apache.org/jira/browse/NIFI-4358
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.3.0
>Reporter: Jean-Louis
>Assignee: Sandish Kumar HN
>Priority: Minor
>  Labels: cassandra
>
> It's interesting to activate compression in some use cases. Can we add a 
> processor property to set a compression. If we consider Cassandra Java 
> Driver, this can be done using this 
> (withCompression(ProtocolOptions.Compression.LZ4))



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-5267) Add Kafka record timestamp to flowfile attributes

2019-02-13 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-5267:
--

Assignee: Sandish Kumar HN

> Add Kafka record timestamp to flowfile attributes
> -
>
> Key: NIFI-5267
> URL: https://issues.apache.org/jira/browse/NIFI-5267
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.5.0, 1.6.0
>Reporter: Jasper Knulst
>Assignee: Sandish Kumar HN
>Priority: Minor
>  Labels: newbie
> Fix For: 2.0.0
>
>
> The ConsumeKafkaRecord and ConsumeKafka processors (0_10, 0_11 and 1_0) can 
> yield 1 flowfile holding many Kafka records. For ConsumeKafka this is 
> optional (using demarcator). 
> Currently the resulting flowfile already gets an attribute 'kafka.offset' 
> which indicates the starting offset (lowest) of any Kafka record within that 
> bundle. 
> It would be valuable to also have a 'kafka.timestamp' attribute there (also 
> only related to the first record of that bundle) to be able to relate all the 
> records in the flowfile to the kafka timestamp and be able to replay some 
> kafka records based on this timestamp (feature in Kafka > 0.9 where replay by 
> offset and by timestamp is now a possibility)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-6009) Add Scan Kudu Processor

2019-02-07 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN updated NIFI-6009:
---
Labels: kudu nosql  (was: )

> Add Scan Kudu Processor 
> 
>
> Key: NIFI-6009
> URL: https://issues.apache.org/jira/browse/NIFI-6009
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Sandish Kumar HN
>Assignee: Sandish Kumar HN
>Priority: Major
>  Labels: kudu, nosql
>
> ScanKudu Processor with a list of predicates to filter the kudu table



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-6009) Add Scan Kudu Processor

2019-02-07 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-6009:
--

Assignee: Sandish Kumar HN

> Add Scan Kudu Processor 
> 
>
> Key: NIFI-6009
> URL: https://issues.apache.org/jira/browse/NIFI-6009
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Sandish Kumar HN
>Assignee: Sandish Kumar HN
>Priority: Major
>
> ScanKudu Processor with a list of predicates to filter the kudu table



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-6009) Add Scan Kudu Processor

2019-02-07 Thread Sandish Kumar HN (JIRA)
Sandish Kumar HN created NIFI-6009:
--

 Summary: Add Scan Kudu Processor 
 Key: NIFI-6009
 URL: https://issues.apache.org/jira/browse/NIFI-6009
 Project: Apache NiFi
  Issue Type: Improvement
Reporter: Sandish Kumar HN


ScanKudu Processor with a list of predicates to filter the kudu table



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4753) Add Get Kudu Processor

2019-02-07 Thread Sandish Kumar HN (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16763227#comment-16763227
 ] 

Sandish Kumar HN commented on NIFI-4753:


[~srisaikumar-inspur] any update on this work?. I'm interested in taking this 
story. 

> Add Get Kudu Processor
> --
>
> Key: NIFI-4753
> URL: https://issues.apache.org/jira/browse/NIFI-4753
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Sri Sai Kumar Ravipati
>Assignee: Sri Sai Kumar Ravipati
>Priority: Major
>
> Now that we have the Put Kudu processor service currently in master, it would 
> be nice to have Get Kudu processor as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5956) Allow Disabling Block Cache With HBase Scan Processor

2019-02-07 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN updated NIFI-5956:
---
Status: Patch Available  (was: In Progress)

> Allow Disabling Block Cache With HBase Scan Processor
> -
>
> Key: NIFI-5956
> URL: https://issues.apache.org/jira/browse/NIFI-5956
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: BELUGA BEHR
>Assignee: Sandish Kumar HN
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {quote}
> Scan instances can be set to use the block cache in the RegionServer via the 
> setCacheBlocks method. For input Scans to MapReduce jobs, this should be 
> false. 
> https://hbase.apache.org/book.html#perf.hbase.client.blockcache
> {quote}
> Please add a configurable option to the HBase Scan processor to allow users 
> to toggle if the Scan should affect the HBase block cache or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-5956) Allow Disabling Block Cache With HBase Scan Processor

2019-01-23 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned NIFI-5956:
--

Assignee: Sandish Kumar HN

> Allow Disabling Block Cache With HBase Scan Processor
> -
>
> Key: NIFI-5956
> URL: https://issues.apache.org/jira/browse/NIFI-5956
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: BELUGA BEHR
>Assignee: Sandish Kumar HN
>Priority: Major
>
> {quote}
> Scan instances can be set to use the block cache in the RegionServer via the 
> setCacheBlocks method. For input Scans to MapReduce jobs, this should be 
> false. 
> https://hbase.apache.org/book.html#perf.hbase.client.blockcache
> {quote}
> Please add a configurable option to the HBase Scan processor to allow users 
> to toggle if the Scan should affect the HBase block cache or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-3973) Create a new Kudu Processor to ingest data

2017-08-10 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121386#comment-16121386
 ] 

Sandish Kumar HN commented on NIFI-3973:


[~cammach] are you working on this? if not I'm interested to work. 

> Create a new Kudu Processor to ingest data
> --
>
> Key: NIFI-3973
> URL: https://issues.apache.org/jira/browse/NIFI-3973
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Cam Mach
>Assignee: Cam Quoc Mach
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)