[jira] [Commented] (KAFKA-10448) Preserve Source Partition in Kafka Streams from context

2020-09-02 Thread satya (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189790#comment-17189790
 ] 

satya commented on KAFKA-10448:
---

Thank you. I submitted the KIP-669. Please let me know if you have any 
suggestions/comments

> Preserve Source Partition in Kafka Streams from context
> ---
>
> Key: KAFKA-10448
> URL: https://issues.apache.org/jira/browse/KAFKA-10448
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: satya
>Priority: Minor
>  Labels: needs-kip
>
> Currently Kafka streams Sink Nodes use default partitioner or has the 
> provision of using a custom partitioner which has to be dependent on 
> key/value. I am looking for an enhancement of Sink Node to ensure source 
> partition is preserved instead of deriving the partition again using 
> key/value. One of our use case has producers which have custom partitioners 
> that we dont have access to as it is a third-party application. By simply 
> preserving the partition through context.partition() would be helpful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10448) Preserve Source Partition in Kafka Streams from context

2020-09-01 Thread satya (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188447#comment-17188447
 ] 

satya commented on KAFKA-10448:
---

[~mjsax] I would like to submit a request for KIP as you suggested. Need some 
guidance please. I have requested for access . I am newbie over here, but would 
like to contribute in anyway possible

> Preserve Source Partition in Kafka Streams from context
> ---
>
> Key: KAFKA-10448
> URL: https://issues.apache.org/jira/browse/KAFKA-10448
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: satya
>Priority: Minor
>  Labels: needs-kip
>
> Currently Kafka streams Sink Nodes use default partitioner or has the 
> provision of using a custom partitioner which has to be dependent on 
> key/value. I am looking for an enhancement of Sink Node to ensure source 
> partition is preserved instead of deriving the partition again using 
> key/value. One of our use case has producers which have custom partitioners 
> that we dont have access to as it is a third-party application. By simply 
> preserving the partition through context.partition() would be helpful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10448) Preserve Source Partition in Kafka Streams from context

2020-08-30 Thread satya (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

satya updated KAFKA-10448:
--
Description: Currently Kafka streams Sink Nodes use default partitioner or 
has the provision of using a custom partitioner which has to be dependent on 
key/value. I am looking for an enhancement of Sink Node to ensure source 
partition is preserved instead of deriving the partition again using key/value. 
One of our use case has producers which have custom partitioners that we dont 
have access to as it is a third-party application. By simply preserving the 
partition through context.partition() would be helpful.  (was: Currently Kafka 
streams Sink Nodes use default partitioner or has the provision of using a 
custom partitioner which has to be dependent on key/value. I am looking for an 
enhancement of Sink Node to ensure source partition is preserved instead of 
deriving the partition again using key/value. One of our use case has producers 
which have customer partitioners that we dont have access to as it is a 
third-party application. By simply preserving the partition through 
context.partition() would be helpful.)

> Preserve Source Partition in Kafka Streams from context
> ---
>
> Key: KAFKA-10448
> URL: https://issues.apache.org/jira/browse/KAFKA-10448
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: satya
>Priority: Critical
>
> Currently Kafka streams Sink Nodes use default partitioner or has the 
> provision of using a custom partitioner which has to be dependent on 
> key/value. I am looking for an enhancement of Sink Node to ensure source 
> partition is preserved instead of deriving the partition again using 
> key/value. One of our use case has producers which have custom partitioners 
> that we dont have access to as it is a third-party application. By simply 
> preserving the partition through context.partition() would be helpful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10448) Preserve Source Partition in Kafka Streams from context

2020-08-30 Thread satya (Jira)
satya created KAFKA-10448:
-

 Summary: Preserve Source Partition in Kafka Streams from context
 Key: KAFKA-10448
 URL: https://issues.apache.org/jira/browse/KAFKA-10448
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Affects Versions: 2.5.0
Reporter: satya


Currently Kafka streams Sink Nodes use default partitioner or has the provision 
of using a custom partitioner which has to be dependent on key/value. I am 
looking for an enhancement of Sink Node to ensure source partition is preserved 
instead of deriving the partition again using key/value. One of our use case 
has producers which have customer partitioners that we dont have access to as 
it is a third-party application. By simply preserving the partition through 
context.partition() would be helpful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-7644) Worker Re balance enhancements

2018-11-16 Thread satya (JIRA)
satya created KAFKA-7644:


 Summary: Worker Re balance enhancements
 Key: KAFKA-7644
 URL: https://issues.apache.org/jira/browse/KAFKA-7644
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Reporter: satya


Currently Kafka Connect distributed worker triggers a re balance any time there 
is a new connector/task is added irrespective of whether the connector added is 
a source connector or sink connector. 

My understanding has been the worker re balance should be identical to consumer 
group re balance. That said, should not source connectors be immune to the re 
balance ?

Are we not supposed to use source connectors with distributed workers ?

It does appear to me there is some caveat in the way the worker re balance is 
working and it needs enhancement to not trigger unwanted re balances causing 
restarts of all tasks etc.

Kafka connectors should have a way to not restart and stay with existing 
partition assignment if the re balance trigger is related to a different 
connector

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7642) Kafka Connect - graceful shutdown of distributed worker

2018-11-16 Thread satya (JIRA)
satya created KAFKA-7642:


 Summary: Kafka Connect - graceful shutdown of distributed worker
 Key: KAFKA-7642
 URL: https://issues.apache.org/jira/browse/KAFKA-7642
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: satya


Currently i dont find any ability to gracefully shutdown a distributed worker 
other than killing

the process .

 

Could you a shutdown option to the workers.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7643) Connectors do not unload even when tasks fail in kafka connect

2018-11-16 Thread satya (JIRA)
satya created KAFKA-7643:


 Summary: Connectors do not unload even when tasks fail in kafka 
connect
 Key: KAFKA-7643
 URL: https://issues.apache.org/jira/browse/KAFKA-7643
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: satya


If there any issues in the tasks associated with a connector, even though the 
tasks fail, i have seen the connector themselves is not released many times. 

The only option out of this is like submitting an explicit request to delete 
the connector.

There should be a way to shut down the connectors gracefully, if there are 
exceptions encountered in the task that are not retriable.

In addition, Kafka connect also does not have a graceful exit option. There 
will be situations like server maintenance, outages etc where it would be 
prudent to gracefully shutdown the connectors rather than performing a DELETE 
through the REST endpoint.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-3726) Enable cold storage option

2017-09-20 Thread Satya (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173020#comment-16173020
 ] 

Satya commented on KAFKA-3726:
--

idle approach would be using Kafka Connector HDFS Source/SInk to take backup of 
kafka segment file. which can be replay when it required

> Enable cold storage option
> --
>
> Key: KAFKA-3726
> URL: https://issues.apache.org/jira/browse/KAFKA-3726
> Project: Kafka
>  Issue Type: Wish
>Reporter: Radoslaw Gruchalski
> Attachments: kafka-cold-storage.txt
>
>
> This JIRA builds up on the cold storage article I have published on Medium. 
> The copy of the article attached here.
> The need for cold storage or an "indefinite" log seems to be quite often 
> discussed on the user mailing list.
> The cold storage idea would enable the opportunity for the operator to keep 
> the raw Kafka offset files in a third party storage and allow retrieving the 
> data back for re-consumption.
> The two possible options for enabling such functionality are, from the 
> article:
> First approach: if Kafka provided a notification mechanism and could trigger 
> a program when a segment file is to be discarded, it would become feasible to 
> provide a standard method of moving data to cold storage in reaction to those 
> events. Once the program finishes backing the segments up, it could tell 
> Kafka “it is now safe to delete these segments”.
> The second option is to provide an additional value for the 
> log.cleanup.policy setting, call it cold-storage. In case of this value, 
> Kafka would move the segment files — which otherwise would be deleted — to 
> another destination on the server. They can be picked up from there and moved 
> to the cold storage.
> Both have their limitations. The former one is simply a mechanism exposed to 
> allow operator building up the tooling necessary to enable this. Events could 
> be published in a manner similar to Mesos Event Bus 
> (https://mesosphere.github.io/marathon/docs/event-bus.html) or Kafka itself 
> could provide a control topic on which such info would be published. The 
> outcome is, the operator can subscribe to the event bus and get notified 
> about, at least, two events:
> - log segment is complete and can be backed up
> - partition leader changed
> These two, together with an option to keep the log segment safe from 
> compaction for a certain amount of time, would be sufficient to reliably 
> implement cold storage.
> The latter option, {{log.cleanup.policy}} setting would be more complete 
> feature but it is also much more difficult to implement.  All brokers would 
> have keep the backup of the data in the cold storage significantly increasing 
> the size requirements, also, the de-duplication of the data for the 
> replicated data would be left completely to the operator.
> In any case, the thing to stay away from is having Kafka to deal with the 
> physical aspect of moving the data to and back from the cold storage. This is 
> not Kafka's task. The intent is to provide a method for reliable cold storage.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)