Re: [EXTERNAL] [DISCUSS] KIP-308: Add a Kafka Source Connector to Kafka Connect

2018-05-28 Thread McCaig, Rhys
Sorry for the bad link to the KIP, here it is: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect

> On May 28, 2018, at 10:19 PM, McCaig, Rhys  wrote:
> 
> Hi All,
> 
> I added a KIP to include a Kafka Source Connector with Kafka Connect.
> Here is the KIP: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect
> 
> Looking forward to your feedback and suggestions.
> 
> Cheers,
> Rhys
> 
> 



[DISCUSS] KIP-308: Add a Kafka Source Connector to Kafka Connect

2018-05-28 Thread McCaig, Rhys
Hi All,

I added a KIP to include a Kafka Source Connector with Kafka Connect.
Here is the KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect

Looking forward to your feedback and suggestions.

Cheers,
Rhys




[jira] [Created] (KAFKA-6963) KIP-308: Add a Kafka Source Connector to Kafka Connect

2018-05-28 Thread Rhys Anthony McCaig (JIRA)
Rhys Anthony McCaig created KAFKA-6963:
--

 Summary: KIP-308: Add a Kafka Source Connector to Kafka Connect
 Key: KAFKA-6963
 URL: https://issues.apache.org/jira/browse/KAFKA-6963
 Project: Kafka
  Issue Type: Improvement
Reporter: Rhys Anthony McCaig


This proposal introduces a new Kafka Connect Source Connector.

See the KIP at 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-308%3A+Add+a+Kafka+Source+Connector+to+Kafka+Connect]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Avro to JSON

2018-05-28 Thread Gwen Shapira
Here's how you do it with KSQL:
https://gist.github.com/miguno/40caf67d2f122d8da528d878c99a4e29
And then you can use Connect with JDBC connector to upload the result to
SQL Server.

On Mon, May 28, 2018 at 2:22 AM,  wrote:

> Hi Team,
>
> Looking for optimal solution to convert Avro format to JSON.
> The final objective is to save the JSON in Sql Server DB.
>
> If anyone of you have worked on this, please suggest the best option.
>
> Regards
> Madhu
>
>


-- 
*Gwen Shapira*
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter  | blog



[jira] [Created] (KAFKA-6962) DescribeConfigsRequest Schema documentation is wrong/missing detail

2018-05-28 Thread Sean Policarpio (JIRA)
Sean Policarpio created KAFKA-6962:
--

 Summary: DescribeConfigsRequest Schema documentation is 
wrong/missing detail
 Key: KAFKA-6962
 URL: https://issues.apache.org/jira/browse/KAFKA-6962
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 1.1.0
Reporter: Sean Policarpio


The Resource fields for DescribeConfigsRequest for the following fields are all 
{{null}}:
 * resource_type
 * resource_name
 * config_names

Additionally, after using the API, I've also noted that {{resource_name}} 
should probably be listed as a nullable String since it's optional.

The PR attached would output something like the following:

*Requests:*

{{DescribeConfigs Request (Version: 0) => [resources] }}
{{  resources => resource_type resource_name [config_names] }}
{{    resource_type => INT8}}
{{    resource_name => NULLABLE_STRING}}
{{    config_names => STRING}}

 
||Field||Description||
|resources|An array of config resources to be returned.|
|resource_type|The resource type, which is one of 0 (UNKNOWN), 1 (ANY), 2 
(TOPIC), 3 (GROUP), 4 (BROKER)|
|resource_name|The resource name to query. If set to null, will retrieve all 
resources of type resource_type|
|config_names|An array of config names to retrieve. If set to null, then all 
configs are returned|

  

{{DescribeConfigs Request (Version: 1) => [resources] include_synonyms }}
{{  resources => resource_type resource_name [config_names] }}
{{    resource_type => INT8}}
{{    resource_name => NULLABLE_STRING}}
{{    config_names => STRING}}
{{  include_synonyms => BOOLEAN}}
||Field||Description||
|resources|An array of config resources to be returned.|
|resource_type|The resource type, which is one of 0 (UNKNOWN), 1 (ANY), 2 
(TOPIC), 3 (GROUP), 4 (BROKER)|
|resource_name|The resource name to query. If set to null, will retrieve all 
resources of type resource_type|
|config_names|An array of config names to retrieve. If set to null, then all 
configs are returned|
|include_synonyms|null|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6698) ConsumerBounceTest#testClose sometimes fails

2018-05-28 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved KAFKA-6698.
---
Resolution: Cannot Reproduce

> ConsumerBounceTest#testClose sometimes fails
> 
>
> Key: KAFKA-6698
> URL: https://issues.apache.org/jira/browse/KAFKA-6698
> Project: Kafka
>  Issue Type: Test
>Reporter: Ted Yu
>Priority: Minor
>
> Saw the following in 
> https://builds.apache.org/job/kafka-1.1-jdk7/94/testReport/junit/kafka.api/ConsumerBounceTest/testClose/
>  :
> {code}
> org.apache.kafka.common.errors.TimeoutException: The consumer group command 
> timed out while waiting for group to initialize: 
> Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: 
> The coordinator is not available.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[DISCUSS] Why don't support web UI

2018-05-28 Thread ????????????
Hi team??
I'm curious that why we don't offer a web UI for administrators . The web 
UI is very useful to display cluster status and current topic status .

[jira] [Resolved] (KAFKA-6427) Inconsistent exception type from KafkaConsumer.position

2018-05-28 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-6427.
--
   Resolution: Fixed
Fix Version/s: 2.0.0

Fixed in https://github.com/apache/kafka/pull/5005

> Inconsistent exception type from KafkaConsumer.position
> ---
>
> Key: KAFKA-6427
> URL: https://issues.apache.org/jira/browse/KAFKA-6427
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Reporter: Jay Kahrman
>Priority: Trivial
> Fix For: 2.0.0
>
>
> If KafkaConsumer.position is called with a partition that the consumer isn't 
> assigned, it throws an IllegalArgumentException. All other APIs throw an 
> IllegalStateException when the consumer tries to act on a partition that is 
> not assigned to the consumer. 
> Looking at the implementation, if it weren't for subscription test and 
> IllegalArgumentException thrown at the beginning of KafkaConsumer.position, 
> the very next line would throw an IllegalStateException anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : kafka-trunk-jdk10 #144

2018-05-28 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-05-28 Thread Matthias J. Sax
Luis,

this week is feature freeze for the upcoming 2.0 release and most people
focus on getting their PR merged. Thus, this and the next week (until
code freeze) KIPs for 2.1 are not a high priority for most people.

Please bear with us. Thanks for your understanding.


-Matthias

On 5/28/18 5:21 AM, Luís Cabral wrote:
>  Hi Guozhang,
> 
> It doesn't look like there will be much feedback here.
> Is it alright if I just update the spec back to a standardized behaviour and 
> move this along?
> 
> Cheers,Luis
> On Thursday, May 24, 2018, 11:20:01 AM GMT+2, Luis Cabral 
>  wrote:  
>  
>  Hi Jun / Ismael, 
> 
> Any chance to get your opinion on this?
> Thanks in advance!
> 
> Regards,
> Luís
> 
>> On 22 May 2018, at 17:30, Guozhang Wang  wrote:
>>
>> Hello Luís,
>>
>> While reviewing your PR I realized my previous calculation on the memory
>> usage was incorrect: in fact, in the current implementation, each entry in
>> the memory-bounded cache is 16 (default MD5 hash digest length) + 8 (long
>> type) = 24 bytes, and if we add the long-typed version value it is 32
>> bytes. I.e. each entry will be increased by 33%, not doubling.
>>
>> After redoing the math I'm bit leaning towards just adding this entry for
>> all cases rather than treating timestamp differently with others (sorry for
>> being back and forth, but I just want to make sure we've got a good balance
>> between efficiency and semantics consistency). I've also chatted with Jun
>> and Ismael about this (cc'ed), and maybe you guys can chime in here as well.
>>
>>
>> Guozhang
>>
>>
>> On Tue, May 22, 2018 at 6:45 AM, Luís Cabral 
>> wrote:
>>
>>> Hi Matthias / Guozhang,
>>>
>>> Were the questions clarified?
>>> Please feel free to add more feedback, otherwise it would be nice to move
>>> this topic onwards 
>>>
>>> Kind Regards,
>>> Luís Cabral
>>>
>>> From: Guozhang Wang
>>> Sent: 09 May 2018 20:00
>>> To: dev@kafka.apache.org
>>> Subject: Re: [DISCUSS] KIP-280: Enhanced log compaction
>>>
>>> I have thought about being consistency in strategy v.s. practical concerns
>>> about storage convenience to its impact on compaction effectiveness.
>>>
>>> The different between timestamp and the header key-value pairs is that for
>>> the latter, as I mentioned before, "it is arguably out of Kafka's control,
>>> and indeed users may (mistakenly) generate many records with the same key
>>> and the same header value." So giving up tie breakers may result in very
>>> poor compaction effectiveness when it happens, while for timestamps the
>>> likelihood of this is considered very small.
>>>
>>>
>>> Guozhang
>>>
>>>
>>> On Sun, May 6, 2018 at 8:55 PM, Matthias J. Sax 
>>> wrote:
>>>
 Thanks.

 To reverse the question: if this argument holds, why does it not apply
 to the case when the header key is used as compaction attribute?

 I am not against keeping both records in case timestamps are equal, but
 shouldn't we apply the same strategy for all cases and don't use offset
 as tie-breaker at all?


 -Matthias

> On 5/6/18 8:47 PM, Guozhang Wang wrote:
> Hello Matthias,
>
> The related discussion was in the PR:
> https://github.com/apache/kafka/pull/4822#discussion_r184588037
>
> The concern is that, to use offset as tie breaker we need to double the
> entry size of the entry in bounded compaction cache, and hence largely
> reduce the effectiveness of the compaction itself. Since with
 milliseconds
> timestamp the scenario of ties with the same key is expected to be
 small, I
> think it would be a reasonable tradeoff to make.
>
>
> Guozhang
>
> On Sun, May 6, 2018 at 9:37 AM, Matthias J. Sax >>>
> wrote:
>
>> Hi,
>>
>> I just updated myself on this KIP. One question (maybe it was
>>> discussed
>> and I missed it). What is the motivation to not use the offset as tie
>> breaker for the "timestamp" case? Isn't this inconsistent behavior?
>>
>>
>> -Matthias
>>
>>
>>> On 5/2/18 2:07 PM, Guozhang Wang wrote:
>>> Hello Luís,
>>>
>>> Sorry for the late reply.
>>>
>>> My understanding is that such duplicates will only happen if the
>> non-offset
>>> version value, either the timestamp or some long-typed header key,
>>> are
>> the
>>> same (i.e. we cannot break ties).
>>>
>>> 1. For timestamp, which is in milli-seconds, I think in practice the
>>> likelihood of records with the same key and the same milli-sec
 timestamp
>>> are very small. And hence the duplicate amount should be very small.
>>>
>>> 2. For long-typed header key, it is arguably out of Kafka's control,
 and
>>> indeed users may (mistakenly) generate many records with the same key
 and
>>> the same header value.
>>>
>>>
>>> So I'd like to propose a counter-offer: for 1), we still use only 8
 bytes
>>> and allows for potential duplicates due 

Build failed in Jenkins: kafka-trunk-jdk10 #143

2018-05-28 Thread Apache Jenkins Server
See 


Changes:

[jason] MINOR: Update consumer javadoc for invalid operations on unassigned

--
[...truncated 1.53 MB...]
kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRemoveTransactionsForPartitionOnEmigration PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldWaitForCommitToCompleteOnHandleInitPidAndExistingTransactionInPrepareCommitState
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldWaitForCommitToCompleteOnHandleInitPidAndExistingTransactionInPrepareCommitState
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAbortExpiredTransactionsInOngoingStateAndBumpEpoch STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAbortExpiredTransactionsInOngoingStateAndBumpEpoch PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnInvalidTxnRequestOnEndTxnRequestWhenStatusIsCompleteCommitAndResultIsNotCommit
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnInvalidTxnRequestOnEndTxnRequestWhenStatusIsCompleteCommitAndResultIsNotCommit
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnOkOnEndTxnWhenStatusIsCompleteCommitAndResultIsCommit STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnOkOnEndTxnWhenStatusIsCompleteCommitAndResultIsCommit PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionsOnAddPartitionsWhenStateIsPrepareCommit 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionsOnAddPartitionsWhenStateIsPrepareCommit 
PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldIncrementEpochAndUpdateMetadataOnHandleInitPidWhenExistingCompleteTransaction
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldIncrementEpochAndUpdateMetadataOnHandleInitPidWhenExistingCompleteTransaction
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldGenerateNewProducerIdIfEpochsExhausted STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldGenerateNewProducerIdIfEpochsExhausted PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithNotCoordinatorOnInitPidWhenNotCoordinator STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithNotCoordinatorOnInitPidWhenNotCoordinator PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionOnAddPartitionsWhenStateIsPrepareAbort 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionOnAddPartitionsWhenStateIsPrepareAbort 
PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldInitPidWithEpochZeroForNewTransactionalId STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldInitPidWithEpochZeroForNewTransactionalId PASSED

kafka.coordinator.transaction.ProducerIdManagerTest > testExceedProducerIdLimit 
STARTED

kafka.coordinator.transaction.ProducerIdManagerTest > testExceedProducerIdLimit 
PASSED

kafka.coordinator.transaction.ProducerIdManagerTest > testGetProducerId STARTED

kafka.coordinator.transaction.ProducerIdManagerTest > testGetProducerId PASSED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldSaveForLaterWhenLeaderUnknownButNotAvailable STARTED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldSaveForLaterWhenLeaderUnknownButNotAvailable PASSED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldGenerateEmptyMapWhenNoRequestsOutstanding STARTED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldGenerateEmptyMapWhenNoRequestsOutstanding PASSED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldCreateMetricsOnStarting STARTED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldCreateMetricsOnStarting PASSED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldAbortAppendToLogOnEndTxnWhenNotCoordinatorError STARTED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldAbortAppendToLogOnEndTxnWhenNotCoordinatorError PASSED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldRetryAppendToLogOnEndTxnWhenCoordinatorNotAvailableError STARTED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldRetryAppendToLogOnEndTxnWhenCoordinatorNotAvailableError PASSED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldCompleteAppendToLogOnEndTxnWhenSendMarkersSucceed STARTED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldCompleteAppendToLogOnEndTxnWhenSendMarkersSucceed 

Permission to create KIP

2018-05-28 Thread Arseniy Tashoyan
Hi,

Please give me permission to create KIP.
id: tashoyan

Arseniy Tashoyan


[jira] [Resolved] (KAFKA-4368) Unclean shutdown breaks Kafka cluster

2018-05-28 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-4368.
--
Resolution: Auto Closed

Closing inactive issue.

> Unclean shutdown breaks Kafka cluster
> -
>
> Key: KAFKA-4368
> URL: https://issues.apache.org/jira/browse/KAFKA-4368
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.9.0.1, 0.10.0.0
>Reporter: Anukool Rattana
>Priority: Critical
>
> My team has observed that if broker process die unclean then it will block 
> producer from sending messages to kafka topic.
> Here is how to reproduce the problem:
> 1) Create a Kafka 0.10 with three brokers (A, B and C). 
> 2) Create topic with replication_factor = 2 
> 3) Set producer to send messages with "acks=all" meaning all replicas must be 
> created before able to proceed next message. 
> 4) Force IEM (IBM Endpoint Manager) to send patch to broker A and force 
> server to reboot after patches installed.
> Note: min.insync.replicas = 1
> Result: - Producers are not able send messages to kafka topic after broker 
> rebooted and come back to join cluster with following error messages. 
> [2016-09-28 09:32:41,823] WARN Error while fetching metadata with correlation 
> id 0 : {logstash=LEADER_NOT_AVAILABLE} 
> (org.apache.kafka.clients.NetworkClient)
> We suspected that number of replication_factor (2) is not sufficient to our 
> kafka environment but really need an explanation on what happen when broker 
> facing unclean shutdown. 
> The same issue occurred when setting cluster with 2 brokers and 
> replication_factor = 1.
> The workaround i used to recover service is to cleanup both kafka topic log 
> file and zookeeper data (rmr /brokers/topics/XXX and rmr /consumers/XXX).
> Note:
> Topic list after A comeback from rebooted.
> Topic:logstash  PartitionCount:3ReplicationFactor:2 Configs:
> Topic: logstash Partition: 0Leader: 1   Replicas: 1,3   Isr: 
> 1,3
> Topic: logstash Partition: 1Leader: 2   Replicas: 2,1   Isr: 
> 2,1
> Topic: logstash Partition: 2Leader: 3   Replicas: 3,2   Isr: 
> 2,3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6769) Upgrade jetty library version

2018-05-28 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-6769.
--
   Resolution: Fixed
Fix Version/s: 2.0.0

Jetty version upgraded to 9.4.10.v20180503 in trunk.

> Upgrade jetty library version
> -
>
> Key: KAFKA-6769
> URL: https://issues.apache.org/jira/browse/KAFKA-6769
> Project: Kafka
>  Issue Type: Task
>  Components: core, security
>Affects Versions: 1.1.0
>Reporter: Di Shang
>Priority: Critical
> Fix For: 2.0.0
>
>
> jetty 9.2 has reached end of life as of Jan 2018
> [http://www.eclipse.org/jetty/documentation/current/what-jetty-version.html#d0e203]
> Current version used in Kafka 1.1.0: 9.2.24.v20180105
> For security reason please upgrade to a later version. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-5346) Kafka Producer Failure After Idle Time

2018-05-28 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-5346.
--
Resolution: Not A Problem

Closing as per above comment. Please reopen if you think the issue still exists

> Kafka Producer Failure After Idle Time
> --
>
> Key: KAFKA-5346
> URL: https://issues.apache.org/jira/browse/KAFKA-5346
> Project: Kafka
>  Issue Type: Bug
> Environment: 0.9.0.1 , windows
>Reporter: Manikandan P
>Priority: Major
>  Labels: windows
>
> We are using kafka (2.11-0.9.0.1) in windows and using .NET Kafka SDK 
> (kafka-net) for connecting kafka server.
> When we produce the data to kafka server after 15 minutes of idle time of 
> .NET Client, we are getting below exception in the Kafka SDK Logs.
> TcpClient Socket [http://10.X.X.100:9092//10.X.X.211:50290] Socket.Poll(S): 
> Data was not available, may be connection was closed. 
> TcpClient Socket [http://10.X.X.100:9092//10.X.X.211:50290] has been closed 
> successfully.
> It seems that Kafka Server is accepting the socket request but not responding 
> the request due to which we are not able to produce the message to Kafka even 
> though Kafka Server is online.
> We also tried to increase the threads and also decrease the idle time in 
> server.properties as below in kafka Server and still getting above logs.
> num.network.threads=6
> num.io.threads=16
> connections.max.idle.ms =12
> Please help us to resolve the above issue as it is breaking functional flow 
> and we are having in go live next week.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-05-28 Thread Vahid S Hashemian
Hi Viktor,

Thanks for sharing your opinion.
So you're in favor of disallowing the empty ("") group id altogether (even 
for fetching).
Given that ideally no one should be using the empty group id (at least in 
a production setting) I think the impact would be minimal in either case.

But as you said, let's hear what others think and I'd be happy to modify 
the KIP if needed.

Regards.
--Vahid




From:   Viktor Somogyi 
To: dev 
Date:   05/28/2018 05:18 AM
Subject:Re: [DISCUSS] KIP-289: Improve the default group id 
behavior in KafkaConsumer



Hi Vahid,

(with the argument that using the default group id for offset commit
should not be the user's intention in practice).

Yea, so in my opinion too this use case doesn't seem too practical. Also I
think breaking the offset commit is not smaller from this perspective than
breaking fetch and offset fetch. If we suppose that someone uses the
default group id and we break the offset commit then that might be harder
to detect than breaking the whole thing altogether. (If we think about an
upgrade situation.)
So since we think it is not a practical use case, I think it would be
better to break altogether but ofc that's just my 2 cents :). Let's gather
other's input as well.

Cheers,
Viktor

On Fri, May 25, 2018 at 5:43 PM, Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> Hi Victor,
>
> Thanks for reviewing the KIP.
>
> Yes, to minimize the backward compatibility impact, there would be no 
harm
> in letting a stand-alone consumer consume messages under a "" group id 
(as
> long as there is no offset commit).
> It would have to knowingly seek to an offset or rely on the
> auto.offset.reset config for the starting offset.
> This way the existing functionality would be preserved for the most part
> (with the argument that using the default group id for offset commit
> should not be the user's intention in practice).
>
> Does it seem reasonable?
>
> Thanks.
> --Vahid
>
>
>
>
> From:   Viktor Somogyi 
> To: dev 
> Date:   05/25/2018 04:56 AM
> Subject:Re: [DISCUSS] KIP-289: Improve the default group id
> behavior in KafkaConsumer
>
>
>
> Hi Vahid,
>
> When reading your KIP I coldn't fully understand why did you decide at
> failing with "offset_commit" in case #2? Can't we fail with an empty 
group
> id even in "fetch" or "fetch_offset"? What was the reason for deciding 
to
> fail at "offset_commit"? Was it because of upgrade compatibility 
reasons?
>
> Thanks,
> Viktor
>
> On Thu, May 24, 2018 at 12:06 AM, Ted Yu  wrote:
>
> > Looks good to me.
> >  Original message From: Vahid S Hashemian <
> > vahidhashem...@us.ibm.com> Date: 5/23/18  11:19 AM  (GMT-08:00) To:
> > dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-289: Improve the 
default
> > group id behavior in KafkaConsumer
> > Hi Ted,
> >
> > Thanks for reviewing the KIP. I updated the KIP and introduced an 
error
> > code for the scenario described.
> >
> > --Vahid
> >
> >
> >
> >
> > From:   Ted Yu 
> > To: dev@kafka.apache.org
> > Date:   04/27/2018 04:31 PM
> > Subject:Re: [DISCUSS] KIP-289: Improve the default group id
> > behavior in KafkaConsumer
> >
> >
> >
> > bq. If they attempt an offset commit they will receive an error.
> >
> > Can you outline what specific error would be encountered ?
> >
> > Thanks
> >
> > On Fri, Apr 27, 2018 at 2:17 PM, Vahid S Hashemian <
> > vahidhashem...@us.ibm.com> wrote:
> >
> > > Hi all,
> > >
> > > I have drafted a proposal for improving the behavior of 
KafkaConsumer
> > when
> > > using the default group id:
> > >
> >
> 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-

>
> >
> > > 289%3A+Improve+the+default+group+id+behavior+in+KafkaConsumer
> > > The proposal based on the issue and suggestion reported in 
KAFKA-6774.
> > >
> > > Your feedback is welcome!
> > >
> > > Thanks.
> > > --Vahid
> > >
> > >
> >
> >
> >
> >
> >
>
>
>
>
>






Re: [VOTE] KIP-264: Add a consumer metric to record raw fetch size

2018-05-28 Thread Viktor Somogyi
Hi Vahid,

+1 (non-binding).
The KIP looks good, will take a look at your PR tomorrow.

Cheers,
Viktor

On Mon, May 28, 2018 at 6:03 AM, Ted Yu  wrote:

> lgtm
>
> On Fri, May 25, 2018 at 10:51 AM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com> wrote:
>
> > In the absence of additional feedback on this KIP I'd like to start a
> > vote.
> >
> > To summarize, the KIP simply proposes to add a consumer metric to track
> > the size of raw (uncompressed) fetched messages.
> > The KIP can be found here:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 264%3A+Add+a+consumer+metric+to+record+raw+fetch+size
> >
> > Thanks.
> > --Vahid
> >
> >
> >
>


Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-05-28 Thread Luís Cabral
 Hi Guozhang,

It doesn't look like there will be much feedback here.
Is it alright if I just update the spec back to a standardized behaviour and 
move this along?

Cheers,Luis
On Thursday, May 24, 2018, 11:20:01 AM GMT+2, Luis Cabral 
 wrote:  
 
 Hi Jun / Ismael, 

Any chance to get your opinion on this?
Thanks in advance!

Regards,
Luís

> On 22 May 2018, at 17:30, Guozhang Wang  wrote:
> 
> Hello Luís,
> 
> While reviewing your PR I realized my previous calculation on the memory
> usage was incorrect: in fact, in the current implementation, each entry in
> the memory-bounded cache is 16 (default MD5 hash digest length) + 8 (long
> type) = 24 bytes, and if we add the long-typed version value it is 32
> bytes. I.e. each entry will be increased by 33%, not doubling.
> 
> After redoing the math I'm bit leaning towards just adding this entry for
> all cases rather than treating timestamp differently with others (sorry for
> being back and forth, but I just want to make sure we've got a good balance
> between efficiency and semantics consistency). I've also chatted with Jun
> and Ismael about this (cc'ed), and maybe you guys can chime in here as well.
> 
> 
> Guozhang
> 
> 
> On Tue, May 22, 2018 at 6:45 AM, Luís Cabral 
> wrote:
> 
>> Hi Matthias / Guozhang,
>> 
>> Were the questions clarified?
>> Please feel free to add more feedback, otherwise it would be nice to move
>> this topic onwards 
>> 
>> Kind Regards,
>> Luís Cabral
>> 
>> From: Guozhang Wang
>> Sent: 09 May 2018 20:00
>> To: dev@kafka.apache.org
>> Subject: Re: [DISCUSS] KIP-280: Enhanced log compaction
>> 
>> I have thought about being consistency in strategy v.s. practical concerns
>> about storage convenience to its impact on compaction effectiveness.
>> 
>> The different between timestamp and the header key-value pairs is that for
>> the latter, as I mentioned before, "it is arguably out of Kafka's control,
>> and indeed users may (mistakenly) generate many records with the same key
>> and the same header value." So giving up tie breakers may result in very
>> poor compaction effectiveness when it happens, while for timestamps the
>> likelihood of this is considered very small.
>> 
>> 
>> Guozhang
>> 
>> 
>> On Sun, May 6, 2018 at 8:55 PM, Matthias J. Sax 
>> wrote:
>> 
>>> Thanks.
>>> 
>>> To reverse the question: if this argument holds, why does it not apply
>>> to the case when the header key is used as compaction attribute?
>>> 
>>> I am not against keeping both records in case timestamps are equal, but
>>> shouldn't we apply the same strategy for all cases and don't use offset
>>> as tie-breaker at all?
>>> 
>>> 
>>> -Matthias
>>> 
 On 5/6/18 8:47 PM, Guozhang Wang wrote:
 Hello Matthias,
 
 The related discussion was in the PR:
 https://github.com/apache/kafka/pull/4822#discussion_r184588037
 
 The concern is that, to use offset as tie breaker we need to double the
 entry size of the entry in bounded compaction cache, and hence largely
 reduce the effectiveness of the compaction itself. Since with
>>> milliseconds
 timestamp the scenario of ties with the same key is expected to be
>>> small, I
 think it would be a reasonable tradeoff to make.
 
 
 Guozhang
 
 On Sun, May 6, 2018 at 9:37 AM, Matthias J. Sax >> 
 wrote:
 
> Hi,
> 
> I just updated myself on this KIP. One question (maybe it was
>> discussed
> and I missed it). What is the motivation to not use the offset as tie
> breaker for the "timestamp" case? Isn't this inconsistent behavior?
> 
> 
> -Matthias
> 
> 
>> On 5/2/18 2:07 PM, Guozhang Wang wrote:
>> Hello Luís,
>> 
>> Sorry for the late reply.
>> 
>> My understanding is that such duplicates will only happen if the
> non-offset
>> version value, either the timestamp or some long-typed header key,
>> are
> the
>> same (i.e. we cannot break ties).
>> 
>> 1. For timestamp, which is in milli-seconds, I think in practice the
>> likelihood of records with the same key and the same milli-sec
>>> timestamp
>> are very small. And hence the duplicate amount should be very small.
>> 
>> 2. For long-typed header key, it is arguably out of Kafka's control,
>>> and
>> indeed users may (mistakenly) generate many records with the same key
>>> and
>> the same header value.
>> 
>> 
>> So I'd like to propose a counter-offer: for 1), we still use only 8
>>> bytes
>> and allows for potential duplicates due to ties; for 2) we use 16
>> bytes
> to
>> always break ties. The motivation for distinguishing 1) and 2), is
>> that
> my
>> expectation for 1) would be much common, and hence worth special
>>> handling
>> it to be more effective in cleaning. WDYT?
>> 
>> 
>> Guozhang
>> 
>> 

Re: [DISCUSS] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-05-28 Thread Viktor Somogyi
Hi Vahid,

(with the argument that using the default group id for offset commit
should not be the user's intention in practice).

Yea, so in my opinion too this use case doesn't seem too practical. Also I
think breaking the offset commit is not smaller from this perspective than
breaking fetch and offset fetch. If we suppose that someone uses the
default group id and we break the offset commit then that might be harder
to detect than breaking the whole thing altogether. (If we think about an
upgrade situation.)
So since we think it is not a practical use case, I think it would be
better to break altogether but ofc that's just my 2 cents :). Let's gather
other's input as well.

Cheers,
Viktor

On Fri, May 25, 2018 at 5:43 PM, Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> Hi Victor,
>
> Thanks for reviewing the KIP.
>
> Yes, to minimize the backward compatibility impact, there would be no harm
> in letting a stand-alone consumer consume messages under a "" group id (as
> long as there is no offset commit).
> It would have to knowingly seek to an offset or rely on the
> auto.offset.reset config for the starting offset.
> This way the existing functionality would be preserved for the most part
> (with the argument that using the default group id for offset commit
> should not be the user's intention in practice).
>
> Does it seem reasonable?
>
> Thanks.
> --Vahid
>
>
>
>
> From:   Viktor Somogyi 
> To: dev 
> Date:   05/25/2018 04:56 AM
> Subject:Re: [DISCUSS] KIP-289: Improve the default group id
> behavior in KafkaConsumer
>
>
>
> Hi Vahid,
>
> When reading your KIP I coldn't fully understand why did you decide at
> failing with "offset_commit" in case #2? Can't we fail with an empty group
> id even in "fetch" or "fetch_offset"? What was the reason for deciding to
> fail at "offset_commit"? Was it because of upgrade compatibility reasons?
>
> Thanks,
> Viktor
>
> On Thu, May 24, 2018 at 12:06 AM, Ted Yu  wrote:
>
> > Looks good to me.
> >  Original message From: Vahid S Hashemian <
> > vahidhashem...@us.ibm.com> Date: 5/23/18  11:19 AM  (GMT-08:00) To:
> > dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-289: Improve the default
> > group id behavior in KafkaConsumer
> > Hi Ted,
> >
> > Thanks for reviewing the KIP. I updated the KIP and introduced an error
> > code for the scenario described.
> >
> > --Vahid
> >
> >
> >
> >
> > From:   Ted Yu 
> > To: dev@kafka.apache.org
> > Date:   04/27/2018 04:31 PM
> > Subject:Re: [DISCUSS] KIP-289: Improve the default group id
> > behavior in KafkaConsumer
> >
> >
> >
> > bq. If they attempt an offset commit they will receive an error.
> >
> > Can you outline what specific error would be encountered ?
> >
> > Thanks
> >
> > On Fri, Apr 27, 2018 at 2:17 PM, Vahid S Hashemian <
> > vahidhashem...@us.ibm.com> wrote:
> >
> > > Hi all,
> > >
> > > I have drafted a proposal for improving the behavior of KafkaConsumer
> > when
> > > using the default group id:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>
> >
> > > 289%3A+Improve+the+default+group+id+behavior+in+KafkaConsumer
> > > The proposal based on the issue and suggestion reported in KAFKA-6774.
> > >
> > > Your feedback is welcome!
> > >
> > > Thanks.
> > > --Vahid
> > >
> > >
> >
> >
> >
> >
> >
>
>
>
>
>


[jira] [Created] (KAFKA-6961) UnknownTopicOrPartitionException & NotLeaderForPartitionException upon replication of topics.

2018-05-28 Thread kaushik srinivas (JIRA)
kaushik srinivas created KAFKA-6961:
---

 Summary: UnknownTopicOrPartitionException & 
NotLeaderForPartitionException upon replication of topics.
 Key: KAFKA-6961
 URL: https://issues.apache.org/jira/browse/KAFKA-6961
 Project: Kafka
  Issue Type: Bug
 Environment: kubernetes cluster kafka.
Reporter: kaushik srinivas
 Attachments: k8s_NotLeaderForPartition.txt, k8s_replication_errors.txt

Running kafka & zookeeper in kubernetes cluster.

No of brokers : 3

No of partitions per topic : 3

creating topic with 3 partitions, and looks like all the partitions are up.

Below is the snapshot to confirm the same,

Topic:applestore  PartitionCount:3  ReplicationFactor:3   Configs:
 Topic: applestore  Partition: 0Leader: 1001Replicas: 
1001,1003,1002Isr: 1001,1003,1002
 Topic: applestore  Partition: 1Leader: 1002Replicas: 
1002,1001,1003Isr: 1002,1001,1003
 Topic: applestore  Partition: 2Leader: 1003Replicas: 
1003,1002,1001Isr: 1003,1002,1001
 
But, we see in the brokers as soon as the topics are created below stack traces 
appears,
 
error 1: 
[2018-05-28 08:00:31,875] ERROR [ReplicaFetcher replicaId=1001, leaderId=1003, 
fetcherId=7] Error for partition applestore-2 to broker 
1003:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This 
server does not host this topic-partition. (kafka.server.ReplicaFetcherThread)
 
error 2 :
[2018-05-28 00:43:20,993] ERROR [ReplicaFetcher replicaId=1003, leaderId=1001, 
fetcherId=0] Error for partition apple-6 to broker 
1001:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server 
is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread)
 
When we tries producing records to each specific partition, it works fine and 
also log size across the replicated brokers appears to be equal, which means 
replication is happening fine.
Attaching the two stack trace files.
 
Why are these stack traces appearing ? can we ignore these stack traces if its 
some spam messages ?
 
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Avro to JSON

2018-05-28 Thread Madhu.Kartha
Hi Team,

Looking for optimal solution to convert Avro format to JSON.
The final objective is to save the JSON in Sql Server DB.

If anyone of you have worked on this, please suggest the best option.

Regards
Madhu



[jira] [Created] (KAFKA-6960) Remove the methods from the internal Scala AdminClient that are provided by the new AdminClient

2018-05-28 Thread Attila Sasvari (JIRA)
Attila Sasvari created KAFKA-6960:
-

 Summary: Remove the methods from the internal Scala AdminClient 
that are provided by the new AdminClient
 Key: KAFKA-6960
 URL: https://issues.apache.org/jira/browse/KAFKA-6960
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Attila Sasvari


This is a follow-up task of KAFKA-6884. 

We should remove all the methods from the internal Scala AdminClient that are 
provided by the new AdminClient. To "safe delete" them (i.e. 
{{deleteConsumerGroups, describeConsumerGroup, listGroups, listAllGroups,  
listAllGroupsFlattened}}), related tests need to be reviewed and adjusted (for 
example: the tests in core_tests and streams_test). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: TLS settings in librdkafka

2018-05-28 Thread Magnus Edenhill
Hi Valentin,

after discussing this offline with Rajini we concluded that your solution
for the JVM will work fine and there is no further correlation required
between librdkafka and Java clients on this issue.

Thanks,
Magnus

2018-05-23 17:32 GMT+02:00 Valentín Gutierrez :

> Hi,
>
> At the Wikimedia Foundation (WMF) we are currently conducting a
> security review regarding TLS on our kafka stack[1]. As part of this
> review we identified some gaps regarding TLS customization in
> librdkafka and we submitted a PR to the project[2]. The ultimate goal
> of the PR is allowing us to avoid even offering insecure algorithms
> during the TLS handshake. I.e: avoid the usage of certificate
> signature algorithms involving SHA1.
>
> Magnus Edenhill mentioned in PR[3] some ongoing work/discussion about
> keeping librdkafka capabilities inline with the corresponding Java clients.
>
> We didn't need (yet) a similar PR on the Kafka broker side because we are
> able to influence TLS behaviour through System and Security Properties
> on the JVM. I.e: the counterpart of ssl.curves.list proposed in [1] is
> -Djdk.tls.namedGroups (option added in j8u121).
>
> I'd like your input in our current approach, if handling TLS
> parameters through JVM settings is the way to go or it would be better
> to implement the same TLS settings on the Java client and/or Kafka
> broker itself.
>
> Thanks,
> Valentín Gutiérrez
>
> [1] https://phabricator.wikimedia.org/T182993
> [2] https://github.com/edenhill/librdkafka/pull/1809
> [3] https://github.com/edenhill/librdkafka/pull/1809#
> pullrequestreview-120982957
>


[jira] [Created] (KAFKA-6959) Any impact we are foresee if we upgrade Linux version or move to VM

2018-05-28 Thread Gene Yi (JIRA)
Gene Yi created KAFKA-6959:
--

 Summary: Any impact we are foresee if we upgrade Linux version or 
move to VM
 Key: KAFKA-6959
 URL: https://issues.apache.org/jira/browse/KAFKA-6959
 Project: Kafka
  Issue Type: Task
  Components: admin
Affects Versions: 0.11.0.2
 Environment: Prod
Reporter: Gene Yi


As we know that the recent issue on the Liunx Meltdown and Spectre. all the 
Linux servers need to deploy the patch and the OS version at least to be 6.9. 

we want to know the impact to Kafka, is there any side effect if we directly 
upgrade the OS to 7.0,  also is there any limitation if we deploy Kafka to VM 
instead of the physical servers?

currently the Kafka version we used is 0.11.0.2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)