Build failed in Jenkins: Kafka » kafka-trunk-jdk11 #33

2020-08-24 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10054: KIP-613, add TRACE-level e2e latency metrics (#9094)


--
[...truncated 6.47 MB...]
org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTimeDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTimeDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessRecordForTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessRecordForTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldForwardRecordsFromSubtopologyToSubtopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldForwardRecordsFromSubtopologyToSubtopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForSmallerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForSmallerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCreateStateDirectoryForStatefulTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCreateStateDirectoryForStatefulTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldReturnIsOpen 
STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldReturnIsOpen 
PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldDeleteAndReturnPlainValue STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldDeleteAndReturnPlainValue PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldReturnName 
STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldReturnName 
PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldPutWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldPutWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldPutAllWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldPutAllWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldReturnIsPersistent STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldReturnIsPersistent PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldPutIfAbsentWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldPutIfAbsentWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldForwardClose 
STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldForwardClose 
PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldForwardFlush 
STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldForwardFlush 
PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldForwardInit 
STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldForwardInit 
PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldStoreAndReturnStateStores STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldStoreAndReturnStateStores PASSED

org.apache.kafka.streams.MockProcessorContextTest > 

Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2020-08-24 Thread Ying Zheng
We did some basic feature tests at Uber. The test cases and results are
shared in this google doc:
https://docs.google.com/spreadsheets/d/1XhNJqjzwXvMCcAOhEH0sSXU6RTvyoSf93DHF-YMfGLk/edit?usp=sharing

The performance test results were already shared in the KIP last month.

On Mon, Aug 24, 2020 at 11:10 AM Harsha Ch  wrote:

> "Understand commitments towards driving design & implementation of the KIP
> further and how it aligns with participant interests in contributing to the
> efforts (ex: in the context of Uber’s Q3/Q4 roadmap)."
> What is that about?
>
> On Mon, Aug 24, 2020 at 11:05 AM Kowshik Prakasam 
> wrote:
>
> > Hi Harsha,
> >
> > The following google doc contains a proposal for temporary agenda for the
> > KIP-405  sync meeting
> > tomorrow:
> >
> https://docs.google.com/document/d/1pqo8X5LU8TpwfC_iqSuVPezhfCfhGkbGN2TqiPA3LBU/edit
> >  .
> > Please could you add it to the Google calendar invite?
> >
> > Thank you.
> >
> >
> > Cheers,
> > Kowshik
> >
> > On Thu, Aug 20, 2020 at 10:58 AM Harsha Ch  wrote:
> >
> >> Hi All,
> >>
> >> Scheduled a meeting for Tuesday 9am - 10am. I can record and upload for
> >> community to be able to follow the discussion.
> >>
> >> Jun, please add the required folks on confluent side.
> >>
> >> Thanks,
> >>
> >> Harsha
> >>
> >> On Thu, Aug 20, 2020 at 12:33 AM, Alexandre Dupriez <
> >> alexandre.dupr...@gmail.com > wrote:
> >>
> >> >
> >> >
> >> >
> >> > Hi Jun,
> >> >
> >> >
> >> >
> >> > Many thanks for your initiative.
> >> >
> >> >
> >> >
> >> > If you like, I am happy to attend at the time you suggested.
> >> >
> >> >
> >> >
> >> > Many thanks,
> >> > Alexandre
> >> >
> >> >
> >> >
> >> > Le mer. 19 août 2020 à 22:00, Harsha Ch < harsha. ch@ gmail. com (
> >> > harsha...@gmail.com ) > a écrit :
> >> >
> >> >
> >> >>
> >> >>
> >> >> Hi Jun,
> >> >> Thanks. This will help a lot. Tuesday will work for us.
> >> >> -Harsha
> >> >>
> >> >>
> >> >>
> >> >> On Wed, Aug 19, 2020 at 1:24 PM Jun Rao < jun@ confluent. io (
> >> >> j...@confluent.io ) > wrote:
> >> >>
> >> >>
> >> >>>
> >> >>>
> >> >>> Hi, Satish, Ying, Harsha,
> >> >>>
> >> >>>
> >> >>>
> >> >>> Do you think it would be useful to have a regular virtual meeting to
> >> >>> discuss this KIP? The goal of the meeting will be sharing
> >> >>> design/development progress and discussing any open issues to
> >> accelerate
> >> >>> this KIP. If so, will every Tuesday (from next week) 9am-10am
> >> >>>
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >> PT
> >> >>
> >> >>
> >> >>>
> >> >>>
> >> >>> work for you? I can help set up a Zoom meeting, invite everyone who
> >> might
> >> >>> be interested, have it recorded and shared, etc.
> >> >>>
> >> >>>
> >> >>>
> >> >>> Thanks,
> >> >>>
> >> >>>
> >> >>>
> >> >>> Jun
> >> >>>
> >> >>>
> >> >>>
> >> >>> On Tue, Aug 18, 2020 at 11:01 AM Satish Duggana <
> >> >>>
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >> satish. duggana@ gmail. com ( satish.dugg...@gmail.com ) >
> >> >>
> >> >>
> >> >>>
> >> >>>
> >> >>> wrote:
> >> >>>
> >> >>>
> >> 
> >> 
> >>  Hi Kowshik,
> >> 
> >> 
> >> 
> >>  Thanks for looking into the KIP and sending your comments.
> >> 
> >> 
> >> 
> >>  5001. Under the section "Follower fetch protocol in detail", the
> >>  next-local-offset is the offset upto which the segments are copied
> to
> >>  remote storage. Instead, would last-tiered-offset be a better name
> >> than
> >>  next-local-offset? last-tiered-offset seems to naturally align well
> >> 
> >> 
> >> >>>
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >> with
> >> >>
> >> >>
> >> >>>
> >> 
> >> 
> >>  the definition provided in the KIP.
> >> 
> >> 
> >> 
> >>  Both next-local-offset and local-log-start-offset were introduced
> to
> >> talk
> >>  about offsets related to local log. We are fine with
> >> last-tiered-offset
> >>  too as you suggested.
> >> 
> >> 
> >> 
> >>  5002. After leadership is established for a partition, the leader
> >> would
> >>  begin uploading a segment to remote storage. If successful, the
> >> leader
> >>  would write the updated RemoteLogSegmentMetadata to the metadata
> >> topic
> >> 
> >> 
> >> >>>
> >> >>>
> >> >>>
> >> >>> (via
> >> >>>
> >> >>>
> >> 
> >> 
> >>  RLMM.putRemoteLogSegmentData). However, for defensive reasons, it
> >> seems
> >>  useful that before the first time the segment is uploaded by the
> >> leader
> >> 
> >> 
> >> >>>
> >> >>>
> >> >>>
> >> >>> for
> >> >>>
> >> >>>
> >> 
> >> 
> >>  a partition, the leader should ensure to catch up to all the
> metadata
> >>  events written so far in the metadata topic for that partition (ex:
> >> by
> >>  previous leader). To achieve this, the leader could start a lease
> >> 
> >> 
> >> >>>
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >> (using
> >> >>
> >> >>
> >> >>>
> >> >>>
> >> >>> an
> >> >>>
> >> >>>

Jenkins build is back to normal : Kafka » kafka-trunk-jdk8 #32

2020-08-24 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-10429) Group Coordinator is unavailable leads to missing events

2020-08-24 Thread Navinder Brar (Jira)
Navinder Brar created KAFKA-10429:
-

 Summary: Group Coordinator is unavailable leads to missing events
 Key: KAFKA-10429
 URL: https://issues.apache.org/jira/browse/KAFKA-10429
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 1.1.1
Reporter: Navinder Brar


We are regularly getting this Exception in logs.

[2020-08-25 03:24:59,214] INFO [Consumer 
clientId=appId-StreamThread-1-consumer, groupId=dashavatara] Group coordinator 
ip:9092 (id: 1452096777 rack: null) is *unavailable* or invalid, will attempt 
rediscovery (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)

 

And after sometime it becomes discoverable:

[2020-08-25 03:25:02,218] INFO [Consumer 
clientId=appId-c3d1d186-e487-4993-ae3d-5fed75887e6b-StreamThread-1-consumer, 
groupId=appId] Discovered group coordinator ip:9092 (id: 1452096777 rack: null) 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)

 

Now, the doubt I have is why this unavailability doesn't trigger a rebalance in 
the cluster. We have few hours of retention on the source Kafka Topics and 
sometimes this unavailability stays over for more than few hours and since it 
doesn't trigger a rebalance or stops processing on other nodes(which are 
connected to GC) we never come to know that some issue has happened and till 
then we lose events from our source topics. 

 

There are some resolutions mentioned on stackoverflow but those configs are 
already set in our kafka:

default.replication.factor=3

offsets.topic.replication.factor=3

 

It would be great to understand why this issue is happening and why it doesn't 
trigger a rebalance and is there any known solution for it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Build failed in Jenkins: Kafka » kafka-trunk-jdk15 #33

2020-08-24 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10054: KIP-613, add TRACE-level e2e latency metrics (#9094)


--
[...truncated 3.24 MB...]

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordsFromKeyValuePairs STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordsFromKeyValuePairs PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKeyAndDefaultTimestamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKeyAndDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullHeaders STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullHeaders PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithDefaultTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithDefaultTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithNullKey STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithNullKey PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecord STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecord PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecord STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecord PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > shouldAdvanceTime 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > shouldAdvanceTime 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairs STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairs PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairsAndCustomTimestamps
 STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairsAndCustomTimestamps
 PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairs 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairs PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicNameAndTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicNameAndTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 

Re: [VOTE] KIP-659: Improve TimeWindowedDeserializer and TimeWindowedSerde to handle window size

2020-08-24 Thread Sophie Blee-Goldman
Thanks for the KIP! +1 (non-binding)

Sophie

On Mon, Aug 24, 2020 at 5:06 PM John Roesler  wrote:

> Thanks Leah,
> I’m +1 (binding)
>
> -John
>
> On Mon, Aug 24, 2020, at 16:54, Leah Thomas wrote:
> > Hi everyone,
> >
> > I'd like to kick-off the vote for KIP-659: Improve
> > TimeWindowedDeserializer
> > and TimeWindowedSerde to handle window size.
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size
> >
> > Thanks,
> > Leah
> >
>


Re: [VOTE] KIP-659: Improve TimeWindowedDeserializer and TimeWindowedSerde to handle window size

2020-08-24 Thread John Roesler
Thanks Leah,
I’m +1 (binding)

-John

On Mon, Aug 24, 2020, at 16:54, Leah Thomas wrote:
> Hi everyone,
> 
> I'd like to kick-off the vote for KIP-659: Improve 
> TimeWindowedDeserializer
> and TimeWindowedSerde to handle window size.
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size
> 
> Thanks,
> Leah
>


[jira] [Created] (KAFKA-10428) Mirror Maker connect applies base64 encoding to string headers

2020-08-24 Thread Jennifer Thompson (Jira)
Jennifer Thompson created KAFKA-10428:
-

 Summary: Mirror Maker connect applies base64 encoding to string 
headers
 Key: KAFKA-10428
 URL: https://issues.apache.org/jira/browse/KAFKA-10428
 Project: Kafka
  Issue Type: Bug
  Components: mirrormaker
Affects Versions: 2.6.0, 2.5.0, 2.4.0
Reporter: Jennifer Thompson


MirrorSourceTask takes the header value as bytes from the ConsumerRecord, which 
does not have a header schema, and adds it to the SourceRecord headers using 
"addBytes". This uses Schema.BYTES as the schema for the header, and somehow, 
base64 encoding gets applied when the record gets committed.

This means that my original header value "with_headers" (created with a python 
producer, and stored as a 12 character byte array) becomes the string value 
"d2l0aF9oZWFkZXJz", a 16 character byte array, which is the base64 encoded 
version of the original. If I try to preempt this using "d2l0aF9oZWFkZXJz" to 
start with, and base64 encoding the headers everywhere, it just gets double 
encoded to "ZDJsMGFGOW9aV0ZrWlhKeg==" after passing through the 
MirrorSourceTask.

I think the base64 encoding may be coming from Values#append 
(https://github.com/apache/kafka/blob/trunk/connect/api/src/main/java/org/apache/kafka/connect/data/Values.java#L674),
 but I'm not sure how. That is invoked by 
SimpleConnectorHeader#fromConnectHeader via Values#convertToString.

SimpleHeaderConverter#toConnectHeader produces the correct schema in this case, 
and solves the problem for me, but it seems to guess at the schema, so I'm not 
sure if it is the right solution. Since schemas seem to be required for 
SourceRecord headers, but not available from ConsumerRecord headers, I'm not 
sure what other option we have. I will open a PR with this solution



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-640 Add log compression analysis tool

2020-08-24 Thread Alex Wang
Hi, how will this work with encrypted data in logs if/when KIP-317 gets merged? 
Encrypted data will be hard to compress, so the analyzer tool might need to 
acquire the decryption key somewhere measure the compression stats.

On 2020/08/17 20:23:51, "Christopher Beard (BLOOMBERG/ 919 3RD A)" 
 wrote: 
> Hi everyone,
> 
> I would like to start a discussion on KIP-640:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-640%3A+Add+log+compression+analysis+tool
> 
> This KIP outlines a new CLI tool which helps compare how the various 
> compression types supported by Kafka reduce the size of a log (and therefore 
> more broadly, of a topic).
> 
> I've put together a PR that might help serve as a starting point for comments 
> and suggestions.
> [WIP] PR: https://github.com/apache/kafka/pull/9193
> 
> Thanks,
> Chris Beard


[VOTE] KIP-659: Improve TimeWindowedDeserializer and TimeWindowedSerde to handle window size

2020-08-24 Thread Leah Thomas
Hi everyone,

I'd like to kick-off the vote for KIP-659: Improve TimeWindowedDeserializer
and TimeWindowedSerde to handle window size.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size

Thanks,
Leah


Build failed in Jenkins: Kafka » kafka-trunk-jdk15 #32

2020-08-24 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10312; Fix error code returned in Metadata response when leader 
is not available (#9112)


--
[...truncated 3.24 MB...]

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestampWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp PASSED


Build failed in Jenkins: Kafka » kafka-trunk-jdk8 #31

2020-08-24 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10312; Fix error code returned in Metadata response when leader 
is not available (#9112)


--
[...truncated 3.21 MB...]
org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCreateStateDirectoryForStatefulTopology[Eos enabled = true] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCreateStateDirectoryForStatefulTopology[Eos enabled = true] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = true] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = true] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldCloseProcessor[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldCloseProcessor[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFeedStoreFromGlobalKTable[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFeedStoreFromGlobalKTable[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCleanUpPersistentStateStoresOnClose[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCleanUpPersistentStateStoresOnClose[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowPatternNotValidForTopicNameException[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowPatternNotValidForTopicNameException[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowNoSuchElementExceptionForUnusedOutputTopicWithDynamicRouting[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowNoSuchElementExceptionForUnusedOutputTopicWithDynamicRouting[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 

Re: [DISCUSS] KIP-662: Throw Exception when Source Topics of a Streams App are Deleted

2020-08-24 Thread Guozhang Wang
Hello Bruno,

Thanks for the KIP, it sounds good to me as well. Just a minor comment: we
would include which package the new "MissingSourceTopicException" class
belongs to.



Guozhang


On Fri, Aug 21, 2020 at 11:53 AM John Roesler  wrote:

> Thanks for the KIP, Bruno!
>
> Your proposal sounds good to me.
>
> -John
>
> On Fri, 2020-08-21 at 11:18 -0700, Sophie Blee-Goldman
> wrote:
> > Thanks for the KIP! I'm totally in favor of this approach and to be
> honest,
> > have
> > always wondered why we just silently shut down instead of throwing an
> > exception.
> > This has definitely been a source of confusion for users in my personal
> > experience.
> >
> > I was originally hesitant to extend StreamsException since I always
> thought
> > that anything
> > extending from KafkaException was supposed to "indicate Streams internal
> > errors"
> > -- a phrase I'm quoting from Streams logs directly -- but I now see that
> > we're actually
> > somewhat inconsistent here. Perhaps "Streams internal errors" does not in
> > fact mean
> > internal to Streams itself but just any error that occurs during Stream
> > processing?
> >
> > Anyways, I'm looking forward to cleaning up the exception hierarchy so we
> > get a clear
> > division of user vs "internal" error, but within the current framework
> this
> > SGTM
> >
> > On Fri, Aug 21, 2020 at 8:06 AM Bruno Cadonna 
> wrote:
> >
> > > Hi,
> > >
> > > I would like to propose the following KIP:
> > >
> > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-662%3A+Throw+Exception+when+Source+Topics+of+a+Streams+App+are+Deleted
> > >
> > > Best,
> > > Bruno
> > >
>
>

-- 
-- Guozhang


Re: KIP idea: Separate publish request from the subscribe request

2020-08-24 Thread Guozhang Wang
Hello Ming,

Thanks for bringing this to the community. Just to clarify your proposal,
are you suggesting to use a separate port for fetch requests from all other
requests including produce, but also e.g. metadata, commit/fetch-offsets,
and other inter-broker requests? If yes that would mean the consumer
metadata request should then be handled differently from producer metadata
request so that they can return different port values, and also on the
server side we'd need two socket servers as well. Maybe we can get a more
concrete design on how this would work end-to-end for others to review.

Also I'm wondering if you have tried increasing the num.network.threads and
see if that helps with the large fanout multiplexing issue other than
separating them to different ports?


Guozhang

On Thu, Aug 20, 2020 at 7:03 PM Ming Liu  wrote:

> Hi Kafka community,
>I like to surface a KIP idea, which is to separate publish request from
> the subscribe request using different ports.
>
>The context: We have some workload with over 5000 subscribers, the
> latency on publish latency can be as high as 3000 ms second. After
> investigation, we found the reason is mainly because there are too many
> connections on socketserver and the multiplexing slows down the publish
> latency.
>
>The proposal is somewhat similar to KIP-291: Separating controller
> connections and requests from the data plane
>
>I like to check with experts here whether this is a viable idea to
> continue pursuing or not?
>
> Thanks!
> Ming
>


-- 
-- Guozhang


Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2020-08-24 Thread Harsha Ch
"Understand commitments towards driving design & implementation of the KIP
further and how it aligns with participant interests in contributing to the
efforts (ex: in the context of Uber’s Q3/Q4 roadmap)."
What is that about?

On Mon, Aug 24, 2020 at 11:05 AM Kowshik Prakasam 
wrote:

> Hi Harsha,
>
> The following google doc contains a proposal for temporary agenda for the
> KIP-405  sync meeting
> tomorrow:
> https://docs.google.com/document/d/1pqo8X5LU8TpwfC_iqSuVPezhfCfhGkbGN2TqiPA3LBU/edit
>  .
> Please could you add it to the Google calendar invite?
>
> Thank you.
>
>
> Cheers,
> Kowshik
>
> On Thu, Aug 20, 2020 at 10:58 AM Harsha Ch  wrote:
>
>> Hi All,
>>
>> Scheduled a meeting for Tuesday 9am - 10am. I can record and upload for
>> community to be able to follow the discussion.
>>
>> Jun, please add the required folks on confluent side.
>>
>> Thanks,
>>
>> Harsha
>>
>> On Thu, Aug 20, 2020 at 12:33 AM, Alexandre Dupriez <
>> alexandre.dupr...@gmail.com > wrote:
>>
>> >
>> >
>> >
>> > Hi Jun,
>> >
>> >
>> >
>> > Many thanks for your initiative.
>> >
>> >
>> >
>> > If you like, I am happy to attend at the time you suggested.
>> >
>> >
>> >
>> > Many thanks,
>> > Alexandre
>> >
>> >
>> >
>> > Le mer. 19 août 2020 à 22:00, Harsha Ch < harsha. ch@ gmail. com (
>> > harsha...@gmail.com ) > a écrit :
>> >
>> >
>> >>
>> >>
>> >> Hi Jun,
>> >> Thanks. This will help a lot. Tuesday will work for us.
>> >> -Harsha
>> >>
>> >>
>> >>
>> >> On Wed, Aug 19, 2020 at 1:24 PM Jun Rao < jun@ confluent. io (
>> >> j...@confluent.io ) > wrote:
>> >>
>> >>
>> >>>
>> >>>
>> >>> Hi, Satish, Ying, Harsha,
>> >>>
>> >>>
>> >>>
>> >>> Do you think it would be useful to have a regular virtual meeting to
>> >>> discuss this KIP? The goal of the meeting will be sharing
>> >>> design/development progress and discussing any open issues to
>> accelerate
>> >>> this KIP. If so, will every Tuesday (from next week) 9am-10am
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> PT
>> >>
>> >>
>> >>>
>> >>>
>> >>> work for you? I can help set up a Zoom meeting, invite everyone who
>> might
>> >>> be interested, have it recorded and shared, etc.
>> >>>
>> >>>
>> >>>
>> >>> Thanks,
>> >>>
>> >>>
>> >>>
>> >>> Jun
>> >>>
>> >>>
>> >>>
>> >>> On Tue, Aug 18, 2020 at 11:01 AM Satish Duggana <
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> satish. duggana@ gmail. com ( satish.dugg...@gmail.com ) >
>> >>
>> >>
>> >>>
>> >>>
>> >>> wrote:
>> >>>
>> >>>
>> 
>> 
>>  Hi Kowshik,
>> 
>> 
>> 
>>  Thanks for looking into the KIP and sending your comments.
>> 
>> 
>> 
>>  5001. Under the section "Follower fetch protocol in detail", the
>>  next-local-offset is the offset upto which the segments are copied to
>>  remote storage. Instead, would last-tiered-offset be a better name
>> than
>>  next-local-offset? last-tiered-offset seems to naturally align well
>> 
>> 
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> with
>> >>
>> >>
>> >>>
>> 
>> 
>>  the definition provided in the KIP.
>> 
>> 
>> 
>>  Both next-local-offset and local-log-start-offset were introduced to
>> talk
>>  about offsets related to local log. We are fine with
>> last-tiered-offset
>>  too as you suggested.
>> 
>> 
>> 
>>  5002. After leadership is established for a partition, the leader
>> would
>>  begin uploading a segment to remote storage. If successful, the
>> leader
>>  would write the updated RemoteLogSegmentMetadata to the metadata
>> topic
>> 
>> 
>> >>>
>> >>>
>> >>>
>> >>> (via
>> >>>
>> >>>
>> 
>> 
>>  RLMM.putRemoteLogSegmentData). However, for defensive reasons, it
>> seems
>>  useful that before the first time the segment is uploaded by the
>> leader
>> 
>> 
>> >>>
>> >>>
>> >>>
>> >>> for
>> >>>
>> >>>
>> 
>> 
>>  a partition, the leader should ensure to catch up to all the metadata
>>  events written so far in the metadata topic for that partition (ex:
>> by
>>  previous leader). To achieve this, the leader could start a lease
>> 
>> 
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> (using
>> >>
>> >>
>> >>>
>> >>>
>> >>> an
>> >>>
>> >>>
>> 
>> 
>>  establish_leader metadata event) before commencing tiering, and wait
>> 
>> 
>> >>>
>> >>>
>> >>>
>> >>> until
>> >>>
>> >>>
>> 
>> 
>>  the event is read back. For example, this seems useful to avoid cases
>> 
>> 
>> >>>
>> >>>
>> >>>
>> >>> where
>> >>>
>> >>>
>> 
>> 
>>  zombie leaders can be active for the same partition. This can also
>> 
>> 
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> prove
>> >>
>> >>
>> >>>
>> 
>> 
>>  useful to help avoid making decisions on which segments to be
>> uploaded
>> 
>> 
>> >>>
>> >>>
>> >>>
>> >>> for
>> >>>
>> >>>
>> 
>> 
>>  a partition, until the current leader has caught up to a complete
>> view
>> 
>> 

Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2020-08-24 Thread Kowshik Prakasam
Hi Harsha,

The following google doc contains a proposal for temporary agenda for the
KIP-405  sync meeting
tomorrow:
https://docs.google.com/document/d/1pqo8X5LU8TpwfC_iqSuVPezhfCfhGkbGN2TqiPA3LBU/edit
 .
Please could you add it to the Google calendar invite?

Thank you.


Cheers,
Kowshik

On Thu, Aug 20, 2020 at 10:58 AM Harsha Ch  wrote:

> Hi All,
>
> Scheduled a meeting for Tuesday 9am - 10am. I can record and upload for
> community to be able to follow the discussion.
>
> Jun, please add the required folks on confluent side.
>
> Thanks,
>
> Harsha
>
> On Thu, Aug 20, 2020 at 12:33 AM, Alexandre Dupriez <
> alexandre.dupr...@gmail.com > wrote:
>
> >
> >
> >
> > Hi Jun,
> >
> >
> >
> > Many thanks for your initiative.
> >
> >
> >
> > If you like, I am happy to attend at the time you suggested.
> >
> >
> >
> > Many thanks,
> > Alexandre
> >
> >
> >
> > Le mer. 19 août 2020 à 22:00, Harsha Ch < harsha. ch@ gmail. com (
> > harsha...@gmail.com ) > a écrit :
> >
> >
> >>
> >>
> >> Hi Jun,
> >> Thanks. This will help a lot. Tuesday will work for us.
> >> -Harsha
> >>
> >>
> >>
> >> On Wed, Aug 19, 2020 at 1:24 PM Jun Rao < jun@ confluent. io (
> >> j...@confluent.io ) > wrote:
> >>
> >>
> >>>
> >>>
> >>> Hi, Satish, Ying, Harsha,
> >>>
> >>>
> >>>
> >>> Do you think it would be useful to have a regular virtual meeting to
> >>> discuss this KIP? The goal of the meeting will be sharing
> >>> design/development progress and discussing any open issues to
> accelerate
> >>> this KIP. If so, will every Tuesday (from next week) 9am-10am
> >>>
> >>>
> >>
> >>
> >>
> >> PT
> >>
> >>
> >>>
> >>>
> >>> work for you? I can help set up a Zoom meeting, invite everyone who
> might
> >>> be interested, have it recorded and shared, etc.
> >>>
> >>>
> >>>
> >>> Thanks,
> >>>
> >>>
> >>>
> >>> Jun
> >>>
> >>>
> >>>
> >>> On Tue, Aug 18, 2020 at 11:01 AM Satish Duggana <
> >>>
> >>>
> >>
> >>
> >>
> >> satish. duggana@ gmail. com ( satish.dugg...@gmail.com ) >
> >>
> >>
> >>>
> >>>
> >>> wrote:
> >>>
> >>>
> 
> 
>  Hi Kowshik,
> 
> 
> 
>  Thanks for looking into the KIP and sending your comments.
> 
> 
> 
>  5001. Under the section "Follower fetch protocol in detail", the
>  next-local-offset is the offset upto which the segments are copied to
>  remote storage. Instead, would last-tiered-offset be a better name
> than
>  next-local-offset? last-tiered-offset seems to naturally align well
> 
> 
> >>>
> >>>
> >>
> >>
> >>
> >> with
> >>
> >>
> >>>
> 
> 
>  the definition provided in the KIP.
> 
> 
> 
>  Both next-local-offset and local-log-start-offset were introduced to
> talk
>  about offsets related to local log. We are fine with
> last-tiered-offset
>  too as you suggested.
> 
> 
> 
>  5002. After leadership is established for a partition, the leader
> would
>  begin uploading a segment to remote storage. If successful, the leader
>  would write the updated RemoteLogSegmentMetadata to the metadata topic
> 
> 
> >>>
> >>>
> >>>
> >>> (via
> >>>
> >>>
> 
> 
>  RLMM.putRemoteLogSegmentData). However, for defensive reasons, it
> seems
>  useful that before the first time the segment is uploaded by the
> leader
> 
> 
> >>>
> >>>
> >>>
> >>> for
> >>>
> >>>
> 
> 
>  a partition, the leader should ensure to catch up to all the metadata
>  events written so far in the metadata topic for that partition (ex: by
>  previous leader). To achieve this, the leader could start a lease
> 
> 
> >>>
> >>>
> >>
> >>
> >>
> >> (using
> >>
> >>
> >>>
> >>>
> >>> an
> >>>
> >>>
> 
> 
>  establish_leader metadata event) before commencing tiering, and wait
> 
> 
> >>>
> >>>
> >>>
> >>> until
> >>>
> >>>
> 
> 
>  the event is read back. For example, this seems useful to avoid cases
> 
> 
> >>>
> >>>
> >>>
> >>> where
> >>>
> >>>
> 
> 
>  zombie leaders can be active for the same partition. This can also
> 
> 
> >>>
> >>>
> >>
> >>
> >>
> >> prove
> >>
> >>
> >>>
> 
> 
>  useful to help avoid making decisions on which segments to be uploaded
> 
> 
> >>>
> >>>
> >>>
> >>> for
> >>>
> >>>
> 
> 
>  a partition, until the current leader has caught up to a complete view
> 
> 
> >>>
> >>>
> >>
> >>
> >>
> >> of
> >>
> >>
> >>>
> 
> 
>  all segments uploaded for the partition so far (otherwise this may
> 
> 
> >>>
> >>>
> >>
> >>
> >>
> >> cause
> >>
> >>
> >>>
> 
> 
>  same segment being uploaded twice -- once by the previous leader and
> 
> 
> >>>
> >>>
> >>
> >>
> >>
> >> then
> >>
> >>
> >>>
> 
> 
>  by the new leader).
> 
> 
> 
>  We allow copying segments to remote storage which may have common
> offsets.
>  Please go through the KIP to understand the 

[jira] [Created] (KAFKA-10427) Implement FetchSnapshot RPC

2020-08-24 Thread Jose Armando Garcia Sancio (Jira)
Jose Armando Garcia Sancio created KAFKA-10427:
--

 Summary: Implement FetchSnapshot RPC
 Key: KAFKA-10427
 URL: https://issues.apache.org/jira/browse/KAFKA-10427
 Project: Kafka
  Issue Type: Sub-task
  Components: core, replication
Reporter: Jose Armando Garcia Sancio
Assignee: Jose Armando Garcia Sancio






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (KAFKA-10133) Cannot compress messages in destination cluster with MM2

2020-08-24 Thread Ning Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang reopened KAFKA-10133:


There is no bug in the code, but need some efforts on doc to clarify on where 
some frequently used configs should be set.

> Cannot compress messages in destination cluster with MM2
> 
>
> Key: KAFKA-10133
> URL: https://issues.apache.org/jira/browse/KAFKA-10133
> Project: Kafka
>  Issue Type: Bug
>  Components: mirrormaker
>Affects Versions: 2.4.0, 2.5.0, 2.4.1
> Environment: kafka 2.5.0 deployed via the strimzi operator 0.18
>Reporter: Steve Jacobs
>Assignee: Ning Zhang
>Priority: Minor
>
> When configuring mirrormaker2 using kafka connect, it is not possible to 
> configure things such that messages are compressed in the destination 
> cluster. Dump Log shows that batching is occuring, but no compression. If 
> this is possible, then this is a documentation bug, because I can find no 
> documentation on how to do this.
> baseOffset: 4208 lastOffset: 4492 count: 285 baseSequence: -1 lastSequence: 
> -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 2 isTransactional: 
> false isControl: false position: 239371 CreateTime: 1591745894859 size: 16362 
> magic: 2 compresscodec: NONE crc: 1811507259 isvalid: true
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: virtual KIP meeting for KIP-405

2020-08-24 Thread Jun Rao
Hi, Ben,

Thanks for your interest. Added you.

Jun

On Mon, Aug 24, 2020 at 3:27 AM Ben Stopford  wrote:

> Please add me too.
>
> On Fri, 21 Aug 2020 at 16:08, Jun Rao  wrote:
>
> > Hi, Bill,
> >
> > Thanks for your interest. Added you.
> >
> > Jun
> >
> > On Fri, Aug 21, 2020 at 7:42 AM Bill Bejeck  wrote:
> >
> > > Hi Jun,
> > >
> > > I'd like to attend as well.
> > >
> > > Thanks,
> > > Bill
> > >
> > > On Fri, Aug 21, 2020 at 10:27 AM Jun Rao  wrote:
> > >
> > > > Hi, Adam,,
> > > >
> > > > Thanks for your interest. Invited you.
> > > >
> > > > Jun
> > > >
> > > > On Thu, Aug 20, 2020 at 5:03 PM Adam Bellemare <
> > adam.bellem...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hello
> > > > >
> > > > > I am interested in attending, mostly just to listen and observe.
> > > > >
> > > > > Thanks !
> > > > >
> > > > > > On Aug 20, 2020, at 6:20 PM, Jun Rao  wrote:
> > > > > >
> > > > > > Hi, everyone,
> > > > > >
> > > > > > We plan to have weekly virtual meetings for KIP-405 to discuss
> > > progress
> > > > > and
> > > > > > outstanding issues, starting from this coming Tuesday at 9am PT.
> If
> > > you
> > > > > are
> > > > > > interested in attending, please let Harsha or me know.
> > > > > >
> > > > > > The recording of the meeting will be posted in
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
> > > > > > .
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > >
> > > >
> > >
> >
>
>
> --
>
> Ben Stopford
>


[jira] [Resolved] (KAFKA-10414) Upgrade api-util dependency - CVE-2018-1337

2020-08-24 Thread Daniel Urban (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Urban resolved KAFKA-10414.
--
Resolution: Not A Problem

api-util is only a test dependency, not an issue.

> Upgrade api-util dependency - CVE-2018-1337
> ---
>
> Key: KAFKA-10414
> URL: https://issues.apache.org/jira/browse/KAFKA-10414
> Project: Kafka
>  Issue Type: Bug
>Reporter: Daniel Urban
>Assignee: Daniel Urban
>Priority: Major
>
> There is a dependency on org.apache.directory.api:api-util:1.0.0, which is 
> involved in CVE-2018-1337. The issue is fixed in api-util:1.0.2<=
> This is a transitive dependency through the apacheds libs.
> -Can be fixed by upgrading to at least version 2.0.0.AM25-
> Since api-all is also a dependency, and there is a class collision between 
> api-all and newer version of api-util, it is better to just upgrade api-util 
> to 1.0.2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: virtual KIP meeting for KIP-405

2020-08-24 Thread Ben Stopford
Please add me too.

On Fri, 21 Aug 2020 at 16:08, Jun Rao  wrote:

> Hi, Bill,
>
> Thanks for your interest. Added you.
>
> Jun
>
> On Fri, Aug 21, 2020 at 7:42 AM Bill Bejeck  wrote:
>
> > Hi Jun,
> >
> > I'd like to attend as well.
> >
> > Thanks,
> > Bill
> >
> > On Fri, Aug 21, 2020 at 10:27 AM Jun Rao  wrote:
> >
> > > Hi, Adam,,
> > >
> > > Thanks for your interest. Invited you.
> > >
> > > Jun
> > >
> > > On Thu, Aug 20, 2020 at 5:03 PM Adam Bellemare <
> adam.bellem...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hello
> > > >
> > > > I am interested in attending, mostly just to listen and observe.
> > > >
> > > > Thanks !
> > > >
> > > > > On Aug 20, 2020, at 6:20 PM, Jun Rao  wrote:
> > > > >
> > > > > Hi, everyone,
> > > > >
> > > > > We plan to have weekly virtual meetings for KIP-405 to discuss
> > progress
> > > > and
> > > > > outstanding issues, starting from this coming Tuesday at 9am PT. If
> > you
> > > > are
> > > > > interested in attending, please let Harsha or me know.
> > > > >
> > > > > The recording of the meeting will be posted in
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
> > > > > .
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > >
> > >
> >
>


-- 

Ben Stopford


Another nag about PR#9122 addressing StorageException on reassignment with offline log dir

2020-08-24 Thread Noa Resare
As per the instructions in How to Contribute 
 I’m now nagging again about getting a 
review for PR#9122  which addresses 
KAFKA-10314 . The PR was was 
opened 20 days ago.

This issue is easy to reproduce and the actual fix is tiny. This is an issue 
that I would assume anyone using Apache Kafka with local JBOD disk 
configurations at scale will be at the very least inconvenienced by, so please 
have a look.

Kind regards
noa

[jira] [Created] (KAFKA-10426) Deadlock in KafkaConfigBackingStore

2020-08-24 Thread Goltseva Taisiia (Jira)
Goltseva Taisiia created KAFKA-10426:


 Summary: Deadlock in KafkaConfigBackingStore
 Key: KAFKA-10426
 URL: https://issues.apache.org/jira/browse/KAFKA-10426
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 2.6.0, 2.4.1
Reporter: Goltseva Taisiia


Hi, guys!

We faced the following deadlock:

 
{code:java}
KafkaBasedLog Work Thread - _streaming_service_config
priority:5 - threadId:0x7f18ec22c000 - nativeId:0x950 - nativeId 
(decimal):2384 - state:BLOCKED
stackTrace:
java.lang.Thread.State: BLOCKED (on object monitor)
at 
com.company.streaming.platform.kafka.DistributedHerder$ConfigUpdateListener.onSessionKeyUpdate(DistributedHerder.java:1586)
- waiting to lock <0xe6136808> (a 
com.company.streaming.platform.kafka.DistributedHerder)
at 
org.apache.kafka.connect.storage.KafkaConfigBackingStore$ConsumeCallback.onCompletion(KafkaConfigBackingStore.java:707)
- locked <0xd8c3be40> (a java.lang.Object)
at 
org.apache.kafka.connect.storage.KafkaConfigBackingStore$ConsumeCallback.onCompletion(KafkaConfigBackingStore.java:481)
at org.apache.kafka.connect.util.KafkaBasedLog.poll(KafkaBasedLog.java:264)
at org.apache.kafka.connect.util.KafkaBasedLog.access$500(KafkaBasedLog.java:71)
at 
org.apache.kafka.connect.util.KafkaBasedLog$WorkThread.run(KafkaBasedLog.java:337)



CustomDistributedHerder-connect-1
priority:5 - threadId:0x7f1a01e30800 - nativeId:0x93a - nativeId 
(decimal):2362 - state:BLOCKED
stackTrace:
java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.kafka.connect.storage.KafkaConfigBackingStore.snapshot(KafkaConfigBackingStore.java:285)
- waiting to lock <0xd8c3be40> (a java.lang.Object)
at 
com.company.streaming.platform.kafka.DistributedHerder.updateConfigsWithIncrementalCooperative(DistributedHerder.java:514)
- locked <0xe6136808> (a 
com.company.streaming.platform.kafka.DistributedHerder)
at 
com.company.streaming.platform.kafka.DistributedHerder.tick(DistributedHerder.java:402)
at 
com.company.streaming.platform.kafka.DistributedHerder.run(DistributedHerder.java:286)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){code}

DistributedHerder went to updateConfigsWithIncrementalCooperative() 
synchronized method and called configBackingStore.snapshot() which take a lock 
on internal object in KafkaConfigBackingStore class.

 

Meanwhile KafkaConfigBackingStore in ConsumeCallback inside synchronized block 
on internal object got SESSION_KEY record and called 
updateListener.onSessionKeyUpdate() which take a lock on DistributedHerder.

 

As I can see the problem is here:

[https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/storage/KafkaConfigBackingStore.java#L737]

 

As I understand this call should be performed outside synchronized block:
{code:java}
if (started)
   
updateListener.onSessionKeyUpdate(KafkaConfigBackingStore.this.sessionKey);{code}
 

I'm going to make a PR.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSSION] Upgrade system tests to python 3

2020-08-24 Thread Nikolay Izhikov
Hello.

PR [1] is ready.
Please, review.

But, I need help with the two following questions:

1. We need a new release of ducktape which includes fixes [2], [3] for python3.
I created the issue in ducktape repo [4].
Can someone help me with the release?

2. I know that some companies run system tests for the trunk on a regular bases.
Can someone show me some results of these runs?
So, I can compare failures in my PR and in the trunk.

Results [5] of run all for my PR available in the ticket [6]

```
SESSION REPORT (ALL TESTS)
ducktape version: 0.8.0
session_id:   2020-08-23--002
run time: 1010 minutes 46.483 seconds
tests run:684
passed:   505
failed:   9
ignored:  170
```

[1] https://github.com/apache/kafka/pull/9196
[2] 
https://github.com/confluentinc/ducktape/commit/23bd5ab53802e3a1e1da1ddf3630934f33b02305
[3] 
https://github.com/confluentinc/ducktape/commit/bfe53712f83b025832d29a43cde3de3d7803106f
[4] https://github.com/confluentinc/ducktape/issues/245
[5] https://issues.apache.org/jira/secure/attachment/13010366/report.txt
[6] https://issues.apache.org/jira/browse/KAFKA-10402

> 14 авг. 2020 г., в 21:26, Ismael Juma  написал(а):
> 
> +1
> 
> On Fri, Aug 14, 2020 at 7:42 AM John Roesler  wrote:
> 
>> Thanks Nikolay,
>> 
>> No objection. This would be very nice to have.
>> 
>> Thanks,
>> John
>> 
>> On Fri, Aug 14, 2020, at 09:18, Nikolay Izhikov wrote:
>>> Hello.
>>> 
 If anyone's interested in porting it to Python 3 it would be a good
>> change.
>>> 
>>> I’ve created a ticket [1] to upgrade system tests to python3.
>>> Does someone have any additional inputs or objections for this change?
>>> 
>>> [1] https://issues.apache.org/jira/browse/KAFKA-10402
>>> 
>>> 
 1 июля 2020 г., в 00:26, Gokul Ramanan Subramanian <
>> gokul24...@gmail.com> написал(а):
 
 Thanks Colin.
 
 While at the subject of system tests, there are a few times I see tests
 timed out (even on a large machine such as m5.4xlarge EC2 with Linux).
>> Are
 there any knobs that system tests provide to control timeouts /
>> throughputs
 across all tests?
 Thanks.
 
 On Tue, Jun 30, 2020 at 6:32 PM Colin McCabe 
>> wrote:
 
> Ducktape runs on Python 2.  You can't use it with Python 3, as you are
> trying to do here.
> 
> If anyone's interested in porting it to Python 3 it would be a good
>> change.
> 
> Otherwise, using docker as suggested here seems to be the best way to
>> go.
> 
> best,
> Colin
> 
> On Mon, Jun 29, 2020, at 02:14, Gokul Ramanan Subramanian wrote:
>> Hi.
>> 
>> Has anyone had luck running Kafka system tests on a Mac. I have a
>> MacOS
>> Mojave 10.14.6. I got Python 3.6.9 using pyenv. However, the command
>> *ducktape tests/kafkatest/tests* yields the following error, making
>> it
> look
>> like some Python incompatibility issue.
>> 
>> $ ducktape tests/kafkatest/tests
>> Traceback (most recent call last):
>> File "/Users/gokusubr/.pyenv/versions/3.6.9/bin/ducktape", line 11,
>> in
>> 
>>   load_entry_point('ducktape', 'console_scripts', 'ducktape')()
>> File
>> 
> 
>> "/Users/gokusubr/.pyenv/versions/3.6.9/lib/python3.6/site-packages/pkg_resources/__init__.py",
>> line 487, in load_entry_point
>>   return get_distribution(dist).load_entry_point(group, name)
>> File
>> 
> 
>> "/Users/gokusubr/.pyenv/versions/3.6.9/lib/python3.6/site-packages/pkg_resources/__init__.py",
>> line 2728, in load_entry_point
>>   return ep.load()
>> File
>> 
> 
>> "/Users/gokusubr/.pyenv/versions/3.6.9/lib/python3.6/site-packages/pkg_resources/__init__.py",
>> line 2346, in load
>>   return self.resolve()
>> File
>> 
> 
>> "/Users/gokusubr/.pyenv/versions/3.6.9/lib/python3.6/site-packages/pkg_resources/__init__.py",
>> line 2352, in resolve
>>   module = __import__(self.module_name, fromlist=['__name__'],
>> level=0)
>> File
>> 
> 
>> "/Users/gokusubr/.pyenv/versions/3.6.9/lib/python3.6/site-packages/ducktape-0.7.6-py3.6.egg/ducktape/command_line/main.py",
>> line 127
>>   print "parameters are not valid json: " + str(e.message)
>> ^
>> SyntaxError: invalid syntax
>> 
>> I followed the instructions in tests/README.md to setup a cluster of
>> 9
>> worker machines. That worked well. When I ran *python setup.py
>> develop*
> to
>> install the necessary dependencies (including ducktape), I got
>> similar
>> errors to above, but the overall command completed successfully.
>> 
>> Any help appreciated.
>> 
>> Thanks.
>> 
> 
>>> 
>>> 
>>