Build failed in Jenkins: kafka-2.1-jdk8 #151

2019-03-19 Thread Apache Jenkins Server
See 


Changes:

[matthias] MINOR: update docs JSON serde links (#6465)

[matthias] MINOR: updated names for deprecated streams constants (#6466)

--
[...truncated 917.83 KB...]

kafka.server.KafkaConfigTest > testListenerAndAdvertisedListenerNames STARTED

kafka.server.KafkaConfigTest > testListenerAndAdvertisedListenerNames PASSED

kafka.server.KafkaConfigTest > testNonroutableAdvertisedListeners STARTED

kafka.server.KafkaConfigTest > testNonroutableAdvertisedListeners PASSED

kafka.server.KafkaConfigTest > 
testInterBrokerListenerNameAndSecurityProtocolSet STARTED

kafka.server.KafkaConfigTest > 
testInterBrokerListenerNameAndSecurityProtocolSet PASSED

kafka.server.KafkaConfigTest > testFromPropsInvalid STARTED

kafka.server.KafkaConfigTest > testFromPropsInvalid PASSED

kafka.server.KafkaConfigTest > testInvalidCompressionType STARTED

kafka.server.KafkaConfigTest > testInvalidCompressionType PASSED

kafka.server.KafkaConfigTest > testAdvertiseHostNameDefault STARTED

kafka.server.KafkaConfigTest > testAdvertiseHostNameDefault PASSED

kafka.server.KafkaConfigTest > testLogRetentionTimeMinutesProvided STARTED

kafka.server.KafkaConfigTest > testLogRetentionTimeMinutesProvided PASSED

kafka.server.KafkaConfigTest > testValidCompressionType STARTED

kafka.server.KafkaConfigTest > testValidCompressionType PASSED

kafka.server.KafkaConfigTest > testUncleanElectionInvalid STARTED

kafka.server.KafkaConfigTest > testUncleanElectionInvalid PASSED

kafka.server.KafkaConfigTest > testListenerNamesWithAdvertisedListenerUnset 
STARTED

kafka.server.KafkaConfigTest > testListenerNamesWithAdvertisedListenerUnset 
PASSED

kafka.server.KafkaConfigTest > testLogRetentionTimeBothMinutesAndMsProvided 
STARTED

kafka.server.KafkaConfigTest > testLogRetentionTimeBothMinutesAndMsProvided 
PASSED

kafka.server.KafkaConfigTest > testLogRollTimeMsProvided STARTED

kafka.server.KafkaConfigTest > testLogRollTimeMsProvided PASSED

kafka.server.KafkaConfigTest > testUncleanLeaderElectionDefault STARTED

kafka.server.KafkaConfigTest > testUncleanLeaderElectionDefault PASSED

kafka.server.KafkaConfigTest > testInvalidAdvertisedListenersProtocol STARTED

kafka.server.KafkaConfigTest > testInvalidAdvertisedListenersProtocol PASSED

kafka.server.KafkaConfigTest > testUncleanElectionEnabled STARTED

kafka.server.KafkaConfigTest > testUncleanElectionEnabled PASSED

kafka.server.KafkaConfigTest > testInterBrokerVersionMessageFormatCompatibility 
STARTED

kafka.server.KafkaConfigTest > testInterBrokerVersionMessageFormatCompatibility 
PASSED

kafka.server.KafkaConfigTest > testAdvertisePortDefault STARTED

kafka.server.KafkaConfigTest > testAdvertisePortDefault PASSED

kafka.server.KafkaConfigTest > testVersionConfiguration STARTED

kafka.server.KafkaConfigTest > testVersionConfiguration PASSED

kafka.server.KafkaConfigTest > testEqualAdvertisedListenersProtocol STARTED

kafka.server.KafkaConfigTest > testEqualAdvertisedListenersProtocol PASSED

kafka.server.ListOffsetsRequestTest > testListOffsetsErrorCodes STARTED

kafka.server.ListOffsetsRequestTest > testListOffsetsErrorCodes PASSED

kafka.server.ListOffsetsRequestTest > testCurrentEpochValidation STARTED

kafka.server.ListOffsetsRequestTest > testCurrentEpochValidation PASSED

kafka.server.CreateTopicsRequestTest > testValidCreateTopicsRequests STARTED

kafka.server.CreateTopicsRequestTest > testValidCreateTopicsRequests PASSED

kafka.server.CreateTopicsRequestTest > testErrorCreateTopicsRequests STARTED

kafka.server.CreateTopicsRequestTest > testErrorCreateTopicsRequests PASSED

kafka.server.CreateTopicsRequestTest > testInvalidCreateTopicsRequests STARTED

kafka.server.CreateTopicsRequestTest > testInvalidCreateTopicsRequests PASSED

kafka.server.CreateTopicsRequestTest > testNotController STARTED

kafka.server.CreateTopicsRequestTest > testNotController PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
PASSED

kafka.server.FetchRequestDownConversionConfigTest > 
testV1FetchWithDownConversionDisabled STARTED

kafka.server.FetchRequestDownConversionConfigTest > 
testV1FetchWithDownConversionDisabled PASSED

kafka.server.FetchRequestDownConversionConfigTest > testV1FetchFromReplica 
STARTED

kafka.server.FetchRequestDownConversionConfigTest > testV1FetchFromReplica 
PASSED

kafka.server.FetchRequestDownConversionConfigTest > 
testLatestFetchWithDownConversionDisabled STARTED

kafka.server.FetchRequestDownConversionConfigTest > 
testLatestFetchWithDownConversionDisabled PASSED

kafka.server.FetchRequestDownConversionConfigTest > 
testV1FetchWithTopicLevelOverrides 

Re: [DISCUSS] KIP-398: Support reading trust store from classpath

2019-03-19 Thread Jun Rao
Hi, Noa,

Thanks for the KIP and sorry for the late response.

The KIP itself looks reasonable. My only concern is that it's very specific
to SSL. We have other configurations that also depend on files (e.g. keytab
in Sasl GSSAPI). It would be inconsistent that we only support the
CLASSPATH: syntax in SSL. So, I was thinking that we could either (1) try
to support CLASSPATH: generally where a file is being used, or (2) just
rely on KIP-383, which seems more general purpose.

Jun



On Tue, Dec 18, 2018 at 2:03 AM Noa Resare  wrote:

> I believe that I have addressed the concerns raised in this discussion. It
> seems reasonable to start a vote in about two days. Please raise any
> concerns you may have and I will be happy to attempt to address them.
>
> /noa
>
> > On 10 Dec 2018, at 10:53, Noa Resare  wrote:
> >
> > Thank you for your comments, see replies inline.
> >
> >> On 9 Dec 2018, at 01:33, Harsha  wrote:
> >>
> >> Hi Noa,
> >>  Based on KIP"s motivation section
> >> "If we had the ability to load a trust store from the classpath as well
> as from a file, the trust store could be shipped in a jar that could be
> declared as a dependency and piggyback on the distribution infrastructure
> already in place."
> >>
> >> It looks like you are making the assumption that distributing a jar is
> better than the file. I am not sure why one is better than the other. There
> are other use-cases where one can make a call local "daemon" over Unix
> socket to fetch a certificate as well.
> >
> > It was not my intention to convey that loading the trust store from the
> classpath is inherently better in all cases. The proposed change simply
> brings more choice. That said, I do believe that maven central and the
> transitive dependency features of maven, gradle and ivy makes for a
> smoother user experience in many cases. Validating broker certificates
> against a organisation wide private CA cert has benefits in that it means
> that the person setting up the kafka cluster(s) doesn’t need to bother with
> purchasing or obtaining publicly trusted certs for every broker while still
> providing strong cryptographic validation that a client is connecting to a
> legitimate endpoint. If there was a way to distribute a private trust store
> that is as easy as declaring an additional maven style dependency, I
> imagine that this would be a more appealing proposition than it is today. I
> would assume that many organisations opt to disable strict host checking in
> certificates to sidestep the CA cert distribution problem. I think it would
> be valuable to make it slightly more easy to do the right thing.
> >
> >>
> >> Just supporting a "classpath" option might work for a few users but
> it's not generic enough to support a wide variety of other infrastructures.
> My suggestion if the KIP motivation is to make the certificate/truststore
> available with different mechanisms, Lets make a interface that allow users
> to roll their own based on their infra and support File as the default
> mechanism so that we can support existing users.
> >
> > I agree that this is a fairly small change that doesn’t aim to support
> all possible mechanisms that one might conceive of. I believe that KIP-383:
> Pluggable interface for SSL Factory <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-383:++Pluggable+interface+for+SSL+Factory>
> might be a good vehicle to provide this level of flexibility, but even if
> that proposal is accepted I still think that there is room for this KIP to
> provide an easy to use solution to this problem.
> >
> > cheers
> > noa
> >
> >> -Harsha
> >>
> >> On Sat, Dec 8, 2018, at 7:03 AM, Noa Resare wrote:
> >>>
> >>>
>  On 6 Dec 2018, at 20:16, Rajini Sivaram 
> wrote:
> 
>  Hi Noa,
> 
>  Thanks for the KIP. A few comments/questions:
> 
>  1. If we support filenames starting with `classpath:` by requiring
>  `file:`prefix,
>  then we are presumably not supporting files starting `file:`. Not
>  necessarily an issue, but we do need to document any restrictions.
> >>>
> >>> I think that it would be trivial to support ‘file:’ as a prefix in a
> >>> filesystem path
> >>> by just asking the user that really want that to add it twice:
> >>>
> >>> The config value "file:file:my_weird_file_name" would map to the
> >>> filesystem path "file:my_weird_file_name”
> >>>
> >>>
>  2. On the broker-side, trust stores are dynamically updatable. And we
> use
>  file modification time to decide whether trust store needs to be
> reloaded.
>  This is less of an issue once we implement
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-339%3A+Create+a+new+IncrementalAlterConfigs+API
> ,
>  but at the moment, we are relying on actual files on the file system
> for
>  which we can compare modification times.
> >>>
>  3. On the client-side, trust stores are not currently updatable. And
> we
>  don't have an API to make them updatable. By using 

Build failed in Jenkins: kafka-2.1-jdk8 #150

2019-03-19 Thread Apache Jenkins Server
See 


Changes:

[jason] MINOR: Improve logging around index files (#6385)

--
[...truncated 463.67 KB...]

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAppendPrepareAbortToLogOnEndTxnWhenStatusIsOngoingAndResultIsAbort PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAcceptInitPidAndReturnNextPidWhenTransactionalIdIsNull STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAcceptInitPidAndReturnNextPidWhenTransactionalIdIsNull PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRemoveTransactionsForPartitionOnEmigration STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRemoveTransactionsForPartitionOnEmigration PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldWaitForCommitToCompleteOnHandleInitPidAndExistingTransactionInPrepareCommitState
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldWaitForCommitToCompleteOnHandleInitPidAndExistingTransactionInPrepareCommitState
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAbortExpiredTransactionsInOngoingStateAndBumpEpoch STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAbortExpiredTransactionsInOngoingStateAndBumpEpoch PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnInvalidTxnRequestOnEndTxnRequestWhenStatusIsCompleteCommitAndResultIsNotCommit
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnInvalidTxnRequestOnEndTxnRequestWhenStatusIsCompleteCommitAndResultIsNotCommit
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnOkOnEndTxnWhenStatusIsCompleteCommitAndResultIsCommit STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnOkOnEndTxnWhenStatusIsCompleteCommitAndResultIsCommit PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionsOnAddPartitionsWhenStateIsPrepareCommit 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionsOnAddPartitionsWhenStateIsPrepareCommit 
PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldIncrementEpochAndUpdateMetadataOnHandleInitPidWhenExistingCompleteTransaction
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldIncrementEpochAndUpdateMetadataOnHandleInitPidWhenExistingCompleteTransaction
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldGenerateNewProducerIdIfEpochsExhausted STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldGenerateNewProducerIdIfEpochsExhausted PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithNotCoordinatorOnInitPidWhenNotCoordinator STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithNotCoordinatorOnInitPidWhenNotCoordinator PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionOnAddPartitionsWhenStateIsPrepareAbort 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionOnAddPartitionsWhenStateIsPrepareAbort 
PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldInitPidWithEpochZeroForNewTransactionalId STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldInitPidWithEpochZeroForNewTransactionalId PASSED

kafka.network.SocketServerTest > testGracefulClose STARTED

kafka.network.SocketServerTest > testGracefulClose PASSED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone STARTED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone PASSED

kafka.network.SocketServerTest > controlThrowable STARTED

kafka.network.SocketServerTest > controlThrowable PASSED

kafka.network.SocketServerTest > testRequestMetricsAfterStop STARTED

kafka.network.SocketServerTest > testRequestMetricsAfterStop PASSED

kafka.network.SocketServerTest > testConnectionIdReuse STARTED

kafka.network.SocketServerTest > testConnectionIdReuse PASSED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
STARTED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
PASSED

kafka.network.SocketServerTest > testProcessorMetricsTags STARTED

kafka.network.SocketServerTest > testProcessorMetricsTags PASSED

kafka.network.SocketServerTest > testMaxConnectionsPerIp STARTED

kafka.network.SocketServerTest > testMaxConnectionsPerIp PASSED

kafka.network.SocketServerTest > testConnectionId STARTED

kafka.network.SocketServerTest > testConnectionId PASSED

kafka.network.SocketServerTest > 
testBrokerSendAfterChannelClosedUpdatesRequestMetrics 

Jenkins build is back to normal : kafka-trunk-jdk8 #3478

2019-03-19 Thread Apache Jenkins Server
See 




Re: [VOTE] 2.2.0 RC2

2019-03-19 Thread Satish Duggana
+1 (non-binding)

- Ran testAll/releaseTarGzAll successfully with no failures.
- Ran through quickstart of core/streams on builds generated from 2.2.0-rc2
tag
- Ran few internal apps targeting to topics on 3 node cluster.

Thanks for running the release Matthias!

On Wed, Mar 20, 2019 at 12:43 AM Manikumar 
wrote:

> +1 (non-binding)
>
> - Verified the artifacts, build from src, ran tests
> - Verified the quickstart, ran producer/consumer performance tests.
>
> Thanks for running release!.
>
> Thanks,
> Manikumar
>
> On Wed, Mar 20, 2019 at 12:19 AM David Arthur 
> wrote:
>
> > +1
> >
> > Validated signatures, and ran through quick-start.
> >
> > Thanks!
> >
> > On Mon, Mar 18, 2019 at 4:00 AM Jakub Scholz  wrote:
> >
> > > +1 (non-binding). I used the staged binaries and run some of my tests
> > > against them. All seems to look good to me.
> > >
> > > On Sat, Mar 9, 2019 at 11:56 PM Matthias J. Sax  >
> > > wrote:
> > >
> > > > Hello Kafka users, developers and client-developers,
> > > >
> > > > This is the third candidate for release of Apache Kafka 2.2.0.
> > > >
> > > >  - Added SSL support for custom principal name
> > > >  - Allow SASL connections to periodically re-authenticate
> > > >  - Command line tool bin/kafka-topics.sh adds AdminClient support
> > > >  - Improved consumer group management
> > > >- default group.id is `null` instead of empty string
> > > >  - API improvement
> > > >- Producer: introduce close(Duration)
> > > >- AdminClient: introduce close(Duration)
> > > >- Kafka Streams: new flatTransform() operator in Streams DSL
> > > >- KafkaStreams (and other classed) now implement AutoClosable to
> > > > support try-with-resource
> > > >- New Serdes and default method implementations
> > > >  - Kafka Streams exposed internal client.id via ThreadMetadata
> > > >  - Metric improvements:  All `-min`, `-avg` and `-max` metrics will
> now
> > > > output `NaN` as default value
> > > > Release notes for the 2.2.0 release:
> > > > https://home.apache.org/~mjsax/kafka-2.2.0-rc2/RELEASE_NOTES.html
> > > >
> > > > *** Please download, test, and vote by Thursday, March 14, 9am PST.
> > > >
> > > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > > https://kafka.apache.org/KEYS
> > > >
> > > > * Release artifacts to be voted upon (source and binary):
> > > > https://home.apache.org/~mjsax/kafka-2.2.0-rc2/
> > > >
> > > > * Maven artifacts to be voted upon:
> > > >
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > > >
> > > > * Javadoc:
> > > > https://home.apache.org/~mjsax/kafka-2.2.0-rc2/javadoc/
> > > >
> > > > * Tag to be voted upon (off 2.2 branch) is the 2.2.0 tag:
> > > > https://github.com/apache/kafka/releases/tag/2.2.0-rc2
> > > >
> > > > * Documentation:
> > > > https://kafka.apache.org/22/documentation.html
> > > >
> > > > * Protocol:
> > > > https://kafka.apache.org/22/protocol.html
> > > >
> > > > * Jenkins builds for the 2.2 branch:
> > > > Unit/integration tests:
> https://builds.apache.org/job/kafka-2.2-jdk8/
> > > > System tests:
> > > https://jenkins.confluent.io/job/system-test-kafka/job/2.2/
> > > >
> > > > /**
> > > >
> > > > Thanks,
> > > >
> > > > -Matthias
> > > >
> > > >
> > >
> >
>


Build failed in Jenkins: kafka-1.1-jdk7 #253

2019-03-19 Thread Apache Jenkins Server
See 


Changes:

[jason] MINOR: Improve logging around index files (#6385)

--
[...truncated 1.93 MB...]
org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerMaxInFlightRequestPerConnectionsWhenEosDisabled 
STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerMaxInFlightRequestPerConnectionsWhenEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowStreamsExceptionIfValueSerdeConfigFails STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowStreamsExceptionIfValueSerdeConfigFails PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfConsumerAutoCommitIsOverridden STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfConsumerAutoCommitIsOverridden PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowExceptionIfNotAtLestOnceOrExactlyOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowExceptionIfNotAtLestOnceOrExactlyOnce PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingConsumerIsolationLevelIfEosDisabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingConsumerIsolationLevelIfEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedProducerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedProducerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedPropertiesThatAreNotPartOfRestoreConsumerConfig STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedPropertiesThatAreNotPartOfRestoreConsumerConfig PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultProducerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultProducerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedPropertiesThatAreNotPartOfProducerConfig STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedPropertiesThatAreNotPartOfProducerConfig PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldBeBackwardsCompatibleWithDeprecatedConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldBeBackwardsCompatibleWithDeprecatedConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldBeSupportNonPrefixedConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldBeSupportNonPrefixedConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConifgsOnRestoreConsumer STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConifgsOnRestoreConsumer PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptExactlyOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptExactlyOnce PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetInternalLeaveGroupOnCloseConfigToFalseInConsumer STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetInternalLeaveGroupOnCloseConfigToFalseInConsumer PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedPropertiesThatAreNotPartOfConsumerConfig STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedPropertiesThatAreNotPartOfConsumerConfig PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldForwardCustomConfigsWithNoPrefixToAllClients STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldForwardCustomConfigsWithNoPrefixToAllClients PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfRestoreConsumerAutoCommitIsOverridden STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfRestoreConsumerAutoCommitIsOverridden PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSpecifyCorrectValueSerdeClassOnError STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSpecifyCorrectValueSerdeClassOnError PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSpecifyCorrectValueSerdeClassOnErrorUsingDeprecatedConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 

Re: [DISCUSS] KIP-392: Allow consumers to fetch from the closest replica

2019-03-19 Thread Jason Gustafson
Hi Ryanne,

Thanks for the comment. If I understand your question correctly, I think
the answer is no. I would expect typical selection logic to consider
replica availability first before any other factor. In some cases, however,
a user may put a higher priority on saving cross-dc traffic costs. If a
preferred replica is unavailable, they may prefer to wait some time for it
to be restored before routing traffic elsewhere. Does that make sense?

Best,
Jason

On Tue, Mar 19, 2019 at 3:43 PM Ryanne Dolan  wrote:

> Jason, awesome KIP.
>
> I'm wondering how this change would affect availability of the cluster when
> a rack is unreachable. Is there a scenario where availability is improved
> or impaired due to the proposed changes?
>
> Ryanne
>
> On Tue, Mar 19, 2019 at 4:32 PM Jason Gustafson 
> wrote:
>
> > Hi Jun,
> >
> > Yes, that makes sense to me. I have added a ClientMetadata class which
> > encapsulates various metadata including the rackId and the client address
> > information.
> >
> > Thanks,
> > Jason
> >
> > On Tue, Mar 19, 2019 at 2:17 PM Jun Rao  wrote:
> >
> > > Hi, Jason,
> > >
> > > Thanks for the updated KIP. Just one more comment below.
> > >
> > > 100. The ReplicaSelector class has the following method. I am wondering
> > if
> > > we should additionally pass in the client connection info to the
> method.
> > > For example, if rackId is not set, the plugin could potentially select
> > the
> > > replica based on the IP address of the client.
> > >
> > > Node select(String rackId, PartitionInfo partitionInfo)
> > >
> > > Jun
> > >
> > >
> > > On Mon, Mar 11, 2019 at 4:24 PM Jason Gustafson 
> > > wrote:
> > >
> > > > Hey Everyone,
> > > >
> > > > Apologies for the long delay. I am picking this work back up.
> > > >
> > > > After giving this some further thought, I decided it makes the most
> > sense
> > > > to move replica selection logic into the broker. It is much more
> > > difficult
> > > > to coordinate selection logic in a multi-tenant environment if
> > operators
> > > > have to coordinate plugins across all client applications (not to
> > mention
> > > > other languages). Take a look at the updates and let me know what you
> > > > think:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica
> > > > .
> > > >
> > > > Thanks!
> > > > Jason
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Jan 11, 2019 at 10:49 AM Jun Rao  wrote:
> > > >
> > > > > Hi, Jason,
> > > > >
> > > > > Thanks for the updated KIP. Looks good overall. Just a few minor
> > > > comments.
> > > > >
> > > > > 20. For case 2, if the consumer receives an OFFSET_NOT_AVAILABLE, I
> > am
> > > > > wondering if the consumer should refresh the metadata before
> > retrying.
> > > > This
> > > > > can allow the consumer to switch to an in-sync replica sooner.
> > > > >
> > > > > 21. Under "protocol changes", there is a sentence "This allows the
> > > > broker "
> > > > > that seems broken.
> > > > >
> > > > > 4. About reducing the ISR propagation delay from the broker to the
> > > > > controller. Jiangjie made that change in KAFKA-2722. Jiangjie,
> could
> > > you
> > > > > comment on whether it's reasonable to reduce the propagation delay
> > now?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Wed, Jan 2, 2019 at 11:06 AM Jason Gustafson <
> ja...@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > Hey Jun,
> > > > > >
> > > > > > Sorry for the late reply. I have been giving your comments some
> > > > thought.
> > > > > > Replies below:
> > > > > >
> > > > > > 1. The section on handling FETCH_OFFSET_TOO_LARGE error says "Use
> > the
> > > > > > > OffsetForLeaderEpoch API to verify the current position with
> the
> > > > > leader".
> > > > > > > The OffsetForLeaderEpoch request returns log end offset if the
> > > > request
> > > > > > > leader epoch is the latest. So, we won't know the true high
> > > watermark
> > > > > > from
> > > > > > > that request. It seems that the consumer still needs to send
> > > > ListOffset
> > > > > > > request to the leader to obtain high watermark?
> > > > > >
> > > > > >
> > > > > > That's a good point. I think we missed this in KIP-320. I've
> added
> > a
> > > > > > replica_id to the OffsetsForLeaderEpoch API to match the Fetch
> and
> > > > > > ListOffsets API so that the broker can avoid exposing offsets
> > beyond
> > > > the
> > > > > > high watermark. This also means that the OffsetsForLeaderEpoch
> API
> > > > needs
> > > > > > the same handling we added to the ListOffsets API to avoid
> > > > non-monotonic
> > > > > or
> > > > > > incorrect responses. Similarly, I've proposed using the
> > > > > > OFFSET_NOT_AVAILABLE error code in cases where the end offset of
> an
> > > > epoch
> > > > > > would exceed the high watermark. When querying the latest epoch,
> > the
> > > > > leader
> > > > > > will return OFFSET_NOT_AVAILABLE until the high watermark has
> > reached
> > > > an
> > 

Re: [DISCUSS] KIP-392: Allow consumers to fetch from the closest replica

2019-03-19 Thread Ryanne Dolan
Jason, awesome KIP.

I'm wondering how this change would affect availability of the cluster when
a rack is unreachable. Is there a scenario where availability is improved
or impaired due to the proposed changes?

Ryanne

On Tue, Mar 19, 2019 at 4:32 PM Jason Gustafson  wrote:

> Hi Jun,
>
> Yes, that makes sense to me. I have added a ClientMetadata class which
> encapsulates various metadata including the rackId and the client address
> information.
>
> Thanks,
> Jason
>
> On Tue, Mar 19, 2019 at 2:17 PM Jun Rao  wrote:
>
> > Hi, Jason,
> >
> > Thanks for the updated KIP. Just one more comment below.
> >
> > 100. The ReplicaSelector class has the following method. I am wondering
> if
> > we should additionally pass in the client connection info to the method.
> > For example, if rackId is not set, the plugin could potentially select
> the
> > replica based on the IP address of the client.
> >
> > Node select(String rackId, PartitionInfo partitionInfo)
> >
> > Jun
> >
> >
> > On Mon, Mar 11, 2019 at 4:24 PM Jason Gustafson 
> > wrote:
> >
> > > Hey Everyone,
> > >
> > > Apologies for the long delay. I am picking this work back up.
> > >
> > > After giving this some further thought, I decided it makes the most
> sense
> > > to move replica selection logic into the broker. It is much more
> > difficult
> > > to coordinate selection logic in a multi-tenant environment if
> operators
> > > have to coordinate plugins across all client applications (not to
> mention
> > > other languages). Take a look at the updates and let me know what you
> > > think:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica
> > > .
> > >
> > > Thanks!
> > > Jason
> > >
> > >
> > >
> > >
> > > On Fri, Jan 11, 2019 at 10:49 AM Jun Rao  wrote:
> > >
> > > > Hi, Jason,
> > > >
> > > > Thanks for the updated KIP. Looks good overall. Just a few minor
> > > comments.
> > > >
> > > > 20. For case 2, if the consumer receives an OFFSET_NOT_AVAILABLE, I
> am
> > > > wondering if the consumer should refresh the metadata before
> retrying.
> > > This
> > > > can allow the consumer to switch to an in-sync replica sooner.
> > > >
> > > > 21. Under "protocol changes", there is a sentence "This allows the
> > > broker "
> > > > that seems broken.
> > > >
> > > > 4. About reducing the ISR propagation delay from the broker to the
> > > > controller. Jiangjie made that change in KAFKA-2722. Jiangjie, could
> > you
> > > > comment on whether it's reasonable to reduce the propagation delay
> now?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Wed, Jan 2, 2019 at 11:06 AM Jason Gustafson 
> > > > wrote:
> > > >
> > > > > Hey Jun,
> > > > >
> > > > > Sorry for the late reply. I have been giving your comments some
> > > thought.
> > > > > Replies below:
> > > > >
> > > > > 1. The section on handling FETCH_OFFSET_TOO_LARGE error says "Use
> the
> > > > > > OffsetForLeaderEpoch API to verify the current position with the
> > > > leader".
> > > > > > The OffsetForLeaderEpoch request returns log end offset if the
> > > request
> > > > > > leader epoch is the latest. So, we won't know the true high
> > watermark
> > > > > from
> > > > > > that request. It seems that the consumer still needs to send
> > > ListOffset
> > > > > > request to the leader to obtain high watermark?
> > > > >
> > > > >
> > > > > That's a good point. I think we missed this in KIP-320. I've added
> a
> > > > > replica_id to the OffsetsForLeaderEpoch API to match the Fetch and
> > > > > ListOffsets API so that the broker can avoid exposing offsets
> beyond
> > > the
> > > > > high watermark. This also means that the OffsetsForLeaderEpoch API
> > > needs
> > > > > the same handling we added to the ListOffsets API to avoid
> > > non-monotonic
> > > > or
> > > > > incorrect responses. Similarly, I've proposed using the
> > > > > OFFSET_NOT_AVAILABLE error code in cases where the end offset of an
> > > epoch
> > > > > would exceed the high watermark. When querying the latest epoch,
> the
> > > > leader
> > > > > will return OFFSET_NOT_AVAILABLE until the high watermark has
> reached
> > > an
> > > > > offset in the leader's current epoch.
> > > > >
> > > > > By the way, I've modified the KIP to drop the OFFSET_TOO_LARGE and
> > > > > OFFSET_TOO_SMALL error codes that I initially proposed. I realized
> > that
> > > > we
> > > > > could continue to use the current OFFSET_OUT_OF_RANGE error and
> rely
> > on
> > > > the
> > > > > returned start offset to distinguish the two cases.
> > > > >
> > > > > 2. If a non in-sync replica receives a fetch request from a
> consumer,
> > > > > > should it return a new type of error like ReplicaNotInSync?
> > > > >
> > > > >
> > > > > I gave this quite a bit of thought. It is impossible to avoid
> > fetching
> > > > from
> > > > > out-of-sync replicas in general due to propagation of the ISR
> state.
> > > The
> > > > > high watermark that is returned in fetch responses 

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-03-19 Thread George Li
 Hi Viktor,

Thanks for the review. 

If there is reassignment in-progress while the cluster is upgraded with this 
KIP (upgrade the binary and then do a cluster rolling restart of the brokers), 
the reassignment JSON in Zookeeper  /admin/reassign_partitions will only have  
{topic, partition, replicas(new)} info when the batch of reassignment was 
kicked off before the upgrade,  not with the "original_replicas" info per 
topic/partition.  So when the user is trying to cancel/rollback the 
reassignments, it's going to fail and the cancellation will be skipped (The 
code in this KIP will check the if the "original_replicas" is in the 
/admin/reassign_partition).  

The user either has to wait till current reassignments to finish or does quite 
some manual work to cancel them (delete ZK node, bounce controller, re-submit 
reassignments with original replicas to rollback, if the original replicas are 
kept before the last batch of reassignments were submitted). 

I think this scenario of reassignments being kicked off by end-user, not by the 
team(s) that managed Kafka infrastructure might be rare (maybe in some very 
small companies?),  since only one batch of reassignments can be running at a 
given time in /admin/reassign_partitions.  The end-users need some 
co-ordination for submitting reassignments. 

Thanks,
George


On Tuesday, March 19, 2019, 3:34:20 AM PDT, Viktor Somogyi-Vass 
 wrote:  
 
 Hey George,
Thanks for the answers. I'll try to block out time this week to review your PR.
I have one more point to clarify:I've seen some customers who are managing 
Kafka as an internal company-wide service and they may or may not know that how 
certain topics are used within the company. That might mean that some clients 
can start reassignment at random times. Let's suppose that such a reassignment 
starts just when the Kafka operation team starts upgrading the cluster that 
contains this KIP. The question is: do you think that we need handle upgrade 
scenarios where there is an in-progress reassignment?
Thanks,
Viktor

On Tue, Mar 19, 2019 at 6:16 AM George Li  wrote:

 Hi Viktor,
FYI, I have added a new ducktape test:  
tests/kafkatest/tests/core/reassign_cancel_test.py to 
https://github.com/apache/kafka/pull/6296  
After review, do you have any more questions?  Thanks

Hi Jun, 
Could you help review this when you have time?  Thanks

Can we start a vote on this KIP in one or two weeks? 

Thanks,George


On Tuesday, March 5, 2019, 10:58:45 PM PST, George Li 
 wrote:  
 
  Hi Viktor,


>  2.: One follow-up question: if the reassignment cancellation gets 
>interrupted and a failover happens after step #2 but before step #3, how will 
>the new controller continue? At this stage Zookeeper would contain OAR + RAR, 
>however the brokers will have the updated LeaderAndIsr about OAR, so they 
>won't know about RAR. I would suppose the new controller would start from the 
>beginning as it only knows what's in Zookeeper. Is that true?

The OAR (Original Assigned Replicas) for the rollback is stored in the 
/admin/reassign_partitions for each topic/partition reassignments.  During the 
controller failover,  the new controller will read /admin/reassign_partitions 
for both new_replicas AND original_replicas into 
controllerContext.partitionsBeingReassigned

then perform pending reassignments cancellation/rollback.  The originalReplicas 
 is added below:  
case class ReassignedPartitionsContext(var newReplicas: Seq[Int] = Seq.empty,
   var originalReplicas: Seq[Int]= 
Seq.empty,
   val reassignIsrChangeHandler: 
PartitionReassignmentIsrChangeHandler) {


> 2.1: Another interesting question that are what are those replicas are doing 
> which are online but not part of the leader and ISR? Are they still 
> replicating? Are they safe to lie around for the time being?

I think they are just dangling without being replicated.  Upon LeaderAndIsr 
request, the makeFollowers() will 
replicaFetcherManager.removeFetcherForPartitions() and it will not be added in 
ReplicaFetcher since they are not in the current AR (Assigned Replicas).   Step 
4 (StopReplica) will delete them. 


I am adding a new ducktape test:  
tests/kafkatest/tests/core/reassign_cancel_test.py for cancelling pending 
reassignments.  but it's not easy to simulate controller failover during the 
cancellation since the cancellation/rollback is so fast to complete. See below. 
 I do have a unit/integration test to simulate controller failover:  
shouldTriggerReassignmentOnControllerStartup()


Thanks,George



Warning: You must run Verify periodically, until the reassignment completes, to 
ensure the throttle is removed. You can also alter the throttle by rerunning 
the Execute command passing a new value.The inter-broker throttle limit was set 
to 1024 B/sSuccessfully started reassignment of partitions.
[INFO  - 2019-03-05 04:20:48,321 - kafka - execute_reassign_cancel - 

Re: [DISCUSS] KIP-392: Allow consumers to fetch from the closest replica

2019-03-19 Thread Jason Gustafson
Hi Jun,

Yes, that makes sense to me. I have added a ClientMetadata class which
encapsulates various metadata including the rackId and the client address
information.

Thanks,
Jason

On Tue, Mar 19, 2019 at 2:17 PM Jun Rao  wrote:

> Hi, Jason,
>
> Thanks for the updated KIP. Just one more comment below.
>
> 100. The ReplicaSelector class has the following method. I am wondering if
> we should additionally pass in the client connection info to the method.
> For example, if rackId is not set, the plugin could potentially select the
> replica based on the IP address of the client.
>
> Node select(String rackId, PartitionInfo partitionInfo)
>
> Jun
>
>
> On Mon, Mar 11, 2019 at 4:24 PM Jason Gustafson 
> wrote:
>
> > Hey Everyone,
> >
> > Apologies for the long delay. I am picking this work back up.
> >
> > After giving this some further thought, I decided it makes the most sense
> > to move replica selection logic into the broker. It is much more
> difficult
> > to coordinate selection logic in a multi-tenant environment if operators
> > have to coordinate plugins across all client applications (not to mention
> > other languages). Take a look at the updates and let me know what you
> > think:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica
> > .
> >
> > Thanks!
> > Jason
> >
> >
> >
> >
> > On Fri, Jan 11, 2019 at 10:49 AM Jun Rao  wrote:
> >
> > > Hi, Jason,
> > >
> > > Thanks for the updated KIP. Looks good overall. Just a few minor
> > comments.
> > >
> > > 20. For case 2, if the consumer receives an OFFSET_NOT_AVAILABLE, I am
> > > wondering if the consumer should refresh the metadata before retrying.
> > This
> > > can allow the consumer to switch to an in-sync replica sooner.
> > >
> > > 21. Under "protocol changes", there is a sentence "This allows the
> > broker "
> > > that seems broken.
> > >
> > > 4. About reducing the ISR propagation delay from the broker to the
> > > controller. Jiangjie made that change in KAFKA-2722. Jiangjie, could
> you
> > > comment on whether it's reasonable to reduce the propagation delay now?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Wed, Jan 2, 2019 at 11:06 AM Jason Gustafson 
> > > wrote:
> > >
> > > > Hey Jun,
> > > >
> > > > Sorry for the late reply. I have been giving your comments some
> > thought.
> > > > Replies below:
> > > >
> > > > 1. The section on handling FETCH_OFFSET_TOO_LARGE error says "Use the
> > > > > OffsetForLeaderEpoch API to verify the current position with the
> > > leader".
> > > > > The OffsetForLeaderEpoch request returns log end offset if the
> > request
> > > > > leader epoch is the latest. So, we won't know the true high
> watermark
> > > > from
> > > > > that request. It seems that the consumer still needs to send
> > ListOffset
> > > > > request to the leader to obtain high watermark?
> > > >
> > > >
> > > > That's a good point. I think we missed this in KIP-320. I've added a
> > > > replica_id to the OffsetsForLeaderEpoch API to match the Fetch and
> > > > ListOffsets API so that the broker can avoid exposing offsets beyond
> > the
> > > > high watermark. This also means that the OffsetsForLeaderEpoch API
> > needs
> > > > the same handling we added to the ListOffsets API to avoid
> > non-monotonic
> > > or
> > > > incorrect responses. Similarly, I've proposed using the
> > > > OFFSET_NOT_AVAILABLE error code in cases where the end offset of an
> > epoch
> > > > would exceed the high watermark. When querying the latest epoch, the
> > > leader
> > > > will return OFFSET_NOT_AVAILABLE until the high watermark has reached
> > an
> > > > offset in the leader's current epoch.
> > > >
> > > > By the way, I've modified the KIP to drop the OFFSET_TOO_LARGE and
> > > > OFFSET_TOO_SMALL error codes that I initially proposed. I realized
> that
> > > we
> > > > could continue to use the current OFFSET_OUT_OF_RANGE error and rely
> on
> > > the
> > > > returned start offset to distinguish the two cases.
> > > >
> > > > 2. If a non in-sync replica receives a fetch request from a consumer,
> > > > > should it return a new type of error like ReplicaNotInSync?
> > > >
> > > >
> > > > I gave this quite a bit of thought. It is impossible to avoid
> fetching
> > > from
> > > > out-of-sync replicas in general due to propagation of the ISR state.
> > The
> > > > high watermark that is returned in fetch responses could be used as a
> > > more
> > > > timely substitute, but we still can't assume that followers will
> always
> > > > know when they are in-sync. From a high level, this means that the
> > > consumer
> > > > anyway has to take out of range errors with a grain of salt if they
> > come
> > > > from followers. This is only a problem when switching between
> replicas
> > or
> > > > if resuming from a committed offset. If a consumer is following the
> > same
> > > > out-of-sync replica, then its position will stay in range and, other
> > than
> > > > some extra latency, no 

[jira] [Created] (KAFKA-8129) Shade Kafka client dependencies

2019-03-19 Thread Colin P. McCabe (JIRA)
Colin P. McCabe created KAFKA-8129:
--

 Summary: Shade Kafka client dependencies
 Key: KAFKA-8129
 URL: https://issues.apache.org/jira/browse/KAFKA-8129
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Colin P. McCabe
Assignee: Colin P. McCabe


The Kafka client should shade its library dependencies.  This will ensure that 
its dependencies don't collide with those employed by users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-392: Allow consumers to fetch from the closest replica

2019-03-19 Thread Jun Rao
Hi, Jason,

Thanks for the updated KIP. Just one more comment below.

100. The ReplicaSelector class has the following method. I am wondering if
we should additionally pass in the client connection info to the method.
For example, if rackId is not set, the plugin could potentially select the
replica based on the IP address of the client.

Node select(String rackId, PartitionInfo partitionInfo)

Jun


On Mon, Mar 11, 2019 at 4:24 PM Jason Gustafson  wrote:

> Hey Everyone,
>
> Apologies for the long delay. I am picking this work back up.
>
> After giving this some further thought, I decided it makes the most sense
> to move replica selection logic into the broker. It is much more difficult
> to coordinate selection logic in a multi-tenant environment if operators
> have to coordinate plugins across all client applications (not to mention
> other languages). Take a look at the updates and let me know what you
> think:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica
> .
>
> Thanks!
> Jason
>
>
>
>
> On Fri, Jan 11, 2019 at 10:49 AM Jun Rao  wrote:
>
> > Hi, Jason,
> >
> > Thanks for the updated KIP. Looks good overall. Just a few minor
> comments.
> >
> > 20. For case 2, if the consumer receives an OFFSET_NOT_AVAILABLE, I am
> > wondering if the consumer should refresh the metadata before retrying.
> This
> > can allow the consumer to switch to an in-sync replica sooner.
> >
> > 21. Under "protocol changes", there is a sentence "This allows the
> broker "
> > that seems broken.
> >
> > 4. About reducing the ISR propagation delay from the broker to the
> > controller. Jiangjie made that change in KAFKA-2722. Jiangjie, could you
> > comment on whether it's reasonable to reduce the propagation delay now?
> >
> > Thanks,
> >
> > Jun
> >
> > On Wed, Jan 2, 2019 at 11:06 AM Jason Gustafson 
> > wrote:
> >
> > > Hey Jun,
> > >
> > > Sorry for the late reply. I have been giving your comments some
> thought.
> > > Replies below:
> > >
> > > 1. The section on handling FETCH_OFFSET_TOO_LARGE error says "Use the
> > > > OffsetForLeaderEpoch API to verify the current position with the
> > leader".
> > > > The OffsetForLeaderEpoch request returns log end offset if the
> request
> > > > leader epoch is the latest. So, we won't know the true high watermark
> > > from
> > > > that request. It seems that the consumer still needs to send
> ListOffset
> > > > request to the leader to obtain high watermark?
> > >
> > >
> > > That's a good point. I think we missed this in KIP-320. I've added a
> > > replica_id to the OffsetsForLeaderEpoch API to match the Fetch and
> > > ListOffsets API so that the broker can avoid exposing offsets beyond
> the
> > > high watermark. This also means that the OffsetsForLeaderEpoch API
> needs
> > > the same handling we added to the ListOffsets API to avoid
> non-monotonic
> > or
> > > incorrect responses. Similarly, I've proposed using the
> > > OFFSET_NOT_AVAILABLE error code in cases where the end offset of an
> epoch
> > > would exceed the high watermark. When querying the latest epoch, the
> > leader
> > > will return OFFSET_NOT_AVAILABLE until the high watermark has reached
> an
> > > offset in the leader's current epoch.
> > >
> > > By the way, I've modified the KIP to drop the OFFSET_TOO_LARGE and
> > > OFFSET_TOO_SMALL error codes that I initially proposed. I realized that
> > we
> > > could continue to use the current OFFSET_OUT_OF_RANGE error and rely on
> > the
> > > returned start offset to distinguish the two cases.
> > >
> > > 2. If a non in-sync replica receives a fetch request from a consumer,
> > > > should it return a new type of error like ReplicaNotInSync?
> > >
> > >
> > > I gave this quite a bit of thought. It is impossible to avoid fetching
> > from
> > > out-of-sync replicas in general due to propagation of the ISR state.
> The
> > > high watermark that is returned in fetch responses could be used as a
> > more
> > > timely substitute, but we still can't assume that followers will always
> > > know when they are in-sync. From a high level, this means that the
> > consumer
> > > anyway has to take out of range errors with a grain of salt if they
> come
> > > from followers. This is only a problem when switching between replicas
> or
> > > if resuming from a committed offset. If a consumer is following the
> same
> > > out-of-sync replica, then its position will stay in range and, other
> than
> > > some extra latency, no harm will be done.
> > >
> > > Furthermore, it may not be a good idea for consumers to chase the ISR
> too
> > > eagerly since this makes the performance profile harder to predict. The
> > > leader itself may have some temporarily increased request load which is
> > > causing followers to fall behind. If consumers then switched to the
> > leader
> > > after they observed that the follower was out-of-sync, it may make the
> > > situation worse. Typically, If a follower has fallen out-of-sync, we
> 

Build failed in Jenkins: kafka-trunk-jdk8 #3477

2019-03-19 Thread Apache Jenkins Server
See 


Changes:

[bbejeck] KAFKA-8062: Do not remore StateListener when shutting down stream 
thread

--
[...truncated 4.67 MB...]

org.apache.kafka.connect.json.JsonConverterTest > stringHeaderToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonNonStringKeys STARTED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonNonStringKeys PASSED

org.apache.kafka.connect.json.JsonConverterTest > longToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > longToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > mismatchSchemaJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > mismatchSchemaJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToConnectNonStringKeys 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > mapToConnectNonStringKeys 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation PASSED

org.apache.kafka.connect.json.JsonConverterTest > bytesToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > bytesToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > shortToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > shortToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > intToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > intToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndNullValueToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndNullValueToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnectOptional 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnectOptional 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > structToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > structToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > stringToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > stringToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndArrayToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndArrayToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > byteToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > byteToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaPrimitiveToConnect 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaPrimitiveToConnect 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > byteToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > byteToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > intToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > intToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > dateToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > dateToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
decimalToConnectOptionalWithDefaultValue STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
decimalToConnectOptionalWithDefaultValue PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonStringKeys STARTED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonStringKeys PASSED

org.apache.kafka.connect.json.JsonConverterTest > arrayToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > arrayToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > timeToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > timeToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > structToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > structToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
testConnectSchemaMetadataTranslation STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
testConnectSchemaMetadataTranslation PASSED

org.apache.kafka.connect.json.JsonConverterTest > shortToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > shortToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > dateToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > 

Re: [VOTE] 2.2.0 RC2

2019-03-19 Thread Manikumar
+1 (non-binding)

- Verified the artifacts, build from src, ran tests
- Verified the quickstart, ran producer/consumer performance tests.

Thanks for running release!.

Thanks,
Manikumar

On Wed, Mar 20, 2019 at 12:19 AM David Arthur 
wrote:

> +1
>
> Validated signatures, and ran through quick-start.
>
> Thanks!
>
> On Mon, Mar 18, 2019 at 4:00 AM Jakub Scholz  wrote:
>
> > +1 (non-binding). I used the staged binaries and run some of my tests
> > against them. All seems to look good to me.
> >
> > On Sat, Mar 9, 2019 at 11:56 PM Matthias J. Sax 
> > wrote:
> >
> > > Hello Kafka users, developers and client-developers,
> > >
> > > This is the third candidate for release of Apache Kafka 2.2.0.
> > >
> > >  - Added SSL support for custom principal name
> > >  - Allow SASL connections to periodically re-authenticate
> > >  - Command line tool bin/kafka-topics.sh adds AdminClient support
> > >  - Improved consumer group management
> > >- default group.id is `null` instead of empty string
> > >  - API improvement
> > >- Producer: introduce close(Duration)
> > >- AdminClient: introduce close(Duration)
> > >- Kafka Streams: new flatTransform() operator in Streams DSL
> > >- KafkaStreams (and other classed) now implement AutoClosable to
> > > support try-with-resource
> > >- New Serdes and default method implementations
> > >  - Kafka Streams exposed internal client.id via ThreadMetadata
> > >  - Metric improvements:  All `-min`, `-avg` and `-max` metrics will now
> > > output `NaN` as default value
> > > Release notes for the 2.2.0 release:
> > > https://home.apache.org/~mjsax/kafka-2.2.0-rc2/RELEASE_NOTES.html
> > >
> > > *** Please download, test, and vote by Thursday, March 14, 9am PST.
> > >
> > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > https://kafka.apache.org/KEYS
> > >
> > > * Release artifacts to be voted upon (source and binary):
> > > https://home.apache.org/~mjsax/kafka-2.2.0-rc2/
> > >
> > > * Maven artifacts to be voted upon:
> > > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > >
> > > * Javadoc:
> > > https://home.apache.org/~mjsax/kafka-2.2.0-rc2/javadoc/
> > >
> > > * Tag to be voted upon (off 2.2 branch) is the 2.2.0 tag:
> > > https://github.com/apache/kafka/releases/tag/2.2.0-rc2
> > >
> > > * Documentation:
> > > https://kafka.apache.org/22/documentation.html
> > >
> > > * Protocol:
> > > https://kafka.apache.org/22/protocol.html
> > >
> > > * Jenkins builds for the 2.2 branch:
> > > Unit/integration tests: https://builds.apache.org/job/kafka-2.2-jdk8/
> > > System tests:
> > https://jenkins.confluent.io/job/system-test-kafka/job/2.2/
> > >
> > > /**
> > >
> > > Thanks,
> > >
> > > -Matthias
> > >
> > >
> >
>


Re: [VOTE] 2.2.0 RC2

2019-03-19 Thread David Arthur
+1

Validated signatures, and ran through quick-start.

Thanks!

On Mon, Mar 18, 2019 at 4:00 AM Jakub Scholz  wrote:

> +1 (non-binding). I used the staged binaries and run some of my tests
> against them. All seems to look good to me.
>
> On Sat, Mar 9, 2019 at 11:56 PM Matthias J. Sax 
> wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the third candidate for release of Apache Kafka 2.2.0.
> >
> >  - Added SSL support for custom principal name
> >  - Allow SASL connections to periodically re-authenticate
> >  - Command line tool bin/kafka-topics.sh adds AdminClient support
> >  - Improved consumer group management
> >- default group.id is `null` instead of empty string
> >  - API improvement
> >- Producer: introduce close(Duration)
> >- AdminClient: introduce close(Duration)
> >- Kafka Streams: new flatTransform() operator in Streams DSL
> >- KafkaStreams (and other classed) now implement AutoClosable to
> > support try-with-resource
> >- New Serdes and default method implementations
> >  - Kafka Streams exposed internal client.id via ThreadMetadata
> >  - Metric improvements:  All `-min`, `-avg` and `-max` metrics will now
> > output `NaN` as default value
> > Release notes for the 2.2.0 release:
> > https://home.apache.org/~mjsax/kafka-2.2.0-rc2/RELEASE_NOTES.html
> >
> > *** Please download, test, and vote by Thursday, March 14, 9am PST.
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > https://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > https://home.apache.org/~mjsax/kafka-2.2.0-rc2/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >
> > * Javadoc:
> > https://home.apache.org/~mjsax/kafka-2.2.0-rc2/javadoc/
> >
> > * Tag to be voted upon (off 2.2 branch) is the 2.2.0 tag:
> > https://github.com/apache/kafka/releases/tag/2.2.0-rc2
> >
> > * Documentation:
> > https://kafka.apache.org/22/documentation.html
> >
> > * Protocol:
> > https://kafka.apache.org/22/protocol.html
> >
> > * Jenkins builds for the 2.2 branch:
> > Unit/integration tests: https://builds.apache.org/job/kafka-2.2-jdk8/
> > System tests:
> https://jenkins.confluent.io/job/system-test-kafka/job/2.2/
> >
> > /**
> >
> > Thanks,
> >
> > -Matthias
> >
> >
>


Re: KIP-213- [DISCUSS] - Three follow-up discussion points - topic partitioning, serializers, hashers

2019-03-19 Thread Adam Bellemare
Thanks John & Matthias. I have created a report with Confluent (
https://github.com/confluentinc/schema-registry/issues/1061).

I will continue on with current work and we can resume the discussion, as
Matthias correctly indicates, in the PR. Matthias, thank you for the link
to Kafka-. This is something that my team has also come across, and I
may be interested in pursuing a KIP on that once this one is completed.

Thank you both again for your insight.

On Tue, Mar 19, 2019 at 2:19 PM John Roesler  wrote:

> Chiming in...
>
> 1) Agreed. There is a technical reason 1:1 joins have to be co-partitioned,
> which does not apply to the many:1 join you've designed.
>
> 2) Looking at the Serializer interface, it unfortunately doesn't indicate
> whether the topic (or the value) is nullable. There are several places in
> Streams where we need to serialize a value for purposes other than sending
> it to a topic (KTableSuppressProcessor comes to mind), and using `null` for
> the topic is the convention we have. I think we should just use `null` for
> this case as well. Since we're doing this already, maybe we should document
> in the Serializer interface which parameters are nullable.
>
> It sounds like you're using the Confluent serde, and need it to support
> this usage. I'd recommend you just send a PR to that project independently.
>
> On Mon, Mar 18, 2019 at 7:13 PM Matthias J. Sax 
> wrote:
>
> > Just my 2 cents. Not sure if others see it differently:
> >
> > 1) it seems that we can lift the restriction on having the same number
> > of input topic partitions, and thus we should exploit this IMHO; don't
> > see why we should enforce an artificial restriction
> >
> >
> > 2) for the value serde it's a little bit more tricky; in general, Apache
> > Kafka should not be concerned with third party tools. It seems that
> > https://issues.apache.org/jira/browse/KAFKA- might provide a
> > solution though -- but it's unclear if KIP-213 and  would be shipped
> > with the same release...
> >
> > > To me, this is a shortcoming of the Confluent Avro Serde
> > >> that will likely need to be fixed on that side.
> >
> > I agree (good to know...)
> >
> >
> > 3) I am not an expert on hashing, but 128-bit murmur3 sounds reasonable
> > to me
> >
> >
> >
> > Btw: I think we can have this discussion on the PR -- no need to concern
> > the mailing list (it's a lot of people that are subscribed).
> >
> >
> >
> > -Matthias
> >
> > On 3/17/19 5:20 PM, Adam Bellemare wrote:
> > > Hey folks
> > >
> > > I have been implementing the KIP as outlined in
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable
> > ,
> > > and I have run into a few points to consider that we did not include in
> > the
> > > original.
> > >
> > > *1) Do all input topics need to have the same partitions or not?*
> > Currently
> > > I have it designed such that it must, to be consistent with other
> joins.
> > > However, consider the following:
> > > TableA - 5 partitions
> > > TableB - 2 partitions
> > > Pre-subscribe Repartition Topic = 2 partitions, 2 RHS processors
> > > Post-Subcribe Repartition Topic = 5 partitions, 5 LHS processors
> > >
> > > Would this not be possible? Is there a value in flexibility to this? I
> > have
> > > not looked deeper into the restrictions of this approach, so if there
> is
> > > something I should know I would appreciate a heads up.
> > >
> > > *2) Is it appropriate to use the KTable valueSerde during the
> computation
> > > of the hash?* To compute the hash I need to obtain an array of bytes,
> > which
> > > is immediately possible by  using the valueSerde. However, the
> Confluent
> > > Kafka Schema Registry serializer fails when it is being used in this
> way:
> > > In the hash generating code, I set topic to null because the data is
> not
> > > dependent on any topic value. I simply want the serialized bytes to
> input
> > > into the hash function.
> > > *byte[] preHashValue = serializer.serialize(topic = null, data)*
> > >
> > > Any KTable that is Consumed.with(Confluent-Key-Serde,
> > > Confluent-Value-Serde) will automatically try to register the schema to
> > > topic+"-key" and topic+"-value". If I pass in null, it tries to
> register
> > to
> > > "-key" and "-value" each time the serializer is called, regardless of
> the
> > > class. In other words, it registers the schemas to a null topic and
> fails
> > > any subsequent serializations that aren't of the exact same schema.
> Note
> > > that this would be the case across ALL applications using the confluent
> > > schema registry. To me, this is a shortcoming of the Confluent Avro
> Serde
> > > that will likely need to be fixed on that side. However, it does bring
> up
> > > the question - is this an appropriate way to use a serializer?
> > Alternately,
> > > if I should NOT use the KTable value-serde to generate the byte array,
> > does
> > > anyone have a better idea?
> > >
> > > *3) How big of a hash value 

Re: KIP-213- [DISCUSS] - Three follow-up discussion points - topic partitioning, serializers, hashers

2019-03-19 Thread John Roesler
Chiming in...

1) Agreed. There is a technical reason 1:1 joins have to be co-partitioned,
which does not apply to the many:1 join you've designed.

2) Looking at the Serializer interface, it unfortunately doesn't indicate
whether the topic (or the value) is nullable. There are several places in
Streams where we need to serialize a value for purposes other than sending
it to a topic (KTableSuppressProcessor comes to mind), and using `null` for
the topic is the convention we have. I think we should just use `null` for
this case as well. Since we're doing this already, maybe we should document
in the Serializer interface which parameters are nullable.

It sounds like you're using the Confluent serde, and need it to support
this usage. I'd recommend you just send a PR to that project independently.

On Mon, Mar 18, 2019 at 7:13 PM Matthias J. Sax 
wrote:

> Just my 2 cents. Not sure if others see it differently:
>
> 1) it seems that we can lift the restriction on having the same number
> of input topic partitions, and thus we should exploit this IMHO; don't
> see why we should enforce an artificial restriction
>
>
> 2) for the value serde it's a little bit more tricky; in general, Apache
> Kafka should not be concerned with third party tools. It seems that
> https://issues.apache.org/jira/browse/KAFKA- might provide a
> solution though -- but it's unclear if KIP-213 and  would be shipped
> with the same release...
>
> > To me, this is a shortcoming of the Confluent Avro Serde
> >> that will likely need to be fixed on that side.
>
> I agree (good to know...)
>
>
> 3) I am not an expert on hashing, but 128-bit murmur3 sounds reasonable
> to me
>
>
>
> Btw: I think we can have this discussion on the PR -- no need to concern
> the mailing list (it's a lot of people that are subscribed).
>
>
>
> -Matthias
>
> On 3/17/19 5:20 PM, Adam Bellemare wrote:
> > Hey folks
> >
> > I have been implementing the KIP as outlined in
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable
> ,
> > and I have run into a few points to consider that we did not include in
> the
> > original.
> >
> > *1) Do all input topics need to have the same partitions or not?*
> Currently
> > I have it designed such that it must, to be consistent with other joins.
> > However, consider the following:
> > TableA - 5 partitions
> > TableB - 2 partitions
> > Pre-subscribe Repartition Topic = 2 partitions, 2 RHS processors
> > Post-Subcribe Repartition Topic = 5 partitions, 5 LHS processors
> >
> > Would this not be possible? Is there a value in flexibility to this? I
> have
> > not looked deeper into the restrictions of this approach, so if there is
> > something I should know I would appreciate a heads up.
> >
> > *2) Is it appropriate to use the KTable valueSerde during the computation
> > of the hash?* To compute the hash I need to obtain an array of bytes,
> which
> > is immediately possible by  using the valueSerde. However, the Confluent
> > Kafka Schema Registry serializer fails when it is being used in this way:
> > In the hash generating code, I set topic to null because the data is not
> > dependent on any topic value. I simply want the serialized bytes to input
> > into the hash function.
> > *byte[] preHashValue = serializer.serialize(topic = null, data)*
> >
> > Any KTable that is Consumed.with(Confluent-Key-Serde,
> > Confluent-Value-Serde) will automatically try to register the schema to
> > topic+"-key" and topic+"-value". If I pass in null, it tries to register
> to
> > "-key" and "-value" each time the serializer is called, regardless of the
> > class. In other words, it registers the schemas to a null topic and fails
> > any subsequent serializations that aren't of the exact same schema. Note
> > that this would be the case across ALL applications using the confluent
> > schema registry. To me, this is a shortcoming of the Confluent Avro Serde
> > that will likely need to be fixed on that side. However, it does bring up
> > the question - is this an appropriate way to use a serializer?
> Alternately,
> > if I should NOT use the KTable value-serde to generate the byte array,
> does
> > anyone have a better idea?
> >
> > *3) How big of a hash value do we need? Does the Foreign Key even matter
> > for resolving?*
> > I am currently looking at fast, non-cryptologically-secure hash options.
> We
> > use murmur2 already in Kafka, but it is only 32 bits. I have been looking
> > at murmur3hash as implemented in the Apache Hive project (
> >
> https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hive/common/util/Murmur3.java
> )
> > - it supports 128-bit hashes and is allegedly more performant than MD5.
> > With a 128-bit hash. The birthday problem indicates that we would have a
> > 50% chance of a collision with 2^64 = 1.8446744e+19 entries. I believe
> that
> > this is sufficiently small, especially for our narrow time window, to
> > expect a collision for a 

Re: [DISCUSS] KIP-438: Expose task, connector IDs in Connect API

2019-03-19 Thread Ryanne Dolan
Thanks Viktor. Yes to both.

ConnectorTaskId (which includes both connector name and task ID) is
accessible from both WorkerSource/SinkTaskContext, so this change is
trivial.

Ryanne

On Tue, Mar 19, 2019 at 9:26 AM Viktor Somogyi-Vass 
wrote:

> I'm generally +1 on this, although have a couple of basic questions.
> Am I getting that right that you basically want to expose the task id from
> ConnectorTaskId? And if so, then I guess you'll provide the implementation
> too?
>
> Thanks,
> Viktor
>
> On Tue, Mar 5, 2019 at 6:49 PM Ryanne Dolan  wrote:
>
> > Hey y'all, please consider the following small KIP:
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-438%3A+Expose+task%2C+connector+IDs+in+Connect+API
> >
> > Thanks!
> > Ryanne
> >
>


Not able to run

2019-03-19 Thread KARISHMA MALIK
After downloading tar file of I am running the zookeeper command I am
getting error message like
Kafka-run-class.sh line 243 C:\Program: No such file or directory

Version is Kafka_2.11-0.10.0.0


[jira] [Resolved] (KAFKA-8094) Iterating over cache with get(key) is inefficient

2019-03-19 Thread Bill Bejeck (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck resolved KAFKA-8094.

Resolution: Fixed

> Iterating over cache with get(key) is inefficient 
> --
>
> Key: KAFKA-8094
> URL: https://issues.apache.org/jira/browse/KAFKA-8094
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Sophie Blee-Goldman
>Assignee: Sophie Blee-Goldman
>Priority: Major
>  Labels: streams
>
> Currently, range queries in the caching layer are implemented by creating an 
> iterator over the subset of keys in the range, and calling get() on the 
> underlying TreeMap for each key. While this protects against 
> ConcurrentModificationException, we can improve performance by replacing the 
> TreeMap with a concurrent data structure such as ConcurrentSkipListMap and 
> then just iterating over a subMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-437: Custom replacement for MaskField SMT

2019-03-19 Thread Valeria Vasylieva
Hi Adam,

Thank you for your response!
Dear community members, do you have any other thoughts on this KIP? Would
be great if you share them!

Regards,

Valeria

сб, 16 мар. 2019 г. в 18:16, Adam Bellemare :

> Hi Valeria
>
> I am thinking that a full map function via configuration is very unlikely
> to be feasible. At that point, it would be best for the user to create
> their own custom transformation.
>
> I think that since your function is indeed just an extension of masking
> that it is reasonable as presented. I don't have any other concerns with
> the proposal, but it would be good to hear from others.
>
> Thanks
>
>
> On Fri, Mar 15, 2019 at 11:38 AM Valeria Vasylieva <
> valeria.vasyli...@gmail.com> wrote:
>
> > Hi Adam,
> >
> > Thank you for your interest. Here is the list of currently supported
> > transformations in Connect:
> > https://kafka.apache.org/documentation/#connect_transforms.
> > As I can see, there is no "map" transformation in this list and all other
> > SMTs do not support functionality described in a KIP.
> > I cannot find the way to achieve the same result using existing
> > transformations.
> > The request, described in an issue was just to add this custom masking
> > functionality to the MaskField SMT, but if there is a need we can evolve
> > this issue and create separate "map" transformation,
> > it may be more useful but will require more effort, so it is better to do
> > it as separate issue.
> >
> > Kind Regards,
> > Valeria
> >
> > пт, 15 мар. 2019 г. в 17:35, Adam Bellemare :
> >
> > > Hi Valeria
> > >
> > > Thanks for the KIP. I admit my knowledge on Kafka Connect transforms
> is a
> > > bit rusty, however - Is there any other way to currently achieve this
> > same
> > > functionality outlined in your KIP using existing transforms?
> > >
> > >
> > > Thanks
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Mar 14, 2019 at 12:05 PM Valeria Vasylieva <
> > > valeria.vasyli...@gmail.com> wrote:
> > >
> > > > Dear community members,
> > > >
> > > > I would be very grateful if you leave any feedback on this KIP. It
> will
> > > > help me to understand if change is useful or not and to decide on
> > further
> > > > actions.
> > > >
> > > > Thank you in advance,
> > > > Valeria
> > > >
> > > > пн, 11 мар. 2019 г. в 13:20, Valeria Vasylieva <
> > > > valeria.vasyli...@gmail.com
> > > > >:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I would like to start a discussion about adding new functionality
> to
> > > > > MaskField SMT. The existing implementation allows to mask out any
> > field
> > > > > value with the null equivalent of the field type.
> > > > >
> > > > > I suggest to add a possibility to provide a literal replacement for
> > the
> > > > > field. This way you can mask out any PII info (IP, SSN etc.) with
> any
> > > > > custom replacement.
> > > > >
> > > > > It is a short KIP which does not require major changes, but could
> > help
> > > to
> > > > > make this transformation more useful for the client.
> > > > >
> > > > > The KIP is here:
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-437%3A+Custom+replacement+for+MaskField+SMT
> > > > >
> > > > > I would be glad to receive any feedback on this KIP.
> > > > >
> > > > > Kind Regards,
> > > > > Valeria
> > > > >
> > > >
> > >
> >
>


[jira] [Created] (KAFKA-8128) Dynamic delegation token change possibility for consumer/producer

2019-03-19 Thread Gabor Somogyi (JIRA)
Gabor Somogyi created KAFKA-8128:


 Summary: Dynamic delegation token change possibility for 
consumer/producer
 Key: KAFKA-8128
 URL: https://issues.apache.org/jira/browse/KAFKA-8128
 Project: Kafka
  Issue Type: Improvement
Reporter: Gabor Somogyi


Re-authentication feature on broker side is under implementation which will 
enforce consumer/producer instances to re-authenticate time to time. It would 
be good to set the latest delegation token dynamically and not re-creating 
consumer/producer instances.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-438: Expose task, connector IDs in Connect API

2019-03-19 Thread Viktor Somogyi-Vass
I'm generally +1 on this, although have a couple of basic questions.
Am I getting that right that you basically want to expose the task id from
ConnectorTaskId? And if so, then I guess you'll provide the implementation
too?

Thanks,
Viktor

On Tue, Mar 5, 2019 at 6:49 PM Ryanne Dolan  wrote:

> Hey y'all, please consider the following small KIP:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-438%3A+Expose+task%2C+connector+IDs+in+Connect+API
>
> Thanks!
> Ryanne
>


[jira] [Resolved] (KAFKA-8062) StateListener is not notified when StreamThread dies

2019-03-19 Thread Bill Bejeck (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck resolved KAFKA-8062.

Resolution: Fixed

> StateListener is not notified when StreamThread dies
> 
>
> Key: KAFKA-8062
> URL: https://issues.apache.org/jira/browse/KAFKA-8062
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.1.1
> Environment: Kafka 2.1.1, kafka-streams-scala version 2.1.1
>Reporter: Andrey Volkov
>Assignee: Guozhang Wang
>Priority: Minor
>
> I want my application to react when streams die. Trying to use 
> KafkaStreams.setStateListener. Also checking KafkaStreams.state() from time 
> to time.
> The test scenario: Kafka is available, but there are no topics that my 
> Topology is supposed to use.
> I expect streams to crash and the state listener to be notified about that, 
> with the new state ERROR. KafkaStreams.state() should also return ERROR.
> In reality the streams crash, but the KafkaStreams.state() method always 
> returns REBALANCING and the last time the StateListener was called, the new 
> state was also REBALANCING. 
>  
> I believe the reason for this is in the methods:
> org.apache.kafka.streams.KafkaStreams.StreamStateListener.onChange() which 
> does not react on the state StreamsThread.State.PENDING_SHUTDOWN
> and
> org.apache.kafka.streams.processor.internals.StreamThread.RebalanceListener.onPartitionsAssigned,
>  which calls shutdown() setting the state to PENDING_SHUTDOWN and then
> streamThread.setStateListener(null) effectively removing the state listener, 
> so that the DEAD state of the thread never reaches KafkaStreams object.
> Here is an extract from the logs:
> {{14:57:03.272 [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] 
> ERROR o.a.k.s.p.i.StreamsPartitionAssignor - stream-thread 
> [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1-consumer] 
> test-input-topic is unknown yet during rebalance, please make sure they have 
> been pre-created before starting the Streams application.}}
> {{14:57:03.283 [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] 
> INFO o.a.k.c.c.i.AbstractCoordinator - [Consumer 
> clientId=Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1-consumer, 
> groupId=Test] Successfully joined group with generation 1}}
> {{14:57:03.284 [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] 
> INFO o.a.k.c.c.i.ConsumerCoordinator - [Consumer 
> clientId=Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1-consumer, 
> groupId=Test] Setting newly assigned partitions []}}
> {{14:57:03.285 [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] 
> INFO o.a.k.s.p.i.StreamThread - stream-thread 
> [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] Informed to shut 
> down}}
> {{14:57:03.285 [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] 
> INFO o.a.k.s.p.i.StreamThread - stream-thread 
> [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] State transition 
> from PARTITIONS_REVOKED to PENDING_SHUTDOWN}}
> {{14:57:03.316 [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] 
> INFO o.a.k.s.p.i.StreamThread - stream-thread 
> [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] Shutting down}}
> {{14:57:03.317 [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] 
> INFO o.a.k.c.c.KafkaConsumer - [Consumer 
> clientId=Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1-restore-consumer,
>  groupId=] Unsubscribed all topics or patterns and assigned partitions}}
> {{14:57:03.317 [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] 
> INFO o.a.k.c.p.KafkaProducer - [Producer 
> clientId=Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1-producer] 
> Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.}}
> {{14:57:03.332 [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] 
> INFO o.a.k.s.p.i.StreamThread - stream-thread 
> [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] State transition 
> from PENDING_SHUTDOWN to DEAD}}
> {{14:57:03.332 [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] 
> INFO o.a.k.s.p.i.StreamThread - stream-thread 
> [Test-749d56c7-505d-46a4-8dbd-2c756b353adc-StreamThread-1] Shutdown complete}}
> After this calls to KafkaStreams.state() still return REBALANCING
> There is a workaround with requesting KafkaStreams.localThreadsMetadata() and 
> checking each thread's state manually, but that seems very wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-03-19 Thread Viktor Somogyi-Vass
Hey George,

Thanks for the answers. I'll try to block out time this week to review your
PR.

I have one more point to clarify:
I've seen some customers who are managing Kafka as an internal company-wide
service and they may or may not know that how certain topics are used
within the company. That might mean that some clients can start
reassignment at random times.
Let's suppose that such a reassignment starts just when the Kafka operation
team starts upgrading the cluster that contains this KIP. The question is:
do you think that we need handle upgrade scenarios where there is an
in-progress reassignment?

Thanks,
Viktor


On Tue, Mar 19, 2019 at 6:16 AM George Li  wrote:

> Hi Viktor,
>
> FYI, I have added a new ducktape test:  
> tests/kafkatest/tests/core/reassign_cancel_test.py
> to https://github.com/apache/kafka/pull/6296
>
> After review, do you have any more questions?  Thanks
>
>
> Hi Jun,
>
> Could you help review this when you have time?  Thanks
>
>
> Can we start a vote on this KIP in one or two weeks?
>
>
> Thanks,
> George
>
>
>
> On Tuesday, March 5, 2019, 10:58:45 PM PST, George Li <
> sql_consult...@yahoo.com> wrote:
>
>
> Hi Viktor,
>
>
>
> >  2.: One follow-up question: if the reassignment cancellation gets
> interrupted and a failover happens after step #2 but before step #3, how
> will the new controller continue? At this stage Zookeeper would contain
> OAR + RAR, however the brokers will have the updated LeaderAndIsr about
> OAR, so they won't know about RAR. I would suppose the new controller would
> start from the beginning as it only knows what's in Zookeeper. Is that true?
>
>
> The OAR (Original Assigned Replicas) for the rollback is stored in the
> /admin/reassign_partitions for each topic/partition reassignments.  During
> the controller failover,  the new controller will read
> /admin/reassign_partitions for both new_replicas AND original_replicas into
> controllerContext.partitionsBeingReassigned
>
> then perform pending reassignments cancellation/rollback.  The 
> originalReplicas
>  is added below:
>
> case class ReassignedPartitionsContext(var newReplicas: Seq[Int] = Seq.empty,
>var originalReplicas: Seq[Int]= 
> Seq.empty,
>val reassignIsrChangeHandler: 
> PartitionReassignmentIsrChangeHandler) {
>
>
>
>
> > 2.1: Another interesting question that are what are those replicas are
> doing which are online but not part of the leader and ISR? Are they still
> replicating? Are they safe to lie around for the time being?
>
>
> I think they are just dangling without being replicated.  Upon
> LeaderAndIsr request, the makeFollowers() will 
> replicaFetcherManager.removeFetcherForPartitions()
> and it will not be added in ReplicaFetcher since they are not in the
> current AR (Assigned Replicas).   Step 4 (StopReplica) will delete them.
>
>
>
> I am adding a new ducktape test:  
> tests/kafkatest/tests/core/reassign_cancel_test.py
> for cancelling pending reassignments.  but it's not easy to simulate
> controller failover during the cancellation since the cancellation/rollback
> is so fast to complete. See below.  I do have a unit/integration test to
> simulate controller failover:
> shouldTriggerReassignmentOnControllerStartup()
>
>
>
> Thanks,
> George
>
>
>
>
> Warning: You must run Verify periodically, until the reassignment
> completes, to ensure the throttle is removed. You can also alter the
> throttle by rerunning the Execute command passing a new value.
> The inter-broker throttle limit was set to 1024 B/s
> Successfully started reassignment of partitions.
>
> [INFO  - 2019-03-05 04:20:48,321 - kafka - execute_reassign_cancel -
> lineno:514]: Executing cancel / rollback pending partition reassignment...
> [DEBUG - 2019-03-05 04:20:48,321 - kafka - execute_reassign_cancel -
> lineno:515]: /opt/kafka-dev/bin/kafka-reassign-partitions.sh --zookeeper
> worker1:2181 --cancel
> [DEBUG - 2019-03-05 04:20:48,321 - remoteaccount - _log - lineno:160]:
> vagrant@worker2: Running ssh command:
> /opt/kafka-dev/bin/kafka-reassign-partitions.sh --zookeeper worker1:2181
> --cancel
> [DEBUG - 2019-03-05 04:20:57,452 - kafka - execute_reassign_cancel -
> lineno:520]: Verify cancel / rollback pending partition reassignment:
> Rolling back the current pending reassignments Map(test_topic-7 ->
> Map(replicas -> Buffer(3, 2, 5), original_replicas -> Buffer(3, 2, 4)),
> test_topic-18 -> Map(replicas -> Buffer(5, 1, 2), original_replicas ->
> Buffer(4, 1, 2)), test_topic-3 -> Map(replicas -> Buffer(5, 2, 3),
> original_replicas -> Buffer(4, 2, 3)), test_topic-15 -> Map(replicas ->
> Buffer(1, 3, 5), original_replicas -> Buffer(1, 3, 4)), test_topic-11 ->
> Map(replicas -> Buffer(2, 3, 5), original_replicas -> Buffer(2, 3, 4)))
> Successfully submitted cancellation of reassignments.
> The cancelled pending reassignments throttle was removed.
> Please run --verify to have the previous reassignments (not just the
> 

Build failed in Jenkins: kafka-trunk-jdk8 #3476

2019-03-19 Thread Apache Jenkins Server
See 


Changes:

[jason] MINOR: Improve logging around index files (#6385)

--
[...truncated 2.33 MB...]
org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldReturnIsOpen 
STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldReturnIsOpen 
PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldDeleteAndReturnPlainValue STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldDeleteAndReturnPlainValue PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldReturnName 
STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > shouldReturnName 
PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldPutWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldPutWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldPutAllWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest > 
shouldPutAllWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.KeyValueStoreFacadeTest 

[jira] [Created] (KAFKA-8127) It may need to import scala.io

2019-03-19 Thread JieFang.He (JIRA)
JieFang.He created KAFKA-8127:
-

 Summary: It may need to import scala.io
 Key: KAFKA-8127
 URL: https://issues.apache.org/jira/browse/KAFKA-8127
 Project: Kafka
  Issue Type: Bug
Reporter: JieFang.He


I get an error when compile kafka,which disappear when import scala.io

 
{code:java}
D:\gerrit\Kafka\core\src\main\scala\kafka\tools\StateChangeLogMerger.scala:140: 
object Source is not a member of package io
val lineIterators = files.map(io.Source.fromFile(_).getLines)
^
6 warnings found
one error found
:core:compileScala FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':core:compileScala'.
> Compilation failed
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)