[ 
https://issues.apache.org/jira/browse/KAFKA-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372223#comment-16372223
 ] 

ASF GitHub Bot commented on KAFKA-6577:
---------------------------------------

rhauch opened a new pull request #4610: KAFKA-6577: Fix Connect system tests 
and add debug messages
URL: https://github.com/apache/kafka/pull/4610
 
 
   **NOTE: This should be backported to the `1.1` branch, and is currently a 
blocker for 1.1.**
   
   The `connect_test.py::ConnectStandaloneFileTest.test_file_source_and_sink` 
system test is failing with the SASL configuration without a sufficient 
explanation. During the test, the Connect worker fails to start, but the 
Connect log contains no useful information. There are actual several things 
compounding to cause the failure and make it difficult to understand the 
problem.
   
   First, the 
`tests/kafkatest/tests/connect/templates/connect_standalone.properties` is only 
adding in the broker's security configuration with the `producer.` and 
`consumer.` prefixes, but is not adding them with no prefix. The worker uses 
the AdminClient to connect to the broker to get the Kafka cluster ID and to 
manage the three internal topics, and the AdminClient is configured via 
top-level properties. Because the SASL test requires the clients all connect 
using SASL, the lack of broker security configs means the AdminClient was 
attempting and failing to connect to the broker. This is corrected by adding 
the broker's security configuration to the Connect worker configuration file at 
the top-level. (This was already being done in the 
`connect_distributed.properties` file.)
   
   Second, the default `request.timeout.ms` for the AdminClient (and the other 
clients) is 120 seconds, so the AdminClient was retrying for 120 seconds before 
it would give up and thrown an error. However, the test was only waiting for 60 
seconds before determining that the service failed to start. This can be 
corrected by setting `request.timeout.ms=10000` in the Connect distributed and 
standalone worker configurations.
   
   Third, the Connect workers were recently changed to lookup the Kafka cluster 
ID before it started the herder. This is unlike the older uses of the 
AdminClient to find and manage the internal topics, where failure to connect 
was not necessarily logged correctly but nevertheless still skipped over, 
relying upon broker auto-topic creation to create the internal topics. (This 
may be why the test did not fail prior to the recent change to always require a 
successful AdminClient connection.) Although the worker never got this far in 
its startup process, the fact that we missed such an error since the prior 
releases means that failure to connect with the AdminClient was not being 
properly reported.
   
   The `ConnectStandaloneFileTest.test_file_source_and_sink` system tests were 
run locally prior to this fix, and they failed as with the nightlies. Once 
these fixes were made, the locally run system tests passed.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Connect standalone SASL file source and sink test fails without explanation
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-6577
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6577
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect, system tests
>    Affects Versions: 1.1.0
>            Reporter: Randall Hauch
>            Assignee: Randall Hauch
>            Priority: Blocker
>             Fix For: 1.1.0
>
>
> The 
> {{tests/kafkatest/tests/connect/connect_test.py::ConnectStandaloneFileTest.test_file_source_and_sink}}
>  test is failing with the SASL configuration without a sufficient 
> explanation. During the test, the Connect worker fails to start, but the 
> Connect log contains no useful information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to