John Roesler created KAFKA-12767:
------------------------------------

             Summary: Properly set Streams system test runtime classpath
                 Key: KAFKA-12767
                 URL: https://issues.apache.org/jira/browse/KAFKA-12767
             Project: Kafka
          Issue Type: Task
          Components: streams, system tests
            Reporter: John Roesler


Some of the streams system tests started to fail recently when we stopped 
exporting our transitive dependencies in the test jar.

[~lct45] was kind enough to submit [https://github.com/apache/kafka/pull/10631] 
to get the system tests running again, but that PR is only a stop-gap.

The real solution is to properly package the transitive dependencies and make 
them available to the system test runtime.

Here is the reason: PR#10631 gets past the issue by removing runtime usages on 
Hamcrest, but Hamcrest is still present in the compiletime classpath. Tomorrow, 
a new PR could add a reference to Hamcrest, and all the unit and integration 
tests would pass, but we would again see the mysterious system test failures. 
Only after another round of costly investigation would we realize the root 
cause, and we might again decide to just patch the test to remove the reference.

It would be far better in the long run to fix the underlying condition: the 
difference between the compiletime and runtime classpaths for the system tests.

 

To help get you started, I'll share some of the groundwork for this task, which 
I put together while trying to understand the nature of the problem.

The first step is to actually copy the transitive dependencies. We had to stop 
placing these dependencies in the `dependant-libs` build directory because that 
results in us actually shipping those dependencies with our releases. Copying a 
similar mechanism from the `:core` project, we can add a new build directory 
(arbitrarily: `dependant-testlibs`), and again copy the runtime dependencies 
there. Here is a commit in my fork that does just that: 
[https://github.com/vvcephei/kafka/commit/8d4552dee05f2a963b8072b86aae756415ea2482]

The next step is to place those jars on the classpath of the system test code. 
The mechanism for that is `kafka-run-class.sh`: 
[https://github.com/apache/kafka/blob/trunk/bin/kafka-run-class.sh]

A specific example of this is the special logic for upgrade tests:
 # If we are running upgrade tests, then we set the artifact directories to the 
relevant version. Otherwise, we use the current build artifacts. 
[https://github.com/apache/kafka/blob/fc405d792de12a50956195827eaf57bbf64444c9/bin/kafka-run-class.sh#L77-L85]
 # The, here's where we specifically pull in Hamcrest from those build artifact 
directories: 
[https://github.com/apache/kafka/blob/fc405d792de12a50956195827eaf57bbf64444c9/bin/kafka-run-class.sh#L128-L136]

It seems to me that since Hamcrest is actually a more general dependency of the 
tests, we might as well pack it up in `dependant-testlibs` and then pull it 
into the classpath from there any time we're running tests. It looks like we 
ought to set `streams_dependant_clients_lib_dir` to `dependant-testlibs` any 
time `INCLUDE_TEST_JARS` is `true`. But if we do have 
`UPGRADE_KAFKA_STREAMS_TEST_VERSION` set, then it should override the lib dir, 
since those artifacts to copy over the transitive dependencies for those older 
versions already.

 

Although the proposed fix itself is pretty small, I think the testing will take 
a few days. You might want to just put some `echo` statements in 
kafka-run-class.sh to see what jars are included on the classpath while you run 
different tests, both locally using Docker, and remotely using Jenkins.

I marked this ticket as `newbie++` because you don't need deep knowledge of the 
codebase to tackle this ticket, just a high pain tolerance for digging though 
gradle/docker/bash script debugging to make sure the right bits make it into 
the system tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to