yaalsn opened a new issue #14401:
URL: https://github.com/apache/pulsar/issues/14401


   **Is your enhancement request related to a problem? Please describe.**
   When we open a new PR, the CI workflows will run about one to two hours. So 
I wrote a script to analyze the time every workflow spends. The key is CI 
workflow name and the value is time(seconds)
   
   ```json
   {
       "CI - Unit - Brokers - Broker": 13255,
       "CI - Unit": 5160,
       "CI - Unit - Brokers - Broker Group 1": 3939,
       "CI - Unit - Brokers - Other": 3622,
       "CI - Integration - Pulsar-IO Sinks and Sources": 3620,
       "CI - Integration - Cli": 3620,
       "CI - CPP build on Windows": 3619,
       "CI - Integration - Process": 3617,
       "CI - Integration - Thread": 3617,
       "CI - Unit - Brokers - Broker Group 2": 3090,
       "CI - Integration - Function State": 3024,
       "CI - Integration - Sql": 2816,
       "CI - Python - Build 3.9 client": 2762,
       "CI - Integration - Messaging": 2651,
       "CI - Integration - Backwards Compatibility": 2577,
       "CI - Integration - Pulsar-IO Oracle Source": 2563,
       "CI - Unit - Brokers - Publish - Throttle": 2531,
       "CI - Shade - Test": 2500,
       "CI - Unit - Brokers - Transaction": 2492,
       "CI - Unit - Flaky": 2441,
       "CI - Integration - Function & IO": 2432,
       "CI - Unit - Brokers - Flaky": 2308,
       "CI - Unit - Broker Auth SASL": 2274,
       "CI - Integration - Standalone": 2235,
       "CI - Integration - Schema": 2169,
       "CI - Integration - Transaction": 2074,
       "CI - Integration - Tiered FileSystem": 2047,
       "CI - Docker Build": 2023,
       "CI - Integration - Tiered JCloud": 1995,
       "CI - Unit - Brokers - Broker Group": 1868,
       "CI - Unit - Brokers - Others": 1637,
       "CI - Build - MacOS": 1513,
       "CI - Unit - Brokers - Long - Time": 1243,
       "CI - CPP, Python Tests": 1117,
       "CI - Unit - Adaptors": 1084,
       "CI - Unit - Brokers - Client Impl": 993,
       "CI - Misc": 969,
       "CI - Unit - Brokers - Client Api": 949,
       "CI - Unit - Proxy": 799,
       "CI - Unit - Brokers - Default": 699,
       "CI - CPP build on CentOS 7": 608,
       "CI - Unit - Broker - JDK8": 455,
       "Pulsar Bot": 213,
       "Auto Labeling": 201,
       "CI - Go Functions style check": 157,
       "CI - Go Functions Tests": 157,
       "CI - Maven Dependency Cache Update": 155,
       "CI - Pulsar - Build - 2.6": 141,
       "CI - Cancel duplicate workflows": 46,
       "CI - Misc - OWASP Dependency Check": 32,
       "CI - Deployment - Helm": 31
   }
   ```
   
   ---
   
   As we can see that `CI - Integration xxx`, `CI - Unit xxx` and  `CPP build 
on Windows` cost most. Let's step into these worfklows.
   
   - `CI - Integration xxx`
   
   Like all the other `CI - Integration xxx`, `CI - Integration - Pulsar-IO 
Sinks and Sources`  can be divided into three parts:
   
   <img width="1019" alt="image" 
src="https://user-images.githubusercontent.com/10069311/154969970-bac4ae3e-3f90-4e5c-8d9b-e9003f0d61ec.png";>
   
   - `CI - Unit xxx`
   
   <img width="1027" alt="image" 
src="https://user-images.githubusercontent.com/10069311/154971162-9dc0879f-2c4f-4454-a5cb-a93b4e0cfc58.png";>
   
   <img width="1023" alt="image" 
src="https://user-images.githubusercontent.com/10069311/154971403-3507fa10-d189-45ed-ae28-8802ae0deea4.png";>
   
   - `CPP build on Windows` 
   
   <img width="1015" alt="image" 
src="https://user-images.githubusercontent.com/10069311/154973610-c6a54f6e-51b2-4849-93d4-96ca5d6a549b.png";>
   
   
   ---
   
   - All these `Intergration` and `Unit` jobs can be abstracted into `Build` 
and `Test`. `Build` contains `mvn install without test` and `docker build`, and 
`Test` is `mvn test`. There are so many workflow jobs execute `Build` step and 
cost most.
   - `vcpkg install`  dependencies cost most.
   
   **Describe the solution you'd like**
   
   - Share maven install result files between all `Intergration` and `Unit` 
jobs. So we need to make a workflow to pre cache the result files first and 
then trigger the jobs. But consider that `Github Workflow` cannot re-run the 
fail steps now, when a job runs with fail, we need to re-run the whole 
workflow. We also need to do some tricks to make the workflow to be idempotent.
   - Cache the vcpkg dependencies.
   - Using docker base images when docker build.
   
   **Describe alternatives you've considered**
   Remove useless or repeat jobs.
   
   **Additional context**
   Add any other context or screenshots about the feature request here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to