Hi Stamatis,
Thank you for initiating this event. I know you’ve put a lot of effort into maintaining the CI infrastructure, and I really appreciate that. I might not be able to attend this event, but I’d like to share some thoughts, especially since I’ve noticed some pain points in the CI while preparing for the 4.1 release. Here are some points for discussion: 1) Issue with requiring login to view CI job details Many first-time contributors don’t know how to log in to the CI system to check error messages [1]. Could we make the CI interface anonymously accessible to ensure a better development experience for new users? 2) Limited CI concurrency and slow rescheduling after cancellation Many users frequently modify and submit code. When a new commit is pushed, the CI is canceled and then rescheduled, but the rescheduling process seems to take a long time. As a result, many users (especially new contributors) often close and reopen their PRs to retrigger the CI. Some even create a new PR altogether. In short, the rescheduling delay is too long, leaving users waiting for extended periods without seeing CI progress, which significantly impacts the development experience. Additionally, when many PRs are submitted concurrently, the scheduling and execution time of the CI seem to increase dramatically. If a user makes a code change and wants to check the CI results, they might have to wait half a day or even a full day. This long waiting period is frustrating for developers, as they may have other work commitments and can’t afford to wait indefinitely. In some cases, the PR might even be forgotten. 3) Could we consider using GitHub Actions resources for CI? I understand that the CI concurrency limit might be due to limited resources, and the provider (Cloudera) needs to impose restrictions. So, could we explore using GitHub Actions as an additional resource? I’ve noticed that Apache Spark uses GitHub Actions [2], and they’re even considering it for release version [3] With GitHub Actions, Spark’s CI workflow seems to run much faster. Could we evaluate using GitHub Actions as a supplementary resource to alleviate the current CI resource constraints? Just to clarify—I haven’t deeply researched GitHub Actions yet, but since many Apache projects are adopting it, I think it’s worth considering as a potential part of our future CI infrastructure. [1] https://github.com/apache/hive/pull/5547#issuecomment-2480098937 [2] https://github.com/apache/spark/pull/32092 [3] https://issues.apache.org/jira/browse/SPARK-52176 Thanks, Butao Zhang ---- Replied Message ---- | From | Stamatis Zampetakis<zabe...@gmail.com> | | Date | 7/9/2025 18:28 | | To | dev<dev@hive.apache.org> | | Subject | [EVENT] Apache Hive CI Introduction & QA | Hi everyone, The Hive CI and precommit infrastructure is very important part of our daily life as Hive contributors and has great impact on productivity and overall contributor experience. I think it would be very useful for everyone contributing to Hive to get a better understanding of how the CI works and what lies underneath. For this purpose, I would like to propose a virtual event on July 23, 2025 at 17:00 CEST [1] in an attempt to facilitate contributions and troubleshooting around this area. I know that the time is not convenient for everyone globally (and it is impossible to find one slot that works for all) but we could possibly shift the date if that could help in getting greater attendance. The format that I had in mind is a small introductory presentation followed by casual and informal QA. I created a google doc [1] to gather questions that people may have in this area. Feel free to append your questions there so that we have a tentative agenda of what to cover during the event. I can lead the presentation/discussion during the event and I would be more happy to co-present with anyone else willing to help. There are people with probably better understanding than myself in this area so it would be great to have them onboard. How do people feel about the idea? Is there interest in attending such an event? Best, Stamatis [1] https://www.timeanddate.com/worldclock/fixedtime.html?msg=Apache+Hive+CI+Introduction+%26+QA&iso=20250723T17&p1=195&ah=1&am=30 [2] https://docs.google.com/document/d/1P5H5N2QUSIwM83Yz00lzItcAQgMWQ5qpjOhoSAs9Tbw/edit?usp=sharing