Hey Eugene,

Having a cluster for performance testing is a great idea and it is
something that has popped up in various contexts.

The most common way to obtain such clusters is via sponsors (companies
or individuals) donating resources to the project. For example, the
Hive CI is now running mostly on resources donated by Cloudera.

There seems to be a process about requesting resources from the Apache
Infra team [1] but I am not aware of other ASF projects following this
path for performance testing. Most likely the easiest and fastest way
to move this forward is through a sponsor. Depending on where the
resources come from will also determine the design, implementation,
and maintenance.

Best,
Stamatis

[1] https://infra.apache.org/vm-for-project.html

On Tue, May 21, 2024 at 11:25 AM Eugene Ryan <ryan.eug...@gmail.com> wrote:
>
> Hi,
>
> I'd like to get folks' opinions on having a public cluster for performance
> testing Hive code and getting an early read on whether a commit / build has
> caused a performance degradation over existing code.
>
> There are already well known workloads available, for example, TPC-DS 
> (https://github.com/hortonworks/hive-testbench) that can be run so I'm not 
> talking about performance test code itself (although that should be as easy 
> as possible on top of a dedicated cluster).
>
> The benefits to the community would be:
>    - A dedicated environment, not necessarily leaving it to the vendors to 
> integrate open-source later into their stacks and only find out some time 
> later about performance problems
>    - Something that can be left set up & running -  no setup and tear-down
>    process needed every time a performance run is required
>    - An automated process for performance testing - no manual setup or
>    intervention
>
> Concerns:
>    - Budget
>    - Who administers the cluster, ie.. who sets it up, fixes it when down
>
> I'd like to get some opinions on what the process for getting this to
> happen would be, bearing in mind that certain things may well be obstacles 
> (budget) that have to be solved upfront before anything else happens:
>    -    Budget approval
>    -   Approval / Sign off - how & who?
>    -    Architecture / pipeline design
>    -   Implementation
>
> Thanks, all opinions welcome.
> Eugene
>

Reply via email to