Awesome. Thank you (all)! We've had so many conversations about it and it
is great to have it running continuously.
Kenn
On Fri, Sep 16, 2022 at 9:29 AM Sachin Agarwal via dev
wrote:
> This is wonderful - thank you so much to you and the whole Talend team to
> make Beam better!
>
> On Fri, Sep 16, 2022 at 9:11 AM Alexey Romanenko
> wrote:
>
>> Hi everybody,
>>
>> As some of you may know, at Talend, we’ve been working for a while to add
>> TPC-DS benchmark suite into Beam. We believe that having TPC-DS as a part
>> of Beam testing workflow and release routine will help a community to
>> detect quickly the performance regressions or improvements, identify
>> missing or incorrect Beam SQL features and execute Beam SQL on different
>> runtime environments with different runners.
>>
>> What is TPC-DS? From TPC-DS specification document [1]:
>>
>> *“TPC-DS is a decision support benchmark that models several generally
>> applicable aspects of a decision support system, including queries and data
>> maintenance. The benchmark provides a representative evaluation of
>> performance as a general purpose decision support system.” *
>>
>> TPC-DS benchmark suite for Beam is implemented as a separate testing tool
>> for Java SDK (like well known Nexmark benchmark suite) [2]. It supports a
>> limited number of TPC-DS SQL queries for now (mostly because of limited SQL
>> syntax support in Beam), CSV and Parquet as input data format, and it runs
>> on Jenkins with three most popular Beam runners (Spark [3], Flink [4],
>> Dataflow [5]). The job metrics are stored in InfluxDB and can be accessed
>> though Grafana dashboards [6][7][8].
>>
>> More details can be found in Beam documentation [9].
>>
>> For sure, there are still plenty things to do, like adding new runners,
>> support of other SDKs, data formats, etc - so, your contributions are very
>> welcomed in any form. Though, at least for now, we already have a first
>> working and automated version that can be used by community.
>>
>> Also, I’d like to thank everybody who worked on this improvement!
>>
>> —
>> Alexey
>>
>>
>> [1]
>> https://www.tpc.org/tpc_documents_current_versions/current_specifications5.asp
>> [2] https://github.com/apache/beam/tree/master/sdks/java/testing/tpcds
>> [3] https://ci-beam.apache.org/job/beam_PostCommit_Java_Tpcds_Spark/
>> [4] https://ci-beam.apache.org/job/beam_PostCommit_Java_Tpcds_Flink/
>> [5] https://ci-beam.apache.org/job/beam_PostCommit_Java_Tpcds_Dataflow/
>> [6]
>> http://metrics.beam.apache.org/d/tkqc0AdGk2/tpc-ds-spark-classic-new-sql?orgId=1
>> [7] http://metrics.beam.apache.org/d/8INnSY9Mv/tpc-ds-flink-sql?orgId=1
>> [8]
>> http://metrics.beam.apache.org/d/tkqc0AdGk2/tpc-ds-spark-classic-new-sql?orgId=1
>> [9] https://beam.apache.org/documentation/sdks/java/testing/tpcds/
>>
>>
>>
>>
>>
>>