Re: Benchmarking dashboard proposal

Wes McKinney Sat, 26 Jan 2019 17:37:34 -0800

hi folks,

I'd like to propose some kind of timeline for getting a first
iteration of a benchmark database developed and live, with scripts to
enable one or more initial agents to start adding new data on a daily
/ per-commit basis. I have at least 3 physical machines where I could
immediately set up cron jobs to start adding new data, and I could
attempt to backfill data as far back as possible.


Personally, I would like to see this done by the end of February if
not sooner -- if we don't have the volunteers to push the work to
completion by then please let me know as I will rearrange my
priorities to make sure that it happens. Does that sounds reasonable?

Please let me know if this plan sounds reasonable:

* Set up a hosted PostgreSQL instance, configure backups
* Propose and adopt a database schema for storing benchmark results
* For C++, write script (or Dockerfile) to execute all
google-benchmarks, output results to JSON, then adapter script
(Python) to ingest into database
* For Python, similar script that invokes ASV, then inserts ASV
results into benchmark database

This seems to be a pre-requisite for having a front-end to visualize
the results, but the dashboard/front end can hopefully be implemented
in such a way that the details of the benchmark database are not too
tightly coupled

(Do we have any other benchmarks in the project that would need to be
inserted initially?)

Related work to trigger benchmarks on agents when new commits land in
master can happen concurrently -- one task need not block the other

Thanks
Wes

On Mon, Jan 21, 2019 at 11:14 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
> Sorry, copy-paste failure: https://issues.apache.org/jira/browse/ARROW-4313
>
> On Mon, Jan 21, 2019 at 11:14 AM Wes McKinney <wesmck...@gmail.com> wrote:
> >
> > I don't think there is one but I just created
> > https://lists.apache.org/thread.html/278e573445c83bbd8ee66474b9356c5291a16f6b6eca11dbbe4b473a@%3Cdev.arrow.apache.org%3E
> >
> > On Mon, Jan 21, 2019 at 10:35 AM Tanya Schlusser <ta...@tickel.net> wrote:
> > >
> > > Areg,
> > >
> > > If you'd like help, I volunteer! No experience benchmarking but tons
> > > experience databasing—I can mock the backend (database + http) as a
> > > starting point for discussion if this is the way people want to go.
> > >
> > > Is there a Jira ticket for this that i can jump into?
> > >
> > >
> > >
> > >
> > > On Sun, Jan 20, 2019 at 3:24 PM Wes McKinney <wesmck...@gmail.com> wrote:
> > >
> > > > hi Areg,
> > > >
> > > > This sounds great -- we've discussed building a more full-featured
> > > > benchmark automation system in the past but nothing has been developed
> > > > yet.
> > > >
> > > > Your proposal about the details sounds OK; the single most important
> > > > thing to me is that we build and maintain a very general purpose
> > > > database schema for building the historical benchmark database
> > > >
> > > > The benchmark database should keep track of:
> > > >
> > > > * Timestamp of benchmark run
> > > > * Git commit hash of codebase
> > > > * Machine unique name (sort of the "user id")
> > > > * CPU identification for machine, and clock frequency (in case of
> > > > overclocking)
> > > > * CPU cache sizes (L1/L2/L3)
> > > > * Whether or not CPU throttling is enabled (if it can be easily 
> > > > determined)
> > > > * RAM size
> > > > * GPU identification (if any)
> > > > * Benchmark unique name
> > > > * Programming language(s) associated with benchmark (e.g. a benchmark
> > > > may involve both C++ and Python)
> > > > * Benchmark time, plus mean and standard deviation if available, else 
> > > > NULL
> > > >
> > > > (maybe some other things)
> > > >
> > > > I would rather not be locked into the internal database schema of a
> > > > particular benchmarking tool. So people in the community can just run
> > > > SQL queries against the database and use the data however they like.
> > > > We'll just have to be careful that people don't DROP TABLE or DELETE
> > > > (but we should have daily backups so we can recover from such cases)
> > > >
> > > > So while we may make use of TeamCity to schedule the runs on the cloud
> > > > and physical hardware, we should also provide a path for other people
> > > > in the community to add data to the benchmark database on their
> > > > hardware on an ad hoc basis. For example, I have several machines in
> > > > my home on all operating systems (Windows / macOS / Linux, and soon
> > > > also ARM64) and I'd like to set up scheduled tasks / cron jobs to
> > > > report in to the database at least on a daily basis.
> > > >
> > > > Ideally the benchmark database would just be a PostgreSQL server with
> > > > a schema we write down and keep backed up etc. Hosted PostgreSQL is
> > > > inexpensive ($200+ per year depending on size of instance; this
> > > > probably doesn't need to be a crazy big machine)
> > > >
> > > > I suspect there will be a manageable amount of development involved to
> > > > glue each of the benchmarking frameworks together with the benchmark
> > > > database. This can also handle querying the operating system for the
> > > > system information listed above
> > > >
> > > > Thanks
> > > > Wes
> > > >
> > > > On Fri, Jan 18, 2019 at 12:14 AM Melik-Adamyan, Areg
> > > > <areg.melik-adam...@intel.com> wrote:
> > > > >
> > > > > Hello,
> > > > >
> > > > > I want to restart/attach to the discussions for creating Arrow
> > > > benchmarking dashboard. I want to propose performance benchmark run per
> > > > commit to track the changes.
> > > > > The proposal includes building infrastructure for per-commit tracking
> > > > comprising of the following parts:
> > > > > - Hosted JetBrains for OSS https://teamcity.jetbrains.com/ as a build
> > > > system
> > > > > - Agents running in cloud both VM/container (DigitalOcean, or others)
> > > > and bare-metal (Packet.net/AWS) and on-premise(Nvidia boxes?)
> > > > > - JFrog artifactory storage and management for OSS projects
> > > > https://jfrog.com/open-source/#artifactory2
> > > > > - Codespeed as a frontend https://github.com/tobami/codespeed
> > > > >
> > > > > I am volunteering to build such system (if needed more Intel folks 
> > > > > will
> > > > be involved) so we can start tracking performance on various platforms 
> > > > and
> > > > understand how changes affect it.
> > > > >
> > > > > Please, let me know your thoughts!
> > > > >
> > > > > Thanks,
> > > > > -Areg.
> > > > >
> > > > >
> > > > >
> > > >

Re: Benchmarking dashboard proposal

Reply via email to