[jira] [Comment Edited] (SOLR-10317) Solr Nightly Benchmarks

Ishan Chattopadhyaya (JIRA) Sun, 06 Aug 2017 04:51:00 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115774#comment-16115774
 ]


Ishan Chattopadhyaya edited comment on SOLR-10317 at 8/6/17 11:49 AM:
----------------------------------------------------------------------

https://github.com/viveknarang/lucene-solr/blob/SolrNightlyBenchmarks/dev-tools/solrnightlybenchmarks/src/main/java/org/apache/solr/tests/nightlybenchmarks/TestPlans.java#L32-L34
{code}
        public enum BenchmarkTestType {
                PROD_TEST, DEV_TEST
}
{code}

There should be no concept of "prod" or "dev". It should be a benchmark that 
relies on configuration rather than assumed defaults like "prod" or "dev".

Also, I don't think we should be using terms like "tests" for individual 
benchmarks. There is no verification or assertions in these benchmarks, other 
than just timing data collection.

The way the entire code is laid out, it is extremely hard to add new 
benchmarks. It might require an entirely new GSoC project next year to make 
this useful for the community. Hard coded test scenarios, really? 
https://github.com/viveknarang/lucene-solr/blob/SolrNightlyBenchmarks/dev-tools/solrnightlybenchmarks/src/main/java/org/apache/solr/tests/nightlybenchmarks/MetricCollector.java#L31-L219
 
We need to make this configurable at the earliest.

I my opinion, the way the benchmarks in this suite should configured as:
{code}
{
  "index-benchmarks": [
    {
      "name": "CLOUD_INDEXING",
      "description": "some shit",
      "replication-type": "cloud",
      "dataset-file": "filename containing data"
      "setups": [
        {
          "collection": "cloud_2x2",
          "replicationFactor": 2,
          "shards": 2,
          "min-threads": 1,
          "max-threads": 16
        },
        {
          "collection": "cloud_1x1",
          "replicationFactor": 1,
          "shards": 1,
          "min-threads": 1,
          "max-threads": 16
        },
        {
          "collection": "cloud_1x2",
          "replicationFactor": 2,
          "shards": 1,
          "min-threads": 1,
          "max-threads": 16
        }
      ]
    },
    {
      "name": "CLOUD_PARTIAL_UPDATE",
      "description": "some shit",
      "replication-type": "cloud",
      "dataset-file": "filename containing full documents",
      "updates-file": "filename containing updates",
      "setups": [
        {
          "collection": "partial_2x2",
          "replicationFactor": 2,
          "shards": 2,
          "min-threads": 1,
          "max-threads": 16
        },
        {
          "collection": "partial_1x1",
          "replicationFactor": 1,
          "shards": 1,
          "min-threads": 1,
          "max-threads": 16
        },
        {
          "collection": "partial_1x2",
          "replicationFactor": 2,
          "shards": 1,
          "min-threads": 1,
          "max-threads": 16
        }
      ]
    }, 
    ... more such benchmarks ...
  ],
  "query-benchmarks": [
    {
      "name": "TERM_NUMERIC_QUERY_CLOUD_2T",
      "description": "some shit describing the benchmark",
      "replication-type": "cloud or standalone""collection/core": "<name of the 
collection/core, that must've already been created during the indexing phase>",
      "query-file": "name of file containing all the queries for this 
benchmark",
      "client-type": "CUSC or CSC or HSC etc.",
      "min-threads": 1"max-threads": 8
    }, 
    ... more such benchmarks ...
  ]
}
{code}

Based on this, the suite will do the right thing. Various things to consider 
here:
# Partial updates benchmarks should:
{code}
for every replicationFactor, shards, thread combination:
  Create a new collection with given name, and given replicationFactor and 
shards
  Index the full dataset without timing.
  Start timer
  Update all documents
  Stop timer, record difference in time
  Delete this collection.
{code}
# Full document indexing benchmarks should:
{code}
for every replicationFactor, shards, thread combination:
  Create a new collection with given name, and given replicationFactor and 
shards
  Start timer
  Index the documents
  Stop timer, record difference in time
  if (numThread != maxThread):
     Delete this collection
  else
     // don't delete, since this collection needs to stay for query benchmarks
{code}
# For every query benchmark:
{code}
for every collection, thread combination:
  Stop and start all Solr nodes (so that caches are cleared)
  Wait till all replicas for the given collection are "active"
  Issue around 100-200 queries to warm up the searchers.
  Start timer
  Query the collection using the given numThreads
  Stop timer, record difference in time
{code}

In case there's any information that the graphs need, but not covered here, 
please comment/discuss.

What do you think of the above proposal (in general or in specific parts) to 
make the suite easier to configure/manage/extend?


was (Author: ichattopadhyaya):
https://github.com/viveknarang/lucene-solr/blob/SolrNightlyBenchmarks/dev-tools/solrnightlybenchmarks/src/main/java/org/apache/solr/tests/nightlybenchmarks/TestPlans.java#L32-L34
{code}
        public enum BenchmarkTestType {
                PROD_TEST, DEV_TEST
}
{code}

There should be no concept of "prod" or "dev". It should be a benchmark that 
relies on configuration rather than assumed defaults like "prod" or "dev".

Also, I don't think we should be using terms like "tests" for individual 
benchmarks. There is no verification or assertions in these benchmarks, other 
than just timing data collection.

The way the entire code is laid out, it is extremely hard to add new 
benchmarks. It might require an entirely new GSoC project next year to make 
this useful for the community. Hard coded test scenarios, really? 
https://github.com/viveknarang/lucene-solr/blob/SolrNightlyBenchmarks/dev-tools/solrnightlybenchmarks/src/main/java/org/apache/solr/tests/nightlybenchmarks/MetricCollector.java#L31-L219
 
This mess absolutely needs to be fixed at the earliest.

I my opinion, the way the benchmarks in this suite should configured as:
{code}
{
  "index-benchmarks": [
    {
      "name": "CLOUD_INDEXING",
      "description": "some shit",
      "replication-type": "cloud",
      "dataset-file": "filename containing data"
      "setups": [
        {
          "collection": "cloud_2x2",
          "replicationFactor": 2,
          "shards": 2,
          "min-threads": 1,
          "max-threads": 16
        },
        {
          "collection": "cloud_1x1",
          "replicationFactor": 1,
          "shards": 1,
          "min-threads": 1,
          "max-threads": 16
        },
        {
          "collection": "cloud_1x2",
          "replicationFactor": 2,
          "shards": 1,
          "min-threads": 1,
          "max-threads": 16
        }
      ]
    },
    {
      "name": "CLOUD_PARTIAL_UPDATE",
      "description": "some shit",
      "replication-type": "cloud",
      "dataset-file": "filename containing full documents",
      "updates-file": "filename containing updates",
      "setups": [
        {
          "collection": "partial_2x2",
          "replicationFactor": 2,
          "shards": 2,
          "min-threads": 1,
          "max-threads": 16
        },
        {
          "collection": "partial_1x1",
          "replicationFactor": 1,
          "shards": 1,
          "min-threads": 1,
          "max-threads": 16
        },
        {
          "collection": "partial_1x2",
          "replicationFactor": 2,
          "shards": 1,
          "min-threads": 1,
          "max-threads": 16
        }
      ]
    }, 
    ... more such benchmarks ...
  ],
  "query-benchmarks": [
    {
      "name": "TERM_NUMERIC_QUERY_CLOUD_2T",
      "description": "some shit describing the benchmark",
      "replication-type": "cloud or standalone""collection/core": "<name of the 
collection/core, that must've already been created during the indexing phase>",
      "query-file": "name of file containing all the queries for this 
benchmark",
      "client-type": "CUSC or CSC or HSC etc.",
      "min-threads": 1"max-threads": 8
    }, 
    ... more such benchmarks ...
  ]
}
{code}

Based on this, the suite will do the right thing. Various things to consider 
here:
# Partial updates benchmarks should:
{code}
for every replicationFactor, shards, thread combination:
  Create a new collection with given name, and given replicationFactor and 
shards
  Index the full dataset without timing.
  Start timer
  Update all documents
  Stop timer, record difference in time
  Delete this collection.
{code}
# Full document indexing benchmarks should:
{code}
for every replicationFactor, shards, thread combination:
  Create a new collection with given name, and given replicationFactor and 
shards
  Start timer
  Index the documents
  Stop timer, record difference in time
  if (numThread != maxThread):
     Delete this collection
  else
     // don't delete, since this collection needs to stay for query benchmarks
{code}
# For every query benchmark:
{code}
for every collection, thread combination:
  Stop and start all Solr nodes (so that caches are cleared)
  Wait till all replicas for the given collection are "active"
  Issue around 100-200 queries to warm up the searchers.
  Start timer
  Query the collection using the given numThreads
  Stop timer, record difference in time
{code}

In case there's any information that the graphs need, but not covered here, 
please comment/discuss.

What do you think of the above proposal (in general or in specific parts) to 
make the suite easier to configure/manage/extend?

> Solr Nightly Benchmarks
> -----------------------
>
>                 Key: SOLR-10317
>                 URL: https://issues.apache.org/jira/browse/SOLR-10317
>             Project: Solr
>          Issue Type: Task
>            Reporter: Ishan Chattopadhyaya
>              Labels: gsoc2017, mentor
>         Attachments: changes-lucene-20160907.json, 
> changes-solr-20160907.json, managed-schema, 
> Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks.docx, 
> Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks-FINAL-PROPOSAL.pdf, 
> Screenshot from 2017-07-30 20-30-05.png, SOLR-10317.patch, SOLR-10317.patch, 
> solrconfig.xml
>
>
> Solr needs nightly benchmarks reporting. Similar Lucene benchmarks can be 
> found here, https://home.apache.org/~mikemccand/lucenebench/.
> Preferably, we need:
> # A suite of benchmarks that build Solr from a commit point, start Solr 
> nodes, both in SolrCloud and standalone mode, and record timing information 
> of various operations like indexing, querying, faceting, grouping, 
> replication etc.
> # It should be possible to run them either as an independent suite or as a 
> Jenkins job, and we should be able to report timings as graphs (Jenkins has 
> some charting plugins).
> # The code should eventually be integrated in the Solr codebase, so that it 
> never goes out of date.
> There is some prior work / discussion:
> # https://github.com/shalinmangar/solr-perf-tools (Shalin)
> # https://github.com/chatman/solr-upgrade-tests/blob/master/BENCHMARKS.md 
> (Ishan/Vivek)
> # SOLR-2646 & SOLR-9863 (Mark Miller)
> # https://home.apache.org/~mikemccand/lucenebench/ (Mike McCandless)
> # https://github.com/lucidworks/solr-scale-tk (Tim Potter)
> There is support for building, starting, indexing/querying and stopping Solr 
> in some of these frameworks above. However, the benchmarks run are very 
> limited. Any of these can be a starting point, or a new framework can as well 
> be used. The motivation is to be able to cover every functionality of Solr 
> with a corresponding benchmark that is run every night.
> Proposing this as a GSoC 2017 project. I'm willing to mentor, and I'm sure 
> [~shalinmangar] and [~markrmil...@gmail.com] would help here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-10317) Solr Nightly Benchmarks

Reply via email to