Re: cloud disk space utilization

Kudrettin Güleryüz Wed, 29 Aug 2018 13:42:21 -0700

Given the set of preferences above, I would expect the difference between
the largest freedisk (test-43 currently) and the smallest freedisk (test-45
currently) to be smaller than what is below. Below is the output from
reading diagnostics endpoint from autoscaling API.  According this output,
the variation between freedisk values is currently as large as 220GiB. I am
concerned because I cannot tell if the variation is expected, or if it is
due to configuration error. Also, it would be great to keep track of a
single disk space, rather than keeping track of 6 disk spaces, if possible.


What policy/preferences options would you suggest exploring specifically
for evening out freedisks across Solr nodes?

{
  "responseHeader":{
    "status":0,
    "QTime":284},
  "diagnostics":{
    "sortedNodes":[{
        "node":"test-43:8983_solr",
        "cores":137,
        "freedisk":447.0913887023926,
        "sysLoadAvg":117.0},
      {
        "node":"test-42:8983_solr",
        "cores":137,
        "freedisk":369.33697509765625,
        "sysLoadAvg":93.0},
      {
        "node":"test-46:8983_solr",
        "cores":137,
        "freedisk":361.7615737915039,
        "sysLoadAvg":93.0},
      {
        "node":"test-41:8983_solr",
        "cores":137,
        "freedisk":347.91234970092773,
        "sysLoadAvg":86.0},
      {
        "node":"test-44:8983_solr",
        "cores":137,
        "freedisk":341.1301383972168,
        "sysLoadAvg":160.0},
      {
        "node":"test-45:8983_solr",
        "cores":137,
        "freedisk":227.17399215698242,
        "sysLoadAvg":118.0}],
    "violations":[]},
  "WARNING":"This response format is experimental.  It is likely to change
in the future."}

On Mon, Aug 27, 2018 at 5:17 PM Kudrettin Güleryüz <kudret...@gmail.com>
wrote:

> Hi,
>
> We have six Solr nodes with ~1TiB disk space on each mounted as ext4. The
> indexers sometimes update the collections and create new ones if update
> wouldn't be faster than scratch indexing. (up to around 5 million documents
> are indexed for each collection) On average there are around 130
> collections on this SolrCloud. Collection sizes vary from 1GiB to 150GiB.
>
> Preferences set:
>
>   "cluster-preferences":[{
>       "maximize":"freedisk",
>       "precision":10}
>     ,{
>       "minimize":"cores",
>       "precision":1}
>     ,{
>       "minimize":"sysLoadAvg",
>       "precision":3}],
>
> * Is it be possible to run out of disk space on one of the nodes while
> others would have plenty? I observe some are getting close to ~80%
> utilization while others stay at ~60%
> * Would this difference be due to collection index size differences or due
> to error on my side to come up with a useful policy/preferences?
>
> Thank you
>
>

Re: cloud disk space utilization

Reply via email to