I can confirm that this is a failure in ceph 14.2.4 dashboard - as i am seeing 
this also when i check the free spare under "pools"

Am 8. Oktober 2019 07:54:58 MESZ schrieb "Yordan Yordanov (Innologica)" 
<yordan.yorda...@innologica.com>:
>Hi Igor,
>
>Thank you for responding. In this case this looks like a breaking
>change. I know of two applications that are now incorrectly displaying
>the pool usage and capacity, It looks like they both rely on the USED
>field to be divided by the number of replicas. One of those application
>is actually the Ceph Dashboard. The other is OpenNebula
>https://docs.opennebula.org/5.6/deployment/open_cloud_storage_setup/ceph_ds.html.
>See the screenshot from Ceph Dashboard - https://imgur.com/vFFxsti. It
>is stating that we have used 88% of the available space, because it
>wrongly assumes that the pool capacity is 47.7TB + 6.7TB = 54.4TB,
>while it should be more like (47.7TB/3) + 6.7TB = 22.6TB. It's
>absolutely the same story with our OpenNebula instance -
>https://imgur.com/MOLbo4g. I'm not sure exactly which update broke
>this, but it was definitely working correctly before.
>I looked at OpenNebula's code for ceph datastore monitoring and found
>that it's parsing the XML output of ceph df --format xml, so it looks
>like this changed too.
>
>From file: /var/lib/one/remotes/tm/ceph/monitor:
># ------------ Compute datastore usage -------------
>
>MONITOR_SCRIPT=$(cat <<EOF
>$CEPH df --format xml
>EOF
>)
>
>MONITOR_DATA=$(ssh_monitor_and_log $HOST "$MONITOR_SCRIPT" 2>&1)
>MONITOR_STATUS=$?
>
>if [ "$MONITOR_STATUS" = "0" ]; then
>    XPATH="${DRIVER_PATH}/../../datastore/xpath.rb --stdin"
>    echo -e "$(rbd_df_monitor ${MONITOR_DATA} ${POOL_NAME})"
>else
>    echo "$MONITOR_DATA"
>    exit $MONITOR_STATUS
>fi
>
>
>
>From file: /var/lib/one/remotes/datastore/ceph/ceph_utils.sh
>
>#--------------------------------------------------------------------------------
># Parse the output of rbd df in xml format and generates a monitor
>string for a
># Ceph pool. You **MUST** define XPATH util before using this function
>#   @param $1 the xml output of the command
>#   @param $2 the pool name
>#--------------------------------------------------------------------------------
>rbd_df_monitor() {
>
>    local monitor_data i j xpath_elements pool_name bytes_used free
>
>    monitor_data=$1
>    pool_name=$2
>
>    while IFS= read -r -d '' element; do
>        xpath_elements[i++]="$element"
>    done < <(echo $monitor_data | $XPATH \
>        "/stats/pools/pool[name = \"${pool_name}\"]/stats/bytes_used" \
>          "/stats/pools/pool[name = \"${pool_name}\"]/stats/max_avail")
>
>    bytes_used="${xpath_elements[j++]:-0}"
>    free="${xpath_elements[j++]:-0}"
>
>    cat << EOF | tr -d '[:blank:][:space:]'
>        USED_MB=$(($bytes_used / 1024**2))\n
>        TOTAL_MB=$((($bytes_used + $free) / 1024**2))\n
>        FREE_MB=$(($free / 1024**2))\n
>EOF
>}
>
>I believe Ceph Dashboard is doing the same, because the results are the
>same.
>
>Best Regards,
>
>
>On 7 Oct 2019, at 19:03, Igor Fedotov
><ifedo...@suse.de<mailto:ifedo...@suse.de>> wrote:
>
>
>Hi Yordan,
>
>this is mimic documentation and these snippets aren't valid for
>Nautilus any more.  They are still present  in Nautilus pages though..
>
>Going to create a corresponding ticket to fix that.
>
>Relevant Nautilus changes for 'ceph df [detail]' command can be found
>in Nautilus release notes:
>https://docs.ceph.com/docs/nautilus/releases/nautilus/
>
>In short - USED field accounts for all the overhead data including
>replicas etc. It's STORED field which now represents pure data user put
>into a pool.
>
>
>Thanks,
>
>Igor
>
>On 10/2/2019 8:33 AM, Yordan Yordanov (Innologica) wrote:
>The documentation states:
>https://docs.ceph.com/docs/mimic/rados/operations/monitoring/
>
>The POOLS section of the output provides a list of pools and the
>notional usage of each pool. The output from this section DOES NOT
>reflect replicas, clones or snapshots. For example, if you store an
>object with 1MB of data, the notional usage will be 1MB, but the actual
>usage may be 2MB or more depending on the number of replicas, clones
>and snapshots.
>
>However in our case we are clearly seeing the USAGE field multiplying
>the total object sizes to the number of replicas.
>
>[root@blackmirror ~]# ceph df
>RAW STORAGE:
>    CLASS     SIZE       AVAIL      USED       RAW USED     %RAW USED
>    hdd       80 TiB     34 TiB     46 TiB       46 TiB         58.10
>    TOTAL     80 TiB     34 TiB     46 TiB       46 TiB         58.10
>
>POOLS:
>POOL      ID     STORED      OBJECTS     USED        %USED     MAX
>AVAIL
>one        2      15 TiB       4.05M      46 TiB     68.32       7.2
>TiB
>bench      5     250 MiB          67     250 MiB         0        22
>TiB
>
>[root@blackmirror ~]# rbd du -p one
>NAME           PROVISIONED USED
>...
><TOTAL>             20 TiB  15 TiB
>
>This is causing several apps (including ceph dashboard) to display
>inaccurate percentages, because they calculate the total pool capacity
>as USED + MAX AVAIL, which in this case yields 53.2TB, which is way
>off. 7.2TB is about 13% of that, so we receive alarms and this is
>bugging us for quite some time now.
>
>
>
>_______________________________________________
>ceph-users mailing list --
>ceph-users@ceph.io<mailto:ceph-users@ceph.io>
>To unsubscribe send an email to
>ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to