Because we have seen many complains about sequential vs index scans, I
wrote a script which computes the value for your OS/hardware
combination.

Under BSD/OS on one SCSI disk, I get a random_page_cost around 60.  Our
current postgresql.conf default is 4.

What do other people get for this value?

Keep in mind if we increase this value, we will get a more sequential
scans vs. index scans.

One flaw in this test is that it randomly reads blocks from different
files rather than randomly reading from the same file.  Do people have a
suggestion on how to correct this?  Does it matter?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [EMAIL PROTECTED]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#!/bin/bash

trap "rm -f /tmp/$$" 0 1 2 3 15

BLCKSZ=8192

if [ "$RANDOM" = "$RANDOM" ]
then    echo "Your shell does not support \$RANDOM.  Try using bash." 1>&2
        exit 1
fi

# XXX We assume 0 <= random <= 32767

echo "Collecting sizing information ..."

TEMPLATE1=`du -s "$PGDATA/base/1" | awk '{print $1}'`
FULL=`du -s "$PGDATA/base" | awk '{print $1}'`
if [ "$FULL" -lt `expr "$TEMPLATE1" \* 4` ]
then    echo "Your installation should have at least four times the data stored in 
template1 to yield meaningful results" 1>&2
        exit 1 
fi

find "$PGDATA/base" -type f -exec ls -ld {} \; |
awk '$5 % '"$BLCKSZ"' == 0 {print $5 / '"$BLCKSZ"', $9}' |
grep -v '^0 ' > /tmp/$$

TOTAL=`awk 'BEGIN       {sum=0}
                        {sum += $1}
            END         {print sum}' /tmp/$$`

echo "Running random access timing test ..."

START=`date '+%s'`
PAGES=1000

while [ "$PAGES" -ne 0 ]
do
        BIGRAND=`expr "$RANDOM" \* 32768 + "$RANDOM"`
        
        OFFSET=`awk 'BEGIN{printf "%d\n", ('"$BIGRAND"' / 2^30) * '"$TOTAL"'}'`
        
        RESULT=`awk '   BEGIN   {offset = 0}
                offset + $1 > '"$OFFSET"' \
                        {print $2, '"$OFFSET"' - offset ; exit}
                        {offset += $1}' /tmp/$$`
        FILE=`echo "$RESULT" | awk '{print $1}'`
        OFFSET=`echo "$RESULT" | awk '{print $2}'`
        
        dd bs="$BLCKSZ" seek="$OFFSET" count=1 if="$FILE" of="/dev/null" >/dev/null 
2>&1
        PAGES=`expr "$PAGES" - 1`
done

STOP=`date '+%s'`
RANDTIME=`expr "$STOP" - "$START"`

echo "Running sequential access timing test ..."

START=`date '+%s'`
# We run the random test 10 times more because it is quicker and
# we need it to run for a while to get accurate results.
PAGES=10000

while [ "$PAGES" -ne 0 ]
do
        BIGRAND=`expr "$RANDOM" \* 32768 + "$RANDOM"`
        
        OFFSET=`awk 'BEGIN{printf "%d\n", ('"$BIGRAND"' / 2^30) * '"$TOTAL"'}'`
        
        RESULT=`awk '   BEGIN   {offset = 0}
                offset + $1 > '"$OFFSET"' \
                        {print $2, $1; exit}
                        {offset += $1}' /tmp/$$`
        FILE=`echo "$RESULT" | awk '{print $1}'`
        FILEPAGES=`echo "$RESULT" | awk '{print $2}'`

        if [ "$FILEPAGES" -gt "$PAGES" ]
        then    FILEPAGES="$PAGES"
        fi
        
        dd bs="$BLCKSZ" count="$FILEPAGES" if="$FILE" of="/dev/null" >/dev/null 2>&1
        PAGES=`expr "$PAGES" - "$FILEPAGES"`
done

STOP=`date '+%s'`
SEQTIME=`expr "$STOP" - "$START"`

echo
awk 'BEGIN      {printf "random_page_cost = %f\n", ('"$RANDTIME"' / '"$SEQTIME"') * 
10}'

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Reply via email to