Le 31/07/2012 18:53, Alexander Burger a écrit :

It causes each time a +File is created to be the list of files
in the +Backup object to be extended:

(de addFile (Bk P)
...
       (put!>
          o
          'backups
          (append (; 'o backups) Bk) )

Thus, the single +Backup object gets larger and larger.

As a general rule, you can always use '+Joint' and '+Ref +Link'
interchangeably. You want to use a list of '+Joint's only if the list
doesn't get too long (less than, say, 100).

This is important information! Thanks! (and it absolutely excludes the use of +Joint in my case!)


So the most important change is to remove the line

    (rel files (+List +Joint) backups (+File))

from the +Backup class, and use

    (rel backups (+List +Ref +Link) NIL (+Backup))

in the +File class. This will increase the speed dramatically.


I did fear it would be the +Joint, but didn't really know how to replace it...
I'll have to test how it behaves...
What I intended for was to have an immediate list of the +Files for any +Backup, and to be able to know if any +File had no +Backup anymore for cleanup. I guess I'll have to query the database for all the +File linking to a given +Backup, here (I haven't yet done more than gloss over this part of the tutorial...)


After that, there are a few more places that should be optimized. You'll
notice the differences only when you create a lot more objects.

In the specification of the database block sizes


(dbs
    (0)
    (0 +Chunk)
    (0 +File)
    (0 (+File pth inode))
    (4 +Backup)
    (0 (+Backup name startDT hostName)) )

the '0's mean that the block size is 64. This is a bit to small for
'+File' and, more important, for the index trees. I would use '2' for
the '+File' and '+Backup' objects, and '4' for the indexes. This gives:

    (2 +File)
    (4 (+File pth inode))
    (2 +Backup)
    (4 (+Backup name startDT hostName)) )


Indeed. I did play a bit with the 'dbs', but didn't see any change in the benches so didn't bother anymore with it as it was obviously not the main speed problem.


Then, the usage of 'request' is not as intended

       (request '(+File)
           'pth P
           'size 0
           'accessTime 1
           'modifyTime 2
           'changeTime 3
           'inode 4
           'links 1
           'uid 1001
           'gid 1002
           'accessRights 766 )

'request' searches with the given keys for an object, before it decides
whether to use an existing object or to create a new one.

So, typically, if 'pth' is a characteristic key, it would be called as

    (let Obj (request '(+File) 'pth P)
       (put> Obj ..)
       ...

However, I suspect that 'request' is not needed here at all, as you
create new +File objects. So 'new' is the way:

    (de addFile (Bk P)
       (let Obj
          (new (db: +File) '(+File)
             'pth P
             'size 0
             'accessTime 1
             'modifyTime 2
             'changeTime 3
             'inode 4
             'links 1
             'uid 1001
             'gid 1002
             'accessRights 766 )
          ## Not necessary (put> Obj 'backups (append (; Obj backups) Bk))
          (put> Obj 'backups Bk)
          (at (0 . 10000) (commit))
          Obj ) )


I'll have to study this carefully, because I do need the 'request' concept, here: when doing a subsequent backup, I don't want to re-create all the +Files that are already in the database. The idea was to give 'request' all the data for the +File to add, and let it give me either a new one, or an existing one... Or maybe, a search on the 'pth index, get all the results and compare with what I already have, then create or reuse... It will depend on if the performance for the necessary generic 'request' can be improved by some specialized function in this case... For a backup engine, we want a new +File entry if any of the data returned by a stat call changes. (I still need to do the 'native' wrapper around glibc's stat... It'll be fun!)

Note two other changes I made:

    - Because 'backups' is a +List relation, the explicit 'append' is not
      necessary. Just putting 'Bk' is enough, the list will be created
      automatically.

I was wondering, and decided to use the safe option...


    - Calling 'new!', 'put!>' etc., i.e. the functions which call
      (dbSync) and then (commit) each time they are called, is very
      expensive. For a large-volume input it is better to go into
      single-user mode of the DB, avoid (dbSync), and call 'commit' less
      often. In the example above, it is called only every 10000th time.

The same applies to the backup function


(de Backup (rootPath)
    (let obj1
       (request '(+Backup)
           'name (stamp)
           'startDT (stamp)
           'basePath rootPath
           'hostName (host "localhost") )
       (put! *DB 'currentBackup obj1)
       # now, walk the path
       (let Dir rootPath
          (recur (Dir)
             (for F (dir Dir)
                (let Path (pack Dir "/" F)
                   (addFile obj1 Path)
                   # note: change this test: it considers a link to a dir as a 
dir!
                   (if (=T (car (info Path)))
                      (recurse Path) ) ) ) ) )
       (length (get obj1 'files)) ) )

Avoiding 'request' and 'put!' gives basically

    (de backup (RootPath)
       (let Obj1
          (new (db: +Backup) '(+Backup)
             'name (stamp)
             'startDT (stamp)
             'basePath RootPath
             'hostName (host "localhost") )
          (put *DB 'currentBackup Obj1)
          (commit)
          # now, walk the path
          (let Dir RootPath
             (recur (Dir)
                (for F (dir Dir)
                   (let Path (pack Dir "/" F)
                      (addFile Obj1 Path)
                      # note: change this test: it considers a link to a dir as 
a dir!
                      (if (=T (car (info Path)))
                         (recurse Path) ) ) ) ) )
          (commit)
          (count (tree 'backups '+File)) ) )

: (bench (backup "/home"))
0.460 sec
-> 7125

Cheers,
- Alex


Thanks for your time!

Regards,
--
Laurent ARTAUD
--
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Reply via email to