Hi Laurent,

> As I like to do something useful when I learn a new language, I
> decided to do a backup system.

OK.

> As you can see, given the progression, it will soon take more time
> to store the meta-data than processing the data itself.
> Does anybody see if I made a mistake in my code?

Yes, the main problem is this +Joint:

> (class +File +Entity)
> ...
> (rel backups (+List +Joint) files (+Backup))  #
> 
> (class +Backup +Entity)
> ...
> (rel files (+List +Joint) backups (+File))    #

It causes each time a +File is created to be the list of files
in the +Backup object to be extended:

> (de addFile (Bk P)
> ...
>       (put!>
>          o
>          'backups
>          (append (; 'o backups) Bk) )

Thus, the single +Backup object gets larger and larger.

As a general rule, you can always use '+Joint' and '+Ref +Link'
interchangeably. You want to use a list of '+Joint's only if the list
doesn't get too long (less than, say, 100).

So the most important change is to remove the line

   (rel files (+List +Joint) backups (+File))

from the +Backup class, and use

   (rel backups (+List +Ref +Link) NIL (+Backup))

in the +File class. This will increase the speed dramatically.


After that, there are a few more places that should be optimized. You'll
notice the differences only when you create a lot more objects.

In the specification of the database block sizes


> (dbs
>    (0)
>    (0 +Chunk)
>    (0 +File)
>    (0 (+File pth inode))
>    (4 +Backup)
>    (0 (+Backup name startDT hostName)) )

the '0's mean that the block size is 64. This is a bit to small for
'+File' and, more important, for the index trees. I would use '2' for
the '+File' and '+Backup' objects, and '4' for the indexes. This gives:

   (2 +File)
   (4 (+File pth inode))
   (2 +Backup)
   (4 (+Backup name startDT hostName)) )


Then, the usage of 'request' is not as intended

>       (request '(+File)
>           'pth P
>           'size 0
>           'accessTime 1
>           'modifyTime 2
>           'changeTime 3
>           'inode 4
>           'links 1
>           'uid 1001
>           'gid 1002
>           'accessRights 766 )

'request' searches with the given keys for an object, before it decides
whether to use an existing object or to create a new one.

So, typically, if 'pth' is a characteristic key, it would be called as

   (let Obj (request '(+File) 'pth P)
      (put> Obj ..)
      ...

However, I suspect that 'request' is not needed here at all, as you
create new +File objects. So 'new' is the way:

   (de addFile (Bk P)
      (let Obj
         (new (db: +File) '(+File)
            'pth P
            'size 0
            'accessTime 1
            'modifyTime 2
            'changeTime 3
            'inode 4
            'links 1
            'uid 1001
            'gid 1002
            'accessRights 766 )
         ## Not necessary (put> Obj 'backups (append (; Obj backups) Bk))
         (put> Obj 'backups Bk)
         (at (0 . 10000) (commit))
         Obj ) )

Note two other changes I made:

   - Because 'backups' is a +List relation, the explicit 'append' is not
     necessary. Just putting 'Bk' is enough, the list will be created
     automatically.

   - Calling 'new!', 'put!>' etc., i.e. the functions which call
     (dbSync) and then (commit) each time they are called, is very
     expensive. For a large-volume input it is better to go into
     single-user mode of the DB, avoid (dbSync), and call 'commit' less
     often. In the example above, it is called only every 10000th time.

The same applies to the backup function


> (de Backup (rootPath)
>    (let obj1
>       (request '(+Backup)
>           'name (stamp)
>           'startDT (stamp)
>           'basePath rootPath
>           'hostName (host "localhost") )
>       (put! *DB 'currentBackup obj1)
>       # now, walk the path
>       (let Dir rootPath
>          (recur (Dir)
>             (for F (dir Dir)
>                (let Path (pack Dir "/" F)
>                   (addFile obj1 Path)
>                   # note: change this test: it considers a link to a dir as a 
> dir!
>                   (if (=T (car (info Path)))
>                      (recurse Path) ) ) ) ) )
>       (length (get obj1 'files)) ) )

Avoiding 'request' and 'put!' gives basically

   (de backup (RootPath)
      (let Obj1
         (new (db: +Backup) '(+Backup)
            'name (stamp)
            'startDT (stamp)
            'basePath RootPath
            'hostName (host "localhost") )
         (put *DB 'currentBackup Obj1)
         (commit)
         # now, walk the path
         (let Dir RootPath
            (recur (Dir)
               (for F (dir Dir)
                  (let Path (pack Dir "/" F)
                     (addFile Obj1 Path)
                     # note: change this test: it considers a link to a dir as 
a dir!
                     (if (=T (car (info Path)))
                        (recurse Path) ) ) ) ) )
         (commit)
         (count (tree 'backups '+File)) ) )

: (bench (backup "/home"))
0.460 sec
-> 7125

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Reply via email to