Thank you. That sped it up. It's taking 69 seconds to insert 1M records

(pool "foo.db")
(class +Invoice +Entity)
(rel id (+Key +Number))
(zero N)
(bench (do 1000000 (new (db: +Invoice) '(+Invoice) 'id (inc 'N)) ))
(commit)

I can work with that. Now I am testing out queries.

? (bench (iter (tree 'id '+Invoice) '((This) (inc 'Z (: id) )) )))
11.822 sec

? (bench (scan (tree 'id '+Invoice) '((Val Key) (inc 'Z Val )) )))
4.430 sec

It makes sense that scan would be fastest because I can use the index
directly. Is that likely the fastest query to sum up a number relation?

Thank you
Joe





On Wed, May 30, 2012 at 2:10 AM, Alexander Burger <a...@software-lab.de>wrote:

> On Wed, May 30, 2012 at 12:28:50PM +0700, Henrik Sarvell wrote:
> > Use new and chunk it up:
> >
> >    (dbSync)
> >    (for A As
> >       (at (0 . 1000) (commit 'upd) (prune) (dbSync))
> >       (new (db: +Article) '(+Article) key1 value1 key2 value2 ... ))
> >    (commit 'upd)
> >
> > With new! you are locking and writing every row so should only be used
> > in cases where you know you are only inserting one (or maybe very
> > few).
> >
> > Above we create them in memory and write 1000 of them at a time.
> >
> > If you have 12 million you should probably use an even higher number
> than 1000.
>
> Yes, I usually use 10000. Larger values seem not to bring any further
> improvements, and use too much memory.
>
>
> You can slightly simplify and speed up the above, if you do not need to
> synchronize with other processes during that import (i.e. if other
> processes can wait until the import is done). Then you can omit the
> calls to 'commit' with 'upd' (signaling "done" to other processes) and
> the (dbSync) calls in the loop.
>
> And, in the final end, you could call (prune T) to reset to normal
> behavior, though not doing this will not have any bad effect.
>
> With that, we would have
>
>   (dbSync)
>   (for ...
>       (new (db: +Article) '(+Article) 'key1 <value1> 'key2 <value2> ... )
>       (at (0 . 10000) (commit) (prune)) )
>   (commit 'upd)
>   (prune T)
>
>
> Practically, I would also omit the 'prune' calls, and only insert them
> if I find that the import process uses too much money (monitor with
> 'top' or 'ps'). This speeds up small imports (which include 12 million
> objects).
>
> This would simplify the import to
>
>   (dbSync)
>   (for ...
>       (new (db: +Article) '(+Article) 'key1 <value1> 'key2 <value2> ... )
>       (at (0 . 10000) (commit)) )
>   (commit 'upd)
>
> Cheers,
> - Alex
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
>

Reply via email to