On Jan 27, 2011, at 5:11 AM, Eduardo wrote:

> Dear all,
> I am writing an application to scan a directory system and store
> metadata in DB.
> For each directory I create a separate process in which scanning and
> metadata feed is performed.
> Now I have following problems:
> 1) I am forced to start a session in each process and bind them for
> the engine
>       "engine = create_engine(dbfile, poolclass=NullPool)"
> With other poolclass (including default) I get the error that number
> of connection are exceeded for the non super users. Is this common
> practice to handle this (I mean NullPool) or are there any way to get
> around this. How NullPool option affects the performance of the DB?

there is no difference between NullPool and QueuePool regarding number of 
connections used, except that QueuePool can be configured to put a hard limit 
on how many are in use, and that if you are opening lots of connections with 
your pool, QueuePool will leave 5 of them hanging around by default whereas 
NullPool will not, so it sounds like you are opening too many connections in 
your child processes.   Set pool_size=1 and max_overflow=0 with those child 
procs and that will ensure just one connection per subprocess.


> 2)
> I create a loop in which various operation are performed (adding,
> deleting, updating of each instances):
> for elem in mydict:
>    .
>    .
>    session.add(someinst)
>    .
>    .
>    session.delete(inst2)
>   .
>    session.refresh(inst3)
> 
> I am concerned about performance issues . Should I commit changes:
> after each operation (add, delete, refresh), after each loop or after
> the loop has run its course?

you should commit after the full operation is complete.   If you'd like results 
from your operation to become available as it runs through a large number of 
records, you can use a scheme like committing every 1000 records or something 
like that.   Consider turning off "expire_on_commit" as that will otherwise 
force everything in the session to reload after a commit, that is if you 
reference the same objects across multiple operations.

> Is there any advantage if I create a new session for each operation
> and then close it. Is it how the data get faster into the DB?

closing the session means the next one has to load data all over again.   you 
can only tell what work is being done and which of it may be unnecessary by 
watching your SQL logs as the operation proceeds. 

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Reply via email to