Re: [sqlalchemy] "Proper" way to do processing across entire db?

Victor Ng Thu, 21 Feb 2013 14:44:56 -0800

Um sure. 

That still doesn't answer my question.


I am interested to persist changes in my db as I am iterating through 
yield_per. 

 

On Thursday, February 21, 2013 1:03:49 PM UTC-8, A.M. wrote:
>
> On Thu, 21 Feb 2013 12:52:42 -0800 (PST), Victor Ng 
> <vicn...@gmail.com<javascript:>> 
>
> wrote: 
> > I do a lot of processing on large amount of data. 
> > 
> > The common pattern we follow is: 
> > 
> > 1. Iterate through a large data set 
> > 2. Do some sort of processing (i.e. NLP processing like tokenization, 
> > capitalization, regex parsing, ... ) 
> > 3. Insert the new result in another table. 
> > 
> > Right now we are doing something like this: 
> > 
> > for x in session.query(Foo).yield_per(10000): 
> >   bar = Bar() 
> >   bar.hello = x.world.lower() 
> >   session.add(bar) 
> >   session.flush() 
> > session.commit() 
>
> Do you really need to flush after making each new Bar? That implies a 
> database round-trip and state sync with SQLAlchemy. 
>
> In any case, you should gather a profile to see where/how time is getting 
> spent. SQLAlchemy is a complex framework, so whatever performance 
> assumptions are implied in the code may be wrong. 
>
> Cheers, 
> M 
>

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [sqlalchemy] "Proper" way to do processing across entire db?

Reply via email to