Thanks for the feedback Michael. Lots of good information in there. I will read up on buffer() and memoryview() and also on custom SQLAlchemy types.
Thanks again, Andre On 13-02-12 04:33 PM, Michael Bayer wrote: > On Feb 12, 2013, at 3:22 PM, Andre Charbonneau > <andre.charbonn...@nrc-cnrc.gc.ca> wrote: > >> Greetings everyone, >> I have a piece of code in a web app where I need to store a large binary >> file (uploaded file stored on disk by Apache server), into an object's >> LargeBinary attribute. >> >> That's pretty easy to do with a syntax like: >> >> myobject.uploaded_file = xyz.file.read() >> >> >> The problem is that I don't want to load the entire file into memory >> when I set the LargeBinary attribute. >> >> >> >> If my understanding is correct, the above call will first cause the >> entire content of the uploaded file to be loaded into memory and then >> that is assigned to the myobject.uploaded_file LargeBinary attribute. >> Correct? >> >> (Then when sqlalchemy eventually issues the INSERT statement to store >> the data in the database... But then I don't really know how the data >> transfer is done...) >> >> >> I have tried to find another way of passing the data to the LargeBinary >> object that would not have to load the entire file into memory at once, >> but stream it in chunks during the INSERT statement, but I was not able >> to find anything. :-( > In the old days these streaming binary interfaces were common, but as memory > has become plentiful you don't see them used anymore. Even systems like > Oracle, you see client libraries reliant upon having to set the allowed size > of memory to be bigger than the largest value you need to store. > > psycopg2 does work with "buffer()" and "memoryview()" objects as the input to > a "bytea" column, and you could send these in as arguments where SQLAlchemy > should pass them through (or if not, its easy to make a custom type that > passes it through). Though these objects don't appear to work around having > to load the data into memory, they just make memory usage more efficient by > removing the need for it to be copied internally. I'm not intimately > familiar with them enough to know if they support some way to "stream" from a > file handle or not. > > There's also a facility I've not previously heard of in Postgresql and > psycopg2 called the "large object" system, which appears to be an entirely > separate table "pg_largeobject" that stores them. Dropping into psycopg2, > you can store and retrieve these objects using the object interface: > > http://initd.org/psycopg/docs/connection.html#connection.lobject > > as far as how to get that data into your table, it seems like you'd need to > link to the OID of your large object, rather than using bytea: > http://www.postgresql.org/docs/current/static/lo-funcs.html . So you'd need > to forego the usage of bytea. Again SQLAlchemy types could be created which > transparently perform these tasks against the OID. > > I'd ask on the psycopg2 list as to which feature they recommend, and I'm > betting they will likely say that memory is very cheap and plentiful these > days and you should just assume the data will be fit into memory. > > -- André Charbonneau Research Computing Support Analyst Shared Services Canada | National Research Council Canada Services partagés Canada | Conseil national de recherches Canada 100 Sussex Drive | 100, promenade Sussex Ottawa, Ontario K1A 0R6 Canada andre.charbonn...@ssc-spc.gc.ca Telephone | Téléphone: 613-993-3129 -- André Charbonneau Research Computing Support Analyst Shared Services Canada | National Research Council Canada Services partagés Canada | Conseil national de recherches Canada 100 Sussex Drive | 100, promenade Sussex Ottawa, Ontario K1A 0R6 Canada andre.charbonn...@ssc-spc.gc.ca Telephone | Téléphone: 613-993-3129 -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.