Re: [sqlalchemy] storing a large file into a LargeBinary question

2013-02-13 Thread Andre Charbonneau
Thanks for the feedback Michael.  Lots of good information in there.
I will read up on buffer() and memoryview() and also on custom
SQLAlchemy types.

Thanks again,
  Andre

On 13-02-12 04:33 PM, Michael Bayer wrote:
 On Feb 12, 2013, at 3:22 PM, Andre Charbonneau 
 andre.charbonn...@nrc-cnrc.gc.ca wrote:

 Greetings everyone,
 I have a piece of code in a web app where I need to store a large binary
 file (uploaded file stored on disk by Apache server), into an object's
 LargeBinary attribute.

 That's pretty easy to do with a syntax like:

myobject.uploaded_file = xyz.file.read()


 The problem is that I don't want to load the entire file into memory
 when I set the LargeBinary attribute.



 If my understanding is correct, the above call will first cause the
 entire content of the uploaded file to be loaded into memory and then
 that is assigned to the myobject.uploaded_file LargeBinary attribute. 
 Correct?

 (Then when sqlalchemy eventually issues the INSERT statement to store
 the data in the database... But then I don't really know how the data
 transfer is done...)


 I have tried to find another way of passing the data to the LargeBinary
 object that would not have to load the entire file into memory at once,
 but stream it in chunks during the INSERT statement, but I was not able
 to find anything. :-(
 In the old days these streaming binary interfaces were common, but as memory 
 has become plentiful you don't see them used anymore.  Even systems like 
 Oracle, you see client libraries reliant upon having to set the allowed size 
 of memory to be bigger than the largest value you need to store.

 psycopg2 does work with buffer() and memoryview() objects as the input to 
 a bytea column, and you could send these in as arguments where SQLAlchemy 
 should pass them through (or if not, its easy to make a custom type that 
 passes it through).  Though these objects don't appear to work around having 
 to load the data into memory, they just make memory usage more efficient by 
 removing the need for it to be copied internally. I'm not intimately 
 familiar with them enough to know if they support some way to stream from a 
 file handle or not.

 There's also a facility I've not previously heard of in Postgresql and 
 psycopg2 called the large object system, which appears to be an entirely 
 separate table pg_largeobject that stores them.  Dropping into psycopg2, 
 you can store and retrieve these objects using the object interface:

 http://initd.org/psycopg/docs/connection.html#connection.lobject

 as far as how to get that data into your table, it seems like you'd need to 
 link to the OID of your large object, rather than using bytea: 
 http://www.postgresql.org/docs/current/static/lo-funcs.html .  So you'd need 
 to forego the usage of bytea.   Again SQLAlchemy types could be created which 
 transparently perform these tasks against the OID.

 I'd ask on the psycopg2 list as to which feature they recommend, and I'm 
 betting they will likely say that memory is very cheap and plentiful these 
 days and you should just assume the data will be fit into memory.




-- 
André Charbonneau
Research Computing Support Analyst
Shared Services Canada | National Research Council Canada
Services partagés Canada | Conseil national de recherches Canada
100 Sussex Drive | 100, promenade Sussex Ottawa, Ontario  K1A 0R6
Canada
andre.charbonn...@ssc-spc.gc.ca
Telephone | Téléphone:  613-993-3129


-- 
André Charbonneau
Research Computing Support Analyst
Shared Services Canada | National Research Council Canada
Services partagés Canada | Conseil national de recherches Canada
100 Sussex Drive | 100, promenade Sussex
Ottawa, Ontario  K1A 0R6
Canada
andre.charbonn...@ssc-spc.gc.ca
Telephone | Téléphone:  613-993-3129

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




[sqlalchemy] storing a large file into a LargeBinary question

2013-02-12 Thread Andre Charbonneau
Greetings everyone,
I have a piece of code in a web app where I need to store a large binary
file (uploaded file stored on disk by Apache server), into an object's
LargeBinary attribute.

That's pretty easy to do with a syntax like:

myobject.uploaded_file = xyz.file.read()


The problem is that I don't want to load the entire file into memory
when I set the LargeBinary attribute.



If my understanding is correct, the above call will first cause the
entire content of the uploaded file to be loaded into memory and then
that is assigned to the myobject.uploaded_file LargeBinary attribute. 
Correct?

(Then when sqlalchemy eventually issues the INSERT statement to store
the data in the database... But then I don't really know how the data
transfer is done...)


I have tried to find another way of passing the data to the LargeBinary
object that would not have to load the entire file into memory at once,
but stream it in chunks during the INSERT statement, but I was not able
to find anything. :-(


Anyone managed to implement something like this before, or know where I
can read some more info about possible ways of doing this with sqlalchemy?


Thanks a lot,
  Andre



-- 
André Charbonneau
Research Computing Support Analyst
Shared Services Canada | National Research Council Canada
Services partagés Canada | Conseil national de recherches Canada
100 Sussex Drive | 100, promenade Sussex 
Ottawa, Ontario  K1A 0R6
Canada
andre.charbonn...@ssc-spc.gc.ca
Telephone | Téléphone:  613-993-3129

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] storing a large file into a LargeBinary question

2013-02-12 Thread Andre Charbonneau
One piece of extra information I forgot to mention in my original email
is that the myobject.uploaded_file LargeBinary column attribute maps to
a Postgresql bytea column.

Best regards,
Andre

On 13-02-12 03:22 PM, Andre Charbonneau wrote:
 Greetings everyone,
 I have a piece of code in a web app where I need to store a large binary
 file (uploaded file stored on disk by Apache server), into an object's
 LargeBinary attribute.

 That's pretty easy to do with a syntax like:

 myobject.uploaded_file = xyz.file.read()


 The problem is that I don't want to load the entire file into memory
 when I set the LargeBinary attribute.



 If my understanding is correct, the above call will first cause the
 entire content of the uploaded file to be loaded into memory and then
 that is assigned to the myobject.uploaded_file LargeBinary attribute. 
 Correct?

 (Then when sqlalchemy eventually issues the INSERT statement to store
 the data in the database... But then I don't really know how the data
 transfer is done...)


 I have tried to find another way of passing the data to the LargeBinary
 object that would not have to load the entire file into memory at once,
 but stream it in chunks during the INSERT statement, but I was not able
 to find anything. :-(


 Anyone managed to implement something like this before, or know where I
 can read some more info about possible ways of doing this with sqlalchemy?


 Thanks a lot,
   Andre





-- 
André Charbonneau
Research Computing Support Analyst
Shared Services Canada | National Research Council Canada
Services partagés Canada | Conseil national de recherches Canada
100 Sussex Drive | 100, promenade Sussex Ottawa, Ontario  K1A 0R6
Canada
andre.charbonn...@ssc-spc.gc.ca
Telephone | Téléphone:  613-993-3129


-- 
André Charbonneau
Research Computing Support Analyst
Shared Services Canada | National Research Council Canada
Services partagés Canada | Conseil national de recherches Canada
100 Sussex Drive | 100, promenade Sussex
Ottawa, Ontario  K1A 0R6
Canada
andre.charbonn...@ssc-spc.gc.ca
Telephone | Téléphone:  613-993-3129

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] storing a large file into a LargeBinary question

2013-02-12 Thread Michael Bayer

On Feb 12, 2013, at 3:22 PM, Andre Charbonneau 
andre.charbonn...@nrc-cnrc.gc.ca wrote:

 Greetings everyone,
 I have a piece of code in a web app where I need to store a large binary
 file (uploaded file stored on disk by Apache server), into an object's
 LargeBinary attribute.
 
 That's pretty easy to do with a syntax like:
 
myobject.uploaded_file = xyz.file.read()
 
 
 The problem is that I don't want to load the entire file into memory
 when I set the LargeBinary attribute.
 
 
 
 If my understanding is correct, the above call will first cause the
 entire content of the uploaded file to be loaded into memory and then
 that is assigned to the myobject.uploaded_file LargeBinary attribute. 
 Correct?
 
 (Then when sqlalchemy eventually issues the INSERT statement to store
 the data in the database... But then I don't really know how the data
 transfer is done...)
 
 
 I have tried to find another way of passing the data to the LargeBinary
 object that would not have to load the entire file into memory at once,
 but stream it in chunks during the INSERT statement, but I was not able
 to find anything. :-(

In the old days these streaming binary interfaces were common, but as memory 
has become plentiful you don't see them used anymore.  Even systems like 
Oracle, you see client libraries reliant upon having to set the allowed size of 
memory to be bigger than the largest value you need to store.

psycopg2 does work with buffer() and memoryview() objects as the input to a 
bytea column, and you could send these in as arguments where SQLAlchemy 
should pass them through (or if not, its easy to make a custom type that passes 
it through).  Though these objects don't appear to work around having to load 
the data into memory, they just make memory usage more efficient by removing 
the need for it to be copied internally. I'm not intimately familiar with 
them enough to know if they support some way to stream from a file handle or 
not.

There's also a facility I've not previously heard of in Postgresql and psycopg2 
called the large object system, which appears to be an entirely separate 
table pg_largeobject that stores them.  Dropping into psycopg2, you can store 
and retrieve these objects using the object interface:

http://initd.org/psycopg/docs/connection.html#connection.lobject

as far as how to get that data into your table, it seems like you'd need to 
link to the OID of your large object, rather than using bytea: 
http://www.postgresql.org/docs/current/static/lo-funcs.html .  So you'd need to 
forego the usage of bytea.   Again SQLAlchemy types could be created which 
transparently perform these tasks against the OID.

I'd ask on the psycopg2 list as to which feature they recommend, and I'm 
betting they will likely say that memory is very cheap and plentiful these days 
and you should just assume the data will be fit into memory.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.