Re: [GENERAL] Storing blobs in PG DB

2007-04-06 Thread Nikolay Moskvichev


This doesn't answer your question, but I thought I'd throw my opinion in 
anyway.


My personal view is that in general, binary files have no place in 
databases. Filesystems are for files, databases are for data. My design 
choice is to store the files in a fileystem and use the database to hold 
metadata as well as a pointer to the file.




Yes, current design exactly the same.

If you *must* put files into the database, then you can do so, and PG 
will handle that many files of those sizes with ease. For all intents 
and purposes, PG can store an unlimited number of files. You're far more 
likely to run into walls in the form of limitations in your disk I/O 
system then limitations in what PG will handle.



Thanks

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Storing blobs in PG DB

2007-04-06 Thread Nikolay Moskvichev

Harvey, Allan AC пишет:

I find it fine. Bit different usage though.
I store about 200 50MB items.


Thanks

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Storing blobs in PG DB

2007-04-06 Thread Nikolay Moskvichev


You don't like filesystems ?

You know file system which supports SQL, referential integrity, and 
managed transactions ?


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Storing blobs in PG DB

2007-04-06 Thread Nikolay Moskvichev

Naz Gassiep пишет:
This doesn't answer your question, but I thought I'd throw my opinion in 
anyway.


My personal view is that in general, binary files have no place in 
databases. Filesystems are for files, databases are for data. My design 
choice is to store the files in a fileystem and use the database to hold 
metadata as well as a pointer to the file.




By the way, not exactly the same case, but just some pros and cons:

http://en.wikibooks.org/wiki/Programming:WebObjects/Web_Applications/Development/Database_vs_Filesystem


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Storing blobs in PG DB

2007-04-06 Thread Nikolay Moskvichev

Postgres User пишет:

I recently heard from Josh Berkus that at least one major CMS
application uses Postgres to store entire HTML pages (including image
files) in order to support full versioning.

As a general rule, I prefer not to store BLOBS in a DB- I'd rather
leave the BLOB in the file system and let the db save only a pointer


Yes, current solution uses the similar approach


to it.  However, if you need to store BLOBs in a database, Postgres is
one of the better platforms for it.


Thanks

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Storing blobs in PG DB

2007-04-06 Thread Peter Wilson
Nikolay Moskvichev wrote:
> Hi All!
> 
> Question is : How suitable PG for storing about 2 000 000 binary files
> 0,5-2,0 Mb size each ? It is not planned the big number of clients or a
> plenty of updatings. Like photoalbum on local host.

In our CMS we store all page data in the database - either text for HTML pages
or the images and other truly binary data.

I agree with most of the comments on why *not* to store binary application data
in the file system, but I'd add the following :

For scalability, we use a very high performance database server (battery backed
RAID controller, lots of disks, lots of RAM, processors etc). Virtually the only
thing it runs is Postgres. This connects to a number of 'front-end' web servers.
 These are lower performance, cheaper boxes than the database. Generally 1U,
single disk, not RAID etc. The web-servers run Apache and the processor
intensive stuff like server-side JavaScript.

A load balancer shares web traffic across the front-end servers and can detect
if any of those fail and switch them out of servicing requests.

The front-end servers connect to the database server over gigabit Ethernet to
cut latency to a minimum.

We've found that the single high-spec database server is more than capable of
servicing quite a lot of front-end web servers.

Now in that environment, if you were to choose to store things in the file
system, not only have you got the issue of synchronising file system with
database, but of synchronising all the file systems.

You could get round some of those issues by keeping all the files on one system
and using an NFS share or some complex FTP system or something, but actually
it's a lot easier to hold the data in the database.

So - when you're developing your application it's worth thinking about what
happens as it starts to get busier. What's your route to scaling? Many web
applications are written to work on a single machine with no thought to what
happens when that reaches the limit, other than get a bigger server.

All the best
Peter Wilson
--
http://www.whitebeam.org
http://www.yellowhawk.co.uk
---

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Storing blobs in PG DB

2007-04-06 Thread Dave Page

Thomas Kellerer wrote:

Merlin Moncure wrote on 05.04.2007 23:24:

I think most reasons why not to store binaries in the
database boil down to performance.


Having implemented an application where the files were stored in the 
filesystem instead of the database I have to say, with my experience I 
would store the files in the DB the next time. Once the number of files 
in a directory exceeds a certain limit, this directory is very hard to 
handle.


Things like "dir", or "ls" or listing the contents through a FTP 
connection become extremely slow (using HP/UX as well as Windows).


This is very true - I've ended up with data stores containing directory 
hierarchys to handle this issue:


1/
  1/
  ...
  16/
2/
  1/
  ...
  16/
3/
...
16/

And so on. The more files, the more directories. The files are then 
stored in the lower level directories using an appropriate algorithm to 
distribute them fairly equally.


And you have to backup only _one_ source (the database), not two. Moving 
the data around from system a to system b (e.g. staging (windows) -> 
production (HP/UX)) is a lot easier when you can simply backup and 
restore the database (in our case it was an Oracle database, but this 
would be the same for PG)


Well this is the big problem - on a busy system your database backup can 
easily become out of sync with your filesystem backup, coupled with 
which, you have no automatic transactional control over anything your 
store in the file system.


Consequently, the more recent systems I've built have stored the blobs 
in PostgreSQL.


Regards, Dave

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Storing blobs in PG DB

2007-04-05 Thread William Garrison

lol.

yeah, I meant binary blobs. :-)

Thomas Kellerer wrote:

William Garrison wrote on 06.04.2007 00:22:
I have actually never stored data in the database. 


Hmm, funny statement somehow ;)


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Storing blobs in PG DB

2007-04-05 Thread Thomas Kellerer

William Garrison wrote on 06.04.2007 00:22:
I have actually never stored data in the database. 


Hmm, funny statement somehow ;)


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Storing blobs in PG DB

2007-04-05 Thread William Garrison
I have actually never stored data in the database.  But in a recent 
project I've realized it might have been smart.  We store a terabytes of 
data on the file system, and many times I would love to have an ACID 
compliant file system.  For example, if I delete an entry, I need to 
delete it from disk and from the database.  How can I be sure that was 
transactional?  Or if I append data to the file, and then update the 
database.  What then?  I wind-up writing "tricky" code that does stuff 
like renames a file, updates the DB, and renames it back if there is an 
error in an attempt to fake-out atomicity and transactions.


Of course, I may have come-up with even more issues if the company put 
this data into a SQL server.  Who knows.


Where exactly does PostgreSQL put large blobs?  Does it ensure ACID 
compliance if I add a 2GB blob in a column?


Merlin Moncure wrote:

On 4/5/07, Listmail <[EMAIL PROTECTED]> wrote:


> My personal view is that in general, binary files have no place in
> databases. Filesystems are for files, databases are for data. My design
> choice is to store the files in a fileystem and use the database to 
hold

> metadata as well as a pointer to the file.
>
> If you *must* put files into the database, then you can do so, and PG
> will handle that many files of those sizes with ease. For all intents
> and purposes, PG can store an unlimited number of files. You're far 
more

> likely to run into walls in the form of limitations in your disk I/O
> system then limitations in what PG will handle.

And you can't backup with rsync...


no, but you can do incrementals with PITR, which is just as good (if
not better) than rsync because you are backing up your database
'indexer' and binaries in one swoop...so the backup argument doesn't
fly, imo.

imo, sql is a richer language for storing and extracting any type of
data, binaries included, than hierarchal filesystem style
organization.  I think most reasons why not to store binaries in the
database boil down to performance.

merlin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Storing blobs in PG DB

2007-04-05 Thread Thomas Kellerer

Merlin Moncure wrote on 05.04.2007 23:24:

I think most reasons why not to store binaries in the
database boil down to performance.


Having implemented an application where the files were stored in the filesystem 
instead of the database I have to say, with my experience I would store the 
files in the DB the next time. Once the number of files in a directory exceeds a 
certain limit, this directory is very hard to handle.


Things like "dir", or "ls" or listing the contents through a FTP connection 
become extremely slow (using HP/UX as well as Windows).


And you have to backup only _one_ source (the database), not two. Moving the 
data around from system a to system b (e.g. staging (windows) -> production 
(HP/UX)) is a lot easier when you can simply backup and restore the database (in 
our case it was an Oracle database, but this would be the same for PG)


Thomas


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Storing blobs in PG DB

2007-04-05 Thread Merlin Moncure

On 4/5/07, Listmail <[EMAIL PROTECTED]> wrote:


> My personal view is that in general, binary files have no place in
> databases. Filesystems are for files, databases are for data. My design
> choice is to store the files in a fileystem and use the database to hold
> metadata as well as a pointer to the file.
>
> If you *must* put files into the database, then you can do so, and PG
> will handle that many files of those sizes with ease. For all intents
> and purposes, PG can store an unlimited number of files. You're far more
> likely to run into walls in the form of limitations in your disk I/O
> system then limitations in what PG will handle.

And you can't backup with rsync...


no, but you can do incrementals with PITR, which is just as good (if
not better) than rsync because you are backing up your database
'indexer' and binaries in one swoop...so the backup argument doesn't
fly, imo.

imo, sql is a richer language for storing and extracting any type of
data, binaries included, than hierarchal filesystem style
organization.  I think most reasons why not to store binaries in the
database boil down to performance.

merlin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Storing blobs in PG DB

2007-04-05 Thread Listmail


My personal view is that in general, binary files have no place in  
databases. Filesystems are for files, databases are for data. My design  
choice is to store the files in a fileystem and use the database to hold  
metadata as well as a pointer to the file.


If you *must* put files into the database, then you can do so, and PG  
will handle that many files of those sizes with ease. For all intents  
and purposes, PG can store an unlimited number of files. You're far more  
likely to run into walls in the form of limitations in your disk I/O  
system then limitations in what PG will handle.


And you can't backup with rsync...

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Storing blobs in PG DB

2007-04-05 Thread Naz Gassiep
This doesn't answer your question, but I thought I'd throw my opinion in 
anyway.


My personal view is that in general, binary files have no place in 
databases. Filesystems are for files, databases are for data. My design 
choice is to store the files in a fileystem and use the database to hold 
metadata as well as a pointer to the file.


If you *must* put files into the database, then you can do so, and PG 
will handle that many files of those sizes with ease. For all intents 
and purposes, PG can store an unlimited number of files. You're far more 
likely to run into walls in the form of limitations in your disk I/O 
system then limitations in what PG will handle.


- Naz.

Nikolay Moskvichev wrote:

Hi All!

Question is : How suitable PG for storing about 2 000 000 binary files 
0,5-2,0 Mb size each ? It is not planned the big number of clients or 
a plenty of updatings. Like photoalbum on local host.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Storing blobs in PG DB

2007-04-04 Thread Postgres User

I recently heard from Josh Berkus that at least one major CMS
application uses Postgres to store entire HTML pages (including image
files) in order to support full versioning.

As a general rule, I prefer not to store BLOBS in a DB- I'd rather
leave the BLOB in the file system and let the db save only a pointer
to it.  However, if you need to store BLOBs in a database, Postgres is
one of the better platforms for it.

On 4/3/07, Nikolay Moskvichev <[EMAIL PROTECTED]> wrote:

Hi All!

Question is : How suitable PG for storing about 2 000 000 binary files
0,5-2,0 Mb size each ? It is not planned the big number of clients or a
plenty of updatings. Like photoalbum on local host.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Storing blobs in PG DB

2007-04-04 Thread Harvey, Allan AC
I find it fine. Bit different usage though.
I store about 200 50MB items.

Allan

> Hi All!
>
> Question is : How suitable PG for storing about 2 000 000 
binary files  
> 0,5-2,0 Mb size each ? It is not planned the big number of 
clients or a  
> plenty of updatings. Like photoalbum on local host.


The material contained in this email may be confidential, privileged or 
copyrighted. If you are not the intended recipient, use, disclosure or copying 
of this information is prohibited. If you have received this document in error, 
please advise the sender and delete the document. Neither OneSteel nor the 
sender accept responsibility for any viruses contained in this email or any 
attachments.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Storing blobs in PG DB

2007-04-04 Thread Listmail


You don't like filesystems ?

On Wed, 04 Apr 2007 07:44:57 +0200, Nikolay Moskvichev <[EMAIL PROTECTED]>  
wrote:



Hi All!

Question is : How suitable PG for storing about 2 000 000 binary files  
0,5-2,0 Mb size each ? It is not planned the big number of clients or a  
plenty of updatings. Like photoalbum on local host.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


[GENERAL] Storing blobs in PG DB

2007-04-04 Thread Nikolay Moskvichev

Hi All!

Question is : How suitable PG for storing about 2 000 000 binary files 
0,5-2,0 Mb size each ? It is not planned the big number of clients or a 
plenty of updatings. Like photoalbum on local host.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq