On 16 Dec 2021, at 7:12, Estevan Rech wrote:

> I have an application that takes pictures of items and uses them as
> evidence in a report.
>
> These images are saved and then used to generate a report.
>
Once you start talking reports and evidence I think audit trail and 
verification a.k.a. “chain of evidence”.
With the images in the database they can be part of your information ecosystem, 
get a datestamp along with checksum and all this included in the audit trail.
All this can be done with the images in the file system and the database holds 
the pointers only… it just may not be as easy to convince the oppositions 
lawyers that it is tamper proof, except the best you can really get is tamper 
evident. 😉

> Each item averages 10 photos and I have about 2 million photos currently,
> with an average growth of 1 million photos over the next year.
>
> I think about using it within the database because of the ease of selecting
> records with the content of the images (when necessary).
>
That will be the same however you go. The selection will be on the context data 
you enter with the images. There has been no mention of searching within the 
image itself, and this is not something PG can do “out of the box” anyway.

> I think my biggest concern is to continue to use this strategy for the next
> 2 years and encounter a PostgreSQL limitation or some failure and have to
> migrate database again or have significant loss of images. And on the
> contrary too, if I use it on disk and in 2 years I find some failure or
> file limit in the operating system (currently it's windows server 2016)...
>
The total image data will end up somewhere less than 4Tb if your estimates 
continue to hold. That’s not going to stress PG. It’s not such a big deal on a 
modern home computer either, but I can’t make any informed comment on Windows 
anything. Realistically, once the data is in the database the whole can be 
moved to a more capable machine and/or OS if problems start to appear. PG will 
work wherever you deploy it.

> Related to this is the backup service that seems to me that in a disaster,
> I can restore the database relatively quickly if it's in the database. On
> the disk, I believe that the restoration is much slower and I don't think
> it's reliable that all the images are copied/restored.
>
If you have a reliable tested backup system for the database and don’t want to 
add to the workload then get the images into the database. If there are reasons 
why you end up with the images outside the database then, of course, the backup 
process has to be able to keep everything together and guarantee a full 
restore. Doable as well, but it does add some complexity.

If I am reading this correctly the use case is one of receiving data in the 
form of photos and field reports, entering the report items associated with the 
photos into the information system, generating a report (with embedded photos), 
and keeping records so disputes can be handled. This is not a high volume 
access to the images so there is no real need for optimum filesystem speed to 
serve the images… keeping them in the database as bytea is perfectly workable 
and will work for data volumes well beyond the current estimates.

Gavan Schneider
——
Gavan Schneider, Sodwalls, NSW, Australia
Explanations exist; they have existed for all time; there is always a 
well-known solution to every human problem — neat, plausible, and wrong.
— H. L. Mencken, 1920


Reply via email to