RE: Storing Documents....

Troy Sosamon Mon, 18 Feb 2002 21:42:35 -0800

Mike,

All the code I used for the demo is on the developers conference CD along with 
the power point presentation.


Here is how I do it.

1. You have to use whatever method works for you and get your document into a 
file.  We use inexpensive HP scanners with doecument feeders on them.  We set 
them to default to store the pages in files at 200 dpi jpg files named 
temp.jpg.  The HP software then automatically numbers multiple pages test.jpt, 
test2.jpg, test3.jpg...

2. I have a program in R:base that will read the files and load them into the 
database.

3. Storing the files in R:base.  I use 3 tables:
docs, docs_doc, and docs_loc.
The docs table holds an autonumbered id called docid int, scan_date date, 
file_type text 3, doc_loc text 8, last_access date, and other information 
linking that record to the appropriate record in the database. 
The docs_doc table has 2 columns called docid and doc_image of type varbit.  
This is where the actual image lives.
The docs_loc table is used to keep track of the locations of the databases 
that store the documents after you move them from your primary database.

The problem with storing images in your database is your #4 file gets too 
large, so I create other database and move the records from the docs_doc table 
to other database.  I keep track of what database each document is in in the 
docs table, and I keep track of where the database is in the docs_loc table

ok, so you scan your document to files c:\temp.jpg, c:\temp2.jpg, temp3.jpg.
You select the client (or whatever) record in your database this document 
relates to.  You run a program that reads the files, load the docs table 
getting a docid, links to your client record, and then load the image into 
docs_doc, and you delete your temp jpg files after you load them.

Now you have 4 records in the docs table and 4 records in the docs_doc table.

To look at your document, you pull up your client and do a choose command on 
the docs table for that client.  You get the docid, file_type, and 
doc_location from the docs table.  You look in the docs_doc table to see if 
your blob is there.  If it is, you dump it out to some file say 
c:\temp$$$.jpg.  Next you either zip out of R:base or use launch and use 
whatever graphic viewer you want to look at the thing.  (You keep track of the 
file extension so you can programatically decide what viewer you want to use.  
You can store any file you want in R:base.  I store jpg, pdf, and doc files 
just to name a few.)  If your document is not in the docs_doc table, you use 
the doc_location field, look that up in the doc_loc table, connect to that 
database, dump the blob to a file, load the file into your main database, and 
then follow the procedure above.

I load the record back into the main database, because of the way we do 
business.  Usually when someone needs an old file, they may use it 10 times in 
a week and then not use it for a couple of years.

When the #4 file gets over 400 megs, I start moving records out of the 
docs_doc table and load them into secondary databases.  I update the docs 
table with the name of the DB that the record is loaded into.  The docs_loc 
table has 1 record for that database with it's location.
Once a document is moved to a secondary database, it is never deleted from it.
I also burn the secondary databases onto CDs to keep off site.  You can move 
those secondary database around or even take them off line, and you just need 
to modify the 1 record in the docs_loc table.  I don't let the secondary 
database get over 600 megs so they will fit on CDs.  If I ever need to, I 
could get a CD jukebox and access them from there.  I have not needed that 
because I have a server with big drive array, and I figure I won't need to 
worry about storage for a couple of years.

This whole procedure is very simple.  I control everything with 3 short 
command files.  One files reads in the jpg files and loads them into the 
database.  The second file pulls the files out of the database and starts the 
viewer, and the third file is used to move files from the production databse 
to the secondary databases.  I think I have about 15 secondary databases 
storing around 75,000 documents.

This method works great.  I don't think I would use it if I were going to scan 
100,000 pages a month, but it would still work if you did.  If you were going 
scan large numbers of documents, you would want to get a better storage format 
to get  the images smaller, but R:base would still work to store everything.

Troy Sosamon
Denver, Co.


>===== Original Message From [EMAIL PROTECTED] =====
>Hi all, and esp Troy,
>
>Do you have an outline on how you do the "TROY METHOD" of document storage?  
It
>seemed to work so well at the developers conference.....I just didn't want to
>reinvent that wheel!
>
>Mike Sinclair
>
>================================================
>TO SEE MESSAGE POSTING GUIDELINES:
>Send a plain text email to [EMAIL PROTECTED]
>In the message body, put just two words: INTRO rbase-l
>================================================
>TO UNSUBSCRIBE: send a plain text email to [EMAIL PROTECTED]
>In the message body, put just two words: UNSUBSCRIBE rbase-l
>================================================
>TO SEARCH ARCHIVES:
>http://www.mail-archive.com/rbase-l%40sonetmail.com/

Troy Sosamon
Denver Co
[EMAIL PROTECTED]

================================================
TO SEE MESSAGE POSTING GUIDELINES:
Send a plain text email to [EMAIL PROTECTED]
In the message body, put just two words: INTRO rbase-l
================================================
TO UNSUBSCRIBE: send a plain text email to [EMAIL PROTECTED]
In the message body, put just two words: UNSUBSCRIBE rbase-l
================================================
TO SEARCH ARCHIVES:
http://www.mail-archive.com/rbase-l%40sonetmail.com/

RE: Storing Documents....

Reply via email to