Re: Hi reliability files, writing,reading and maintaining
On Wed, 08 Feb 2006 00:29:16 +0100, rumours say that Xavier Morel <[EMAIL PROTECTED]> might have written: >You can also nest Raid arrays, the most common nesting are Raid 01 >(creating Raid1 arrays of Raid0 arrays), Raid 10 (creating Raid0 arrays >of Raid1 arrays), Raid 50 (Raid0 array of Raid5 arrays), and the "Raids >for Paranoids", Raid 15 and Raid 51 arrays (creatung a Raid5 array of >Raid1 arrays, or a Raid1 array of Raid5 arrays, both basically means >that you're wasting most of your storage space for redundancy >informations, but that the probability of losing any data is extremely low). Nah, too much talk. Better provide images: http://www.epidauros.be/raid.jpg -- TZOTZIOY, I speak England very best. "Dear Paul, please stop spamming us." The Corinthians -- http://mail.python.org/mailman/listinfo/python-list
Re: Hi reliability files, writing,reading and maintaining
John Pote wrote: > I would wish to secure this data gathering against crashes of the OS, > hardware failures and power outages. My first thought when reading this is "SQLite" (with the Python wrappers PySqlite or APSW). See http://www.sqlite.org where it claims "Transactions are atomic, consistent, isolated, and durable (ACID) even after system crashes and power failures", ... or some of the sections in http://www.sqlite.org/lockingv3.html which provide more technical background. If intending to rely on this for a mission critical system, one would be well advised to research independent analyses of the claims. -Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: Hi reliability files, writing,reading and maintaining
John Pote wrote: > Hello, help/advice appreciated. > > Background: > I am writing some web scripts in python to receive small amounts of data > from remote sensors and store the data in a file. 50 to 100 bytes every 5 or > 10 minutes. A new file for each day is anticipated. Of considerable > importance is the long term availability of this data and it's gathering and > storage without gaps. > > As the remote sensors have little on board storage it is important that a > web server is available to receive the data. To that end two separately > located servers will be running at all times and updating each other as new > data arrives. > > I also assume each server will maintain two copies of the current data file, > only one of which would be open at any one time, and some means of > indicating if a file write has failed and which file contains valid data. > The latter is not so important as the data itself will indicate both its > completeness (missing samples) and its newness because of a time stamp with > each sample. > I would wish to secure this data gathering against crashes of the OS, > hardware failures and power outages. > > So my request: > 1. Are there any python modules 'out there' that might help in securely > writing such files. > 2. Can anyone suggest a book or two on this kind of file management. (These > kind of problems must have been solved in the financial world many times). > > Many thanks, > > John Pote > > Others have made recommendations that I agree with: Use a REAL database that supports transactions. Other items you must consider: 1) Don't spend a lot of time engineering your software and then purchase the cheapest server you can find. Most fault tolerance has to due with dealing with hardware failures. Eliminate as many single-point-of-failure devices as possible. If your application requires 99.999 uptime, consider clustering. 2) Using RAID arrays, multiple controllers, ECC memory, etc. is not cheap but then fault tolerance requires such investments. 3) Don't forget that power and Internet access are normally the final single point of failure. It doesn't matter about all the rest if the power is off for an extended period of time. You will need to host your server(s) at a hosting facility that has rock-solid Internet pipes and generator backed power. It won't do any good to have a kick-ass server and software that can handle all types of failures if someone knocking over a power pole outside your office can take you offline. Hope info helps. -Larry Bates in a hosting facility -- http://mail.python.org/mailman/listinfo/python-list
Re: Hi reliability files, writing,reading and maintaining
"John Pote" <[EMAIL PROTECTED]> writes: > 1. Are there any python modules 'out there' that might help in securely > writing such files. > 2. Can anyone suggest a book or two on this kind of file management. (These > kind of problems must have been solved in the financial world many times). It's a complicated subject and is intimately mixed up with details of the OS and filesystem you're using. The relevant books are books about database implementation. One idea for your situation is use an actual database (e.g. MySQL or PostgreSQL) to store the data, so someone else (the database implementer) will have already dealt with the issues of making sure data is flushed properly. Use one of the Python DbAPI modules to communicate with the database. -- http://mail.python.org/mailman/listinfo/python-list
Re: Hi reliability files, writing,reading and maintaining
Terry Reedy wrote: > "John Pote" <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] >> I would wish to secure this data gathering against crashes of the OS, > > I have read about people running *nix servers a year or more without > stopping. > He'd probably want to check the various block-journaling filesystems to boot (such as Reiser4 or ZFS). Even though they don't reach DB-level of data integrity they've reached an interresting and certainly useful level of recovery. > To transparently write to duplicate disks, lookup RAID (but not level 0 > which I believe does no duplication). > Indeed, Raid0 stores data across several physical drives (striping), Raid1 fully duplicates the data over several physical HDs (mirror raid), Raid5 uses parity checks (which puts it between Raid0 and Raid1) and requires at least 3 physical drives (Raid0 and Raid1 require 2 or more). You can also nest Raid arrays, the most common nesting are Raid 01 (creating Raid1 arrays of Raid0 arrays), Raid 10 (creating Raid0 arrays of Raid1 arrays), Raid 50 (Raid0 array of Raid5 arrays), and the "Raids for Paranoids", Raid 15 and Raid 51 arrays (creatung a Raid5 array of Raid1 arrays, or a Raid1 array of Raid5 arrays, both basically means that you're wasting most of your storage space for redundancy informations, but that the probability of losing any data is extremely low). -- http://mail.python.org/mailman/listinfo/python-list
Re: Hi reliability files, writing,reading and maintaining
John Pote wrote: > Hello, help/advice appreciated. > I am writing some web scripts in python to receive small amounts of data > from remote sensors and store the data in a file. 50 to 100 bytes every 5 or > 10 minutes. A new file for each day is anticipated. Of considerable > importance is the long term availability of this data and it's gathering and > storage without gaps. This looks to me like the kind of thing a database is designed to handle. File systems under many operating systems have a nasty habit of re-ordering writes for I/O efficiency, and don't necessarily have the behavior you need for your application. The "ACID" design criteria for database design ask that operations on the DB are: Atomic Consistent Independent Durable "Atomic" means that the database always appears as if the "transaction" has either happened or not; it is not possible for any transaction to see the DB with any transaction in a semi-completed state. "Consistent" says that if you have invariants that are true about the data in the database, and each transaction preserves the invariants, the database will always satisfy the invariants. "Independent" essentially says that no transaction (such as reading the DB) will be able to tell it is running in parallel with other transactions (such as reads). "Durable" says that, once a transaction has been committed, even pulling the plug and restarting the DBMS should give a database with those transactions which got committed there, and no pieces of any other there. Databases often provide pre-packaged ways to do backups while the DB is running. These considerations are the core considerations to database design, so I'd suggest you consider using a DB for your application. I do note that some of the most modern operating systems are trying to provide "log-structured file systems," which may help with the durability of file writes. I understand there is an attempt even to provide transactional interactions to the file systems, but I'm not sure how far down the line that goes. -- -Scott David Daniels [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: Hi reliability files, writing,reading and maintaining
"John Pote" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > I would wish to secure this data gathering against crashes of the OS, I have read about people running *nix servers a year or more without stopping. > hardware failures To transparently write to duplicate disks, lookup RAID (but not level 0 which I believe does no duplication). >and power outages. The UPSes (uninterruptable power supplies) sold in stores will run a computer about half an hour on the battery. This is long enough to either gracefully shut down the computer or startup a generator. -- http://mail.python.org/mailman/listinfo/python-list
Hi reliability files, writing,reading and maintaining
Hello, help/advice appreciated. Background: I am writing some web scripts in python to receive small amounts of data from remote sensors and store the data in a file. 50 to 100 bytes every 5 or 10 minutes. A new file for each day is anticipated. Of considerable importance is the long term availability of this data and it's gathering and storage without gaps. As the remote sensors have little on board storage it is important that a web server is available to receive the data. To that end two separately located servers will be running at all times and updating each other as new data arrives. I also assume each server will maintain two copies of the current data file, only one of which would be open at any one time, and some means of indicating if a file write has failed and which file contains valid data. The latter is not so important as the data itself will indicate both its completeness (missing samples) and its newness because of a time stamp with each sample. I would wish to secure this data gathering against crashes of the OS, hardware failures and power outages. So my request: 1. Are there any python modules 'out there' that might help in securely writing such files. 2. Can anyone suggest a book or two on this kind of file management. (These kind of problems must have been solved in the financial world many times). Many thanks, John Pote -- http://mail.python.org/mailman/listinfo/python-list