Thanks for the comments. Quick notes: 1. Yes, ultimate goal is to output various fields into Excel, but I need to enter the data into a DB as well to avoid having any strange problems. That may be handled by a completely different process. I should have been more exact in saying that my current goal is to get this data into a database.
2. I'm relatively new to Python and definitely new to trying to parse Ragged-right type CSV files that are pairs of data. I'm trying to get my syntax correct when wording my questions/phrases - please bear with me. :-) Your help (all posters) is quite appreciated. 3. regarding read/write each row - I started thinking about working my way through the dictionary. This file could have up to 1900 pairs of data per row. That's an awful lot of columns to put into a database table, especially if quite a few will be NULL. I starting thinking of stepping through each pair by each row. The rows are identified by Letter type and an account number (always the same field #). I would want to use those, and insert something like: LetterType, Account #, Field#, Field Data, (current date) I wasn't quite sure how to phrase that and apologize for being unclear. I guess if I'm doing row-by-row, I can hold that for each row without too much trouble. 4. No control over file format. I can define a lookup table that will match Field# to a real name, perhaps code that into my scripts, but I'm stuck with that. As to other posters' responses - it's ANSI-coded, CSV, No Quotes (no Commas in values). Coming from the mainframe, I don't have to worry about embedded special characters within the file as they can't be entered into these fields to start with. File always starts with letter_type, 1 - no idea why they didn't do that as 0, letter_type for consistency but it isn't my file. (Heck, I would have just specified to output all fields all the time and avoid any need to do special parsing :-) Anyway, I appreciate the help. The code looks like it will work for me and I'll be doing some basic tests to get up to speed on splitting/parsing the data and putting it into a form that will be usable by all. I may have to write certain rows out to a new file if they don't match what I'm looking for. I don't anticipate any issues there - just read the row, write it to a new file if it doesn't match my criteria. Thanks to all. Off to search for dictionaries and databases now. :-) -Pete John Machin <[EMAIL PROTECTED]> wrote: > On Fri, 29 Apr 2005 18:54:54 GMT, Peter A. Schott > <[EMAIL PROTECTED]> wrote: > > >That looks promising. > > > The field numbers are pre-defined at the mainframe level. > > Of course. Have you managed to acquire a copy of the documentation, or > do you have to reverse-engineer it? > > >This may help me get to my ultimate goal which is to pump these into a DB on > >a > >row-by-row basis ( :-P ) > > That's your *ultimate* goal? Are you running a retro-computing museum > or something? Don't you want to *USE* the data? > > > I'll have to do some playing around with this. I > >knew that it looked like a dictionary, but wasn't sure how best to handle > >this. > > > >One follow-up question: I'll end up getting multiple records for each > >"type". > > What does that mean?? If it means that more than one customer will get > the "please settle your account" letter, and more than one customer > will get the "please buy a spangled fritzolator, only $9.99" letter, > you are stating the obvious -- otherwise, please explain. > > >Would I be referencing these by row[#][field#]? > > Not too sure what you mean by that -- whether you can get away with a > (read a row, write a row) way of handling the data depends on its > structure (like what are the relationships if any between different > rows) and what you want to do with it -- both murky concepts at the > moment. > > > > >Minor revision to the format is that starts like: > >###,1,1,val_1,.... > > How often do these "minor revisions" happen? How flexible do you have > to be? And the extra "1" means what? Is it ever any other number? > > > > > > >I think right now the plan is to parse through the file and insert the pairs > >directly into a DB table. Something like RowID, LetterType, date, Field#, > >Value. > > Again, I'd recommend you lose the "Field#" in favour of a better > representation, ASAP. > > > I can get RowID and LetterType overall, date is a constant, the rest > >would involve reading each pair and inserting both values into the table. > >Time > >to hit the books a little more to get up to speed on all of this. > > What you need is (a) a clear appreciation of what you are trying to do > with the data at a high level (b) then develop an understanding of > what is the underlying data model (c) then and only then worry about > technical details. > > Good luck, > John > -- http://mail.python.org/mailman/listinfo/python-list