On Tuesday 06 January 2009, Jack Chastain wrote: > Not knowing exactly what your intent is, or what you have, my first thought > was for you to use the OpenOffice Spreadsheet. I use this often to read CSV > files as it formats them easily into columns - and will save them as CSV > again rather simply. An input file may look something like this: > "Name","Address","City","State","Zip" > "Jack Chastain","123 fourth street","Newburgh","NY",12345 > "Another Name","456 Seventh st","Wallkill","NY",98765 > > When this file is opened by the Spreadsheet, it provides an import system > that allow you to specify the Character set (I just set it to US ASCII, The > default is Unicode 8) and the Seperator options. It appears to already know > that the file is separated by commas, but there may be conditions where > this is different. There is a preview that shows what the import would look > like, along with ability to eliminate some columns if you wish, as well as > instruction as to which row to begin the import from. Handy and easy.
I think you're on the right track with OpenOffice.org Calc (or Excel), but I want to clarify one part. I think the best option would be to save the Word documet in ASCII TEXT format, and then IMPORT that text document into either OpenOffice Calc or Excel, choosing the chararacter seperator options so that the cells are filled in as desired. The reason to do this is that Excel and Calc have the option to export to CSV because data is in cells, but this is something unexpected from a text document like Word or OOo Writer. The other similar-but-different method would be to first save the Word document as TEXT, but using other text manipulation tools, like Sed, Awk, Cut, Bash, etc. to do the job. If it were me I'd only go this route if the saved text is just too quirky to be able to import into cells cleanly in Calc or Excel, which can happen in some cases. A good example is the vertical list of data that you mentioned, which when imported into Calc or Excel will end up in a single column but multiple rows, which when turned into CSV ends up looking the same instead of being comma seperated as expected. For situations where there are names/addresses/etc in a vertical list seperated by one or more blank lines, that's one situation where I'd probably use a Bash script or a Python program to turn the data into CSV format rather than try to import with Excel. -- Chris -- Chris Knadle [email protected] _______________________________________________ Mid-Hudson Valley Linux Users Group http://mhvlug.org http://mhvlug.org/cgi-bin/mailman/listinfo/mhvlug Upcoming Meetings (6pm - 8pm) MHVLS Auditorium Jan 7 - Ruby on Rails Feb 4 - TBD
