When I switched to Mac OS X (from Windows) earlier this year, I saved
all of my personal email in an Excel spreadsheet. (It was a ridiculous
process: I had to convert it from Outlook Express to Outlook to Access
to Excel.) In this huge spreadsheet, each line is a single email
message, with columns for "To", "From", "Subject", "Body", etc.
I now want to get the email into a MySQL database using mysqlimport.
Mysqlimport can import comma- or tab-separated files, which Excel can
generate. My problem arises because the body of each email has
newlines after each paragraph (\r on Mac OS X), which is precisely the
character that is used to separate records (messages).
Here's an example of what an Excel-generated, tab-separated file looks
like: (It has the header and two records.)
To \t From \t Subject \t Body \r
[EMAIL PROTECTED] \t [EMAIL PROTECTED] \t Hello! \t Hi Richard,
\r
How are you doing?] \r
I just wanted to let you know... \r
... \r
[EMAIL PROTECTED] \t [EMAIL PROTECTED] \t Thanks \t Mr. Miller, \r
Thanks for switching to Mac OS X earlier this year. \r
Sincerely, \r
Steve \r
.... \r
Because the message column has line breaks in it, when I import it into
MySQL, the body only reads "Hi Richard," or "Mr. Miller," and excludes
the rest of the message.
If I knew what the newline character was in Excel, I would replace it
with some dummy character that I could expand later. If not, it looks
like once I export to a tab-separated file, it's too late to
distinguish between newlines in the message body and newlines that end
each record.
Any suggestions?
Richard Miller
____________________
BYU Unix Users Group
http://uug.byu.edu/
___________________________________________________________________
List Info: http://uug.byu.edu/cgi-bin/mailman/listinfo/uug-list
