That worked perfectly, Dave! I added a column in Excel that read "___TERMINATOR___" at the end of each row. Then I exported to a tab-separated file. Then in a text editor, I replaced every instance of "\t___TERMINATOR___\r" with "\n". Then I could import into MySQL.

However, for unrelated reasons, most of my email messages in the database have 2-3 copies. (I saved my email in multiple files over time and didn't want to risk losing anything when I consolidated them.) Does any one know SQL commands to delete all duplicates? (Everything but the ID is identical in these duplicate rows.) Or, I could go back and do it in the text file. Is there a way to remove duplicate rows in a text file?

Thanks!

Richard


On Nov 13, 2003, at 10:53 AM, District Webmaster wrote:


If you still have the excel file, add another column at the end
with the same string in every cell -- the string, however,
should not befound anywhere else in your document (I'll use "x_uug_x").

Export to a csv, then search and replace all "\r" instaces with "\n"

Then search and replace all "x_uug_x" instances with "\r"

Then import into MySQL

Of course, I may be wrong . . .

Dave

[EMAIL PROTECTED] 11/12/03 9:04 PM >>>
When I switched to Mac OS X (from Windows) earlier this year, I saved
all of my personal email in an Excel spreadsheet.  (It was a ridiculous
process: I had to convert it from Outlook Express to Outlook to Access
to Excel.)  In this huge spreadsheet, each line is a single email
message, with columns for "To", "From", "Subject", "Body", etc.

I now want to get the email into a MySQL database using mysqlimport.
Mysqlimport can import comma- or tab-separated files, which Excel can
generate.  My problem arises because the body of each email has
newlines after each paragraph (\r on Mac OS X), which is precisely the
character that is used to separate records (messages).

Here's an example of what an Excel-generated, tab-separated file looks
like: (It has the header and two records.)

To \t From \t Subject \t Body \r
[EMAIL PROTECTED] \t [EMAIL PROTECTED] \t Hello! \t Hi Richard,
\r
How are you doing?] \r
I just wanted to let you know... \r
... \r
[EMAIL PROTECTED] \t [EMAIL PROTECTED] \t Thanks \t Mr. Miller, \r
Thanks for switching to Mac OS X earlier this year. \r
Sincerely, \r
Steve \r
.... \r


Because the message column has line breaks in it, when I import it into
MySQL, the body only reads "Hi Richard," or "Mr. Miller," and excludes
the rest of the message.

If I knew what the newline character was in Excel, I would replace it
with some dummy character that I could expand later.  If not, it looks
like once I export to a tab-separated file, it's too late to
distinguish between newlines in the message body and newlines that end
each record.

Any suggestions?

Richard Miller


____________________ BYU Unix Users Group http://uug.byu.edu/ ___________________________________________________________________ List Info: http://uug.byu.edu/cgi-bin/mailman/listinfo/uug-list


____________________ BYU Unix Users Group http://uug.byu.edu/ ___________________________________________________________________ List Info: http://uug.byu.edu/cgi-bin/mailman/listinfo/uug-list




____________________
BYU Unix Users Group http://uug.byu.edu/ ___________________________________________________________________
List Info: http://uug.byu.edu/cgi-bin/mailman/listinfo/uug-list

Reply via email to