On Mar 3, 12:21 pm, John Machin <sjmac...@lexicon.net> wrote: > On Mar 3, 8:49 pm, Hussein B <hubaghd...@gmail.com> wrote: > > > > > On Mar 3, 11:05 am, Hussein B <hubaghd...@gmail.com> wrote: > > > > On Mar 2, 5:40 pm, John Machin <sjmac...@lexicon.net> wrote: > > > > > On Mar 3, 1:50 am, Hussein B <hubaghd...@gmail.com> wrote: > > > > > > On Mar 2, 4:31 pm, John Machin <sjmac...@lexicon.net> wrote:> On Mar > > > > > 2, 7:30 pm, Hussein B <hubaghd...@gmail.com> wrote: > > > > > > > > On Mar 1, 4:51 pm, Philip Semanchuk <phi...@semanchuk.com> wrote: > > > > > > > > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote: > > > > > > > > > > Hey, > > > > > > > > > I'm retrieving records from MySQL database that contains non > > > > > > > > > english > > > > > > > > > characters. > > > > > > > Can you reveal which language??? > > > > > > Arabic > > > > > > > > > > Then I create a String that contains HTML markup and column > > > > > > > > > values > > > > > > > > > from the previous result set. > > > > > > > > > +++++ > > > > > > > > > markup = u'''<table>.....''' > > > > > > > > > for row in rows: > > > > > > > > > markup = markup + '<tr><td>' + row['id'] > > > > > > > > > markup = markup + '</table> > > > > > > > > > +++++ > > > > > > > > > Then I'm sending the email according to this tip: > > > > > > > > >http://code.activestate.com/recipes/473810/ > > > > > > > > > Well, the email contains ????? characters for each non > > > > > > > > > english ones. > > > > > > > > > Any ideas? > > > > > > > > > There's so many places where this could go wrong and you > > > > > > > > haven't > > > > > > > > narrowed down the problem. > > > > > > > > > Are the characters stored in the database correctly? > > > > > > > > Yes they are. > > > > > > > How do you KNOW that they are stored correctly? What makes you so > > > > > > sure? > > > > > > Because MySQL Query Browser displays them correctly, in addition I use > > > > > BIRT as the reporting system and it shows them correctly. > > > > > > > > > Are they stored consistently (i.e. all using the same encoding, > > > > > > > > not > > > > > > > > some using utf-8 and others using iso-8859-1)? > > > > > > > > Yes. > > > > > > > So what is the encoding used to store them? > > > > > > Tables are created with UTF-8 encoding option > > > > > > > > > What are you getting out of the database? Is it being converted > > > > > > > > to > > > > > > > > Unicode correctly, or at all? > > > > > > > > I don't know, how to make sure of this point? > > > > > > > You could show us some of the output from the database query. As > > > > > > well > > > > > > as > > > > > > print the_output > > > > > > you should > > > > > > print repr(the_output) > > > > > > and show us both, and also tell us what you *expect* to see. > > > > > > The result of print repr(row['name']) is '??? ??????' > > > > > The '?' characters are supposed to be Arabic characters. > > > > > Are you expecting 3 Arabic characters, a space, and then 6 Arabic > > > > characters? > > > > > We now have some interesting evidence: row['name'] is NOT a unicode > > > > object -- otherwise the print would show u'??? ??????'; it's a str > > > > object. > > > > > So: A utf8-encoded string is being decoded to unicode, and then re- > > > > encoded to some other encoding, using the "replace" (with "?") error- > > > > handling method. That shouldn't be hard to spot! It's about time you > > > > showed us the code you are using to extract the data from the > > > > database, including the print statements you have put in. > > > > This is how I retrieve the data: > > > > db = MySQLdb.connect(host = "127.0.0.1", port = 3306, user = > > > "username", > > > passwd = "passwd", db = "reporting") > > > cr = db.cursor(MySQLdb.cursors.DictCursor) > > > cr.execute(sql) > > > rows = cr.fetchall() > > > > Thanks all for your nice help. > > > Hey, > > I added use_unicode and charset keyword params to the connect() method > > Hey, that was a brilliant idea -- I was just about to ask you to try > use_unicode=True, charset="utf8" ... what were the actual values that > you used?
I didn't supply values for them the first times. > Let's suppose that you used charset="XXXX" ... as far as I can tell, > not being a mysqldb user myself, this means that your data tables and/ > or your default connection don't use XXXX as an encoding. If so, this > might be an issue you might like to take up with whoever created the > database that you are using. > > > and I got the following: > > u'\u062f\u062e\u0648\u0644 \u0633\u0631\u064a\u0639 > > \u0634\u0647\u0631' > > So characters are getting converted successfully. > > I guess so -- U+06nn sure are Arabic characters :-) > > However as suggested above, "converted from what?" might be worth > pursuing if you like to understand what is going on instead of just > applying magic recipes ;-) > > > Well, using the previous recipe for sending the > > mail:http://code.activestate.com/recipes/473810/ > > I got the following error: > > > Traceback (most recent call last): > > File "HtmlMail.py", line 52, in <module> > > s.sendmail(sender, receiver , msg.as_string()) > > [big snip] > > > _handle_text > > self._fp.write(payload) > > UnicodeEncodeError: 'ascii' codec can't encode characters in position > > 115-118: ordinal not in range(128) > > > Again, any ideas guys? :) > > That recipe appears to have been written by an ascii bigot for ascii > bigots :-( > > Try reading the docs for email.charset (that's the charset module in > the email package). Every thing is working now, I did the following: t = MIMEText(markup.encode('utf-8'), 'html', 'utf-8') > Cheers, > John Thank you all guys and especially you John, I owe you a HUGE bottle of beer :D -- http://mail.python.org/mailman/listinfo/python-list