Nir Soffer has posted comments on this change.

Change subject: logging: Don't crash on non-ASCII in SimpleLogAdapter
......................................................................


Patch Set 1: Code-Review-1

(5 comments)

https://gerrit.ovirt.org/#/c/48542/1/tests/utilsTests.py
File tests/utilsTests.py:

Line 1084:         # various codings.
Line 1085:         invalid = chr(254) + chr(0)
Line 1086:         u_invalid = unichr(253) + unichr(254)
Line 1087:         extra = dict(e1=u'Hello', e2='Здра́вствуйте', e3=u'שלום', 
e4='你好',
Line 1088:                      e5=invalid, e6=u_invalid)
This test is valid, if we don't control the extra parameters - but is this a 
real need? where is the code adding unicode parameters to logs? Can we encode 
them instead of doing extra work in the logging? Logging is too expensive 
without these changes.
Line 1089:         log = logUtils.SimpleLogAdapter(self._log, extra)
Line 1090:         log.debug("Dobrý večer")
Line 1091:         log.debug(u"Dobrý večer")
Line 1092:         log.debug(invalid)


Line 1087:         extra = dict(e1=u'Hello', e2='Здра́вствуйте', e3=u'שלום', 
e4='你好',
Line 1088:                      e5=invalid, e6=u_invalid)
Line 1089:         log = logUtils.SimpleLogAdapter(self._log, extra)
Line 1090:         log.debug("Dobrý večer")
Line 1091:         log.debug(u"Dobrý večer")
This should be forbidden by the logging framework. There is no reason to use 
unicode format string, we have complete control over the format string in the 
source.

If we have to support unicode parameters to logging functions, we should encode 
them to utf-8 in the logging framework.
Line 1092:         log.debug(invalid)


https://gerrit.ovirt.org/#/c/48542/1/vdsm/logUtils.py
File vdsm/logUtils.py:

Line 93:         result = u''
Line 94:         for key, value in self.extra.iteritems():
Line 95:             if isinstance(value, str):
Line 96:                 value = value.decode('utf-8', 'replace')
Line 97:             result += u'%s=`%s`' % (key, value)
This decoding is not free.
Line 98:         if isinstance(msg, str):
Line 99:             msg = msg.decode('utf-8', 'replace')
Line 100:         result += u'::%s' % msg
Line 101:         return (result, kwargs)


Line 95:             if isinstance(value, str):
Line 96:                 value = value.decode('utf-8', 'replace')
Line 97:             result += u'%s=`%s`' % (key, value)
Line 98:         if isinstance(msg, str):
Line 99:             msg = msg.decode('utf-8', 'replace')
So is this.
Line 100:         result += u'::%s' % msg
Line 101:         return (result, kwargs)
Line 102: 
Line 103: 


Line 96:                 value = value.decode('utf-8', 'replace')
Line 97:             result += u'%s=`%s`' % (key, value)
Line 98:         if isinstance(msg, str):
Line 99:             msg = msg.decode('utf-8', 'replace')
Line 100:         result += u'::%s' % msg
Why return unicode string? There is no reason to do this, and encode it back to 
utf-8 later when emitting logs.

If we must handle unicode input, the right way is to encode the unicode values 
to utf-8 here, and return plain string.
Line 101:         return (result, kwargs)
Line 102: 
Line 103: 
Line 104: class TracebackRepeatFilter(logging.Filter):


-- 
To view, visit https://gerrit.ovirt.org/48542
To unsubscribe, visit https://gerrit.ovirt.org/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1930817ade1799b218913348ffa099aab2211ef7
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Milan Zamazal <mzama...@redhat.com>
Gerrit-Reviewer: Francesco Romani <from...@redhat.com>
Gerrit-Reviewer: Jenkins CI
Gerrit-Reviewer: Nir Soffer <nsof...@redhat.com>
Gerrit-Reviewer: Vinzenz Feenstra <vfeen...@redhat.com>
Gerrit-Reviewer: Yaniv Bronhaim <ybron...@redhat.com>
Gerrit-Reviewer: gerrit-hooks <automat...@ovirt.org>
Gerrit-HasComments: Yes
_______________________________________________
vdsm-patches mailing list
vdsm-patches@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches

Reply via email to