-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
... > On Fri Dec 11 03:35:01 +0000 2009 james.westby wrote: >> 639 packages failed >> >> 94 repeated reasons: >> >> 61 packages failed with reason ... >> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/util.py", >> line >> 358, in find_extra_authors >> match = extra_author_re.match(change.decode("utf-8")) >> File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode >> return codecs.utf_8_decode(input, errors, True) >> UnicodeDecodeError: 'utf8' codec can't decode bytes in position 9-14: >> unsupported Unicode code range > > First of a set which is probably non-utf8 data in changelogs. There may be > hacks > we can do for this. Don't discount the possibility that it is faulty encoding > handling though. > ^- 'unsupported Unicode code range' sounds funny, but it may just be that they have latin-1 chars in what should otherwise be a UTF-8 doc. Is changelog *defined* as UTF-8? Or is it just '8-bit, put whatever feels good to you' in there? >> 43 packages failed with reason ... >> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py", >> line 159, in import_dir >> import_archive(tree, dir_file, file_ids_from=file_ids_from) >> File >> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py", >> line 234, in import_archive >> trans_id = tt.trans_id_tree_path(relative_path) >> File "/usr/lib/python2.5/site-packages/bzrlib/transform.py", line 241, in >> trans_id_tree_path >> path = self.canonical_path(path) >> File "/usr/lib/python2.5/site-packages/bzrlib/transform.py", line 1282, >> in >> canonical_path >> abs = self._tree.abspath(path) >> File "/usr/lib/python2.5/site-packages/bzrlib/workingtree.py", line 394, >> in >> abspath >> return pathjoin(self.basedir, filename) >> File "/usr/lib/python2.5/posixpath.py", line 65, in join >> path += '/' + b >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 25: >> ordinal not in range(128) > > This is usually that there is a filename from a different encoding in the > package. We may not be able to get around this. I imagine some are utf-8 > though, so it may be a bug that it is trying to decode in ascii. > My guess is that you are handing us 8-bit paths, and inside bzrlib all *paths* are supposed to be Unicode. And if you hand us an 8-bit string, and we up-cast it to Unicode, then we fail because the upcast is generally done via ascii. So I would at least take a first look at the 'import_archive' code, and make sure it is trying to work in Unicode paths, rather than 8-bit strings. >> 36 packages failed with reason >> 'launchpadlib.errors.HTTPError:<module>:main:get_versions:lp_call:__call__:_requ >> est': > ... >> File "/usr/lib/python2.5/site-packages/launchpadlib/_browser.py", line >> 211, >> in _request >> raise HTTPError(response, content) >> launchpadlib.errors.HTTPError: HTTP Error 503: Service Unavailable > > Launchpad doesn't like me. These 36 happened in the few hours I > was working on this task. > Could this be related to the overloading of whatever machine that also happened? Meaning running this stuff is hammering on a machine hard enough that it times out occassionally? (Swapping, etc?) ... > >> 30 packages failed with reason >> 'UnicodeDecodeError:<module>:main:import_package:import_package:_do_import_packa >> ge:import_upstream:decode': >> >> /srv/package-import.canonical.com/new/scripts/python-debian/debian_bundle/change >> log.py:274: UserWarning: Unexpected line while looking for next heading of >> EOF: >> vim:ai:et:sts=2:sw=2:tw=78: >> warnings.warn(message) >> Traceback (most recent call last): >> File "./import_package.py", line 884, in <module> >> sys.exit(main(args[0])) >> File "./import_package.py", line 849, in main >> import_package(temp_dir, package, version, distro, release, pocket, >> package_url, possible_transports=possible_transports) >> File "./import_package.py", line 532, in import_package >> use_time_from_changelog=True) >> File >> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py", >> line 1555, in import_package >> timestamp=timestamp, author=author) >> File >> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py", >> line 1434, in _do_import_package >> timestamp=timestamp, author=author) >> File >> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py", >> line 1155, in import_upstream >> revprops['authors'] = author.decode("utf-8") >> File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode >> return codecs.utf_8_decode(input, errors, True) >> UnicodeDecodeError: 'utf8' codec can't decode bytes in position 10-12: >> invalid data > > Changelog data again perhaps. An author field that is non-ascii and not utf-8. There is always the: def decode_as_best_you_can(s): try: return s.decode('utf-8') except UnicodeDecodeError: return s.decode('latin-1') > ... >> "KnitPackRepository('lp-45193168:///~ubuntu-branches/ubuntu/karmic/awstats/karmi >> c/.bzr/repository')\nis not compatible >> with\nCHKInventoryRepository('lp-45193168:///~ubuntu-branches/ubuntu/jaunty/awst >> ats/jaunty-updates/.bzr/repository')\ndifferent serializers") > > This is because there are some packages that were imported in an older format. > We should upgrade them. It's failing as there are no smarts to pick a > compatible > format when we work on those packages. > Is it possible to get a query of old ones, and just run a bulk-update of them? ... >> line 1264, in import_debian >> revprops['authors'] = "\n".join(authors).decode("utf-8") >> File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode >> return codecs.utf_8_decode(input, errors, True) >> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in >> position >> 3: ordinal not in range(128) > > Changelog data again? The author field seems especially sensitive. ... >> "/usr/lib/python2.5/site-packages/bzrlib/repofmt/groupcompress_repo.py", line >> 583, in _execute_pack_operations >> packer.pack() >> File "/usr/lib/python2.5/site-packages/bzrlib/repofmt/pack_repo.py", line >> 749, in pack >> return self._create_pack_from_packs() >> File >> "/usr/lib/python2.5/site-packages/bzrlib/repofmt/groupcompress_repo.py", line >> 471, in _create_pack_from_packs >> self._pack_collection.allocate(self.new_pack) >> File "/usr/lib/python2.5/site-packages/bzrlib/repofmt/pack_repo.py", line >> 1715, in allocate >> 'Pack %r already exists in %s' % (a_new_pack.name, self)) >> bzrlib.errors.BzrError: Pack 'ac9506e4e5ddccb2730a2920256091bc' already >> exists in <bzrlib.repofmt.groupcompress_repo.GCRepositoryPackCollection >> object >> at 0x263e490> >> >> qmmp > > bzr bug? Happens if you commit exactly the same data 2 times, or if you try to autopack a single file. We've fixed a few of them, but having reproducible data here would help. John =:-> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Cygwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAktDdIMACgkQJdeBCYSNAAOw+gCggy6/FufD0H0jS9W7+R+dhX/Q 7moAnRdOPYPxThRdqdxkfobndm2CKrg+ =D23h -----END PGP SIGNATURE----- -- ubuntu-distributed-devel mailing list ubuntu-distributed-devel@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-distributed-devel