-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...
> On Fri Dec 11 03:35:01 +0000 2009 james.westby wrote:
>> 639 packages failed
>>
>> 94 repeated reasons:
>>
>> 61 packages failed with reason

...

>> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/util.py", 
>> line
>> 358, in find_extra_authors
>>       match = extra_author_re.match(change.decode("utf-8"))
>>     File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
>>       return codecs.utf_8_decode(input, errors, True)
>>   UnicodeDecodeError: 'utf8' codec can't decode bytes in position 9-14:
>> unsupported Unicode code range
> 
> First of a set which is probably non-utf8 data in changelogs. There may be 
> hacks
> we can do for this. Don't discount the possibility that it is faulty encoding
> handling though.
> 

^- 'unsupported Unicode code range' sounds funny, but it may just be
that they have latin-1 chars in what should otherwise be a UTF-8 doc. Is
changelog *defined* as UTF-8? Or is it just '8-bit, put whatever feels
good to you' in there?


>> 43 packages failed with reason
...

>> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py",
>> line 159, in import_dir
>>       import_archive(tree, dir_file, file_ids_from=file_ids_from)
>>     File
>> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py",
>> line 234, in import_archive
>>       trans_id = tt.trans_id_tree_path(relative_path)
>>     File "/usr/lib/python2.5/site-packages/bzrlib/transform.py", line 241, in
>> trans_id_tree_path
>>       path = self.canonical_path(path)
>>     File "/usr/lib/python2.5/site-packages/bzrlib/transform.py", line 1282, 
>> in
>> canonical_path
>>       abs = self._tree.abspath(path)
>>     File "/usr/lib/python2.5/site-packages/bzrlib/workingtree.py", line 394, 
>> in
>> abspath
>>       return pathjoin(self.basedir, filename)
>>     File "/usr/lib/python2.5/posixpath.py", line 65, in join
>>       path += '/' + b
>>   UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 25:
>> ordinal not in range(128)
> 
> This is usually that there is a filename from a different encoding in the
> package. We may not be able to get around this. I imagine some are utf-8
> though, so it may be a bug that it is trying to decode in ascii.
> 

My guess is that you are handing us 8-bit paths, and inside bzrlib all
*paths* are supposed to be Unicode. And if you hand us an 8-bit string,
and we up-cast it to Unicode, then we fail because the upcast is
generally done via ascii.

So I would at least take a first look at the 'import_archive' code, and
make sure it is trying to work in Unicode paths, rather than 8-bit strings.


>> 36 packages failed with reason
>> 'launchpadlib.errors.HTTPError:<module>:main:get_versions:lp_call:__call__:_requ
>> est':
> 
...

>>     File "/usr/lib/python2.5/site-packages/launchpadlib/_browser.py", line 
>> 211,
>> in _request
>>       raise HTTPError(response, content)
>>   launchpadlib.errors.HTTPError: HTTP Error 503: Service Unavailable
> 
> Launchpad doesn't like me. These 36 happened in the few hours I
> was working on this task.
> 

Could this be related to the overloading of whatever machine that also
happened? Meaning running this stuff is hammering on a machine hard
enough that it times out occassionally? (Swapping, etc?)



...

> 
>> 30 packages failed with reason
>> 'UnicodeDecodeError:<module>:main:import_package:import_package:_do_import_packa
>> ge:import_upstream:decode':
>>  
>> /srv/package-import.canonical.com/new/scripts/python-debian/debian_bundle/change
>> log.py:274: UserWarning: Unexpected line while looking for next heading of 
>> EOF:
>>   vim:ai:et:sts=2:sw=2:tw=78:
>>     warnings.warn(message)
>>   Traceback (most recent call last):
>>     File "./import_package.py", line 884, in <module>
>>       sys.exit(main(args[0]))
>>     File "./import_package.py", line 849, in main
>>       import_package(temp_dir, package, version, distro, release, pocket,
>> package_url, possible_transports=possible_transports)
>>     File "./import_package.py", line 532, in import_package
>>       use_time_from_changelog=True)
>>     File
>> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py",
>> line 1555, in import_package
>>       timestamp=timestamp, author=author)
>>     File
>> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py",
>> line 1434, in _do_import_package
>>       timestamp=timestamp, author=author)
>>     File
>> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py",
>> line 1155, in import_upstream
>>       revprops['authors'] = author.decode("utf-8")
>>     File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
>>       return codecs.utf_8_decode(input, errors, True)
>>   UnicodeDecodeError: 'utf8' codec can't decode bytes in position 10-12:
>> invalid data
> 
> Changelog data again perhaps.

An author field that is non-ascii and not utf-8. There is always the:

def decode_as_best_you_can(s):
  try:
    return s.decode('utf-8')
  except UnicodeDecodeError:
    return s.decode('latin-1')

> 
...

>> "KnitPackRepository('lp-45193168:///~ubuntu-branches/ubuntu/karmic/awstats/karmi
>> c/.bzr/repository')\nis not compatible
>> with\nCHKInventoryRepository('lp-45193168:///~ubuntu-branches/ubuntu/jaunty/awst
>> ats/jaunty-updates/.bzr/repository')\ndifferent serializers")
> 
> This is because there are some packages that were imported in an older format.
> We should upgrade them. It's failing as there are no smarts to pick a 
> compatible
> format when we work on those packages.
> 

Is it possible to get a query of old ones, and just run a bulk-update of
them?


...

>> line 1264, in import_debian
>>       revprops['authors'] = "\n".join(authors).decode("utf-8")
>>     File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
>>       return codecs.utf_8_decode(input, errors, True)
>>   UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in 
>> position
>> 3: ordinal not in range(128)
> 
> Changelog data again?

The author field seems especially sensitive.


...

>> "/usr/lib/python2.5/site-packages/bzrlib/repofmt/groupcompress_repo.py", line
>> 583, in _execute_pack_operations
>>       packer.pack()
>>     File "/usr/lib/python2.5/site-packages/bzrlib/repofmt/pack_repo.py", line
>> 749, in pack
>>       return self._create_pack_from_packs()
>>     File
>> "/usr/lib/python2.5/site-packages/bzrlib/repofmt/groupcompress_repo.py", line
>> 471, in _create_pack_from_packs
>>       self._pack_collection.allocate(self.new_pack)
>>     File "/usr/lib/python2.5/site-packages/bzrlib/repofmt/pack_repo.py", line
>> 1715, in allocate
>>       'Pack %r already exists in %s' % (a_new_pack.name, self))
>>   bzrlib.errors.BzrError: Pack 'ac9506e4e5ddccb2730a2920256091bc' already
>> exists in <bzrlib.repofmt.groupcompress_repo.GCRepositoryPackCollection 
>> object
>> at 0x263e490>
>>
>>  qmmp
> 
> bzr bug?

Happens if you commit exactly the same data 2 times, or if you try to
autopack a single file. We've fixed a few of them, but having
reproducible data here would help.


John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktDdIMACgkQJdeBCYSNAAOw+gCggy6/FufD0H0jS9W7+R+dhX/Q
7moAnRdOPYPxThRdqdxkfobndm2CKrg+
=D23h
-----END PGP SIGNATURE-----

-- 
ubuntu-distributed-devel mailing list
ubuntu-distributed-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-distributed-devel

Reply via email to