- **status**: in-progress --> review
- **Comment**:
fixed on db/8350 This illustrates how it works to handle a name with a
different encoding:
```
>>> 'data/\xCA\xEE\xEF\xE8\xFF scene.txt'.decode('utf-8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/var/local/env-allura/lib64/python2.7/encodings/utf_8.py", line 16, in
decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xca in position 5: invalid
continuation byte
>>> h.really_unicode('data/\xCA\xEE\xEF\xE8\xFF scene.txt')
u'data/\u041a\u043e\u043f\u0438\u044f scene.txt'
>>> print h.really_unicode('data/\xCA\xEE\xEF\xE8\xFF scene.txt')
data/Копия scene.txt
```
Unfortunately that only gets directory browsing working. Trying to view or
diff the file raises ` ManifestLookupError: data/Копия
scene.txt@a18ff7d3ef0d: not found in manifest` because we converted the
filename to unicode for mongo and web purposes, but then when requesting it
from the hg repo it is encoded differently so the utf8 version of the filename
is not found. I don't know how to deal with that
---
** [tickets:#8350] non-unicode filenames in hg**
**Status:** review
**Milestone:** unreleased
**Created:** Tue Feb 11, 2020 10:54 PM UTC by Dave Brondsema
**Last Updated:** Tue Feb 11, 2020 10:54 PM UTC
**Owner:** Dave Brondsema
with a non-unicode filename this error is threown
```
File "/src/forgehg/forgehg/model/hg.py", line 324, in refresh_commit_info
fake_tree = self._tree_from_changectx(obj)
File "/src/timermiddleware/timermiddleware/__init__.py", line 120, in wrapper
return self.run_and_log(func, inst, *args, **kwargs)
File "/src/timermiddleware/timermiddleware/__init__.py", line 152, in
run_and_log
retval = func(*args, **kwargs)
File "/src/forgehg/forgehg/model/hg.py", line 453, in _tree_from_changectx
root.set_blob(filepath, oid)
File "/src/allura/Allura/allura/model/repository.py", line 1847, in set_blob
path = six.ensure_text(path)
File "/var/local/env-allura/lib/python2.7/site-packages/six.py", line 904, in
ensure_text
return s.decode(encoding, errors)
File "/var/local/env-allura/lib64/python2.7/encodings/utf_8.py", line 16, in
decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xca in position 5: invalid
continuation byte
```
---
Sent from forge-allura.apache.org because [email protected] is subscribed
to https://forge-allura.apache.org/p/allura/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://forge-allura.apache.org/p/allura/admin/tickets/options. Or, if this is
a mailing list, you can unsubscribe from the mailing list.