#18938: Authorities should reject non-ASCII content in ExtraInfo descriptors ----------------------------+------------------------------------ Reporter: teor | Owner: Type: defect | Status: new Priority: Medium | Milestone: Tor: 0.2.9.x-final Component: Core Tor/Tor | Version: Severity: Normal | Resolution: Keywords: needs-proposal | Actual Points: Parent ID: #18656 | Points: 1 Reviewer: | Sponsor: ----------------------------+------------------------------------ Changes (by teor):
* keywords: needs-proposal-maybe => needs-proposal Comment: Replying to [comment:18 cypherpunks]: > Another cypherpunks here. > > Replying to [comment:17 teor]: > > Currently, the Tor ContactInfo and Platform consist of arbitrary binary data, terminated by an ASCII linefeed byte. > > That's not what the spec says. It's supposed to be ASCII. dir-spec 1.2: > NL = The ascii LF character (hex value 0x0a). > Document ::= (Item | NL)+ > Item ::= KeywordLine Object* > KeywordLine ::= Keyword NL | Keyword WS ArgumentChar+ NL > Keyword = KeywordChar+ > KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-' > ArgumentChar ::= any printing ASCII character except NL. > WS = (SP | TAB)+ > > Contact info: > "contact" info NL > > [At most once] > Describes a way to contact the relay's administrator, preferably > including an email address and a PGP key fingerprint. Yes, we have a spec issue. The implementation of platform and contact allows arbitrary data. The format has never been enforced. That said, there are multiple ways to allow users to spell their names correctly, but mandate ASCII in descriptors: * users can provide the ASCII alias for their EAI email address https://en.wikipedia.org/wiki/Email_address#Internationalization * users can provide an ASCII website address (in IDN form if necessary) that links to their name and email as they would like it displayed https://en.wikipedia.org/wiki/Internationalized_domain_name * users can choose a transliteration or nickname they feel most comfortable with, and use that for their name and email address There are excellent technologies available for accurately displaying names in a variety of character sets. But Tor descriptors and the associated infrastructure can't reliably do that at the moment. Regardless of what decision we make about the format of these fields, there will be ambiguity. There will be display inconsistency. There will be users who can't email non-ASCII addresses. And this means that the Contact field can't achieve its purpose: helping people contact relay operators. I'm also concerned that making unicode a requirement in some other Tor applications could have security impacts, as well as taking some work to implement correctly. I'm happy to discuss other alternatives. Perhaps we need a proposal. And perhaps we need more time to work this out. -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18938#comment:19> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online _______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs