willmurnane commented on issue #88:
URL: https://github.com/apache/accumulo-access/issues/88#issuecomment-3583623244

   Ah, I misremembered Java semantics, I thought `new String(invalidUtf8Bytes, 
StandardCharsets.UTF8)` threw, but it doesn't, just replaces with . 
   
   I think accumulo-access would be a better standard if valid UTF-8 were 
required. Since access expressions are human-facing (at least at some level; 
for example, they're not encoded as CBOR or JSON or Protobuf or whatever), a 
requirement that they be human-readable in a popular encoding makes sense to 
me. I understand you'd probably like to keep existing Accumulo code compatible 
with this spec, though, so I think documenting the actual behavior you're 
trying to be compatible with is a good solution. 
   
   ```abnf
   access-token            = 1*( ALPHA / DIGIT / "_" / "-" / "." / ":" / slash )
   access-token            =/ DQUOTE 1*(utf8-subset / escaped) DQUOTE
   
   utf8-subset             = %x00-21 / %x23-5B / %x5D-FF; any byte, except 0x22 
'"' or 0x5C '\'
   escaped                 = "\" DQUOTE / "\\" ; '\"' or '\\'
   slash                   = "/"
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to