willmurnane commented on issue #88: URL: https://github.com/apache/accumulo-access/issues/88#issuecomment-3583623244
Ah, I misremembered Java semantics, I thought `new String(invalidUtf8Bytes, StandardCharsets.UTF8)` threw, but it doesn't, just replaces with . I think accumulo-access would be a better standard if valid UTF-8 were required. Since access expressions are human-facing (at least at some level; for example, they're not encoded as CBOR or JSON or Protobuf or whatever), a requirement that they be human-readable in a popular encoding makes sense to me. I understand you'd probably like to keep existing Accumulo code compatible with this spec, though, so I think documenting the actual behavior you're trying to be compatible with is a good solution. ```abnf access-token = 1*( ALPHA / DIGIT / "_" / "-" / "." / ":" / slash ) access-token =/ DQUOTE 1*(utf8-subset / escaped) DQUOTE utf8-subset = %x00-21 / %x23-5B / %x5D-FF; any byte, except 0x22 '"' or 0x5C '\' escaped = "\" DQUOTE / "\\" ; '\"' or '\\' slash = "/" ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
