scovich opened a new issue, #9204:
URL: https://github.com/apache/arrow-rs/issues/9204

   **Describe the bug**
   
   The JSON tape decoder currently accepts invalid JSON such as:
   ```json
   {, "a": 1, ,,, , "b":2 }
   [, 1, ,,, , 2]
   ```
   
   **To Reproduce**
   
   Try parsing the above JSON.
   
   **Expected behavior**
   
   Leading commas and multiple commas should produce a parsing error.
   
   From the [JSON spec](https://www.json.org/json-en.html):
   > An object is an unordered set of name/value pairs. An object begins with 
`{` left brace and ends with `}` right brace. Each name is followed by `:` 
colon and the name/value pairs are separated by `,` comma.
   > <img width="420" height="202" alt="Image" 
src="https://github.com/user-attachments/assets/59acf90c-b396-49df-a451-be340edaed0d";
 />
   > An array is an ordered collection of values. An array begins with `[` left 
bracket and ends with `]` right bracket. Values are separated by `,` comma.
   > <img width="416" height="115" alt="Image" 
src="https://github.com/user-attachments/assets/b4a0fc40-5be2-4243-a9f1-feb5fbb7c8ea";
 />
   
   **Additional context**
   
   The tape decoder lacks a `DecoderState::Comma` enum variant. Instead 
`DecoderState::Object` and `DecoderState::List` both rely on the following 
(overly permissive) string search:
   ```rust
   iter.advance_until(|b| !json_whitespace(b) && b != b',');
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to