I'm looking for a way to incrementally decode a JSON file. I know this has come up before, and in general the problem is not soluble (because in theory the JSON file could be a single object). In my particular situation, though, I have a 9GB file containing a top-level array object, with many elements. So what I could (in theory) do is to parse an element at a time, yielding them.
The problem is that the stdlib JSON library reads the whole file, which defeats my purpose. What I'd like is if it would read one complete element, then just enough far ahead to find out that the parse was done, and return the object it found (it should probably also return the "next token", as it can't reliably push it back - I'd check that it was a comma before proceeding with the next list element). I couldn't see a way to get the stdlib json library to read "just as much as needed" in this way. Did I miss a trick? Or alternatively, is there a JSON decoder library on PyPI that supports this sort of usage? I'd rather not have to implement my own JSON parser if I can avoid it. Thanks, Paul -- https://mail.python.org/mailman/listinfo/python-list