03-Aug-2014 21:40, Andrei Alexandrescu пишет:
On 8/3/14, 10:19 AM, Sean Kelly wrote:
I don't want to pay for anything I don't use. No allocations should
occur within the parser and it should simply slice up the input.
What to do about arrays and objects, which would naturally allocate
arrays and associative arrays respectively? What about strings with
backslash-encoded characters?
SAX-style would imply that array is "parsed" by calling 6 user-defined
callbacks inside of a parser:
startArray, endArray, startObject, endObject, id and value.
A simplified pseudo-code of JSON-parser inner loop is then:
if(cur == '[')
startArray();
else if(cur == '{'){
startObject();
else if(cur == '}')
endObject();
else if(cur == ']')
endArray();
else{
if(expectObjectKey){
id(parseAsIdentifier());
}
else
value(parseAsValue());
}
This is as barebones as it can get and is very fast in practice esp. in
context of searching/extracting/matching specific sub-tries of JSON
documents.
--
Dmitry Olshansky