I just want to say that I echo Reinier's sentiment: I don't really trust any JSON-safe-ifying regex unless it comes with a formal proof (which doesn't seem inconceivable -- anyone know of one?). How many times have you heard about a security hole (usually x-site scripting) of the form "our HTML or Javascript validator was missing a case allowed an XSS attack"? If GWT were to provide a parseSafe() method, I'd want to be absolutely certain that it was safe -- I do *not* want GWT to be responsible for someone else's security hole.
On Thu, Aug 28, 2008 at 10:53 PM, Reinier Zwitserloot <[EMAIL PROTECTED]>wrote: > > #1: About that regexp - I don't trust it. It first finds all > javascript strings and deletes them (this is the second regexp), and > as far as I can tell, it's perfect, if a bit too strict in that it > won't accept single-quote delimited strings, though those aren't > technically JSON. It's the first regexp I don't trust: It scans the > entire string that remains after removing all strings for any > character that is not a legal JSON token. In particular, any > combination of dot, and the letters 'falsetrunE' (all characters in > 'false', 'true', and 'null', and the Exponent separated for floating > points) are still legal. This means the following: > > return flute.run > > would not be caught by the regexp as dangerous - it contains no > strings, and the entire thing consists only of legal JSON characters, > by pure coincidence. And, it's not dangerous. At least, not if every > javascript parser out there has no extra features and follows the spec > exactly, which, well, is only true in wonderland. > > 'return' is the only official keyword on the javascript keyword list > that you can make other than the intended 'null', 'false', and 'true', > you can't use '=', you can't use 'delete', and you can't use '()', so > it looks like you can't change any variables, or run any methods, and > return by itself is probably innocuous, but, still. My inner sense of > security's alarm bell is clinging. Some obscure bug in a javascript > parser is now a lot easier to exploit. > > Then again, doing the parsing by hand is going to be a lot slower. > > #2: The slash thing to path into lists and maps: Scott, you make a > good case. I'll remove it. .get(a).get(b) is not that much more > trouble anyway. > > #3: I'm not going to implement ISO8601 parsing in the JSON library. > asObject() would never return it, so it would deadstrip, but, ISO8601 > support is not something that should be hardcoded into a JSON library. > If it is needed, it should be its own thing. Javascript has a built in > date parser but it's very strict, so doesn't seem worth the trouble > (see http://www.quirksmode.org/js/introdate.html for what Date.parse() > accepts). > > #4: That regexp again: If the regexp is deemed acceptable for safe > parsing, let's just remove the distinction between 'safe' and 'unsafe' > parsing and double check everything. That regexp contains no backtrack > traps - it'll always finish in O(n) order (worst case scenario, every > character is hit twice, that's it), even in crappy regexp engines. > Therefore, it should always run faster than the eval() instruction. > > #5: Protocol buffers seems like a bit of a reach for this. Having a > GWT-specific output for PB is nice, but using the PB .proto format to > run a 'compiler' that produces a JavaScriptObject that can wrap around > JSON output (vs. pb output) isn't a perfect match with what PB is > trying to do, I think. > > #6: Overlaying a JSO is really nice - Bruce's recent article (I wonder > if this thread inspired it?) is quite convincing: > > http://googlewebtoolkit.blogspot.com/2008/08/getting-to-really-know-gwt-part-2.html > > The JSON API as proposed in the OP can be implemented as a JSO, which > should do good things for efficiency. Such a library is still useful > for generic JSON introspection. > > #7: A DSL that compiles to more specific JSO objects (e.g. having a > public List<Customer> getCustomers() method, with each Customer also > being a JSO class having such methods as 'String getFirstName()' for > example) would be very nice. I don't think the generic JSON library > needs to be designed with this in mind, so, why don't I start with > just the generic library built on JSO? I'll include a verify method > that checks if the JSON is at least valid, without modifying anything, > so that Ray can at least get that much for his JSO overlays. > > > > On Aug 28, 11:28 pm, "Ray Cromwell" <[EMAIL PROTECTED]> wrote: > > This is the regexp that the JSON RFC suggests: > > > > var my_JSON_object = !(/[^,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]/.test( > > text.replace(/"(\\.|[^"\\])*"/g, ''))) && > > eval('(' + text + ')'); > > > > It will take me awhile to digest how it actually works :), but > > assuming that it does work, it would provide a safe and efficient way > > to validate JSON before evals, atleast until browsers provide native > > JSON parsers. > > > > I must say that as a GWT developer, a generic JSON library doesn't > > appeal to me very much except in rare cases where you're trying to > > deal with arbitrary JSON from unknown sources, say, if you were > > writing a JSON-as-GWT tree visualization. > > > > But for most service interaction, you have to know the schema/format > > of the service you are dealing with upfront, so if I'm building a > > mashup, I prefer to just build concrete overlay types for each > > service. The compiler is helped by this as well. > > > > I'm not against DSLs, one of the things I'm working on is a > > Microformat equivalent to my GWT Exporter which allows you to take a > > POJO and "map" it to an HTML Microformat structure for > > serialization/derserialization, and this is done by both convention > > and configuration (i.e. with annotations) There are proposals for > > things like JSONPath, an XPath equivalent, but again, this totally > > obscures object type relationships in the JSON object. > > > > I'm much prefer a DSL that somehow taught the compiler what it was > > navigating. If the .proto format is too obscure, you could go in the > > reverse direction. Code up a POJO or interface, and use a generator to > > turn the interface/POJO into a .proto file, and a linker to invoke PB > > and package up a bunch of server-side deployables for whichever > > container/language environment you like. This would keep everything in > > Java. > > > > Barring that, as Scott mentioned, you could use something like my > > gwtquery approach to combine JSON queries into inlined code. > > > > I'm doing alot of work right now integrating GWT JSOs with Google > > AppEngine JSON services, and having a fast, efficient, easy to use, > > and secure JSON library it pretty important. I'm not against a generic > > parser, but something that could lighten the load for those using JSOs > > to integrate with non-Java server environments would be a big win. As > > cool as GWT RPC is, it's not a universal hammer. > > > > -Ray > > > > On Thu, Aug 28, 2008 at 1:39 PM, Scott Blum <[EMAIL PROTECTED]> wrote: > > > Reinier, I have to agree that a single library that could be used > client and > > > server side would be really useful, particularly if you use some > > > GWT.isScript() love to make the client side really efficient. And it > sounds > > > better than the current JSON library (not that that's saying a lot). > > > A few random thoughts: > > > - You'd have to be pretty careful about how the API is constructed to > get > > > all the dead stripping. Example: if .toObject() is allowed to return > Date, > > > you've basically pulled in all the date parsing code. > > > - I have been told that there are some regexps available which allow > you to > > > accept or reject input as legal JSON. You might look into this as a > really > > > fast way of doing safe parse. Along those lines, we regret that our > JSON > > > library's "parse" is the unsafe version. I would strongly consider > having > > > an 'unsafeParse' (and possibly not having a method named 'parse' at > all. > > > - The "foo/bar/1" stuff kinda makes my stomach turn, honestly, contrast > with > > > Ray's gquery compile-time evaluation. I know > j.get("foo").get("bar").get(1) > > > is more verby, but at least in theory that can be optimized/inlined. > > > Although, who knows, the compiler does do static string eval these > days, so > > > maybe if you're extraordinarily clever about the implementation, you > could > > > get static eval to work in your favor. > > > --~--~---------~--~----~------------~-------~--~----~ http://groups.google.com/group/Google-Web-Toolkit-Contributors -~----------~----~----~----~------~----~------~--~---