Re: [CODE4LIB] Q: XML2JSON converter

Houghton,Andrew Sat, 06 Mar 2010 16:00:18 -0800

> From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
> Bill Dueber
> Sent: Saturday, March 06, 2010 05:11 PM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: Re: [CODE4LIB] Q: XML2JSON converter
> 
> Anyway, hopefully, it won't be a huge surprise that I don't disagree
> with any of the quote above in general; I would assert, though, that
> application/json and application/marc+json should both return JSON
> (in the same way that text/xml, application/xml, and 
> application/marc+xml can all be expected to return XML). 
> Newline-delimited json is starting to crop up in a few places 
> (e.g. couchdb) and should probably have its own mime type
> and associated extension. So I would say something like:
> 
> application/json -- return json (obviously)
> application/marc+json  -- return json
> application/marc+ndj  -- return newline-delimited json


This sounds like consensus on how to deal with newline-delimited JSON in a 
standards based manner.

I'm not familiar with CouchDB, but I am using MongoDB which is similar.  I'll 
have to dig into how they deal with this newline-delimited JSON.  Can you 
provide any references to get me started?

> In all cases, we should agree on a standard record serialization,
> though, and the pure-json returns should include something that 
> indicates what the heck it is (hopefully a URI that can act as a 
> distinct "namespace"-type identifier, including a version in it).

I agree that our MARC-JSON serialization needs some "namespace" identifier in 
it and it occurred to me that the way it is handling indicators, e.g., ind1 and 
ind2 properties, might be better handled as an array to accommodate IFLA's 
MARC-XML-ish where they can have from 1-9 indicator values.

BTW, our MARC-JSON content is specified in Unicode not MARC-8, per the JSON 
standard, which means you need to use \uXXXX notation to specify characters in 
strings, not sure I made that clear in earlier posts.  A downside to the 
current ECMA 262 specification is that it doesn't support \U00XXXXXX, as Python 
does, for the extended characters.  Hopefully that will get rectified in a 
future ECMA 262 specification.

> The question for me, I think, is whether within this community,  anyone
> who provides one of these types (application/marc+json and
> application/marc+ndj) should automatically be expected to provide both.
> I don't have an answer for that.

I think this issue gets into familiar territory when dealing with RDF formats.  
Let's see, there is N3, NT, XML, Turtle, etc.  Do you need to provide all of 
them?  No, but it's nice of the server to at least provide NT or Turtle and 
XML.  Ultimately it's up to the server.  But the only difference between use 
cases #2 and #3 is whether the output is wrapped in an array, so it's probably 
easy for the server to produce both.

Depending on how much time I get next week I'll talk with the developer network 
folks to see what I need to do to put a specification under their 
infrastructure.  Looks like from my schedule it's going to be another week of 
hell :(


Andy.

Re: [CODE4LIB] Q: XML2JSON converter

Reply via email to