[ 
https://issues.apache.org/jira/browse/JENA-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123422#comment-13123422
 ] 

Andy Seaborne commented on JENA-127:
------------------------------------


The RIOT output framework is less defined; "undefined" would be 
accurate.  For now, writing into Jena core is best as a Jena writer then 
go via a static function like writeRDFJSON(graph) and it'll get adapted 
when/if.

Input efficiency is more important because it more directly affects 
users experience with large data.


Rob - the tokenizer is doing minimal lookahead.  Bytes->chars conversion 
is done in large chunks (by the Java library - I tried to short circuit 
it with a non-codepoint-checking version but it was not faster).

The tokens are quite simple - only one character look ahead is need 
except in the case of prefix names [*].  There's non geneal rexexps 
going on so it should be fast and I've profiled it heavily for N-triples 
and others - don't see any obvious hot spots or inefficiencies.

[*] The end of local name is  (CHARS|'.')* CHARS so there is a little 
dance to handle ".".  Does not affect RDF/JSON, and it's only at most 
one charactser pushback code, not general backtracking.

                
> Add RDF/JSON Parsing Support to RIOT
> ------------------------------------
>
>                 Key: JENA-127
>                 URL: https://issues.apache.org/jira/browse/JENA-127
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ, Jena, RIOT
>         Environment: All
>            Reporter: Rob Vesse
>            Assignee: Paolo Castagna
>            Priority: Minor
>              Labels: patch, rdf/json, riot
>         Attachments: ARQ-RDF-JSON-tests_r1179639.patch, 
> ARQ_JENA-127_r1179358.patch, JenaReaderRdfJson.java, LangRDFJSON.java, 
> RdfJsonRiotPatch-ApacheSVN.patch, RdfJsonRiotPatch.patch, 
> RdfJsonRiotPatch.patch, TestLangRdfJson.java, TestLangRdfJson.java
>
>
> The attached patch provides a RDF/JSON (Talis Specification) parser for RIOT, 
> the patch is against ARQ trunk from the Jena SourceForge SVN repository
> It plugs in as an implementation of LangRIOT (named LangRDFJSON) and uses the 
> existing TokenizerJSON from the atlas package to do the tokenisation.  There 
> is also a JenaReaderRdfJson added as part of this patch which does what the 
> name suggests.
> I have also included in this patch a set of unit tests which verify the 
> parsers behaviour with a variety of valid and invalid inputs.
> There are still some things to be addressed:
> - The patch includes registration of the Jena reader when 
> SysRiot.writeIntoJena() is called but does not unregister itself when 
> resetJenaReaders() is called, should this be done?
> - Add a RDF/JSON writer - a separate patch will be submitted at a later date 
> (likely next week) for this
> Otherwise the patch is fairly comprehensive and I hope can be reviewed and 
> included in future releases
> EDIT - I have now redone the patch against Apache SVN as well and attached 
> that as a separate file since there are some differences in the structure of 
> the two repos and some minor code changes that mean the SourceForge SVN patch 
> cannot be applied directly against Apache SVN

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to