Re: Line Numbers
if you want to keep track of the provenance of data, you can use named graphs. If your graphs are small (fine-grained) enough then this may give you the necessary precision to refer any triple back to its source. On 18/01/2017 7:49 am, "Grahame Grieve"wrote: > > > > Add a stage to parsing which built a Triple to location map > (ParserProfile > > receives the line and column number). > > > > The look up in that map to find the source of the triple. Imperfect in > > the general case but for working on a single file it might do what you > are > > looking for. > > > > yes I could do that. I'll add it to my list. > > Grahame >
Re: Line Numbers
> > Add a stage to parsing which built a Triple to location map (ParserProfile > receives the line and column number). > > The look up in that map to find the source of the triple. Imperfect in > the general case but for working on a single file it might do what you are > looking for. > yes I could do that. I'll add it to my list. Grahame
Re: Line Numbers
You could even ... Add a stage to parsing which built a Triple to location map (ParserProfile receives the line and column number). The look up in that map to find the source of the triple. Imperfect in the general case but for working on a single file it might do what you are looking for. Triples are safe to use as hash keys. Andy On 17/01/17 20:23, A. Soroka wrote: There are several answers. There is no reason to suppose that any given triple actually derives from a file at all. It might have been created programmatically, or by inference, or from SPARQL, amongst many possible other means. You are suggesting the carriage of a really large amount of metadata all throughout Jena's internals. The performance implications would be big, and entirely negative. Andy has given you a really good road to go down if what you want is more detailed parsing metadata for, as you say, "reporting issues with the content". You can take off that metadata and record it elsewhere or record it in RDF in various ways. Perhaps you can tell us a little more about your use case and we can help you find a more targeted technique for it. --- A. Soroka The University of Virginia Library On Jan 17, 2017, at 3:14 PM, Grahame Grieve <grah...@healthintersections.com.au> wrote: hi Yes replacing a library is not simple, but I thought I'd still make the offer. Other advantages... no, it's just a JSON parser. You did seem to be asking for a way to get from a triple in a graph to the line where it was read, and that is not possible. There is no such association. why not? the library could provide a way, and retain the association. Grahame On Wed, Jan 18, 2017 at 6:58 AM, A. Soroka <aj...@virginia.edu> wrote: Replacing the JSON library in use is a considerably bigger proposition than working with the one we now use in a different way. Are there other advantages to using your custom code? We want to stick to well-supported dependencies unless there is a convincing argument otherwise. As for Turtle, I believe you can take a look at LangTurtleBase to see what might be done. Keep in mind that there's not necessarily a precise way to understand what line produces an error-- it might occur in the interaction between tokens on more than one line. --- A. Soroka The University of Virginia Library On Jan 17, 2017, at 2:42 PM, Grahame Grieve < grah...@healthintersections.com.au> wrote: well, I care about turtle and json-ld. I can contribute a json library that preserves line numbers when the json is parsed, since the main stream ones don't. Grahame On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote: That will depend a bit on the language. For example, JSON parsing doesn't occur directly in Jena, Jena uses a library that parses from JSON to Java objects and then works with those objects: org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String, ContentType, StreamRDF, Context) In some other cases, it seems like it should be possible. Do you have a specific language in mind? --- A. Soroka The University of Virginia Library On Jan 16, 2017, at 6:48 AM, Grahame Grieve < grah...@healthintersections.com.au> wrote: Can the Jena parser maintain a link between the triples and the line number from which are sourced in the original file? This is really useful for reporting issues with the content Grahame -- - http://www.healthintersections.com.au / grahame@healthintersections. com.au / +61 411 867 065 -- - http://www.healthintersections.com.au / grahame@healthintersections. com.au / +61 411 867 065 -- - http://www.healthintersections.com.au / grah...@healthintersections.com.au / +61 411 867 065
Re: Line Numbers
There are several answers. There is no reason to suppose that any given triple actually derives from a file at all. It might have been created programmatically, or by inference, or from SPARQL, amongst many possible other means. You are suggesting the carriage of a really large amount of metadata all throughout Jena's internals. The performance implications would be big, and entirely negative. Andy has given you a really good road to go down if what you want is more detailed parsing metadata for, as you say, "reporting issues with the content". You can take off that metadata and record it elsewhere or record it in RDF in various ways. Perhaps you can tell us a little more about your use case and we can help you find a more targeted technique for it. --- A. Soroka The University of Virginia Library > On Jan 17, 2017, at 3:14 PM, Grahame Grieve > <grah...@healthintersections.com.au> wrote: > > hi > > Yes replacing a library is not simple, but I thought I'd still make the > offer. Other advantages... no, it's just a JSON parser. > >> You did seem to be asking for a way to get from a triple in a graph to > the line where it was read, and that is not possible. There is no such > association. > > why not? the library could provide a way, and retain the association. > > Grahame > > > On Wed, Jan 18, 2017 at 6:58 AM, A. Soroka <aj...@virginia.edu> wrote: > >> Replacing the JSON library in use is a considerably bigger proposition >> than working with the one we now use in a different way. Are there other >> advantages to using your custom code? We want to stick to well-supported >> dependencies unless there is a convincing argument otherwise. >> >> As for Turtle, I believe you can take a look at LangTurtleBase to see what >> might be done. Keep in mind that there's not necessarily a precise way to >> understand what line produces an error-- it might occur in the interaction >> between tokens on more than one line. >> >> --- >> A. Soroka >> The University of Virginia Library >> >>> On Jan 17, 2017, at 2:42 PM, Grahame Grieve < >> grah...@healthintersections.com.au> wrote: >>> >>> well, I care about turtle and json-ld. I can contribute a json library >>> that preserves line numbers when the json is parsed, since the main >> stream >>> ones don't. >>> >>> Grahame >>> >>> >>> On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote: >>> >>>> That will depend a bit on the language. For example, JSON parsing >> doesn't >>>> occur directly in Jena, Jena uses a library that parses from JSON to >> Java >>>> objects and then works with those objects: >>>> >>>> org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String, >>>> ContentType, StreamRDF, Context) >>>> >>>> In some other cases, it seems like it should be possible. Do you have a >>>> specific language in mind? >>>> >>>> --- >>>> A. Soroka >>>> The University of Virginia Library >>>> >>>>> On Jan 16, 2017, at 6:48 AM, Grahame Grieve < >>>> grah...@healthintersections.com.au> wrote: >>>>> >>>>> Can the Jena parser maintain a link between the triples and the line >>>> number >>>>> from which are sourced in the original file? This is really useful for >>>>> reporting issues with the content >>>>> >>>>> Grahame >>>>> >>>>> >>>>> -- >>>>> - >>>>> http://www.healthintersections.com.au / grahame@healthintersections. >>>> com.au >>>>> / +61 411 867 065 >>>> >>>> >>> >>> >>> -- >>> - >>> http://www.healthintersections.com.au / grahame@healthintersections. >> com.au >>> / +61 411 867 065 >> >> > > > -- > - > http://www.healthintersections.com.au / grah...@healthintersections.com.au > / +61 411 867 065
Re: Line Numbers
hi Yes replacing a library is not simple, but I thought I'd still make the offer. Other advantages... no, it's just a JSON parser. > You did seem to be asking for a way to get from a triple in a graph to the line where it was read, and that is not possible. There is no such association. why not? the library could provide a way, and retain the association. Grahame On Wed, Jan 18, 2017 at 6:58 AM, A. Soroka <aj...@virginia.edu> wrote: > Replacing the JSON library in use is a considerably bigger proposition > than working with the one we now use in a different way. Are there other > advantages to using your custom code? We want to stick to well-supported > dependencies unless there is a convincing argument otherwise. > > As for Turtle, I believe you can take a look at LangTurtleBase to see what > might be done. Keep in mind that there's not necessarily a precise way to > understand what line produces an error-- it might occur in the interaction > between tokens on more than one line. > > --- > A. Soroka > The University of Virginia Library > > > On Jan 17, 2017, at 2:42 PM, Grahame Grieve < > grah...@healthintersections.com.au> wrote: > > > > well, I care about turtle and json-ld. I can contribute a json library > > that preserves line numbers when the json is parsed, since the main > stream > > ones don't. > > > > Grahame > > > > > > On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote: > > > >> That will depend a bit on the language. For example, JSON parsing > doesn't > >> occur directly in Jena, Jena uses a library that parses from JSON to > Java > >> objects and then works with those objects: > >> > >> org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String, > >> ContentType, StreamRDF, Context) > >> > >> In some other cases, it seems like it should be possible. Do you have a > >> specific language in mind? > >> > >> --- > >> A. Soroka > >> The University of Virginia Library > >> > >>> On Jan 16, 2017, at 6:48 AM, Grahame Grieve < > >> grah...@healthintersections.com.au> wrote: > >>> > >>> Can the Jena parser maintain a link between the triples and the line > >> number > >>> from which are sourced in the original file? This is really useful for > >>> reporting issues with the content > >>> > >>> Grahame > >>> > >>> > >>> -- > >>> - > >>> http://www.healthintersections.com.au / grahame@healthintersections. > >> com.au > >>> / +61 411 867 065 > >> > >> > > > > > > -- > > - > > http://www.healthintersections.com.au / grahame@healthintersections. > com.au > > / +61 411 867 065 > > -- - http://www.healthintersections.com.au / grah...@healthintersections.com.au / +61 411 867 065
Re: Line Numbers
Replacing the JSON library in use is a considerably bigger proposition than working with the one we now use in a different way. Are there other advantages to using your custom code? We want to stick to well-supported dependencies unless there is a convincing argument otherwise. As for Turtle, I believe you can take a look at LangTurtleBase to see what might be done. Keep in mind that there's not necessarily a precise way to understand what line produces an error-- it might occur in the interaction between tokens on more than one line. --- A. Soroka The University of Virginia Library > On Jan 17, 2017, at 2:42 PM, Grahame Grieve > <grah...@healthintersections.com.au> wrote: > > well, I care about turtle and json-ld. I can contribute a json library > that preserves line numbers when the json is parsed, since the main stream > ones don't. > > Grahame > > > On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote: > >> That will depend a bit on the language. For example, JSON parsing doesn't >> occur directly in Jena, Jena uses a library that parses from JSON to Java >> objects and then works with those objects: >> >> org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String, >> ContentType, StreamRDF, Context) >> >> In some other cases, it seems like it should be possible. Do you have a >> specific language in mind? >> >> --- >> A. Soroka >> The University of Virginia Library >> >>> On Jan 16, 2017, at 6:48 AM, Grahame Grieve < >> grah...@healthintersections.com.au> wrote: >>> >>> Can the Jena parser maintain a link between the triples and the line >> number >>> from which are sourced in the original file? This is really useful for >>> reporting issues with the content >>> >>> Grahame >>> >>> >>> -- >>> - >>> http://www.healthintersections.com.au / grahame@healthintersections. >> com.au >>> / +61 411 867 065 >> >> > > > -- > - > http://www.healthintersections.com.au / grah...@healthintersections.com.au > / +61 411 867 065
Re: Line Numbers
RDF does not have the concept of an order to triples and indeed triples can be added and deleted to the set of triples from different places. What you can do is to add stages to the parsing process to produce messages as parsing happens. Andy On 17/01/17 19:42, Grahame Grieve wrote: well, I care about turtle and json-ld. I can contribute a json library that preserves line numbers when the json is parsed, since the main stream ones don't. Grahame On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote: That will depend a bit on the language. For example, JSON parsing doesn't occur directly in Jena, Jena uses a library that parses from JSON to Java objects and then works with those objects: org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String, ContentType, StreamRDF, Context) In some other cases, it seems like it should be possible. Do you have a specific language in mind? --- A. Soroka The University of Virginia Library On Jan 16, 2017, at 6:48 AM, Grahame Grieve < grah...@healthintersections.com.au> wrote: Can the Jena parser maintain a link between the triples and the line number from which are sourced in the original file? This is really useful for reporting issues with the content Grahame -- - http://www.healthintersections.com.au / grahame@healthintersections. com.au / +61 411 867 065
Re: Line Numbers
You did seem to be asking for a way to get from a triple in a graph to the line where it was read, and that is not possible. There is no such association. Andy is pointing out that only during parsing can such information be managed (and I pointed out that even that is not the case all the time). If that is not what you are asking for, perhaps you can clarify. --- A. Soroka The University of Virginia Library > On Jan 17, 2017, at 2:52 PM, Grahame Grieve > <grah...@healthintersections.com.au> wrote: > > I'm not sure where that means it's not possible or of interest to trace the > triples (or their parts) to source files > > Grahame > > > On Wed, Jan 18, 2017 at 6:47 AM, Andy Seaborne <a...@apache.org> wrote: > >> RDF does not have the concept of an order to triples and indeed triples >> can be added and deleted to the set of triples from different places. >> >> What you can do is to add stages to the parsing process to produce >> messages as parsing happens. >> >>Andy >> >> >> On 17/01/17 19:42, Grahame Grieve wrote: >> >>> well, I care about turtle and json-ld. I can contribute a json library >>> that preserves line numbers when the json is parsed, since the main stream >>> ones don't. >>> >>> Grahame >>> >>> >>> On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote: >>> >>> That will depend a bit on the language. For example, JSON parsing doesn't >>>> occur directly in Jena, Jena uses a library that parses from JSON to Java >>>> objects and then works with those objects: >>>> >>>> org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String, >>>> ContentType, StreamRDF, Context) >>>> >>>> In some other cases, it seems like it should be possible. Do you have a >>>> specific language in mind? >>>> >>>> --- >>>> A. Soroka >>>> The University of Virginia Library >>>> >>>> On Jan 16, 2017, at 6:48 AM, Grahame Grieve < >>>>> >>>> grah...@healthintersections.com.au> wrote: >>>> >>>>> >>>>> Can the Jena parser maintain a link between the triples and the line >>>>> >>>> number >>>> >>>>> from which are sourced in the original file? This is really useful for >>>>> reporting issues with the content >>>>> >>>>> Grahame >>>>> >>>>> >>>>> -- >>>>> - >>>>> http://www.healthintersections.com.au / grahame@healthintersections. >>>>> >>>> com.au >>>> >>>>> / +61 411 867 065 >>>>> >>>> >>>> >>>> >>> >>> > > > -- > - > http://www.healthintersections.com.au / grah...@healthintersections.com.au > / +61 411 867 065
Re: Line Numbers
I'm not sure where that means it's not possible or of interest to trace the triples (or their parts) to source files Grahame On Wed, Jan 18, 2017 at 6:47 AM, Andy Seaborne <a...@apache.org> wrote: > RDF does not have the concept of an order to triples and indeed triples > can be added and deleted to the set of triples from different places. > > What you can do is to add stages to the parsing process to produce > messages as parsing happens. > > Andy > > > On 17/01/17 19:42, Grahame Grieve wrote: > >> well, I care about turtle and json-ld. I can contribute a json library >> that preserves line numbers when the json is parsed, since the main stream >> ones don't. >> >> Grahame >> >> >> On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote: >> >> That will depend a bit on the language. For example, JSON parsing doesn't >>> occur directly in Jena, Jena uses a library that parses from JSON to Java >>> objects and then works with those objects: >>> >>> org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String, >>> ContentType, StreamRDF, Context) >>> >>> In some other cases, it seems like it should be possible. Do you have a >>> specific language in mind? >>> >>> --- >>> A. Soroka >>> The University of Virginia Library >>> >>> On Jan 16, 2017, at 6:48 AM, Grahame Grieve < >>>> >>> grah...@healthintersections.com.au> wrote: >>> >>>> >>>> Can the Jena parser maintain a link between the triples and the line >>>> >>> number >>> >>>> from which are sourced in the original file? This is really useful for >>>> reporting issues with the content >>>> >>>> Grahame >>>> >>>> >>>> -- >>>> - >>>> http://www.healthintersections.com.au / grahame@healthintersections. >>>> >>> com.au >>> >>>> / +61 411 867 065 >>>> >>> >>> >>> >> >> -- - http://www.healthintersections.com.au / grah...@healthintersections.com.au / +61 411 867 065
Re: Line Numbers
well, I care about turtle and json-ld. I can contribute a json library that preserves line numbers when the json is parsed, since the main stream ones don't. Grahame On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote: > That will depend a bit on the language. For example, JSON parsing doesn't > occur directly in Jena, Jena uses a library that parses from JSON to Java > objects and then works with those objects: > > org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String, > ContentType, StreamRDF, Context) > > In some other cases, it seems like it should be possible. Do you have a > specific language in mind? > > --- > A. Soroka > The University of Virginia Library > > > On Jan 16, 2017, at 6:48 AM, Grahame Grieve < > grah...@healthintersections.com.au> wrote: > > > > Can the Jena parser maintain a link between the triples and the line > number > > from which are sourced in the original file? This is really useful for > > reporting issues with the content > > > > Grahame > > > > > > -- > > - > > http://www.healthintersections.com.au / grahame@healthintersections. > com.au > > / +61 411 867 065 > > -- - http://www.healthintersections.com.au / grah...@healthintersections.com.au / +61 411 867 065
Re: Line Numbers
That will depend a bit on the language. For example, JSON parsing doesn't occur directly in Jena, Jena uses a library that parses from JSON to Java objects and then works with those objects: org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String, ContentType, StreamRDF, Context) In some other cases, it seems like it should be possible. Do you have a specific language in mind? --- A. Soroka The University of Virginia Library > On Jan 16, 2017, at 6:48 AM, Grahame Grieve >wrote: > > Can the Jena parser maintain a link between the triples and the line number > from which are sourced in the original file? This is really useful for > reporting issues with the content > > Grahame > > > -- > - > http://www.healthintersections.com.au / grah...@healthintersections.com.au > / +61 411 867 065
Line Numbers
Can the Jena parser maintain a link between the triples and the line number from which are sourced in the original file? This is really useful for reporting issues with the content Grahame -- - http://www.healthintersections.com.au / grah...@healthintersections.com.au / +61 411 867 065