Re: Line Numbers

2017-01-17 Thread Conal Tuohy
if you want to keep track of the provenance of data, you can use named
graphs. If your graphs are small (fine-grained) enough then this may give
you the necessary precision to refer any triple back to its source.

On 18/01/2017 7:49 am, "Grahame Grieve" 
wrote:

> >
> > Add a stage to parsing which built a Triple to location map
> (ParserProfile
> > receives the line and column number).
> >
> > The look up in that map to find the source of the triple.  Imperfect in
> > the general case but for working on a single file it might do what you
> are
> > looking for.
> >
>
> yes I could do that. I'll add it to my list.
>
> Grahame
>


Re: Line Numbers

2017-01-17 Thread Grahame Grieve
>
> Add a stage to parsing which built a Triple to location map (ParserProfile
> receives the line and column number).
>
> The look up in that map to find the source of the triple.  Imperfect in
> the general case but for working on a single file it might do what you are
> looking for.
>

yes I could do that. I'll add it to my list.

Grahame


Re: Line Numbers

2017-01-17 Thread Andy Seaborne

You could even ...

Add a stage to parsing which built a Triple to location map 
(ParserProfile receives the line and column number).


The look up in that map to find the source of the triple.  Imperfect in 
the general case but for working on a single file it might do what you 
are looking for.


Triples are safe to use as hash keys.

Andy

On 17/01/17 20:23, A. Soroka wrote:

There are several answers.

There is no reason to suppose that any given triple actually derives from a 
file at all. It might have been created programmatically, or by inference, or 
from SPARQL, amongst many possible other means.

You are suggesting the carriage of a really large amount of metadata all 
throughout Jena's internals. The performance implications would be big, and 
entirely negative.

Andy has given you a really good road to go down if what you want is more detailed 
parsing metadata for, as you say, "reporting issues with the content". You can 
take off that metadata and record it elsewhere or record it in RDF in various ways. 
Perhaps you can tell us a little more about your use case and we can help you find a more 
targeted technique for it.

---
A. Soroka
The University of Virginia Library


On Jan 17, 2017, at 3:14 PM, Grahame Grieve 
<grah...@healthintersections.com.au> wrote:

hi

Yes replacing a library is not simple, but I thought I'd still make the
offer. Other advantages... no, it's just a JSON parser.


You did seem to be asking for a way to get from a triple in a graph to

the line where it was read, and that is not possible. There is no such
association.

why not? the library could provide a way, and retain the association.

Grahame


On Wed, Jan 18, 2017 at 6:58 AM, A. Soroka <aj...@virginia.edu> wrote:


Replacing the JSON library in use is a considerably bigger proposition
than working with the one we now use in a different way. Are there other
advantages to using your custom code? We want to stick to well-supported
dependencies unless there is a convincing argument otherwise.

As for Turtle, I believe you can take a look at LangTurtleBase to see what
might be done. Keep in mind that there's not necessarily a precise way to
understand what line produces an error-- it might occur in the interaction
between tokens on more than one line.

---
A. Soroka
The University of Virginia Library


On Jan 17, 2017, at 2:42 PM, Grahame Grieve <

grah...@healthintersections.com.au> wrote:


well, I care about turtle and json-ld.  I can contribute a json library
that preserves line numbers when the json is parsed, since the main

stream

ones don't.

Grahame


On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote:


That will depend a bit on the language. For example, JSON parsing

doesn't

occur directly in Jena, Jena uses a library that parses from JSON to

Java

objects and then works with those objects:

org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String,
ContentType, StreamRDF, Context)

In some other cases, it seems like it should be possible. Do you have a
specific language in mind?

---
A. Soroka
The University of Virginia Library


On Jan 16, 2017, at 6:48 AM, Grahame Grieve <

grah...@healthintersections.com.au> wrote:


Can the Jena parser maintain a link between the triples and the line

number

from which are sourced in the original file? This is really useful for
reporting issues with the content

Grahame


--
-
http://www.healthintersections.com.au / grahame@healthintersections.

com.au

/ +61 411 867 065






--
-
http://www.healthintersections.com.au / grahame@healthintersections.

com.au

/ +61 411 867 065






--
-
http://www.healthintersections.com.au / grah...@healthintersections.com.au
/ +61 411 867 065




Re: Line Numbers

2017-01-17 Thread A. Soroka
There are several answers.

There is no reason to suppose that any given triple actually derives from a 
file at all. It might have been created programmatically, or by inference, or 
from SPARQL, amongst many possible other means.

You are suggesting the carriage of a really large amount of metadata all 
throughout Jena's internals. The performance implications would be big, and 
entirely negative.

Andy has given you a really good road to go down if what you want is more 
detailed parsing metadata for, as you say, "reporting issues with the content". 
You can take off that metadata and record it elsewhere or record it in RDF in 
various ways. Perhaps you can tell us a little more about your use case and we 
can help you find a more targeted technique for it.

---
A. Soroka
The University of Virginia Library

> On Jan 17, 2017, at 3:14 PM, Grahame Grieve 
> <grah...@healthintersections.com.au> wrote:
> 
> hi
> 
> Yes replacing a library is not simple, but I thought I'd still make the
> offer. Other advantages... no, it's just a JSON parser.
> 
>> You did seem to be asking for a way to get from a triple in a graph to
> the line where it was read, and that is not possible. There is no such
> association.
> 
> why not? the library could provide a way, and retain the association.
> 
> Grahame
> 
> 
> On Wed, Jan 18, 2017 at 6:58 AM, A. Soroka <aj...@virginia.edu> wrote:
> 
>> Replacing the JSON library in use is a considerably bigger proposition
>> than working with the one we now use in a different way. Are there other
>> advantages to using your custom code? We want to stick to well-supported
>> dependencies unless there is a convincing argument otherwise.
>> 
>> As for Turtle, I believe you can take a look at LangTurtleBase to see what
>> might be done. Keep in mind that there's not necessarily a precise way to
>> understand what line produces an error-- it might occur in the interaction
>> between tokens on more than one line.
>> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>>> On Jan 17, 2017, at 2:42 PM, Grahame Grieve <
>> grah...@healthintersections.com.au> wrote:
>>> 
>>> well, I care about turtle and json-ld.  I can contribute a json library
>>> that preserves line numbers when the json is parsed, since the main
>> stream
>>> ones don't.
>>> 
>>> Grahame
>>> 
>>> 
>>> On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote:
>>> 
>>>> That will depend a bit on the language. For example, JSON parsing
>> doesn't
>>>> occur directly in Jena, Jena uses a library that parses from JSON to
>> Java
>>>> objects and then works with those objects:
>>>> 
>>>> org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String,
>>>> ContentType, StreamRDF, Context)
>>>> 
>>>> In some other cases, it seems like it should be possible. Do you have a
>>>> specific language in mind?
>>>> 
>>>> ---
>>>> A. Soroka
>>>> The University of Virginia Library
>>>> 
>>>>> On Jan 16, 2017, at 6:48 AM, Grahame Grieve <
>>>> grah...@healthintersections.com.au> wrote:
>>>>> 
>>>>> Can the Jena parser maintain a link between the triples and the line
>>>> number
>>>>> from which are sourced in the original file? This is really useful for
>>>>> reporting issues with the content
>>>>> 
>>>>> Grahame
>>>>> 
>>>>> 
>>>>> --
>>>>> -
>>>>> http://www.healthintersections.com.au / grahame@healthintersections.
>>>> com.au
>>>>> / +61 411 867 065
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> -
>>> http://www.healthintersections.com.au / grahame@healthintersections.
>> com.au
>>> / +61 411 867 065
>> 
>> 
> 
> 
> -- 
> -
> http://www.healthintersections.com.au / grah...@healthintersections.com.au
> / +61 411 867 065



Re: Line Numbers

2017-01-17 Thread Grahame Grieve
hi

Yes replacing a library is not simple, but I thought I'd still make the
offer. Other advantages... no, it's just a JSON parser.

> You did seem to be asking for a way to get from a triple in a graph to
the line where it was read, and that is not possible. There is no such
association.

why not? the library could provide a way, and retain the association.

Grahame


On Wed, Jan 18, 2017 at 6:58 AM, A. Soroka <aj...@virginia.edu> wrote:

> Replacing the JSON library in use is a considerably bigger proposition
> than working with the one we now use in a different way. Are there other
> advantages to using your custom code? We want to stick to well-supported
> dependencies unless there is a convincing argument otherwise.
>
> As for Turtle, I believe you can take a look at LangTurtleBase to see what
> might be done. Keep in mind that there's not necessarily a precise way to
> understand what line produces an error-- it might occur in the interaction
> between tokens on more than one line.
>
> ---
> A. Soroka
> The University of Virginia Library
>
> > On Jan 17, 2017, at 2:42 PM, Grahame Grieve <
> grah...@healthintersections.com.au> wrote:
> >
> > well, I care about turtle and json-ld.  I can contribute a json library
> > that preserves line numbers when the json is parsed, since the main
> stream
> > ones don't.
> >
> > Grahame
> >
> >
> > On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote:
> >
> >> That will depend a bit on the language. For example, JSON parsing
> doesn't
> >> occur directly in Jena, Jena uses a library that parses from JSON to
> Java
> >> objects and then works with those objects:
> >>
> >> org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String,
> >> ContentType, StreamRDF, Context)
> >>
> >> In some other cases, it seems like it should be possible. Do you have a
> >> specific language in mind?
> >>
> >> ---
> >> A. Soroka
> >> The University of Virginia Library
> >>
> >>> On Jan 16, 2017, at 6:48 AM, Grahame Grieve <
> >> grah...@healthintersections.com.au> wrote:
> >>>
> >>> Can the Jena parser maintain a link between the triples and the line
> >> number
> >>> from which are sourced in the original file? This is really useful for
> >>> reporting issues with the content
> >>>
> >>> Grahame
> >>>
> >>>
> >>> --
> >>> -
> >>> http://www.healthintersections.com.au / grahame@healthintersections.
> >> com.au
> >>> / +61 411 867 065
> >>
> >>
> >
> >
> > --
> > -
> > http://www.healthintersections.com.au / grahame@healthintersections.
> com.au
> > / +61 411 867 065
>
>


-- 
-
http://www.healthintersections.com.au / grah...@healthintersections.com.au
/ +61 411 867 065


Re: Line Numbers

2017-01-17 Thread A. Soroka
Replacing the JSON library in use is a considerably bigger proposition than 
working with the one we now use in a different way. Are there other advantages 
to using your custom code? We want to stick to well-supported dependencies 
unless there is a convincing argument otherwise.

As for Turtle, I believe you can take a look at LangTurtleBase to see what 
might be done. Keep in mind that there's not necessarily a precise way to 
understand what line produces an error-- it might occur in the interaction 
between tokens on more than one line.

---
A. Soroka
The University of Virginia Library

> On Jan 17, 2017, at 2:42 PM, Grahame Grieve 
> <grah...@healthintersections.com.au> wrote:
> 
> well, I care about turtle and json-ld.  I can contribute a json library
> that preserves line numbers when the json is parsed, since the main stream
> ones don't.
> 
> Grahame
> 
> 
> On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote:
> 
>> That will depend a bit on the language. For example, JSON parsing doesn't
>> occur directly in Jena, Jena uses a library that parses from JSON to Java
>> objects and then works with those objects:
>> 
>> org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String,
>> ContentType, StreamRDF, Context)
>> 
>> In some other cases, it seems like it should be possible. Do you have a
>> specific language in mind?
>> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>>> On Jan 16, 2017, at 6:48 AM, Grahame Grieve <
>> grah...@healthintersections.com.au> wrote:
>>> 
>>> Can the Jena parser maintain a link between the triples and the line
>> number
>>> from which are sourced in the original file? This is really useful for
>>> reporting issues with the content
>>> 
>>> Grahame
>>> 
>>> 
>>> --
>>> -
>>> http://www.healthintersections.com.au / grahame@healthintersections.
>> com.au
>>> / +61 411 867 065
>> 
>> 
> 
> 
> -- 
> -
> http://www.healthintersections.com.au / grah...@healthintersections.com.au
> / +61 411 867 065



Re: Line Numbers

2017-01-17 Thread Andy Seaborne
RDF does not have the concept of an order to triples and indeed triples 
can be added and deleted to the set of triples from different places.


What you can do is to add stages to the parsing process to produce 
messages as parsing happens.


Andy

On 17/01/17 19:42, Grahame Grieve wrote:

well, I care about turtle and json-ld.  I can contribute a json library
that preserves line numbers when the json is parsed, since the main stream
ones don't.

Grahame


On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote:


That will depend a bit on the language. For example, JSON parsing doesn't
occur directly in Jena, Jena uses a library that parses from JSON to Java
objects and then works with those objects:

org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String,
ContentType, StreamRDF, Context)

In some other cases, it seems like it should be possible. Do you have a
specific language in mind?

---
A. Soroka
The University of Virginia Library


On Jan 16, 2017, at 6:48 AM, Grahame Grieve <

grah...@healthintersections.com.au> wrote:


Can the Jena parser maintain a link between the triples and the line

number

from which are sourced in the original file? This is really useful for
reporting issues with the content

Grahame


--
-
http://www.healthintersections.com.au / grahame@healthintersections.

com.au

/ +61 411 867 065








Re: Line Numbers

2017-01-17 Thread A. Soroka
You did seem to be asking for a way to get from a triple in a graph to the line 
where it was read, and that is not possible. There is no such association. Andy 
is pointing out that only during parsing can such information be managed (and I 
pointed out that even that is not the case all the time). If that is not what 
you are asking for, perhaps you can clarify.

---
A. Soroka
The University of Virginia Library

> On Jan 17, 2017, at 2:52 PM, Grahame Grieve 
> <grah...@healthintersections.com.au> wrote:
> 
> I'm not sure where that means it's not possible or of interest to trace the
> triples (or their parts) to source files
> 
> Grahame
> 
> 
> On Wed, Jan 18, 2017 at 6:47 AM, Andy Seaborne <a...@apache.org> wrote:
> 
>> RDF does not have the concept of an order to triples and indeed triples
>> can be added and deleted to the set of triples from different places.
>> 
>> What you can do is to add stages to the parsing process to produce
>> messages as parsing happens.
>> 
>>Andy
>> 
>> 
>> On 17/01/17 19:42, Grahame Grieve wrote:
>> 
>>> well, I care about turtle and json-ld.  I can contribute a json library
>>> that preserves line numbers when the json is parsed, since the main stream
>>> ones don't.
>>> 
>>> Grahame
>>> 
>>> 
>>> On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote:
>>> 
>>> That will depend a bit on the language. For example, JSON parsing doesn't
>>>> occur directly in Jena, Jena uses a library that parses from JSON to Java
>>>> objects and then works with those objects:
>>>> 
>>>> org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String,
>>>> ContentType, StreamRDF, Context)
>>>> 
>>>> In some other cases, it seems like it should be possible. Do you have a
>>>> specific language in mind?
>>>> 
>>>> ---
>>>> A. Soroka
>>>> The University of Virginia Library
>>>> 
>>>> On Jan 16, 2017, at 6:48 AM, Grahame Grieve <
>>>>> 
>>>> grah...@healthintersections.com.au> wrote:
>>>> 
>>>>> 
>>>>> Can the Jena parser maintain a link between the triples and the line
>>>>> 
>>>> number
>>>> 
>>>>> from which are sourced in the original file? This is really useful for
>>>>> reporting issues with the content
>>>>> 
>>>>> Grahame
>>>>> 
>>>>> 
>>>>> --
>>>>> -
>>>>> http://www.healthintersections.com.au / grahame@healthintersections.
>>>>> 
>>>> com.au
>>>> 
>>>>> / +61 411 867 065
>>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
> 
> 
> -- 
> -
> http://www.healthintersections.com.au / grah...@healthintersections.com.au
> / +61 411 867 065



Re: Line Numbers

2017-01-17 Thread Grahame Grieve
I'm not sure where that means it's not possible or of interest to trace the
triples (or their parts) to source files

Grahame


On Wed, Jan 18, 2017 at 6:47 AM, Andy Seaborne <a...@apache.org> wrote:

> RDF does not have the concept of an order to triples and indeed triples
> can be added and deleted to the set of triples from different places.
>
> What you can do is to add stages to the parsing process to produce
> messages as parsing happens.
>
> Andy
>
>
> On 17/01/17 19:42, Grahame Grieve wrote:
>
>> well, I care about turtle and json-ld.  I can contribute a json library
>> that preserves line numbers when the json is parsed, since the main stream
>> ones don't.
>>
>> Grahame
>>
>>
>> On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote:
>>
>> That will depend a bit on the language. For example, JSON parsing doesn't
>>> occur directly in Jena, Jena uses a library that parses from JSON to Java
>>> objects and then works with those objects:
>>>
>>> org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String,
>>> ContentType, StreamRDF, Context)
>>>
>>> In some other cases, it seems like it should be possible. Do you have a
>>> specific language in mind?
>>>
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>>
>>> On Jan 16, 2017, at 6:48 AM, Grahame Grieve <
>>>>
>>> grah...@healthintersections.com.au> wrote:
>>>
>>>>
>>>> Can the Jena parser maintain a link between the triples and the line
>>>>
>>> number
>>>
>>>> from which are sourced in the original file? This is really useful for
>>>> reporting issues with the content
>>>>
>>>> Grahame
>>>>
>>>>
>>>> --
>>>> -
>>>> http://www.healthintersections.com.au / grahame@healthintersections.
>>>>
>>> com.au
>>>
>>>> / +61 411 867 065
>>>>
>>>
>>>
>>>
>>
>>


-- 
-
http://www.healthintersections.com.au / grah...@healthintersections.com.au
/ +61 411 867 065


Re: Line Numbers

2017-01-17 Thread Grahame Grieve
well, I care about turtle and json-ld.  I can contribute a json library
that preserves line numbers when the json is parsed, since the main stream
ones don't.

Grahame


On Wed, Jan 18, 2017 at 5:38 AM, A. Soroka <aj...@virginia.edu> wrote:

> That will depend a bit on the language. For example, JSON parsing doesn't
> occur directly in Jena, Jena uses a library that parses from JSON to Java
> objects and then works with those objects:
>
> org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String,
> ContentType, StreamRDF, Context)
>
> In some other cases, it seems like it should be possible. Do you have a
> specific language in mind?
>
> ---
> A. Soroka
> The University of Virginia Library
>
> > On Jan 16, 2017, at 6:48 AM, Grahame Grieve <
> grah...@healthintersections.com.au> wrote:
> >
> > Can the Jena parser maintain a link between the triples and the line
> number
> > from which are sourced in the original file? This is really useful for
> > reporting issues with the content
> >
> > Grahame
> >
> >
> > --
> > -
> > http://www.healthintersections.com.au / grahame@healthintersections.
> com.au
> > / +61 411 867 065
>
>


-- 
-
http://www.healthintersections.com.au / grah...@healthintersections.com.au
/ +61 411 867 065


Re: Line Numbers

2017-01-17 Thread A. Soroka
That will depend a bit on the language. For example, JSON parsing doesn't occur 
directly in Jena, Jena uses a library that parses from JSON to Java objects and 
then works with those objects:

org.apache.jena.riot.lang.JsonLDReader.read(InputStream, String, ContentType, 
StreamRDF, Context)

In some other cases, it seems like it should be possible. Do you have a 
specific language in mind?

---
A. Soroka
The University of Virginia Library

> On Jan 16, 2017, at 6:48 AM, Grahame Grieve 
>  wrote:
> 
> Can the Jena parser maintain a link between the triples and the line number
> from which are sourced in the original file? This is really useful for
> reporting issues with the content
> 
> Grahame
> 
> 
> -- 
> -
> http://www.healthintersections.com.au / grah...@healthintersections.com.au
> / +61 411 867 065



Line Numbers

2017-01-16 Thread Grahame Grieve
Can the Jena parser maintain a link between the triples and the line number
from which are sourced in the original file? This is really useful for
reporting issues with the content

Grahame


-- 
-
http://www.healthintersections.com.au / grah...@healthintersections.com.au
/ +61 411 867 065