On 03/02/2025 14:50, Adrian Gschwend wrote:
On 26.01.2025 19:13, Andy Seaborne wrote:

Hi Andy,

Aside: do you use SHACL?

I do but for other use-cases. In this particular use-case, I was exporting a large government LOD database and we don't have influence on all the stuff some pipelines write into it. And I simply wanted to run it on another database.

Recovery means reworking parsers. Which RDF syntaxes are you interested in? N-triples can make use of the statement per line. Turtle can't but recovery might risk skip-to-DOT. RDF/XML is based on an XML parser. JSON-LD is a 3rd party engine.

ah good point, I did not think that through. I personally have N-Triples in this scenario but I completely see your point, in another, similar use-case the source was a ton of JSON-LD files that were sometimes simply broken.

That brings me to another issue I had with this JSON-LD dump: It was around 60k files so shell expansion of *.json failed on all shells I tried due to the size and I could not find a glob option in riot. And scripting it didn't work either due to JVM startup time.

Someone is looking at a native of Jena which would be great for command line use.

https://github.com/apache/jena/discussions/2934


Warnings - do you mean IRI warnings? - or some syntax level warnings?

IRI warnings or invalid datatypes, usually some xsd that is wrong. 31st of February was a common one in this dataset for reasons.

There's a new IRI subsystem in the pipeline - and its errors/warnigns are more controllable -
 > > https://lists.apache.org/list.html?
[email protected]#:~:text=Fuseki%20development%20features.-, %3D%3D%3D%3D%20IRI3986,-Issue%3A%20https

was that a direct link to a post on the list? I get only an index.

regards

Adrian

Reply via email to