Lewis John McGibbney created NUTCH-2938: -------------------------------------------
Summary: Use Any23's RepositoryWriter to write structured data to Rdf4j repository Key: NUTCH-2938 URL: https://issues.apache.org/jira/browse/NUTCH-2938 Project: Nutch Issue Type: Improvement Components: any23, plugin Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 1.19 I have been running a patch which leverages [Any23's RepositoryWriter|https://any23.apache.org/apidocs/org/apache/any23/writer/RepositoryWriter.html] (implemented as one of a number of TripleHandler's via [CompositeTripleHandler|https://any23.apache.org/apidocs/org/apache/any23/writer/CompositeTripleHandler.html]) to write Any23 extractions to [GraphDB|https://www.ontotext.com/products/graphdb/]. This enables us to build a content graph from data across the enterprise. This feature is turned off by default so will not change existing Any23 behaviour. I have concerns about the performance of this patch because right now we need to create a new repository connection for each URL. This is not great so I will definitely improve on it. PR coming up. -- This message was sent by Atlassian Jira (v8.20.1#820001)