[ https://issues.apache.org/jira/browse/NUTCH-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sebastian Nagel closed NUTCH-2938. ---------------------------------- > Use Any23's RepositoryWriter to write structured data to Rdf4j repository > ------------------------------------------------------------------------- > > Key: NUTCH-2938 > URL: https://issues.apache.org/jira/browse/NUTCH-2938 > Project: Nutch > Issue Type: Improvement > Components: any23, plugin > Reporter: Lewis John McGibbney > Assignee: Lewis John McGibbney > Priority: Major > > I have been running a patch which leverages [Any23's > RepositoryWriter|https://any23.apache.org/apidocs/org/apache/any23/writer/RepositoryWriter.html] > (implemented as one of a number of TripleHandler's via > [CompositeTripleHandler|https://any23.apache.org/apidocs/org/apache/any23/writer/CompositeTripleHandler.html]) > to write Any23 extractions to > [GraphDB|https://www.ontotext.com/products/graphdb/]. This enables us to > build a content graph from data across the enterprise. > This feature is turned off by default so will not change existing Any23 > behaviour. I have concerns about the performance of this patch because right > now we need to create a new repository connection for each URL. This is not > great so I will definitely improve on it. > PR coming up. -- This message was sent by Atlassian Jira (v8.20.10#820010)