I'm loading the Turtle Wikidata RDF complete dump, split into pieces and loaded with 10 active readers. About half the time the load fails with one or more of these errors. The errors are always near the beginning of the load---in the first group of 10 files to be loaded and near the beginning of the files (generally in the first couple of hundred lines in a file of size well over 1 GB). No errors occur for any files beyond the first ten.
I could provide the files, but they total to about 340GB. It sure looks as if there is some sort of bug when loading RDF language-tagged strings, where a race condition means that two threads are trying to load the same language tag into DB.DBA.RDF_LANGUAGE. This would explain why the problem occurs only at the beginning of the load, when the language tags are being added to DB.DBA.RDF_LANGUAGE, and not later. It would also explain why the errors are different between different runs. (The only other explanation would be hardware errors, but this doesn't seem to be viable.) It seems to me that a quick patch for this problem would be to change the insert into a soft insert, but I don't know where to make this change in the code. peter On 12/11/18 7:11 PM, Hugh Williams wrote: > Hi Peter, > > The triple value do indeed appear to be valid, but the problem could be > somewhere else in the dataset file and not necessarily on the reported line or > line before it. > > Is it a public dataset you are loading and if so can you provide a copy for > local testing ? > > Best Regards > Hugh Williams > Professional Services > OpenLink Software > Home Page: http://www.openlinksw.com > Community Support: https://community.openlinksw.com > Weblogs (Blogs): > Company Blog: https://medium.com/openlink-software-blog > Virtuoso Blog: https://medium.com/virtuoso-blog > Data Access Drivers > Blog: https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers > LinkedIn -- http://www.linkedin.com/company/openlink-software/ > Twitter -- http://twitter.com/OpenLink > Google+ -- http://plus.google.com/100570109519069333827/ > Facebook -- http://www.facebook.com/OpenLinkSoftware > Universal Data Access, Integration, and Management Technology Providers > > > > >> On 11 Dec 2018, at 17:45, Peter F. Patel-Schneider <pfpschnei...@gmail.com >> <mailto:pfpschnei...@gmail.com>> wrote: >> >> I'm loading a bunch of Turtle files and I'm getting the error >> >> 2300 TURTLE RDF loader, line 1012: SR197: Non unique primary key on >> DB.DBA.RDF_LANGUAGE >> >> The line in question looks fine: >> >> "Wikimedia template"@ki, >> >> The line before it may indicate the issue >> >> "Wikimedia template"@kg, >> >> Nonetheless this should be valid RDF so there appears to be a bug in Virtuoso >> here. >> >> Is there any workaround? >> >> >> This is in Virtuoso 07.20.3230. >> >> peter >> >> >> _______________________________________________ >> Virtuoso-users mailing list >> Virtuoso-users@lists.sourceforge.net >> <mailto:Virtuoso-users@lists.sourceforge.net> >> https://lists.sourceforge.net/lists/listinfo/virtuoso-users > _______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users