Hi Peter,

I generated the datasets from your python script and loaded them into a local 
Virtuoso open source multiple times but did not see any occurrences of the 
error:

SQL> select * from load_list;
ll_file                                                                         
  ll_graph                                                                      
    ll_state    ll_started           ll_done              ll_host     
ll_work_time  ll_error
VARCHAR NOT NULL                                                                
  VARCHAR                                                                       
    INTEGER     TIMESTAMP            TIMESTAMP            INTEGER     INTEGER   
  VARCHAR
_______________________________________________________________________________

./wikidata/test00.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 826749000  2018.12.19 0:47.54 983316000  0   
        NULL        NULL
./wikidata/test01.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 826749000  2018.12.19 0:47.55 105660000  0   
        NULL        NULL
./wikidata/test02.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 826749000  2018.12.19 0:47.55 233562000  0   
        NULL        NULL
./wikidata/test03.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 826749000  2018.12.19 0:47.55 371457000  0   
        NULL        NULL
./wikidata/test04.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 826749000  2018.12.19 0:47.55 483846000  0   
        NULL        NULL
./wikidata/test05.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 826749000  2018.12.19 0:47.55 621974000  0   
        NULL        NULL
./wikidata/test06.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 826749000  2018.12.19 0:47.55 742255000  0   
        NULL        NULL
./wikidata/test07.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 826749000  2018.12.19 0:47.55 860062000  0   
        NULL        NULL
./wikidata/test08.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 826749000  2018.12.19 0:47.55 993561000  0   
        NULL        NULL
./wikidata/test09.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 826749000  2018.12.19 0:47.56 140431000  0   
        NULL        NULL
./wikidata/test10.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 827790000  2018.12.19 0:47.54 985386000  0   
        NULL        NULL
./wikidata/test11.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 827790000  2018.12.19 0:47.55 109072000  0   
        NULL        NULL
./wikidata/test12.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 827790000  2018.12.19 0:47.55 230846000  0   
        NULL        NULL
./wikidata/test13.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 827790000  2018.12.19 0:47.55 375427000  0   
        NULL        NULL
./wikidata/test14.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 827790000  2018.12.19 0:47.55 486963000  0   
        NULL        NULL
./wikidata/test15.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 827790000  2018.12.19 0:47.55 624303000  0   
        NULL        NULL
./wikidata/test16.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 827790000  2018.12.19 0:47.55 745760000  0   
        NULL        NULL
./wikidata/test17.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 827790000  2018.12.19 0:47.55 862932000  0   
        NULL        NULL
./wikidata/test18.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 827790000  2018.12.19 0:47.55 995704000  0   
        NULL        NULL
./wikidata/test19.ttl                                                           
  http://test.nuance.com                                                        
    2           2018.12.19 0:47.54 827790000  2018.12.19 0:47.56 144745000  0   
        NULL        NULL

20 Rows. -- 3 msec.
SQL> sparql select count(*) from <http://test.nuance.com> where {?s ?p ?o};
callret-0
INTEGER
_______________________________________________________________________________

135402

1 Rows. -- 85 msec.
SQL> status('');
REPORT
VARCHAR
_______________________________________________________________________________

OpenLink Virtuoso  Server
Version 07.20.3230-pthreads for Darwin as of Nov 10 2018 
Started on: 2018-12-19 00:36 GMT+0

Best Regards
Hugh Williams
Professional Services
OpenLink Software
Home Page: http://www.openlinksw.com <http://www.openlinksw.com/>
Community Support: https://community.openlinksw.com 
<https://community.openlinksw.com/>
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog 
<https://medium.com/openlink-software-blog>
Virtuoso Blog: https://medium.com/virtuoso-blog 
<https://medium.com/virtuoso-blog>
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers 
<https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers>
LinkedIn -- http://www.linkedin.com/company/openlink-software/
Twitter  -- http://twitter.com/OpenLink
Google+  -- http://plus.google.com/100570109519069333827/
Facebook -- http://www.facebook.com/OpenLinkSoftware
Universal Data Access, Integration, and Management Technology Providers




> On 18 Dec 2018, at 17:24, Peter F. Patel-Schneider <pfpschnei...@gmail.com> 
> wrote:
> 
> I created some synthetic data that tickles the bug reliably on my machine with
> a standard virtuoso.ini (just adding the directory for the files to the
> allowed list).  I'm attaching the generator program for the files and a
> loading script.
> 
> peter
> 
> 
> On 12/18/18 9:46 AM, Peter F. Patel-Schneider wrote:
>> I did a bit of digging and it sure looks as if there is a race condition in
>> rdf_rl_lang_id in ttlpv.sql.   This code appears to check to see if the
>> language tag is already in DB.DBA.RDF_LANGUAGE and adds it if not.  But
>> another thread could do the same insert between the check and the insert, as
>> far as I can tell.
>> 
>> It looks to me as if the right solution is to do a soft insert and a
>> subsequent query instead of a hard insert.
>> 
>> However, I don't understand how locking works in SQL so there may be 
>> something
>> that prevents another thread from interfering.
>> 
>> peter
>> 
>> 
>> On 12/18/18 8:55 AM, Peter F. Patel-Schneider wrote:
>>> I'm loading the Turtle Wikidata RDF complete dump, split into pieces and
>>> loaded with 10 active readers.   About half the time the load fails with one
>>> or more of these errors.  The errors are always near the beginning of the
>>> load---in the first group of 10 files to be loaded and near the beginning of
>>> the files (generally in the first couple of hundred lines in a file of size
>>> well over 1 GB).  No errors occur for any files beyond the first ten.
>>> 
>>> I could provide the files, but they total to about 340GB.
>>> 
>>> It sure looks as if there is some sort of bug when loading RDF 
>>> language-tagged
>>> strings, where a race condition means that two threads are trying to load 
>>> the
>>> same language tag into DB.DBA.RDF_LANGUAGE.  This would explain why the
>>> problem occurs only at the beginning of the load, when the language tags are
>>> being added to DB.DBA.RDF_LANGUAGE, and not later.  It would also explain 
>>> why
>>> the errors are different between different runs.  (The only other 
>>> explanation
>>> would be hardware errors, but this doesn't seem to be viable.)
>>> 
>>> It seems to me that a quick patch for this problem would be to change the
>>> insert into a soft insert, but I don't know where to make this change in 
>>> the code.
>>> 
>>> peter
>>> 
>>> 
>>> 
>>> 
>>> On 12/11/18 7:11 PM, Hugh Williams wrote:
>>>> Hi Peter,
>>>> 
>>>> The triple value do indeed appear to be valid, but the problem could be
>>>> somewhere else in the dataset file and not necessarily on the reported 
>>>> line or
>>>> line before it.
>>>> 
>>>> Is it a public dataset you are loading and if so can you provide a copy for
>>>> local testing ?
>>>> 
>>>> Best Regards
>>>> Hugh Williams
>>>> Professional Services
>>>> OpenLink Software
>>>> Home Page: http://www.openlinksw.com
>>>> Community Support: https://community.openlinksw.com
>>>> Weblogs (Blogs):
>>>> Company Blog: https://medium.com/openlink-software-blog
>>>> Virtuoso Blog: https://medium.com/virtuoso-blog
>>>> Data Access Drivers
>>>> Blog: https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>>>> LinkedIn -- http://www.linkedin.com/company/openlink-software/
>>>> Twitter  -- http://twitter.com/OpenLink
>>>> Google+  -- http://plus.google.com/100570109519069333827/
>>>> Facebook -- http://www.facebook.com/OpenLinkSoftware
>>>> Universal Data Access, Integration, and Management Technology Providers
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On 11 Dec 2018, at 17:45, Peter F. Patel-Schneider <pfpschnei...@gmail.com
>>>>> <mailto:pfpschnei...@gmail.com>> wrote:
>>>>> 
>>>>> I'm loading a bunch of Turtle files and I'm getting the error
>>>>> 
>>>>> 2300 TURTLE RDF loader, line 1012: SR197: Non unique primary key on
>>>>> DB.DBA.RDF_LANGUAGE
>>>>> 
>>>>> The line in question looks fine:
>>>>> 
>>>>>   "Wikimedia template"@ki,
>>>>> 
>>>>> The line before it may indicate the issue
>>>>> 
>>>>>    "Wikimedia template"@kg,
>>>>> 
>>>>> Nonetheless this should be valid RDF so there appears to be a bug in 
>>>>> Virtuoso
>>>>> here.
>>>>> 
>>>>> Is there any workaround?
>>>>> 
>>>>> 
>>>>> This is in Virtuoso 07.20.3230.
>>>>> 
>>>>> peter
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Virtuoso-users mailing list
>>>>> Virtuoso-users@lists.sourceforge.net
>>>>> <mailto:Virtuoso-users@lists.sourceforge.net>
>>>>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>>>> 
> <generate.py><test.sh>_______________________________________________
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net 
> <mailto:Virtuoso-users@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users 
> <https://lists.sourceforge.net/lists/listinfo/virtuoso-users>
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to