Re: [Virtuoso-users] Automating RDF data imports in VIrtuoso

Davis, Daniel (NIH/NLM) [C] Mon, 28 Sep 2015 07:16:05 -0700

Hi Paul,

We haven't had problem with data corruption, but the fix for the load memory 
leak is indeed a patch on top of 7.2.1.
If you download 7.2.1 and expand it, then you can apply the patch as follows:



-         Take attached patch file and save to system

-         Apply with Unix/Linux patch utility like this:

cd virtuoso-source-directory
patch -p1 < path-to-patch-file
./configure --with-readline -prefix path-to-install
make
make install

Our build is a little more complicated as we package it as an rpm.

As far as the memory leak, we are very certain it is gone - we have been 
monitoring the memory since and seen no leaks.

Thanks,

Dan Davis, Systems/Applications Architect (Contractor),
Office of Computer and Communications Systems,
National Library of Medicine, NIH


From: Hugh Williams [mailto:hwilli...@openlinksw.com]
Sent: Saturday, September 26, 2015 9:17 PM
To: Paul Houle <ontolo...@gmail.com>
Cc: virtuoso-users <virtuoso-users@lists.sourceforge.net>
Subject: Re: [Virtuoso-users] Automating RDF data imports in VIrtuoso

Hi Paul,

If you a test case for recreating this data corruption issues I would suggest 
trying against the git develop/7 branch with all the latest fixes to see if it 
still persists ? Or if this step for recreation can be provide it we could test 
locally ?

Best Regards
Hugh Williams
Professional Services
OpenLink Software, Inc.      //              http://www.openlinksw.com/
Weblog   -- http://www.openlinksw.com/blogs/
LinkedIn -- http://www.linkedin.com/company/openlink-software/
Twitter  -- http://twitter.com/OpenLink
Google+  -- http://plus.google.com/100570109519069333827/
Facebook -- http://www.facebook.com/OpenLinkSoftware
Universal Data Access, Integration, and Management Technology Providers

On 26 Sep 2015, at 23:39, Paul Houle 
<ontolo...@gmail.com<mailto:ontolo...@gmail.com>> wrote:

I like the cloud solution of creating a new virtuoso system,  doing the load,  
having plenty of time to test it,  then replacing the production instance with 
the new instance and retiring the production instance.

The main advantage here is that there is no way a screw-up in the load 
procedure can trash the production system --  even if Virtuoso was entirely 
reliable,  as the data sources grow the rate of exceptional events (say you 
fill the disk) goes up.  The temporary server approach eliminates a lot of 
headaches and it is good cloud economics.  (if you run a server at AMZN for 1 
hour a day to update,  the cost of your system only goes up by %4).

I was having good luck with this approach until Virtuoso 7.2.0 came along and 
since then I've had problems similar in severity to what the N.I.H. was 
reporting,  it really looked like massive corruption of the data structures,  
7.2.1 did not help.

I don't know if these issues are fixed in the current TRUNK but if they are it 
would be nice to get an official release.

On Fri, Sep 25, 2015 at 1:31 PM, Haag, Jason 
<jhaa...@gmail.com<mailto:jhaa...@gmail.com>> wrote:

Hi Users,

I'm trying to determine the best option for my situation for importing RDF data 
into Virtuoso. Here's my situation:

I currently have several RDF datasets available on my server. Each data set has 
an RDF dump available as RDF/XML, JSON-LD, and Turtle. These dumps are 
generated automatically without virtuoso from an HTML page marked up using RDFa.

What is the best option for automating the import of this data on a regular 
basis into the virtuoso DB? The datasets may grow so it should not just import 
the data once, but import on a regular basis, perhaps daily or weekly.

Based on what I've read in the documentation, this crawler option seems like 
the most appropriate option for my situation: 
http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSetCrawlerJobsGuideDirectories

Can anyone verify if this would be the best approach? Does anyone know if the 
crawler supports RDFa/HTML or should it point to a specific directory with only 
the RDF dump files?

Thanks in advance!

J Haag

------------------------------------------------------------------------------

_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net<mailto:Virtuoso-users@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/virtuoso-users



--
Paul Houle

Applying Schemas for Natural Language Processing, Distributed Systems, 
Classification and Text Mining and Data Lakes

(607) 539 6254    paul.houle on Skype   
ontolo...@gmail.com<mailto:ontolo...@gmail.com>

:BaseKB -- Query Freebase Data With SPARQL
http://basekb.com/gold/

Legal Entity Identifier Lookup
https://legalentityidentifier.info/lei/lookup/<http://legalentityidentifier.info/lei/lookup/>

Join our Data Lakes group on LinkedIn
https://www.linkedin.com/grp/home?gid=8267275

------------------------------------------------------------------------------
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net<mailto:Virtuoso-users@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

develop-7-342a683c8a430dff1dbe4fe976c5994b2a259181.patch
Description: develop-7-342a683c8a430dff1dbe4fe976c5994b2a259181.patch

------------------------------------------------------------------------------

_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Re: [Virtuoso-users] Automating RDF data imports in VIrtuoso

Reply via email to