Re: [topbraid-users] External SPARQL Endpoints

Fan Li Fri, 21 Aug 2020 04:59:54 -0700

Thanks for the advice, Irene. Using EDG to provide data governance to 
existing operational databases/triple stores is relevant to us as well.


You mentioned for "very large (like 10 to the 12th) amounts of operational 
data, ... going from the external sources to EDG would be limited to the 
reference data discovery" . Do you imply there are more options if we have 
less than a 1 billion triples?

On Thursday, August 20, 2020 at 1:31:05 PM UTC-4 Irene Polikoff wrote:

> OK, let us know how it goes.
>
> Btw, no one in TopQuadrant recalls giving you this advice. 
>
> If you have very large (like 10 to the 12th) amounts of operational data 
> in RDF, this would make sense. However, in this case, you would not try to 
> make this data available in EDG. If this data uses controlled 
> vocabularies/reference data managed by EDG, then the typical workflow would 
> be:
>
> 1. Curate reference data in EDG i.e., EDG is the definitive source of 
> reference data
> 2. Deliver it from EDG to other environments where it is used
> 3. Reference data gets updated in EDG, the external systems get updated
>
> Thus, the flow would be from EDG to the external systems. This is a 
> standard scenario for the “master reference data” solution. 
>
> The scenario of going from the external sources to EDG would be limited to 
> the reference data discovery i.e., the initial set up step when you are 
> first establishing your reference datasets. If you already have various 
> sources that use the controlled values, you would want to start by 
> importing them into EDG where you would put them under management.
>
> Regards,
>
> Irene
>
> On Aug 20, 2020, at 12:11 PM, Matt Goldberg <mgbe...@gmail.com> wrote:
>
> Great, I'll have to give this a try.
>
> And yes, someone at TopQuadrant told us that very large, dynamic datasets 
> would be better kept in another triple store.
>
> On Thursday, August 20, 2020 at 11:40:41 AM UTC-4 Irene Polikoff wrote:
>
>> Hi Matt,
>>
>> Please see below
>>
>> On Aug 20, 2020, at 10:14 AM, Matt Goldberg <mgbe...@gmail.com> wrote:
>>
>> Right, I know the SERVICE keyword does the trick, and that may be 
>> sufficient.
>>
>> There's a couple paths I'm trying to explore:
>>
>>    - We've been told that large, dynamic datasets would be better kept 
>>    in another triple store, and we're looking at AllegroGraph as a 
>> possibility 
>>    for that. 
>>
>> Has someone at TopQuadrant give you this advice?
>>
>>
>>    - It would be nice to have a data graph in AG appear as a graph in 
>>    EDG 
>>
>> This is not possible. Data is either in EDG or it is not in EDG.
>>
>>
>>    - as the vocabularies that would be used in the AG data graphs will 
>>    be managed by EDG and we'd like EDG web services and SHACL validators to 
>> be 
>>    able to easily access the AG data.
>>
>> There are options for selectively copying some data which would then be 
>> available for services and validation. Copied data can be periodically 
>> refreshed.
>>
>> One option to consider is described here 
>> https://www.topquadrant.com/technology/shacl/wikidata/
>>
>> Note that the screenshots are from 6.2. Hopefully, it is still easy 
>> enough to follow the instructions in 6.4. We will update the screenshots 
>> shortly.
>>
>> Further, the example describes a connection to Wikidata. There are some 
>> conveniences built-in to EDG for linking with Wikidata. You can, however, 
>> do the same with other SPARQL endpoint. You will not get the 
>> auto-suggestions for the resource links or auto-copying of shapes. You will 
>> need to establish these yourself. Once it is done, the fetching, copying 
>> and access to the property values of the linked remote resources works 
>> exactly the same.
>>
>>
>>    - It would be convenient if there was a way to create a graph that 
>>    could import several virtual graphs/connect to several external SPARQL 
>>    endpoints in order to federate queries to multiple graphs simultaneously, 
>>    in order to hide the fact that data may be coming from different sources. 
>>    This would prevent our users from having to know what SPARQL endpoints 
>>    exist and would give them just one to access all the data they would need 
>>    in one place.
>>    
>> If you have multiple SPARQL Endpoints, create a link property for each 
>> endpoint and provide links to the corresponding remote resources from all 
>> the endpoints. Also create separate Node Shapes representing data of 
>> interest from each endpoint.
>>
>> You would then, for example, be able to fetch “height” from one endpoint 
>> and “weight” from another.
>>
>> On Thursday, August 20, 2020 at 9:46:12 AM UTC-4 Irene Polikoff wrote:
>>
>>> What exactly are you trying to accomplish?
>>>
>>> You can use the SERVICE key word in SPARQL queries without having a 
>>> connection fie.
>>>
>>> On Aug 20, 2020, at 9:32 AM, Matt Goldberg <mgbe...@gmail.com> wrote:
>>>
>>> I've been experimenting with importing SPARQL endpoints via Import > 
>>> Create Connection File For SPARQL Endpoint. This works great for smaller 
>>> datasets, and having the endpoint wrapped as a virtual graph is a great 
>>> feature I'd like to take advantage of. However, since it tries to cache all 
>>> triples at that SPARQL endpoint, it is not practical for large datasets 
>>> (e.g. DBPedia). Is there a way to configure a virtual graph for a SPARQL 
>>> endpoint that does not try to cache the contents of the remote store?
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "TopBraid Suite Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to topbraid-user...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/topbraid-users/9749e6cc-0534-4934-8f39-d7ca8fe675a3n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/topbraid-users/9749e6cc-0534-4934-8f39-d7ca8fe675a3n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>>
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "TopBraid Suite Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to topbraid-user...@googlegroups.com.
>>
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/topbraid-users/b7aebf32-2b8f-4272-9506-1b94de32cd3en%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/topbraid-users/b7aebf32-2b8f-4272-9506-1b94de32cd3en%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>>
>>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "TopBraid Suite Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to topbraid-user...@googlegroups.com.
>
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/topbraid-users/162fa308-34fd-4b1f-9761-8784d1c96f8fn%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/topbraid-users/162fa308-34fd-4b1f-9761-8784d1c96f8fn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to topbraid-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/topbraid-users/c5fe6799-c105-486c-a9d5-4d809d3f20d9n%40googlegroups.com.

Re: [topbraid-users] External SPARQL Endpoints

Reply via email to