Re: [SMW-devel] The future of the SMW query stores

2013-10-18 Thread Joel Natividad
I'm with David in Camp 3, but there are also some things i'd like to bring
up for discussion while the group is considering a new store:

-  job queue, smw_refresh to store properties.  If you have millions of
articles, with each article template laden with semantic properties, there
are scalability and maintenance issues.  Especially if you use bots to pump
a lot of data.  Is this a mediawiki constraint or can we store properties
near-real time?  FYI, in the CKAN project, they have a datastorer plugin (
https://github.com/okfn/ckanext-datastorer using celeryproject.org) that
parses structured data asynchronously so it can be queried via an API

- Historical semantic data.  Would be nice if we can query historical data
as pages are updated over time, i.e. including a date range when doing
queries to show how properties change over time.

- having additional metadata/provenance info.  Apart from when, who/what
made the assertion?  This could well become a semantic system catalog/data
dictionary of sorts that can be used to compute semantic statistics as well
as optimize queries.

- Wikidata integration.  If I'm not mistaken, there was talk of WikiData
and SMW ultimately joining together in the indeterminate future.
 Perhaps, this could be the start of that process as some of these issues
may have already been considered by the WikiData team.

Thanks,
Joel


===
Think Different! (http://en.wikipedia.org/wiki/Think_different#Text)
Imagine Different! (http://www.youtube.com/watch?v=H5tOgRD4EqY)


On Wed, Oct 16, 2013 at 10:13 AM, david mason 
vid_semediawiki-de...@zooid.org wrote:


 With regard to ES and data recovery/transactions, if SMW continues to be
 able to generate this data at any time it doesn't seem to be much of an
 issue. ES is also horizontally scalable as one of its main features, and
 supports geo features and advanced search, although graph traversal is
 manual and commits are near-real-time.

 I am mainly proposing this for the simplicity of the operators.  Asking
 them to set up, for one SMW instance, MW, MySQL, SMW, ES for MW search at
 least, and one or two additional stores seems like a lot.

 I would guess that there are three kinds of SMW users; 1. those happy
 using it as a flexible self-contained front end built on MW for forms and
 pages, 2. those who would like to use it for Semantic Web / LOD type
 purposes (formal ontology design, enforcement, inference, and shared data
 between sites using web standards), and 3. those who would at least like a
 solid option/path to 2.

 For the many members of the community who would benefit from a real focus
 on an RDF store and schema support, I would clearly support something like
 Richard's stack, but it might add a lot of complexity to hosting and
 development. Probably many SMW users now are using inexpensive hosting
 plans which wouldn't support this broader stack, and as I understand it the
 current SMW PHP API is not cleanly designed up so it may basically be a
 reinvention (which could be a good thing but would be disruptive).

 For myself I work in a mix of applications and am in solidly in camp 3 as
 a way forward, fwiw.

 And I can't help but wonder how WikiData fits into the mix. (=

 David





 On 16 October 2013 09:48, Richard Banks richard.bank...@gmail.com wrote:

 Hi,

 Just to add to the conversation, I would also recommend ElasticSearch as
 a great solution for the search side of things. There are also cases of
 people using it as the sole data store. However, I believe caution should
 be taken against such an approach since ES currently doesn't provide much
 in the way of data recovery or transactions.

 For this reason, ES is typically deployed in combination with a data
 storage technology that does support these factors, such as Mongo. ES
 allows you to define what's known as rivers, and these pull data out of a
 configured data source and into the index, thus providing the benefits of
 its powerful search (which is literally insane).

 In terms of making use of the rich inherent graph structure of the data
 at the higher level, a GraphDB would make sense as suggested by Joel. One
 GraphDB that might be worth a look is Titan, which has been developed by
 the Tinkerpop guys I believe. Its a distributed graph database which also
 (interestingly) supports ElasticSearch. It also abstracts over many data
 stores/formats (including RDF) out-of-the-box. ES is a clever move IMO
 because one of the challenges in graph search is jumping into the graph in
 the first place, and it looks like they use the ES index to do this.

 So, you could almost just use Titan for search, get all the benefits of
 graph traversals etc., and have it manage your ES index too.

 Regards,
 Richard


 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips 

Re: [SMW-devel] The future of the SMW query stores

2013-10-18 Thread Jeroen De Dauw
Hey,

Wikidata integration.  If I'm not mistaken, there was talk of WikiData and
 SMW ultimately joining together in the indeterminate future.  Perhaps,
 this could be the start of that process as some of these issues may have
 already been considered by the WikiData team.


The actual topic of my email, which got completely ignore so far in favor
of discussing MongoDB vs $AlternativeStore, is very much about
interoperability with the Wikidata software.

Cheers

--
Jeroen De Dauw
http://www.bn2vs.com
Don't panic. Don't be evil. ~=[,,_,,]:3
--
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135031iu=/4140/ostg.clktrk___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


Re: [SMW-devel] The future of the SMW query stores

2013-10-18 Thread Joel Natividad
A useful side-discussion still :)

And I guess, WikiData does solve the historical, scalability and storing
structured data issues.

Perhaps, the SMW community should just jump on it - helping make WikiData
integration a reality? :)

Best,
Joel

===
Think Different! (http://en.wikipedia.org/wiki/Think_different#Text)
Imagine Different! (http://www.youtube.com/watch?v=H5tOgRD4EqY)


On Fri, Oct 18, 2013 at 2:33 PM, Jeroen De Dauw jeroended...@gmail.comwrote:

 Hey,

 Wikidata integration.  If I'm not mistaken, there was talk of WikiData and
 SMW ultimately joining together in the indeterminate future.  Perhaps,
 this could be the start of that process as some of these issues may have
 already been considered by the WikiData team.


 The actual topic of my email, which got completely ignore so far in favor
 of discussing MongoDB vs $AlternativeStore, is very much about
 interoperability with the Wikidata software.

 Cheers

 --
 Jeroen De Dauw
 http://www.bn2vs.com
 Don't panic. Don't be evil. ~=[,,_,,]:3
 --

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135031iu=/4140/ostg.clktrk___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


Re: [SMW-devel] The future of the SMW query stores

2013-10-18 Thread Ryan Lane
On Fri, Oct 18, 2013 at 11:33 AM, Jeroen De Dauw jeroended...@gmail.comwrote:

 Hey,

 Wikidata integration.  If I'm not mistaken, there was talk of WikiData and
 SMW ultimately joining together in the indeterminate future.  Perhaps,
 this could be the start of that process as some of these issues may have
 already been considered by the WikiData team.


 The actual topic of my email, which got completely ignore so far in favor
 of discussing MongoDB vs $AlternativeStore, is very much about
 interoperability with the Wikidata software.


Wikidata is going to use mongodb?

- Ryan
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135031iu=/4140/ostg.clktrk___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


Re: [SMW-devel] The future of the SMW query stores

2013-10-18 Thread Joel Natividad
No Ryan.

Read The First Mail from Jeroen in the thread :)

Especially after the prompted by work by MWJames paragraph.

Best,
Joel

===
Think Different! (http://en.wikipedia.org/wiki/Think_different#Text)
Imagine Different! (http://www.youtube.com/watch?v=H5tOgRD4EqY)


On Fri, Oct 18, 2013 at 4:41 PM, Ryan Lane rlan...@gmail.com wrote:

 On Fri, Oct 18, 2013 at 11:33 AM, Jeroen De Dauw 
 jeroended...@gmail.comwrote:

 Hey,

 Wikidata integration.  If I'm not mistaken, there was talk of WikiData
 and SMW ultimately joining together in the indeterminate future.
  Perhaps, this could be the start of that process as some of these issues
 may have already been considered by the WikiData team.


 The actual topic of my email, which got completely ignore so far in favor
 of discussing MongoDB vs $AlternativeStore, is very much about
 interoperability with the Wikidata software.


 Wikidata is going to use mongodb?

 - Ryan

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135031iu=/4140/ostg.clktrk___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel