Hi Paul,

Re RDF*/SPARQL*: could you send a link? Someone has really made an effort to find the least googleable terminology here ;-)

Re relying on standards: I think this argument is missing the point. If you look at what developers in Wikidata are concerned with, it is +90% interface and internal data workflow. This would be exaclty the same no matter which data standard you would use. All the challenges of providing a usable UI and a stable API would remain the same, since a data encoding standard does not help with any of this. If you have followed some of the recent discussion on the DBpedia mailing list about the UIs they have there, you can see that Wikidata is already in a very good position in comparison when it comes to exposing data to humans (thanks to Magnus, of course ;-). RDF is great but there are many problems that it does not even try to solve (rightly so). These problems seem to be dominant in the Wikidata world right now.

This said, we are in a great position to adopt new standards as they come along. I agree with you on the obvious relationships between Wikidata statements and the property graph model. We are well aware of this. Graph databases are considered for providing query solutions to Wikidata, and we are considering to set up a SPARQL endpoint for our existing RDF as well. Overall, I don't see a reason why we should not embrace all of these technologies as they suit our purpose, even if they were not available yet when Wikidata was first conceived.

Re "It is also exciting that vendors are getting on board with this and we are going to seeing some stuff that is crazy scalable (way past 10^12 facts on commodity hardware) very soon." [which vendors?] [citation needed] ;-) We would be very interested in learning about such technologies. After the recent end of Titan, the discussion of query answering backends is still ongoing.

Cheers,

Markus


On 18.02.2015 21:25, Paul Houle wrote:
What bugs me about it is that Wikidata has gone down the same road as
Freebase and Neo4J in the sense of developing something ad-hoc that is
not well understood.

I understand the motivations that lead there,  because there are
requirements to meet that standards don't necessarily satisfy,  plus
Wikidata really is doing ambitious things in the sense of capturing
provenance information.

Perhaps it has come a little too late to help with Wikidata but it seems
to me that RDF* and SPARQL* have a lot to offer for "data wikis" in that
you can view data as plain ordinary RDF and query with SPARQL but you
can also attach provenance and other metadata in a sane way with sweet
syntax for writing it in Turtle or querying it in other ways.

Another way of thinking about it is that RDF* is formalizing the
property graph model which has always been ad hoc in products like
Neo4J.  I can say that knowing what the algebra is you are implementing
helps a lot in getting the tools to work right.  So you not only have
SPARQL queries as a possibility but also languages like Gremlin and
Cypher and this is all pretty exciting.  It is also exciting that
vendors are getting on board with this and we are going to seeing some
stuff that is crazy scalable (way past 10^12 facts on commodity
hardware) very soon.




On Tue, Feb 17, 2015 at 12:20 PM, Jeroen De Dauw <jeroended...@gmail.com
<mailto:jeroended...@gmail.com>> wrote:

    Hey,

    As Lydia mentioned, we obviously do not actively discourage outside
    contributions, and will gladly listen to suggestions on how we can
    do better. That being said, we are actively taking steps to make it
    easier for developers not already part of the community to start
    contributing.

    For instance, we created a website about our software itself [0],
    which lists the MediaWiki extensions and the different libraries [1]
    we created. For most of our libraries, you can just clone the code
    and run composer install. And then you're all set. You can make
    changes, run the tests and submit them back. Different workflow than
    what you as MediaWiki developer are used to perhaps, though quite a
    bit simpler. Furthermore, we've been quite progressive in adopting
    practices and tools from the wider PHP community.

    I definitely do not disagree with you that some things could, and
    should, be improved. Like you I'd like to see the Wikibase git
    repository and naming of the extensions be aligned more, since it
    indeed is confusing. Increased API stability, especially the
    JavaScript one, is something else on my wish-list, amongst a lot of
    other things. There are always reasons of why things are the way
    they are now and why they did not improve yet. So I suggest to look
    at specific pain points and see how things can be improved there.
    This will get us much further than looking at the general state,
    concluding people do not want third party contributions, and then
    protesting against that.

    [0] http://wikiba.se/
    [1] http://wikiba.se/components/

    Cheers

    --
    Jeroen De Dauw - http://www.bn2vs.com
    Software craftsmanship advocate
    Evil software architect at Wikimedia Germany
    ~=[,,_,,]:3

    _______________________________________________
    Wikidata-l mailing list
    Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org>
    https://lists.wikimedia.org/mailman/listinfo/wikidata-l




--
Paul Houle
Expert on Freebase, DBpedia, Hadoop and RDF
(607) 539 6254    paul.houle on Skype ontolo...@gmail.com
<mailto:ontolo...@gmail.com>
http://legalentityidentifier.info/lei/lookup


_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l



_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to