[Wikidata-tech] Report from the Architecture Summit

Daniel Kinzler Fri, 31 Jan 2014 13:29:30 -0800

(reposting, accidentally posted this to the internal list at first)

Hey. Here's a brief summary of what I talked to folks in SF about, what the
result was, or who we should contact to move forward.


* At the architecture summit, there seemed to be wide agreement that we need to
improve modularity in core. The TitleValue proposal was viewed as going to far
to the "dark side of javafication", but it was generally seen to be moving in
the right direction. I will update the change soon to address some comments.

* Furthermore, we (the core developers) should see out service interfaces that
can and should be factored out of existing classes, starting with pathological
cases like EditPage, Title, or User. Several people agreed to look into that
(and at the same time watch out to avoid "javafication"), Nik Everett
vonunteered to lead the discussion.

* Gabriel Wicke has interesting plans for factoring out storage services (both
low level blob storage as well as higher level revision storage) into separate
HTTP/REST services.

* Jurik is working on a library/extension for JSON based configuration storage
for extensions. Needs review/feedback, I'm looking into that.

* I asked Aaron to provide a JobSpecification interface, so jobs can be
scheduled without having to instantiate the class that will be used to execute
the job. This makes it easier to post jobs from one wiki to another. Aaron has
already implemented this now, yay!

* Yurik wants us to rework the Wikibase API to be compatible with the core APIs
"query" infrastructure. This would allow use to use item lists generated by one
module as the input for another module. See
https://www.mediawiki.org/wiki/Requests_for_comment/Wikidata_API

* After talking to Chad, I'm now pretty sure we should go for ElasticSearch for
implementing queries right away. It just seems a lot simpler than using MySQL
for the baseline implementation. This however makes ElasticSearch a dependency
of WikibaseQuery, making it harder for third parties to set up queries (though
setting up Elastic seems pretty simple).

* Brion would like to be in the loop on the PubsubHubbub project. For the
operations side, and the question whether WMF would want to run their own hub,
he pointed me to Ori and Mark Bergsma.

* I didn't make progress wrt the JSON dumps. Need to get hold of Ariel, he
wasn't around. We need to find out what makes the dumps so slow. Aaron Schulz
agreed to help with that. One problematic aspect of the current implementation
is that it tries to retrieve all entity IDs with a single DB query. We might
need to chunk that.

* For the future use of composer, we should be in touch with Markus Glaser and
Hexmode (Mark Hershberger), as well as with Hashar.

* Hashar is quite interested in switching to composer and perhaps also Travis.
He was happy to hear that travis is Berlin based and sympathetic. The WMF might
even be ready to invest a bit into making Travis work with our workflow. Hashar
may come and visit us, poke him about it!

* For access to the new log stash service, we should talk to Ken Snider

* For shell access we should talk to Quim.

* I discussed allowing queries on page_prove by property value with Tim as well
as Roan. Tim suggested to add a pp_sortkey column to page_props (a float, but
nullable), and index by pp_propname+pp_sortkey. That should cover most use cases
nicely, without big schema changes.


So, lots to follow up on!

Cheers
Daniel

-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

_______________________________________________
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech

[Wikidata-tech] Report from the Architecture Summit

Reply via email to