jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/377372 )
Change subject: Add script for loading category data ...................................................................... Add script for loading category data Use: loadCategoryDump.sh WIKI Bug: T157676 Change-Id: I138ed5243fc58acf7961f4e6a5b81190b13b6f78 --- A dist/src/script/loadCategoryDump.sh A docs/Categories.md 2 files changed, 42 insertions(+), 0 deletions(-) Approvals: Smalyshev: Looks good to me, approved jenkins-bot: Verified diff --git a/dist/src/script/loadCategoryDump.sh b/dist/src/script/loadCategoryDump.sh new file mode 100755 index 0000000..25cb738 --- /dev/null +++ b/dist/src/script/loadCategoryDump.sh @@ -0,0 +1,29 @@ +#!/bin/bash +if [ -r /etc/wdqs/vars.sh ]; then + . /etc/wdqs/vars.sh +fi + +SOURCE=${SOURCE:-"https://dumps.wikimedia.org/other/categoriesrdf"} +DATA_DIR=${DATA_DIR:-"/srv/wdqs"} +HOST=http://localhost:9999 +CONTEXT=bigdata +NAMESPACE=categories +WIKI=$1 + +if [ -z "$WIKI" ]; then + echo "Use: $0 WIKI-NAME" + exit 1 +fi + +TS=$(curl -s -XGET $SOURCE/lastdump/$WIKI-categories.last) +if [ -z "$TS" ]; then + echo "Could not load timestamp" + exit 1 +fi +FILENAME=$WIKI-$TS-categories.ttl.gz +curl -s -XGET $SOURCE/$TS/$FILENAME -o $DATA_DIR/$FILENAME +if [ ! -s $DATA_DIR/$FILENAME ]; then + echo "Could not download $FILENAME" + exit 1 +fi +curl -XPOST --data-binary update="LOAD <file://$DATA_DIR/$FILENAME>" $HOST/$CONTEXT/namespace/$NAMESPACE/sparql \ No newline at end of file diff --git a/docs/Categories.md b/docs/Categories.md new file mode 100644 index 0000000..e5233bc --- /dev/null +++ b/docs/Categories.md @@ -0,0 +1,13 @@ +# Categories + +This document describes how to set up and maintain Mediawiki categories graph on top of Wikidata Query Service. + +# Setup + +In order to create categories namespace, run `createNamespace.sh categories`. + +# Data loading + +To load the data, the dumps should be in https://dumps.wikimedia.org/other/categoriesrdf or another place with analogous structure pointed to by SOURCE variable. + +Run script loadCategoryDump.sh for each wiki, e.g.: `loadCategoryDump.sh testwiki` \ No newline at end of file -- To view, visit https://gerrit.wikimedia.org/r/377372 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I138ed5243fc58acf7961f4e6a5b81190b13b6f78 Gerrit-PatchSet: 1 Gerrit-Project: wikidata/query/rdf Gerrit-Branch: master Gerrit-Owner: Smalyshev <smalys...@wikimedia.org> Gerrit-Reviewer: Gehel <guillaume.leder...@wikimedia.org> Gerrit-Reviewer: Smalyshev <smalys...@wikimedia.org> Gerrit-Reviewer: jenkins-bot <> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits