Manybubbles created this task. Manybubbles added a subscriber: Manybubbles. Manybubbles added projects: Search-Team, Wikidata-Query-Service. Manybubbles moved this task to Backlog on the Search-Team workboard. Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION Blazegraph allows lots of customization of how values and uris are stored in the index. By defualt uris are stored in a dictionary and that dictionary is keyed using a long in the index. Given that everything in the index has a one byte overhead we're looking at 9 bytes per URI. Most of our URIs can be squashed much much smaller. This patch introduces a variable-length strategy that compresses the entities anywhere from 4 to 7 bytes (or more for much much longer ids). It improves load performance by 16% and on disk size by about 15%. I suspect we can do much better if we put our minds to it. https://gerrit.wikimedia.org/r/#/c/203837 TASK DETAIL https://phabricator.wikimedia.org/T95906 WORKBOARD https://phabricator.wikimedia.org/project/board/1174/ REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>. EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Manybubbles Cc: Aklapper, Manybubbles, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, GWicke, daniel, JanZerebecki _______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs