Michael added a comment.
In T295560#7504717 <https://phabricator.wikimedia.org/T295560#7504717>, @Manuel wrote: >> The trouble with compression is that we can //never// change it again. > > I see. Maybe some very simple dictionary approach could work? The syntax is highly redundant so that might go a long way. It would also leave the resulting string still interpretable. Not the perfect solution anyways. The syntax isn't that redundant. Eg. if you consider the first example linked above: #defaultView:Map SELECT ?s ?sLabel ?coor ?operator ?operatorLabel ?image ?layer WHERE { ?s wdt:P31/wdt:P279* wd:Q28564 ; wdt:P17 wd:Q145 ; wdt:P625 ?coor ; wdt:P137 ?operator ; OPTIONAL {?s wdt:P18 ?image .} VALUES ?operator { wd:Q4923796 wd:Q4966533 wd:Q5016926 wd:Q5038400 wd:Q5043224 wd:Q5064127 wd:Q5166758 wd:Q5256629 wd:Q16837157 wd:Q5623821 wd:Q6083890 wd:Q16997658 wd:Q6901162 wd:Q6984500 wd:Q16998902 wd:Q7161994 wd:Q7236943 wd:Q7321391 wd:Q5123523 wd:Q7825688 wd:Q7909538 wd:Q8038115 } BIND( IF(?operator = wd:Q4923796, "Blaenau Gwent", IF(?operator = wd:Q4966533, "Bridgend", IF(?operator = wd:Q5016926, "Caerphilly", IF(?operator = wd:Q5038400, "Cardiff", IF(?operator = wd:Q5043224, "Carmarthenshire", IF(?operator = wd:Q5064127, "Ceredigion", IF(?operator = wd:Q5166758, "Conwy", IF(?operator = wd:Q5256629, "Denbighshire", IF(?operator = wd:Q16837157, "Flintshire", IF(?operator = wd:Q5623821, "Gwynedd", IF(?operator = wd:Q6083890, "Isle of Anglesey", IF(?operator = wd:Q16997658, "Merthyr Tydfil", IF(?operator = wd:Q6901162, "Monmouthshire", IF(?operator = wd:Q6984500, "Neath Port Talbot", IF(?operator = wd:Q16998902, "Newport", IF(?operator = wd:Q7161994, "Pembrokeshire", IF(?operator = wd:Q7236943, "Powys", IF(?operator = wd:Q7321391, "Rhondda Cynon Taf", IF(?operator = wd:Q5123523, "Swansea", IF(?operator = wd:Q7825688, "Torfaen", IF(?operator = wd:Q7909538, "Vale of Glamorgan", IF(?operator = wd:Q8038115, "Wrexham", "")))))))))))))))))))))) AS ?layer). SERVICE wikibase:label { bd:serviceParam wikibase:language "en" } } ORDER BY ?operatorLabel ?sLabel We can //not// replace the variables (`?operator`), the ids (`Q7825688`, `P31`) and the string literals (`"Blaenau Gwent"`) with a code-side dictionary. That leaves only the literal SPARQL keywords (`OPTIONAL`) and some common constructs (`SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }`). That's not nothing but it is much less than it might seem. Also, it is still more effort than it might seem because we have to be very careful in our work as we might not able to fix some mistakes after launch. And it is very easy to make mistakes here, e.g. by replacing a keyword with a key that also appears in the actual content. Writing your own compression algorithm isn't as bad as writing your own encryption algorithm, but similar arguments still apply. So, //if// we were to go the route of compression, then I would recommend using a very stable implementation of that LZ algorithm from the '70s, so that we can hope that it will still be around for a while. (See also: Lindy effect <https://en.wikipedia.org/wiki/Lindy_effect>) TASK DETAIL https://phabricator.wikimedia.org/T295560 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Michael Cc: Michael, Mahir256, Manuel, Aklapper, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org