I see this makes all sense Andy but I think I would like an option for turning the replication / versioning off. or at least direct that data into another location.
I believe this only is the case for TDB/TDB2 backed datasets? On Tue, Nov 23, 2021 at 10:45 AM Andy Seaborne <[email protected]> wrote: > > > On 23/11/2021 09:40, Rob Vesse wrote: > > Marco > > > > So there's a couple of things going on. > > > > Firstly the Node Table, the mapping of RDF Terms to the internal Node > IDs used in the indexes can only ever grow. TDB2 doesn't do reference > counting so it doesn't ever remove entries from the table as it doesn't > know when a Node ID is no longer needed. Also for RDF Terms that aren't > directly interned (e.g. some numerics, booleans, dates etc), so primarily > URIs, Blank Nodes and larger/arbitrarily typed literals, the Node ID > actually encodes the offset into the Node Table to make Node ID to RDF Term > decoding fast so you can’t just arbitrarily rewrite the Node Table. And > even if rewriting the Node Table were supported it would require rewriting > all the indexes since those use the Node IDs. > > > > TL;DR the Node Table only grows because the cost of compacting it > outweighs the benefits. This is also why you may have seen advice in the > past that if your database has a lot of DELETE operations made against it > then in periodically dumping all the data and reloading it into a new > database is recommended since that generates a fresh Node Table with only > the RDF Terms currently in use. > > > > Secondly the indexes are themselves versioned storage, so when you > modify the database a new state is created (potentially pointing to > some/all of the existing data) but the old data is still there as well. > This is done for two reasons: > > > > 1) It allows writes to overlap with ongoing reads to improve > concurrency. Essentially each read/write transaction operates on a > snapshot of the data, a write creates a new snapshot but an ongoing read > can continue to read the old snapshot it was working against > > 2) It provides for strong fault tolerance since a crash/exit during a > write doesn't affect old data > > 3) Arbitrarily large transactions. > > > Note that you can perform a compact operation on a TDB2 database which > essentially discards all but the latest snapshot and should reclaim the > index data that is no longer needed. This is a blocking exclusive write > operation so doesn't allow for concurrent reads as a normal write would. > > Nowadays, reads continue during compaction; it's only writes that get > held up (I'd like to add delta-technology to fix that). > > There is a short period of pointer swapping with some disk sync at the > end to switch the database in-use; it is milliseconds. > > Andy > > > > > Cheers, > > > > Rob > > > > PS. I'm sure Andy will chime in if I've misrepresented/misstated > anything above > > > > On 22/11/2021, 21:15, "Marco Neumann" <[email protected]> wrote: > > > > Yes I just had a look at one of my own datasets with 180mt and a > footprint > > of 28G. The overhead is not too bad at 10-20%. vs raw nt files > > > > I was surprised that the CLEAR ALL directive doesn't remove/release > disk > > memory. Does TDB2 require a commit to release disk space? > > > > impressed to see that load times went up to 250k/s with 4.2. more > than > > twice the speed I have seen with 3.15. Not sure if this is OS > (Ubuntu > > 20.04.3 LTS) related. > > > > Maybe we should make a recommendation to the wikidata team to > provide us > > with a production environment type machine to run some load and > query tests. > > > > > > > > > > > > > > On Mon, Nov 22, 2021 at 8:43 PM Andy Seaborne <[email protected]> > wrote: > > > > > > > > > > > On 21/11/2021 21:03, Marco Neumann wrote: > > > > What's the disk footprint these days for 1b on tdb2? > > > > > > Quite a lot. For 1B BSBM, ~125G (which is a bit heavy on > significant > > > sized literals - the node themselves are 50G). Obvious for > current WD > > > scale usage a sprinkling of compression would be good! > > > > > > One thing xloader gives us is that it makes it possible to load > on a > > > spinning disk. (it also has lower peak intermediate file space and > > > faster because it does not fall into a slow loading mode for the > node > > > table that tdbloader2 did sometimes.) > > > > > > Andy > > > > > > > > > > > On Sun, Nov 21, 2021 at 8:00 PM Andy Seaborne <[email protected]> > wrote: > > > > > > > >> > > > >> > > > >> On 20/11/2021 14:21, Andy Seaborne wrote: > > > >>> Wikidata are looking for a replace for BlazeGraph > > > >>> > > > >>> About WDQS, current scale and current challenges > > > >>> https://youtu.be/wn2BrQomvFU?t=9148 > > > >>> > > > >>> And in the process of appointing a graph consultant: (5 month > > > contract): > > > >>> https://boards.greenhouse.io/wikimedia/jobs/3546920 > > > >>> > > > >>> and Apache Jena came up: > > > >>> https://phabricator.wikimedia.org/T206560#7517212 > > > >>> > > > >>> Realistically? > > > >>> > > > >>> Full wikidata is 16B triples. Very hard to load - xloader may > help > > > >>> though the goal for that was to make loading the truthy > subset (5B) > > > >>> easier. 5B -> 16B is not a trivial step. > > > >> > > > >> And it's growing at about 1B per quarter. > > > >> > > > >> > > > > https://wikitech.wikimedia.org/wiki/Wikidata_Query_Service/ScalingStrategy > > > >> > > > >>> > > > >>> Even if wikidata loads, it would be impractically slow as TDB > is today. > > > >>> (yes, that's fixable; not practical in their timescales.) > > > >>> > > > >>> The current discussions feel more like they are looking for a > "product" > > > >>> - a triplestore that they are use - rather than a > collaboration. > > > >>> > > > >>> Andy > > > >> > > > > > > > > > > > > > > > > > -- > > > > > > --- > > Marco Neumann > > KONA > > > > > > > > > -- --- Marco Neumann KONA
