Since I don't think anyone answered your specific original question

TDB and TDB2 both use dictionary encoding (and in fact most RDF stores use some 
variation on this).  Basically they map each unique RDF term (whether URI, 
string, blank node etc) to a consistent internal identifier and use this to 
refer to the term.  Therefore most data structures internally are implemented 
in terms of these internal identifiers (which are typically very compact, 
TDB/TDB2 use 64 bit identifiers) and the system only translates between the 
internal identifier and the full RDF term when explicitly needed e.g. when 
presenting results

Rob

On 15/02/2019, 06:03, "Ekaterina Danilova" <katja.danilov...@gmail.com> wrote:

    i would like to ask how TDB2 and Fuseki manages big amounts of string data
    (especially repeating data) and what it the best practices. Does it
    optimize it somehow?




Reply via email to