For the _mapping, I think about two more types for that I intend to write ES type mappers, "iri" and "literal", so ES can receive XSD data types and language codes and map them to fields / analyzers. IRIs are just opaque strings but they can be shortened if prefix is configured and can be used as _id or for referencing to an _id.
Instead of _mapping I prefer the thought about handling @contexts like template documents. Not sure about the best way to manage JSON-LD. There are two approaches: save a JSON-LD (you say original document) beside other versions. This requires more space and I'm not sure about the purpose of the original JSON-LD. The other approach is more about dropping original JSON-LD after parsing it to triples and store the triples in an ES JSON doc which is a surrogate close to JSON-LD but arranges with all the JSON dialect characteristics of the ES document DSL. I'm not in scala, so I can not promise much, but happy about glimpsing all related code! Jörg On Sat, Sep 27, 2014 at 4:50 PM, Alfredo Serafini <ser...@gmail.com> wrote: > HI Jorg Indeed! :-) > > What I like about _mapping is that they are managed as documents too, and > they can be: > > 1. automatically inferred from data (at risk, but useful) > 2. provided by static files, in some cases > 3. managed for _index/_types > > all those things could be done with something like a _context (which will > include at first a single @context). The first point should probably be > avoided at all for json-ld :-), but it should be possible. > > But we may need more @context items for a single "resource" schema > (referring to _index/_type), and in perspective it's even possible to > re-use a @context for different _index/_type pairs. > Furthermore: when exposing results in jsonld one might want to reference > an external @context and merge it before providing results, and In my > opinion the more "risky" part is when input the original json-ld, if we > want to flat it and extract the @context which will permits us to > recostruct later the original document. > Given the fact that it could be possible to map every kind of json results > from ES, documents imported as jsonld might has to maintain at least the > original fields. > > I'd like to put some code on github and if you want we could join the > effort on that? I'm working mostly on scala at the moment. What do you > think about? > > > > > Il giorno venerdì 26 settembre 2014 20:32:52 UTC+2, Jörg Prante ha scritto: >> >> Absolutely. My thought is about managing one (or more) context ES JSON >> document(s) where all the @context definitions of an index live. A format >> plugin can then process search results and converts ES JSON to expanded >> JSON-LD and from there to other RDF serializations. >> >> Jörg >> >> On Fri, Sep 26, 2014 at 6:23 PM, Alfredo Serafini <ser...@gmail.com> >> wrote: >> >>> Hi >>> >>> using json-ld is indeed rather simple, as it is JSON, and then it's even >>> possible to index it as is. >>> I'm currently using ES for storing RDF documents in json-ld on a >>> specific index: in that case one can simply use the uri as an _id, recover >>> the full original format by _source, and use basic search capabilities on >>> the index, if escaping / nesting it's not a big deal. >>> >>> However, in order to use resource with some more flexibility, I think >>> the best would be index them as "flat" as possible, then use an ad-hoc >>> @context on the ES json to obtain again the original json-ld. >>> This would be my ideal usage at the moment: seems complex at first, but >>> it's not, I'm currently experimenting in saving @context for a _type, >>> obtaining let's say a sort of _context, similar to a _mapping, to >>> reconstruct the original semantics. >>> If someone likes the idea, I'd like to share thoughts on that :-) >>> >>> >>> Il giorno venerdì 26 settembre 2014 14:08:07 UTC+2, Jörg Prante ha >>> scritto: >>>> >>>> Lukáš, >>>> >>>> of course you are right, RDF/XML looks complex and requires parsing. >>>> The underlying principle of all RDF is a graph (or a series of triples in >>>> form of subject/predicate/object, where the triple series is a >>>> serialization of the graph), So the challenge is first the parsing of RDF >>>> input, and second, constructing the model, and third, serializing the model >>>> to an ES-friendly input (here: JSON-LD, sort of). RDF ensures that there is >>>> a single model for all serializations. >>>> >>>> This technical perspective does not necessarily solve all challenges >>>> that are inherent to the chosen data model. For example, nested resources >>>> in RDF. It might be feasible to flatten nested resource by their >>>> identifiers and generate one JSON after the other. Or it could be feasible >>>> to keep nested resources intact and wrap them into nested structures in a >>>> single ES JSON object. >>>> >>>> In my data model, I can map RDF subject IDs to ES doc IDs. Other data >>>> models may prefer other approaches to select ES doc IDs. >>>> >>>> Jörg >>>> >>>> >>>> >>>> On Fri, Sep 26, 2014 at 10:11 AM, Lukáš Vlček <lukas...@gmail.com> >>>> wrote: >>>> >>>>> Jörg, >>>>> >>>>> my concern is that RDF/XML allow to express one thing in several ways. >>>>> For example, if you take FOAF specification then there are several ways >>>>> how >>>>> you can express that one Person knows other Person. One way it using >>>>> reference IDs other way it using nested Person inside other Person. See >>>>> [1] >>>>> for examples. My understanding is that although both ways express exactly >>>>> the same information they lead to different XML representation and thus to >>>>> different JSON-LD. Not that you can push such data in ES but I wonder if >>>>> you can then have any consistent way of querying such data. >>>>> >>>>> May be there is some way how you can preprocess XML document and >>>>> convert all nested Persons to references (would require arbitrary ID >>>>> construction?). Or something similar. Though I am not sure this would be >>>>> generally applicable approach to any RDF data. >>>>> >>>>> [1] http://www.xml.com/pub/a/2004/02/04/foaf.html >>>>> >>>>> Regards, >>>>> Lukas >>>>> >>>>> On Fri, Sep 26, 2014 at 9:28 AM, joerg...@gmail.com < >>>>> joerg...@gmail.com> wrote: >>>>> >>>>>> JSON-LD is perfect for ES indexing, as long as you use the "compact" >>>>>> form of representation. >>>>>> >>>>>> http://www.w3.org/TR/json-ld-api/#compaction-algorithms >>>>>> >>>>>> Example: >>>>>> >>>>>> https://github.com/lanthaler/JsonLD/blob/master/Test/Fixture >>>>>> s/sample-compacted.jsonld >>>>>> >>>>>> This means you should use short field names and shorten IRIs to a >>>>>> prefix form. This gives a convenient mapping to ES field names (e.g. >>>>>> "dc:title" or "dc:creator"). The '@' fields can also be indexed and they >>>>>> do >>>>>> not control anything special in ES (some @id may be mapped to ES _id but >>>>>> for nested structures this does not match) >>>>>> >>>>>> I use my own RDF API and transform RDF graphs (so not only JSON-LD >>>>>> but also other formats like N-Triples and RDF/XML) into XContent using >>>>>> this >>>>>> method: >>>>>> >>>>>> https://github.com/xbib/xbib/blob/master/content/src/main/ja >>>>>> va/org/xbib/rdf/content/DefaultResourceContentBuilder.java >>>>>> >>>>>> I plan to extend this content building by interpreting rdf:type and >>>>>> rdf:list etc. to generate correct ES JSON objects and arrays. There is >>>>>> also >>>>>> an amount of work left to do for the plethora of XSD types in RDF >>>>>> literals >>>>>> or for language tags. >>>>>> >>>>>> This will be subsumed into an RDF input/output plugin for an ES-based >>>>>> Linked Data Platform >>>>>> >>>>>> http://www.w3.org/TR/ldp/ >>>>>> >>>>>> but there is no ETA yet. >>>>>> >>>>>> Jörg >>>>>> >>>>>> >>>>>> On Fri, Sep 26, 2014 at 5:08 AM, Lukáš Vlček <lukas...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I think you will have to preprocess documents on your side first and >>>>>>> then push into ES individually (you can push in batch). >>>>>>> >>>>>>> As a side note, I would say json-ld is quite low level serialization >>>>>>> od RDF data IMO not optimal for ES indexing. May be better would be to >>>>>>> find >>>>>>> some RDF-OOM tool and have your RDF documents mapped to Java POJOs and >>>>>>> serialize POJOs into JSONs instead (you can use Jackson library for that >>>>>>> for example). This will give you better control over whole RDF -> JSON >>>>>>> conversion process. >>>>>>> >>>>>>> Regards, >>>>>>> Lukas >>>>>>> >>>>>>> On Thu, Sep 25, 2014 at 7:21 PM, abo <a...@datavolution.com> wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> I'm new to Elasticsearch, so forgive me if this is a basic question >>>>>>>> or if it's in some documentation that I haven't read... >>>>>>>> >>>>>>>> I am trying to load a json-ld file into ES. The json-ld file was >>>>>>>> generated from an RDF file, using Jena. The structure starts with: >>>>>>>> >>>>>>>> { >>>>>>>> "@graph" : >>>>>>>> >>>>>>>> followed by the individual "documents", each with: >>>>>>>> >>>>>>>> { >>>>>>>> "@id" : >>>>>>>> >>>>>>>> and a variable number of parameters in each. >>>>>>>> >>>>>>>> My question is how do I load this into ES and ensure that documents >>>>>>>> are individually referenced (as opposed to the entire json-ld file)? >>>>>>>> >>>>>>>> Do I need to doctor this json-ld file further in order to load it? >>>>>>>> >>>>>>>> Thanks for your help. >>>>>>>> >>>>>>>> -- abo >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "elasticsearch" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to elasticsearc...@googlegroups.com. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/elasticsearch/ec26bbe7-5bb >>>>>>>> 1-4c50-96c4-8f586e1e0807%40googlegroups.com >>>>>>>> <https://groups.google.com/d/msgid/elasticsearch/ec26bbe7-5bb1-4c50-96c4-8f586e1e0807%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "elasticsearch" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to elasticsearc...@googlegroups.com. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/elasticsearch/CAO9cvUYiqGoP5% >>>>>>> 3DpYkkhLzP17pLXAPN9sQVY9Oxn7AH4EY10xGA%40mail.gmail.com >>>>>>> <https://groups.google.com/d/msgid/elasticsearch/CAO9cvUYiqGoP5%3DpYkkhLzP17pLXAPN9sQVY9Oxn7AH4EY10xGA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "elasticsearch" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to elasticsearc...@googlegroups.com. >>>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>>> msgid/elasticsearch/CAKdsXoHtOZmTcm1dYWKHxSfjNN%3D% >>>>>> 3DqdoVwwvpg3DBEAcJz-xw5A%40mail.gmail.com >>>>>> <https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHtOZmTcm1dYWKHxSfjNN%3D%3DqdoVwwvpg3DBEAcJz-xw5A%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to elasticsearc...@googlegroups.com. >>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>> msgid/elasticsearch/CAO9cvUZXZNtTAVw1Mhr7N%3D03wo7-L1rKqChja45 >>>>> X7EGTEyc2bw%40mail.gmail.com >>>>> <https://groups.google.com/d/msgid/elasticsearch/CAO9cvUZXZNtTAVw1Mhr7N%3D03wo7-L1rKqChja45X7EGTEyc2bw%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elasticsearc...@googlegroups.com. >>> To view this discussion on the web visit https://groups.google.com/d/ >>> msgid/elasticsearch/25674e99-8767-49be-9e7b-f3d9ae9dffde% >>> 40googlegroups.com >>> <https://groups.google.com/d/msgid/elasticsearch/25674e99-8767-49be-9e7b-f3d9ae9dffde%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/ae48800a-b0df-47fe-aa05-b6fc7b272b00%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/ae48800a-b0df-47fe-aa05-b6fc7b272b00%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF7aoPkgdQF_id%3DA7KDBMffQaWtFBtnpvuBtjmJZpLqXQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.