Repository: incubator-commonsrdf
Updated Branches:
  refs/heads/master 567b775be -> c6c4cbfcf


about Blank Nodes and literals


Project: http://git-wip-us.apache.org/repos/asf/incubator-commonsrdf/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-commonsrdf/commit/ba779e35
Tree: http://git-wip-us.apache.org/repos/asf/incubator-commonsrdf/tree/ba779e35
Diff: http://git-wip-us.apache.org/repos/asf/incubator-commonsrdf/diff/ba779e35

Branch: refs/heads/master
Commit: ba779e356ed985edf549453b943a601a354aad88
Parents: 567b775
Author: Stian Soiland-Reyes <st...@apache.org>
Authored: Mon Nov 21 13:43:27 2016 +0000
Committer: Stian Soiland-Reyes <st...@apache.org>
Committed: Mon Nov 21 13:43:27 2016 +0000

----------------------------------------------------------------------
 src/site/markdown/introduction.md | 264 ++++++++++++++++++++++++++++++++-
 1 file changed, 260 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-commonsrdf/blob/ba779e35/src/site/markdown/introduction.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/introduction.md 
b/src/site/markdown/introduction.md
index d494562..9d9530b 100644
--- a/src/site/markdown/introduction.md
+++ b/src/site/markdown/introduction.md
@@ -377,7 +377,263 @@ for (Triple triple : graph.iterate(alice, knows, null)) {
 
 ## Literal values
 
-We talked briefly about literals above as a way to represent values in RDF.
-What is a value? In a way you could a value is when we no longer want to
-stay in graph land and just want to use primitive types like `long`,
-`int` or `String`.  
+We talked briefly about literals above as a way to represent _values_ in RDF.
+What is a literal value? In a way you could think of a value as when you no 
longer
+want to stay in graph-land of related resources, and just want to use primitive
+types like `float`, `int` or `String` to represent values like
+a player rating, the number of matches played, or the full name of a person
+(including spaces and punctuation which don't work well in an identifier).
+
+Such values are in Commons RDF represented as instances of `Literal`,
+which we can create using `rdf.createLiteral(..)`. Strings are easy:
+
+```java
+Literal aliceName = rdf.createLiteral("Alice W. Land");
+```
+
+We can then add a triple that relates the resource `<Alice>`
+to this value, let's use a new predicate `<name>`:
+
+```java
+IRI name = rdf.createIRI("name");
+graph.add(alice, name, aliceName);
+```
+
+When you look up literal properties in a graph,
+take care that in RDF a property is not necessarily _functional_, that is,
+it would be perfectly valid RDF-wise for a person to have multiple names;
+Alice might also have a `<name> "Alice Land"`.  Instead of using
+`graph.iterate()` and `break` in a for-loop, it might be easier to use the
+Java 8 `Stream` returned from `.stream()` together with `.findAny()`
+- which  return an `Optional` in case there is no `<name>`:
+
+```java
+System.out.println(graph.stream(alice, name, null).findAny());
+```
+
+> `Optional[<Alice> <name> "Alice W. Land" .]``
+
+**Note:** Using `.findFirst()` will not returned the "first"
+recorded triple, as triples in a graph are not necessarily
+kept in order.
+
+You can use `optional.isPresent()` and `optional.get()` to check if a
+`Triple` matched the graph stream pattern:
+
+```java
+Optional<? extends Triple> nameTriple = graph.stream(alice, name, 
null).findAny();
+if (nameTriple.isPresent()) {
+    System.out.println(nameTriple.get());
+}
+```
+
+If you feel adventerous, you can try the
+[Java 8 functional 
programming](http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/Lambda-QuickStart/index.html)
+style to work with of `Stream` and `Optional` and get the literal value 
unquoted:
+
+```java
+graph.stream(alice, name, null)
+        .findAny().map(Triple::getObject)
+        .filter(obj -> obj instanceof Literal)
+        .map(literalName -> ((Literal)literalName).getLexicalForm())
+        .ifPresent(System.out::println);
+```
+
+> `Alice W. Land`
+
+Notice how we here used a `.filter` to skip any non-`Literal` names
+(which would not have the `.getLexicalForm()` method).
+
+
+
+### Typed literals
+
+Non-String value types are represented in RDF as _typed literals_;
+which is similar to (but not the same as) Java native types. A
+typed literal is a combination of a _string representation_
+(e.g. "13.37") and a data type IRI, e.g. 
`<http://www.w3.org/2001/XMLSchema#float>`.
+RDF reuse the XSD datatypes.
+
+A collection of the standardized datatype `IRI`s
+are provided in Simple's 
[Types](apidocs/org/apache/commons/rdf/simple/Types.html)
+class, which we can use with `createLiteral` by adding the corresponding 
`import`:
+
+```java
+import org.apache.commons.rdf.simple.Types;
+// ...
+IRI playerRating = rdf.createIRI("playerRating");
+Literal aliceRating = rdf.createLiteral("13.37", Types.XSD_FLOAT);
+graph.add(alice, playerRating, aliceRating);
+```
+
+Note that Commons RDF does not currently provide converters
+from/to native Java data types and the RDF string representations.
+
+### Language-specific literals
+
+We live in a globalized world, with many spoken and written languages.
+While we can often agree about a concept like `<Football>`, different
+languages might call it differently. The distinction in RDF
+between identified resources and literal values, mean we can represent
+names or labels for the same thing.
+
+Rather than introducing language-specific predicates like
+`<name_in_english>` and `<name_in_norwegian>`
+it is usually better in RDF to use _language-typed literals_:
+
+```java
+Literal footballInEnglish = rdf.createLiteral("football", "en");
+Literal footballInNorwegian = rdf.createLiteral("fotball", "no");
+
+graph.add(football, name, footballInEnglish);
+graph.add(football, name, footballInNorwegian);
+```
+
+The language tags like `"en"` and `"no"` are
+identified by [BCP47](https://tools.ietf.org/html/bcp47) - you can't just make
+up your own but must use one that matches the language. It is possible to use
+localized languages as well, e.g.
+
+```java
+Literal footballInAmericanEnglish = rdf.createLiteral("soccer", "en-US");
+graph.add(football, name, footballInAmericanEnglish);
+```
+
+Note that Commons RDF does not currently provide constants for
+the standardized languages or methods to look up localized languages.
+
+## Blank nodes - when you don't know the identity
+
+Sometimes you don't know the identity of a resource. This can be the case
+where you know the _existence_ of a resource, similar to "someone" or "some"
+in English.  For instance,
+
+```turtle
+<Charlie> <knows> _:someone .
+_:someone <plays> <Football> .
+```
+
+We don't know who this `_:someone` is, it could be `<Bob>` (which we know
+plays football), it could be someone else, even `<Alice>`
+(we don't know that she doesn't play football).
+
+In RDF we represent `_:someone` as a _blank node_ - it's a resource without
+a global identity.  Different RDF files can all talk about `_:blanknode`, but 
they
+would all be different resources.  Crucially, a blank node can be used
+in multiple triples within the same graph, so that we can relate
+a subject to a blank node resource, and then describe that resource (usually 
incomplete).
+
+Let's add the blank node statements to our graph:
+
+```turtle
+BlankNode someone = rdf.createBlankNode();
+graph.add(charlie, knows, someone);
+graph.add(someone, plays, football);
+BlankNode someoneElse = rdf.createBlankNode();
+graph.add(charlie, knows, someoneElse);
+```
+
+Every call to `rdf.createBlankNode()` creates a new, unrelated blank node
+with an internal identifier. Let's have a look:
+
+```java
+for (Triple heKnows : graph.iterate(charlie, knows, null)) {
+    if (! (heKnows.getObject() instanceof BlankNodeOrIRI)) {
+        continue;
+    }
+    BlankNodeOrIRI who = (BlankNodeOrIRI)heKnows.getObject();
+    System.out.println("Charlie knows "+ who);
+    for (Triple whoPlays : graph.iterate(who, plays, null)) {
+        System.out.println("  who plays " + whoPlays.getObject());
+    }
+}      
+```
+
+> `Charlie knows _:ae4115fb-86bf-3330-bc3b-713810e5a1ea` <br>
+> `  who plays <Football>` <br>
+> `Charlie knows _:884d5c05-93a9-3709-b655-4152c2e51258`
+
+As we see above, given a `BlankNode` instance it is perfectly
+valid to ask the same `Graph` about further triples
+relating to the `BlankNode`.
+
+### Blank node labels
+
+In Commons RDF it is also possible to create a blank node from a
+_name_ - which can be useful if you don't want to keep (or look up)
+the `BlankNode` instance to later add statements about the same node.
+
+Let's first delete the old BlankNode statements:
+
+```java
+graph.remove(null,null,someone);
+graph.remove(someone,null,null);
+```
+
+And now we'll try an alternate approach:
+
+```java
+// no Java variable for the new BlankNode instance
+graph.add(charlie, knows, rdf.createBlankNode("someone"));        
+// at any point later (with the same RDF instance)
+graph.add(rdf.createBlankNode("someone"), plays, football);
+```
+
+
+Running the `"Charlie knows"` query again should still work, but now
+return a different identifier.
+
+
+> `Charlie knows _:5e2a75b2-33b4-3bb8-b2dc-019d42c2215a` <br>
+> `  who plays <Football>` <br>
+> `Charlie knows _:884d5c05-93a9-3709-b655-4152c2e51258`
+
+
+You may notice that with `SimpleRDF` the string `"someone"` does not
+survive into the string representation of the `BlankNode` label `_:someone`,
+other `RDF` implementations may support that.
+
+Note that it needs to be the same `RDF` instance to recreate
+the same _"someone"_ `BlankNode`.  This is a Commons RDF-specific behaviour to 
improve
+cross-graph compatibility, other RDF frameworks may save the blank node using
+the provided name as a blank node label, which in some cases
+could cause collisions (but perhaps more readable output).
+
+
+### Open world assumption
+
+How to interpret a blank node depends on the assumptions you build into your
+RDF application - it could be thought of as a logical "there exists a resource 
that.."
+or a more pragmatic "I don't know/care about the resource's IRI". Blank nodes 
can be
+useful if your RDF model describes intermediate resources like
+"a person's membership of an organization" or "a participant's result in a 
race"
+which it often is not worth maintaining identifiers for.
+
+It is common on the semantic web to use the
+[open world 
assumption](http://wiki.opensemanticframework.org/index.php/Overview_of_the_Open_World_Assumption)
 -
+if it is not stated as a _triple_ in your graph, then you don't know if
+something is is true or false,
+for instance if `<Alice> <plays> <Football> .`  
+
+Note that the open world assumption applies both to `IRI`s and `BlankNode`s,
+that is, you can't necessarily assume that the
+resources `<Alice>` and `<Charlie>` describe
+two different people just because they have
+two different identifiers - in fact it is very common that different systems 
use
+different identifiers to describe the same (or pretty much the same) thing in 
the
+real world.
+
+It is however common for applications to
+"close the world"; saying "given this information I have
+gathered as RDF, I'll assume these resources are all separate things in the 
world,
+then do I then know if `<Alice> <plays> <Football>` is false?".
+
+Using logical _inference rules_ and _ontologies_
+is one method to get stronger assumptions and conclusions.
+Note that building good rules or ontologies requires a fair
+bit more knowledge than what can be conveyed in this short tutorial.
+
+It is out of scope for Commons RDF to support the many ways to deal with
+logical assumptions and conclusions, however you may find interest in using
+[Jena implementation](implementations.html#Apache_Jena)
+combined with Jena's [ontology 
API](https://jena.apache.org/documentation/ontology/).

Reply via email to