Re: Relationship between similar columns from multiple databases
Jena is a library for storing, processing, and querying RDF, not a device (magical or otherwise) for deriving triples from other data sources. Using fuzzy matching to resolve items in a database to named things-in-the-world is not a job for an RDF library. There are things that do that; I happen to work at a place that has a commercial offering for this, and there are many others. On Wed, Sep 7, 2016 at 2:00 PM, ☼ R Nair (रविशंकर नायर) wrote: > Agree, the question is whether an RDF can be created out of the data from > multiple data sources and use it for semantic correlation. That would turn > the world round. In my organization, there are at least a PB of data lying > in disparate sources, untapped because , they are legacy and none knows the > relationships until explored manually. If Jean is not, any suggestions to > manage this? Thanks > > Best, Ravion > > On Sep 7, 2016 1:55 PM, "A. Soroka" wrote: > > Jena is an RDF framework-- it's not really designed to integrate SQL > databases. Are you sure you are using the right product? Does your use case > involve a good deal of RDF processing? > > --- > A. Soroka > The University of Virginia Library > >> On Sep 7, 2016, at 1:43 PM, ☼ R Nair (रविशंकर नायर) < > ravishankar.n...@gmail.com> wrote: >> >> All, >> >> I am new to Jena. I would like to query two databases, mysql and Oracle. >> Assume that there are similar columns in both. For example MYSQL contains > a >> table EMP with ENAME column. Oracle contains, say, DEPT table with >> EMPLOYEENAME column. What are the steps if I want Jena to find out ENAME > of >> MYSQL is same as EMPLOYEENAME column of Oracle, ( and so can be joined). > Is >> this possible, at least to get an output saying both columns are similar? >> If so, how, thanks and appreciate your help. >> >> Best, Ravion
Re: Web Development tools
On Sat, Sep 5, 2015 at 1:37 PM, kumar rohit wrote: > My question is almost about Jena.. Developing Ontology in Protege and > import that file to Jena using read() method is used as stand alone > program. How can we use Jena if we have to develop Web application using > Jena and Protege ontology. > If JSP is used, then how can we use Jena code inside JSP? Never put complex code in a JSP. Write plain old java code that presents your JSP pages with a nice simple API, and call Java code from there. > > On Sat, Sep 5, 2015 at 10:41 AM, buehmann < > buehm...@informatik.uni-leipzig.de> wrote: > >> @Kumar: >> You question is totally impossible to answer. What is the purpose of your >> question? >> >> All the things you mentioned can be used for different things, which to >> choose and how to combine is totally free and depends on the particular >> use-case. How can somebody else tell you what in your planned application >> is needed? >> >> By the way, this is a JENA mailing list, supposed to discuss particular >> questions around Apache JENA. Your questions is better placed at general >> discussion platforms like StackOverflow or the semantic version of it. >> >> Lorenz >> >> >> On 04.09.2015 22:01, kumar rohit wrote: >> >>> I could not understand it, what is it and its purpose? How it replace >>> tools >>> like JSP and Java? >>> >>> On Fri, Sep 4, 2015 at 4:19 PM, Adrian Walker >>> wrote: >>> >>> Kumar, You may be interested in the advanced semantics in the Executable English system -- www.reengineeringllc.com/internet_business_logic_in_a_nutshell.pdf -- Adrian Executable Open English / Internet Business Logic Online at www.reengineeringllc.com Nothing to download, shared use is free, and there are no advertisements On Fri, Sep 4, 2015 at 12:44 AM, kumar rohit wrote: BEA Workshop IDE, JSP, Jena TDB, Spring and Sparql... These tools are > sufficient to build a Semantic Web application? > Is there any need for another tool or technology ? Please guide me. > > >>
Re: Language codes
A squirrel ran by and I clicked 'send' too fast. Well, I'm exaggerating. "Perfectly awful" should be 'mildly inconvenient'. In my space, it's typical to assume that language code comparisons know the equivalence between en and eng. So, if one expect to process a range of data including languages only distinguished in -3 space, one just works with -3 codes. However, There's lots of RDF out there with -1 codes (e.g. @en). So, I can't just throw the switch, as it were, to -3 codes and expect to match against it. I need to be careful to generate triples that use -1 codes except for those languages where -3 codes are required to distinguish. Am I making sense? On Wed, Jul 2, 2014 at 6:00 PM, Benson Margulies wrote: > On Wed, Jul 2, 2014 at 5:55 PM, Andy Seaborne wrote: >> On 02/07/14 22:27, Benson Margulies wrote: >>> >>> On Wed, Jul 2, 2014 at 5:11 PM, Andy Seaborne wrote: >>>> >>>> On 02/07/14 21:45, Benson Margulies wrote: >>>>> >>>>> >>>>> Andy, >>>>> >>>>> The upshot of all of this is that ISO-639-3 codes should work. >>>>> However, that leaves a mystery to me. If I store a triple with @en, >>>>> and someone queries with @eng, are they supposed to match? In >>>>> practical terms, do they match in TDB or any other common triple >>>>> stores? >>>> >>>> >>>> >>>> No. >>>> >>>> ""@en and ""@eng are different RDF terms. As is ""@en-uk. >>>> >>>> All the stores I know of treat language tags as (normalized) strings. >>> >>> >>> That's perfectly clear and perfectly awful, at least for people who >>> care about Persian, Dari, and that ilk. Thanks. >> >> >> Why? All ISO-639 systems are supported - but there is no equivalence tables >> between the different systems built in. Or within the systems (B and T >> codes). > > Well, I'm exaggerating. "Perfectly awful" should be 'mildly inconvenient'/ > > In my space, it's typical to assume that language code comparisons > know the equivalence between en and eng. So, if one expect to process > a range of data including languages only distinguished in -3 space. > There's lots of RDF out there with @en. So, I can't just throw the > switch, as it were, to -3 codes and expect to match against it. I need > to be careful to use -1 codes except for those languages where -3 > codes are required to distinguish. > > Am I making sense? > > >> >> (This is all outside the RDF specs - they just inherit from W3C >> Internationalization and BCP 47). >> >> >> Experiment with: >> http://www.sparql.org/data-validator.html >> >> Andy >> >> >>> >>>> >>>> SPARQL uses LANGMATCHES, which is the algorithm from RFC 4647 "Matching >>>> of >>>> Language Tags". >>>> >>>> If you want semantic (ha!) equality, then canonicalizing on input is >>>> best. >>>> Then worry about en-uk. >>>> >>>> Andy >>>> >>>> >>>>> >>>>> >>>>> >>>>> On Wed, Jul 2, 2014 at 12:34 PM, Andy Seaborne wrote: >>>>>> >>>>>> >>>>>> On 02/07/14 12:01, Benson Margulies wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> I always see two-letter ISO-639-1 language codes. This isn't enough, >>>>>>> not all languages have them. >>>>>>> >>>>>>> Does the spec specifically call for these, or does it also allow for >>>>>>> -3? >>>>>>> >>>>>>> --benson >>>>>>> >>>>>> >>>>>> RDF 1.1 Concepts: >>>>>> >>>>>> http://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal >>>>>> >>>>>> so it's BCP 47 / RFC 5646 >>>>>> >>>>>> The grammars do not include the RFC grammar (because a big language tag >>>>>> grammar would dwarf the rest). >>>>>> >>>>>> http://www.w3.org/TR/turtle/#grammar-production-LANGTAG >>>>>> >>>>>> [144s] LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* >>>>>> >>>>>> So neutral and the grammars provide a more general match to language >>>>>> codes. >>>>>> >>>>>> Jena has a language tag parser: LangTag. >>>>>> >>>>>> Andy >>>>>> >>>> >>
Re: Language codes
On Wed, Jul 2, 2014 at 5:55 PM, Andy Seaborne wrote: > On 02/07/14 22:27, Benson Margulies wrote: >> >> On Wed, Jul 2, 2014 at 5:11 PM, Andy Seaborne wrote: >>> >>> On 02/07/14 21:45, Benson Margulies wrote: >>>> >>>> >>>> Andy, >>>> >>>> The upshot of all of this is that ISO-639-3 codes should work. >>>> However, that leaves a mystery to me. If I store a triple with @en, >>>> and someone queries with @eng, are they supposed to match? In >>>> practical terms, do they match in TDB or any other common triple >>>> stores? >>> >>> >>> >>> No. >>> >>> ""@en and ""@eng are different RDF terms. As is ""@en-uk. >>> >>> All the stores I know of treat language tags as (normalized) strings. >> >> >> That's perfectly clear and perfectly awful, at least for people who >> care about Persian, Dari, and that ilk. Thanks. > > > Why? All ISO-639 systems are supported - but there is no equivalence tables > between the different systems built in. Or within the systems (B and T > codes). Well, I'm exaggerating. "Perfectly awful" should be 'mildly inconvenient'/ In my space, it's typical to assume that language code comparisons know the equivalence between en and eng. So, if one expect to process a range of data including languages only distinguished in -3 space. There's lots of RDF out there with @en. So, I can't just throw the switch, as it were, to -3 codes and expect to match against it. I need to be careful to use -1 codes except for those languages where -3 codes are required to distinguish. Am I making sense? > > (This is all outside the RDF specs - they just inherit from W3C > Internationalization and BCP 47). > > > Experiment with: > http://www.sparql.org/data-validator.html > > Andy > > >> >>> >>> SPARQL uses LANGMATCHES, which is the algorithm from RFC 4647 "Matching >>> of >>> Language Tags". >>> >>> If you want semantic (ha!) equality, then canonicalizing on input is >>> best. >>> Then worry about en-uk. >>> >>> Andy >>> >>> >>>> >>>> >>>> >>>> On Wed, Jul 2, 2014 at 12:34 PM, Andy Seaborne wrote: >>>>> >>>>> >>>>> On 02/07/14 12:01, Benson Margulies wrote: >>>>>> >>>>>> >>>>>> >>>>>> I always see two-letter ISO-639-1 language codes. This isn't enough, >>>>>> not all languages have them. >>>>>> >>>>>> Does the spec specifically call for these, or does it also allow for >>>>>> -3? >>>>>> >>>>>> --benson >>>>>> >>>>> >>>>> RDF 1.1 Concepts: >>>>> >>>>> http://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal >>>>> >>>>> so it's BCP 47 / RFC 5646 >>>>> >>>>> The grammars do not include the RFC grammar (because a big language tag >>>>> grammar would dwarf the rest). >>>>> >>>>> http://www.w3.org/TR/turtle/#grammar-production-LANGTAG >>>>> >>>>> [144s] LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* >>>>> >>>>> So neutral and the grammars provide a more general match to language >>>>> codes. >>>>> >>>>> Jena has a language tag parser: LangTag. >>>>> >>>>> Andy >>>>> >>> >
Re: Language codes
On Wed, Jul 2, 2014 at 5:11 PM, Andy Seaborne wrote: > On 02/07/14 21:45, Benson Margulies wrote: >> >> Andy, >> >> The upshot of all of this is that ISO-639-3 codes should work. >> However, that leaves a mystery to me. If I store a triple with @en, >> and someone queries with @eng, are they supposed to match? In >> practical terms, do they match in TDB or any other common triple >> stores? > > > No. > > ""@en and ""@eng are different RDF terms. As is ""@en-uk. > > All the stores I know of treat language tags as (normalized) strings. That's perfectly clear and perfectly awful, at least for people who care about Persian, Dari, and that ilk. Thanks. > > SPARQL uses LANGMATCHES, which is the algorithm from RFC 4647 "Matching of > Language Tags". > > If you want semantic (ha!) equality, then canonicalizing on input is best. > Then worry about en-uk. > > Andy > > >> >> >> >> On Wed, Jul 2, 2014 at 12:34 PM, Andy Seaborne wrote: >>> >>> On 02/07/14 12:01, Benson Margulies wrote: >>>> >>>> >>>> I always see two-letter ISO-639-1 language codes. This isn't enough, >>>> not all languages have them. >>>> >>>> Does the spec specifically call for these, or does it also allow for -3? >>>> >>>> --benson >>>> >>> >>> RDF 1.1 Concepts: >>> >>> http://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal >>> >>> so it's BCP 47 / RFC 5646 >>> >>> The grammars do not include the RFC grammar (because a big language tag >>> grammar would dwarf the rest). >>> >>> http://www.w3.org/TR/turtle/#grammar-production-LANGTAG >>> >>> [144s] LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* >>> >>> So neutral and the grammars provide a more general match to language >>> codes. >>> >>> Jena has a language tag parser: LangTag. >>> >>> Andy >>> >
Re: Language codes
Andy, The upshot of all of this is that ISO-639-3 codes should work. However, that leaves a mystery to me. If I store a triple with @en, and someone queries with @eng, are they supposed to match? In practical terms, do they match in TDB or any other common triple stores? On Wed, Jul 2, 2014 at 12:34 PM, Andy Seaborne wrote: > On 02/07/14 12:01, Benson Margulies wrote: >> >> I always see two-letter ISO-639-1 language codes. This isn't enough, >> not all languages have them. >> >> Does the spec specifically call for these, or does it also allow for -3? >> >> --benson >> > > RDF 1.1 Concepts: > > http://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal > > so it's BCP 47 / RFC 5646 > > The grammars do not include the RFC grammar (because a big language tag > grammar would dwarf the rest). > > http://www.w3.org/TR/turtle/#grammar-production-LANGTAG > > [144s] LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* > > So neutral and the grammars provide a more general match to language codes. > > Jena has a language tag parser: LangTag. > > Andy >
Language codes
I always see two-letter ISO-639-1 language codes. This isn't enough, not all languages have them. Does the spec specifically call for these, or does it also allow for -3? --benson
Are language and type exclusive on literals?
I see how to create an untyped literal with a language or a typed literal without, but not one with both.
Re: TDB versus an OntModel
On Mon, Jun 30, 2014 at 1:59 PM, Andy Seaborne wrote: > connected Thanks. So, if I want to 'commit every so often', I need to pass the dataset around. Thanks.
Re: TDB versus an OntModel
Thanks. Does committing the ontModel do any good, or does it have to be the dataset? On Mon, Jun 30, 2014 at 11:35 AM, Andy Seaborne wrote: > Benson, > > The example has > dataset.begin(ReadWrite.WRITE); > ... > dataset.end(); > > and it generates a warning. For once, warnings matter! > > There is no commit. > > The idiom is: > > dataset.begin(ReadWrite.WRITE); > try { > > dataset.commit() ; > } finally { dataset.end() ; } > > > .end without .commit forces an abort, hence no data in the read transaction. > > Andy > > > On 29/06/14 15:27, Benson Margulies wrote: >> >> OK, the default branch (andy) is now buildable outside our office. >> >> On Sun, Jun 29, 2014 at 10:21 AM, Benson Margulies >> wrote: >>> >>> I will build you a test case, but for now, those two classes amount to >>> 10 calls to the following, where 'model' is the ontModel. >>> >>> My premise was that this was a stupid mistake. >>> >>> Actually, all this code is stuff I can share on github, come to think >>> it is, and it's quite small. >>> >>> https://github.com/benson-basis/tdb-case ... but it wants a parent POM >>> that you won't have. Gimme a few minutes ... >>> >>> >>> Individual entity = model.createIndividual(encodedUri("E", >>> item.getEntityId()), Res20140626.Entity); >>> Individual document = model.createIndividual(encodedUri("D", >>> item.getDocId()), Res20140626.Document); >>> // anonymous! >>> Individual mention = >>> model.createIndividual(Res20140626.Mention); >>> if (item.getEntityId().startsWith("Q")) { >>> model.createStatement(entity, OWL.sameAs, >>> String.format("http://www.wikidata.org/entity/%s";, >>> item.getEntityId())); >>> } >>> >>> model.add(entity, Res20140626.hasEntityMention, mention); >>> model.add(mention, Res20140626.hasMentionEntity, entity); >>> model.add(mention, Res20140626.hasMentionDocument, document); >>> model.add(document, Res20140626.hasDocumentMention, mention); >>> if (typedLiterals) { >>> model.add(entity, Skos.notation, >>> model.createTypedLiteral(item.getEntityId())); >>> model.add(document, Skos.notation, >>> model.createTypedLiteral(item.getDocId())); >>> model.add(mention, Res20140626.hasIndocId, >>> model.createTypedLiteral(item.getIndocId())); >>> model.add(mention, Res20140626.hasMentionStart, >>> model.createTypedLiteraldataset.begin(ReadWrite.WRITE); > > Model model = dataset.getDefaultModel(); > OntModel ontModel = > ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM, model); > ResRdfBuilder resRdfBuilder = new ResRdfBuilder(ontModel, 1); > ResTsvReader reader = new ResTsvReader(); > URL dataUrl = Resources.getResource(ResTsvReaderTest.class, > "data.txt"); > ByteSource dataSource = Resources.asByteSource(dataUrl); > reader.read(dataSource, resRdfBuilder); > smallAssertions(ontModel); > dataset.end();(item.getMentionStart())); > >>> model.add(mention, Res20140626.hasMentionEnd, >>> model.createTypedLiteral(item.getMentionEnd())); >>> model.add(mention, Res20140626.hasMentionText, >>> model.createTypedLiteral(item.getMentionText())); >>> model.add(mention, Res20140626.hasConfidence, >>> model.createTypedLiteral(item.getConfidence())); >>> } else { >>> model.add(entity, Skos.notation, >>> model.createLiteral(item.getEntityId())); >>> model.add(document, Skos.notation, >>> model.createLiteral(item.getDocId())); >>> model.add(mention, Res20140626.hasIndocId, >>> model.createLiteral(item.getIndocId())); >>> model.add(mention, Res20140626.hasMentionStart, >>> model.createLiteral(Integer.toString(item.getMentionStart(; >>> model.add(mention, Res20140626.hasMentionEnd, >>> model.createLiteral(Integer.toString(item.getMentionEnd(; >>> model.add(mention, Res20140626.hasMentionText, >>> model.createLiteral(item.getMentionText())); >>> model.add(mention, Res20140626.hasConfidence, >>> model.createLiteral(Doubl
Re: TDB versus an OntModel
OK, the default branch (andy) is now buildable outside our office. On Sun, Jun 29, 2014 at 10:21 AM, Benson Margulies wrote: > I will build you a test case, but for now, those two classes amount to > 10 calls to the following, where 'model' is the ontModel. > > My premise was that this was a stupid mistake. > > Actually, all this code is stuff I can share on github, come to think > it is, and it's quite small. > > https://github.com/benson-basis/tdb-case ... but it wants a parent POM > that you won't have. Gimme a few minutes ... > > > Individual entity = model.createIndividual(encodedUri("E", > item.getEntityId()), Res20140626.Entity); > Individual document = model.createIndividual(encodedUri("D", > item.getDocId()), Res20140626.Document); > // anonymous! > Individual mention = model.createIndividual(Res20140626.Mention); > if (item.getEntityId().startsWith("Q")) { > model.createStatement(entity, OWL.sameAs, > String.format("http://www.wikidata.org/entity/%s";, > item.getEntityId())); > } > > model.add(entity, Res20140626.hasEntityMention, mention); > model.add(mention, Res20140626.hasMentionEntity, entity); > model.add(mention, Res20140626.hasMentionDocument, document); > model.add(document, Res20140626.hasDocumentMention, mention); > if (typedLiterals) { > model.add(entity, Skos.notation, > model.createTypedLiteral(item.getEntityId())); > model.add(document, Skos.notation, > model.createTypedLiteral(item.getDocId())); > model.add(mention, Res20140626.hasIndocId, > model.createTypedLiteral(item.getIndocId())); > model.add(mention, Res20140626.hasMentionStart, > model.createTypedLiteral(item.getMentionStart())); > model.add(mention, Res20140626.hasMentionEnd, > model.createTypedLiteral(item.getMentionEnd())); > model.add(mention, Res20140626.hasMentionText, > model.createTypedLiteral(item.getMentionText())); > model.add(mention, Res20140626.hasConfidence, > model.createTypedLiteral(item.getConfidence())); > } else { > model.add(entity, Skos.notation, > model.createLiteral(item.getEntityId())); > model.add(document, Skos.notation, > model.createLiteral(item.getDocId())); > model.add(mention, Res20140626.hasIndocId, > model.createLiteral(item.getIndocId())); > model.add(mention, Res20140626.hasMentionStart, > model.createLiteral(Integer.toString(item.getMentionStart(; > model.add(mention, Res20140626.hasMentionEnd, > model.createLiteral(Integer.toString(item.getMentionEnd(; > model.add(mention, Res20140626.hasMentionText, > model.createLiteral(item.getMentionText())); > model.add(mention, Res20140626.hasConfidence, > model.createLiteral(Double.toString(item.getConfidence(; > } > > On Sun, Jun 29, 2014 at 6:20 AM, Andy Seaborne wrote: >> On 27/06/14 14:25, Benson Margulies wrote: >>> >>> here's a little test method. The second call to smallAssertions sees >>> no triples at all. There's a todo on the doc for ontologies, and I'll >>> cheerfully write some doc if someone will tell me what dumb thing I've >>> missed here. >>> >>> >> >> Hi Benson, >> >> There seem to be details that might matter but aren't included. Do you have >> a complete, standalone, test case? >> >> >>> @Test >>> public void tdb() throws Exception { >>> Dataset dataset = >>> TDBFactory.createDataset(tempDataset.getAbsolutePath()); >>> dataset.begin(ReadWrite.WRITE); >>> Model model = dataset.getDefaultModel(); >>> OntModel ontModel = >>> ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM, model); >>> ResRdfBuilder resRdfBuilder = new ResRdfBuilder(ontModel, 1); >>> ResTsvReader reader = new ResTsvReader(); >>> URL dataUrl = Resources.getResource(ResTsvReaderTest.class, >>> "data.txt"); >>> ByteSource dataSource = Resources.asByteSource(dataUrl); >>> reader.read(dataSource, resRdfBuilder); >> >> >> ResRdfBuilder and ResTsvReader.read -- what do they do? >> >> >>> smallAssertions(ontModel); >>> dataset.end(); >>> >>> dataset.begin(ReadWrite.READ); >>> model = dataset.getDefaultModel(); >>> smallAssertions(model); >>> dataset.end(); >>> } >>> >> >> Andy >>
Re: TDB versus an OntModel
I will build you a test case, but for now, those two classes amount to 10 calls to the following, where 'model' is the ontModel. My premise was that this was a stupid mistake. Actually, all this code is stuff I can share on github, come to think it is, and it's quite small. https://github.com/benson-basis/tdb-case ... but it wants a parent POM that you won't have. Gimme a few minutes ... Individual entity = model.createIndividual(encodedUri("E", item.getEntityId()), Res20140626.Entity); Individual document = model.createIndividual(encodedUri("D", item.getDocId()), Res20140626.Document); // anonymous! Individual mention = model.createIndividual(Res20140626.Mention); if (item.getEntityId().startsWith("Q")) { model.createStatement(entity, OWL.sameAs, String.format("http://www.wikidata.org/entity/%s";, item.getEntityId())); } model.add(entity, Res20140626.hasEntityMention, mention); model.add(mention, Res20140626.hasMentionEntity, entity); model.add(mention, Res20140626.hasMentionDocument, document); model.add(document, Res20140626.hasDocumentMention, mention); if (typedLiterals) { model.add(entity, Skos.notation, model.createTypedLiteral(item.getEntityId())); model.add(document, Skos.notation, model.createTypedLiteral(item.getDocId())); model.add(mention, Res20140626.hasIndocId, model.createTypedLiteral(item.getIndocId())); model.add(mention, Res20140626.hasMentionStart, model.createTypedLiteral(item.getMentionStart())); model.add(mention, Res20140626.hasMentionEnd, model.createTypedLiteral(item.getMentionEnd())); model.add(mention, Res20140626.hasMentionText, model.createTypedLiteral(item.getMentionText())); model.add(mention, Res20140626.hasConfidence, model.createTypedLiteral(item.getConfidence())); } else { model.add(entity, Skos.notation, model.createLiteral(item.getEntityId())); model.add(document, Skos.notation, model.createLiteral(item.getDocId())); model.add(mention, Res20140626.hasIndocId, model.createLiteral(item.getIndocId())); model.add(mention, Res20140626.hasMentionStart, model.createLiteral(Integer.toString(item.getMentionStart(; model.add(mention, Res20140626.hasMentionEnd, model.createLiteral(Integer.toString(item.getMentionEnd(; model.add(mention, Res20140626.hasMentionText, model.createLiteral(item.getMentionText())); model.add(mention, Res20140626.hasConfidence, model.createLiteral(Double.toString(item.getConfidence(; } On Sun, Jun 29, 2014 at 6:20 AM, Andy Seaborne wrote: > On 27/06/14 14:25, Benson Margulies wrote: >> >> here's a little test method. The second call to smallAssertions sees >> no triples at all. There's a todo on the doc for ontologies, and I'll >> cheerfully write some doc if someone will tell me what dumb thing I've >> missed here. >> >> > > Hi Benson, > > There seem to be details that might matter but aren't included. Do you have > a complete, standalone, test case? > > >> @Test >> public void tdb() throws Exception { >> Dataset dataset = >> TDBFactory.createDataset(tempDataset.getAbsolutePath()); >> dataset.begin(ReadWrite.WRITE); >> Model model = dataset.getDefaultModel(); >> OntModel ontModel = >> ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM, model); >> ResRdfBuilder resRdfBuilder = new ResRdfBuilder(ontModel, 1); >> ResTsvReader reader = new ResTsvReader(); >> URL dataUrl = Resources.getResource(ResTsvReaderTest.class, >> "data.txt"); >> ByteSource dataSource = Resources.asByteSource(dataUrl); >> reader.read(dataSource, resRdfBuilder); > > > ResRdfBuilder and ResTsvReader.read -- what do they do? > > >> smallAssertions(ontModel); >> dataset.end(); >> >> dataset.begin(ReadWrite.READ); >> model = dataset.getDefaultModel(); >> smallAssertions(model); >> dataset.end(); >> } >> > > Andy >
Possible change to the schemagen plugin
As of 0.5, the current version, the schemagen plugin defaults to writing into target/generated-sources. This directory is usually the home of multiple sets of generated source; for example, a typical Maven 3.0.4 build creates an 'annotations' directory in there. That directory isn't automatically compiled. In the simplest case, I find that I want to generate the sources and include them in the build. But there are less simple cases, where, for example, people want to run the plugin explicitly and check in the results. The existing configuration scheme is fine for the less-simple cases: for the simple case, it at least needs the help of the build-helper-maven-plugin. On the dev list, I've proposed a change to make it possible to do the simple case simply: to just provide a single output directory, defaulted to target/generated-sources/jena, that would be automatically added as a source root. We could: a: not do this b: make adding the source root an optional behavior and leave the pathname specification as it is c: add a new means of pathname configuration that triggers the simple case d: while we are at it, change the default to add the extra level of directory Do users of the plugin have preferences?
Re: Typed literals: great stuff or a waste of space?
Chris, I'm completely lost in your remark about 'one thing posted twice'. Did I post something twice? On Fri, Jun 27, 2014 at 10:19 AM, Chris_Dollin wrote: > On Friday, June 27, 2014 02:55:43 PM Chris_Dollin wrote: > > One thing, posted twice. Why twice? No idea. Sorries. > > ChrisChris\
TDB versus an OntModel
here's a little test method. The second call to smallAssertions sees no triples at all. There's a todo on the doc for ontologies, and I'll cheerfully write some doc if someone will tell me what dumb thing I've missed here. @Test public void tdb() throws Exception { Dataset dataset = TDBFactory.createDataset(tempDataset.getAbsolutePath()); dataset.begin(ReadWrite.WRITE); Model model = dataset.getDefaultModel(); OntModel ontModel = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM, model); ResRdfBuilder resRdfBuilder = new ResRdfBuilder(ontModel, 1); ResTsvReader reader = new ResTsvReader(); URL dataUrl = Resources.getResource(ResTsvReaderTest.class, "data.txt"); ByteSource dataSource = Resources.asByteSource(dataUrl); reader.read(dataSource, resRdfBuilder); smallAssertions(ontModel); dataset.end(); dataset.begin(ReadWrite.READ); model = dataset.getDefaultModel(); smallAssertions(model); dataset.end(); }
Typed literals: great stuff or a waste of space?
I specified some ranges on some data properties in a schema, and now all my triples have types in N-TRIPLE. Is there a way to have these in the schema for documentation and not clutter all the RDF?
URIs, inverse properties and creating statements, typed literals
I am trying to remove more rust from my OWL neurons. 1. Some item has a natural ID. It might not be composed of valid characters for a URI. http://answers.semanticweb.com/questions/3903/n3-syntax-for-url-encoding-on-jena does not give direction that I see how to follow. What's best practice? 2. If a have a functional/inverse-functional pair, should I createStatement both directions, or should I use some sort of inference device to fill in? I'm creating a lot of triples here, so I'm inclined to just make them myself for fear of a very slow traverse of a large model.
Re: Feeling a bit stupid with schemagen
Tom, Yup, resaving as RDF did the trick. Now I just have to fix 3 bugs in the schemagen maven plugin. --benson On Thu, Jun 26, 2014 at 9:32 PM, Tom Emerson wrote: > Hi Benson, > > The file you are trying to put through schemagen isn't an OWL RDF > file, it's an OWL2 XML file. AFAIK Jena doesn't support OWL2 yet. > > -tree > > P.S. Hi > > > On Thu, Jun 26, 2014 at 9:13 PM, Benson Margulies > wrote: >> I just made an OWL schema, something I have not done in a long time. >> >> https://gist.github.com/benson-basis/6ab309f661156a532e78 >> >> I fed it to schemagen with Maven. The resulting Java class is trivial. >> >> >> org.apache.jena >> jena-maven-tools >> 0.5 >> >> >> >> src/main/ontologies/res.res20140626.owl >> >> >> >> default >> >> com.basistech.restordf >> true >> >> >> >> >> >> schemagen >> >> translate >> >> >> >> >> >> Here's the sadly trivial output. >> >> ? > > > > -- > Tom Emerson > tremer...@gmail.com > http://www.dreamersrealm.net/tree
Feeling a bit stupid with schemagen
I just made an OWL schema, something I have not done in a long time. https://gist.github.com/benson-basis/6ab309f661156a532e78 I fed it to schemagen with Maven. The resulting Java class is trivial. org.apache.jena jena-maven-tools 0.5 src/main/ontologies/res.res20140626.owl default com.basistech.restordf true schemagen translate Here's the sadly trivial output. ?
Re: Is this prima facia evidence of a TDB bug?
Unfortunately, I've only seen this mishap once, over several (multi-hour) runs that get into trouble with memory. On Wed, Jun 13, 2012 at 8:02 AM, Andy Seaborne wrote: > On 12/06/12 18:45, Benson Margulies wrote: >> >> I'm using 0.9.0-incubating. >> >> 2012-06-12 13:27:49,090 [DefaultMessageListenerContainer-1] ERROR >> com.basistech.jug.rdfdb.task.RdfdbTaskEndpoint - Failed to \ >> add generate RDF for 9eb8b4cc9bbee1b8f9f1d6895a68b367 >> com.hp.hpl.jena.tdb.TDBException: Different ids for >> urn:jug:doc#de76105335a9c253f56384e905fd36196: allocated: expected >> [0\ >> 00022191FA3], got [2215E21F] >> at >> com.hp.hpl.jena.tdb.transaction.NodeTableTrans.inconsistent(NodeTableTrans.java:212) >> at >> com.hp.hpl.jena.tdb.transaction.NodeTableTrans.append(NodeTableTrans.java:200) >> at >> com.hp.hpl.jena.tdb.transaction.NodeTableTrans.writeNodeJournal(NodeTableTrans.java:306) >> at >> com.hp.hpl.jena.tdb.transaction.NodeTableTrans.commitPrepare(NodeTableTrans.java:266) >> at >> com.hp.hpl.jena.tdb.transaction.Transaction.prepare(Transaction.java:131) >> at >> com.hp.hpl.jena.tdb.transaction.Transaction.commit(Transaction.java:112) >> at >> com.hp.hpl.jena.tdb.transaction.DatasetGraphTxn.commit(DatasetGraphTxn.java:40) >> at >> com.hp.hpl.jena.tdb.transaction.DatasetGraphTransaction._commit(DatasetGraphTransaction.java:106) >> at >> com.hp.hpl.jena.tdb.migrate.DatasetGraphTrackActive.commit(DatasetGraphTrackActive.java:60) >> at >> com.hp.hpl.jena.sparql.core.DatasetImpl.commit(DatasetImpl.java:137) >> at >> com.basistech.jug.store.jena.RdfIndexStoreStore.resultsToModel(RdfIndexStoreStore.java:64) >> at >> com.basistech.jug.rdfdb.task.RdfdbTaskEndpoint.assignTask(RdfdbTaskEndpoint.java:129) >> at >> com.basistech.jug.task.TaskEndpointErrorProxy.assignTask(TaskEndpointErrorProxy.java:34) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) > > > This looks like something that has been fixed - bad news is that the DB is > broken. Is this deterministically happening or not? > > I can't be absolutely positive - there are two symptoms that can occur in > the node table "inconsistent" and can't rebuild a node. All errors and bugs > (cause) result in one of the two symptoms. > > If you could try a recent snapshot, that would be great. > > Andy
Re: TDB oom?
That answers that question. We were running in .5G. On Wed, Jun 13, 2012 at 7:52 AM, Andy Seaborne wrote: > On 13/06/12 12:13, Benson Margulies wrote: >> >> On Wed, Jun 13, 2012 at 5:20 AM, Damian Steer >> wrote: >>> >>> -BEGIN PGP SIGNED MESSAGE- >>> Hash: SHA1 >>> >>> On 12/06/12 18:47, Benson Margulies wrote: >>>> >>>> I hit the problem I sent mail about most recently whilst trying to >>>> reproduce the following. Is there any reason to think that this is >>>> worse than TDB being unlucky enough to be the allocation straw >>>> that breaks the camel's back? >>> >>> >>> What are you running this on? On 64bit machines TDB should be >>> reasonably light on the heap. >> >> >> "A whole lot of triples"? It's sitting there stuffing triples into the >> store, some, as you see, are reifications, which results in some >> queries, and in other cases we're explicitly querying to decide what >> to insert. Running the whole business with -Xmx1g succeeds. If there >> is some more specific characterization of the process that would help, >> please let me know. I obviously can't deliver the entire circus as a >> test case. > > > You are running 64bit presumably. 1G should be enough for TDB 9bit on the > small side?) but of course it's competing for space with the rest of the > app. > > On the surface, TDB is the unlucky straw, although realistically it is using > a decent proportion of that 1G so it's putting itself in the firing line. > > The main heap memory consumer on a 64bit machine (mapped mode) is the node > table. > > I normally run with 1.5G. > > Andy >
Re: TDB oom?
On Wed, Jun 13, 2012 at 5:20 AM, Damian Steer wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 12/06/12 18:47, Benson Margulies wrote: >> I hit the problem I sent mail about most recently whilst trying to >> reproduce the following. Is there any reason to think that this is >> worse than TDB being unlucky enough to be the allocation straw >> that breaks the camel's back? > > What are you running this on? On 64bit machines TDB should be > reasonably light on the heap. "A whole lot of triples"? It's sitting there stuffing triples into the store, some, as you see, are reifications, which results in some queries, and in other cases we're explicitly querying to decide what to insert. Running the whole business with -Xmx1g succeeds. If there is some more specific characterization of the process that would help, please let me know. I obviously can't deliver the entire circus as a test case. > > Damian > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAk/YW2IACgkQAyLCB+mTtyn0DgCePSGFRsIMp8PBR9JGzX11nQwM > s+wAn0JSAcF/J+pbXIMRUsinSnDzVPvf > =1C1e > -END PGP SIGNATURE-
Re: TDB oom?
With a bigger heap, I ran longer, and then got the following. "GC overhead limit exceeded" is described as a side effect of too many little objects created too fast. 2012-06-12 19:03:41,093 [DefaultMessageListenerContainer-1] WARN TDB - Transaction not commited or aborted: Transaction: 2335\ 9 : Mode=WRITE : State=ACTIVE : /data/benson/oap/dist/jug-dist/ris-install/bin/../data/ 2012-06-12 19:03:41,099 [DefaultMessageListenerContainer-1] ERROR com.basistech.jug.rdfdb.task.RdfdbTaskEndpoint - Failed to \ add generate RDF for b671a2b222c20fced4d1ed68096f2a97 java.lang.OutOfMemoryError: GC overhead limit exceeded at java.nio.HeapByteBuffer.asIntBuffer(HeapByteBuffer.java:366) at com.hp.hpl.jena.tdb.base.buffer.PtrBuffer.(PtrBuffer.java:47) at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr.formatBPTreeNode(BPTreeNodeMgr.java:259) at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr.overlay(BPTreeNodeMgr.java:204) at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr.access$000(BPTreeNodeMgr.java:36) at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr$Block2BPTreeNode.fromBlock(BPTreeNodeMgr.java:144) at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr$Block2BPTreeNode.fromBlock(BPTreeNodeMgr.java:119) at com.hp.hpl.jena.tdb.base.page.PageBlockMgr.getRead(PageBlockMgr.java:69) at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr.getRead(BPTreeNodeMgr.java:105) at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.getMgrRead(BPTreeNode.java:166) at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.get(BPTreeNode.java:154) at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.findHere(BPTreeNode.java:422) at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.recordsPageId(BPTreeNode.java:270) at com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.iterator(BPlusTree.java:373) at com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.iterator(BPlusTree.java:361) at com.hp.hpl.jena.tdb.index.TupleIndexRecord.findWorker(TupleIndexRecord.java:164) at com.hp.hpl.jena.tdb.index.TupleIndexRecord.findOrScan(TupleIndexRecord.java:84) at com.hp.hpl.jena.tdb.index.TupleIndexRecord.performFind(TupleIndexRecord.java:78) at com.hp.hpl.jena.tdb.index.TupleIndexBase.find(TupleIndexBase.java:88) at com.hp.hpl.jena.tdb.index.TupleTable.find(TupleTable.java:173) at com.hp.hpl.jena.tdb.nodetable.NodeTupleTableConcrete.find(NodeTupleTableConcrete.java:169) at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:101) at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44) at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49) at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:46) at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:287) at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108) On Tue, Jun 12, 2012 at 1:47 PM, Benson Margulies wrote: > I hit the problem I sent mail about most recently whilst trying to > reproduce the following. Is there any reason to think that this is > worse than TDB being unlucky enough to be the allocation straw that > breaks the camel's back? > > java.lang.OutOfMemoryError: Java heap space > at java.util.LinkedHashMap.createEntry(LinkedHashMap.java:424) > at java.util.LinkedHashMap.addEntry(LinkedHashMap.java:406) > at java.util.HashMap.put(HashMap.java:385) > at > sun.util.resources.OpenListResourceBundle.loadLookup(OpenListResourceBundle.java:118) > at > sun.util.resources.OpenListResourceBundle.loadLookupTablesIfNecessary(OpenListResourceBundle.java:97) > at > sun.util.resources.OpenListResourceBundle.handleGetObject(OpenListResourceBundle.java:58) > at > sun.util.resources.TimeZoneNamesBundle.handleGetObject(TimeZoneNamesBundle.java:59) > at java.util.ResourceBundle.getObject(ResourceBundle.java:368) > at java.util.ResourceBundle.getObject(ResourceBundle.java:371) > at java.util.ResourceBundle.getStringArray(ResourceBundle.java:351) > at > sun.util.TimeZoneNameUtility.retrieveDisplayNames(TimeZoneNameU
Is this prima facia evidence of a TDB bug?
I'm using 0.9.0-incubating. 2012-06-12 13:27:49,090 [DefaultMessageListenerContainer-1] ERROR com.basistech.jug.rdfdb.task.RdfdbTaskEndpoint - Failed to \ add generate RDF for 9eb8b4cc9bbee1b8f9f1d6895a68b367 com.hp.hpl.jena.tdb.TDBException: Different ids for urn:jug:doc#de76105335a9c253f56384e905fd36196: allocated: expected [0\ 00022191FA3], got [2215E21F] at com.hp.hpl.jena.tdb.transaction.NodeTableTrans.inconsistent(NodeTableTrans.java:212) at com.hp.hpl.jena.tdb.transaction.NodeTableTrans.append(NodeTableTrans.java:200) at com.hp.hpl.jena.tdb.transaction.NodeTableTrans.writeNodeJournal(NodeTableTrans.java:306) at com.hp.hpl.jena.tdb.transaction.NodeTableTrans.commitPrepare(NodeTableTrans.java:266) at com.hp.hpl.jena.tdb.transaction.Transaction.prepare(Transaction.java:131) at com.hp.hpl.jena.tdb.transaction.Transaction.commit(Transaction.java:112) at com.hp.hpl.jena.tdb.transaction.DatasetGraphTxn.commit(DatasetGraphTxn.java:40) at com.hp.hpl.jena.tdb.transaction.DatasetGraphTransaction._commit(DatasetGraphTransaction.java:106) at com.hp.hpl.jena.tdb.migrate.DatasetGraphTrackActive.commit(DatasetGraphTrackActive.java:60) at com.hp.hpl.jena.sparql.core.DatasetImpl.commit(DatasetImpl.java:137) at com.basistech.jug.store.jena.RdfIndexStoreStore.resultsToModel(RdfIndexStoreStore.java:64) at com.basistech.jug.rdfdb.task.RdfdbTaskEndpoint.assignTask(RdfdbTaskEndpoint.java:129) at com.basistech.jug.task.TaskEndpointErrorProxy.assignTask(TaskEndpointErrorProxy.java:34) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597)