[OWLIM-discussion] trouble with large ontology in Owlim 2.9

2007-08-21 Thread John del Corral
I am having trouble loading a large (400,000 statements) ontology
into sesame with the Owlim 2.9.0 SAIL in sesame 1.26..  I use the Add(www)
action once I have logged into the repository with the sesame client
interface.  I enter the URL of our ontology of ntriples,
http://iri.columbia.edu/~jdcorral/ingrid/upload23161.nt
(~100MB).

I have checked the 'verify the data' button, and it checks out ok.

 From the Sesame client screen, I can watch the progress of the load.
I see the total number of statements (439,599), and then the progress
of the load.  there are usually 100,000 statements processed per 60 sec.

When the count gets greater than 200,000, I see in a separate window
(one that is monitoring the catalina.out file for our tomcat server) this:

Exception in thread "Thread-41" 
java.lang.ArrayIndexOutOfBoundsException: 1000
   at 
com.ontotext.trree.transitivity.PredicateMap.addPredicate(Unknown Source)
   at com.ontotext.trree.transitivity.Repository.put(Unknown Source)
   at com.ontotext.trree.transitivity.s.addStatement(Unknown Source)
   at 
org.openrdf.sesame.sailimpl.OWLIMSchemaRepository$LocalThreadPool$LocalWorkerThread.doJob(OWLIMSchemaRepository.java:490)
 

   at 
com.ontotext.trree.transitivity.ThreadPool$WorkerThread.run(Unknown Source)
Exception in thread "Thread-40" 
java.lang.ArrayIndexOutOfBoundsException: 1000
   at 
com.ontotext.trree.transitivity.PredicateMap.addPredicate(Unknown Source)
   at com.ontotext.trree.transitivity.Repository.put(Unknown Source)
   at com.ontotext.trree.transitivity.s.addStatement(Unknown Source)
   at 
org.openrdf.sesame.sailimpl.OWLIMSchemaRepository$LocalThreadPool$LocalWorkerThread.doJob(OWLIMSchemaRepository.java:490)
 

   at 
com.ontotext.trree.transitivity.ThreadPool$WorkerThread.run(Unknown Source)
Exception in thread "Thread-39" 
java.lang.ArrayIndexOutOfBoundsException: 1000
   at 
com.ontotext.trree.transitivity.PredicateMap.addPredicate(Unknown Source)
   at com.ontotext.trree.transitivity.Repository.put(Unknown Source)
   at com.ontotext.trree.transitivity.s.addStatement(Unknown Source)
   at 
org.openrdf.sesame.sailimpl.OWLIMSchemaRepository$LocalThreadPool$LocalWorkerThread.doJob(OWLIMSchemaRepository.java:490)
 

   at 
com.ontotext.trree.transitivity.ThreadPool$WorkerThread.run(Unknown Source)
Exception in thread "Thread-38" 
java.lang.ArrayIndexOutOfBoundsException: 1000
   at 
com.ontotext.trree.transitivity.PredicateMap.addPredicate(Unknown Source)
   at com.ontotext.trree.transitivity.Repository.put(Unknown Source)
   at com.ontotext.trree.transitivity.s.addStatement(Unknown Source)
   at 
org.openrdf.sesame.sailimpl.OWLIMSchemaRepository$LocalThreadPool$LocalWorkerThread.doJob(OWLIMSchemaRepository.java:490)
 

   at 
com.ontotext.trree.transitivity.ThreadPool$WorkerThread.run(Unknown Source)

I have loaded smaller (75,000 statements) ontologies into this repository,
but I get a failure with the large ontology.
Could someone else please try to load this large ontology and maybe give me
some feedback?

Thank you, John

-- 
John del Corral, IRI, Earth Inst. at Columbia Univ., Monell 107
Lamont-Doherty Earth Obs., 61 Route 9W, Palisades, NY 10964
+1 845-680-4437(v) +1 845-680-4864(F) [EMAIL PROTECTED]


___
OWLIM-discussion mailing list
OWLIM-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/owlim-discussion_ontotext.com


Re: [OWLIM-discussion] trouble with large ontology in Owlim 2.9

2007-08-21 Thread Damyan Ognyanoff
Hi John,

obviously (from the exception trace) your data makes use of more than 1000 
URIs as predicates.

In current version, we hardcoded the maximum number of predicates to 1000 - 
thinking wrongly that it will be sufficient for, virtually, every RDF 
application in the world. So we are wrong.

Can you confirm that you have more than 1000 predicates in your data, 
otherwise we should look at the inference rules which also could lead to 
such "undesired" inference.

regards,
Damyan Ognyanoff,
Ontotext Lab.


- Original Message - 
From: "John del Corral" <[EMAIL PROTECTED]>
To: 
Sent: Tuesday, August 21, 2007 5:39 PM
Subject: [OWLIM-discussion] trouble with large ontology in Owlim 2.9


>I am having trouble loading a large (400,000 statements) ontology
> into sesame with the Owlim 2.9.0 SAIL in sesame 1.26..  I use the Add(www)
> action once I have logged into the repository with the sesame client
> interface.  I enter the URL of our ontology of ntriples,
> http://iri.columbia.edu/~jdcorral/ingrid/upload23161.nt
> (~100MB).
>
> I have checked the 'verify the data' button, and it checks out ok.
>
> From the Sesame client screen, I can watch the progress of the load.
> I see the total number of statements (439,599), and then the progress
> of the load.  there are usually 100,000 statements processed per 60 sec.
>
> When the count gets greater than 200,000, I see in a separate window
> (one that is monitoring the catalina.out file for our tomcat server) this:
>
> Exception in thread "Thread-41"
> java.lang.ArrayIndexOutOfBoundsException: 1000
>   at
> com.ontotext.trree.transitivity.PredicateMap.addPredicate(Unknown Source)
>   at com.ontotext.trree.transitivity.Repository.put(Unknown Source)
>   at com.ontotext.trree.transitivity.s.addStatement(Unknown Source)
>   at
> org.openrdf.sesame.sailimpl.OWLIMSchemaRepository$LocalThreadPool$LocalWorkerThread.doJob(OWLIMSchemaRepository.java:490)
>
>   at
> com.ontotext.trree.transitivity.ThreadPool$WorkerThread.run(Unknown 
> Source)
> Exception in thread "Thread-40"
> java.lang.ArrayIndexOutOfBoundsException: 1000
>   at
> com.ontotext.trree.transitivity.PredicateMap.addPredicate(Unknown Source)
>   at com.ontotext.trree.transitivity.Repository.put(Unknown Source)
>   at com.ontotext.trree.transitivity.s.addStatement(Unknown Source)
>   at
> org.openrdf.sesame.sailimpl.OWLIMSchemaRepository$LocalThreadPool$LocalWorkerThread.doJob(OWLIMSchemaRepository.java:490)
>
>   at
> com.ontotext.trree.transitivity.ThreadPool$WorkerThread.run(Unknown 
> Source)
> Exception in thread "Thread-39"
> java.lang.ArrayIndexOutOfBoundsException: 1000
>   at
> com.ontotext.trree.transitivity.PredicateMap.addPredicate(Unknown Source)
>   at com.ontotext.trree.transitivity.Repository.put(Unknown Source)
>   at com.ontotext.trree.transitivity.s.addStatement(Unknown Source)
>   at
> org.openrdf.sesame.sailimpl.OWLIMSchemaRepository$LocalThreadPool$LocalWorkerThread.doJob(OWLIMSchemaRepository.java:490)
>
>   at
> com.ontotext.trree.transitivity.ThreadPool$WorkerThread.run(Unknown 
> Source)
> Exception in thread "Thread-38"
> java.lang.ArrayIndexOutOfBoundsException: 1000
>   at
> com.ontotext.trree.transitivity.PredicateMap.addPredicate(Unknown Source)
>   at com.ontotext.trree.transitivity.Repository.put(Unknown Source)
>   at com.ontotext.trree.transitivity.s.addStatement(Unknown Source)
>   at
> org.openrdf.sesame.sailimpl.OWLIMSchemaRepository$LocalThreadPool$LocalWorkerThread.doJob(OWLIMSchemaRepository.java:490)
>
>   at
> com.ontotext.trree.transitivity.ThreadPool$WorkerThread.run(Unknown 
> Source)
>
> I have loaded smaller (75,000 statements) ontologies into this repository,
> but I get a failure with the large ontology.
> Could someone else please try to load this large ontology and maybe give 
> me
> some feedback?
>
> Thank you, John
>
> -- 
> John del Corral, IRI, Earth Inst. at Columbia Univ., Monell 107
> Lamont-Doherty Earth Obs., 61 Route 9W, Palisades, NY 10964
> +1 845-680-4437(v) +1 845-680-4864(F) [EMAIL PROTECTED]
>
>
> ___
> OWLIM-discussion mailing list
> OWLIM-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/owlim-discussion_ontotext.com
> 


___
OWLIM-discussion mailing list
OWLIM-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/owlim-discussion_ontotext.com


Re: [OWLIM-discussion] trouble with large ontology in Owlim 2.9

2007-08-21 Thread John del Corral
Yes, Damyan, we do have more than 1000 URIs as predicates.  We have  
and expect
to be adding more.  There are 2 reasons for this.
1) We combine multiple ontologies in our repository.  Each has its own 
set of predicates
in its namespace.
2) We are cataloging global earth science datasets that have unique and 
non-standard properties.

In our computing environment, I think that we could support a larger 
'maximum number of predicates'.
Our machine has 16GB RAM, and 2 dual core Opteron processors.

Our smaller repository of bibliographic references loads 10X faster with 
Owlim 2.9.0 than 2.8.4,
so keep up the good work.

John

Damyan Ognyanoff wrote:
> Hi John,
>
> obviously (from the exception trace) your data makes use of more than 
> 1000 URIs as predicates.
>
> In current version, we hardcoded the maximum number of predicates to 
> 1000 - thinking wrongly that it will be sufficient for, virtually, 
> every RDF application in the world. So we are wrong.
>
> Can you confirm that you have more than 1000 predicates in your data, 
> otherwise we should look at the inference rules which also could lead 
> to such "undesired" inference.
>
> regards,
> Damyan Ognyanoff,
> Ontotext Lab.
>
>
> - Original Message - From: "John del Corral" 
> <[EMAIL PROTECTED]>
> To: 
> Sent: Tuesday, August 21, 2007 5:39 PM
> Subject: [OWLIM-discussion] trouble with large ontology in Owlim 2.9
>
>
>> I am having trouble loading a large (400,000 statements) ontology
>> into sesame with the Owlim 2.9.0 SAIL in sesame 1.26..  I use the 
>> Add(www)
>> action once I have logged into the repository with the sesame client
>> interface.  I enter the URL of our ontology of ntriples,
>> http://iri.columbia.edu/~jdcorral/ingrid/upload23161.nt
>> (~100MB).
>>
>> I have checked the 'verify the data' button, and it checks out ok.
>>
>> From the Sesame client screen, I can watch the progress of the load.
>> I see the total number of statements (439,599), and then the progress
>> of the load.  there are usually 100,000 statements processed per 60 sec.
>>
>> When the count gets greater than 200,000, I see in a separate window
>> (one that is monitoring the catalina.out file for our tomcat server) 
>> this:
>>
>> Exception in thread "Thread-41"
>> java.lang.ArrayIndexOutOfBoundsException: 1000
>>   at
>> com.ontotext.trree.transitivity.PredicateMap.addPredicate(Unknown 
>> Source)
>>   at com.ontotext.trree.transitivity.Repository.put(Unknown Source)
>>   at com.ontotext.trree.transitivity.s.addStatement(Unknown Source)
>>   at
>> org.openrdf.sesame.sailimpl.OWLIMSchemaRepository$LocalThreadPool$LocalWorkerThread.doJob(OWLIMSchemaRepository.java:490)
>>  
>>
>>
>>   at
>> com.ontotext.trree.transitivity.ThreadPool$WorkerThread.run(Unknown 
>> Source)
>> Exception in thread "Thread-40"
>> java.lang.ArrayIndexOutOfBoundsException: 1000
>>   at
>> com.ontotext.trree.transitivity.PredicateMap.addPredicate(Unknown 
>> Source)
>>   at com.ontotext.trree.transitivity.Repository.put(Unknown Source)
>>   at com.ontotext.trree.transitivity.s.addStatement(Unknown Source)
>>   at
>> org.openrdf.sesame.sailimpl.OWLIMSchemaRepository$LocalThreadPool$LocalWorkerThread.doJob(OWLIMSchemaRepository.java:490)
>>  
>>
>>
>>   at
>> com.ontotext.trree.transitivity.ThreadPool$WorkerThread.run(Unknown 
>> Source)
>> Exception in thread "Thread-39"
>> java.lang.ArrayIndexOutOfBoundsException: 1000
>>   at
>> com.ontotext.trree.transitivity.PredicateMap.addPredicate(Unknown 
>> Source)
>>   at com.ontotext.trree.transitivity.Repository.put(Unknown Source)
>>   at com.ontotext.trree.transitivity.s.addStatement(Unknown Source)
>>   at
>> org.openrdf.sesame.sailimpl.OWLIMSchemaRepository$LocalThreadPool$LocalWorkerThread.doJob(OWLIMSchemaRepository.java:490)
>>  
>>
>>
>>   at
>> com.ontotext.trree.transitivity.ThreadPool$WorkerThread.run(Unknown 
>> Source)
>> Exception in thread "Thread-38"
>> java.lang.ArrayIndexOutOfBoundsException: 1000
>>   at
>> com.ontotext.trree.transitivity.PredicateMap.addPredicate(Unknown 
>> Source)
>>   at com.ontotext.trree.transitivity.Repository.put(Unknown Source)
>>   at com.ontotext.trree.transitivity.s.addStatement(Unknown Source)
>>   at
>> org.openrdf.sesame.sailimpl.OWLIMSchemaRepository$LocalThreadPool$LocalWorkerThread.doJo