The customized ruleset is working as well... I'll keep it running and see
that it is stable.

I experienced another issue, which is unrelated so I'll open a different
thread.

Thanks for your help!

David

On Sat, Mar 17, 2012 at 10:01 AM, David Riccitelli <[email protected]>wrote:

> Hi Alessandro,
>
> It's much better now. Disk usage
> in sling/felix/bundle85/data/tdb-data/mgraph folder is steady at 1.4M.
>
> Open files were at ~700 a start-up, they increased up to ~1.600 after two
> tests. Now after each test they jump at ~2.100 and then decrease back to
> ~1.600.
>
> And it's also much faster than before.
>
> I'll continue testing now with our customized ruleset.
>
> BR
> David
>
>
> On Fri, Mar 16, 2012 at 7:38 PM, Alessandro Adamou <[email protected]>wrote:
>
>> Hi David,
>>
>> after quite some work today I rewrote part of the Refactor Engine to
>> avoid creating useless graphs.
>>
>> Many were blank ontologies created along with the SEO scope. They are no
>> longer created.
>>
>> Many of the other graphs that you see are due to the fact that the engine
>> merges together the entity signatures into an OntoNet session. Every such
>> signature ends up resulting in its own ontology and therefore a graph in
>> Clerezza/TDB.
>>
>> I have not modified this second behaviour, but I have seen to it that the
>> refactor engine now destroys its own session *and its contents* when
>> computeEnhancements() completes. This means a lot of space occupied during
>> analysis but freed up right thereafter.
>>
>> It's more brutal than I wanted it to be, but a better implementation will
>> come up once I add a couple new features to OntoNet that should make the
>> process more reasonable.
>>
>> On the upside, the engine code is now smaller by some 250 lines.
>>
>> It would be super if you could update and try it out.
>>
>> Thanks
>>
>> Alessandro
>>
>> P.S. now I'm glad I added the "ontonet" prefix to those graph names...
>>
>>
>>
>> On 3/16/12 12:40 PM, David Riccitelli wrote:
>>
>>>  From what I've seen so far, yes. But it could depend on your engine
>>>> configuration using a richer set of rules.
>>>>
>>>
>>> Same thing happens when we use the default rules set (seo_rules.sem) from
>>> SVN.
>>>
>>> We did not customize any other part of the installation with the
>>> exception
>>> of loading a local DBpedia index in sling/datafiles.
>>>
>>> David
>>>
>>> On Fri, Mar 16, 2012 at 12:27 PM, Alessandro Adamou<[email protected]>*
>>> *wrote:
>>>
>>>  On 3/16/12 11:16 AM, David Riccitelli wrote:
>>>>
>>>>  Is this issue happening to us only?
>>>>>
>>>>>   From what I've seen so far, yes. But it could depend on your engine
>>>> configuration using a richer set of rules.
>>>>
>>>> Alessandro
>>>>
>>>>  On Fri, Mar 16, 2012 at 12:12 PM, Alessandro Adamou<[email protected]
>>>> >**
>>>>
>>>>> wrote:
>>>>>
>>>>>  One thing that it would be great to do is to detect the ontology ID
>>>>>
>>>>>> *before* creating the TripleCollection in Clerezza, so any mappings
>>>>>> could
>>>>>> be done before storing.
>>>>>>
>>>>>> But I don't know how this can be done with not so much code.
>>>>>>
>>>>>> Perhaps creating an IndexedGraph, exploring its content, then creating
>>>>>> the
>>>>>> Graph in the TcManager with the same content and the right graph name,
>>>>>> then
>>>>>> finally clearing the IndexedGraph could work.
>>>>>>
>>>>>> But it still means having twice the resource usage (disk+memory) for a
>>>>>> period.
>>>>>>
>>>>>> Alessandro
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 3/16/12 10:56 AM, Alessandro Adamou wrote:
>>>>>>
>>>>>>  Hi David,
>>>>>>
>>>>>>> well, I guess that depends pretty much on how heavy the usage of
>>>>>>> OntoNet
>>>>>>> is in your Stanbol installation.
>>>>>>>
>>>>>>> Those are graphs created when OntoNet has to load an ontology from
>>>>>>> its
>>>>>>> content rather than from a Web URI, so it cannot know the ontology ID
>>>>>>> earlier.
>>>>>>>
>>>>>>> This happens e.g. by POSTing the ontology as the payload or by
>>>>>>> passing a
>>>>>>> GraphContentInputSource to the Java API.
>>>>>>>
>>>>>>> Now I do not know why these graphs are created (perhaps the refactor
>>>>>>> engine could be loading some), but I do know that a Clerezza graph in
>>>>>>> Jena
>>>>>>> TDB occupies a LOT of disk space.
>>>>>>>
>>>>>>> Suffice it to say that my bundled had stored nine graphs of<100
>>>>>>> triples
>>>>>>> each. Their disk space was about 1.8 GB, but when I tried to make a
>>>>>>> zipfile
>>>>>>> out of it, it came out as about 2MB!
>>>>>>>
>>>>>>> Alessandro
>>>>>>>
>>>>>>>
>>>>>>> On 3/16/12 10:30 AM, David Riccitelli wrote:
>>>>>>>
>>>>>>>  Dears,
>>>>>>>
>>>>>>>> As I ran into disk issues, I found that this folder:
>>>>>>>>  sling/felix/bundleXXX/data/******tdb-data/mgraph
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> where XX is the bundle of:
>>>>>>>>  Clerezza - SCB Jena TDB Storage Provider
>>>>>>>> org.apache.clerezza.rdf.jena.******tdb.storage
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> took almost 70 gbytes of disk space (then the disk space has been
>>>>>>>> exhausted).
>>>>>>>>
>>>>>>>> These are some of the files I found inside:
>>>>>>>> 193M ./ontonet%3A%3Ainputstream%******3Aontology889
>>>>>>>> 193M ./ontonet%3A%3Ainputstream%******3Aontology1041
>>>>>>>> 193M ./ontonet%3A%3Ainputstream%******3Aontology395
>>>>>>>> 193M ./ontonet%3A%3Ainputstream%******3Aontology363
>>>>>>>> 193M ./ontonet%3A%3Ainputstream%******3Aontology661
>>>>>>>> 193M ./ontonet%3A%3Ainputstream%******3Aontology786
>>>>>>>> 193M ./ontonet%3A%3Ainputstream%******3Aontology608
>>>>>>>> 193M ./ontonet%3A%3Ainputstream%******3Aontology213
>>>>>>>> 193M ./ontonet%3A%3Ainputstream%******3Aontology188
>>>>>>>> 193M ./ontonet%3A%3Ainputstream%******3Aontology602
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Any clues?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David Riccitelli
>>>>>>>>
>>>>>>>> ****************************************************************
>>>>>>>> ****
>>>>>>>> ************************
>>>>>>>>
>>>>>>>>
>>>>>>>> InsideOut10 s.r.l.
>>>>>>>> P.IVA: IT-11381771002
>>>>>>>> Fax: +39 0110708239
>>>>>>>> ---
>>>>>>>> LinkedIn: 
>>>>>>>> http://it.linkedin.com/in/******riccitelli<http://it.linkedin.com/in/****riccitelli>
>>>>>>>> <http://it.linkedin.**com/in/**riccitelli<http://it.linkedin.com/in/**riccitelli>
>>>>>>>> >
>>>>>>>> <http://it.linkedin.**com/in/**riccitelli<http://it.linkedin.**
>>>>>>>> com/in/riccitelli <http://it.linkedin.com/in/riccitelli>>
>>>>>>>> Twitter: ziodave
>>>>>>>> ---
>>>>>>>> Layar Partner 
>>>>>>>> Network<http://www.layar.com/******<http://www.layar.com/****>
>>>>>>>> <http://www.layar.com/**>
>>>>>>>> publishing/developers/list/?******page=1&country=&city=&**
>>>>>>>> keyword=****
>>>>>>>> insideout10&lpn=1<http://www.****layar.com/publishing/**
>>>>>>>> developers/list/?page=1&****country=&city=&keyword=****
>>>>>>>> insideout10&lpn=1<http://www.**layar.com/publishing/**
>>>>>>>> developers/list/?page=1&**country=&city=&keyword=**
>>>>>>>> insideout10&lpn=1<http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1>
>>>>>>>> >
>>>>>>>> ****************************************************************
>>>>>>>> ****
>>>>>>>> ************************
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>   --
>>>>>>>
>>>>>> M.Sc. Alessandro Adamou
>>>>>>
>>>>>> Alma Mater Studiorum - Università di Bologna
>>>>>> Department of Computer Science
>>>>>> Mura Anteo Zamboni 7, 40127 Bologna - Italy
>>>>>>
>>>>>> Semantic Technology Laboratory (STLab)
>>>>>> Institute for Cognitive Science and Technology (ISTC)
>>>>>> National Research Council (CNR)
>>>>>> Via Nomentana 56, 00161 Rome - Italy
>>>>>>
>>>>>>
>>>>>> "I will give you everything, so long as you do not demand anything."
>>>>>> (Ettore Petrolini, 1930)
>>>>>>
>>>>>> Not sent from my iSnobTechDevice
>>>>>>
>>>>>>
>>>>>>
>>>>>>  --
>>>> M.Sc. Alessandro Adamou
>>>>
>>>> Alma Mater Studiorum - Università di Bologna
>>>> Department of Computer Science
>>>> Mura Anteo Zamboni 7, 40127 Bologna - Italy
>>>>
>>>> Semantic Technology Laboratory (STLab)
>>>> Institute for Cognitive Science and Technology (ISTC)
>>>> National Research Council (CNR)
>>>> Via Nomentana 56, 00161 Rome - Italy
>>>>
>>>>
>>>> "I will give you everything, so long as you do not demand anything."
>>>> (Ettore Petrolini, 1930)
>>>>
>>>> Not sent from my iSnobTechDevice
>>>>
>>>>
>>>>
>>>
>>
>> --
>> M.Sc. Alessandro Adamou
>>
>> Alma Mater Studiorum - Università di Bologna
>> Department of Computer Science
>> Mura Anteo Zamboni 7, 40127 Bologna - Italy
>>
>> Semantic Technology Laboratory (STLab)
>> Institute for Cognitive Science and Technology (ISTC)
>> National Research Council (CNR)
>> Via Nomentana 56, 00161 Rome - Italy
>>
>>
>> "I will give you everything, so long as you do not demand anything."
>> (Ettore Petrolini, 1930)
>>
>> Not sent from my iSnobTechDevice
>>
>>
>
>
> --
> David Riccitelli
>
>
> ********************************************************************************
> InsideOut10 s.r.l.
> P.IVA: IT-11381771002
> Fax: +39 0110708239
> ---
> LinkedIn: http://it.linkedin.com/in/riccitelli
> Twitter: ziodave
> ---
> Layar Partner 
> Network<http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1>
>
> ********************************************************************************
>
>


-- 
David Riccitelli

********************************************************************************
InsideOut10 s.r.l.
P.IVA: IT-11381771002
Fax: +39 0110708239
---
LinkedIn: http://it.linkedin.com/in/riccitelli
Twitter: ziodave
---
Layar Partner 
Network<http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1>
********************************************************************************

Reply via email to