Hi Alok,

En 24 de abril de 2015 en 13:16:53, Alok K. Shukla (m...@alokkumarshukla.com) 

Hi All 

So I have following scenario; suggest me the best approaches to go about it 

Structured and unstructured content distributed across various content silos 
mainly comprised of ECM repositories.  

Using Stanbol as content upliftment engine; enrich the content in all the 

Use Solr to index and search the enriched content. 

I have following concerns regarding above scenario: 
1. Approach for crawling the content ( ManifoldCF? ) 
According to your scenario, it is probably the best choice yes. Although there 
are some similar frameworks, ManifoldCF has a great coverage of connectors and 
it is not quite complex to build your own one. There is also the possibility to 
enhance the content with a “stanbol transformation connector”. This is 
something I started to work on some time ago and I’m planning to finish it as 
soon as possible (https://issues.apache.org/jira/browse/CONNECTORS-1181). The 
connector will enhance the content with the selected Stanbol chain and will 
store the entity data as plain metadata indexable in the selected output 

2. Storage of enhancements ( with content? , triple store ? ) 
The “official” approach for this is the Stanbol ContentHub, available at 
release 0.12.x but deprecated for further releases. There have been some 
initiatives to improve and/or re-desing the ContentHub but there is nothing 
clear at this moment. You can check if the ContentHub fulfills your 
requirements and otherwise you would be to build your own architecture.

3. Connect enhanced content to Solr ( ManifoldCF? ) 
Check Answer 1.

Has anyone used ManifoldCF to achieve semantic search as described with content 
repositories ? Any markers ? 
I have been quite interested in this field for long time, but some months ago I 
left the company where we were working intensively around this, so now it is 
hard for me to find the time for approaching this in a good way. Anyway, there 
have been great discussions in the community and I could help with ideas and 
maybe some coding, so if you are interested in taking this further and 
contribute it to (both) projects I can definitively help




Reply via email to