Hi Alok, En 24 de abril de 2015 en 13:16:53, Alok K. Shukla (m...@alokkumarshukla.com) escrito:
Hi All So I have following scenario; suggest me the best approaches to go about it Structured and unstructured content distributed across various content silos mainly comprised of ECM repositories. Using Stanbol as content upliftment engine; enrich the content in all the repositories. Use Solr to index and search the enriched content. I have following concerns regarding above scenario: 1. Approach for crawling the content ( ManifoldCF? ) According to your scenario, it is probably the best choice yes. Although there are some similar frameworks, ManifoldCF has a great coverage of connectors and it is not quite complex to build your own one. There is also the possibility to enhance the content with a “stanbol transformation connector”. This is something I started to work on some time ago and I’m planning to finish it as soon as possible (https://issues.apache.org/jira/browse/CONNECTORS-1181). The connector will enhance the content with the selected Stanbol chain and will store the entity data as plain metadata indexable in the selected output connector. 2. Storage of enhancements ( with content? , triple store ? ) The “official” approach for this is the Stanbol ContentHub, available at release 0.12.x but deprecated for further releases. There have been some initiatives to improve and/or re-desing the ContentHub but there is nothing clear at this moment. You can check if the ContentHub fulfills your requirements and otherwise you would be to build your own architecture. 3. Connect enhanced content to Solr ( ManifoldCF? ) Check Answer 1. Has anyone used ManifoldCF to achieve semantic search as described with content repositories ? Any markers ? I have been quite interested in this field for long time, but some months ago I left the company where we were working intensively around this, so now it is hard for me to find the time for approaching this in a good way. Anyway, there have been great discussions in the community and I could help with ideas and maybe some coding, so if you are interested in taking this further and contribute it to (both) projects I can definitively help Cheers, Rafa Alok