Hi Mark,
El 26/02/14 15:57, Mark Loper escribió:
Hi, this is going to take some clarification so you understand what I’m trying
to accomplish. Bare with me.
I’m looking at Stanbol as a solution to some requirements that I have received
from my customer. They want dynamic and intelligent content with a social feel
across all of their content DBs. I’m looking at around 3PB of data stored in
various databases ranging from geospatial imagery, documents, images, and
video. I want to show the concept of semantic web can help them see, view and
find their data faster and more intelligently. I want to be able to feed
documents into Stanbol, get a tag cloud based on information in the object and
find out what objects are most related based upon a the relationships that are
found over time.
Glad to see you are considering Stanbol for such interesting use case :-)
My background is as a developer mainly dealing with geospatial image
processing, large object delivery over low coms, security, and OGC systems like
ESRI and OpenGeo. CMS, and semantic web is proving to be a large area of study
that I’m trying to get up to speed with. I did however get Stanbol up and
running quickly and very easily and have run a few documents through the
default config and have been happy with the results.
Here is a basic list of what I want to have happen, and I’m having trouble
finding a use case that describes anything close.
Let me try to put some light around these requirements and let's wait
for more suggestions from the community
1. I can’t store the data, it needs to reside at the origin servers. so I’m
“enhancing” links to objects/ metadata.
As long as you can gather that content and send it to Stanbol's Enhancer
API using one of the supported media types, that is not a problem
2. I don’t have internet access, so this needs to live on a closed network. I
could load my own copies of things like DB-pedia, maybe.
That is exactly the way Stanbol works out of the box, with local sites,
although Stanbol can also use remote sites. For instance, as you might
know, a 43K entities DBpedia site is created by default.
3. I want to grow the intelligence, not start out with everything. The
customer is most interested in what is recent, not what is 15 years old, so I
don’t need to consume all their data.
Initially not relevant from the technical point of view.
4. I want to push every document a user looks at through the system, and then
over a short amount of time I expect I will have a decent and growing library
of connections between what is current and important.
Currently, Stanbol doesn't provide services for making sense of the
extracted metadata using the enhancer. So that is something you would
have to build by yourself.
5. When looking at an image or video, I’d like the user to be able to tag that
object and based on that tag add that to the enhancements of that and other
objects in the system.
Currently, there aren't engines to enhance images or videos, although
this functionality is in the backlog and for example has been proposed
as a possible GSoC project for this year. So you would have to manually
tag that content.
6. I want to display a tag cloud, and/or list of related documents based on
what stanbol knows.
That could be easily achieved in a custom backend storing the
enhancements and it is also possible in a way in Stanbol storing them in
a Clerezza graph.
I’m not looking for a solution to all this, I realize that much of it is
custom, but I feel that the Stanbol services are key to the picture. I can’t
find a good example of how this would all fit together, and I don’t think I
have the semantic / CMS knowledge to just plow forward. I am looking to have a
conversation that will get me moving. Any help you can provide would be very
appreciated.
Thank you,
Mark Loper
CTO
Cyren LLC
mark.lo...@cyrenllc.com
Hope that helps. Cheers,
Rafa