Re: Whither Stanbol

Rafa Haro Tue, 24 May 2016 02:50:38 -0700

Hi Antero,

I'm going to try to answer your questions as better as possible, but for
some of them probably Rupert could provide a better explanation. Besides
your questions, I will take a look to your documentation ASAP but please,
feel free to directly suggest any change you consider should be done. There
is a mirror of Apache Stanbol codebase at GitHub. Making pull requests
there would be a very good way for easing documentation improvements. We
can later take pull requests' diffs files and apply them directly to SVN
based local copy.


On Mon, May 23, 2016 at 3:45 PM Antero Duarte <a.fduar...@gmail.com> wrote:

> Hi there,
>
> Okay, in order to keep this alive, I compiled a collection of the
> documentation that I have created related to Stanbol. I am sending this
> attached to this email as a zip file. If there is a better way to do it,
> just reply and tell me what it is.
>
> Some parts of the documentation are just an expansion on the official
> docs, so a lot of it will be repeated, just worded differently or with some
> extra thing that I found useful.
>
> To complement this, I have some specific questions about where stanbol is
> moving towards and I'd like to welcome anyone that know the answer to any
> of them to reply to the email.
>
> What's the role of the Sesame Yard?
>     The reason why I ask this is because I was able to configure a kiwi
> repository in a marmotta instance and register it in stanbol as a remote
> Sesame Yard, but unlike the Solr yard, there seems to be no way of
> connecting this to an engine and put it on an enhancement chain. Doing this
> would allow greater flexibility as one could use marmotta as a remote
> triplestore. Is this implemented? Is it meant to work in a different way?
>

As far as I know, that is exactly the use case the Sesame Yard was planned
for. As you have also stated, others have reported in the list several
problems for using it within a Linking Engine. So this is again a good
opportunity for compiling an step by step tutorial and include it as part
of the documentation. Anyway, and Rupert could probably confirm this point,
you must take into account that linking process is somehow limited using a
Sesame Yard. Several features of the Entity Lookup process in Stanbol are
totally coupled to the Solr Yard, basically because there is no way to
achieve the same with SPARQL queries and also because Triplestore's don't
provide fully fulltext search support


>
> What is the current version of Solr bundled with Stanbol, and are we
> planning on moving on to some more recent version?
>

Stanbol's trunk solr's current version is 4.4.0. I suppose we can upfrade
to Solr 5, but I don't see a major issue with that right now.


>
> What is the status of connecting to a remote Solr instance?
>     Stanbol already uses Solr in an embedded way so from an abstract
> perspective, it shouldn't be too hard to just plug it in to a remote
> instance of Solr possibly running in a different server. The advantages of
> this would be obviously the decoupling of function and storage, more
> flexibility and control over the Solr instance (i.e applying a
> visualisation layer like banana <https://github.com/lucidworks/banana> on
> it), but also an easier route to connect directly to the solr instance
> which I don't know how everyone else sees it, but I see it as just having
> more flexibility.
>

Again, AFAIK, this is already possible. The main limitation of using an
stand-alone Solr server is that you could not use the current Stanbol FST
engine. FST is the best option if your knowledge base is very large, which
by the way would be probably the major reason for using an external Solr
cluster.


>
> What's both the current role and the "proposed" role of ontonet?
>     Is it supposed to define a namespace globally? For example, if I
> define an ontology in ontonet, I don't need to worry about defining it when
> I create a new custom vocabulary and I can just use it in the raw RDF data?
>

I can't help you with nothing regarding Ontonet, I have never used it,
sorry.


>
> How far are we from accepting form data POST requests to the enhancer?
>     Frameworks and libraries like Express.js for node.js are deprecating
> the use of raw POST requests in favour of form data POST requests, is this
> something Stanbol will want to at least support?
>

Mmmm, I'm again not sure, it sounds to me that this is already supported,
but I would need to check it. If not, could you please create a Jira issue
for this?

Thanks!


>
> Sorry for this huge dump of information, but these are just some things
> that have been on my mind for quite a while and this seemed like the best
> timing for sharing them with the community. As I said before, feel free to
> comment on those if you know any answers, criticize my lack of research if
> anything I ask has been said somewhere by someone before and comment on the
> documentation I am providing (especially the places where I ask for help).
>
> Best Regards,
> Antero
>
> On Fri, 20 May 2016 at 23:06 Stefano Cossu <sco...@artic.edu> wrote:
>
>> Hello,
>>
>> Great to see so much feedback. As A. Soroka mentioned, some Fedora
>> adopters are already using Stanbol or looking into it. We at the Art
>> Insitute of Chicago fall in the latter category.
>>
>> Reading and understanding the documentation has been tough indeed. I have
>> some use cases and I have been trying to figure out whether Stanbol is a
>> good fit for them, but I cannot match what I read in the docs with what I
>> have in my running Stanbol instance (for example, where is the content
>> hub?). Also, without a reasonably regular release schedule or a 1.x release
>> available, it is hard to rely on Stanbol for tasks beyond experimental or
>> ancillary.
>>
>> With a massive introduction of Linked Data concepts in the latest version
>> of Fedora I foresee it being just a matter of time until more folks will
>> start looking at something to resolve semantic integration issues. If that
>> is Stanbol's goal, it would be great to rely on a community project rather
>> than on individual implementations.
>>
>> The AIC has very limited developer resources, but we may be able to
>> contribute with use cases, ideas, testing, and spreading the word; and I am
>> sure that if enough awareness arises, more contribution may come from other
>> sides.
>>
>>
>> Thanks,
>>
>> Stefano
>>
>
>> On 05/20/2016 06:34 AM, Antero Duarte wrote:
>>
> Hi,
>>
>> I will gather all the documentation I have, create some comments on what
>> I don't really understand and essentially got to work on a trial-error
>> basis and then I will send these to everyone. I will also outline in the
>> same email some features I don't understand, some features that I think are
>> useful but don't know how to configure/ not sure if they are actually fully
>> implemented and a list of items that I came across that no longer apply/are
>> deprecated.
>>
>> Regards,
>> Antero
>>
>> On Fri, 20 May 2016 at 11:39 Rafa Haro < <rh...@apache.org>
>> rh...@apache.org> wrote:
>>
>> HI Antero,
>>
>>
>>>
>>> On Fri, May 20, 2016 at 12:23 PM Antero Duarte < <a.fduar...@gmail.com>
>>> a.fduar...@gmail.com> wrote:
>>>
>>> > Hi there,
>>> >
>>> > Stanbol is great and I would hate to see it die.
>>> >
>>>
>>> Couldn't be more agree!
>>>
>>> >
>>> > About the lack of feedback from users/developers, I can only say that
>>> it
>>> > took quite a while for me to be able to reply to someone on this
>>> mailing
>>> > list because the learning curve is so steep. I bet a lot of people
>>> still
>>> > read and are interested in stanbol updates, but they just don't have
>>> the
>>> > technical know-how to be involved. I include myself in this group, I
>>> have
>>> > answered a couple of questions, but only really basic ones, as I fear
>>> my
>>> > knowledge of the platform as a whole doesn't allow me to answer more
>>> > complicated questions.
>>> >
>>> > I think one step that definitely needs to be taken is
>>> improving/updating
>>> > the existing documentation. I know for a fact that one thing that
>>> really
>>> > put me off when I first started using stanbol was the that there was
>>> > documentation that was unclear, examples that were unable to be
>>> reproduced
>>> > for several reasons, and outdated documents that referenced components
>>> that
>>> > no longer existed in the latest stable release of stanbol (I'm not even
>>> > talking about the latest build from trunk).
>>> >
>>>
>>> That's true again imho. Also Development documentation, not only final
>>> user
>>> one is needed. And probably some work on making the APIs more
>>> comprehensible.
>>>
>>>
>>> >
>>> > I have a couple of documents that I have written over time that made it
>>> > easier for me to understand how stanbol works and I could share these
>>> but
>>> > they would need to be reviewed by someone who understands stanbol a lot
>>> > better than me.
>>> >
>>>
>>> Please share, for sure we can all take benefit from it and improve the
>>> documentation
>>>
>>>
>>> >
>>> > I understand that you have busy lives and as developers, you'd rather
>>> use
>>> > the little time you have to code than to write documentation, but if
>>> we can
>>> > make stanbol more approachable to newcomers, I believe the developer
>>> pool
>>> > would increase greatly and we could make Stanbol great again.
>>> >
>>>
>>> +1. It would be great to have also concrete examples about what features,
>>> components and son on are not clear enough or just deprecated in the
>>> current live documentation so we can start by those
>>>
>>> Thanks a lot!
>>>
>>>
>>> >
>>> > My two cents.
>>> >
>>> > Best Regards,
>>> > Antero Duarte
>>> >
>>>
>> > On Fri, 20 May 2016 at 10:26 Rafa Haro < <rh...@apache.org>
>>> rh...@apache.org> wrote:
>>> >
>>> > > Hi Soroka,
>>> > >
>>> > > First of all, reading this kind of emails is, in my opinion, a cause
>>> of
>>> > > happiness as a new attempt to somehow reactivate the project. I
>>> share the
>>> > > same feeling about Apache Stanbol since sometime ago. More than one
>>> month
>>> > > ago, there was a Google Hangout meeting joined by some committers and
>>> > also
>>> > > users. We tried to sketch an immediate roadmap and planned to release
>>> > > version 1.0 in the following weeks after that meeting. We sent an
>>> email
>>> > to
>>> > > the list with the meeting minutes, but after that there was a lot of
>>> > > silence again.
>>> > >
>>> > > Probably the main problem right now is probably the lack of quality
>>> time
>>> > to
>>> > > dedicate to the project for the current active committers. I can only
>>> > speak
>>> > > for myself: in my particular case, in the last year I have used
>>> Stanbol
>>> > for
>>> > > a couple of projects, we developed a couple of custom engines that
>>> we can
>>> > > prepare for contribution, but we never found the proper time to do
>>> this,
>>> > > among other things because we didn't have clear if those engines
>>> could be
>>> > > useful for the community. And that is probably another symptom, we
>>> have
>>> > > been progressively losing feedback from users,
>>> developers....community:
>>> > > there are less and less messages in the mailing list every month.
>>> This
>>> > > scenario is probably not too much motivating for aiming
>>> contributions and
>>> > > finding new committers. There are probably more reasons, like
>>> Stanbol is
>>> > > not technically very friendly to be approached.
>>> > >
>>> > > Of course I'm not saying this situation is someone fault. I'm not
>>> very
>>> > sure
>>> > > about the best recipe for improving the situation either.
>>> > >
>>> > > Thoughts?
>>> > >
>>>
>> > > On Thu, May 19, 2016 at 5:49 PM A. Soroka < <aj...@virginia.edu>
>>> aj...@virginia.edu> wrote:
>>> > >
>>> > > > Hi, Stanbol folks!
>>> > > >
>>> > > > I'm writing to you on behalf of the community of Fedora Commons (
>>> > > > http://fedora-commons.org). Fedora is an information architecture
>>> with
>>> > > > open source reference implementation that has come into wide use
>>> over
>>> > the
>>> > > > last fifteen years in the "cultural heritage" world of libraries,
>>> > > archives,
>>> > > > museums, etc. For many years, we've been intensely concerned with
>>> the
>>> > > ideas
>>> > > > that go under the loose label of "the Semantic Web". In fact, the
>>> > latest
>>> > > > edition of Fedora is an Linked Data Platform implementation,
>>> amongst
>>> > > other
>>> > > > things.
>>> > > >
>>> > > > Several institutions using Fedora are also using Stanbol for
>>> various
>>> > > tasks
>>> > > > (supporting OpenRefine, metadata entity management, NER, etc.), and
>>> > some
>>> > > > discussion has occurred about its state and future potential. It's
>>> not
>>> > > > totally clear to us what kind of development community and
>>> commitment
>>> > > > therefrom currently exists. There has been discussion about a 1.0
>>> > release
>>> > > > of Stanbol, but there doesn't seem to be much other activity in the
>>> > > > codebase, with very few of the listed committers making commits.
>>> > > >
>>> > > > We were wondering if it is possible to get a better sense of the
>>> > > > near-mid-term future of the project. Is there a road map beyond
>>> the 1.0
>>> > > > release? Is Stanbol seeking new developers? What kinds of
>>> resources are
>>> > > > missing to put more vitality back into Stanbol? It's an excellent
>>> > project
>>> > > > filled with great ideas and we'd like to see it move forward.
>>> > > >
>>> > > > We'd be happy to get together for a telephone call / Google
>>> Hangout /
>>> > > > other meeting, if that seems useful!
>>> > > >
>>> > > > ---
>>> > > > A. Soroka
>>> > > > The University of Virginia Library
>>> > > >
>>> > > >
>>> > >
>>> >
>>>
>> --
>>
>> Stefano Cossu
>> Director of Application Services, Collections
>>
>> The Art Institute of Chicago
>> 116 S. Michigan Ave.
>> Chicago, IL 60603
>> 312-499-4026
>>
>

Re: Whither Stanbol

Reply via email to