Re: Move Stanbol to the Attic?

2020-01-31 Thread Rafa Haro
Per discussions with the board, +1 to move Stanbol to the attic.



On Wed, Jan 29, 2020 at 8:42 PM Dave Fisher  wrote:

> Hi -
>
> Development activity in the project is quiet and not progressing. This
> message is to warn the community that the project and codebase may be moved
> to the Apache Attic - http://attic.apache.org
>
> If the community can demonstrate additional development then Stanbol can
> be revitalized.
>
> Once in the Attic all code and prior releases are preserved and available.
>
> Regards,
> Dave
>


Re: Entity not found on referenced site 'dbpedia'

2019-09-24 Thread Rafa Haro
Hi Grzegorz,

The default dbpedia index is a subset of 30k most prominent entities from
dbpedia to be used for testing. There is an specific indexer for dbpedia
that you can use to index the whole dataset, although it might need some
improvements for the more recent dumps since it was tested with a version
from several years ago.

Hope that helps



On Wed, Sep 18, 2019 at 11:59 PM Grzegorz Trzeciak 
wrote:

> My guess it has to do with indexing (
> http://stanbol.apache.org/docs/trunk/customvocabulary.html) but then I
> thought there is a fallback to the original site with referenced sites?
>
> śr., 18 wrz 2019 o 23:38 Grzegorz Trzeciak 
> napisał(a):
>
> > I've encountered it already a few times that some dbpedia resources are
> > not being found on the default dbpedia entityhub.
> > One such entity is
> > http://dbpedia.org/resource/Museum_of_Anatolian_Civilizations
> >
> > When called with:
> >
> >
> http://localhost:8787/stanbol/entityhub/site/dbpedia/entity?id=http://dbpedia.org/resource/Museum_of_Anatolian_Civilizations
> >
> > The response I got is:
> > Entity 'http://dbpedia.org/resource/Museum_of_Anatolian_Civilizations'
> > not found on referenced site 'dbpedia'
> >
> > Do you have any idea as to what is happening and how to fix this?
> >
> > Grzegorz
> >
>


Re: Steps required for adding support for another language

2019-04-14 Thread Rafa Haro
By the way, in your case, you shouldn't be using opennlp ner engine, you
should be using directly opennlp chunking and EntityLinking engine (no
named Entity Linking)

El El dom, 14 abr 2019 a las 22:27, Rafa Haro  escribió:

> Yeah, ideally you will have to train open nlp models for Polish. But for
> testing, you can force opennlp engines to use the models for a specific
> language (English normally). I would swear you can do that directly in the
> engines configuration through Felix console. The content will be processed
> as English and open nlp will be doing its best, but for languages with a
> similar sintaxis sometimes is enough for, at least, getting chunks with
> candidate tokens.
>
> Hope that helps
>
> PD: just for curiosity, because I don't remember it right now and I won't
> have a laptop by hand in some dayswhich are the engines involve in the
> fst-linking chain?
>
> El El dom, 14 abr 2019 a las 21:54, Grzegorz Trzeciak 
> escribió:
>
>> OK I've found the chain that at least captures some dbpedia entities:
>> dbpedia-fst-linking
>> I will be playing with varous engine combinations to see what can get me
>> through the POC the best which leaves me with question about the more
>> permanent solution.
>>
>> My understanding is that this would require building language model for
>> opennlp, is it correct? Are there other requirements for adding language
>> support? I am trying to estimate work effort required for such task so any
>> advice will be helpful.
>>
>> Also if you are aware of any resources that could be helpful, that would
>> be great.
>>
>> Thank you
>>
>> G.
>>
>> niedz., 14 kwi 2019 o 21:10 Grzegorz Trzeciak 
>> napisał(a):
>>
>>> using default chain:
>>>
>>>- *tika* ( optional , TikaEngine)
>>>- *langdetect* ( required , LanguageDetectionEnhancementEngine)
>>>- *opennlp-sentence* ( required , OpenNlpSentenceDetectionEngine)
>>>- *opennlp-token* ( required , OpenNlpTokenizerEngine)
>>>- *opennlp-pos* ( required , OpenNlpPosTaggingEngine)
>>>- *opennlp-ner* ( required , NamedEntityExtractionEnhancementEngine)
>>>- *dbpediaLinking* ( required , NamedEntityTaggingEngine)
>>>- *entityhubExtraction* ( required , EntityLinkingEngine)
>>>- *dbpedia-dereference* ( required , EntityDereferenceEngine)
>>>
>>>
>>> I will try disabling langdetect then.
>>>
>>> niedz., 14 kwi 2019 o 21:08 Rafa Haro  napisał(a):
>>>
>>>> Hi Grzergorz,
>>>>
>>>> Can you provide details about your enhancement chain?. Probably you can
>>>> try
>>>> by disabling language detection and forcing English as language for the
>>>> whole chain
>>>>
>>>> El El dom, 14 abr 2019 a las 20:52, Grzegorz Trzeciak <
>>>> gtrzec...@gmail.com>
>>>> escribió:
>>>>
>>>> > I need to provide a proof of concept for a customer using Stanbol
>>>> enhancer
>>>> > but the POC needs to be in Polish, only now I realised there is no
>>>> support
>>>> > for Polish in Stanbol (other than language recognition). At the moment
>>>> > running the enhancer on a text only returns the recognized language,
>>>> so my
>>>> > question is twofold:
>>>> >
>>>> > 1. Is there a quick and dirty way of making Stanbol work with Polish
>>>> > language (for POC only)
>>>> > 2. What are the steps necessary to implement the correct solution of
>>>> > supporting another language
>>>> >
>>>> > Thanks
>>>> >
>>>> > Grzegorz Trzeciak
>>>> >
>>>>
>>>


Re: Steps required for adding support for another language

2019-04-14 Thread Rafa Haro
Yeah, ideally you will have to train open nlp models for Polish. But for
testing, you can force opennlp engines to use the models for a specific
language (English normally). I would swear you can do that directly in the
engines configuration through Felix console. The content will be processed
as English and open nlp will be doing its best, but for languages with a
similar sintaxis sometimes is enough for, at least, getting chunks with
candidate tokens.

Hope that helps

PD: just for curiosity, because I don't remember it right now and I won't
have a laptop by hand in some dayswhich are the engines involve in the
fst-linking chain?

El El dom, 14 abr 2019 a las 21:54, Grzegorz Trzeciak 
escribió:

> OK I've found the chain that at least captures some dbpedia entities:
> dbpedia-fst-linking
> I will be playing with varous engine combinations to see what can get me
> through the POC the best which leaves me with question about the more
> permanent solution.
>
> My understanding is that this would require building language model for
> opennlp, is it correct? Are there other requirements for adding language
> support? I am trying to estimate work effort required for such task so any
> advice will be helpful.
>
> Also if you are aware of any resources that could be helpful, that would
> be great.
>
> Thank you
>
> G.
>
> niedz., 14 kwi 2019 o 21:10 Grzegorz Trzeciak 
> napisał(a):
>
>> using default chain:
>>
>>- *tika* ( optional , TikaEngine)
>>- *langdetect* ( required , LanguageDetectionEnhancementEngine)
>>- *opennlp-sentence* ( required , OpenNlpSentenceDetectionEngine)
>>- *opennlp-token* ( required , OpenNlpTokenizerEngine)
>>- *opennlp-pos* ( required , OpenNlpPosTaggingEngine)
>>- *opennlp-ner* ( required , NamedEntityExtractionEnhancementEngine)
>>- *dbpediaLinking* ( required , NamedEntityTaggingEngine)
>>- *entityhubExtraction* ( required , EntityLinkingEngine)
>>- *dbpedia-dereference* ( required , EntityDereferenceEngine)
>>
>>
>> I will try disabling langdetect then.
>>
>> niedz., 14 kwi 2019 o 21:08 Rafa Haro  napisał(a):
>>
>>> Hi Grzergorz,
>>>
>>> Can you provide details about your enhancement chain?. Probably you can
>>> try
>>> by disabling language detection and forcing English as language for the
>>> whole chain
>>>
>>> El El dom, 14 abr 2019 a las 20:52, Grzegorz Trzeciak <
>>> gtrzec...@gmail.com>
>>> escribió:
>>>
>>> > I need to provide a proof of concept for a customer using Stanbol
>>> enhancer
>>> > but the POC needs to be in Polish, only now I realised there is no
>>> support
>>> > for Polish in Stanbol (other than language recognition). At the moment
>>> > running the enhancer on a text only returns the recognized language,
>>> so my
>>> > question is twofold:
>>> >
>>> > 1. Is there a quick and dirty way of making Stanbol work with Polish
>>> > language (for POC only)
>>> > 2. What are the steps necessary to implement the correct solution of
>>> > supporting another language
>>> >
>>> > Thanks
>>> >
>>> > Grzegorz Trzeciak
>>> >
>>>
>>


Re: Steps required for adding support for another language

2019-04-14 Thread Rafa Haro
Hi Grzergorz,

Can you provide details about your enhancement chain?. Probably you can try
by disabling language detection and forcing English as language for the
whole chain

El El dom, 14 abr 2019 a las 20:52, Grzegorz Trzeciak 
escribió:

> I need to provide a proof of concept for a customer using Stanbol enhancer
> but the POC needs to be in Polish, only now I realised there is no support
> for Polish in Stanbol (other than language recognition). At the moment
> running the enhancer on a text only returns the recognized language, so my
> question is twofold:
>
> 1. Is there a quick and dirty way of making Stanbol work with Polish
> language (for POC only)
> 2. What are the steps necessary to implement the correct solution of
> supporting another language
>
> Thanks
>
> Grzegorz Trzeciak
>


[ANNOUNCE] Please welcome Antonio Pérez as a Stanbol PMC member

2018-12-21 Thread Rafa Haro
Please, join me in welcoming Antonio David Pérez as a new Stanbol PMC
member. Congratulations Antonio!

The Apache Stanbol PMC


[ANNOUNCE] Please welcome Furkan Kamaci as a Stanbol committer

2018-12-21 Thread Rafa Haro
Please, join me in welcoming Furkan Kamaci as a new Apache Stanbol
committer. Looking forward to your contributions Furkan. Congratulations!

The Apache Stanbol PMC


Re: Trying to understand the level of activity in Stanbol community

2018-12-01 Thread Rafa Haro
Hi Furkan,

I have merged your PRs. Thanks again

On Thu, Nov 29, 2018 at 7:26 PM Rafa Haro  wrote:

> Hi Furkan,
>
> Sorry I didn't. This weekend is probably a good time for me. Will review
> and eventually merge them before next Monday
>
> Thanks!
> El El jue, 29 nov 2018 a las 17:48, Furkan KAMACI 
> escribió:
>
>> Hi Rafa,
>>
>> Did you have a chance to check my PRs?
>>
>> Kind Regards,
>> Furkan KAMACI
>>
>> On Mon, Nov 26, 2018 at 11:51 AM Rafa Haro  wrote:
>>
>>> Hi Furkan,
>>>
>>> Thanks! I will take a look ASAP in order to get them merged
>>>
>>> Cheers,
>>> Rafa
>>>
>>> On Sat, Nov 24, 2018 at 9:40 PM Furkan KAMACI 
>>> wrote:
>>>
>>> > Hi,
>>> >
>>> > I've just created STANBOL-1473, STANBOL-1474, STANBOL-1475 and
>>> STANBOL-1476
>>> > and created PRs for them!
>>> >
>>> > Kind Regards,
>>> > Furkan KAMACI
>>> >
>>> > On Wed, Nov 21, 2018 at 5:39 PM Furkan KAMACI 
>>> > wrote:
>>> >
>>> > > Hi Rafa & Roman,
>>> > >
>>> > > I've already contributed to Apache Stanbol with my 15 issues. You can
>>> > > check them from here:
>>> > >
>>> > >
>>> > >
>>> >
>>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20Stanbol%20AND%20reporter%20%3D%20kamaci%20
>>> > >
>>> > > @Rafa I'll check backlog and also file new issues to keep my
>>> > contribution.
>>> > >
>>> > > @Roman I believe that Apache Stanbol has a great future and I'll try
>>> to
>>> > > contribute its source code and improve its community as much as I
>>> can.
>>> > >
>>> > > Kind Regards,
>>> > > Furkan KAMACI
>>> > >
>>> > > On Wed, Nov 21, 2018 at 5:37 AM Roman Shaposhnik 
>>> wrote:
>>> > >
>>> > >> Hi Everyone,
>>> > >>
>>> > >> top-posting here to summarize a month worth of this thread:
>>> > >>   1. it sounds like there's still interest in keeping Stanbol alive
>>> > >>   2. it sounds like at least Phillip Rhodes, Dileepa Jayakody and
>>> > >>   Rafa Haro are enthusiastic enough about Stanbol's future
>>> > >>   and are willing to invest some amount of their time into it.
>>> > >>
>>> > >> Hence, I'd like to propose that we start a formal PMC vote on adding
>>> > >> these
>>> > >> individuals to the PMC in the hope that this will help to
>>> revitalize the
>>> > >> project.
>>> > >>
>>> > >> If there are no objections -- I'm going to start a PMC vote thread
>>> over
>>> > >> the
>>> > >> weekend.
>>> > >>
>>> > >> Thanks,
>>> > >> Roman.
>>> > >>
>>> > >> On 2018/10/25 02:50:03, Phillip Rhodes 
>>> > >> wrote:
>>> > >> > Hey guys, I'm definitely still very interested in Stanbol and
>>> would
>>> > >> > like to see it remain an active project here at ASF.  I haven't
>>> had
>>> > >> > much opportunity to contribute up to this point, but I remain
>>> > >> > committed to getting more deeply involved and starting to
>>> contribute.
>>> > >> > It's hard to commit to a specific level or a specific timeline,
>>> but I
>>> > >> > can say that my plan is definitely to try and get equipped to
>>> start
>>> > >> > contributing to Stanbol.
>>> > >> >
>>> > >> >
>>> > >> > Phil
>>> > >> > ~~~
>>> > >> > This message optimized for indexing by NSA PRISM
>>> > >> >
>>> > >> > On Thu, Oct 18, 2018 at 5:48 AM Dileepa Jayakody
>>> > >> >  wrote:
>>> > >> > >
>>> > >> > > Hi Roman, Rafa and fellow Stanbolers.
>>> > >> > >
>>> > >> > > I'm Dileepa Jayakody, a committer of Apache Stanbol project
>>> since
>>> > >> 2014.
>>> > >> > > While I was engaging in the dev list and development of the
&

Re: Trying to understand the level of activity in Stanbol community

2018-11-29 Thread Rafa Haro
Hi Furkan,

Sorry I didn't. This weekend is probably a good time for me. Will review
and eventually merge them before next Monday

Thanks!
El El jue, 29 nov 2018 a las 17:48, Furkan KAMACI 
escribió:

> Hi Rafa,
>
> Did you have a chance to check my PRs?
>
> Kind Regards,
> Furkan KAMACI
>
> On Mon, Nov 26, 2018 at 11:51 AM Rafa Haro  wrote:
>
>> Hi Furkan,
>>
>> Thanks! I will take a look ASAP in order to get them merged
>>
>> Cheers,
>> Rafa
>>
>> On Sat, Nov 24, 2018 at 9:40 PM Furkan KAMACI 
>> wrote:
>>
>> > Hi,
>> >
>> > I've just created STANBOL-1473, STANBOL-1474, STANBOL-1475 and
>> STANBOL-1476
>> > and created PRs for them!
>> >
>> > Kind Regards,
>> > Furkan KAMACI
>> >
>> > On Wed, Nov 21, 2018 at 5:39 PM Furkan KAMACI 
>> > wrote:
>> >
>> > > Hi Rafa & Roman,
>> > >
>> > > I've already contributed to Apache Stanbol with my 15 issues. You can
>> > > check them from here:
>> > >
>> > >
>> > >
>> >
>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20Stanbol%20AND%20reporter%20%3D%20kamaci%20
>> > >
>> > > @Rafa I'll check backlog and also file new issues to keep my
>> > contribution.
>> > >
>> > > @Roman I believe that Apache Stanbol has a great future and I'll try
>> to
>> > > contribute its source code and improve its community as much as I can.
>> > >
>> > > Kind Regards,
>> > > Furkan KAMACI
>> > >
>> > > On Wed, Nov 21, 2018 at 5:37 AM Roman Shaposhnik 
>> wrote:
>> > >
>> > >> Hi Everyone,
>> > >>
>> > >> top-posting here to summarize a month worth of this thread:
>> > >>   1. it sounds like there's still interest in keeping Stanbol alive
>> > >>   2. it sounds like at least Phillip Rhodes, Dileepa Jayakody and
>> > >>   Rafa Haro are enthusiastic enough about Stanbol's future
>> > >>   and are willing to invest some amount of their time into it.
>> > >>
>> > >> Hence, I'd like to propose that we start a formal PMC vote on adding
>> > >> these
>> > >> individuals to the PMC in the hope that this will help to revitalize
>> the
>> > >> project.
>> > >>
>> > >> If there are no objections -- I'm going to start a PMC vote thread
>> over
>> > >> the
>> > >> weekend.
>> > >>
>> > >> Thanks,
>> > >> Roman.
>> > >>
>> > >> On 2018/10/25 02:50:03, Phillip Rhodes 
>> > >> wrote:
>> > >> > Hey guys, I'm definitely still very interested in Stanbol and would
>> > >> > like to see it remain an active project here at ASF.  I haven't had
>> > >> > much opportunity to contribute up to this point, but I remain
>> > >> > committed to getting more deeply involved and starting to
>> contribute.
>> > >> > It's hard to commit to a specific level or a specific timeline,
>> but I
>> > >> > can say that my plan is definitely to try and get equipped to start
>> > >> > contributing to Stanbol.
>> > >> >
>> > >> >
>> > >> > Phil
>> > >> > ~~~
>> > >> > This message optimized for indexing by NSA PRISM
>> > >> >
>> > >> > On Thu, Oct 18, 2018 at 5:48 AM Dileepa Jayakody
>> > >> >  wrote:
>> > >> > >
>> > >> > > Hi Roman, Rafa and fellow Stanbolers.
>> > >> > >
>> > >> > > I'm Dileepa Jayakody, a committer of Apache Stanbol project since
>> > >> 2014.
>> > >> > > While I was engaging in the dev list and development of the
>> project
>> > >> for
>> > >> > > couple of years, due to other engagements, I must admit my
>> > >> contributions
>> > >> > > have been very low since.
>> > >> > >
>> > >> > > However, I think there is a lot we can do to revive the project.
>> > >> > > Looking at the Jira tracker[1] I think we can start by fixing the
>> > >> bugs and
>> > >> > > work towards a new release. Would love to see Stanbol active
>> again,
>> > >

Re: Trying to understand the level of activity in Stanbol community

2018-11-26 Thread Rafa Haro
Hi Furkan,

Thanks! I will take a look ASAP in order to get them merged

Cheers,
Rafa

On Sat, Nov 24, 2018 at 9:40 PM Furkan KAMACI 
wrote:

> Hi,
>
> I've just created STANBOL-1473, STANBOL-1474, STANBOL-1475 and STANBOL-1476
> and created PRs for them!
>
> Kind Regards,
> Furkan KAMACI
>
> On Wed, Nov 21, 2018 at 5:39 PM Furkan KAMACI 
> wrote:
>
> > Hi Rafa & Roman,
> >
> > I've already contributed to Apache Stanbol with my 15 issues. You can
> > check them from here:
> >
> >
> >
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20Stanbol%20AND%20reporter%20%3D%20kamaci%20
> >
> > @Rafa I'll check backlog and also file new issues to keep my
> contribution.
> >
> > @Roman I believe that Apache Stanbol has a great future and I'll try to
> > contribute its source code and improve its community as much as I can.
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On Wed, Nov 21, 2018 at 5:37 AM Roman Shaposhnik  wrote:
> >
> >> Hi Everyone,
> >>
> >> top-posting here to summarize a month worth of this thread:
> >>   1. it sounds like there's still interest in keeping Stanbol alive
> >>   2. it sounds like at least Phillip Rhodes, Dileepa Jayakody and
> >>   Rafa Haro are enthusiastic enough about Stanbol's future
> >>   and are willing to invest some amount of their time into it.
> >>
> >> Hence, I'd like to propose that we start a formal PMC vote on adding
> >> these
> >> individuals to the PMC in the hope that this will help to revitalize the
> >> project.
> >>
> >> If there are no objections -- I'm going to start a PMC vote thread over
> >> the
> >> weekend.
> >>
> >> Thanks,
> >> Roman.
> >>
> >> On 2018/10/25 02:50:03, Phillip Rhodes 
> >> wrote:
> >> > Hey guys, I'm definitely still very interested in Stanbol and would
> >> > like to see it remain an active project here at ASF.  I haven't had
> >> > much opportunity to contribute up to this point, but I remain
> >> > committed to getting more deeply involved and starting to contribute.
> >> > It's hard to commit to a specific level or a specific timeline, but I
> >> > can say that my plan is definitely to try and get equipped to start
> >> > contributing to Stanbol.
> >> >
> >> >
> >> > Phil
> >> > ~~~
> >> > This message optimized for indexing by NSA PRISM
> >> >
> >> > On Thu, Oct 18, 2018 at 5:48 AM Dileepa Jayakody
> >> >  wrote:
> >> > >
> >> > > Hi Roman, Rafa and fellow Stanbolers.
> >> > >
> >> > > I'm Dileepa Jayakody, a committer of Apache Stanbol project since
> >> 2014.
> >> > > While I was engaging in the dev list and development of the project
> >> for
> >> > > couple of years, due to other engagements, I must admit my
> >> contributions
> >> > > have been very low since.
> >> > >
> >> > > However, I think there is a lot we can do to revive the project.
> >> > > Looking at the Jira tracker[1] I think we can start by fixing the
> >> bugs and
> >> > > work towards a new release. Would love to see Stanbol active again,
> >> and see
> >> > > the senior developers back on the list :)
> >> > >
> >> > > I will start to engage again as much as possible.
> >> > >
> >> > > Thanks,
> >> > > Dileepa
> >> > >
> >> > > [1] https://issues.apache.org/jira/projects/STANBOL/summary
> >> > >
> >> > > On Thu, Oct 18, 2018 at 1:15 PM Rafa Haro  wrote:
> >> > >
> >> > > > Hi Roman,
> >> > > >
> >> > > > I'm Rafa Haro, current PMC member of Apache Stanbol. I think we
> have
> >> > > > discussed this situation a couple of times in the past few months.
> >> There
> >> > > > were a similar inquiry recently and back then, at least 3 "active"
> >> > > > (including me) PMC members responded with similar answers: we
> would
> >> be
> >> > > > somehow available for bug fixing, attend the mailing lists and so
> >> on but,
> >> > > > because of lack of time, further contributions like working on the
> >> backlog,
> >> > > > community developmenthad to b

Re: Trying to understand the level of activity in Stanbol community

2018-11-20 Thread Rafa Haro
Hi Furkan,

Glad to hear that. Is there any specific open task that you want to start
addressing or any proposal for the backlog?

Cheers,
Rafa

On Tue, Nov 20, 2018 at 5:12 AM Muhammad Sajjad 
wrote:

> yes of course.
>
> On Sun, Nov 18, 2018 at 10:26 AM Furkan KAMACI 
> wrote:
>
> > Hi All,
> >
> > I'm *Furkan KAMACI* who is a person loves open source, Java, and Machine
> > Learning. I'm Committer and PMC member of Apache Gora, Apache Nutch, and
> > Committer of Apache ManifoldCF and also a member of The Apache Software
> > Foundation.
> >
> > I've developed software and managed teams which creates products on
> > analyzing Petabytes of data to run efficient Machine Learning and Search
> > algorithms on Big Data. I have a work experience +10 years including
> > companies as like Alcatel-Lucent/Nokia and has an academical background.
> > Currently, I have a company named as LAGOM which works on Big Data and
> > Machine Learning and contributes to open source projects.
> >
> > I've contributed to many ASF projects throughout the years including
> Gora,
> > ManifoldCF, Nutch, Solr/Lucene and of course *Stanbol*. I was a GSoC
> > student for Stanbol 4 years ago but I couldn't deep into dive with
> Stanbol
> > due to my health problems.
> >
> > I would like to take responsibility and contribute to *Apache Stanbol*
> as a
> > Committer/PMC.
> >
> > PS: My Apache and GitHub id is *kamaci*.
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On Thu, Oct 25, 2018 at 5:50 AM Phillip Rhodes <
> motley.crue@gmail.com>
> > wrote:
> >
> > > Hey guys, I'm definitely still very interested in Stanbol and would
> > > like to see it remain an active project here at ASF.  I haven't had
> > > much opportunity to contribute up to this point, but I remain
> > > committed to getting more deeply involved and starting to contribute.
> > > It's hard to commit to a specific level or a specific timeline, but I
> > > can say that my plan is definitely to try and get equipped to start
> > > contributing to Stanbol.
> > >
> > >
> > > Phil
> > > ~~~
> > > This message optimized for indexing by NSA PRISM
> > >
> > > On Thu, Oct 18, 2018 at 5:48 AM Dileepa Jayakody
> > >  wrote:
> > > >
> > > > Hi Roman, Rafa and fellow Stanbolers.
> > > >
> > > > I'm Dileepa Jayakody, a committer of Apache Stanbol project since
> 2014.
> > > > While I was engaging in the dev list and development of the project
> for
> > > > couple of years, due to other engagements, I must admit my
> > contributions
> > > > have been very low since.
> > > >
> > > > However, I think there is a lot we can do to revive the project.
> > > > Looking at the Jira tracker[1] I think we can start by fixing the
> bugs
> > > and
> > > > work towards a new release. Would love to see Stanbol active again,
> and
> > > see
> > > > the senior developers back on the list :)
> > > >
> > > > I will start to engage again as much as possible.
> > > >
> > > > Thanks,
> > > > Dileepa
> > > >
> > > > [1] https://issues.apache.org/jira/projects/STANBOL/summary
> > > >
> > > > On Thu, Oct 18, 2018 at 1:15 PM Rafa Haro  wrote:
> > > >
> > > > > Hi Roman,
> > > > >
> > > > > I'm Rafa Haro, current PMC member of Apache Stanbol. I think we
> have
> > > > > discussed this situation a couple of times in the past few months.
> > > There
> > > > > were a similar inquiry recently and back then, at least 3 "active"
> > > > > (including me) PMC members responded with similar answers: we would
> > be
> > > > > somehow available for bug fixing, attend the mailing lists and so
> on
> > > but,
> > > > > because of lack of time, further contributions like working on the
> > > backlog,
> > > > > community developmenthad to be eventually abandoned. I think
> this
> > > > > situation is extensible to most of current committers. Within the
> > same
> > > > > threads, there were Stanbol's users that were interesting in taking
> > the
> > > > > project out of this "blockage" but the reality is that this
> situation
> > > has
> > > > > remained the same for months and the activity at the mailing li

Re: Trying to understand the level of activity in Stanbol community

2018-10-18 Thread Rafa Haro
Hi Roman,

I'm Rafa Haro, current PMC member of Apache Stanbol. I think we have
discussed this situation a couple of times in the past few months. There
were a similar inquiry recently and back then, at least 3 "active"
(including me) PMC members responded with similar answers: we would be
somehow available for bug fixing, attend the mailing lists and so on but,
because of lack of time, further contributions like working on the backlog,
community developmenthad to be eventually abandoned. I think this
situation is extensible to most of current committers. Within the same
threads, there were Stanbol's users that were interesting in taking the
project out of this "blockage" but the reality is that this situation has
remained the same for months and the activity at the mailing lists is
minimum. So maybe yeah, probably it is a shame and maybe it is time to
archive the project.

I'm not sure which are the implications of putting the project into the
Attic from the organisation point of view, but in my honest opinion it is
that or, somehow, I don't know exactly how, try to find or extend the
community with new members interested in further improvements and uses
cases around Stanbol that could lead to resurrect the project.

Warm regards from Seville

On Thu, Oct 18, 2018 at 3:15 AM Roman Shaposhnik  wrote:

> On Sun, Oct 7, 2018 at 9:47 AM Roman Shaposhnik  wrote:
> >
> > Hi!
> >
> > a few months ago Shane Curcuru started this discussion
> > around the level of PMC activity in the project:
> >
> https://lists.apache.org/thread.html/7f85aecc5180b85d888cf63a8136521739e5e1ee74ae167abd3ab9ff@%3Cdev.stanbol.apache.org%3E
> >
> > at the time it seemed like the consensus was: there's
> > enough folks interested in at least overseeing mechanics
> > of the PMC. However if a few months since it would seem
> > that we're back to where Shane's discussion started.
> >
> > Now, this is, of course, based on my cursory review of mailing
> > list and JIRA activity, but it seems that out of the current 22 PMC
> > members only one or two are active (and I don't mean writing
> > code active, but at least supporting "mechanics" of the PMC):
> >http://people.apache.org/phonebook.html?pmc=stanbol
> >
> > I'm wondering, perhaps, if it would benefit the project to consider
> > expanding the PMC to include the other 4 committers and/or
> > some of the project's active users.
> >
> > Thoughts?
>
> Hi Again!
>
> it seems that there hasn't been much of a feedback on this thread so
> I'm expanding it to users@ as well.
>
> The concern here is that unless the community comes up with at least 3
> reasonably active PMC
> members to make sure that the project can continue it may be end up
> "archived" into Apache Attic.
>
> Now, these PMC members do NOT have to be existing PMC members, if
> there's enough enthusiastic
> folks in the user community we can definitely look into "rebooting"
> the PMC that way.
>
> If anybody is interested in making sure that Stanbol continues as an
> active Apache project -- please
> reply to this thread.
>
> Thanks,
> Roman.
>


Re: Missing Stanbol Board Report - any PMC Member can file

2018-08-17 Thread Rafa Haro
As already discussed at other threads, it has been hard for me to lately
dedicate time to the project as well, but as PMC member I intend to be
available to watch over the project

On Thu, Aug 16, 2018 at 10:42 PM Phillip Rhodes 
wrote:

> FWIW:
>
> Not a PMC member, but I'm willing to declare that my goal is to attain
> that status.  I haven't contributed anything TO Stanbol yet, but I use
> it heavily within the scope of another OSS project I'm working on and
> I do intend to get involved with helping the core Stanbol project.
>
>
> Phil
>
>
> This message optimized for indexing by NSA PRISM
>
>
> On Sun, Aug 12, 2018 at 2:01 PM, Shane Curcuru 
> wrote:
> > Dear Apache Stanbol PMC (and community),
> >
> > The board report for Stanbol has not yet been submitted for this month's
> > board meeting on Wednesday. If you or another member of the PMC are
> > unable to get it in by twenty-four hours before meeting time, please let
> > the board know, and plan to report next month.
> >
> >   https://www.apache.org/foundation/board/reporting#how
> >
> > I see that elsewhere on dev@ people are discussing the activity level in
> > the project (and a couple of people are still interested, yay!).  From
> > the ASF board's point of view, what is most important is ensuring that
> > there are at least three PMC members who are active enough that they
> > could respond to security or other serious issues.
> >
> > That is, a mostly-dormant project is fine (because features are done, or
> > because people don't currently have time to build new ones), but the
> > board still needs to know there are PMC members watching over the
> > project, and providing quarterly reports.
> >
> > Any PMC member can submit a report for the project if the Chair is not
> > available this week.  The board would also appreciate seeing if there
> > are three PMC members who could reply to this email, so we know the PMC
> > still has sufficient attention here to provide oversight.
> >
> > Thanks,
> > --
> >
> > - Shane
> >   Director & Member
> >   The Apache Software Foundation
> > (on behalf of the ASF Board)
> >
>


Re: Stanbol development

2018-08-07 Thread Rafa Haro
Hi Philip,

I think it is just a matter of lack of time. There are still some good
ideas/refactors in the backlog (some of them discussed not too much time
ago), but we probably failed to build a broader community.

My time is quite limited and others like Rupert has expressed the same
recently, but I'm still wiling and looking forward to put some effort in
the project.

Cheers,
Rafa

On Wed, Aug 1, 2018 at 5:35 PM Phillip Rhodes 
wrote:

> So, just wondering... of the folks who have worked on Stanbol up until
> now, is there still interest in continuing to work on Stanbol among
> that group? That is, is the lull in activity because people aren't
> interested / available, or is because Stanbol is simply complete as is
> and doesn't *need* any more work?
>
> Are there features that people have in mind for adding to a
> (potential) next release?  If so, can anybody call out what some of
> the highest priority ones might be?  I'm still interested in possibly
> trying to help with some Stanbol activity, depending on what things
> are needed / wanted...
>
>
> Cheers,
>
> Phil
> 
> This message optimized for indexing by NSA PRISM
>


Re: GSOC 2018 - Squebi - 1327

2018-04-09 Thread Rafa Haro
Hi Kamila,

I could made an effort an tried to co-mentor the project, but prior to that
you would have to modify and extend your proposal covering the new
discussed tasks. That could take you some time, I'm not sure even if you
are allowed to modify the proposal at this point


On Fri, Apr 6, 2018 at 10:01 PM Kamila Molina Orellana <
kamila.molin...@gmail.com> wrote:

> Hi,
>
> Thanks Rafa, that sounds cool. That is the reason I wanted to leave open
> the possibility to work on a different activity. Will you be up to mentor
> this task with Andreas? According to the timeline [1], organizations are
> reviewing and selecting proposals.
>
>
> ~Regards,
> Kamila.
>
> [1] https://summerofcode.withgoogle.com/how-it-works/#timeline
>
> On Tue, Mar 27, 2018 at 8:39 AM, Rafa Haro <rh...@apache.org> wrote:
>
>> Hi Kamila,
>>
>> I'm probably late to the discussion because you have already made a nice
>> progress on the proposal but I just wanted to put another topic on the
>> table which in my opinion could be quite suitable for a GSoC project. It
>> has to do with a new EntityHub Yard implementation. ATM there are (as far
>> as I know) three different Yard implementations based on Solr, Clerezza
>> and
>> Sesame. First one is suitable to be used for Entity Linking and the others
>> could be eventually used as classic TripleStore enabling for example
>> SPARQL
>> querying. We are still missing of a Yard implementation that could fulfil
>> both uses cases with a single technology, basically a Triple Store with
>> full text search capabilities. There are at least a couple of them
>> available like Jena TDB + Jena Text (formerly Jena LARQ) or Stardog which
>> is a commercial triple store providing a Community version as well.
>>
>> I have been interested in this topic for years but never had the proper
>> time to work on it. Apart from the Yard implementation, a new Entity
>> Searcher would be to need implemented as well. @Rupert is a better
>> authorised voice for this anyone as main developer of that part.
>>
>> My 2 cents,
>>
>> Rafa
>>
>> On Tue, Mar 27, 2018 at 5:55 AM Kamila Molina Orellana <
>> kamila.molin...@gmail.com> wrote:
>>
>> > Hi Rupert,
>> >
>> > Thanks for your answer.
>> >
>> > I was seeing STANBOL-320 mostly as data cleaning/management issue. I
>> will
>> > try to describe the scope and possible solutions.
>> >
>> > However, until we define that, I think it is better to leave my proposal
>> > with an open issue. Just to reiterate, I can say that I will exchange
>> the
>> > current SPARQL editor + an issue TBD (the deadline for the proposals is
>> in
>> > 12 hours). So, we can decide an issue or some work in the remaining
>> time. I
>> > said that because you can help me to estimate times, so I don't take an
>> > issue that is too small or too big.
>> >
>> > Finally, this is my proposal [1] in case you have some comments.
>> >
>> > Regards,
>> > ~Kamila.
>> >
>> > [1]
>> >
>> >
>> https://docs.google.com/document/d/1WmropX2Bu_0g10VX3ZRE8Jil5kGb8N9RwhSCTs4KFww/edit?usp=sharing
>> >
>> > On Mon, Mar 26, 2018 at 5:32 AM, Rupert Westenthaler <
>> > rupert.westentha...@gmail.com> wrote:
>> >
>> > > Hi,
>> > >
>> > > Disambiguation is interesting for sure, but also a very broad topic.
>> > > So please make sure to describe scope and possible solutions (e.g.
>> > > approach + frameworks) to the problem.
>> > >
>> > > A dummy example based on STANBOL-320 this could be
>> > >
>> > > * approach: process extracted named entities; filter all with more as
>> > > 5 words and 50 chars
>> > > * frameworks: none
>> > >
>> > > A more realistic example could be
>> > >
>> > > * approach: summary of the approach + links to some papers this
>> > > approach is based on or related to
>> > > * frameworks: links to the frameworks used to implement the approach.
>> > > If required also links to the datasets needed for learning ...
>> > >
>> > > hope this helps
>> > > best
>> > > Rupert
>> > >
>> > > On Sun, Mar 25, 2018 at 11:51 PM, Kamila Molina Orellana
>> > > <kamila.molin...@gmail.com> wrote:
>> > > > Hi,
>> > > >
>> > > > I am working into my proposal. I think it would be a good

Re: Stanbol 1.0 Enhancer chain issue

2018-03-09 Thread Rafa Haro
Hi Shenal,

Which kind of site were your creating? A Managed or a Referenced site?

Cheers,
Rafa

On Mon, Feb 19, 2018 at 7:42 AM Silva, Shenal 
wrote:

> Hi All,
>
> I have set up the latest version (1.0) of Stanbol and created a custom
> vocabulary as per the following document which was given in one of the mail
> trails
> https://drive.google.com/drive/u/0/folders/0BxAMBxFN_NcgOVB1SFlUTTlOZ2M
>
> The vocabulary/ enhancer chains get created without any issue. However,
> entities are not extracted as expected. When I try the same setup on
> version 0.12 the entities are extracted as expected.
>
> Java version: Oracle JDK 8 - "1.8.0_131"
> Tomcat Version: 7.0.85
>
> Am I missing anything here which is not in the document above?
>
> Thanx & Regards,
> Shenal Silva
> ᐧ
>


Re: GSOC 2018 - Squebi - 1327

2018-03-09 Thread Rafa Haro
Hi Andreas,

Can you or the student elaborate here a little bit the idea for the project?

Thanks

On Fri, Mar 9, 2018 at 8:26 AM Andreas Kuckartz  wrote:

> Hi Kamila and Rupert,
>
> Sorry for my very belated reply.
>
> The intention of tagging the issue with GSOC 2018 was that it could be
> resolved in a GSOC-project.
>
> Yes, it alone would not be enough for such a project and therefore
> significantly more work would need to be included.
>
> The main problem seems to be to have two mentors.
>
> Is there any of the other Stanbol developers who can and likes to fill
> such a role? It would really help to support Stanbol.
>
> Only then should we attempt to develop the scope of the GSOC project.
>
> Cheers,
> Andreas
> ---
>
> Kamila Molina Orellana schrieb:
> > Thanks Rupert for you answer. I understand it. I was really interested in
> > Stanbol, so I guess I will keep looking at some other project for GSoC.
> > However, If anyone is interested in mentoring, I am still looking forward
> > to contribute.
> >
> > Regards,
> > ~Kamila.
> >
> > On Mon, Mar 5, 2018 at 7:02 AM, Rupert Westenthaler <
> > rupert.westentha...@gmail.com> wrote:
> >
> >> Hi Kamila
> >>
> >> Stanbol has very little ongoing development - mostly fixing bugs.
> >> While their are for sure topics of interests - especially related to
> >> information extraction and classification - I do not know if we would
> >> be able to find 2 Developers plan to be active enough for Mentoring.
> >>
> >> For me the GSoC 2018 timeline is troublesome as I will be mostly
> >> offline in final evaluation period (6 - 21 August). So I would have a
> >> hard time providing list minute feedback and would have a hard time to
> >> do the final evaluation.
> >>
> >> best
> >> Rupert
> >>
> >>
> >> On Sun, Mar 4, 2018 at 6:31 PM, Kamila Molina Orellana
> >>  wrote:
> >>> Hi Rupert.
> >>>
> >>> Thanks so much Rupert for your answer. Do you think the project will go
> >> for
> >>> GSoC? I have been looking at Squebi, and if it is too short, we can try
> >> to
> >>> fix some other issue.
> >>>
> >>> Regards,
> >>> ~Kamila.
> >>>
> >>> On Wed, Feb 28, 2018 at 1:09 AM, Rupert Westenthaler <
> >>> rupert.westentha...@gmail.com> wrote:
> >>>
>  Hi Kamila,
> 
>  Sorry for the late response, but I wanted wait for Andreas to answer
>  this as I do not really agree with him tagging STANBOL-1327 with
>  GSoC2018.
> 
>  IMHO this would be not a good topic as it is much to simple and small
>  (I would expect a maximum of 2 working days to complete this feature).
>  Maybe Andreas can add some comments about his intentions when marking
>  this issue with GSoC2018.
> 
>  best
>  Rupert
> 
>  On Mon, Feb 26, 2018 at 4:58 AM, Kamila Molina Orellana
>   wrote:
> > Hi,
> >
> > I have been working in my proposal and wanted to make the most of
>  bounding
> > period by interacting with the community and the tool. I have been
>  working
> > in the proposal and would like to share it with my prospective
> mentor.
> > Anyone from Apache who would like to mentorship?
> >
> > Regards,
> > ~Kamila.
> >
> > On Tue, Feb 20, 2018 at 9:31 PM, Kamila Molina Orellana <
> > kamila.molin...@gmail.com> wrote:
> >
> >> Hi all,
> >>
> >> I am interested in participating in GSOC 2018 and been looking at
> the
> >> issue and playing a bit with Stanbol. I am a student from the
>  University of
> >> Cuenca. I have been working with Semantic technologies and currently
> >> in
>  my
> >> third year of college.
> >>
> >> Well just a quick overview of what I understand, please correct me
> >> if I
>  am
> >> wrong. The idea will be to exchange the actual Sparql Endpoint for
>  Squebi.
> >> Then when you go to /sparql, you'll have Squebi functionalities,
> >> right?
> >>
> >> I see you use Fremaker to build the website, OSGI to load modules.
> >> The
> >> actual Sparql Endpoint is loading in bundle/list.xml (
> >> org.apache.stanbol.commons.web.sparql), but when is this file
> >> loaded?
> >> Where is the module that manages the Sparql Services? I mean, the WS
> >> to
> >> make updates and select queries.
> >>
> >>
> >> Regards,
> >> ~Kamila.
> >>
> 
> 
> 
>  --
>  | Rupert Westenthaler rupert.westentha...@gmail.com
>  | Bodenlehenstraße 11  ++43-699-11108907
> <+43%20699%2011108907>
>  | A-5500 Bischofshofen
>  | REDLINK.CO ..
> >> ..
>  ..
>  | http://redlink.co/
> 
> >>
> >>
> >>
> >> --
> >> | Rupert Westenthaler rupert.westentha...@gmail.com
> >> | Bodenlehenstraße 11  ++43-699-11108907
> <+43%20699%2011108907>
> >> | A-5500 

Re: ASF Board Report for Stanbol - Repeated Reminder for December 2017

2017-11-20 Thread Rafa Haro
Hi ajs6f, Philip,

There have been a separated thread at priv...@stanbol.apache.org where more
than 3 PMC members have confirmed to be active, although everyone has
agreed (including myself) on having limited time to further contribute to
the project codebase at this moment but also agreed on being available for
supporting new contributors

Regards,
Rafa

On Mon, Nov 20, 2017 at 4:02 PM Phillip Rhodes 
wrote:

> I'm really hoping there are enough PMC members to keep Stanbol active.
> I'm personally very interested in getting more involved, although I
> don't know enough to be a PMC candidate anytime terribly soon.   I'd
> like to get there eventually though.  :-)
>
>
> Phil
>
> This message optimized for indexing by NSA PRISM
>
>
> On Fri, Nov 17, 2017 at 9:51 AM, ajs6f  wrote:
> > I assume that Phil Steitz sent this message to dev@ after failing to
> get any response by sending to private@.
> >
> > Do we still have _any_ active PMC members to actually get a report in?
> >
> > Do we have _three_ active PMC members left? If not, it's time to start
> an orderly progression towards the Attic. We don't want to just dump a
> codebase on them (much as I'm sure it happens).
> >
> > Moving to the Attic would be a shame, because there are really good
> ideas in Stanbol, ideas that could have a future. Some of the components
> might be able to find other homes for a new life, but not without active
> PMC/committer work. I certainly don't pretend to understand the complex
> module architecture well enough to excise individual components or
> rearchitect towards a simpler modularization.
> >
> > ajs6f ; Jena PMC
> >
> >
> >> On Nov 16, 2017, at 12:04 PM, Phil Steitz  wrote:
> >>
> >> Dear Stanbol community,
> >>
> >> In the governance model at the ASF the board delegates responsibility
> for
> >> managing projects to PMCs. To enable the board to provide oversight
> across the
> >> foundation, PMCs are tasked with providing the board with a quarterly
> >> report on the health of the project. The board has noticed that the
> reports
> >> for Stanbol have been missed for a number of months.
> >>
> >> The reports to the board are normally written by the PMC chair but all
> PMC
> >> members have an individual responsibility to ensure that a report is
> >> submitted. If the PMC chair is not available then any PMC member can
> submit
> >> the report. If you need help with this process, please reach out to
> >> bo...@apache.org
> >>
> >> Please ensure that a report for Stanbol is submitted to the board for
> the
> >> next meeting.
> >>
> >> If the PMC chair is not going to be available for an extended period of
> time
> >> it may make sense to rotate the PMC chair. Rotating the PMC chair does
> not
> >> mean the current chair has failed. People's situations and interests
> change,
> >> and rotation is good as it allows more people to become familiar with
> that
> >> role. Again, if assistance is required with this process, please feel
> free to
> >> reach out to bo...@apache.org
> >>
> >> As projects mature, they will naturally reach a point where activity
> reduces
> >> to a level that the project is no longer sustainable. At Apache,
> projects
> >> reach this stage when there are no longer 3 active PMC members providing
> >> oversight. Projects that reach this stage are placed in the attic[1].
> If
> >> Stanbol has reached this point, please reach out to the Attic project to
> >> arrange transfer. On the other hand, if your project is mostly dormant
> but
> >> still has at least three active PMC members it can stay in that state
> for as
> >> long as needed. If your project is in such a state, please mention that
> in
> >> your report and verify the PMC's state at regular intervals.
> >>
> >> Finally, if you have any questions please feel free to reach out to
> >> bo...@apache.org.
> >>
> >> Thanks,
> >> The ASF Board
> >>
> >> [1] (http://attic.apache.org/
> >
>


Re: trying to replace default opennlp with stanford models

2017-08-04 Thread Rafa Haro
Hi Laura,

Stanbol's Stanfordnlp is a wrapper for converting Stanford CorenNLP into a
server that process text and return annotations in a concrete format
expected by Stanbol. At Stanbol's side, the RESTful NLP engine which
basically acts as a client of that server (
https://stanbol.apache.org/docs/trunk/components/enhancer/nlp/restfulnlpanalysisservice
)

Regards

On Wed, Aug 2, 2017 at 9:45 AM Laura Bostan  wrote:

> Hi,
>
> I am currently working on replacing the default opennlp with stanford
> models in stanbol configuration. For that I tried to integrate this release
>  https://github.com/westei/stanbol-stanfordnlp/releases  into my own
> stanbol configuration so that it will be used together with the enhancer
> and other chains I created.
>
> My goal is to compare the NER tools and check which is better for my needs
> and also if possible try to combine them and check if they are better
> together.
> I guess, one solution could be that a bundle out of the stanford models can
> be made and then being loaded and used in the current configuration.
>
> How can we create a bundle out of the stanford models?
>
> Thank you,
> Laura
>


Re: Report Time - Input Required

2017-05-04 Thread Rafa Haro
Hi Fabian,

>From my side, I fixed some weeks ago a couple of serious bugs that was
preventing EntityHub to correctly index entities using Managed sites. They
were basically invalidating the whole Enhancing using a Managed Site use
case.

I will be working in the next weeks as well into extending the EntityHub
REST API for creating Managed Sites through a REST request. So far, it is
only possible to create them manually through the Felix Console.

That's all from my side

On Thu, May 4, 2017 at 10:57 AM Fabian Christ 
wrote:

> Hi Stanbolers,
>
> we have to report to the board since we (my fault) missed the last
> report slots we should not miss the upcoming one.
>
> So, what has happened and what is going to happen?
>
>
> --
> Fabian
> http://twitter.com/fctwitt
>


Re: Contributing to Stanbol - Getting involved

2017-03-31 Thread Rafa Haro
Hi All,

Although I still think and agree that organizing a couple of technical
sessions regarding the codebase is a great idea, taking into account that
it is being complicated to bring Rupert in (I can ensure he is quite busy),
I would also like to encourage you to start getting your hands dirty. As
any other complex project (and this one it is), it is not going to be easy
to getting started, but 2-4 hours of Rupert explaining the technical
details is not going to change that too much, mainly because 2-4 hours is
only enough time for an overall overview.

So, ad Fabian said, you can give it a try to any of the currently open
issues or just try to start implementing anything you are interested in.
The project has also a developers mailing list where we can try to discuss
and solve detailed technical stuff.

Don't be shy :-)

On Fri, Mar 31, 2017 at 10:36 AM Michal Krajňanský <
michal.krajnan...@gmail.com> wrote:

> Hi Stanbolers and Mr. Soroka,
>
> I was curious, if there has been any development in the organization of the
> meeting. Would it perhaps make sense to create a doodle, so that anyone
> interested can vote for the suitable time? If I can somehow help make this
> meeting happen, I will be more than happy to.
>
> Cheers,
> Michal Krajnansky
> Research Specialist
> Konica Minolta Laboratory Europe
> http://research.konicaminolta.eu/
>
> On Mon, Mar 20, 2017 at 2:50 PM A. Soroka  wrote:
>
> > This is a nice précis of how to get involved with pretty much any Apache
> > project. On one particular point:
> >
> > > * There is no technical barrier to start contributing to Stanbol.
> >
> > Well, no, not any unsurmountable technical barriers. But it is not an
> easy
> > code base to approach (I count over 230 Maven modules!). There are a lot
> of
> > important underlying technologies like JAX-RS, OSGi (and the slightly
> > obscure Sling launcher system), and an extensive set of relationships to
> > other projects (e.g. Marmotta, Solr, etc.). And Stanbol has a lot of
> great
> > functionality with which to become familiar!
> >
> > So my hope for a synchronous meeting is rooted in the desire for some
> > "bootstrapping" discussion that could help people like me (who are ready
> > and willing to do work) get to do that work with as little overhead as
> > possible, while updating and writing documentation to help the _next_
> > "generation" of Stanbol contributors.
> >
> > ---
> > A. Soroka
> > The University of Virginia Library
> >
> > > On Mar 20, 2017, at 4:54 AM, Fabian Christ <
> christ.fab...@googlemail.com>
> > wrote:
> > >
> > > Hi Stanbolers,
> > >
> > > there have been some questions in the direction of how to start
> > > contributing to Stanbol. Especially from the viewpoint of a company. I
> > > would like to point to some basics here at Apache.
> > >
> > > * The ASF projects are about people and not so much about companies.
> > > However, companies are very welcome to let their employees contribute
> > > to projects during their working hours.
> > > * There is no technical barrier to start contributing to Stanbol. For
> > > people, who are not yet committers to the SVN yet, the process is to
> > > file a JIRA ticket and attach a patch file to that ticket for desired
> > > changes. A Stanbol committer has to pick this patch and apply it to
> > > the code base.
> > > * People who started to contribute to Stanbol and show interest will
> > > be invited to become official Stanbol committers.
> > > * All discussions about the project and its contributions have to
> > > happen on the dev mailing list. There should be no discussions in some
> > > other channel that are not archived or visible to the world. The
> > > mantra here is "If it is not on the list, it did not happen."
> > >
> > > So everyone with interest in the Stanbol technology is very welcome to
> > > start any technical discussion about improvements on the dev list,
> > > file JIRA ticket, submit patches. This is the way the group of
> > > contributors will grow.
> > >
> > > Best
> > > Fabian
> > >
> > > --
> > > Fabian
> > > http://twitter.com/fctwitt
> >
> >
>


Re: Using Custom Vocabularies

2017-03-22 Thread Rafa Haro
Hi Habib, Rupert

I would like to do another clarification. You should be using language
annotations for the fields that you are planning to map to rdfs:label.
Actually, I recently fix a bug that was preventing language annotated
fields (xml:lang) to be correctly indexed in Solr while using Managed
sites. I tried first to configure an Entity Linking engine for not using
any language while looking up for entities and it wasn't working. That is
probably another bug to solve. Therefore, basically, at this moment, if you
want to recognize entities from a Managed Site a language is mandatory.

This also implies that Entity Linking against Managed Sites is broken for
version 1.0.0 and it is only working if you checkout the current trunk

Cheers,
Rafa

On Wed, Mar 22, 2017 at 12:54 PM Rupert Westenthaler <
rupert.westentha...@gmail.com> wrote:

> Hi Habib
>
> In addition to my response to your other mail some minutes ago ...
>
> When configuring the YardSite make sure you configure the current
> FieldMappings for your Ontology
>
> Given the example
>
> <http://www.sample-xo.com/KM/smpl#1CGI>
>   rdf:type smpl:ATR ;
>   smpl:acronym "CGI" ;
>   smpl:comment "Digital Services and Secure Connectivity for
> Governments and key Industrial Sectors and Infrastructure, incl.
> Security Solutions" ;
>   smpl:level "1" ;
>   smpl:refinedBy <http://www.sample-xo.com/KM/smpl#2CYS> ;
>   smpl:refinedBy <http://www.sample-xo.com/KM/smpl#2INT> ;
>   smpl:title "Secure Connectivity for Government and Industries" ;
> .
>
> you will need the mappings
>
> http://www.sample-xo.com/KM/smpl#title > rdfs:label
> http://www.sample-xo.com/KM/smpl#acronym > rdfs:label
>
> to ensure that values of those fields are mapped to the rdfs:label -
> the field used by default for EntityLinking.
>
> Without those mappings linking will not work unless you configure the
> engine to link against http://www.sample-xo.com/KM/smpl#title
>
> Hope this helps
> best
> Rupert
>
> On Fri, Mar 17, 2017 at 2:56 PM, Habibrahman A <habibrahma...@tcs.com>
> wrote:
> > Thank you very much, Rafa, for the response.
> >
> > It is going to be a dynamic one, hence we need to use the Managed site i
> > believe. Kindly correct me if am wrong.
> >
> > I'm attaching (in google drive) the sample ontology that we are using.
> > Also the screen shots of the configuration we made. It would be great if
> > you could review and let us know where we are missing.
> >
> > Also, if you can point to some documents/books to understand the labels,
> > would be great. I've gone through the link that talks about the
> > EntityLinking but couldn't get clarity on the labels part because of my
> > limited knowledge.
> >
> >
> > https://drive.google.com/open?id=0BxAMBxFN_NcgOVB1SFlUTTlOZ2M
> >
> >
> > Thanks & Regards
> > Habib
> >
> >
> >
> >
> > From:   Rafa Haro <rh...@apache.org>
> > To: dev@stanbol.apache.org
> > Date:   03/17/2017 06:04 PM
> > Subject:Re: Using Custom Vocabularies
> >
> >
> >
> > Hi Habib,
> >
> > What you are doing is ok but with some exceptions. First of all, if your
> > domain ontology is not something dynamic (you are not planning to change
> > it
> > very often) you might prefer to use a Referenced Site instead of Managed
> > Site which is more suitable for dynamic uses cases. Basically, you would
> > need to use the EntityHub genericrdf indexer tool as explained here:
> > https://stanbol.apache.org/docs/trunk/customvocabulary.html (Using a
> > Referenced Site section).
> >
> > By configuring the indexing part, the most important part is to properly
> > model the fields mappings, specially those fields that must be considered
> > as labels as you main use case is to Enhance content with that domain
> > ontology. Regarding this, I can't see any good candidate for labels in
> the
> > piece of ontology that you posted. Take into account that the Entity
> > Linker
> > has been designed for linking mostly concepts names (labels) and not
> whole
> > sentences like "Digital Product Design and Factory". That doesn't sound
> as
> > a label for me, but you can try anyway.
> >
> > Hope that helps,
> > Rafa
> >
> > On Fri, Mar 17, 2017 at 12:23 PM Habibrahman A <habibrahma...@tcs.com>
> > wrote:
> >
> >> Dear Team,
> >>
> >> We're using Stanbol for semantic search in one of our projects about
> >> Knowledge Management.
> >>
> >> We need to use th

Re: Using Custom Vocabularies

2017-03-17 Thread Rafa Haro
Hi Habib,

What you are doing is ok but with some exceptions. First of all, if your
domain ontology is not something dynamic (you are not planning to change it
very often) you might prefer to use a Referenced Site instead of Managed
Site which is more suitable for dynamic uses cases. Basically, you would
need to use the EntityHub genericrdf indexer tool as explained here:
https://stanbol.apache.org/docs/trunk/customvocabulary.html (Using a
Referenced Site section).

By configuring the indexing part, the most important part is to properly
model the fields mappings, specially those fields that must be considered
as labels as you main use case is to Enhance content with that domain
ontology. Regarding this, I can't see any good candidate for labels in the
piece of ontology that you posted. Take into account that the Entity Linker
has been designed for linking mostly concepts names (labels) and not whole
sentences like "Digital Product Design and Factory". That doesn't sound as
a label for me, but you can try anyway.

Hope that helps,
Rafa

On Fri, Mar 17, 2017 at 12:23 PM Habibrahman A 
wrote:

> Dear Team,
>
> We're using Stanbol for semantic search in one of our projects about
> Knowledge Management.
>
> We need to use the domain specific ontology provided to us.
>
> Below are the steps we did.
>
> 1. Created a new SolrYard
> 2. Create a new Site (Managed Site) and linked the yard
> 3. Using 'curl' command, loaded the ontology to the site
> curl -i -X POST -H "Content-Type:text/turtle" -T {ATRA.ttl} "
> http://localhost:8080/entityhub/site/customSite/entity;
> 4. Created a EntityHub Linking and referred the site created in step 2
> 5. Created a custom Enhancement Chain and added the EntityHub Linking
> created in step 4.
>
> When we run search (with sentences from the ontology), we couldn't see the
> entities defined being identified, but the results are different from when
> default chain is run. So we think something is happening, but not sure
> what and how.
>
> For eg.,
> The below is part of the ontology
> ##
> 
>   rdf:type ATRb:ATR ;
>   ATRb:acronym "PDP" ;
>   ATRb:comment "Digital Product Development Process and fully integrated
> Digitized Production and Supply Chain (> 2x Acceleration)" ;
>   ATRb:level "1" ;
>   ATRb:refinedBy  ;
>   ATRb:refinedBy  ;
>   ATRb:refinedBy  ;
>   ATRb:title "Digital Product Design and Factory" ;
> #
>   when we search for 'Digital Product design and factory', we do not see
> anything in the list of entities identified. The enhancement chain just
> identifies the language.
>
> Our enhancemen chain contains the below engines
>
> tika;optional
> langdetect
> opennlp-sentence
> opennlp-token
> opennlp-pos
> opennlp-chunker
> sampleEntityLinking
> dbpedia-disamb-linking
> disambiguation-mlt
> dbpedia-dereference
>
>
> We would like to know, if what we are doing is correct ? Also, why we are
> not seeing the entities from the ontology that is loaded.
>
> Also, we were told that we need to create a pipeline to actually have the
> semantic search implemented.
>
> Pipeline steps being,
> 1. Load input documents to an external Solr
> 2. Configure the Managed site, enhancement engine and the chain
> 3. Hit Enhancement chain with the search text
> 4. Parse the output and extract the entities identified
> 5. Hit Solr, which has the input documents indexed, with the identified
> entities.
>
> Is this correct understanding.
>
> We're very new to Stanbol and search technologies as such.
>
> Your help is much appreciated.
>
> Thanks & Regards,
> Habib
> Thanks & Regards
> Habib Rahman
> Manufacturing TEG - Java CoE
> Tata Consultancy Services
> Ph:- +91446616 9247
> Cell:- 9094765645 <(909)%20476-5645>
> Mailto: habibrahma...@tcs.com
> Website: http://www.tcs.com
> 
> Experience certainty.   IT Services
> Business Solutions
> Consulting
> 
> =-=-=
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>
>


Re: Installation-Test Error

2017-03-16 Thread Rafa Haro
Hi Christian,

Typical workflow for me consists on calling the enhancer for each document
and retrieve all the entities annotations plus dereferenced fields. How to
store the entities is, again, pretty use case dependent. If you want to use
Solr or ES for performing a kind of Semantic Search, take into account that
probably you are going to need to flat the entities data somehow. You could
be using different but related cores or indexes for trying to maintain the
entities data structures.

Hope that helps,
Rafa

On Thu, Mar 16, 2017 at 2:59 PM Christian Herrmann <
c...@christian-herrmann.info> wrote:

> Hi Rafa,
>
> basically, I want to be able to index documents based on their content as
> well as derived (stanbol) information. As result, I want to be able to
> semantically search through documents (based on ontologys) or structure the
> information space (based on pre-defined ontologys or tags (NER), etc.).
> Therefore, I probably would have to built up an index and make it
> searchable (solr). So my question basically is, how to get stanbol involved
> within the ETL ...
>
> Clear enough to help me out here? ;-)
>
> Best, Christian
>
> 2017-03-16 11:50 GMT+01:00 Rafa Haro <rh...@apache.org>:
>
> > Hi Christian,
> >
> > AFAIK, there is not such kind of wiki page or documentation. Take into
> > account that the integration of Stanbol is a use case dependent matter.
> If
> > you further explain your concrete requirements we can try to give some
> > guidelines or at least a piece of advice :-)
> >
> > Cheers
> >
> > On Thu, Mar 16, 2017 at 10:04 AM Christian Herrmann <
> > c...@christian-herrmann.info> wrote:
> >
> > > Hi Rafa,
> > >
> > > regarding the content hub discussion: Cannot manage to get the old
> > versions
> > > compiled (broken dependencies, probably because of new releases). Is
> > there
> > > a wiki page or similar that describes how to integrate stanbol 1.0 with
> > > solr or any other persistance database? The current documentation seems
> > to
> > > be kind of outdated and I can't find something online.
> > >
> > > Thx and regards
> > > Christian
> > >
> > > 2017-03-03 10:05 GMT+01:00 Rafa Haro <rh...@apache.org>:
> > >
> > > > Hi Christian. Probably the tests don't properly work on windows
> > > > El El vie, 3 mar 2017 a las 10:00, Christian Herrmann <
> > > > c...@christian-herrmann.info> escribió:
> > > >
> > > > > OK - same build failure with this revision. Shall I change JDK?
> > > > >
> > > > > 2017-03-02 15:18 GMT+01:00 Rafa Haro <rh...@apache.org>:
> > > > >
> > > > > > Hi Christian,
> > > > > >
> > > > > > ContentHub was discontinued for versions beyond 0.12.x. If you
> want
> > > to
> > > > > use
> > > > > > it you would need to download that version
> > > > > >
> > > > > > On Thu, Mar 2, 2017 at 3:03 PM Christian Herrmann <
> > > > > > c...@christian-herrmann.info> wrote:
> > > > > >
> > > > > > I was able to build Stanbol skipping the tests - nevertheless, I
> am
> > > not
> > > > > > sure, whether everything went correct, e.g. I don't have the
> > > contenthub
> > > > > > available ...
> > > > > >
> > > > > > 2017-03-01 12:05 GMT+01:00 Christian Herrmann <
> > > > > c...@christian-herrmann.info
> > > > > > >:
> > > > > >
> > > > > > > Oracle
> > > > > > >
> > > > > > > 2017-03-01 11:57 GMT+01:00 Rafa Haro <rh...@apache.org>:
> > > > > > >
> > > > > > >> Oracle JDK or Open JDK?
> > > > > > >>
> > > > > > >> On Wed, Mar 1, 2017 at 11:49 AM Christian Herrmann <
> > > > > > >> c...@christian-herrmann.info> wrote:
> > > > > > >>
> > > > > > >> > Hi Rafa,
> > > > > > >> >
> > > > > > >> > C:\>java -showversion
> > > > > > >> > java version "1.8.0_121"
> > > > > > >> > Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
> > > > > > >> > Ja

Re: Installation-Test Error

2017-03-16 Thread Rafa Haro
Hi Christian,

AFAIK, there is not such kind of wiki page or documentation. Take into
account that the integration of Stanbol is a use case dependent matter. If
you further explain your concrete requirements we can try to give some
guidelines or at least a piece of advice :-)

Cheers

On Thu, Mar 16, 2017 at 10:04 AM Christian Herrmann <
c...@christian-herrmann.info> wrote:

> Hi Rafa,
>
> regarding the content hub discussion: Cannot manage to get the old versions
> compiled (broken dependencies, probably because of new releases). Is there
> a wiki page or similar that describes how to integrate stanbol 1.0 with
> solr or any other persistance database? The current documentation seems to
> be kind of outdated and I can't find something online.
>
> Thx and regards
> Christian
>
> 2017-03-03 10:05 GMT+01:00 Rafa Haro <rh...@apache.org>:
>
> > Hi Christian. Probably the tests don't properly work on windows
> > El El vie, 3 mar 2017 a las 10:00, Christian Herrmann <
> > c...@christian-herrmann.info> escribió:
> >
> > > OK - same build failure with this revision. Shall I change JDK?
> > >
> > > 2017-03-02 15:18 GMT+01:00 Rafa Haro <rh...@apache.org>:
> > >
> > > > Hi Christian,
> > > >
> > > > ContentHub was discontinued for versions beyond 0.12.x. If you want
> to
> > > use
> > > > it you would need to download that version
> > > >
> > > > On Thu, Mar 2, 2017 at 3:03 PM Christian Herrmann <
> > > > c...@christian-herrmann.info> wrote:
> > > >
> > > > I was able to build Stanbol skipping the tests - nevertheless, I am
> not
> > > > sure, whether everything went correct, e.g. I don't have the
> contenthub
> > > > available ...
> > > >
> > > > 2017-03-01 12:05 GMT+01:00 Christian Herrmann <
> > > c...@christian-herrmann.info
> > > > >:
> > > >
> > > > > Oracle
> > > > >
> > > > > 2017-03-01 11:57 GMT+01:00 Rafa Haro <rh...@apache.org>:
> > > > >
> > > > >> Oracle JDK or Open JDK?
> > > > >>
> > > > >> On Wed, Mar 1, 2017 at 11:49 AM Christian Herrmann <
> > > > >> c...@christian-herrmann.info> wrote:
> > > > >>
> > > > >> > Hi Rafa,
> > > > >> >
> > > > >> > C:\>java -showversion
> > > > >> > java version "1.8.0_121"
> > > > >> > Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
> > > > >> > Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
> > > > >> >
> > > > >> > 2017-03-01 11:12 GMT+01:00 Rafa Haro <rh...@apache.org>:
> > > > >> >
> > > > >> > > Hi Christian,
> > > > >> > >
> > > > >> > > Which JVM version are you using?
> > > > >> > >
> > > > >> > > On Tue, Feb 28, 2017 at 10:53 PM Christian Herrmann <
> > > > >> > > c...@christian-herrmann.info> wrote:
> > > > >> > >
> > > > >> > > Hi all,
> > > > >> > >
> > > > >> > > I am facing an installation error when trying to build stanbol
> > > > >> (Revision
> > > > >> > > 1767958 from https://svn.apache.org/repos/asf/stanbol/trunk)
> > on a
> > > > >> > windows
> > > > >> > > environment; it's a test failure... anyone can help me?
> > > > >> > >
> > > > >> > > Logs attached.
> > > > >> > >
> > > > >> > > Thank you in advance!
> > > > >> > >
> > > > >> > > [INFO]
> > > > >> > > 
> > > > >> 
> > > > >> > > [INFO] BUILD FAILURE
> > > > >> > > [INFO]
> > > > >> > > 
> > > > >> 
> > > > >> > > [INFO] Total time: 19:42 min
> > > > >> > > [INFO] Finished at: 2017-02-28T20:47:42+01:00
> > > > >> > > [INFO] Final Memory: 378M/966M
> > > > >> > > [INFO]
> > > > >> > > ---

Re: Installation-Test Error

2017-03-03 Thread Rafa Haro
Hi Christian. Probably the tests don't properly work on windows
El El vie, 3 mar 2017 a las 10:00, Christian Herrmann <
c...@christian-herrmann.info> escribió:

> OK - same build failure with this revision. Shall I change JDK?
>
> 2017-03-02 15:18 GMT+01:00 Rafa Haro <rh...@apache.org>:
>
> > Hi Christian,
> >
> > ContentHub was discontinued for versions beyond 0.12.x. If you want to
> use
> > it you would need to download that version
> >
> > On Thu, Mar 2, 2017 at 3:03 PM Christian Herrmann <
> > c...@christian-herrmann.info> wrote:
> >
> > I was able to build Stanbol skipping the tests - nevertheless, I am not
> > sure, whether everything went correct, e.g. I don't have the contenthub
> > available ...
> >
> > 2017-03-01 12:05 GMT+01:00 Christian Herrmann <
> c...@christian-herrmann.info
> > >:
> >
> > > Oracle
> > >
> > > 2017-03-01 11:57 GMT+01:00 Rafa Haro <rh...@apache.org>:
> > >
> > >> Oracle JDK or Open JDK?
> > >>
> > >> On Wed, Mar 1, 2017 at 11:49 AM Christian Herrmann <
> > >> c...@christian-herrmann.info> wrote:
> > >>
> > >> > Hi Rafa,
> > >> >
> > >> > C:\>java -showversion
> > >> > java version "1.8.0_121"
> > >> > Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
> > >> > Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
> > >> >
> > >> > 2017-03-01 11:12 GMT+01:00 Rafa Haro <rh...@apache.org>:
> > >> >
> > >> > > Hi Christian,
> > >> > >
> > >> > > Which JVM version are you using?
> > >> > >
> > >> > > On Tue, Feb 28, 2017 at 10:53 PM Christian Herrmann <
> > >> > > c...@christian-herrmann.info> wrote:
> > >> > >
> > >> > > Hi all,
> > >> > >
> > >> > > I am facing an installation error when trying to build stanbol
> > >> (Revision
> > >> > > 1767958 from https://svn.apache.org/repos/asf/stanbol/trunk) on a
> > >> > windows
> > >> > > environment; it's a test failure... anyone can help me?
> > >> > >
> > >> > > Logs attached.
> > >> > >
> > >> > > Thank you in advance!
> > >> > >
> > >> > > [INFO]
> > >> > > 
> > >> 
> > >> > > [INFO] BUILD FAILURE
> > >> > > [INFO]
> > >> > > 
> > >> 
> > >> > > [INFO] Total time: 19:42 min
> > >> > > [INFO] Finished at: 2017-02-28T20:47:42+01:00
> > >> > > [INFO] Final Memory: 378M/966M
> > >> > > [INFO]
> > >> > > 
> > >> 
> > >> > > [ERROR] Failed to execute goal
> > >> > > org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test
> > >> (default-test)
> > >> > > on project org.apache.stanbol.enhancer.engine.topic: There are
> test
> > >> > > failures.
> > >> > > [ERROR]
> > >> > > [ERROR] Please refer to
> > >> > >
> C:\stanbol\enhancement-engines\topic\engine\target\surefire-reports
> > >> for
> > >> > > the
> > >> > > individual test results.
> > >> > > [ERROR] -> [Help 1]
> > >> > > [ERROR]
> > >> > > [ERROR] To see the full stack trace of the errors, re-run Maven
> with
> > >> the
> > >> > -e
> > >> > > switch.
> > >> > > [ERROR] Re-run Maven using the -X switch to enable full debug
> > logging.
> > >> > > [ERROR]
> > >> > > [ERROR] For more information about the errors and possible
> > solutions,
> > >> > > please read the following articles:
> > >> > > [ERROR] [Help 1]
> > >> > > http://cwiki.apache.org/confluence/display/MAVEN/
> > MojoFailureException
> > >> > > [ERROR]
> > >> > > [ERROR] After correcting the problems, you can resume the build
> with
> > >> the
> > 

Re: Installation-Test Error

2017-03-01 Thread Rafa Haro
Oracle JDK or Open JDK?

On Wed, Mar 1, 2017 at 11:49 AM Christian Herrmann <
c...@christian-herrmann.info> wrote:

> Hi Rafa,
>
> C:\>java -showversion
> java version "1.8.0_121"
> Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
>
> 2017-03-01 11:12 GMT+01:00 Rafa Haro <rh...@apache.org>:
>
> > Hi Christian,
> >
> > Which JVM version are you using?
> >
> > On Tue, Feb 28, 2017 at 10:53 PM Christian Herrmann <
> > c...@christian-herrmann.info> wrote:
> >
> > Hi all,
> >
> > I am facing an installation error when trying to build stanbol (Revision
> > 1767958 from https://svn.apache.org/repos/asf/stanbol/trunk) on a
> windows
> > environment; it's a test failure... anyone can help me?
> >
> > Logs attached.
> >
> > Thank you in advance!
> >
> > [INFO]
> > 
> > [INFO] BUILD FAILURE
> > [INFO]
> > 
> > [INFO] Total time: 19:42 min
> > [INFO] Finished at: 2017-02-28T20:47:42+01:00
> > [INFO] Final Memory: 378M/966M
> > [INFO]
> > 
> > [ERROR] Failed to execute goal
> > org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test (default-test)
> > on project org.apache.stanbol.enhancer.engine.topic: There are test
> > failures.
> > [ERROR]
> > [ERROR] Please refer to
> > C:\stanbol\enhancement-engines\topic\engine\target\surefire-reports for
> > the
> > individual test results.
> > [ERROR] -> [Help 1]
> > [ERROR]
> > [ERROR] To see the full stack trace of the errors, re-run Maven with the
> -e
> > switch.
> > [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> > [ERROR]
> > [ERROR] For more information about the errors and possible solutions,
> > please read the following articles:
> > [ERROR] [Help 1]
> > http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> > [ERROR]
> > [ERROR] After correcting the problems, you can resume the build with the
> > command
> > [ERROR]   mvn  -rf :org.apache.stanbol.enhancer.engine.topic
> >
> > The surefire-reports tell me:
> >
> > 
> > ---
> > Test set: org.apache.stanbol.enhancer.engine.topic.TopicEngineTest
> > 
> > ---
> > Tests run: 14, Failures: 1, Errors: 13, Skipped: 0, Time elapsed: 1.172
> sec
> > <<< FAILURE! - in
> org.apache.stanbol.enhancer.engine.topic.TopicEngineTest
> > testCrossValidation(org.apache.stanbol.enhancer.
> > engine.topic.TopicEngineTest)
> >  Time elapsed: 0.625 sec  <<< FAILURE!
> > java.lang.AssertionError: Default directory is not an absolute path
> > at sun.nio.fs.WindowsFileSystem.(WindowsFileSystem.java:61)
> > at
> > sun.nio.fs.WindowsFileSystemProvider.(WindowsFileSystemProvider.
> > java:53)
> > at
> > sun.nio.fs.DefaultFileSystemProvider.create(DefaultFileSystemProvider.
> > java:36)
> > at
> > java.nio.file.FileSystems$DefaultFileSystemHolder.getDefaultProvider(
> > FileSystems.java:108)
> > at
> > java.nio.file.FileSystems$DefaultFileSystemHolder.
> > access$000(FileSystems.java:89)
> > at
> > java.nio.file.FileSystems$DefaultFileSystemHolder$1.run(
> > FileSystems.java:98)
> > at
> > java.nio.file.FileSystems$DefaultFileSystemHolder$1.run(
> > FileSystems.java:96)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at
> > java.nio.file.FileSystems$DefaultFileSystemHolder.
> > defaultFileSystem(FileSystems.java:96)
> > at
> > java.nio.file.FileSystems$DefaultFileSystemHolder.<
> > clinit>(FileSystems.java:90)
> > at java.nio.file.FileSystems.getDefault(FileSystems.java:176)
> > at java.io.File.toPath(File.java:2234)
> > at javax.crypto.JarVerifier.getSystemEntropy(JarVerifier.java:828)
> > at javax.crypto.JarVerifier.testSignatures(JarVerifier.java:742)
> > at javax.crypto.JarVerifier.access$400(JarVerifier.java:37)
> > at javax.crypto.JarVerifier$1.run(JarVerifier.java:222)
> > at javax.crypto.JarVerifier$1.run(JarVerifier.java:187)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.crypto.JarVerifier.(JarVerifier.jav

Re: Installation-Test Error

2017-03-01 Thread Rafa Haro
Hi Christian,

Which JVM version are you using?

On Tue, Feb 28, 2017 at 10:53 PM Christian Herrmann <
c...@christian-herrmann.info> wrote:

Hi all,

I am facing an installation error when trying to build stanbol (Revision
1767958 from https://svn.apache.org/repos/asf/stanbol/trunk) on a windows
environment; it's a test failure... anyone can help me?

Logs attached.

Thank you in advance!

[INFO]

[INFO] BUILD FAILURE
[INFO]

[INFO] Total time: 19:42 min
[INFO] Finished at: 2017-02-28T20:47:42+01:00
[INFO] Final Memory: 378M/966M
[INFO]

[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test (default-test)
on project org.apache.stanbol.enhancer.engine.topic: There are test
failures.
[ERROR]
[ERROR] Please refer to
C:\stanbol\enhancement-engines\topic\engine\target\surefire-reports for the
individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions,
please read the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the
command
[ERROR]   mvn  -rf :org.apache.stanbol.enhancer.engine.topic

The surefire-reports tell me:

---
Test set: org.apache.stanbol.enhancer.engine.topic.TopicEngineTest
---
Tests run: 14, Failures: 1, Errors: 13, Skipped: 0, Time elapsed: 1.172 sec
<<< FAILURE! - in org.apache.stanbol.enhancer.engine.topic.TopicEngineTest
testCrossValidation(org.apache.stanbol.enhancer.engine.topic.TopicEngineTest)
 Time elapsed: 0.625 sec  <<< FAILURE!
java.lang.AssertionError: Default directory is not an absolute path
at sun.nio.fs.WindowsFileSystem.(WindowsFileSystem.java:61)
at
sun.nio.fs.WindowsFileSystemProvider.(WindowsFileSystemProvider.java:53)
at
sun.nio.fs.DefaultFileSystemProvider.create(DefaultFileSystemProvider.java:36)
at
java.nio.file.FileSystems$DefaultFileSystemHolder.getDefaultProvider(FileSystems.java:108)
at
java.nio.file.FileSystems$DefaultFileSystemHolder.access$000(FileSystems.java:89)
at
java.nio.file.FileSystems$DefaultFileSystemHolder$1.run(FileSystems.java:98)
at
java.nio.file.FileSystems$DefaultFileSystemHolder$1.run(FileSystems.java:96)
at java.security.AccessController.doPrivileged(Native Method)
at
java.nio.file.FileSystems$DefaultFileSystemHolder.defaultFileSystem(FileSystems.java:96)
at
java.nio.file.FileSystems$DefaultFileSystemHolder.(FileSystems.java:90)
at java.nio.file.FileSystems.getDefault(FileSystems.java:176)
at java.io.File.toPath(File.java:2234)
at javax.crypto.JarVerifier.getSystemEntropy(JarVerifier.java:828)
at javax.crypto.JarVerifier.testSignatures(JarVerifier.java:742)
at javax.crypto.JarVerifier.access$400(JarVerifier.java:37)
at javax.crypto.JarVerifier$1.run(JarVerifier.java:222)
at javax.crypto.JarVerifier$1.run(JarVerifier.java:187)
at java.security.AccessController.doPrivileged(Native Method)
at javax.crypto.JarVerifier.(JarVerifier.java:186)
at javax.crypto.JceSecurity.loadPolicies(JceSecurity.java:317)
at javax.crypto.JceSecurity.setupJurisdictionPolicies(JceSecurity.java:262)
at javax.crypto.JceSecurity.access$000(JceSecurity.java:48)
at javax.crypto.JceSecurity$1.run(JceSecurity.java:80)
at java.security.AccessController.doPrivileged(Native Method)
at javax.crypto.JceSecurity.(JceSecurity.java:77)
at javax.crypto.JceSecurityManager.(JceSecurityManager.java:65)
at javax.crypto.Cipher.getConfiguredPermission(Cipher.java:2587)
at javax.crypto.Cipher.getMaxAllowedKeyLength(Cipher.java:2611)
at sun.security.ssl.CipherSuite$BulkCipher.isUnlimited(CipherSuite.java:535)
at sun.security.ssl.CipherSuite$BulkCipher.(CipherSuite.java:507)
at sun.security.ssl.CipherSuite.(CipherSuite.java:614)
at
sun.security.ssl.SSLContextImpl.getApplicableCipherSuiteList(SSLContextImpl.java:293)
at sun.security.ssl.SSLContextImpl.access$100(SSLContextImpl.java:41)
at
sun.security.ssl.SSLContextImpl$AbstractTLSContext.(SSLContextImpl.java:424)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at java.security.Provider$Service.getImplClass(Provider.java:1634)
at java.security.Provider$Service.newInstance(Provider.java:1592)
at sun.security.jca.GetInstance.getInstance(GetInstance.java:236)
at sun.security.jca.GetInstance.getInstance(GetInstance.java:164)
at javax.net.ssl.SSLContext.getInstance(SSLContext.java:156)
at javax.net.ssl.SSLContext.getDefault(SSLContext.java:96)
at 

Re: The Future of Apache Stanbol

2017-02-24 Thread Rafa Haro
Hi,

I wouldn't be available next Monday, but maybe we can organize more than
one :-). I insist, in order to make any sense out of this we need @Rupert
to join these meetings. He is, by far, the most experienced Stanbol
developer and the one with the broader knowledge. Please @Rupert, raise
your voice man :-)

Cheers,
Rafa

On Fri, Feb 24, 2017 at 3:33 PM Michal Krajňanský <
michal.krajnan...@gmail.com> wrote:

> Hi Mr. Soroka,
>
> Thank you for quick reply and direct call proposal. It will be my pleasure
> to meeting you virtually.
>
> The suggested time Monday 27th 10AM EST works well, so you may consider it
> settled. I will try to get involved also the managing director of our
> department located in Brno, Czech Republic, Matej Dusik.
>
> I am looking forward to having a fruitful discussion.
>
> Best Regards,
>
> Michal Krajnansky
>
> On Fri, Feb 24, 2017 at 3:22 PM A. Soroka <aj...@virginia.edu> wrote:
>
> > We had a few volunteers to begin learning the codebase to take it
> forward,
> > and willingness from at least some of the current committers to teach,
> but
> > I think we need a bit more organization! :grin:
> >
> > I will make a concrete suggestion. Would it be possible to have a video
> > call (perhaps with Google Hangout) about this on (just throwing out a
> date
> > here) this coming week, Monday 27 February, at 10AM EST?
> >
> > I would be able to attend. I realize that we are a far-flung group, so
> I'm
> > just throwing out that date to get us started. The most important thing,
> of
> > course, is to get as many current committers involved as is practical.
> >
> > ---
> > A. Soroka
> > The University of Virginia Library
> >
> > > On Feb 24, 2017, at 8:50 AM, Michal Krajňanský <
> > michal.krajnan...@gmail.com> wrote:
> > >
> > > Dear Stanbol users,
> > >
> > > I was wondering, if there were any results of the discussion about the
> > > Apache Stanbol future.
> > >
> > > I work for a R team of Konica Minolta Laboratory Europe, and we have
> > been
> > > using Stanbol enhancement pipeline in our prototypes concerning
> > information
> > > extraction from unstructured data.
> > >
> > > We are highly interested in the continuing evolutoin of the Stanbol
> > > project, and willing to actively support it. Is here anyone who could
> > tell
> > > us, what would be a good way to approach the existing Stanbol
> > stakeholders,
> > > and support the project by active development and possibly via other
> > ways?
> > >
> > >
> > > Michal Krajnansky
> > > Research Specialist Junior, Konica Minolta Laboratory Europe
> > >
> > >
> > >
> > > On Mon, Jan 23, 2017 at 6:14 PM A. Soroka <aj...@virginia.edu> wrote:
> > >
> > >> Perhaps we can start a page for people to put their name down for
> this?
> > I
> > >> couldn't find an Apache wiki site-- is there somewhere the developers
> > think
> > >> appropriate?
> > >>
> > >> ---
> > >> A. Soroka
> > >> Apache Jena / The University of Virginia Library
> > >>
> > >>> On Jan 23, 2017, at 12:08 PM, Aaron Coburn <acob...@amherst.edu>
> > wrote:
> > >>>
> > >>> I would also be very interested.
> > >>>
> > >>> We use the entityhub component quite a bit at our institution, and I
> > >> would be happy to be involved.
> > >>>
> > >>> Aaron Coburn
> > >>>
> > >>>
> > >>>> On Jan 22, 2017, at 12:30 PM, Antero Duarte <a.fduar...@gmail.com>
> > >> wrote:
> > >>>>
> > >>>> I would definitely be up for that!
> > >>>>
> > >>>> On Sat, 21 Jan 2017, 3:33 p.m. A. Soroka, <aj...@virginia.edu>
> wrote:
> > >>>>
> > >>>>> Ditto. If there are other folks who would be interested in
> > >> participating
> > >>>>> in something like this, now would be a good time to raise your
> voice!
> > >>>>>
> > >>>>> ---
> > >>>>> A. Soroka
> > >>>>> The University of Virginia Library
> > >>>>>
> > >>>>>
> > >>>>>> On Jan 21, 2017, at 10:20 AM, Andrew Valencik <and...@affin.io>
> > >> wrote:
> > >>>>>>
> > >>>&

Re: The Future of Apache Stanbol

2017-01-18 Thread Rafa Haro
I wouldn't mind to be involved in that but it would be almost "mandatory"
to contribute some Developer Documentation as outcome of those meetings
:-). @Rupert we specially need you here :-)

On Wed, Jan 18, 2017 at 4:09 PM A. Soroka <aj...@virginia.edu> wrote:

> > I agree that the barrier to contribution is very high. I recall having
> issues with the documentation initially and the only
> > available book on Stanbol was not sufficient.
> >
> > If there was renewed interest in bringing on other developers I would be
> interested in investing the time to learn to codebase.
>
> I second this!
>
> Perhaps (I know it's very difficult to organize synchronous time for a
> globally-distributed group but perhaps) we could try to organize a boot
> camp meeting on-line? In other words, those people who (like Andrew and
> myself) would be willing to contribute as part of a larger effort could get
> some virtual time with one or more committers/PMC members to take an
> in-depth tour of the system from the developer point of view and hear about
> the outstanding architectural issues, maybe start to figure out points of
> contribution.
>
> I realize this would make a lot of demands on the committers involved, but
> it might be a way to inject some fuel into the effort.
>
> Just an idea...
>
> ---
> A. Soroka
> The University of Virginia Library
>
> > On Jan 18, 2017, at 9:54 AM, Andrew Valencik <and...@affin.io> wrote:
> >
> > Hello!
> >
> > We use Stanbol in production to annotate text with entities as part of
> some
> > of our data products.
> > We do this via the REST API.
> > Originally we were using the content hub to store all the documents but
> saw
> > higher than expected failures.
> > The entity engines seem a bit more resilient to varying content types /
> > encoding.
> >
> > I agree that the barrier to contribution is very high.
> > I recall having issues with the documentation initially and the only
> > available book on Stanbol was not sufficient.
> >
> > If there was renewed interest in bringing on other developers I would be
> > interested in investing the time to learn to codebase.
> >
> > Thanks!
> >
> > On Wed, Jan 18, 2017 at 5:53 AM Raffaele Palmieri <
> > raffaele.palmi...@gmail.com> wrote:
> >
> >> Dear community,
> >> we are using 0.12 version with content hub. I find Stanbol very
> flexible to
> >> enhance content, specially those unstructured ones.
> >> For that regards connections with other projects, we have done some
> >> thoughts about the use with Apache Marmotta with nosql backends for big
> >> data scenarios. Also with Apache Manifold to implement enrichment of
> >> existent document repositories, that's a common request.
> >> Regards,
> >> Raffaele.
> >>
> >>
> >> 2017-01-16 22:41 GMT+01:00 Antero Duarte <a.fduar...@gmail.com>:
> >>
> >>> Hi there,
> >>> Stanbol is very useful for me! Greatest and easiest tool for us to do
> NLP
> >>> and linked data. Has there been any discussion to move towards a nosql
> >>> storage solution, or is solr still the best thing for us? Also, what
> >> about
> >>> upgrading solr? How much work would be involved in that? Anyway, great
> >>> tool, really hope this doesn't die!!!
> >>>
> >>> Regards,
> >>> Antero
> >>>
> >>> On Mon, 16 Jan 2017, 4:43 p.m. Bertrand Delacretaz, <
> >>> bdelacre...@apache.org>
> >>> wrote:
> >>>
> >>>> On Mon, Jan 16, 2017 at 2:28 PM, Rafa Haro <rh...@apache.org> wrote:
> >>>>> ...I participated
> >>>>> also in the development of the Java client, so I could take the
> >>>>> responsibility of bringing that one as well...
> >>>>
> >>>> FWIW, I won't be involved in decisions about this as I left the
> >>>> Stanbol PMC a while ago - I'm just commenting from a community point
> >>>> of view, as an experienced Apache member.
> >>>>
> >>>> A while ago Stanbol was "larger" and more focused on its core, but as
> >>>> its community becomes smaller (IIUC) it's probably good to bring
> >>>> everyone here, as much as possible, even it it means a slightly less
> >>>> focused codebase. This can also help recruiting more active committers
> >>>> and PMC members by involving them directly here.
> >>>>
> >>>> That might make Stanbol more sustainable, as a community of people who
> >>>> need similar functionality.
> >>>>
> >>>> -Bertrand
> >>>>
> >>>
> >>
> > --
> >
> > Andrew Valencik
> >
> > Data Scientist
> >
> > Affinio <http://www.affinio.com> | Twitter <http://twitter.com/valencik>
> |
> > LinkedIn <https://www.linkedin.com/in/andrew-valencik-472b2aa4>
> >
> > Discover your personal brand for free with Discovery by Affinio
> > <
> http://t.sidekickopen36.com/e1t/c/5/f18dQhb0S7lC8dDMPbW2n0x6l2B9nMJW7t5XZs2zGp1-W3Ljs1C5vfVNRVQJSG056dPwzdmbQgC02?t=http%3A%2F%2Fdiscovery.affinio.com%2F=5220160139427840=8070b80d-4591-486c-891e-f9743f8a6c3f
> >
>
>


Re: The Future of Apache Stanbol

2017-01-16 Thread Rafa Haro
HI Bertrand,

There was initiatives in the past to bring Stanbol REST API clients as part
of codebase, but they were rejected in that concrete moment. There is also
a Java one available from long time ago and licensed under Apache License.
I will be more than happy to contribute the python client. I participated
also in the development of the Java client, so I could take the
responsibility of bringing that one as well.

Cheers,
Rafa

On Mon, Jan 16, 2017 at 2:21 PM Bertrand Delacretaz <bdelacre...@apache.org>
wrote:

> Hi,
>
> On Mon, Jan 16, 2017 at 11:48 AM, Rafa Haro <rh...@apache.org> wrote:
> > ...I started to develop one: https://github.com/rafaharo/pystanbol ...
>
> How about bringing this to Apache Stanbol?
>
> As this is a small community, getting it all together here might make
> it more sustainable, as opposed to fragmented efforts.
>
> -Bertrand
>


Re: The Future of Apache Stanbol

2017-01-16 Thread Rafa Haro
Hi Fabian and Devs,

We also widely use Stanbol both as part of customers projects and within
our main product. I admit that we should been contributing further than we
currently do, but sometimes is difficult to find the time for preparing
something you have customize according to your concrete needs to a more
generic contribution.

Apart from that, it seems that Stanbol nowadays covers a couple of well
known uses cases that most of the final users adopt where, in my opinion,
both further contributions and improvements will not likely arise. Still
under my opinion, one reason is because of the complexity of the code.
Another reason, taking into account that there is quite poor activity at
the list regarding this, is that Stanbol Enhancer features seem to be
enough for final users.

I think we are not failing in the objective of making Apache Stanbol a
great tool for developers. We are probably failing, I don't know why, in
the objective of building and maintaining a community.

Those are just my thoughts

On Mon, Jan 16, 2017 at 11:48 AM Rafa Haro <rh...@apache.org> wrote:

> Hi Arthi,
>
> I started to develop one: https://github.com/rafaharo/pystanbol
>
> It only covers Enhancer for now. Contributions are more than welcome.
>
> Rafa
>
> On Mon, Jan 16, 2017 at 11:41 AM <arthi.ven...@wipro.com> wrote:
>
> Hi,
>  Stanbol is a great solution for entity extraction and many NLP problems.
> I have used it for different pilots and customer implementations.
> I also plan to use same in future.
> The community is also great and very helpful.
> If there is a way for non Java programmer  say a Python programmer to more
> easily set up and consume the Rest services more folks can use.
>
> Thanks and Regards,
> Arthi
>
>
>
> -Original Message-
> From: Bertrand Delacretaz [mailto:bdelacre...@apache.org]
> Sent: Monday, January 16, 2017 3:40 PM
> To: dev@stanbol.apache.org
> Subject: Re: The Future of Apache Stanbol
>
> ** This mail has been sent from an external source **
>
> Hi,
>
> On Mon, Jan 16, 2017 at 9:33 AM, Fabian Christ <
> christ.fab...@googlemail.com> wrote:
> > ...Maybe the time for Stanbol is over after 5 years of being a top
> > level Apache project. The ASF has the concept of moving projects to
> > the attic once there is not enough interest or community for a project
> > anymore
>
> To be precise, an ASF project has to move to Attic if there are less than
> 3 active PMC members, which is the minimum required to vote on releases.
>
> Or if the project is unable to respond to security or other critical bug
> reports, due to lack of available contributors.
>
> It's also fine to move to Attic voluntarily if people think the above
> criteria might not be met for much longer, of course - I just wanted to
> clarify the requirements.
>
> -Bertrand
> The information contained in this electronic message and any attachments
> to this message are intended for the exclusive use of the addressee(s) and
> may contain proprietary, confidential or privileged information. If you are
> not the intended recipient, you should not disseminate, distribute or copy
> this e-mail. Please notify the sender immediately and destroy all copies of
> this message and any attachments. WARNING: Computer viruses can be
> transmitted via email. The recipient should check this email and any
> attachments for the presence of viruses. The company accepts no liability
> for any damage caused by any virus transmitted by this email.
> www.wipro.com
>
>


Re: The Future of Apache Stanbol

2017-01-16 Thread Rafa Haro
Hi Arthi,

I started to develop one: https://github.com/rafaharo/pystanbol

It only covers Enhancer for now. Contributions are more than welcome.

Rafa

On Mon, Jan 16, 2017 at 11:41 AM  wrote:

> Hi,
>  Stanbol is a great solution for entity extraction and many NLP problems.
> I have used it for different pilots and customer implementations.
> I also plan to use same in future.
> The community is also great and very helpful.
> If there is a way for non Java programmer  say a Python programmer to more
> easily set up and consume the Rest services more folks can use.
>
> Thanks and Regards,
> Arthi
>
>
>
> -Original Message-
> From: Bertrand Delacretaz [mailto:bdelacre...@apache.org]
> Sent: Monday, January 16, 2017 3:40 PM
> To: dev@stanbol.apache.org
> Subject: Re: The Future of Apache Stanbol
>
> ** This mail has been sent from an external source **
>
> Hi,
>
> On Mon, Jan 16, 2017 at 9:33 AM, Fabian Christ <
> christ.fab...@googlemail.com> wrote:
> > ...Maybe the time for Stanbol is over after 5 years of being a top
> > level Apache project. The ASF has the concept of moving projects to
> > the attic once there is not enough interest or community for a project
> > anymore
>
> To be precise, an ASF project has to move to Attic if there are less than
> 3 active PMC members, which is the minimum required to vote on releases.
>
> Or if the project is unable to respond to security or other critical bug
> reports, due to lack of available contributors.
>
> It's also fine to move to Attic voluntarily if people think the above
> criteria might not be met for much longer, of course - I just wanted to
> clarify the requirements.
>
> -Bertrand
> The information contained in this electronic message and any attachments
> to this message are intended for the exclusive use of the addressee(s) and
> may contain proprietary, confidential or privileged information. If you are
> not the intended recipient, you should not disseminate, distribute or copy
> this e-mail. Please notify the sender immediately and destroy all copies of
> this message and any attachments. WARNING: Computer viruses can be
> transmitted via email. The recipient should check this email and any
> attachments for the presence of viruses. The company accepts no liability
> for any damage caused by any virus transmitted by this email.
> www.wipro.com
>


Re: [VOTE] Release Apache Stanbol 1.0.0 RC0

2016-11-08 Thread Rafa Haro
Hi Ian,

Yes, It is already uploaded to Apache's distribution repo. I just had not
the time to upgrade the webpage yet. I will do it during the day and let
you know here. Anyway, we don't release binaries distributions yet, that is
something still under discussion due to the special nature of the Stanbol
launchers. Therefore, I'm afraid you still would need to build the project
but we can help you with that. Initially, all you need to setup is Java 8
and Maven 3.x

Regards

On Tue, Nov 8, 2016 at 10:42 AM Stewart, Ian <ian.stew...@kcl.ac.uk> wrote:

> Hi. Did this get released? Will the Apache stanbol homepage be updated? I
> am not a developer, and have build problems with both 0.12 and this build,
> so I am rather hoping that a pre-compiled version will be released.
>
> On 2016-10-17 09:53 (+0100), Fabian Christ wrote:
> > Hi,>
> >
> > thanks for the release. To be honest, I did not check it because lack>
> > of time. Just one remark for future releases.>
> >
> > We do not vote on SVN tags as these may change. The vote is binding>
> > for the release source package that is prepared by the release>
> > maintainer and archived in the Apache archives. This and only this>
> > package is the reference for the release.>
> >
> > The fact that there is an SVN tag to the same sources is nice but>
> > technically not binding or necessary.>
> >
> > Best,>
> > Fabian>
> >
> >
> > 2016-09-26 8:24 GMT+02:00 Andreas Kuckartz :>
> > > I am having problems building the package. I will investigate and
> report>
> > > before tomorrow.>
> > >>
> > > Cheers,>
> > > Andreas>
> > > --->
> > >>
> > >>
> > > Rafa Haro wrote:>
> > >> Hi devs,>
> > >>>
> > >> Please vote on wether to release Apache Stanbol 1.0.0 RC0. This is
> the>
> > >> first 1.x.x release and the first release since version 0.12 (more
> than 2>
> > >> years ago). Therefore, it is not easy to summarize all the changes
> since>
> > >> then. Please refer to https://issues.apache.org/jira/browse/STANBOL
> for an>
> > >> exhaustive list of issues fixed in this version.>
> > >>>
> > >> The release source code can be found at the following tag:>
> > >>>
> > >> http://svn.apache.org/repos/asf/stanbol/tags/apache-stanbol-1.0.0/>
> > >>>
> > >> The release includes the complete Apache Stanbol stack with all
> components.>
> > >> The release artifacts are staged at:>
> > >>>
> > >>
> https://repository.apache.org/content/repositories/orgapachestanbol-1009/>
> > >>>
> > >> and the source packages here:>
> > >>>
> > >> https://dist.apache.org/repos/dist/dev/stanbol/1.0.0/>
> > >>>
> > >> You can check the staged Maven artifacts using the script in
> 'releasing'>
> > >> ./check_staged_release.sh 1009 [tmp-directory]>
> > >>>
> > >> PGP release singing keys are available at:>
> > >>>
> > >> https://people.apache.org/keys/group/stanbol.asc>
> > >>>
> > >> The vote will be open for 72 hours>
> > >>>
> >
> >
> >
> > -- >
> > Fabian>
> > http://twitter.com/fctwitt>
> >
>
>
> Sent from my iPad


Re: [VOTE] Release Apache Stanbol 1.0.0 RC0

2016-09-23 Thread Rafa Haro
Three positives votes and more than 72 hours: we can consider the release
accepted.

Thanks guys. I will proceed with the pertinent actions and will make an
official announcement.

Cheers,
Rafa

On Thu, Sep 22, 2016 at 8:41 AM Cristian Petroaca <
cristian.petro...@gmail.com> wrote:

> Tested without running junits (due to my aforementioned junit failure) on
> Windows 7 with Java 1.8.0_77-b03. All fine.
>
> +1 from me
>
> On Thu, Sep 22, 2016 at 2:23 AM, Aaron Coburn <acob...@amherst.edu> wrote:
>
> > Hello,
> >
> > +1 (non-binding)
> >
> > I was able to successfully build and deploy the war files for Stanbol
> > 1.0.0. Some simple testing shows everything working properly. I was also
> > able to run the genericrdf indexer over a custom vocabulary (80 million
> > triples) with success.
> >
> > I look forward with working more with this release!
> >
> > Thanks everyone.
> > Aaron Coburn
> >
> >
> > > On Sep 21, 2016, at 3:45 PM, A. Soroka <aj...@virginia.edu> wrote:
> > >
> > > Thank you for moving the release process forward!
> > >
> > > https://github.com/apache/stanbol/pull/5
> > >
> > > ---
> > > A. Soroka
> > > The University of Virginia Library
> > >
> > >> On Sep 21, 2016, at 3:27 PM, Rafa Haro <rh...@apache.org> wrote:
> > >>
> > >> Thanks a lot Soroka!
> > >> El El mié, 21 sept 2016 a las 21:00, A. Soroka <aj...@virginia.edu>
> > >> escribió:
> > >>
> > >>> I was able to build the same release code on OS X 10.10.5 using Java
> > >>> 1.8.0_40 via mvn clean install. As a simple test, I was able to use
> the
> > >>> EntityHub Generic indexer to index a medium-sized vocabulary of
> > interest to
> > >>> my site without difficulty. I did notice that the default config for
> > the
> > >>> Generic indexer uses the namespace prefix bio: for
> > >>> http://vocab.org/bio/0.1/, which is apparently no longer an included
> > >>> preset. This throws a warning during operation [1}. Also, the usage
> > note
> > >>> for the Generic indexer refers to
> > >>> "org.apache.stanbol.indexing.core-*-jar-with-dependencies.jar" when
> > in fact
> > >>> the artifact is now called
> > >>> "org.apache.stanbol.entityhub.indexing.genericrdf-*.jar". I can send
> > a PR
> > >>> for these minor annoyances if that would be useful.
> > >>>
> > >>>
> > >>> ---
> > >>> A. Soroka
> > >>> The University of Virginia Library
> > >>>
> > >>> [1] E.g.
> > >>>
> > >>> 14:40:46,767 [main] WARN  mapping.FieldMappingUtils - Unable to parse
> > >>> fieldMapping because of unknown namespace prefix
> > >>> java.lang.IllegalArgumentException: The prefix 'bio' is unknown (not
> > >>> mapped to an namespace) by the Stanbol Namespace Prefix Mapping
> > Service.
> > >>> Please change the configuration to use the full URI instead of
> 'bio:*'!
> > >>>
> > >>>
> > >>>
> > >>>> On Sep 20, 2016, at 2:07 PM, steve reinders <steve...@gmail.com>
> > wrote:
> > >>>>
> > >>>> All,
> > >>>>
> > >>>> - I downloaded from
> > >>> https://dist.apache.org/repos/dist/dev/stanbol/1.0.0/
> > >>>> to OSX 10.11.6 ( 9:00 PM US CST Mon Sep 19 )
> > >>>> - using 1.8.0_25 ( oracle )
> > >>>> - built w/mvn install
> > >>>> - only problem was CELI license
> > >>>> - produced org.apache.stanbol.launchers.full-1.0.0.jar
> > >>>>
> > >>>> TopicClassifier looks to have been built fine.
> > >>>>
> > >>>> Is this the source's correct source ?
> > >>>>
> > >>>> BTW is the CMS REST interface in ? Can't tell easily in Jira and I
> > knew
> > >>> it
> > >>>> was pulled in earlier version.
> > >>>>
> > >>>> danke
> > >>>>
> > >>>> Steve
> > >>>>
> > >>>>
> > >>>> On Tue, Sep 20, 2016 at 12:00 PM, Rafa Haro <rh...@apache.org>
> wrote:
> > >>>>
> > >>>>> Hi Cristian,
> > >>>>>
&

Re: [VOTE] Release Apache Stanbol 1.0.0 RC0

2016-09-21 Thread Rafa Haro
Thanks a lot Soroka!
El El mié, 21 sept 2016 a las 21:00, A. Soroka <aj...@virginia.edu>
escribió:

> I was able to build the same release code on OS X 10.10.5 using Java
> 1.8.0_40 via mvn clean install. As a simple test, I was able to use the
> EntityHub Generic indexer to index a medium-sized vocabulary of interest to
> my site without difficulty. I did notice that the default config for the
> Generic indexer uses the namespace prefix bio: for
> http://vocab.org/bio/0.1/, which is apparently no longer an included
> preset. This throws a warning during operation [1}. Also, the usage note
> for the Generic indexer refers to
> "org.apache.stanbol.indexing.core-*-jar-with-dependencies.jar" when in fact
> the artifact is now called
> "org.apache.stanbol.entityhub.indexing.genericrdf-*.jar". I can send a PR
> for these minor annoyances if that would be useful.
>
>
> ---
> A. Soroka
> The University of Virginia Library
>
> [1] E.g.
>
> 14:40:46,767 [main] WARN  mapping.FieldMappingUtils - Unable to parse
> fieldMapping because of unknown namespace prefix
> java.lang.IllegalArgumentException: The prefix 'bio' is unknown (not
> mapped to an namespace) by the Stanbol Namespace Prefix Mapping Service.
> Please change the configuration to use the full URI instead of 'bio:*'!
>
>
>
> > On Sep 20, 2016, at 2:07 PM, steve reinders <steve...@gmail.com> wrote:
> >
> > All,
> >
> > - I downloaded from
> https://dist.apache.org/repos/dist/dev/stanbol/1.0.0/
> > to OSX 10.11.6 ( 9:00 PM US CST Mon Sep 19 )
> > - using 1.8.0_25 ( oracle )
> > - built w/mvn install
> > - only problem was CELI license
> > - produced org.apache.stanbol.launchers.full-1.0.0.jar
> >
> > TopicClassifier looks to have been built fine.
> >
> > Is this the source's correct source ?
> >
> > BTW is the CMS REST interface in ? Can't tell easily in Jira and I knew
> it
> > was pulled in earlier version.
> >
> > danke
> >
> > Steve
> >
> >
> > On Tue, Sep 20, 2016 at 12:00 PM, Rafa Haro <rh...@apache.org> wrote:
> >
> >> Hi Cristian,
> >>
> >> Apparently the Topic Annotation engine is only compiling with OpenJDK
> >> 1.8.x. As far as I know, that code has remained untouched since long
> time
> >> ago, but we should probably remove that dependency (although I don't
> see it
> >> a problem for not going ahead with the release).
> >>
> >> By the way, I have checked the artifacts and signatures, built also from
> >> source without problems
> >>
> >> Therefore +1 for me
> >>
> >> Cheers,
> >> Rafa
> >>
> >> On Mon, Sep 19, 2016 at 10:37 PM Rafa Haro <rh...@apache.org> wrote:
> >>
> >>> Hi Cristian,
> >>>
> >>> I build it directly from the SVN tag and didn't have any problem. I
> will
> >>> check tomorrow the source packages
> >>>
> >>> Thanks,
> >>> Rafa
> >>> El El lun, 19 sept 2016 a las 22:21, Cristian Petroaca <
> >>> cristian.petro...@gmail.com> escribió:
> >>>
> >>>> Hi guys,
> >>>>
> >>>> I downloaded the sources from here https://dist.apache.org/repos/
> >>>> dist/dev/stanbol/1.0.0/
> >>>> <https://dist.apache.org/repos/dist/dev/stanbol/1.0.0/> and did a
> "mvn
> >>>> install" but some tests failed with:
> >>>> Tests run: 7, Failures: 0, Errors: 5, Skipped: 0, Time elapsed: 4.33
> sec
> >>>> <<< FAILURE! - in org.apache.stanbol.enhancer.
> >> engine.topic.TopicEngineTest
> >>>>
> >>>> testCrossValidation(org.apache.stanbol.enhancer.
> >> engine.topic.TopicEngineTest)
> >>>> Time elapsed: 2.108 sec  <<< ERROR!
> >>>> java.lang.NoClassDefFoundError: Could not initialize class
> >>>> sun.security.provider.SecureRandom$SeederHolder
> >>>>at
> >>>> sun.security.provider.SecureRandom.engineNextBytes(
> >> SecureRandom.java:221)
> >>>>at java.security.SecureRandom.nextBytes(SecureRandom.java:468)
> >>>>at java.util.UUID.randomUUID(UUID.java:145)
> >>>>at
> >>>>
> >>>> org.apache.stanbol.enhancer.engine.topic.TopicClassificationEngine.
> >> addConcept(TopicClassificationEngine.java:790)
> >>>>at
> >>>>
> >>>> org.apache.stanbol.enhancer.engine

Re: [VOTE] Release Apache Stanbol 1.0.0 RC0

2016-09-20 Thread Rafa Haro
Hi Cristian,

Apparently the Topic Annotation engine is only compiling with OpenJDK
1.8.x. As far as I know, that code has remained untouched since long time
ago, but we should probably remove that dependency (although I don't see it
a problem for not going ahead with the release).

By the way, I have checked the artifacts and signatures, built also from
source without problems

Therefore +1 for me

Cheers,
Rafa

On Mon, Sep 19, 2016 at 10:37 PM Rafa Haro <rh...@apache.org> wrote:

> Hi Cristian,
>
> I build it directly from the SVN tag and didn't have any problem. I will
> check tomorrow the source packages
>
> Thanks,
> Rafa
> El El lun, 19 sept 2016 a las 22:21, Cristian Petroaca <
> cristian.petro...@gmail.com> escribió:
>
>> Hi guys,
>>
>> I downloaded the sources from here https://dist.apache.org/repos/
>> dist/dev/stanbol/1.0.0/
>> <https://dist.apache.org/repos/dist/dev/stanbol/1.0.0/> and did a "mvn
>> install" but some tests failed with:
>> Tests run: 7, Failures: 0, Errors: 5, Skipped: 0, Time elapsed: 4.33 sec
>> <<< FAILURE! - in org.apache.stanbol.enhancer.engine.topic.TopicEngineTest
>>
>> testCrossValidation(org.apache.stanbol.enhancer.engine.topic.TopicEngineTest)
>>  Time elapsed: 2.108 sec  <<< ERROR!
>> java.lang.NoClassDefFoundError: Could not initialize class
>> sun.security.provider.SecureRandom$SeederHolder
>> at
>> sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:221)
>> at java.security.SecureRandom.nextBytes(SecureRandom.java:468)
>> at java.util.UUID.randomUUID(UUID.java:145)
>> at
>>
>> org.apache.stanbol.enhancer.engine.topic.TopicClassificationEngine.addConcept(TopicClassificationEngine.java:790)
>> at
>>
>> org.apache.stanbol.enhancer.engine.topic.TopicClassificationEngine.addConcept(TopicClassificationEngine.java:825)
>> at
>>
>> org.apache.stanbol.enhancer.engine.topic.TopicEngineTest.initArtificialTrainingSet(TopicEngineTest.java:537)
>> at
>>
>> org.apache.stanbol.enhancer.engine.topic.TopicEngineTest.testCrossValidation(TopicEngineTest.java:475)
>>
>> Java version:
>> java version "1.8.0_77"
>> Java(TM) SE Runtime Environment (build 1.8.0_77-b03)
>> Java HotSpot(TM) 64-Bit Server VM (build 25.77-b03, mixed mode)
>>
>>
>> My java version is the Oracle one, not OpenJDK.
>> I looked that class up, SecureRandom, but I can't find it in Oracle's
>> documentation in that package but rather here:
>> https://docs.oracle.com/javase/7/docs/api/java/security/SecureRandom.html
>>
>> Not sure if this is the problem.
>>
>> Cristian
>>
>> On Sat, Sep 17, 2016 at 1:55 PM, Antonio David Pérez Morales <
>> adperezmora...@gmail.com> wrote:
>>
>> > Tested
>> >
>> > +1 for me
>> >
>> > Regards
>> >
>> > El 16 sept. 2016 6:38 p. m., "Rafa Haro" <rh...@apache.org> escribió:
>> >
>> > > Hi devs,
>> > >
>> > > Please vote on wether to release Apache Stanbol 1.0.0 RC0. This is the
>> > > first 1.x.x release and the first release since version 0.12 (more
>> than 2
>> > > years ago). Therefore, it is not easy to summarize all the changes
>> since
>> > > then. Please refer to https://issues.apache.org/jira/browse/STANBOL
>> for
>> > an
>> > > exhaustive list of issues fixed in this version.
>> > >
>> > > The release source code can be found at the following tag:
>> > >
>> > > http://svn.apache.org/repos/asf/stanbol/tags/apache-stanbol-1.0.0/
>> > >
>> > > The release includes the complete Apache Stanbol stack with all
>> > components.
>> > > The release artifacts are staged at:
>> > >
>> > > https://repository.apache.org/content/repositories/
>> > orgapachestanbol-1009/
>> > >
>> > > and the source packages here:
>> > >
>> > > https://dist.apache.org/repos/dist/dev/stanbol/1.0.0/
>> > >
>> > > You can check the staged Maven artifacts using the script in
>> 'releasing'
>> > > ./check_staged_release.sh 1009 [tmp-directory]
>> > >
>> > > PGP release singing keys are available at:
>> > >
>> > > https://people.apache.org/keys/group/stanbol.asc
>> > >
>> > > The vote will be open for 72 hours
>> > >
>> >
>>
>


Re: Allowing specific modules only

2016-09-20 Thread Rafa Haro
Hi Yauhen,

I'm not sure I'm completely understanding you. Could you please elaborate a
little bit further your question?

Thanks,
Rafa

On Tue, Sep 20, 2016 at 2:17 PM klim klim  wrote:

> Hi dear dev-team of Stanbol,
>
> 1) could you give me an advice how to build the project with only
> particular bundles allowed (in my case I need only enhancer and enityhub)?
> 2) what is the way of configuring stanbol only with property files? In my
> case I have to copy now from already configured instance, and the files are
> rewritten every time on start. Again, I’d eager to know at least for two
> modules (entityhub and enhancer)
>
> 3) Let me give my vote later for 1.0.0 as I still sometimes face issues of
> downloading third-party stuff.
>
>
> Thank you
>
> Best,
> — Yauhen


Re: [VOTE] Release Apache Stanbol 1.0.0 RC0

2016-09-19 Thread Rafa Haro
Hi Cristian,

I build it directly from the SVN tag and didn't have any problem. I will
check tomorrow the source packages

Thanks,
Rafa
El El lun, 19 sept 2016 a las 22:21, Cristian Petroaca <
cristian.petro...@gmail.com> escribió:

> Hi guys,
>
> I downloaded the sources from here https://dist.apache.org/repos/
> dist/dev/stanbol/1.0.0/
> <https://dist.apache.org/repos/dist/dev/stanbol/1.0.0/> and did a "mvn
> install" but some tests failed with:
> Tests run: 7, Failures: 0, Errors: 5, Skipped: 0, Time elapsed: 4.33 sec
> <<< FAILURE! - in org.apache.stanbol.enhancer.engine.topic.TopicEngineTest
>
> testCrossValidation(org.apache.stanbol.enhancer.engine.topic.TopicEngineTest)
>  Time elapsed: 2.108 sec  <<< ERROR!
> java.lang.NoClassDefFoundError: Could not initialize class
> sun.security.provider.SecureRandom$SeederHolder
> at
> sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:221)
> at java.security.SecureRandom.nextBytes(SecureRandom.java:468)
> at java.util.UUID.randomUUID(UUID.java:145)
> at
>
> org.apache.stanbol.enhancer.engine.topic.TopicClassificationEngine.addConcept(TopicClassificationEngine.java:790)
> at
>
> org.apache.stanbol.enhancer.engine.topic.TopicClassificationEngine.addConcept(TopicClassificationEngine.java:825)
> at
>
> org.apache.stanbol.enhancer.engine.topic.TopicEngineTest.initArtificialTrainingSet(TopicEngineTest.java:537)
> at
>
> org.apache.stanbol.enhancer.engine.topic.TopicEngineTest.testCrossValidation(TopicEngineTest.java:475)
>
> Java version:
> java version "1.8.0_77"
> Java(TM) SE Runtime Environment (build 1.8.0_77-b03)
> Java HotSpot(TM) 64-Bit Server VM (build 25.77-b03, mixed mode)
>
>
> My java version is the Oracle one, not OpenJDK.
> I looked that class up, SecureRandom, but I can't find it in Oracle's
> documentation in that package but rather here:
> https://docs.oracle.com/javase/7/docs/api/java/security/SecureRandom.html
>
> Not sure if this is the problem.
>
> Cristian
>
> On Sat, Sep 17, 2016 at 1:55 PM, Antonio David Pérez Morales <
> adperezmora...@gmail.com> wrote:
>
> > Tested
> >
> > +1 for me
> >
> > Regards
> >
> > El 16 sept. 2016 6:38 p. m., "Rafa Haro" <rh...@apache.org> escribió:
> >
> > > Hi devs,
> > >
> > > Please vote on wether to release Apache Stanbol 1.0.0 RC0. This is the
> > > first 1.x.x release and the first release since version 0.12 (more
> than 2
> > > years ago). Therefore, it is not easy to summarize all the changes
> since
> > > then. Please refer to https://issues.apache.org/jira/browse/STANBOL
> for
> > an
> > > exhaustive list of issues fixed in this version.
> > >
> > > The release source code can be found at the following tag:
> > >
> > > http://svn.apache.org/repos/asf/stanbol/tags/apache-stanbol-1.0.0/
> > >
> > > The release includes the complete Apache Stanbol stack with all
> > components.
> > > The release artifacts are staged at:
> > >
> > > https://repository.apache.org/content/repositories/
> > orgapachestanbol-1009/
> > >
> > > and the source packages here:
> > >
> > > https://dist.apache.org/repos/dist/dev/stanbol/1.0.0/
> > >
> > > You can check the staged Maven artifacts using the script in
> 'releasing'
> > > ./check_staged_release.sh 1009 [tmp-directory]
> > >
> > > PGP release singing keys are available at:
> > >
> > > https://people.apache.org/keys/group/stanbol.asc
> > >
> > > The vote will be open for 72 hours
> > >
> >
>


[VOTE] Release Apache Stanbol 1.0.0 RC0

2016-09-16 Thread Rafa Haro
Hi devs,

Please vote on wether to release Apache Stanbol 1.0.0 RC0. This is the
first 1.x.x release and the first release since version 0.12 (more than 2
years ago). Therefore, it is not easy to summarize all the changes since
then. Please refer to https://issues.apache.org/jira/browse/STANBOL for an
exhaustive list of issues fixed in this version.

The release source code can be found at the following tag:

http://svn.apache.org/repos/asf/stanbol/tags/apache-stanbol-1.0.0/

The release includes the complete Apache Stanbol stack with all components.
The release artifacts are staged at:

https://repository.apache.org/content/repositories/orgapachestanbol-1009/

and the source packages here:

https://dist.apache.org/repos/dist/dev/stanbol/1.0.0/

You can check the staged Maven artifacts using the script in 'releasing'
./check_staged_release.sh 1009 [tmp-directory]

PGP release singing keys are available at:

https://people.apache.org/keys/group/stanbol.asc

The vote will be open for 72 hours


Re: Welcome our new PMC members Cristian and Rafa

2016-08-02 Thread Rafa Haro
Thanks a lot guys!


El El lun, 1 ago 2016 a las 16:35, Tommaso Teofili <
tommaso.teof...@gmail.com> escribió:

> Welcome onboard Cristian and Rafa!
>
> Regards,
> Tommaso
>
> Il giorno lun 1 ago 2016 alle ore 15:27 Rupert Westenthaler <
> rupert.westentha...@gmail.com> ha scritto:
>
> > Herzlich Willkommen Rafa und Cristian.
> > A hearty welcome to Rafa und Cristian.
> >
> > best
> > Rupert
> >
> > On Fri, Jul 29, 2016 at 8:48 AM, Dileepa Jayakody <djayak...@zaizi.com>
> > wrote:
> > > Congratulations Rafa and Cristian!
> > >
> > > On Thu, Jul 28, 2016 at 12:24 PM, Fabian Christ <fchr...@apache.org>
> > wrote:
> > >
> > >> Dear Stanbolers,
> > >>
> > >> the official records have been updated. Please welcome our new Stanbol
> > >> PMC members:
> > >>
> > >> * Cristian Petroaca
> > >> * Rafa Haro
> > >>
> > >> We are glad to have them in our PMC and are looking forward to get
> > >> Stanbol back on a more stable ground with a special focus on releases.
> > >>
> > >> Best
> > >> Fabian
> > >>
> > >
> > > --
> > >
> > > --
> > > This message should be regarded as confidential. If you have received
> > this
> > > email in error please notify the sender and destroy it immediately.
> > > Statements of intent shall only become binding when confirmed in hard
> > copy
> > > by an authorised signatory.
> > >
> > > Zaizi Ltd is registered in England and Wales with the registration
> number
> > > 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
> > > London W6 7AN.
> >
> >
> >
> > --
> > | Rupert Westenthaler rupert.westentha...@gmail.com
> > | Bodenlehenstraße 11  ++43-699-11108907
> > | A-5500 Bischofshofen
> > | REDLINK.CO
> >
> ..
> > | http://redlink.co/
> >
>


Re: Get AnalyzedText Content Part via REST

2016-07-19 Thread Rafa Haro
Hi,

As far as I know, currently, the NLP stack within Stanbol doesn't include
any information in the Enhancement structure (output), probably because
that information could be large and verbose. In my opinion, a flag for
requesting it in the request would be nice to have, but it is not
implemented right now. Actually the NLP stage is used as a necessary step
for further analysis engines.

Cheers,
Rafa

On Tue, Jul 19, 2016 at 10:18 AM mzl  wrote:

> Hello,
>
> how do I get the AnalysedText content part when using the REST API?
>
> I'm trying to execute an enhancement chain via the REST API by calling
> each enhancement engine with the result of the preceeding. This works
> for the langdetect engine but when I'm calling some other engines, like
> the opennlp-sentence engine, the result seems contain no additional
> information. The error log is clean:
>   "19.07.2016 10:03:51.190 *INFO* [qtp298757786-36]
> org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
> Execution of Chain opennlp-sentenceChain finished after 3ms for
> ContentItem
> 
> 19.07.2016 10:03:51.190 *INFO* [qtp298757786-36]
> org.apache.stanbol.enhancer.servicesapi.EnhancementJobManager >
> processed ContentItem
>  with
> Chain 'opennlp-sentenceChain' in 2ms | chain:[opennlp-sentence: 2ms
> (100%)], concurrency: 1.0 (0%)"
> But the result seems to not contain the AnalysedText part which causes
> errors in later steps.
>
>
>
> The engines I try to call:
>   1. Tika
>   2. langdetect
>   3. opennlp-sentence
>   4. opennlp-token
>   5. opennlp-pos
>   6. opennlp-chunker
>
>
>
> What I did so far:
>   - Get Stanbol code from https://svn.apache.org/repos/asf/stanbol/trunk/
>   - Build Stanbol with mvn clean install -DskipTests
>   - Start Stanbol with java -Xmx1g -jar
> ./org.apache.stanbol.launchers.stable-1.0.0-SNAPSHOT.jar
>   - Use Firefox HttpRequester Plugin to send a Requests to the
> langdetect-Engine at
>   http://localhost:8080/enhancer/engine/langdetect?outputContent=*/*
> with "Accept multipart/form-data" Header
>   - Received enhanced content (see below)
>   - Send this with content type "multipart/form-data; charset=UTF-8;
> boundary=contentItem-U9u25OIBks0JM-j1GP" to
> http://localhost:8080/enhancer/engine/opennlp-pos?outputContent=*/*
>   - Received the response shown below
>   - Send the response to the opennlp-chunker at
> "http://localhost:8080/enhancer/engine/opennlp-chunker?outputContent=*/*;
> with "multipart/form-data; charset=UTF-8;
> boundary=contentItem-KFLGIdIWg8rZZ7AF_"
>   - Receive response with status 200 OK and content equal to the content
> of the request
>   - error log:
>  > 19.07.2016 10:14:24.360 *WARN* [Thread-9]
> org.apache.stanbol.enhancer.nlp.utils.NlpEngineHelper The Enhancement
> Engine 'opennlp-chunker (impl: OpenNlpChunkingEngine)' CAN NOT enhance
> ContentItem InMemoryContentItem
> uri=[urn:content-item-sha1-ccfad800c413a3ba0297c202badb0eaebb4a57ce],
> content=[size:204 bytes;;mime-type:text/plain], metadata=[8 triples],
>
> parts=[,
> ,
> <
> http://stanbol.apache.org/ontology/enhancer/executionmetadata#ChainExecution
> >]
> because the AnalysedText ContentPart is missing. Users might want to add
> an EnhancementEngine that creates the AnalysedText ContentPart such as
> the POSTaggingEngine (o.a.stanbol.enhancer.engines.opennlp.pos)!
>
>
>
>
> ###
> Content from langdetect:
>
> ###
>
> --contentItem-U9u25OIBks0JM-j1GP
> Content-Disposition: form-data; name="metadata";
> filename="urn:content-item-sha1-ccfad800c413a3ba0297c202badb0eaebb4a57ce"
> Content-Type: application/ld+json; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> [ {
>"@id" : "urn:enhancement-25ded47b-bdd2-cef8-abee-7ea258c03390",
>"http://fise.iks-project.eu/ontology/confidence; : [ {
>  "@value" : "0.959994881431",
>  "@type" : "http://www.w3.org/2001/XMLSchema#double;
>} ],
>"http://fise.iks-project.eu/ontology/extracted-from; : [ {
>  "@id" :
> "urn:content-item-sha1-ccfad800c413a3ba0297c202badb0eaebb4a57ce"
>} ],
>"http://purl.org/dc/terms/created; : [ {
>  "@value" : "2016-07-14T15:12:09.795Z",
>  "@type" : "http://www.w3.org/2001/XMLSchema#dateTime;
>} ],
>"http://purl.org/dc/terms/creator; : [ {
>  "@value" :
>
> "org.apache.stanbol.enhancer.engines.langdetect.LanguageDetectionEnhancementEngine"
>} ],
>"http://purl.org/dc/terms/language; : [ {
>  "@value" : "en"
>} ],
>"http://purl.org/dc/terms/type; : [ {
>  "@id" : "http://purl.org/dc/terms/LinguisticSystem;
>} ],
>"@type" : [ "http://fise.iks-project.eu/ontology/Enhancement;,
> "http://fise.iks-project.eu/ontology/TextAnnotation; ]
> } ]
> --contentItem-U9u25OIBks0JM-j1GP
> Content-Disposition: form-data; name="content"
> Content-Type: 

Re: Releasing Plan

2016-06-24 Thread Rafa Haro
Hi Fabian. Let me go through those links in detail. I was in charge once of
releasing Apache ManifoldCF so I will try to remember how I did it as well.
In Stanbol I suppose that the process should be similar

Thanks,
Rafa
El El vie, 24 jun 2016 a las 11:14, Fabian Christ <
christ.fab...@googlemail.com> escribió:

> Hi,
>
> from my perspective there is not much to hand over. The trick is to
> generate release bundles, upload them to the Apache Nexus instance and
> call a vote.
>
> Rafa, are you willing to become a release manager for Stanbol? That
> would be great! You would need access to https://repository.apache.org
> - as this is the staging repository.
>
> The challenge in Stanbol was the overwhelming number of components
> that have to be released. This can be tricky and handling the Maven
> Release plugin can be become a pain in itself. If have not used the
> toolchain since the last release but in theory it should work as
> described here [1,2].
>
> For each release you could decide to release each component for itself
> or release a big TAR ball that has to be generated somehow and
> contains everything to be released. Then we have only one large
> artifact to vote on.
>
> [1] http://stanbol.apache.org/development/release-management.html
> [2] http://www.apache.org/dev/release-publishing.html
>
> Best
> Fabian
>
> 2016-06-16 17:57 GMT+02:00 Rafa Haro <rh...@apache.org>:
> > Hi Reto
> >
> > On Thu, Jun 16, 2016 at 4:47 PM Reto Gmür <r...@apache.org> wrote:
> >
> >> Hi Rafa
> >>
> >> There was a delay between the discussion on the list and me committing
> >> the changes. The reason is also that some little issues had to be fixed
> >> in clerezza so that all of Stanbol works.
> >>
> >> But the problem is not the change between earlier SNAPSHOT version and
> >> the release but the lack of release. Developing against a SNAPSHOT
> >> version should be the exception, the whole apache release and licensing
> >> process depends on artifacts actually being released if people just rely
> >> on the subversion code rather than releases this misses a very important
> >> aspect of the Apache way and licensing.
> >>
> >
> > I totally agree with you and, at the end, I was saying exactly the same
> in
> > my first email. I'm using SNAPSHOT because the only released version I
> > could use was 0.12 and I needed to work with 1.X. So yeah, the problem is
> > we are not releasing because current trunk was probably ready for a first
> > 1.0.0 release.
> >
> > Apart from that, being aware of that situation, I would have probably
> > notified such a change in this mailing list, at least to give us the
> chance
> > to locally or in a fork tag a version and not force us to adapt our
> private
> > code. But, of course, the real problem is the lack of releasing since too
> > much time ago.
> >
> > Let's try to fix this situation then, is anyone apart from Fabian aware
> > about how the releasing process?
> >
> > I'm more than happy to learn and help
> >
> > Cheers,
> > Rafa
> >
> >
> >>
> >> So the problem is that we're not releasing. When an issue is fixed it
> >> should never take so long till the fix is available to the public, that
> >> is, till there is a release.
> >>
> >> Cheers,
> >> Reto
> >>
> >> On Wed, 15 Jun 2016, at 12:32, Rafa Haro wrote:
> >> > Hi all,
> >> >
> >> > We should get back to the releasing discussion as soon as possible.
> >> > Yesterday, I had to adapt the code of several custom engines that we
> used
> >> > for an internal project in my company because I was relying on version
> >> > 1.0.0-SNAPSHOT and like a month ago Clerezza version was updated.
> This is
> >> > not nice, because probably there are more people out there developing
> >> > with
> >> > 1.0.0-SNAPSHOT artifacts (because those are the only ones that we can
> get
> >> > using maven if you want to use "version 1.x") which code is not going
> now
> >> > to compile at all like it happened to me yesterday.
> >> >
> >> > I'm not saying that the update to the new Clerezza API shouldn't have
> >> > been
> >> > done and released as SNAPSHOT version, but I would have expected at
> least
> >> > to announce it here first and to wait until releasing and publish a
> first
> >> > 1.0.0 version for people that have been working with 1.0.0-SNAPSHOT
> prior
> >> > to this major change doesn't seen his code affected.
> >> >
> >> > Anyway, Fabian, could we speed up the handover for the releasing
> process?
> >> >
> >> > By the way, I have updated Clerezza bundlelists dependencies for all
> the
> >> > launchers, because only the full launchers was updated in the last
> commit
> >> >
> >> > Cheers,
> >> > Rafa
> >>
>
>
>
> --
> Fabian
> http://twitter.com/fctwitt
>


Re: Tika engine failing with PDF files. Problem with PDFBox

2016-06-22 Thread Rafa Haro
Hi Antero,

It seems clear that there are some PDFBox dependencies that are not being
included as bundles in the specific launcher that you are using. I have
just compiled and unit tested a fresh copy (from the trunk) of the tika
engine and it hasn't failed. PDF files are used in the engine unit tests.
So, it could be an old issue already solved. I have not used tika engine
too much, did you configure it in some way that could be causing this?

Cheers,
Rafa

On Tue, Jun 21, 2016 at 6:14 PM Antero Duarte  wrote:

> Hi there,
>
> Recently I've been trying to get enhancements from a PDF file, and the
> Apache tika engine fails and logs the following error:
>
> 21.06.2016 17:02:56.824 *ERROR* [Thread-12]
> org.apache.stanbol.enhancer.jobmanager.event.impl.EnhancementJobHandler
> Unexpected Exception while processing ContentItem
>  with
> EnhancementJobManager: class
> org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
> java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.pdfbox.pdmodel.PDPage
> at
> org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:212)
> at
> org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:218)
> at
> org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:184)
> at
>
> org.apache.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:212)
> at
> org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:340)
> at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:106)
> at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:143)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at
>
> org.apache.stanbol.enhancer.engines.tika.TikaEngine$1.run(TikaEngine.java:275)
> at java.security.AccessController.doPrivileged(Native Method)
> at
>
> org.apache.stanbol.enhancer.engines.tika.TikaEngine.computeEnhancements(TikaEngine.java:256)
> at
>
> org.apache.stanbol.enhancer.jobmanager.event.impl.EnhancementJobHandler.processEvent(EnhancementJobHandler.java:280)
> at
>
> org.apache.stanbol.enhancer.jobmanager.event.impl.EnhancementJobHandler.handleEvent(EnhancementJobHandler.java:198)
> at
>
> org.apache.felix.eventadmin.impl.handler.EventHandlerProxy.sendEvent(EventHandlerProxy.java:415)
> at
>
> org.apache.felix.eventadmin.impl.tasks.SyncDeliverTasks.execute(SyncDeliverTasks.java:118)
> at
>
> org.apache.felix.eventadmin.impl.tasks.AsyncDeliverTasks$TaskExecuter.run(AsyncDeliverTasks.java:159)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
> org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
> Execution of Chain tikaChain failed after 14ms for ContentItem
> 
> 21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
> org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
>  finished: true
> 21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
> org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
>  state:failed
> 21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
> org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
>  chain:tikaChain
> 21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
> org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
>  content-item:
> 
> 21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
> org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
> executions:
> 21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
> org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl -
> tika completed
> 21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
> org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl Error
> Message: Enhancement Chain failed because of required Engiat
>
> org.apache.stanbol.enhancer.jersey.resource.AbstractEnhancerResource.enhanceFromData(AbstractEnhancerResource.java:213)
> at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
>
> 

Re: Releasing Plan

2016-06-16 Thread Rafa Haro
Hi Reto

On Thu, Jun 16, 2016 at 4:47 PM Reto Gmür <r...@apache.org> wrote:

> Hi Rafa
>
> There was a delay between the discussion on the list and me committing
> the changes. The reason is also that some little issues had to be fixed
> in clerezza so that all of Stanbol works.
>
> But the problem is not the change between earlier SNAPSHOT version and
> the release but the lack of release. Developing against a SNAPSHOT
> version should be the exception, the whole apache release and licensing
> process depends on artifacts actually being released if people just rely
> on the subversion code rather than releases this misses a very important
> aspect of the Apache way and licensing.
>

I totally agree with you and, at the end, I was saying exactly the same in
my first email. I'm using SNAPSHOT because the only released version I
could use was 0.12 and I needed to work with 1.X. So yeah, the problem is
we are not releasing because current trunk was probably ready for a first
1.0.0 release.

Apart from that, being aware of that situation, I would have probably
notified such a change in this mailing list, at least to give us the chance
to locally or in a fork tag a version and not force us to adapt our private
code. But, of course, the real problem is the lack of releasing since too
much time ago.

Let's try to fix this situation then, is anyone apart from Fabian aware
about how the releasing process?

I'm more than happy to learn and help

Cheers,
Rafa


>
> So the problem is that we're not releasing. When an issue is fixed it
> should never take so long till the fix is available to the public, that
> is, till there is a release.
>
> Cheers,
> Reto
>
> On Wed, 15 Jun 2016, at 12:32, Rafa Haro wrote:
> > Hi all,
> >
> > We should get back to the releasing discussion as soon as possible.
> > Yesterday, I had to adapt the code of several custom engines that we used
> > for an internal project in my company because I was relying on version
> > 1.0.0-SNAPSHOT and like a month ago Clerezza version was updated. This is
> > not nice, because probably there are more people out there developing
> > with
> > 1.0.0-SNAPSHOT artifacts (because those are the only ones that we can get
> > using maven if you want to use "version 1.x") which code is not going now
> > to compile at all like it happened to me yesterday.
> >
> > I'm not saying that the update to the new Clerezza API shouldn't have
> > been
> > done and released as SNAPSHOT version, but I would have expected at least
> > to announce it here first and to wait until releasing and publish a first
> > 1.0.0 version for people that have been working with 1.0.0-SNAPSHOT prior
> > to this major change doesn't seen his code affected.
> >
> > Anyway, Fabian, could we speed up the handover for the releasing process?
> >
> > By the way, I have updated Clerezza bundlelists dependencies for all the
> > launchers, because only the full launchers was updated in the last commit
> >
> > Cheers,
> > Rafa
>


Releasing Plan

2016-06-15 Thread Rafa Haro
Hi all,

We should get back to the releasing discussion as soon as possible.
Yesterday, I had to adapt the code of several custom engines that we used
for an internal project in my company because I was relying on version
1.0.0-SNAPSHOT and like a month ago Clerezza version was updated. This is
not nice, because probably there are more people out there developing with
1.0.0-SNAPSHOT artifacts (because those are the only ones that we can get
using maven if you want to use "version 1.x") which code is not going now
to compile at all like it happened to me yesterday.

I'm not saying that the update to the new Clerezza API shouldn't have been
done and released as SNAPSHOT version, but I would have expected at least
to announce it here first and to wait until releasing and publish a first
1.0.0 version for people that have been working with 1.0.0-SNAPSHOT prior
to this major change doesn't seen his code affected.

Anyway, Fabian, could we speed up the handover for the releasing process?

By the way, I have updated Clerezza bundlelists dependencies for all the
launchers, because only the full launchers was updated in the last commit

Cheers,
Rafa


Re: Report Time - Input

2016-06-08 Thread Rafa Haro
Hi Fabian,

There have been some contributions lately in terms of documentation
improvement. We will be included them in the following weeks.

Rafa

On Wed, Jun 8, 2016 at 1:35 PM Fabian Christ  wrote:

> Hi,
>
> here is my draft for the June report:
>
> Apache Stanbol provides a set of reusable components for semantic content
> management.
>
> The project is doing fine on a low level. The major blocker is still the
> stocked release process. People do not find the time to cut new releases
> or to hand over the process to other PMC members.
>
> The project is focused on easing the use of Stanbol and use resources like
> Docker images. In combination with an improved release process Stanbol
> should become more easy to use to end users.
>
> The project voted for two new PMC members. The board has been notified but
> we need to wait for the 72 hours period to end until we invite them
> officially.
>
> The vote for new PMC members is a first step to hand over the planned
> improvements and the release process to a new generation of people.
>
> Subscribers on the dev list: 231
>
> Last new committer was
>   Cristian Petroaca on May 7th, 2015
>
> Last stack release was:
>   Apache Stanbol 0.12 on Mar 2nd, 2014
>
> Last component release was:
>   Apache Stanbol Partial Security Release RC2 on June 5th, 2014
>   org.apache.stanbol.commons.security.reactor-20140602
>
> 2016-06-08 13:12 GMT+02:00 Fabian Christ :
> > Hi Stanbolers,
> >
> > it is report time again. Please, provide some input for the June 2016
> report.
> >
> > Best
> > Fabian
>


Re: Need Help Configuring EntityhubLinkingEngine

2016-06-01 Thread Rafa Haro
Hi Nathan, can you check entities' labels language in your dataset?

Cheers,
Rafa
El El mié, 1 jun 2016 a las 19:30, Nathan Breit <br...@ecohealthalliance.org>
escribió:

> Thanks for your assistance Rafa. Unfortunately, I'm still stuck. I used the
> following longer test string that was detected as en, "This is really
> English text and dengue hemorrhagic fever is a disease." However, there
> were still no entity annotations returned. This was printed in my
> error.log:
> ```
> 01.06.2016 13:14:40.641 *INFO* [Thread-7]
> org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine language
> identified as en
> 01.06.2016 13:14:40.670 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> EntityLinking Statistics:
> 01.06.2016 13:14:40.670 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> - overal: 7ms (text processing: 6%, lookup: 91%, matching 0%, ranking
> 0%, other 3%)
> 01.06.2016 13:14:40.670 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
>   - Text Processing: 0.399572ms [count: 5 | time: 0.0799144ms
> (max:0.366414, min:0.007158)]
> 01.06.2016 13:14:40.670 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
>   - Vocabulary Lookup: 6.356819ms [count: 4 | time: 1.58920475ms
> (max:2.560572, min:0.893326)]
> 01.06.2016 13:14:40.670 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> - cache hits: 1 (25.0%)
> 01.06.2016 13:14:40.670 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
>   - 0 query results (0 filtered - NaN%)
> 01.06.2016 13:14:40.670 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
>   - Label Matching: 0.003802ms [count: 4 | time: 9.505E-4ms (max:0.001065,
> min:8.85E-4)]
> 01.06.2016 13:14:40.670 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
>   - Suggestion Ranking: 0.0ms [count: 0 | time: NaNms (max:-1.0E-6,
> min:9.223372036854775E12)]
> 01.06.2016 13:14:40.671 *INFO* [qtp621234008-38]
> org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
> Execution of Chain doidEnhancerChain finished after 36ms for ContentItem
> 
> 01.06.2016 13:14:40.672 *INFO* [qtp621234008-38]
> org.apache.stanbol.enhancer.servicesapi.EnhancementJobManager > processed
> ContentItem
>  with Chain
> 'doidEnhancerChain' in 34ms | chain:[langid: 6ms (18%), tika: 0ms (0%),
> opennlp-sentence: 1ms (3%), opennlp-token: 0ms (0%), opennlp-pos: 3ms (9%),
> opennlp-ner: 5ms (15%), dbpediaLinking: 1ms (3%), entityhubExtraction: 18ms
> (53%), doidEnhancer: 9ms (26%)], concurrency: 1.0 (0%)
> ```
> I'm not sure what to make of NER mentions in the logs. My enhancement chain
> does not include a NER, unless it is being invoked by another enhancer like
> opennlp-pos.
> Regards,
> -Nathan
>
> On Wed, Jun 1, 2016 at 5:32 PM, Rafa Haro <rh...@apache.org> wrote:
>
> > Hi Nathan,
> >
> > You are testing the enhancer with a very short sentence and the Language
> > Detection engine is identifying 'no' (probable Norwegian) as the sentence
> > language. By default, Stanbol uses the identified language code for both
> > loading OpenNLP models in that language and for entity lookup for
> searching
> > only entity labels in that language. There is a couple of things you can
> do
> > for avoiding an empty annotation is these situations:
> >
> > 1. Force the language code as a header in your request (curl request in
> > this case)
> > 2. Configure English 'en' or whatever language you know your dataset has
> > labels for the entities as Default Matching Language which is missing in
> > your configuration. More information here:
> >
> >
> https://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking
> >
> > Also, you also would probably like to disable NER engines for such kind
> of
> > entities.
> >
> > Hope that helps,
> > Rafa
> >
> > On Tue, May 31, 2016 at 6:13 PM Nathan Breit <
> br...@ecohealthalliance.org>
> > wrote:
> >
> > > Hello,
> > > I am trying to configure the Entityhub linking engine to use an
> Entityhub
> > > site with vocabulary from the Disease Ontology (
> > > http://disease-ontology.org/),
> > > but when I enhance text with it, labels from the ontology are not being
> > > annotated in the text. I am looking for advice on 

Re: Need Help Configuring EntityhubLinkingEngine

2016-06-01 Thread Rafa Haro
Hi Nathan,

You are testing the enhancer with a very short sentence and the Language
Detection engine is identifying 'no' (probable Norwegian) as the sentence
language. By default, Stanbol uses the identified language code for both
loading OpenNLP models in that language and for entity lookup for searching
only entity labels in that language. There is a couple of things you can do
for avoiding an empty annotation is these situations:

1. Force the language code as a header in your request (curl request in
this case)
2. Configure English 'en' or whatever language you know your dataset has
labels for the entities as Default Matching Language which is missing in
your configuration. More information here:
https://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking

Also, you also would probably like to disable NER engines for such kind of
entities.

Hope that helps,
Rafa

On Tue, May 31, 2016 at 6:13 PM Nathan Breit 
wrote:

> Hello,
> I am trying to configure the Entityhub linking engine to use an Entityhub
> site with vocabulary from the Disease Ontology (
> http://disease-ontology.org/),
> but when I enhance text with it, labels from the ontology are not being
> annotated in the text. I am looking for advice on how to debug this. Here
> is what I've tried so far:
> - I used the genericrdf indexing tool to import the Disease Ontology into a
> new Entityhub site. When I used the entityhub /find API endpoint to search
> for the name "dengue hemorrhagic fever" a result from the Disease Ontology
> was returned.
> - I configured and built a EntityhubLinkingEngine and a WeightedChain
> containing the linking engine. They show up on the Stanbol admin site and
> felix console. These are the config files:
>
> https://github.com/ecohealthalliance/t11/tree/master/ansible/roles/stanbol/templates/enhancer
> - When I used the following API call to enhance text containing the same
> term I was able to find using the /find endpoint, the language detected is
> the only annotation returned.
>
> curl -X POST -H "Accept: appltion/json" -H "Content-type: text/plain"
> --data "Avoid dengue hemorrhagic fever."
> http://54.197.175.163:3000/enhancer/chain/doidEnhancerChain
>
> This appears in the Stanbol error.log when the enhancement runs:
>
> ```
> 31.05.2016 12:05:06.204 *INFO* [Thread-5]
> org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine language
> identified as no
> 31.05.2016 12:05:06.206 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine
> No NER Model for person and language no available!
> 31.05.2016 12:05:06.206 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine
> No NER Model for organization and language no available!
> 31.05.2016 12:05:06.207 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine
> No NER Model for location and language no available!
> 31.05.2016 12:05:06.210 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> EntityLinking Statistics:
> 31.05.2016 12:05:06.210 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> - overal: 2ms (text processing: 4%, lookup: 127%, matching 0%, ranking
> 0%, other -31%)
> 31.05.2016 12:05:06.210 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
>   - Text Processing: 0.071543ms [count: 4 | time: 0.01788575ms
> (max:0.051031, min:0.005928)]
> 31.05.2016 12:05:06.211 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
>   - Vocabulary Lookup: 2.541598ms [count: 3 | time: 0.84719933ms
> (max:1.190281, min:0.667284)]
> 31.05.2016 12:05:06.211 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> - cache hits: 1 (33.32%)
> 31.05.2016 12:05:06.211 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
>   - 0 query results (0 filtered - NaN%)
> 31.05.2016 12:05:06.211 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
>   - Label Matching: 0.00218ms [count: 3 | time: 7.267E-4ms
> (max:7.55E-4, min:7.04E-4)]
> 31.05.2016 12:05:06.211 *INFO* [Thread-5]
>
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
>   - Suggestion Ranking: 0.0ms [count: 0 | time: NaNms (max:-1.0E-6,
> min:9.223372036854775E12)]
> 31.05.2016 12:05:06.214 *INFO* [qtp1118916813-38]
> org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
> Execution of Chain doidEnhancerChain finished after 14ms for ContentItem
> 
> 31.05.2016 12:05:06.215 *INFO* [qtp1118916813-38]
> org.apache.stanbol.enhancer.servicesapi.EnhancementJobManager > processed
> ContentItem
>  with Chain
> 

Re: Whither Stanbol

2016-05-24 Thread Rafa Haro
ack of research if
> anything I ask has been said somewhere by someone before and comment on the
> documentation I am providing (especially the places where I ask for help).
>
> Best Regards,
> Antero
>
> On Fri, 20 May 2016 at 23:06 Stefano Cossu <sco...@artic.edu> wrote:
>
>> Hello,
>>
>> Great to see so much feedback. As A. Soroka mentioned, some Fedora
>> adopters are already using Stanbol or looking into it. We at the Art
>> Insitute of Chicago fall in the latter category.
>>
>> Reading and understanding the documentation has been tough indeed. I have
>> some use cases and I have been trying to figure out whether Stanbol is a
>> good fit for them, but I cannot match what I read in the docs with what I
>> have in my running Stanbol instance (for example, where is the content
>> hub?). Also, without a reasonably regular release schedule or a 1.x release
>> available, it is hard to rely on Stanbol for tasks beyond experimental or
>> ancillary.
>>
>> With a massive introduction of Linked Data concepts in the latest version
>> of Fedora I foresee it being just a matter of time until more folks will
>> start looking at something to resolve semantic integration issues. If that
>> is Stanbol's goal, it would be great to rely on a community project rather
>> than on individual implementations.
>>
>> The AIC has very limited developer resources, but we may be able to
>> contribute with use cases, ideas, testing, and spreading the word; and I am
>> sure that if enough awareness arises, more contribution may come from other
>> sides.
>>
>>
>> Thanks,
>>
>> Stefano
>>
>
>> On 05/20/2016 06:34 AM, Antero Duarte wrote:
>>
> Hi,
>>
>> I will gather all the documentation I have, create some comments on what
>> I don't really understand and essentially got to work on a trial-error
>> basis and then I will send these to everyone. I will also outline in the
>> same email some features I don't understand, some features that I think are
>> useful but don't know how to configure/ not sure if they are actually fully
>> implemented and a list of items that I came across that no longer apply/are
>> deprecated.
>>
>> Regards,
>> Antero
>>
>> On Fri, 20 May 2016 at 11:39 Rafa Haro < <rh...@apache.org>
>> rh...@apache.org> wrote:
>>
>> HI Antero,
>>
>>
>>>
>>> On Fri, May 20, 2016 at 12:23 PM Antero Duarte < <a.fduar...@gmail.com>
>>> a.fduar...@gmail.com> wrote:
>>>
>>> > Hi there,
>>> >
>>> > Stanbol is great and I would hate to see it die.
>>> >
>>>
>>> Couldn't be more agree!
>>>
>>> >
>>> > About the lack of feedback from users/developers, I can only say that
>>> it
>>> > took quite a while for me to be able to reply to someone on this
>>> mailing
>>> > list because the learning curve is so steep. I bet a lot of people
>>> still
>>> > read and are interested in stanbol updates, but they just don't have
>>> the
>>> > technical know-how to be involved. I include myself in this group, I
>>> have
>>> > answered a couple of questions, but only really basic ones, as I fear
>>> my
>>> > knowledge of the platform as a whole doesn't allow me to answer more
>>> > complicated questions.
>>> >
>>> > I think one step that definitely needs to be taken is
>>> improving/updating
>>> > the existing documentation. I know for a fact that one thing that
>>> really
>>> > put me off when I first started using stanbol was the that there was
>>> > documentation that was unclear, examples that were unable to be
>>> reproduced
>>> > for several reasons, and outdated documents that referenced components
>>> that
>>> > no longer existed in the latest stable release of stanbol (I'm not even
>>> > talking about the latest build from trunk).
>>> >
>>>
>>> That's true again imho. Also Development documentation, not only final
>>> user
>>> one is needed. And probably some work on making the APIs more
>>> comprehensible.
>>>
>>>
>>> >
>>> > I have a couple of documents that I have written over time that made it
>>> > easier for me to understand how stanbol works and I could share these
>>> but
>>> > they would need to be reviewed by someone who understands stanbol a

Re: Whither Stanbol

2016-05-20 Thread Rafa Haro
HI Antero,

On Fri, May 20, 2016 at 12:23 PM Antero Duarte <a.fduar...@gmail.com> wrote:

> Hi there,
>
> Stanbol is great and I would hate to see it die.
>

Couldn't be more agree!

>
> About the lack of feedback from users/developers, I can only say that it
> took quite a while for me to be able to reply to someone on this mailing
> list because the learning curve is so steep. I bet a lot of people still
> read and are interested in stanbol updates, but they just don't have the
> technical know-how to be involved. I include myself in this group, I have
> answered a couple of questions, but only really basic ones, as I fear my
> knowledge of the platform as a whole doesn't allow me to answer more
> complicated questions.
>
> I think one step that definitely needs to be taken is improving/updating
> the existing documentation. I know for a fact that one thing that really
> put me off when I first started using stanbol was the that there was
> documentation that was unclear, examples that were unable to be reproduced
> for several reasons, and outdated documents that referenced components that
> no longer existed in the latest stable release of stanbol (I'm not even
> talking about the latest build from trunk).
>

That's true again imho. Also Development documentation, not only final user
one is needed. And probably some work on making the APIs more
comprehensible.


>
> I have a couple of documents that I have written over time that made it
> easier for me to understand how stanbol works and I could share these but
> they would need to be reviewed by someone who understands stanbol a lot
> better than me.
>

Please share, for sure we can all take benefit from it and improve the
documentation


>
> I understand that you have busy lives and as developers, you'd rather use
> the little time you have to code than to write documentation, but if we can
> make stanbol more approachable to newcomers, I believe the developer pool
> would increase greatly and we could make Stanbol great again.
>

+1. It would be great to have also concrete examples about what features,
components and son on are not clear enough or just deprecated in the
current live documentation so we can start by those

Thanks a lot!


>
> My two cents.
>
> Best Regards,
> Antero Duarte
>
> On Fri, 20 May 2016 at 10:26 Rafa Haro <rh...@apache.org> wrote:
>
> > Hi Soroka,
> >
> > First of all, reading this kind of emails is, in my opinion, a cause of
> > happiness as a new attempt to somehow reactivate the project. I share the
> > same feeling about Apache Stanbol since sometime ago. More than one month
> > ago, there was a Google Hangout meeting joined by some committers and
> also
> > users. We tried to sketch an immediate roadmap and planned to release
> > version 1.0 in the following weeks after that meeting. We sent an email
> to
> > the list with the meeting minutes, but after that there was a lot of
> > silence again.
> >
> > Probably the main problem right now is probably the lack of quality time
> to
> > dedicate to the project for the current active committers. I can only
> speak
> > for myself: in my particular case, in the last year I have used Stanbol
> for
> > a couple of projects, we developed a couple of custom engines that we can
> > prepare for contribution, but we never found the proper time to do this,
> > among other things because we didn't have clear if those engines could be
> > useful for the community. And that is probably another symptom, we have
> > been progressively losing feedback from users, developerscommunity:
> > there are less and less messages in the mailing list every month. This
> > scenario is probably not too much motivating for aiming contributions and
> > finding new committers. There are probably more reasons, like Stanbol is
> > not technically very friendly to be approached.
> >
> > Of course I'm not saying this situation is someone fault. I'm not very
> sure
> > about the best recipe for improving the situation either.
> >
> > Thoughts?
> >
> > On Thu, May 19, 2016 at 5:49 PM A. Soroka <aj...@virginia.edu> wrote:
> >
> > > Hi, Stanbol folks!
> > >
> > > I'm writing to you on behalf of the community of Fedora Commons (
> > > http://fedora-commons.org). Fedora is an information architecture with
> > > open source reference implementation that has come into wide use over
> the
> > > last fifteen years in the "cultural heritage" world of libraries,
> > archives,
> > > museums, etc. For many years, we've been intensely concerned with the
> > ideas
> > > that go 

Re: March Report Input

2016-03-26 Thread Rafa Haro
Hi devs,

Apologize for the delay. These are the notes from the last week meeting:

- Last stable version released is 0.12.1. Version 0.13 would be the last
0.1x release and the development would continue only on 1.x releases
(current project trunk). Version 1.0 would be releases ASAP not including
the new Clerezza API that would be included as part of the next 1.x release.

- Binary releases will include some scripts for downloading not compatible
license models like OpenNLP models for some languages.

- Fabian to handover the releasing process to the community

- In order to easing the use of stanbol for final users, resources like
docker images are planned to be built

- A common use case was agreed for most of the call attendees: ease the
incremental indexing of a dataset stored in a triple store within a
ManagedSite

- Rupert to take a look to the pending issues and most blocking bugs and
propose a release at the list

- Things to improve ASAP: final user documentation, technical
documentation, remove outdated documentation, include tutorials...

Dies anyone muss something?

Cheers,
Rafa
El El mié, 16 mar 2016 a las 19:33, Yauhen Klimovich <
klimovich@gmail.com> escribió:

> Hi all,
> could you also let me join?
>
>
> > On Mar 16, 2016, at 4:11 PM, Rafa Haro <rh...@apache.org> wrote:
> >
> > Absolutely not Andrew, please join us
> >
> > Cheers,
> > Rafa
> >
> > On Wed, Mar 16, 2016 at 3:04 PM Andrew Valencik <and...@affin.io> wrote:
> >
> >> Hi Folks,
> >>
> >> Any objection to an interested newbie joining the call to listen?
> >> (Assuming you don't hit the Hangout limit with existing developers)
> >>
> >>
> >> On Wed, Mar 16, 2016 at 9:38 AM Rafa Haro <rh...@apache.org> wrote:
> >>
> >>> Hi,
> >>>
> >>> The link for today's Hangout:
> >>> https://hangouts.google.com/hangouts/_/c4kmtbmw6ze57gvzvllqmv52jee
> >>>
> >>> See you in a while!
> >>>
> >>> Cheers,
> >>> Rafa
> >>>
> >>> On Mon, Mar 14, 2016 at 5:17 PM Rafa Haro <rh...@apache.org> wrote:
> >>>
> >>>> Hi Rupert,
> >>>>
> >>>> It would be nice for everyone to take a look at least into Jira to
> >> check
> >>>> current project situation
> >>>>
> >>>> Also, from my side, I have a couple of ideas or features to discuss
> >>>>
> >>>> Cheers,
> >>>> Rafa
> >>>>
> >>>> On Mon, Mar 14, 2016 at 7:42 AM Rupert Westenthaler <
> >>>> rupert.westentha...@gmail.com> wrote:
> >>>>
> >>>>> That's great. Lets talk on Wednesday evening. As usual results need
> to
> >>>>> be documented and published on the mailing (eventually voted) on the
> >>>>> mailing lists.
> >>>>>
> >>>>> @Rafa: Do we need to do anything in preparation of the call?
> >>>>>
> >>>>> best
> >>>>> Rupert
> >>>>>
> >>>>>
> >>>>> On Fri, Mar 11, 2016 at 10:51 AM, Rafa Haro <rh...@apache.org>
> wrote:
> >>>>>> Seems like next Wednesday (March 16th) at 5pm is a good fit for
> >>>>> everyone.
> >>>>>>
> >>>>>> I think we can close the survey now :-)
> >>>>>>
> >>>>>> On Tue, Mar 8, 2016 at 2:27 PM Rafa Haro <rh...@apache.org> wrote:
> >>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> This is the link to the Doodle for the meeting date:
> >>>>>>>
> >>>>>>> http://doodle.com/poll/w5giz2egtb4kpyyc
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Rafa
> >>>>>>>
> >>>>>>> On Tue, Mar 8, 2016 at 12:14 PM Rupert Westenthaler <
> >>>>>>> rupert.westentha...@gmail.com> wrote:
> >>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I think the new Sling Provisioning model [1] could be really
> >> helpful
> >>>>>>>> in implementing this. But ATM I do not have the time to look
> >> further
> >>>>>>>> into this option. We should anyway migrate your launchers to this
> >>>>>>>> model as the current m

Re: March Report Input

2016-03-19 Thread Rafa Haro
Absolutely not Andrew, please join us

Cheers,
Rafa

On Wed, Mar 16, 2016 at 3:04 PM Andrew Valencik <and...@affin.io> wrote:

> Hi Folks,
>
> Any objection to an interested newbie joining the call to listen?
> (Assuming you don't hit the Hangout limit with existing developers)
>
>
> On Wed, Mar 16, 2016 at 9:38 AM Rafa Haro <rh...@apache.org> wrote:
>
> > Hi,
> >
> > The link for today's Hangout:
> > https://hangouts.google.com/hangouts/_/c4kmtbmw6ze57gvzvllqmv52jee
> >
> > See you in a while!
> >
> > Cheers,
> > Rafa
> >
> > On Mon, Mar 14, 2016 at 5:17 PM Rafa Haro <rh...@apache.org> wrote:
> >
> > > Hi Rupert,
> > >
> > > It would be nice for everyone to take a look at least into Jira to
> check
> > > current project situation
> > >
> > > Also, from my side, I have a couple of ideas or features to discuss
> > >
> > > Cheers,
> > > Rafa
> > >
> > > On Mon, Mar 14, 2016 at 7:42 AM Rupert Westenthaler <
> > > rupert.westentha...@gmail.com> wrote:
> > >
> > >> That's great. Lets talk on Wednesday evening. As usual results need to
> > >> be documented and published on the mailing (eventually voted) on the
> > >> mailing lists.
> > >>
> > >> @Rafa: Do we need to do anything in preparation of the call?
> > >>
> > >> best
> > >> Rupert
> > >>
> > >>
> > >> On Fri, Mar 11, 2016 at 10:51 AM, Rafa Haro <rh...@apache.org> wrote:
> > >> > Seems like next Wednesday (March 16th) at 5pm is a good fit for
> > >> everyone.
> > >> >
> > >> > I think we can close the survey now :-)
> > >> >
> > >> > On Tue, Mar 8, 2016 at 2:27 PM Rafa Haro <rh...@apache.org> wrote:
> > >> >
> > >> >> Hi all,
> > >> >>
> > >> >> This is the link to the Doodle for the meeting date:
> > >> >>
> > >> >> http://doodle.com/poll/w5giz2egtb4kpyyc
> > >> >>
> > >> >> Cheers,
> > >> >> Rafa
> > >> >>
> > >> >> On Tue, Mar 8, 2016 at 12:14 PM Rupert Westenthaler <
> > >> >> rupert.westentha...@gmail.com> wrote:
> > >> >>
> > >> >>> Hi,
> > >> >>>
> > >> >>> I think the new Sling Provisioning model [1] could be really
> helpful
> > >> >>> in implementing this. But ATM I do not have the time to look
> further
> > >> >>> into this option. We should anyway migrate your launchers to this
> > >> >>> model as the current model is deprecated by now.
> > >> >>>
> > >> >>> best
> > >> >>> Rupert
> > >> >>>
> > >> >>> [1]
> > >> https://sling.apache.org/documentation/development/slingstart.html
> > >> >>>
> > >> >>> On Tue, Mar 8, 2016 at 11:41 AM, Rafa Haro <rh...@apache.org>
> > wrote:
> > >> >>> > Hi all,
> > >> >>> >
> > >> >>> > Regarding the releases, we should find a way for distributing
> > >> binaries
> > >> >>> > releases in order to prevent final users to build Stanbol by
> > >> themselves.
> > >> >>> > The problem with OpenNLP models could be probably resolved by
> > >> including
> > >> >>> a
> > >> >>> > downloading tool within the releases.
> > >> >>> >
> > >> >>> > IMHO an important part of the Hangout discussion should be
> release
> > >> 1.0
> > >> >>> cut.
> > >> >>> > And probably at some point we are going to need to stop
> supporting
> > >> >>> 0.12.x
> > >> >>> > development.
> > >> >>> >
> > >> >>> > wdyt?
> > >> >>> >
> > >> >>> > Cheers,
> > >> >>> > Rafa
> > >> >>> >
> > >> >>> > On Tue, Mar 8, 2016 at 10:13 AM Rupert Westenthaler <
> > >> >>> > rupert.westentha...@gmail.com> wrote:
> > >> >>> >
> > >> >>> >> Hi all,
> >

Re: March Report Input

2016-03-19 Thread Rafa Haro
Sorry guys, new link:
https://talkgadget.google.com/hangouts/_/diigxjahdfbsdosyhoeul2rlxqe?authuser=2=en

Cheers,
Rafa

On Wed, Mar 16, 2016 at 4:11 PM Rafa Haro <rh...@apache.org> wrote:

> Absolutely not Andrew, please join us
>
> Cheers,
> Rafa
>
> On Wed, Mar 16, 2016 at 3:04 PM Andrew Valencik <and...@affin.io> wrote:
>
>> Hi Folks,
>>
>> Any objection to an interested newbie joining the call to listen?
>> (Assuming you don't hit the Hangout limit with existing developers)
>>
>>
>> On Wed, Mar 16, 2016 at 9:38 AM Rafa Haro <rh...@apache.org> wrote:
>>
>> > Hi,
>> >
>> > The link for today's Hangout:
>> > https://hangouts.google.com/hangouts/_/c4kmtbmw6ze57gvzvllqmv52jee
>> >
>> > See you in a while!
>> >
>> > Cheers,
>> > Rafa
>> >
>> > On Mon, Mar 14, 2016 at 5:17 PM Rafa Haro <rh...@apache.org> wrote:
>> >
>> > > Hi Rupert,
>> > >
>> > > It would be nice for everyone to take a look at least into Jira to
>> check
>> > > current project situation
>> > >
>> > > Also, from my side, I have a couple of ideas or features to discuss
>> > >
>> > > Cheers,
>> > > Rafa
>> > >
>> > > On Mon, Mar 14, 2016 at 7:42 AM Rupert Westenthaler <
>> > > rupert.westentha...@gmail.com> wrote:
>> > >
>> > >> That's great. Lets talk on Wednesday evening. As usual results need
>> to
>> > >> be documented and published on the mailing (eventually voted) on the
>> > >> mailing lists.
>> > >>
>> > >> @Rafa: Do we need to do anything in preparation of the call?
>> > >>
>> > >> best
>> > >> Rupert
>> > >>
>> > >>
>> > >> On Fri, Mar 11, 2016 at 10:51 AM, Rafa Haro <rh...@apache.org>
>> wrote:
>> > >> > Seems like next Wednesday (March 16th) at 5pm is a good fit for
>> > >> everyone.
>> > >> >
>> > >> > I think we can close the survey now :-)
>> > >> >
>> > >> > On Tue, Mar 8, 2016 at 2:27 PM Rafa Haro <rh...@apache.org> wrote:
>> > >> >
>> > >> >> Hi all,
>> > >> >>
>> > >> >> This is the link to the Doodle for the meeting date:
>> > >> >>
>> > >> >> http://doodle.com/poll/w5giz2egtb4kpyyc
>> > >> >>
>> > >> >> Cheers,
>> > >> >> Rafa
>> > >> >>
>> > >> >> On Tue, Mar 8, 2016 at 12:14 PM Rupert Westenthaler <
>> > >> >> rupert.westentha...@gmail.com> wrote:
>> > >> >>
>> > >> >>> Hi,
>> > >> >>>
>> > >> >>> I think the new Sling Provisioning model [1] could be really
>> helpful
>> > >> >>> in implementing this. But ATM I do not have the time to look
>> further
>> > >> >>> into this option. We should anyway migrate your launchers to this
>> > >> >>> model as the current model is deprecated by now.
>> > >> >>>
>> > >> >>> best
>> > >> >>> Rupert
>> > >> >>>
>> > >> >>> [1]
>> > >> https://sling.apache.org/documentation/development/slingstart.html
>> > >> >>>
>> > >> >>> On Tue, Mar 8, 2016 at 11:41 AM, Rafa Haro <rh...@apache.org>
>> > wrote:
>> > >> >>> > Hi all,
>> > >> >>> >
>> > >> >>> > Regarding the releases, we should find a way for distributing
>> > >> binaries
>> > >> >>> > releases in order to prevent final users to build Stanbol by
>> > >> themselves.
>> > >> >>> > The problem with OpenNLP models could be probably resolved by
>> > >> including
>> > >> >>> a
>> > >> >>> > downloading tool within the releases.
>> > >> >>> >
>> > >> >>> > IMHO an important part of the Hangout discussion should be
>> release
>> > >> 1.0
>> > >> >>> cut.
>> > >> >>> > And probably at some point we are going to need to

Re: March Report Input

2016-03-14 Thread Rafa Haro
Hi Rupert,

It would be nice for everyone to take a look at least into Jira to check
current project situation

Also, from my side, I have a couple of ideas or features to discuss

Cheers,
Rafa

On Mon, Mar 14, 2016 at 7:42 AM Rupert Westenthaler <
rupert.westentha...@gmail.com> wrote:

> That's great. Lets talk on Wednesday evening. As usual results need to
> be documented and published on the mailing (eventually voted) on the
> mailing lists.
>
> @Rafa: Do we need to do anything in preparation of the call?
>
> best
> Rupert
>
>
> On Fri, Mar 11, 2016 at 10:51 AM, Rafa Haro <rh...@apache.org> wrote:
> > Seems like next Wednesday (March 16th) at 5pm is a good fit for everyone.
> >
> > I think we can close the survey now :-)
> >
> > On Tue, Mar 8, 2016 at 2:27 PM Rafa Haro <rh...@apache.org> wrote:
> >
> >> Hi all,
> >>
> >> This is the link to the Doodle for the meeting date:
> >>
> >> http://doodle.com/poll/w5giz2egtb4kpyyc
> >>
> >> Cheers,
> >> Rafa
> >>
> >> On Tue, Mar 8, 2016 at 12:14 PM Rupert Westenthaler <
> >> rupert.westentha...@gmail.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> I think the new Sling Provisioning model [1] could be really helpful
> >>> in implementing this. But ATM I do not have the time to look further
> >>> into this option. We should anyway migrate your launchers to this
> >>> model as the current model is deprecated by now.
> >>>
> >>> best
> >>> Rupert
> >>>
> >>> [1] https://sling.apache.org/documentation/development/slingstart.html
> >>>
> >>> On Tue, Mar 8, 2016 at 11:41 AM, Rafa Haro <rh...@apache.org> wrote:
> >>> > Hi all,
> >>> >
> >>> > Regarding the releases, we should find a way for distributing
> binaries
> >>> > releases in order to prevent final users to build Stanbol by
> themselves.
> >>> > The problem with OpenNLP models could be probably resolved by
> including
> >>> a
> >>> > downloading tool within the releases.
> >>> >
> >>> > IMHO an important part of the Hangout discussion should be release
> 1.0
> >>> cut.
> >>> > And probably at some point we are going to need to stop supporting
> >>> 0.12.x
> >>> > development.
> >>> >
> >>> > wdyt?
> >>> >
> >>> > Cheers,
> >>> > Rafa
> >>> >
> >>> > On Tue, Mar 8, 2016 at 10:13 AM Rupert Westenthaler <
> >>> > rupert.westentha...@gmail.com> wrote:
> >>> >
> >>> >> Hi all,
> >>> >>
> >>> >> Regarding the path forward the main points are
> >>> >>
> >>> >> * releasing 0.12.1
> >>> >> * @Fabian would be nice if you could help me with this
> >>> >> * deciding how to deal with the Clerezza API changes
> >>> >> * upgrading will means maintaining 0.12.1 and 1.0.0 becomes much
> >>> >> more difficult because
> >>> >> * delaying the upgrade means to use outdated Clerezza
> dependencies
> >>> >> * do we want to update before a full 1.0.0 release?
> >>> >> * path to a full 1.0.0 release
> >>> >>
> >>> >> best
> >>> >> Rupert
> >>> >>
> >>> >>
> >>> >> On Mon, Mar 7, 2016 at 12:29 PM, Cristian Petroaca
> >>> >> <cristian.petro...@gmail.com> wrote:
> >>> >> > Hi Fabian,
> >>> >> >
> >>> >> > I've been working on Event Extraction Engine. Trying to figure out
> >>> if a
> >>> >> > rule based approach makes sense.
> >>> >> >
> >>> >> > Cristian
> >>> >> >
> >>> >> > On Mon, Mar 7, 2016 at 1:21 PM, Rafa Haro <rh...@apache.org>
> wrote:
> >>> >> >
> >>> >> >> HI Fabian, Rupert,
> >>> >> >>
> >>> >> >> We have been working on regular expression based fact extraction
> >>> >> engine. I
> >>> >> >> just need to prettify it a little and it will be ready for
> >>> contribution.
> >>> >> >> @Rupert I will send the doodle ASAP.
&

Re: March Report Input

2016-03-11 Thread Rafa Haro
Seems like next Wednesday (March 16th) at 5pm is a good fit for everyone.

I think we can close the survey now :-)

On Tue, Mar 8, 2016 at 2:27 PM Rafa Haro <rh...@apache.org> wrote:

> Hi all,
>
> This is the link to the Doodle for the meeting date:
>
> http://doodle.com/poll/w5giz2egtb4kpyyc
>
> Cheers,
> Rafa
>
> On Tue, Mar 8, 2016 at 12:14 PM Rupert Westenthaler <
> rupert.westentha...@gmail.com> wrote:
>
>> Hi,
>>
>> I think the new Sling Provisioning model [1] could be really helpful
>> in implementing this. But ATM I do not have the time to look further
>> into this option. We should anyway migrate your launchers to this
>> model as the current model is deprecated by now.
>>
>> best
>> Rupert
>>
>> [1] https://sling.apache.org/documentation/development/slingstart.html
>>
>> On Tue, Mar 8, 2016 at 11:41 AM, Rafa Haro <rh...@apache.org> wrote:
>> > Hi all,
>> >
>> > Regarding the releases, we should find a way for distributing binaries
>> > releases in order to prevent final users to build Stanbol by themselves.
>> > The problem with OpenNLP models could be probably resolved by including
>> a
>> > downloading tool within the releases.
>> >
>> > IMHO an important part of the Hangout discussion should be release 1.0
>> cut.
>> > And probably at some point we are going to need to stop supporting
>> 0.12.x
>> > development.
>> >
>> > wdyt?
>> >
>> > Cheers,
>> > Rafa
>> >
>> > On Tue, Mar 8, 2016 at 10:13 AM Rupert Westenthaler <
>> > rupert.westentha...@gmail.com> wrote:
>> >
>> >> Hi all,
>> >>
>> >> Regarding the path forward the main points are
>> >>
>> >> * releasing 0.12.1
>> >> * @Fabian would be nice if you could help me with this
>> >> * deciding how to deal with the Clerezza API changes
>> >> * upgrading will means maintaining 0.12.1 and 1.0.0 becomes much
>> >> more difficult because
>> >> * delaying the upgrade means to use outdated Clerezza dependencies
>> >> * do we want to update before a full 1.0.0 release?
>> >> * path to a full 1.0.0 release
>> >>
>> >> best
>> >> Rupert
>> >>
>> >>
>> >> On Mon, Mar 7, 2016 at 12:29 PM, Cristian Petroaca
>> >> <cristian.petro...@gmail.com> wrote:
>> >> > Hi Fabian,
>> >> >
>> >> > I've been working on Event Extraction Engine. Trying to figure out
>> if a
>> >> > rule based approach makes sense.
>> >> >
>> >> > Cristian
>> >> >
>> >> > On Mon, Mar 7, 2016 at 1:21 PM, Rafa Haro <rh...@apache.org> wrote:
>> >> >
>> >> >> HI Fabian, Rupert,
>> >> >>
>> >> >> We have been working on regular expression based fact extraction
>> >> engine. I
>> >> >> just need to prettify it a little and it will be ready for
>> contribution.
>> >> >> @Rupert I will send the doodle ASAP.
>> >> >>
>> >> >> Cheers,
>> >> >> Rafa
>> >> >>
>> >> >> On Mon, Mar 7, 2016 at 10:59 AM Rupert Westenthaler <
>> >> >> rupert.westentha...@gmail.com> wrote:
>> >> >>
>> >> >> > Hi Fabian,
>> >> >> >
>> >> >> > Nothing special from my side.
>> >> >> >
>> >> >> > Not for the report - but just to let you know -
>> >> >> >
>> >> >> > I just implemented a new feature that allows to wait for
>> (re-)creation
>> >> >> > of FST corpora after changes in the Solr Index. While doing so I
>> >> >> > discovered a major Issue with the FST linking engine
>> (STANBOL-1448)
>> >> >> > that can especially affect users of Entityhub Managed Sites.
>> >> >> >
>> >> >> > Both together will now allow to use FST linking that does
>> consider the
>> >> >> > most current version of the SolrCore (given that the Core is small
>> >> >> > enough that building FST models does not take longer as 10sec).
>> >> >> >
>> >> >> > I would also like to go on with the discussed Hangout. @Rafa, can
>> you
>> >> >>

Re: March Report Input

2016-03-08 Thread Rafa Haro
Hi all,

This is the link to the Doodle for the meeting date:

http://doodle.com/poll/w5giz2egtb4kpyyc

Cheers,
Rafa

On Tue, Mar 8, 2016 at 12:14 PM Rupert Westenthaler <
rupert.westentha...@gmail.com> wrote:

> Hi,
>
> I think the new Sling Provisioning model [1] could be really helpful
> in implementing this. But ATM I do not have the time to look further
> into this option. We should anyway migrate your launchers to this
> model as the current model is deprecated by now.
>
> best
> Rupert
>
> [1] https://sling.apache.org/documentation/development/slingstart.html
>
> On Tue, Mar 8, 2016 at 11:41 AM, Rafa Haro <rh...@apache.org> wrote:
> > Hi all,
> >
> > Regarding the releases, we should find a way for distributing binaries
> > releases in order to prevent final users to build Stanbol by themselves.
> > The problem with OpenNLP models could be probably resolved by including a
> > downloading tool within the releases.
> >
> > IMHO an important part of the Hangout discussion should be release 1.0
> cut.
> > And probably at some point we are going to need to stop supporting 0.12.x
> > development.
> >
> > wdyt?
> >
> > Cheers,
> > Rafa
> >
> > On Tue, Mar 8, 2016 at 10:13 AM Rupert Westenthaler <
> > rupert.westentha...@gmail.com> wrote:
> >
> >> Hi all,
> >>
> >> Regarding the path forward the main points are
> >>
> >> * releasing 0.12.1
> >> * @Fabian would be nice if you could help me with this
> >> * deciding how to deal with the Clerezza API changes
> >> * upgrading will means maintaining 0.12.1 and 1.0.0 becomes much
> >> more difficult because
> >> * delaying the upgrade means to use outdated Clerezza dependencies
> >> * do we want to update before a full 1.0.0 release?
> >> * path to a full 1.0.0 release
> >>
> >> best
> >> Rupert
> >>
> >>
> >> On Mon, Mar 7, 2016 at 12:29 PM, Cristian Petroaca
> >> <cristian.petro...@gmail.com> wrote:
> >> > Hi Fabian,
> >> >
> >> > I've been working on Event Extraction Engine. Trying to figure out if
> a
> >> > rule based approach makes sense.
> >> >
> >> > Cristian
> >> >
> >> > On Mon, Mar 7, 2016 at 1:21 PM, Rafa Haro <rh...@apache.org> wrote:
> >> >
> >> >> HI Fabian, Rupert,
> >> >>
> >> >> We have been working on regular expression based fact extraction
> >> engine. I
> >> >> just need to prettify it a little and it will be ready for
> contribution.
> >> >> @Rupert I will send the doodle ASAP.
> >> >>
> >> >> Cheers,
> >> >> Rafa
> >> >>
> >> >> On Mon, Mar 7, 2016 at 10:59 AM Rupert Westenthaler <
> >> >> rupert.westentha...@gmail.com> wrote:
> >> >>
> >> >> > Hi Fabian,
> >> >> >
> >> >> > Nothing special from my side.
> >> >> >
> >> >> > Not for the report - but just to let you know -
> >> >> >
> >> >> > I just implemented a new feature that allows to wait for
> (re-)creation
> >> >> > of FST corpora after changes in the Solr Index. While doing so I
> >> >> > discovered a major Issue with the FST linking engine (STANBOL-1448)
> >> >> > that can especially affect users of Entityhub Managed Sites.
> >> >> >
> >> >> > Both together will now allow to use FST linking that does consider
> the
> >> >> > most current version of the SolrCore (given that the Core is small
> >> >> > enough that building FST models does not take longer as 10sec).
> >> >> >
> >> >> > I would also like to go on with the discussed Hangout. @Rafa, can
> you
> >> >> > create a Doodle so that we can find a date for such a meeting. Also
> >> >> > feel free to assign me some task for the preparation of such.
> >> >> >
> >> >> >
> >> >> > best
> >> >> > Rupert
> >> >> >
> >> >> >
> >> >> > On Sat, Mar 5, 2016 at 7:48 AM, Fabian Christ
> >> >> > <christ.fab...@googlemail.com> wrote:
> >> >> > > Hi Stanbolers,
> >> >> > >
> >> >> > > even if we have repor

Re: March Report Input

2016-03-08 Thread Rafa Haro
Hi all,

Regarding the releases, we should find a way for distributing binaries
releases in order to prevent final users to build Stanbol by themselves.
The problem with OpenNLP models could be probably resolved by including a
downloading tool within the releases.

IMHO an important part of the Hangout discussion should be release 1.0 cut.
And probably at some point we are going to need to stop supporting 0.12.x
development.

wdyt?

Cheers,
Rafa

On Tue, Mar 8, 2016 at 10:13 AM Rupert Westenthaler <
rupert.westentha...@gmail.com> wrote:

> Hi all,
>
> Regarding the path forward the main points are
>
> * releasing 0.12.1
> * @Fabian would be nice if you could help me with this
> * deciding how to deal with the Clerezza API changes
> * upgrading will means maintaining 0.12.1 and 1.0.0 becomes much
> more difficult because
> * delaying the upgrade means to use outdated Clerezza dependencies
> * do we want to update before a full 1.0.0 release?
> * path to a full 1.0.0 release
>
> best
> Rupert
>
>
> On Mon, Mar 7, 2016 at 12:29 PM, Cristian Petroaca
> <cristian.petro...@gmail.com> wrote:
> > Hi Fabian,
> >
> > I've been working on Event Extraction Engine. Trying to figure out if a
> > rule based approach makes sense.
> >
> > Cristian
> >
> > On Mon, Mar 7, 2016 at 1:21 PM, Rafa Haro <rh...@apache.org> wrote:
> >
> >> HI Fabian, Rupert,
> >>
> >> We have been working on regular expression based fact extraction
> engine. I
> >> just need to prettify it a little and it will be ready for contribution.
> >> @Rupert I will send the doodle ASAP.
> >>
> >> Cheers,
> >> Rafa
> >>
> >> On Mon, Mar 7, 2016 at 10:59 AM Rupert Westenthaler <
> >> rupert.westentha...@gmail.com> wrote:
> >>
> >> > Hi Fabian,
> >> >
> >> > Nothing special from my side.
> >> >
> >> > Not for the report - but just to let you know -
> >> >
> >> > I just implemented a new feature that allows to wait for (re-)creation
> >> > of FST corpora after changes in the Solr Index. While doing so I
> >> > discovered a major Issue with the FST linking engine (STANBOL-1448)
> >> > that can especially affect users of Entityhub Managed Sites.
> >> >
> >> > Both together will now allow to use FST linking that does consider the
> >> > most current version of the SolrCore (given that the Core is small
> >> > enough that building FST models does not take longer as 10sec).
> >> >
> >> > I would also like to go on with the discussed Hangout. @Rafa, can you
> >> > create a Doodle so that we can find a date for such a meeting. Also
> >> > feel free to assign me some task for the preparation of such.
> >> >
> >> >
> >> > best
> >> > Rupert
> >> >
> >> >
> >> > On Sat, Mar 5, 2016 at 7:48 AM, Fabian Christ
> >> > <christ.fab...@googlemail.com> wrote:
> >> > > Hi Stanbolers,
> >> > >
> >> > > even if we have reported last month, we have to report again this
> month
> >> > to
> >> > > be back on the regular schedule.
> >> > >
> >> > > So anything you would like to see in the report? Remember that we
> have
> >> to
> >> > > report on the project and its progress. We do not report technical
> >> issues
> >> > > or implementation details here.
> >> > >
> >> > > Best
> >> > > Fabian
> >> >
> >> >
> >> >
> >> > --
> >> > | Rupert Westenthaler rupert.westentha...@gmail.com
> >> > | Bodenlehenstraße 11  ++43-699-11108907
> >> > | A-5500 Bischofshofen
> >> > | REDLINK.CO
> >> >
> >>
> ..
> >> > | http://redlink.co/
> >> >
> >>
>
>
>
> --
> | Rupert Westenthaler rupert.westentha...@gmail.com
> | Bodenlehenstraße 11  ++43-699-11108907
> | A-5500 Bischofshofen
> | REDLINK.CO
> ..
> | http://redlink.co/
>


Re: March Report Input

2016-03-07 Thread Rafa Haro
HI Fabian, Rupert,

We have been working on regular expression based fact extraction engine. I
just need to prettify it a little and it will be ready for contribution.
@Rupert I will send the doodle ASAP.

Cheers,
Rafa

On Mon, Mar 7, 2016 at 10:59 AM Rupert Westenthaler <
rupert.westentha...@gmail.com> wrote:

> Hi Fabian,
>
> Nothing special from my side.
>
> Not for the report - but just to let you know -
>
> I just implemented a new feature that allows to wait for (re-)creation
> of FST corpora after changes in the Solr Index. While doing so I
> discovered a major Issue with the FST linking engine (STANBOL-1448)
> that can especially affect users of Entityhub Managed Sites.
>
> Both together will now allow to use FST linking that does consider the
> most current version of the SolrCore (given that the Core is small
> enough that building FST models does not take longer as 10sec).
>
> I would also like to go on with the discussed Hangout. @Rafa, can you
> create a Doodle so that we can find a date for such a meeting. Also
> feel free to assign me some task for the preparation of such.
>
>
> best
> Rupert
>
>
> On Sat, Mar 5, 2016 at 7:48 AM, Fabian Christ
>  wrote:
> > Hi Stanbolers,
> >
> > even if we have reported last month, we have to report again this month
> to
> > be back on the regular schedule.
> >
> > So anything you would like to see in the report? Remember that we have to
> > report on the project and its progress. We do not report technical issues
> > or implementation details here.
> >
> > Best
> > Fabian
>
>
>
> --
> | Rupert Westenthaler rupert.westentha...@gmail.com
> | Bodenlehenstraße 11  ++43-699-11108907
> | A-5500 Bischofshofen
> | REDLINK.CO
> ..
> | http://redlink.co/
>


Re: [REPORT] Apache Stanbol - February 2016

2016-02-18 Thread Rafa Haro
Sure Reto, that's exactly the idea ;-)

On Thu, Feb 18, 2016 at 12:28 PM Reto Gmür <r...@apache.org> wrote:

> I would lik to participate. However don't forget the Apache way and
> focus on the list which is the only way to involve everybody: if it's
> not on the list, it didn't happen. So lets have the call as it might
> give some inspiration, but lets keep the important discussions and
> especially decision making on the list.
>
> Cheers,
> Reto
>
> On Thu, Feb 18, 2016, at 09:43, klim klim wrote:
> > Hi all,
> >
> > I’m in,
> > just let me know the date and time.
> >
> > thank you
> >
> > my best,
> > -Yauhen-
> >
> >
> > > On Feb 18, 2016, at 11:41 AM, Cristian Petroaca <
> cristian.petro...@gmail.com> wrote:
> > >
> > > +1 for the report.
> > >
> > > I'd also like to join the call.
> > >
> > > Regards,
> > > Cristian
> > >
> > > On Thu, Feb 18, 2016 at 10:34 AM, Dileepa Jayakody <
> > > dileepajayak...@gmail.com> wrote:
> > >
> > >> +1 from my side as well for the report.
> > >>
> > >> I would like to join the call as well.
> > >>
> > >> Thanks,
> > >> Dileepa
> > >>
> > >> On Thu, Feb 18, 2016 at 1:10 PM, Antonio David Pérez Morales <
> > >> adperezmora...@gmail.com> wrote:
> > >>
> > >>> Hi
> > >>>
> > >>> +1 for the report
> > >>>
> > >>> I'm available any time you decide this or next week to have the call
> > >>>
> > >>> Regards
> > >>>
> > >>> 2016-02-17 19:41 GMT+01:00 Susheel Kumar <susheel2...@gmail.com>:
> > >>>
> > >>>> Sure, I would love to join any call to discuss the latest and
> project
> > >>>> direction.
> > >>>>
> > >>>> Thanks,
> > >>>> Susheel
> > >>>>
> > >>>> On Fri, Feb 12, 2016 at 5:57 AM, Rafa Haro <rh...@apache.org>
> wrote:
> > >>>>
> > >>>>> Hi Devs,
> > >>>>>
> > >>>>> +1 from my side too. Anyone would join me next week anytime for a
> > >>>> Hangout?
> > >>>>> Although we can always discuss further in the list, I think it
> could
> > >>> be a
> > >>>>> good idea to discuss on a call on project direction, next releases,
> > >>> most
> > >>>>> urgent issues, priorities
> > >>>>>
> > >>>>> What do you guys think?
> > >>>>>
> > >>>>> Cheers,
> > >>>>> Rafa
> > >>>>>
> > >>>>> On Fri, Feb 12, 2016 at 11:10 AM Rupert Westenthaler <
> > >>>>> rupert.westentha...@gmail.com> wrote:
> > >>>>>
> > >>>>>> The report is fine with me
> > >>>>>>
> > >>>>>> On Thu, Feb 11, 2016 at 11:48 AM, Fabian Christ <
> > >> fchr...@apache.org>
> > >>>>>> wrote:
> > >>>>>>> Hi Stanbolers,
> > >>>>>>>
> > >>>>>>> since I missed the last two slots in December and January to
> > >> report
> > >>>> to
> > >>>>>>> the board we have to submit a report this month.
> > >>>>>>>
> > >>>>>>> I have submitted the report below to the board.
> > >>>>>>>
> > >>>>>>> Best
> > >>>>>>> Fabian
> > >>>>>>>
> > >>>>>>> Status report for the Apache Stanbol Project - February, 2016
> > >>>>>>>
> > >>>>>>> Apache Stanbol provides a set of reusable components for semantic
> > >>>>> content
> > >>>>>>> management.
> > >>>>>>>
> > >>>>>>> First, appologies for the missed reports. We had to report in
> > >>>> December
> > >>>>>>> but missed that plus the January report.
> > >>>>>>>
> > >>>>>>> There are no issues which require board attention at the moment.
> > >>>>> However,
> > >>>>

Re: Hello Stanbol Community

2016-02-03 Thread Rafa Haro
Hi Susheel,

That sounds like a perfect use case for Stanbol. There are some nice guides
you can follow, like for example:
https://stanbol.apache.org/docs/trunk/customvocabulary.html.

Cheers!
Rafa

On Wed, Feb 3, 2016 at 5:49 PM Susheel Kumar  wrote:

> Hi,
>
> I am looking into annotating solr documents  for semantic enrichment using
> external private taxonomy in RDF format and wondering if this project is
> active and should give a try.
>
> Please suggest.
>
>
> Thanks,
> Susheel
>


Re: Re: question on working with custom vocabularies

2016-01-12 Thread Rafa Haro
Hi Kata,

Probably there is a problem with your linking configuration. The first
thing I would check would be if you have labels for the language identified
automatically by Stanbol.

If your instance is available in any public URL I could try to take a look
if you want.

Cheers,
Rafa

On Tue, Jan 12, 2016 at 11:36 AM Lejtovicz, Katalin <
katalin.lejtov...@oeaw.ac.at> wrote:

> Hi Rafa,
>
>
>
> Thanks for your reply!
>
> The index file was deleted and the new one copied to the datafiles folder,
> but on another machine I also tried out a new installation of Stanbol, and
> copied the index file over, and it didn’t work there either.
>
>
>
> The EntityHub via the API works, I checked that first, when I noticed the
> problem. I get back my entities.
>
> Do you probably have any clue, what the problem can be? The index contains
> the entities, and I can also query them via EntityHub, but the solr query
> seems to be ‘incorrect’ (the first one, which is logged from Stanbol
> doesn’t return any results, but the second, that I tried out works fine).
>
>
>
> Thanks in advance!
>
>
>
> Best regards,
>
> Kata
>
>
>
> >Hi Kata,
>
> >
>
> >Have you overwritten the old solr index in the datafiles folder or have
> you
>
> >started from the scratch after fixing the encoding of the RDF files?
>
> >
>
> >Just a hint: you can check if your entities have been indexed by querying
>
> >then with the EntityHub API at Stanbol Web interface
>
> >
>
> >Hope that helps,
>
> >Rafa
>
> >
>
> >On Mon, Jan 11, 2016 at 7:19 PM Lejtovicz, Katalin <
>
> >katalin.lejtov...@oeaw.ac.at>> wrote:
>
> >
>
> >> Dear All,
>
> >>
>
> >> I have some problem with using custom vocabularies to enhance my
> content.
>
> >> I created an index with Stanbol from a vocabulary, deployed the .jar
> file
>
> >> and copied the solr index file to the datafiles folder, and created an
>
> >> EntityHub Linking Engine, plus a weighted chain, where the following
>
> >> pipeline was configured: langdetect, opennlp-sentence, opennlp-token,
>
> >> opennlp-pos, opennlp-chunker, and the an EntityHub Linking Engine for my
>
> >> custom vocab.
>
> >>
>
> >> It worked fine, when text was pasted in this enhancement chain in the
> user
>
> >> interface of Stanbol, entities were found. However we had an encoding
>
> >> problem in our RDF resource from which the index was built, so entities
>
> >> with umlaut (eg. ö, ä) were not found. We corrected the encoding of the
> RDF
>
> >> and I ran the indexing process again with the same config files, but
> with
>
> >> the new RDF resource.
>
> >> I again deployed (.jar and solr zip), and created the entityhub Linking
>
> >> Engine, plus the same Weighted Chain as above specified.
>
> >> Now I don't get any results, when I paste text in the text field of this
>
> >> chain in Stanbol.
>
> >>
>
> >> I configured log files, so that I can see what is happening. The
> linkable,
>
> >> matchable tokens, etc. are defined correctly eg. 'Berlin' in the
> sentence
>
> >> 'Berlin is a big city' is defined as linkable token:
>
> >>
>
> >> 11.01.2016 16:14:05.667 *DEBUG* [Thread-9]
>
> >> org.apache.stanbol.enhancer.engines.entitylinking.impl.SectionData -
>
> >> TokenData: 'Berlin'[linkable=true(linkabkePos=true)|
>
> >> matchable=true(matchablePos=true)| alpha=true| seachLength=true|
>
> >> upperCase=true]
>
> >>
>
> >> Also it is sent to the solr index, but from there, no results come back:
>
> >> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
>
> >> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker ---
>
> >> preocess Token 0: Berlin (lemma: null) linkable=true, matchable=true |
>
> >> chunk: Chunk: [0, 6] Berlin
>
> >> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
>
> >> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker
> -
>
> >> 1:'is' (lemma: null) linkable=false, matchable=false
>
> >> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
>
> >> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker
> -
>
> >> 2:'a' (lemma: null) linkable=false, matchable=false
>
> >> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
>
> >> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker
> 
>
> >> searchStrings [Berlin]
>
> >> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
>
> >> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker
> >>
>
> >> request entities [0-20] entities ...
>
> >> 11.01.2016 16:14:05.669 *DEBUG* [Thread-9]
>
> >>
> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker   <
>
> >> found 0 entities ...
>
> >>
>
> >> I also looked at the solr.log, the query looks like this:
>
> >> (((@en\/rdfs\:label\/:"Berlin")) OR ((@\/rdfs\:label\/:"Berlin")))
>
> >> hits=0 status=0 QTime=1
>
> >>
>
> >>
>
> >> I installed solr and copied the index file over to execute the above
>
> >> query. It does not result any Solr Documents, but the following one
> does:
>
> >> (((_\!@en\/rdfs\:label\/:" Berlin ")) OR ((_\!@\/rdfs\:label\/:" Berlin
>
> >> ")))
>
> >>
>
> >> Can 

Re: question on working with custom vocabularies

2016-01-11 Thread Rafa Haro
Hi Kata,

Have you overwritten the old solr index in the datafiles folder or have you
started from the scratch after fixing the encoding of the RDF files?

Just a hint: you can check if your entities have been indexed by querying
then with the EntityHub API at Stanbol Web interface

Hope that helps,
Rafa

On Mon, Jan 11, 2016 at 7:19 PM Lejtovicz, Katalin <
katalin.lejtov...@oeaw.ac.at> wrote:

> Dear All,
>
> I have some problem with using custom vocabularies to enhance my content.
> I created an index with Stanbol from a vocabulary, deployed the .jar file
> and copied the solr index file to the datafiles folder, and created an
> EntityHub Linking Engine, plus a weighted chain, where the following
> pipeline was configured: langdetect, opennlp-sentence, opennlp-token,
> opennlp-pos, opennlp-chunker, and the an EntityHub Linking Engine for my
> custom vocab.
>
> It worked fine, when text was pasted in this enhancement chain in the user
> interface of Stanbol, entities were found. However we had an encoding
> problem in our RDF resource from which the index was built, so entities
> with umlaut (eg. ö, ä) were not found. We corrected the encoding of the RDF
> and I ran the indexing process again with the same config files, but with
> the new RDF resource.
> I again deployed (.jar and solr zip), and created the entityhub Linking
> Engine, plus the same Weighted Chain as above specified.
> Now I don't get any results, when I paste text in the text field of this
> chain in Stanbol.
>
> I configured log files, so that I can see what is happening. The linkable,
> matchable tokens, etc. are defined correctly eg. 'Berlin' in the sentence
> 'Berlin is a big city' is defined as linkable token:
>
> 11.01.2016 16:14:05.667 *DEBUG* [Thread-9]
> org.apache.stanbol.enhancer.engines.entitylinking.impl.SectionData -
> TokenData: 'Berlin'[linkable=true(linkabkePos=true)|
> matchable=true(matchablePos=true)| alpha=true| seachLength=true|
> upperCase=true]
>
> Also it is sent to the solr index, but from there, no results come back:
> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker ---
> preocess Token 0: Berlin (lemma: null) linkable=true, matchable=true |
> chunk: Chunk: [0, 6] Berlin
> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker -
> 1:'is' (lemma: null) linkable=false, matchable=false
> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker -
> 2:'a' (lemma: null) linkable=false, matchable=false
> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker   >>
> searchStrings [Berlin]
> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker>
> request entities [0-20] entities ...
> 11.01.2016 16:14:05.669 *DEBUG* [Thread-9]
> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker   <
> found 0 entities ...
>
> I also looked at the solr.log, the query looks like this:
> (((@en\/rdfs\:label\/:"Berlin")) OR ((@\/rdfs\:label\/:"Berlin")))
> hits=0 status=0 QTime=1
>
>
> I installed solr and copied the index file over to execute the above
> query. It does not result any Solr Documents, but the following one does:
> (((_\!@en\/rdfs\:label\/:" Berlin ")) OR ((_\!@\/rdfs\:label\/:" Berlin
> ")))
>
> Can someone help me, what I am missing?
> Is it a configuration issue when I am creating the index? (Strange is,
> that I used the same config files for the incorrectly encoded RDF resource
> file, an that index worked.)
> Or is it a Stanbol issue?
>
> Thanks for any hints/help!
>
> Best regards,
> Kata
>
>


Re: Sling 8 release breaks Stanbol Builds (was Build failed in Jenkins: stanbol-0.12 #92)

2015-10-19 Thread Rafa Haro
Nop, The bundlelist is missing

On Mon, Oct 19, 2015 at 4:14 PM, Rafa Haro <rharoapa...@gmail.com> wrote:

> The artifact anyway seems to be there:
> https://repo.maven.apache.org/maven2/org/apache/sling/org.apache.sling.launchpad/8/
> On Mon, Oct 19, 2015 at 3:56 PM, Rafa Haro <rharoapa...@gmail.com> wrote:
>> Hi Rupert, 
>> I’m suffering this issue too, in my case not working directly with the trunk 
>> but with a recent revision of the project. I have been building my own 
>> launchers with this revision for a couple of months now and never had this 
>> problem that we are having today.
>> Cheers,
>> Rafa
>> On Mon, Oct 19, 2015 at 3:47 PM, Rupert Westenthaler
>> <rupert.westentha...@gmail.com> wrote:
>>> Hi all,
>>> looks like the Sling 8 release has broken Stanbol Builds.
>>> [ERROR] Failed to execute goal on project
>>> org.apache.stanbol.launchers.full: Could not resolve dependencies for
>>> project 
>>> org.apache.stanbol:org.apache.stanbol.launchers.full:jar:0.12.1-SNAPSHOT:
>>> Could not find artifact
>>> org.apache.sling:org.apache.sling.launchpad:xml:bundlelist:8 in
>>> central (http://repo1.maven.org/maven2) -> [Help 1]
>>> I have no Idea why the maven-launchpad-plugin tries to get the
>>> `org.apache.sling:org.apache.sling.launchpad:xml:bundlelist:8`.
>>> Based on the documentation on [1] the referenced file would not even
>>> get used as stanbol uses
>>> false
>>> I also tried to set the  property but with no success.
>>> I also found the sling-start-plugin [2]. Looks a bit like a
>>> replacement for the maven sling plugin ...
>>> Would be nice if someone could provide some more information on this.
>>> Otherwise I will try to ask over on the sling lists and/or do some
>>> additional digging.
>>> best
>>> Rupert
>>> [1] 
>>> https://sling.apache.org/documentation/development/maven-launchpad-plugin.html#general-configuration
>>> [2] https://sling.apache.org/documentation/development/slingstart.html
>>> -- Forwarded message --
>>> From: Apache Jenkins Server <jenk...@builds.apache.org>
>>> Date: Mon, Oct 19, 2015 at 3:03 PM
>>> Subject: Build failed in Jenkins: stanbol-0.12 #92
>>> To: dev@stanbol.apache.org, rwes...@apache.org
>>> See <https://builds.apache.org/job/stanbol-0.12/92/changes>
>>> Changes:
>>> [rwesten] minor: reverted change in the pom.xml of the
>>> integration-test module from the last commit (r1709397)
>>> [rwesten] fix for STANBOL-1443: The FastLRUCacheManager is now thread
>>> save; Minor: The FST Linking engine does no longer force a new
>>> Searcher to be opened during initialization; The Enhancer Stress Test
>>> Tool does no longer use Assert.assert*(..) during response processing.
>>> This caused responses not to be marked as success or failed. Now
>>> exceptions are thrown instead.
>>> --
>>> [...truncated 36460 lines...]
>>> [JENKINS] Archiving
>>> <https://builds.apache.org/job/stanbol-0.12/ws/commons/solr/install/pom.xml>
>>> to 
>>> org.apache.stanbol/org.apache.stanbol.commons.solr.install/0.12.1-SNAPSHOT/org.apache.stanbol.commons.solr.install-0.12.1-SNAPSHOT.pom
>>> [JENKINS] Archiving
>>> <https://builds.apache.org/job/stanbol-0.12/ws/commons/solr/install/target/org.apache.stanbol.commons.solr.install-0.12.1-SNAPSHOT.jar>
>>> to 
>>> org.apache.stanbol/org.apache.stanbol.commons.solr.install/0.12.1-20151019.122302-35/org.apache.stanbol.commons.solr.install-0.12.1-20151019.122302-35.jar
>>> [JENKINS] Archiving
>>> <https://builds.apache.org/job/stanbol-0.12/ws/commons/solr/install/target/org.apache.stanbol.commons.solr.install-0.12.1-SNAPSHOT-sources.jar>
>>> to 
>>> org.apache.stanbol/org.apache.stanbol.commons.solr.install/0.12.1-20151019.122302-35/org.apache.stanbol.commons.solr.install-0.12.1-20151019.122302-35-sources.jar
>>> No artifacts from stanbol-0.12 » Apache Stanbol Commons Solr Installer
>>> #91 to compare, so performing full copy of artifacts
>>> [JENKINS] Archiving
>>> <https://builds.apache.org/job/stanbol-0.12/ws/cmsadapter/core/pom.xml>
>>> to 
>>> org.apache.stanbol/org.apache.stanbol.cmsadapter.core/0.12.1-SNAPSHOT/org.apache.stanbol.cmsadapter.core-0.12.1-SNAPSHOT.pom
>>> [JENKINS] Archiving
>>> <https://builds.apache.org/job/stanbol-0.12/ws/cmsadapter/core/target/org.apache.stanbol.cmsadapter.core-0.1

Re: Sling 8 release breaks Stanbol Builds (was Build failed in Jenkins: stanbol-0.12 #92)

2015-10-19 Thread Rafa Haro
The artifacts were updated last October 13th. So probably there has been an 
undetected error in the artifacts deployment. We should probably open an issue 
for this at Sling’s JIRA

On Mon, Oct 19, 2015 at 4:15 PM, Rafa Haro <rharoapa...@gmail.com> wrote:

> Nop, The bundlelist is missing
> On Mon, Oct 19, 2015 at 4:14 PM, Rafa Haro <rharoapa...@gmail.com> wrote:
>> The artifact anyway seems to be there:
>> https://repo.maven.apache.org/maven2/org/apache/sling/org.apache.sling.launchpad/8/
>> On Mon, Oct 19, 2015 at 3:56 PM, Rafa Haro <rharoapa...@gmail.com> wrote:
>>> Hi Rupert, 
>>> I’m suffering this issue too, in my case not working directly with the 
>>> trunk but with a recent revision of the project. I have been building my 
>>> own launchers with this revision for a couple of months now and never had 
>>> this problem that we are having today.
>>> Cheers,
>>> Rafa
>>> On Mon, Oct 19, 2015 at 3:47 PM, Rupert Westenthaler
>>> <rupert.westentha...@gmail.com> wrote:
>>>> Hi all,
>>>> looks like the Sling 8 release has broken Stanbol Builds.
>>>> [ERROR] Failed to execute goal on project
>>>> org.apache.stanbol.launchers.full: Could not resolve dependencies for
>>>> project 
>>>> org.apache.stanbol:org.apache.stanbol.launchers.full:jar:0.12.1-SNAPSHOT:
>>>> Could not find artifact
>>>> org.apache.sling:org.apache.sling.launchpad:xml:bundlelist:8 in
>>>> central (http://repo1.maven.org/maven2) -> [Help 1]
>>>> I have no Idea why the maven-launchpad-plugin tries to get the
>>>> `org.apache.sling:org.apache.sling.launchpad:xml:bundlelist:8`.
>>>> Based on the documentation on [1] the referenced file would not even
>>>> get used as stanbol uses
>>>> false
>>>> I also tried to set the  property but with no success.
>>>> I also found the sling-start-plugin [2]. Looks a bit like a
>>>> replacement for the maven sling plugin ...
>>>> Would be nice if someone could provide some more information on this.
>>>> Otherwise I will try to ask over on the sling lists and/or do some
>>>> additional digging.
>>>> best
>>>> Rupert
>>>> [1] 
>>>> https://sling.apache.org/documentation/development/maven-launchpad-plugin.html#general-configuration
>>>> [2] https://sling.apache.org/documentation/development/slingstart.html
>>>> -- Forwarded message --
>>>> From: Apache Jenkins Server <jenk...@builds.apache.org>
>>>> Date: Mon, Oct 19, 2015 at 3:03 PM
>>>> Subject: Build failed in Jenkins: stanbol-0.12 #92
>>>> To: dev@stanbol.apache.org, rwes...@apache.org
>>>> See <https://builds.apache.org/job/stanbol-0.12/92/changes>
>>>> Changes:
>>>> [rwesten] minor: reverted change in the pom.xml of the
>>>> integration-test module from the last commit (r1709397)
>>>> [rwesten] fix for STANBOL-1443: The FastLRUCacheManager is now thread
>>>> save; Minor: The FST Linking engine does no longer force a new
>>>> Searcher to be opened during initialization; The Enhancer Stress Test
>>>> Tool does no longer use Assert.assert*(..) during response processing.
>>>> This caused responses not to be marked as success or failed. Now
>>>> exceptions are thrown instead.
>>>> --
>>>> [...truncated 36460 lines...]
>>>> [JENKINS] Archiving
>>>> <https://builds.apache.org/job/stanbol-0.12/ws/commons/solr/install/pom.xml>
>>>> to 
>>>> org.apache.stanbol/org.apache.stanbol.commons.solr.install/0.12.1-SNAPSHOT/org.apache.stanbol.commons.solr.install-0.12.1-SNAPSHOT.pom
>>>> [JENKINS] Archiving
>>>> <https://builds.apache.org/job/stanbol-0.12/ws/commons/solr/install/target/org.apache.stanbol.commons.solr.install-0.12.1-SNAPSHOT.jar>
>>>> to 
>>>> org.apache.stanbol/org.apache.stanbol.commons.solr.install/0.12.1-20151019.122302-35/org.apache.stanbol.commons.solr.install-0.12.1-20151019.122302-35.jar
>>>> [JENKINS] Archiving
>>>> <https://builds.apache.org/job/stanbol-0.12/ws/commons/solr/install/target/org.apache.stanbol.commons.solr.install-0.12.1-SNAPSHOT-sources.jar>
>>>> to 
>>>> org.apache.stanbol/org.apache.stanbol.commons.solr.install/0.12.1-20151019.122302-35/org.apache.stanbol.commons.solr.install-0.12.1-20151019.122302-35-sources.jar
>>>> No artifacts from stanbol

Re: Sling 8 release breaks Stanbol Builds (was Build failed in Jenkins: stanbol-0.12 #92)

2015-10-19 Thread Rafa Haro
Hi Rupert, 




I’m suffering this issue too, in my case not working directly with the trunk 
but with a recent revision of the project. I have been building my own 
launchers with this revision for a couple of months now and never had this 
problem that we are having today.




Cheers,

Rafa

On Mon, Oct 19, 2015 at 3:47 PM, Rupert Westenthaler
 wrote:

> Hi all,
> looks like the Sling 8 release has broken Stanbol Builds.
> [ERROR] Failed to execute goal on project
> org.apache.stanbol.launchers.full: Could not resolve dependencies for
> project 
> org.apache.stanbol:org.apache.stanbol.launchers.full:jar:0.12.1-SNAPSHOT:
> Could not find artifact
> org.apache.sling:org.apache.sling.launchpad:xml:bundlelist:8 in
> central (http://repo1.maven.org/maven2) -> [Help 1]
> I have no Idea why the maven-launchpad-plugin tries to get the
> `org.apache.sling:org.apache.sling.launchpad:xml:bundlelist:8`.
> Based on the documentation on [1] the referenced file would not even
> get used as stanbol uses
> false
> I also tried to set the  property but with no success.
> I also found the sling-start-plugin [2]. Looks a bit like a
> replacement for the maven sling plugin ...
> Would be nice if someone could provide some more information on this.
> Otherwise I will try to ask over on the sling lists and/or do some
> additional digging.
> best
> Rupert
> [1] 
> https://sling.apache.org/documentation/development/maven-launchpad-plugin.html#general-configuration
> [2] https://sling.apache.org/documentation/development/slingstart.html
> -- Forwarded message --
> From: Apache Jenkins Server 
> Date: Mon, Oct 19, 2015 at 3:03 PM
> Subject: Build failed in Jenkins: stanbol-0.12 #92
> To: dev@stanbol.apache.org, rwes...@apache.org
> See 
> Changes:
> [rwesten] minor: reverted change in the pom.xml of the
> integration-test module from the last commit (r1709397)
> [rwesten] fix for STANBOL-1443: The FastLRUCacheManager is now thread
> save; Minor: The FST Linking engine does no longer force a new
> Searcher to be opened during initialization; The Enhancer Stress Test
> Tool does no longer use Assert.assert*(..) during response processing.
> This caused responses not to be marked as success or failed. Now
> exceptions are thrown instead.
> --
> [...truncated 36460 lines...]
> [JENKINS] Archiving
> 
> to 
> org.apache.stanbol/org.apache.stanbol.commons.solr.install/0.12.1-SNAPSHOT/org.apache.stanbol.commons.solr.install-0.12.1-SNAPSHOT.pom
> [JENKINS] Archiving
> 
> to 
> org.apache.stanbol/org.apache.stanbol.commons.solr.install/0.12.1-20151019.122302-35/org.apache.stanbol.commons.solr.install-0.12.1-20151019.122302-35.jar
> [JENKINS] Archiving
> 
> to 
> org.apache.stanbol/org.apache.stanbol.commons.solr.install/0.12.1-20151019.122302-35/org.apache.stanbol.commons.solr.install-0.12.1-20151019.122302-35-sources.jar
> No artifacts from stanbol-0.12 » Apache Stanbol Commons Solr Installer
> #91 to compare, so performing full copy of artifacts
> [JENKINS] Archiving
> 
> to 
> org.apache.stanbol/org.apache.stanbol.cmsadapter.core/0.12.1-SNAPSHOT/org.apache.stanbol.cmsadapter.core-0.12.1-SNAPSHOT.pom
> [JENKINS] Archiving
> 
> to 
> org.apache.stanbol/org.apache.stanbol.cmsadapter.core/0.12.1-20151019.125700-32/org.apache.stanbol.cmsadapter.core-0.12.1-20151019.125700-32.jar
> [JENKINS] Archiving
> 
> to 
> org.apache.stanbol/org.apache.stanbol.cmsadapter.core/0.12.1-20151019.125700-32/org.apache.stanbol.cmsadapter.core-0.12.1-20151019.125700-32-sources.jar
> [JENKINS] Archiving
> 
> to 
> org.apache.stanbol/org.apache.stanbol.launchers.full/0.12.1-SNAPSHOT/org.apache.stanbol.launchers.full-0.12.1-SNAPSHOT.pom
> [JENKINS] Archiving
> 
> to 
> org.apache.stanbol/org.apache.stanbol.contenthub.search.reactor/0.12.1-SNAPSHOT/org.apache.stanbol.contenthub.search.reactor-0.12.1-SNAPSHOT.pom
> [JENKINS] Archiving
> 
> to 
> 

Re: Sling 8 release breaks Stanbol Builds (was Build failed in Jenkins: stanbol-0.12 #92)

2015-10-19 Thread Rafa Haro
The artifact anyway seems to be there:




https://repo.maven.apache.org/maven2/org/apache/sling/org.apache.sling.launchpad/8/

On Mon, Oct 19, 2015 at 3:56 PM, Rafa Haro <rharoapa...@gmail.com> wrote:

> Hi Rupert, 
> I’m suffering this issue too, in my case not working directly with the trunk 
> but with a recent revision of the project. I have been building my own 
> launchers with this revision for a couple of months now and never had this 
> problem that we are having today.
> Cheers,
> Rafa
> On Mon, Oct 19, 2015 at 3:47 PM, Rupert Westenthaler
> <rupert.westentha...@gmail.com> wrote:
>> Hi all,
>> looks like the Sling 8 release has broken Stanbol Builds.
>> [ERROR] Failed to execute goal on project
>> org.apache.stanbol.launchers.full: Could not resolve dependencies for
>> project 
>> org.apache.stanbol:org.apache.stanbol.launchers.full:jar:0.12.1-SNAPSHOT:
>> Could not find artifact
>> org.apache.sling:org.apache.sling.launchpad:xml:bundlelist:8 in
>> central (http://repo1.maven.org/maven2) -> [Help 1]
>> I have no Idea why the maven-launchpad-plugin tries to get the
>> `org.apache.sling:org.apache.sling.launchpad:xml:bundlelist:8`.
>> Based on the documentation on [1] the referenced file would not even
>> get used as stanbol uses
>> false
>> I also tried to set the  property but with no success.
>> I also found the sling-start-plugin [2]. Looks a bit like a
>> replacement for the maven sling plugin ...
>> Would be nice if someone could provide some more information on this.
>> Otherwise I will try to ask over on the sling lists and/or do some
>> additional digging.
>> best
>> Rupert
>> [1] 
>> https://sling.apache.org/documentation/development/maven-launchpad-plugin.html#general-configuration
>> [2] https://sling.apache.org/documentation/development/slingstart.html
>> -- Forwarded message --
>> From: Apache Jenkins Server <jenk...@builds.apache.org>
>> Date: Mon, Oct 19, 2015 at 3:03 PM
>> Subject: Build failed in Jenkins: stanbol-0.12 #92
>> To: dev@stanbol.apache.org, rwes...@apache.org
>> See <https://builds.apache.org/job/stanbol-0.12/92/changes>
>> Changes:
>> [rwesten] minor: reverted change in the pom.xml of the
>> integration-test module from the last commit (r1709397)
>> [rwesten] fix for STANBOL-1443: The FastLRUCacheManager is now thread
>> save; Minor: The FST Linking engine does no longer force a new
>> Searcher to be opened during initialization; The Enhancer Stress Test
>> Tool does no longer use Assert.assert*(..) during response processing.
>> This caused responses not to be marked as success or failed. Now
>> exceptions are thrown instead.
>> --
>> [...truncated 36460 lines...]
>> [JENKINS] Archiving
>> <https://builds.apache.org/job/stanbol-0.12/ws/commons/solr/install/pom.xml>
>> to 
>> org.apache.stanbol/org.apache.stanbol.commons.solr.install/0.12.1-SNAPSHOT/org.apache.stanbol.commons.solr.install-0.12.1-SNAPSHOT.pom
>> [JENKINS] Archiving
>> <https://builds.apache.org/job/stanbol-0.12/ws/commons/solr/install/target/org.apache.stanbol.commons.solr.install-0.12.1-SNAPSHOT.jar>
>> to 
>> org.apache.stanbol/org.apache.stanbol.commons.solr.install/0.12.1-20151019.122302-35/org.apache.stanbol.commons.solr.install-0.12.1-20151019.122302-35.jar
>> [JENKINS] Archiving
>> <https://builds.apache.org/job/stanbol-0.12/ws/commons/solr/install/target/org.apache.stanbol.commons.solr.install-0.12.1-SNAPSHOT-sources.jar>
>> to 
>> org.apache.stanbol/org.apache.stanbol.commons.solr.install/0.12.1-20151019.122302-35/org.apache.stanbol.commons.solr.install-0.12.1-20151019.122302-35-sources.jar
>> No artifacts from stanbol-0.12 » Apache Stanbol Commons Solr Installer
>> #91 to compare, so performing full copy of artifacts
>> [JENKINS] Archiving
>> <https://builds.apache.org/job/stanbol-0.12/ws/cmsadapter/core/pom.xml>
>> to 
>> org.apache.stanbol/org.apache.stanbol.cmsadapter.core/0.12.1-SNAPSHOT/org.apache.stanbol.cmsadapter.core-0.12.1-SNAPSHOT.pom
>> [JENKINS] Archiving
>> <https://builds.apache.org/job/stanbol-0.12/ws/cmsadapter/core/target/org.apache.stanbol.cmsadapter.core-0.12.1-SNAPSHOT.jar>
>> to 
>> org.apache.stanbol/org.apache.stanbol.cmsadapter.core/0.12.1-20151019.125700-32/org.apache.stanbol.cmsadapter.core-0.12.1-20151019.125700-32.jar
>> [JENKINS] Archiving
>> <https://builds.apache.org/job/stanbol-0.12/ws/cmsadapter/core/target/org.apache.stanbol.cmsadapter.core-0.12.1-SNAPSHOT-sources.jar>
>> to 
>> org.apache.st

Re: September Report Input

2015-09-03 Thread Rafa Haro
Hi Fabian,




As we agreed, I started to reorganize the issues in Jira. I closed a lot of 
(very old or already fixed) issues. I think we can review the current status 
and probably we should schedule a release.




Cheers,

Rafa

On Thu, Sep 3, 2015 at 10:59 AM, Rupert Westenthaler
 wrote:

> Hi all,
> The main two things for me are:
> * I have updated the JSON-LD implementation (STANBOL-1439) and also
> created a pull request for java-jsonld [1] with the mid-term goal to
> get completely rid of the Stanbol specific JSON-LD code but to use a
> lib managed by the JSON-LD community
> * I Integrated IXA Pipes Nerc  with Apache Stanbol (see [2] for
> details): IXA Pipes Nerc provides high quality Named Entity
> Recognition Models. Those can now be used with the OpenNLP Named
> Entity Recognition Engine in Apache Stanbol.
> I also worked on several smaller bug fixes and improvements related to
> the use of Apache Stanbol within Fusepool and Redlink.
> best
> Rupert
> [1] https://github.com/jsonld-java/jsonld-java/pull/146
> [2] http://stanbol.markmail.org/thread/cxo4jloxesdkj2sr
> On Thu, Sep 3, 2015 at 10:50 AM, Fabian Christ  wrote:
>> Hi Stanbolers,
>>
>> it is report time again and I need input for the September 2015 report.
>>
>> What has happened?
>>
>> Best
>> Fabian
> -- 
> | Rupert Westenthaler rupert.westentha...@gmail.com
> | Bodenlehenstraße 11  ++43-699-11108907
> | A-5500 Bischofshofen
> | REDLINK.CO 
> ..
> | http://redlink.co/

Re: Problem with Referenced Site and Enhancer

2015-08-07 Thread Rafa Haro
Hi Mano, 




I have never tried to configure a ReferencedSite without a local index (i.e. 
using only the remote dataset) so I couldn’t help you right now too much. 
According to the documentation, apparently the remote site is used only for 
dereferencing but can’t be used alone for enhancing (for searching for entities 
using SPARQL).




Let’s see if someone else can shed light on this. Sorry :-(

On Thu, Aug 6, 2015 at 5:25 PM, Mano Swerts mano.swe...@aca-it.be wrote:

 Hi all,
 I want to use Stanbol to enhance content, but I do not succeed in setting
 it up. We use the following tools:
- *SkosJS* (for a non-technical user to manager a taxonomy)
- *Apache Marmotta* (contains the data. It is linked to SkosJS. It was
not possible to link SkosJS with Stanbol)
- *Stanbol* (use the data in Apache Marmotta to enhance content)
 I registered Marmotta as a Referenced Site through SPARQL. My entities are
 available through this Referenced Site.
 I created an Enhancer Engine coupled to the Referenced Site, which is added
 to the default chain. It is detected and used by Stanbol, but I get no
 results when enhancing content.
 I looked at the existing DBPedia setup and noticed that it uses a Solr Yard
 and Cache. I think this might be the clue, but when I link my Reference
 Site to my newly created Cache it is suddenly not available anymore.
 Therefore the enhancers won't work.
 I simply created the Solr Yard and Entityhub Cache through the Apache Felix
 Web Console.
 I noticed that there is an folder called indexes which contains indexes
 from DBPedia, but none for my own Yard. I also read something about
 generating indexes manually, but in our use case this does not seem
 feasible. When users add new data using SkosJS, it should be automacially
 made available to the Enhancer.
 Can somebody help me with this?
 Many thanks.
 Mano Swerts

Re: Problem with Referenced Site and Enhancer

2015-08-07 Thread Rafa Haro
Hi Mano, 




I understand what you mean now. You are making a wrong assumption about 
Stanbol. The entityhub Referenced Site local cache doesn’t work that way. It 
doesn’t automatically synchronize with the remote site. As any other cache, as 
far as I know, it works by storing locally those remote entities that you have 
retrieved through dereferencing at any time. That means that when you configure 
it for the first time, Stanbol is not going to retrieve all the entities by 
using a kind of  SELECT ?s ?p ?o SPARQL query. 




The full synchronization with a TripleStore is, in my opinion, an extremely 
interesting use case, because it is very natural, as you are doing right now, 
to store RDF data in a triple store and have it immediately available for 
enhancing. The thing is, the SolrYard is the only Yard that really works for 
Enhancing. So, there should be a way to synchronize a TripleStore backend with 
an Stanbol SolrYard, but this is not easy to architecture from the Stanbol 
point of view and also would imply to couple with a concrete TripleStore where 
you need to include the pushing module to Stanbol.




So far, the best option would be to use a ManagedSite with a SolrYard instead 
of a ReferencedSite and use the EntityHub REST API for pushing entities. You 
are using SKOSjs, maybe you can extend it for communicating directly with the 
ManagedSite and I suppose that wouldn’t be straightforward also because any 
editing action at SKOSjs that finally rely on concrete SPARQL queries must have 
a correspondent REST call to the EntityHub




Cheers,

Rafa

On Fri, Aug 7, 2015 at 11:23 AM, Mano Swerts mano.swe...@aca-it.be
wrote:

 Hi Rafa,
 Thank you for the reply!
 I have no issue with the fact that a local index is required, but the local
 index should update automatically (or check for changes in a cron) to make
 sure that it is up to date. It is not useful if the client needs to contact
 us every time they make a change so that we can update the indexes.
 I also saw a module in the Stanbol source code called jennatdb. Is this
 maybe an option? I have no problem with switching to Jena instead of
 Marmotta.
 Hopefully somebody else in this mailing list can help me.
 Kind regards.
 Mano Swerts
 On Fri, Aug 7, 2015 at 11:07 AM, Rafa Haro rharoapa...@gmail.com wrote:
 Hi Mano,




 I have never tried to configure a ReferencedSite without a local index
 (i.e. using only the remote dataset) so I couldn’t help you right now too
 much. According to the documentation, apparently the remote site is used
 only for dereferencing but can’t be used alone for enhancing (for searching
 for entities using SPARQL).




 Let’s see if someone else can shed light on this. Sorry :-(

 On Thu, Aug 6, 2015 at 5:25 PM, Mano Swerts mano.swe...@aca-it.be wrote:

  Hi all,
  I want to use Stanbol to enhance content, but I do not succeed in setting
  it up. We use the following tools:
 - *SkosJS* (for a non-technical user to manager a taxonomy)
 - *Apache Marmotta* (contains the data. It is linked to SkosJS. It was
 not possible to link SkosJS with Stanbol)
 - *Stanbol* (use the data in Apache Marmotta to enhance content)
  I registered Marmotta as a Referenced Site through SPARQL. My entities
 are
  available through this Referenced Site.
  I created an Enhancer Engine coupled to the Referenced Site, which is
 added
  to the default chain. It is detected and used by Stanbol, but I get no
  results when enhancing content.
  I looked at the existing DBPedia setup and noticed that it uses a Solr
 Yard
  and Cache. I think this might be the clue, but when I link my Reference
  Site to my newly created Cache it is suddenly not available anymore.
  Therefore the enhancers won't work.
  I simply created the Solr Yard and Entityhub Cache through the Apache
 Felix
  Web Console.
  I noticed that there is an folder called indexes which contains indexes
  from DBPedia, but none for my own Yard. I also read something about
  generating indexes manually, but in our use case this does not seem
  feasible. When users add new data using SkosJS, it should be automacially
  made available to the Enhancer.
  Can somebody help me with this?
  Many thanks.
  Mano Swerts


Re: Report time again. Query for input

2015-06-18 Thread Rafa Haro
Hi Rupert,

On Thu, Jun 18, 2015 at 12:33 PM, Rupert Westenthaler 
rupert.westentha...@gmail.com wrote:

 Hi all,

 I like the Idea of creating a roadmap. @Rafa would you be able to
 drive this process.


Sure, no problem. The very first thing we need to do is to close release
1.0. For that, probably Rupert is the most indicated person to say which
current issues must be included. I wouldn't add any other functionality
until releasing this version. We have to decide also how is going to be
released in terms of binaries resolving all the issues related with third
party resources like OpenNLP models.

After that, a brainstorming should start for next releases. First, which
components are going to be deprecated. For the rest, we setup a roadmap
around desired functionality or feedback from the users. I can take a look
to Jira and harvest those tickets that are interesting for the future and
start closing old and deprecated ones.

How that sound?



 I am around until end of Juli, after that I am on vacation for ~1
 month. If possible I would like to have this sorted out before my
 vacation.

 best
 Rupert


 On Wed, Jun 17, 2015 at 2:00 PM, Rafa Haro rh...@apache.org wrote:
  Hi Fabian,
 
  On Wed, Jun 17, 2015 at 1:34 PM, Fabian Christ 
 christ.fab...@googlemail.com
  wrote:
 
  Hi,
 
  yes we have indeed the situation where we do not have a project owner
  who is doing some oversight of the project. I am the current chair of
  the project but also do not find the time to manage the project. I
  would have resigned as a chair if there was a person who would like to
  be in charge and become the shepherd of the project.
 
 
  Maybe it is easier for you and everyone after the reorganization. Most of
  us have very limited time for Stanbol, but this might could change if we
  are able to catch-up everything, get a clean release with the current
  codebase, build a new roadmap with the feedback of everyone that is using
  Stanbol in regular basis and wants to join the discussion.
 
 
 
  Anything that addresses this issue is +1 from me. If there is someone
  standing up and would like to clean up - go for it.
 
  In my opinion we should shrink the code base a lot and remove anything
  that was not touched for two years. Same for the JIRA issues. Just
  keep a small Stanbol kernel around the enhancer that is manageable. We
  started Stanbol with a much larger team during a research project but
  today we have to focus. And Rafa is right that this should be part of
  roadmap planning.
 
 
  +1. I can help on this. I can also help with some ideas for the future
 and
  start discussions around them.
 
 
 
  Regarding a chat or meeting: We have to discuss this in the open on
  the mailing list. We have the credo that anything that was not
  discussed on the ML did not happen. So, just discuss the matters here
  and only move to chat or whatever if really necessary.
 
 
  ok. Let's see if we can bring together a great number of actual
 committers
  for this.
 
  Cheers,
  Rafa
 
 
 
  Best
   - Fabian
 
  2015-06-16 14:35 GMT+02:00 Dileepa Jayakody dileepajayak...@gmail.com
 :
   +1 Rafa. I too would like to join the discussion.
  
  
   On Tue, Jun 16, 2015 at 5:19 PM, Rafa Haro rh...@apache.org wrote:
  
   Hi Fabian, Rupert,
  
   I have been taking a look to the Stanbol Jira and certainly there is
 a
   great mess there. For example, there are a lot of issues opened more
  than
   one or two years ago. What do you think guys about having a session
 for
   properly organizing this?. We can meet at the IRC channel or wherever
  you
   consider and talk about releases, expectations, roadmap. I have the
   impression that there is a complete lack of planning for the
 project. I
   know that I'm not probably the most indicated person for saying this
   because I'm far away of being a daily contributor. But maybe we can
   motivate committers again by building a great backlog and planning
  properly
   where the project should go. I have been using Stanbol lately for two
  real
   uses cases and I could share with you my impressions and provide
  feedback
   as functionality that would have been nice to have while developing
 the
   projects.
  
   It would be great also to know how other Apache projects build their
   roadmap. Maybe we find a mirror to look at.
  
   Cheers,
   Rafa
  
   On Mon, Jun 15, 2015 at 11:22 AM, Rupert Westenthaler 
   rupert.westentha...@gmail.com wrote:
  
On Mon, Jun 15, 2015 at 8:14 AM, Fabian Christ fchr...@apache.org
 
   wrote:
 here is my draft for the report. Anything to add?
   
Hi Fabian
   
Nothing that is so important for adding to the Report.
   
On Mon, Jun 15, 2015 at 8:14 AM, Fabian Christ fchr...@apache.org
 
   wrote:
 The project is doing well but on a slow level. Main issue at the
  moment
 may be that there is no one who is cutting releases from time to
  time.
 The last release was cut one year ago. The project is aware

Re: Report time again. Query for input

2015-06-17 Thread Rafa Haro
Hi Fabian,

On Wed, Jun 17, 2015 at 1:34 PM, Fabian Christ christ.fab...@googlemail.com
 wrote:

 Hi,

 yes we have indeed the situation where we do not have a project owner
 who is doing some oversight of the project. I am the current chair of
 the project but also do not find the time to manage the project. I
 would have resigned as a chair if there was a person who would like to
 be in charge and become the shepherd of the project.


Maybe it is easier for you and everyone after the reorganization. Most of
us have very limited time for Stanbol, but this might could change if we
are able to catch-up everything, get a clean release with the current
codebase, build a new roadmap with the feedback of everyone that is using
Stanbol in regular basis and wants to join the discussion.



 Anything that addresses this issue is +1 from me. If there is someone
 standing up and would like to clean up - go for it.

 In my opinion we should shrink the code base a lot and remove anything
 that was not touched for two years. Same for the JIRA issues. Just
 keep a small Stanbol kernel around the enhancer that is manageable. We
 started Stanbol with a much larger team during a research project but
 today we have to focus. And Rafa is right that this should be part of
 roadmap planning.


+1. I can help on this. I can also help with some ideas for the future and
start discussions around them.



 Regarding a chat or meeting: We have to discuss this in the open on
 the mailing list. We have the credo that anything that was not
 discussed on the ML did not happen. So, just discuss the matters here
 and only move to chat or whatever if really necessary.


ok. Let's see if we can bring together a great number of actual committers
for this.

Cheers,
Rafa



 Best
  - Fabian

 2015-06-16 14:35 GMT+02:00 Dileepa Jayakody dileepajayak...@gmail.com:
  +1 Rafa. I too would like to join the discussion.
 
 
  On Tue, Jun 16, 2015 at 5:19 PM, Rafa Haro rh...@apache.org wrote:
 
  Hi Fabian, Rupert,
 
  I have been taking a look to the Stanbol Jira and certainly there is a
  great mess there. For example, there are a lot of issues opened more
 than
  one or two years ago. What do you think guys about having a session for
  properly organizing this?. We can meet at the IRC channel or wherever
 you
  consider and talk about releases, expectations, roadmap. I have the
  impression that there is a complete lack of planning for the project. I
  know that I'm not probably the most indicated person for saying this
  because I'm far away of being a daily contributor. But maybe we can
  motivate committers again by building a great backlog and planning
 properly
  where the project should go. I have been using Stanbol lately for two
 real
  uses cases and I could share with you my impressions and provide
 feedback
  as functionality that would have been nice to have while developing the
  projects.
 
  It would be great also to know how other Apache projects build their
  roadmap. Maybe we find a mirror to look at.
 
  Cheers,
  Rafa
 
  On Mon, Jun 15, 2015 at 11:22 AM, Rupert Westenthaler 
  rupert.westentha...@gmail.com wrote:
 
   On Mon, Jun 15, 2015 at 8:14 AM, Fabian Christ fchr...@apache.org
  wrote:
here is my draft for the report. Anything to add?
  
   Hi Fabian
  
   Nothing that is so important for adding to the Report.
  
   On Mon, Jun 15, 2015 at 8:14 AM, Fabian Christ fchr...@apache.org
  wrote:
The project is doing well but on a slow level. Main issue at the
 moment
may be that there is no one who is cutting releases from time to
 time.
The last release was cut one year ago. The project is aware of this
issue and is working on it.
   
  
   Thats very true!
  
   Slow Progress ...
  
   * I updated most of the dependencies STANBOL-1419 (for both 0.12.1 and
   trunk). So we are again using the newest Apache Sling Launcher, Apache
   Felix. New Apache Sling logging  with lock back support, ...)
   * Added support for ixa-nerc - Named Entity Recognition models. They
   provide much better quality models for OpenNLP. Languages: English,
   Spanish, German, Italian and Dutch.
   * I am also working together with Alfonso Noriega on adding support
   for the Stanford NLP sentiment annotation. For Stanbol this will mean
   that the Restful NLP engine will support Sentiment Annotations.
  
   An other issue are questions about the Contenthub and CMS Adatper that
   we can not really handle as the original developers are no longer
   active. During the dependency upgrades (STANBOL-1419) I noticed that
   those components do not have any Integration tests. So while I fixed
   all related module level issues (as reported by the unit tests) I was
   unable to ensure that everything was working as intended at runtime
   (as for all other components that do have integration tests).
  
   About Releases:
  
   I am happy to help with releases. Best would be if I you Fabian could
   show me everything I need to know

Re: Report time again. Query for input

2015-06-16 Thread Rafa Haro
Hi Fabian, Rupert,

I have been taking a look to the Stanbol Jira and certainly there is a
great mess there. For example, there are a lot of issues opened more than
one or two years ago. What do you think guys about having a session for
properly organizing this?. We can meet at the IRC channel or wherever you
consider and talk about releases, expectations, roadmap. I have the
impression that there is a complete lack of planning for the project. I
know that I'm not probably the most indicated person for saying this
because I'm far away of being a daily contributor. But maybe we can
motivate committers again by building a great backlog and planning properly
where the project should go. I have been using Stanbol lately for two real
uses cases and I could share with you my impressions and provide feedback
as functionality that would have been nice to have while developing the
projects.

It would be great also to know how other Apache projects build their
roadmap. Maybe we find a mirror to look at.

Cheers,
Rafa

On Mon, Jun 15, 2015 at 11:22 AM, Rupert Westenthaler 
rupert.westentha...@gmail.com wrote:

 On Mon, Jun 15, 2015 at 8:14 AM, Fabian Christ fchr...@apache.org wrote:
  here is my draft for the report. Anything to add?

 Hi Fabian

 Nothing that is so important for adding to the Report.

 On Mon, Jun 15, 2015 at 8:14 AM, Fabian Christ fchr...@apache.org wrote:
  The project is doing well but on a slow level. Main issue at the moment
  may be that there is no one who is cutting releases from time to time.
  The last release was cut one year ago. The project is aware of this
  issue and is working on it.
 

 Thats very true!

 Slow Progress ...

 * I updated most of the dependencies STANBOL-1419 (for both 0.12.1 and
 trunk). So we are again using the newest Apache Sling Launcher, Apache
 Felix. New Apache Sling logging  with lock back support, ...)
 * Added support for ixa-nerc - Named Entity Recognition models. They
 provide much better quality models for OpenNLP. Languages: English,
 Spanish, German, Italian and Dutch.
 * I am also working together with Alfonso Noriega on adding support
 for the Stanford NLP sentiment annotation. For Stanbol this will mean
 that the Restful NLP engine will support Sentiment Annotations.

 An other issue are questions about the Contenthub and CMS Adatper that
 we can not really handle as the original developers are no longer
 active. During the dependency upgrades (STANBOL-1419) I noticed that
 those components do not have any Integration tests. So while I fixed
 all related module level issues (as reported by the unit tests) I was
 unable to ensure that everything was working as intended at runtime
 (as for all other components that do have integration tests).

 About Releases:

 I am happy to help with releases. Best would be if I you Fabian could
 show me everything I need to know to create releases by my own in the
 future.

 best
 Rupert

 --
 | Rupert Westenthaler rupert.westentha...@gmail.com
 | Bodenlehenstraße 11  ++43-699-11108907
 | A-5500 Bischofshofen
 | REDLINK.CO
 ..
 | http://redlink.co/



Re: Status of dev.iks-project.eu server

2015-05-19 Thread Rafa Haro
Hi Rupert,




We can provide the launcher along with a downloader tool for not apache 
dependencies. Another possibility is to provide a link in the stanbol website 
to an ready to use docker machine at docker-hub




What do you guys think?



—
Enviado desde Mailbox

On Tue, May 19, 2015 at 2:22 PM, Rupert Westenthaler
rupert.westentha...@gmail.com wrote:

 HI Rafa
 On Mon, May 18, 2015 at 10:03 PM, Rafa Haro rharoapa...@gmail.com wrote:
 I would rather provide a ready-to-use binary or package like others projects 
 like solr do
 If you want to work on this you do have my full support. The problem
 with ready to use is that the OpenNLP models are not under Apache
 License and therefore we can not include them in a binary release.
 Therefore the current Stanbol Launchers (stable, full and full-war)
 can NOT be distributed as binary releases.
 What we could provide is
 * a binary release of the Software
 * excluding any OPenNLP Models
 * a small custom vocabulary configured with a plain entity linking chain.
 * AFAIK we could also include the default DBPedia index, but without
 NLP models for linking this does not seam to be very useful.
 best
 Rupert




 Cheers

 Rafa



 —
 Enviado desde Mailbox

 On Mon, May 18, 2015 at 8:25 PM, Andreas Kuckartz a.kucka...@ping.de
 wrote:

 Dileepa Jayakody wrote:
 Thanks Fabian, Rupert and Rafa.

 I agree with Rafa, that there is no reason for us to have a live hosted
 Stanbol instance.
 A bit late, but I do not agree with that.
 The best way for many people to see the value of Stanbol is to see it in
 action. And the easiest way to do that is by using an online demo. An
 online demo can be used instantly while otherwise software needs to be
 downloaded and installed. That is a difference which should not be
 underestimated even for the main target audiences of Stanbol.
 Cheers,
 Andreas
 -- 
 | Rupert Westenthaler rupert.westentha...@gmail.com
 | Bodenlehenstraße 11  ++43-699-11108907
 | A-5500 Bischofshofen
 | REDLINK.CO 
 ..
 | http://redlink.co/

Re: Welcome our new commiter: Cristian Petroaca

2015-05-19 Thread Rafa Haro
Welcome Christian!! I'm quite interested in the coreferencing you have been 
working on. I will try to test it soon!



—
Enviado desde Mailbox

On Tue, May 19, 2015 at 6:39 PM, Dileepa Jayakody
dileepajayak...@gmail.com wrote:

 Welcome Cristian !
 On Tue, May 19, 2015 at 9:12 PM, Antonio David Perez Morales 
 ape...@zaizi.com wrote:
 Welcome on board Cristian!!


 On Tue, May 19, 2015 at 3:27 PM, Cristian Petroaca 
 cristian.petro...@gmail.com wrote:

  Hi everybody,
 
  Thank you. It's great to be here. I'll try to involve myself in Stanbol
 as
  mush as possible in the future.
 
  Regards,
  Cristian
 
  On Tue, May 19, 2015 at 12:54 PM, Tommaso Teofili 
  tommaso.teof...@gmail.com
   wrote:
 
   Welcome Cristian, looking forward to your contributions!
  
   Regards,
   Tommaso
  
   2015-05-19 10:30 GMT+02:00 Fabian Christ fchr...@apache.org:
  
Dear Stanbolers,
   
please welcome our new committer Cristian Petroaca!
   
Welcome Cristian! We are looking forward to work with you on the
future of Stanbol.
   
Best
 - Fabian
   
  
 

 --

 --
 This message should be regarded as confidential. If you have received this
 email in error please notify the sender and destroy it immediately.
 Statements of intent shall only become binding when confirmed in hard copy
 by an authorised signatory.

 Zaizi Ltd is registered in England and Wales with the registration number
 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
 London W6 7AN.


Re: Status of dev.iks-project.eu server

2015-05-18 Thread Rafa Haro
I would rather provide a ready-to-use binary or package like others projects 
like solr do




Cheers

Rafa



—
Enviado desde Mailbox

On Mon, May 18, 2015 at 8:25 PM, Andreas Kuckartz a.kucka...@ping.de
wrote:

 Dileepa Jayakody wrote:
 Thanks Fabian, Rupert and Rafa.
 
 I agree with Rafa, that there is no reason for us to have a live hosted
 Stanbol instance.
 A bit late, but I do not agree with that.
 The best way for many people to see the value of Stanbol is to see it in
 action. And the easiest way to do that is by using an online demo. An
 online demo can be used instantly while otherwise software needs to be
 downloaded and installed. That is a difference which should not be
 underestimated even for the main target audiences of Stanbol.
 Cheers,
 Andreas

Re: Enhancer engine breaking client software

2015-05-14 Thread Rafa Haro
Hi John, 




Could you please open an issue here 
https://github.com/rafaharo/apache-stanbol-client including the stack trace of 
the NPE. I will take a look to it as soon as possible.



Cheers,

Rafa




On miércoles, may 13, 2015 at 7:18 p. m., john.2.crowt...@bt.com 
john.2.crowt...@bt.com, wrote:
Hello All,


I have been building an enhancer engine which I want to add to a one of the 
default chains  dbpedia-proper-noun.


I am using the java Stanbol client to query the Stanbol server and extract 
places, names and organisations.


When I add my enhancement engine to the chain I get a null pointer in 
javax.ws.rs.ProcessingException on the client side during the read entity 
method .


If I comment out the engine metadata editing below, my client software works 
fine with the extra engine on the chain.


UriRef textAnnotation = EnhancementEngineHelper.createTextEnhancement(ci, this);

metadata.add(new TripleImpl(textAnnotation, DCTERMS.type,

new UriRef(http://example.org/ontology/LengthEnhancement;)));


metadata.add(new TripleImpl(textAnnotation, RDFS.comment,

new PlainLiteralImpl( --- this is where I want to add information 
)));



How can I add information to the metadata without breaking my client ?


Best Regards,


John Crowther

Re: Status of dev.iks-project.eu server

2015-05-08 Thread Rafa Haro
+1 for hosting them at github



En 8 de mayo de 2015 en 8:19:03, Rupert Westenthaler 
(rupert.westentha...@gmail.com) escrito:

Hi all,  

The original server had an hardware issue. I already ask for  
replacement and replacement will be provided. The current plan is to  
make resources required to build already released versions of Stanbol  
available at their original location (so that older releases do work  
again). But for ongoing development I suggest to switch to a different  
location. I would suggest to create a github or bitbucket project to  
manage (and possible also release) those artifacts.  

Just as a reminder: This is mainly about the OpenNLP models. As the  
license of those is unclear we can not host them on Apache  
Infrastructure (for more information search the OpenNLP mailing  
lists).  

best  
Rupert  


On Thu, May 7, 2015 at 11:20 AM, Rafa Haro rh...@apache.org wrote:  
 I would say we shouldn’t….there is no reason in my opinion  
  
 Cheers,  
 Rafa  
  
  
 En 7 de mayo de 2015 en 11:15:56, Fabian Christ 
 (christ.fab...@googlemail.com) escrito:  
  
 Hi Dileepa,  
  
 I have no further knowledge about this server but it may be the case  
 that the money for running this service ran out as the IKS project has  
 ended in 2012.  
  
 Do we need a live system?  
  
 Best  
 - Fabian  
  
 2015-04-08 15:22 GMT+02:00 Dileepa Jayakody dileepajayak...@gmail.com:  
 Hi Guys,  
  
 Can I please know the status of dev.iks-project.eu server?  
 Can we bring it back online soon?  
  
 Thanks,  
 Dileepa  
  
  
  
 --  
 Fabian  
 http://twitter.com/fctwitt  



--  
| Rupert Westenthaler rupert.westentha...@gmail.com  
| Bodenlehenstraße 11 ++43-699-11108907  
| A-5500 Bischofshofen  
| REDLINK.CO 
..  
| http://redlink.co/  


Re: Anyone know how to performance tune Stanbol

2015-04-08 Thread Rafa Haro
Hi Aree,

Can you provide more details about your instance? Which engines are you using, 
configuration and so on?

Cheers,
Rafa


En 8 de abril de 2015 en 5:44:36, Aree Cohen (aree.co...@sbs.com.au) escrito:

?Hi Guys, 


Stanbol is incredibly slow for our project. Is anyone able to give tips on how 
to performance tune it. Thank you. 


Aree 


Re: Super Help - Build Stanbol

2015-03-28 Thread Rafa Haro
You should be able to build it just using Java 7.x and maven 3. The test is
failing for you shouldn't be failing also (it's not failing to me) but
anyway, if you want to have it built and running just skip the tests with
maven.

Cheers,
Rafa

El sábado, 28 de marzo de 2015, Joni Hoppen j...@aquare.la escribió:

 Hi Rafa,

 My problem is that I can't find a way to get stanbol compiled and running,
 everytime I try different problems occur, I am trying to install it on my
 mac and on a Centos 6.5. Do you have it running?

 Below is the very last message I got from one of my attempts, I would be
 glad if anyone can help.

 Cheers
 Joni







 Results :




 Failed tests:

 TextAnalyzerTest.testMultipleSentenceDefaultConfig:139-checkSingleSentence:152
 null




 Tests in error:

 OpenNLPTest.testLoadModelByName:145 ? EOF Unexpected end of ZLIB input
 stream

 OpenNLPTest.testLoadIncompatibleModelByName ? Unexpected exception,
 expected...

 OpenNLPTest.testLoadEnTokenizer:57 ? EOF Unexpected end of ZLIB input
 stream

 OpenNLPTest.testLoadEnSentence:81 ? EOF Unexpected end of ZLIB input stream

 OpenNLPTest.testLoadEnChunker:109 ? EOF Unexpected end of ZLIB input stream

 OpenNLPTest.testLoadEnNER:124 ? EOF Unexpected end of ZLIB input stream




 Tests run: 18, Failures: 1, Errors: 6, Skipped: 0




 [INFO]
 

 [INFO] Reactor Summary:

 [INFO]

 [INFO] Apache Stanbol Commons OpenNLP Utilities and Models FAILURE [
 24.099 s]

 [INFO] Apache Stanbol Commons Testing Jar Executor  SKIPPED

 [INFO] Apache Stanbol Commons Testing HTTP Testing Library SKIPPED

 [INFO] Apache Stanbol Commons Testing Stanbol Utilities ... SKIPPED

 [INFO] Apache Stanbol Commons OWL Bundle .. SKIPPED

 [INFO] Apache Stanbol Commons Launchpad ... SKIPPED

 [INFO] Apache Stanbol Commons Security Core ... SKIPPED

 [INFO] Apache Stanbol Commons Basic Authenticator . SKIPPED







 Joni Hoppen

 Diretor financeiro e de operações




 Aquarela Inovação Tecnológica do Brasil

 www.aquare.la/pt

 j...@aquare.la javascript:;

 (48) 3304 1137
 (48) 9623 4864

 - Mensagem original -

 De: Rafa Haro rh...@apache.org javascript:;
 Para: dev@stanbol.apache.org javascript:;
 Enviadas: Sexta-feira, 27 de março de 2015 20:09:38
 Assunto: Re: Super Help - Build Stanbol

 Hi Joni

 which are exactly your problems?

 Cheers

 El viernes, 27 de marzo de 2015, Joni Hoppen j...@aquare.la
 javascript:; escribió:

  Hello folks, how have you being doing?
 
  It's been months, perhaps more than 6 months that I am trying to have
  Stanbol up and running, but no matter what I do I simply can't have it
  done, is there anyone who can give a tutorial that works? or a suggested
  platform linux and versions? I would very much appreciate to know if the
  difficulties to install is technical or political based. If the second
 case
  applies, is it solvable? Are there alternatives to Stanbol?
 
  Many thanks for you valuable attention. I wish you all a restful weekend.
 
 
 
 
  Joni Hoppen
 
  Managing Director
 
 
 
 
  Aquarela Inovação Tecnológica do Brasil
 
  www.aquare.la
 
  j...@aquare.la javascript:; javascript:;
 
  (48) 3304 1137
  (48) 9623 4864
 
 




Re: Super Help - Build Stanbol

2015-03-27 Thread Rafa Haro
Hi Joni

which are exactly your problems?

Cheers

El viernes, 27 de marzo de 2015, Joni Hoppen j...@aquare.la escribió:

 Hello folks, how have you being doing?

 It's been months, perhaps more than 6 months that I am trying to have
 Stanbol up and running, but no matter what I do I simply can't have it
 done, is there anyone who can give a tutorial that works? or a suggested
 platform linux and versions? I would very much appreciate to know if the
 difficulties to install is technical or political based. If the second case
 applies, is it solvable? Are there alternatives to Stanbol?

 Many thanks for you valuable attention. I wish you all a restful weekend.




 Joni Hoppen

 Managing Director




 Aquarela Inovação Tecnológica do Brasil

 www.aquare.la

 j...@aquare.la javascript:;

 (48) 3304 1137
 (48) 9623 4864




Re: Recommended Backup Strategy

2015-03-26 Thread Rafa Haro
Hi Aree and Bradley, 

About the backup, it really depends on your uses cases. Most of the Stanbol 
services are completely stateless and that (initially) means you don’t need a 
backup. If you are using a custom dataset, or some custom configurations you 
probably would like to backup them. For the datasets, if you have used the 
generic indexer to build them, the outcomes of that tool could be your backup: 
at the end you will have a Solr index that you can install in a clean Stanbol 
whenever you want and a bundle for the EntityHub site that you can also install 
in any moment.

Regarding configuration, probably the best solution is to configure your own 
Stanbol launcher with your final configuration. With all these resources you 
can easily replicate your instance.

Hope that helps. Cheers,
Rafa


En 25 de marzo de 2015 en 23:31:18, Aree Cohen (aree.co...@sbs.com.au) escrito:

Perfect timing. We are in the same boat. We're about to deliver a critical 
project using Stanbol and need help with setting up our production environment. 
 
Moreover, does anyone know the best strategy for replication between two 
Stanbol servers. Is rsyncing a feasible strategy?  

Any help with this would be extremely appreciated.  

Thanks,  
Aree  

  
From: Bradley Falzon b...@teambrad.net  
Sent: Thursday, 26 March 2015 9:14 AM  
To: dev@stanbol.apache.org  
Subject: Recommended Backup Strategy  

Hi All,  

We're beginning initial testing of Stanbol, and wanted to know the  
recommended approach to backups. We've tried checking the mailing list and  
stanbol.apache.org for best practices, to no avail.  

Is there any features built in? We're really unsure on the internal  
workings and best strategy here.  

Thoughts?  

--  
Bradley Falzon  
b...@teambrad.net  


[Vikings. Thursdays, 8.30pm, SBS ONE.]http://sbs.com.au/vikings  


Re: Subscribing for mail

2015-03-16 Thread Rafa Haro
Hi Rajendram, 

This issue was initially planned to be included in GSoC 2014 edition and I 
personally moved it to GSoC 2015 because it is still pendant and it is quite 
interested. But, as far as I know, for this year, there isn’t yet any Apache 
Stanbol committer willing to mentoring. So, I would wait a couple of days to 
see if someone is interested in mentoring the project and if not, probably it 
is a good idea for you to move on another project.

This is anyway my fault because I should have asked first to the project 
members before assigning the new labels. So apologize for that

Cheers,
Rafa


En 16 de marzo de 2015 en 14:10:12, Rajendram Layansan (layan...@gmail.com) 
escrito:

Hi,  

I want to apply for GSOC 2015 with this project :  
https://issues.apache.org/jira/browse/STANBOL-1007?filter=12330297 . I need  
to know more informationabout how can i proceed. Can anyone help me  

Thank you,  

Layansan.  


Re: New in Stanbol - compiling problems

2015-03-10 Thread Rafa Haro
Hi Leandro, 

Can you possible provide more information about the problem you are 
experiencing? The build error would help :-)

Cheers,
Rafa


En 10 de marzo de 2015 en 16:59:33, Leandro Ramos (lra...@gmail.com) escrito:

Hi,  

I introduced in Stanbol recently, and getting some troubles in first build.  
Specifically in the test cases inside OpenNLP NER. Anyone knows if the  
current version has some problems in download the NERs English models? Or  
another kind of adjust that I need to do?  

My environment is Eclipse Luna, Subclipse 1.10 and Maven 3.2.5. I just did  
the SVN co and a mvn clean install, but I didn't change the configuration  
files.  

Regards,  

Leandro.  


Re: nlp enhancer wordlift-stanbol out of date?

2015-03-09 Thread Rafa Haro
Hi Patrick, 

The github project that you have linked in your previous email is not the 
stanbol-freeling server. The correct link is the following: 
https://github.com/insideout10/stanbol-freeling. I have using it recently 
without any problem. Try with that one.

Hope that helps.
Cheers, Rafa


En 9 de marzo de 2015 en 8:56:19, Patrick Kirsch (pkir...@zscho.de) escrito:

Hey list,  

as descriped in  
https://stanbol.apache.org/docs/trunk/components/enhancer/nlp/  

I tried to use freeling together with the given instructions at:  
https://github.com/insideout10/wordlift-stanbol  

But this module does not build against latest stanbol versions (of  
course, I filed detailed issues to the wordlift-stanbol maintainer).  

Furthermore the wordlift-stanbol project is rather quiet (~2013.02).  

This brings me to my question, is anyone using this wordlift-stanbol  
framework or has usage experience?  

Regards,  
Patrick  




Model for Linking Data

2015-03-04 Thread Rafa Haro
Hi all, 

Recently, while working on a post-processing engine, I have realized that 
currently it is not straightforward to deal with the data produced by Linking 
engines. Basically, in my opinion, the problem is that there is not currently 
easy to relate the results of NLP analysis with the results of the Linking 
process. After NLP analysis, all the extracted Spans (tokens, sentences, chunks 
and so on) are stored in an AnalyzedText object [1]. This model has a nice to 
use API and it really eases the work in the next engines within a chain. 
However, the result of the Linking Engines are currently only stored in the 
Clerezza graph holding the metadata of a ContentItem mainly as Text and Entity 
Annotations. Although there are some helpers to deal with the annotations 
within the graph, when developing a, let’s say, post-linking engine, a 
developer really miss a way to find, for example, the text and entity 
annotations that could be associated with the spans. The only way I have found 
without started to work on a good solution for this, has been to locate the 
spans associated to a Text Annotation by using the start and end offsets.

I would like to start a discussion here about the best design for tackling this 
problem.

Cheers,
Rafa

[1] - https://stanbol.apache.org/docs/trunk/components/enhancer/nlp/analyzedtext




Re: Searching RDF graph using SKOS thesaurus

2014-11-12 Thread Rafa Haro
Hi Mark,

You can solve your problem in Stanbol if you link or merge together both graphs 
in a single one and you create a site with it. After indexing the merged graph, 
you can use the EntityHub API and specifically the find (/entityhub/site/find) 
service to search for your label and then move to all the nodes associated to 
that skos label using an LDPath expression. Please take a look to the EntityHub 
REST API documentation.

Hope that helps. Cheers,
Rafa


En 11 de noviembre de 2014 en 20:34:01, cont...@linkeddatatools.com 
(cont...@linkeddatatools.com) escrito:

Hi, here is an example of what I'm trying to achieve. Does Fusepool,  
or another solution, achieve this goal?  

I have an RDF graph in a graph store:  

==  

foaf:Person rdf:ID=johnsmith  
foaf:firstNameJohn/foaf:firstName  
foaf:lastNameSmith/foaf:lastName  
ex:roleManaging Director/ex:role  
/foaf:Person  

==  

I have the following SKOS vocabulary:  

==  

ex:role rdf:type skos:Concept;  
skos:prefLabel Managing Director@en;  
skos:altLabel MD@en;  
skos:altLabel President@en;  
skos:altLabel CEO@en.  

==  

If I search for anyone with the role 'President', I want to return  
John Smith (rdf:ID=johnsmith) - because 'President' is an  
alternative label for 'Managing Director'.  

Is this possible using an already established best practice, or framework?  

Please let me know if any further examples are required.  


Best wishes  

Mark  

Quoting Reto Gmür r...@apache.org:  

 Hi Linked Data Tools  
  
 One difficulty might arise because ContentHub has the index and the facets  
 in lucene only and other metadata in an RDF graph. So for example if  
 contenthub provides a facet Paris you only have the label without any  
 association to the URI, so it won't be possible to get additional  
 properties of the resource. This is way in the fusepool project we've  
 chosen to build a store that stores all the data in an RDF graph and builds  
 a lucene index on top of it. The code is here  
 https://github.com/fusepool/fusepool-ecs, its apache licensed and btw.  
 fusepool would be happy to donate it to the stanbol project.  
  
 Cheers,  
 Reto  
  
 On Mon, Nov 10, 2014 at 7:54 PM, cont...@linkeddatatools.com wrote:  
  
 Hi, I posted a similar message to the IKS mailing list, but understand  
 from the response that this mailing list is no longer administrated.  
  
 Stanbol is a great tool and I'm having some success with it; particularly  
 the entity extractor tool.  
  
 I have a requirement and, I am not sure the best way to approach this and  
 whether a best practice for this sort of problem has already been  
 established.  
  
 I have an RDF graph - one in accordance with the FOAF ontology - and I  
 have a controlled vocabulary in the form of a SKOS RDF graph, which  
 contains a set of literal string terms and their semantic equivalents (e.g.  
 'President' - 'Managing Director' - 'Chief Executive' - 'MD' -  
 etc.).  
  
 I would like to search the literal strings in the FOAF graph for the  
 occurrence of the string literals, and their equivalents as defined by the  
 SKOS thesaurus.  
  
 I can suggest one approach to this problem, but I fear it may be quite  
 inefficient and take a long time, namely:  
  
 - Query the RDF graph using SPARQL for all string literals.  
 - Pass each string literal to the Stanbol Entity Extractor, having  
 uploaded the SKOS thesaurus to the Stanbol Entity Hub.  
  
 Now this seems quite a long winded. Further, I'm not even clear from the  
 documentation whether the Stanbol Entity Extractor is capable of using SKOS  
 vocabularies to map string literals to entities. Is Stanbol capable of  
 extracting entities using a SKOS vocabulary?  
  
 This seems a fairly common thing to do (semantic search of an RDF graph  
 using a thesaurus) - is there some better way of solving this problem using  
 an already established strategy?  
  
  
 Many thanks!  
  
 Linked Data Tools  
  
  



Re: Searching RDF graph using SKOS thesaurus

2014-11-12 Thread Rafa Haro
Hi Mark, Reto,


En 12 de noviembre de 2014 en 11:45:43, Reto Gmür (r...@apache.org) escrito:

On Wed, Nov 12, 2014 at 10:16 AM, Rafa Haro rh...@apache.org wrote:  

 Hi Mark,  
  
 You can solve your problem in Stanbol if you link or merge together both  
 graphs in a single one and you create a site with it. After indexing the  
 merged graph, you can use the EntityHub API and specifically the find  
 (/entityhub/site/find) service to search for your label and then move to  
 all the nodes associated to that skos label using an LDPath expression.  
 Please take a look to the EntityHub REST API documentation.  
  

Just for completeness: After meging the two graphs (or even without) you  
can also use SPARQL.  


Yep, that’s true :-). I probably forgot to mention that if you are planning to 
enrich documents using both graphs, the LDPath approach is also available.

Cheers,
Rafa


Cheers,  
Reto  



  
 Hope that helps. Cheers,  
 Rafa  
  
  
 En 11 de noviembre de 2014 en 20:34:01, cont...@linkeddatatools.com (  
 cont...@linkeddatatools.com) escrito:  
  
 Hi, here is an example of what I'm trying to achieve. Does Fusepool,  
 or another solution, achieve this goal?  
  
 I have an RDF graph in a graph store:  
  
 ==  
  
 foaf:Person rdf:ID=johnsmith  
 foaf:firstNameJohn/foaf:firstName  
 foaf:lastNameSmith/foaf:lastName  
 ex:roleManaging Director/ex:role  
 /foaf:Person  
  
 ==  
  
 I have the following SKOS vocabulary:  
  
 ==  
  
 ex:role rdf:type skos:Concept;  
 skos:prefLabel Managing Director@en;  
 skos:altLabel MD@en;  
 skos:altLabel President@en;  
 skos:altLabel CEO@en.  
  
 ==  
  
 If I search for anyone with the role 'President', I want to return  
 John Smith (rdf:ID=johnsmith) - because 'President' is an  
 alternative label for 'Managing Director'.  
  
 Is this possible using an already established best practice, or framework?  
  
 Please let me know if any further examples are required.  
  
  
 Best wishes  
  
 Mark  
  
 Quoting Reto Gmür r...@apache.org:  
  
  Hi Linked Data Tools  
   
  One difficulty might arise because ContentHub has the index and the  
 facets  
  in lucene only and other metadata in an RDF graph. So for example if  
  contenthub provides a facet Paris you only have the label without any  
  association to the URI, so it won't be possible to get additional  
  properties of the resource. This is way in the fusepool project we've  
  chosen to build a store that stores all the data in an RDF graph and  
 builds  
  a lucene index on top of it. The code is here  
  https://github.com/fusepool/fusepool-ecs, its apache licensed and btw.  
  fusepool would be happy to donate it to the stanbol project.  
   
  Cheers,  
  Reto  
   
  On Mon, Nov 10, 2014 at 7:54 PM, cont...@linkeddatatools.com wrote:  
   
  Hi, I posted a similar message to the IKS mailing list, but understand  
  from the response that this mailing list is no longer administrated.  
   
  Stanbol is a great tool and I'm having some success with it;  
 particularly  
  the entity extractor tool.  
   
  I have a requirement and, I am not sure the best way to approach this  
 and  
  whether a best practice for this sort of problem has already been  
  established.  
   
  I have an RDF graph - one in accordance with the FOAF ontology - and I  
  have a controlled vocabulary in the form of a SKOS RDF graph, which  
  contains a set of literal string terms and their semantic equivalents  
 (e.g.  
  'President' - 'Managing Director' - 'Chief Executive' - 'MD' -  
  etc.).  
   
  I would like to search the literal strings in the FOAF graph for the  
  occurrence of the string literals, and their equivalents as defined by  
 the  
  SKOS thesaurus.  
   
  I can suggest one approach to this problem, but I fear it may be quite  
  inefficient and take a long time, namely:  
   
  - Query the RDF graph using SPARQL for all string literals.  
  - Pass each string literal to the Stanbol Entity Extractor, having  
  uploaded the SKOS thesaurus to the Stanbol Entity Hub.  
   
  Now this seems quite a long winded. Further, I'm not even clear from the  
  documentation whether the Stanbol Entity Extractor is capable of using  
 SKOS  
  vocabularies to map string literals to entities. Is Stanbol capable of  
  extracting entities using a SKOS vocabulary?  
   
  This seems a fairly common thing to do (semantic search of an RDF graph  
  using a thesaurus) - is there some better way of solving this problem  
 using  
  an already established strategy?  
   
   
  Many thanks!  
   
  Linked Data Tools  
   
   
  
  


Re: Searching RDF graph using SKOS thesaurus

2014-11-11 Thread Rafa Haro
Hi, 

Can you clarify this with an example please?

Cheers,
Rafa


En 10 de noviembre de 2014 en 19:55:26, cont...@linkeddatatools.com 
(cont...@linkeddatatools.com) escrito:

Hi, I posted a similar message to the IKS mailing list, but understand  
from the response that this mailing list is no longer administrated.  

Stanbol is a great tool and I'm having some success with it;  
particularly the entity extractor tool.  

I have a requirement and, I am not sure the best way to approach this  
and whether a best practice for this sort of problem has already been  
established.  

I have an RDF graph - one in accordance with the FOAF ontology - and I  
have a controlled vocabulary in the form of a SKOS RDF graph, which  
contains a set of literal string terms and their semantic equivalents  
(e.g. 'President' - 'Managing Director' - 'Chief Executive' -  
'MD' - etc.).  

I would like to search the literal strings in the FOAF graph for the  
occurrence of the string literals, and their equivalents as defined by  
the SKOS thesaurus.  

I can suggest one approach to this problem, but I fear it may be quite  
inefficient and take a long time, namely:  

- Query the RDF graph using SPARQL for all string literals.  
- Pass each string literal to the Stanbol Entity Extractor, having  
uploaded the SKOS thesaurus to the Stanbol Entity Hub.  

Now this seems quite a long winded. Further, I'm not even clear from  
the documentation whether the Stanbol Entity Extractor is capable of  
using SKOS vocabularies to map string literals to entities. Is Stanbol  
capable of extracting entities using a SKOS vocabulary?  

This seems a fairly common thing to do (semantic search of an RDF  
graph using a thesaurus) - is there some better way of solving this  
problem using an already established strategy?  


Many thanks!  

Linked Data Tools  


Re: CMS Integration

2014-10-30 Thread Rafa Haro
Hi Alok, 

Depending on which kind of architecture you want for your system, there are 
different possibilities. Let me list you some of them:

1. CMS Adapter: I have played around with it in the past but never tested it 
seriously, so I can’t not talk from the experience. Depending on the CMS, there 
have been others users in the past reporting all kind of problems. According to 
documentation, it allows you to represent your repository as a graph in RDF 
(and probably will allow later for example to perform SPARQL queries over this 
representation) and, also, it allows you to directly feed the Stanbol 
ContentHub for Semantic Search.

2. ContentHub: this component 
(https://stanbol.apache.org/docs/trunk/components/contenthub/contenthub5min) 
allows users to define custom Semantic cores on top of Solr. It is not longer 
supported in the current version of Stanbol (1.0) but it is supported in 0.12.* 
releases. The documentation is more or less clear, but basically what you can 
do with ContentHub is to define a custom schema using an LDPath program. The 
LDPath program defines a set of fields to be stored in Solr and how to populate 
those fields from the Enhancer results. The workflow is the following: you can 
take the content out from your CMS and sent it to the ContentHub through a REST 
API. The content is enriched with a configured chain. The Enhancement Structure 
resultant from the enrichment process is parsed using the configured LDPath 
program. As a result, you get a list of fields values to be stored in Solr. 
Besides these fields, by default, the textual content is also stored in Solr 
and the Enhancement Structure is stored in a Clerezza graph with an unique id 
for your index. So at the end you have a graph relating your content with 
entities.

3. Use Apache ManifoldCF: Apache ManifoldCF is an effort to provide an open 
source framework for connecting source content repositories like Microsoft 
Sharepoint, EMC Documentum, Alfresco or any CMIS compatible CMS, to target 
repositories or indexes, such as Apache Solr or ElasticSearch. ManifoldCF 
allows you to crawl your content from your CMS supporting “incremental 
crawling”, i.e., managing deletions, additions, modifications, etc. of the 
content in your CMS. Recently, ManifoldCF is supporting Transformation 
Connectors, which basically allows to process the content before indexing it. 
I’m currently working on a Stanbol Transformation Connector that, following the 
ContentHub use case, will allow to enrich the content with Stanbol and store 
the extracted entities information as plain metadata. I will be contributing 
this to ManifoldCF in the following weeks.

Hope this email helps.
Cheers,
Rafa


En 29 de octubre de 2014 en 7:09:07, Alok K. Shukla (m...@alokkumarshukla.com) 
escrito:

Hi everyone  

I would like to use Stanbol with existing CMS for Semantic Search. From 
documentation of CMS Adapter, I get that it would be the starting point for the 
task. Can someone please guide me along, specially with building indexes; how 
entities would be created out of CMS data. Any help would be highly 
appreciated.  

Thanks  
Alok  

Sent from my iPhone

Re: Improving AIda-light Disambiguation Engine

2014-10-24 Thread Rafa Haro

Hi Rupert!

Nice to see great feedback on this topic!. I wanted to comment part of your 
previous email:

En 24 de octubre de 2014 en 12:31:39, Rupert Westenthaler 
(rupert.westentha...@gmail.com) escrito:

I fear that each disambiguation approach will come with its own data 
model. Mainly because the way data is kept is central for performance. 
Also based on what I have seen up to now keeping everything in-memory 
is the way to go. For Aida-Light I suggest to keep the current 
solutions. The requirement of about 50GByte RAM for Yago is anyways 
quite OK. If we can add support for more focused datasets one will 
often end up with far less entities. This is also a way to keep 
resource requirements down. 
I completely understand your point but I partially disagree. I find the memory 
consumption requirement quite tough. It may prevent a lot of people to give it 
try or experiment with it. Actually, Chalitha had serious problems to find a 
machine for testing it. I agree that probably it is difficult to find a valid 
architecture for any disambiguation approach, but focusing only in Aida-Light I 
think that it is worth to provide also an other data management solution. It 
will affect the performance for sure, but at least it will let people use it 
without a super machine. I’m actually thinking in alternatives that can 
“simulate” memory access using the disk like LevelDB or similar.

Makes sense?

Cheers,

Rafa



Re: Improving AIda-light Disambiguation Engine

2014-10-24 Thread Rafa Haro
Touché

:-

On Friday, October 24, 2014, Rupert Westenthaler 
rupert.westentha...@gmail.com wrote:

 Hi Rafa

 On Fri, Oct 24, 2014 at 12:47 PM, Rafa Haro rh...@apache.org
 javascript:; wrote:
  I completely understand your point but I partially disagree. I find the
 memory consumption requirement quite tough. It may prevent a lot of people
 to give it try or experiment with it. Actually, Chalitha had serious
 problems to find a machine for testing it. I agree that probably it is
 difficult to find a valid architecture for any disambiguation approach, but
 focusing only in Aida-Light I think that it is worth to provide also an
 other data management solution. It will affect the performance for sure,
 but at least it will let people use it without a super machine. I’m
 actually thinking in alternatives that can “simulate” memory access using
 the disk like LevelDB or similar.
 
  Makes sense?
 

 IMO no and here is why:

 * If you need to solve this problem for a developer - find a way to
 trim down the size of the dataset - yago - so that he can test in
 setups with 5Gbyte of RAM
 * For a real installation you do not want to solve the problem as for
 dedicated machines memory is really cheap. Implementing an alternative
 that can use some hard drive technology would just be so much slower
 that you would always opt for the more memory option.

 So if your concern is about ease of deployment for developers or
 demoability I would recommend to work on a way to extract meaningful
 sub-sets of Yago that can be managed by Aida-Light setups requiring 
 10GByte of RAM.

 best
 Rupert


 --
 | Rupert Westenthaler rupert.westentha...@gmail.com
 javascript:;
 | Bodenlehenstraße 11  ++43-699-11108907
 | A-5500 Bischofshofen
 | REDLINK.CO
 ..
 | http://redlink.co/



Re: Improving AIda-light Disambiguation Engine

2014-10-23 Thread Rafa Haro
Hi Chalitha, 

Thanks for coming back to Stanbol after GSoC!! I have in my TODO list to deeply 
review the code and merge it into Stanbol. If you are planning to make more 
improvements, we can leave for now in a branch that can be included in the 
trunk when it will be mature enough.

Using the project outcome as baseline, I can suggest a first list of possible 
improvements:

1. Extend Aida-Light for supporting others datasets. We would need to check how 
much the disambiguation algorithms are coupled with the information provided by 
YAGO and try to convert them to a generic approach.

2. Current Aida-Light architecture. Currently, all the data is preloaded in 
memory forcing the user to use a high profiled machines. We have discussed this 
several times, but maybe it is moment to finally decide on a proper backend 
strategy for supporting disambiguation. Probably current Yards are not 
enough/valid.

3. Stanbol Disambiguation API. Another (almost) eternal discussion. Can we 
design an extensible API for supporting different disambiguation approaches?

I will start by creating the branch ASAP Chalitha. Thanks!!

Cheers,
Rafa
En 23 de octubre de 2014 en 9:28:35, chalitha udara Perera 
(chalithaud...@gmail.com) escrito:

Hi all,  

As you know, For GSOC I have integrated YAGO  
knowledge base and Aida-light disambiguation server  
with Stanbol. But there are many improvements that can be  
applied to current disambiguation engine. For example current  
engine only works with YAGO site, but it would be more useful  
if it support other sites such as dbpedia.  

I would like to continue contributing to Stanbol.  
As mentioned above I can start with making disambiguation  
engine to used with dbpedia entities. What you guys think ?  
Any feedback will be greatly appreciated.  

cheers,  
Chalitha  
--  
J.M Chalitha Udara Perera  

*Department of Computer Science and Engineering,*  
*University of Moratuwa,*  
*Sri Lanka*  


Re: Missing some HTTP services on 1624013

2014-09-18 Thread Rafa Haro
Hi Fabio, 

Which services were you expecting? Some components like ContentHub or Ortonet 
are not included in the current trunk version but are still maintained in the 
0.12 branch as far as I know.

Cheers,
Rafa


En 17 de septiembre de 2014 en 23:04:11, Fabio Ricci (fabio.ri...@semweb.ch) 
escrito:

Dear community  

I am just starting working on this highly promising framework!  
After one day figuring out on my osx mavericks installation - my stanbol 
instance runs but shows only /enhancer /topic /entityhub as services.  

Using the latest stanbol release (1624013) from trunk. It compiles and builds 
the target(s).  

I start stanbol using the full target snapshot:  

java -Xmx4g -jar 
full/target/original-org.apache.stanbol.launchers.full-1.0.0-SNAPSHOT.jar 
-p9321  

My stanbol instance is showing only  /enhancer /topic /entityhub as services.  

Where are the other services? What should I do in order to get all the other 
services registered and running?  

I am grateful for every hint on this !  

Thank you very much in advance  

Kind regards   

Fabio  


Re: Missing some HTTP services on 1624013

2014-09-18 Thread Rafa Haro
Hi Fabio, 

If you want to work with all those components, probably you should checkout the 
0.12 branch: https://svn.apache.org/repos/asf/stanbol/branches/release-0.12/

Hope that helps,
Rafa


En 18 de septiembre de 2014 en 10:21:03, Fabio Ricci (fabio.ri...@semweb.ch) 
escrito:

Dear Rafa  

thank you so much for your question and hint.  
Well I got the code from http://svn.apache.org/repos/asf/stanbol/trunk and all 
the literature (stanbol website, Books) tell that all the components should be 
already included there.  

The latest revision of  http://svn.apache.org/repos/asf/stanbol/trunk  
(1624013) = HEAD has as comment: merged implementation for STANBOL-1391 from 
0.12.1 to trunk - hence I supposed to find all the components in the HEAD 
revision (as the literature suggests).  

The components I needed are depicted in 
https://stanbol.apache.org/docs/trunk/components/   
- and they are the same as in http://dev.iks-project.eu:8081/   :  
/contenthub  
/enhancer_VIE  
/sparql  
/ontonet  
/rules  
/reasoners  
/cmsadapter  

Or shell I maybe take the ready launcher (renouncing at all that nice coding) 
under http://dev.iks-project.eu/downloads/stanbol-launchers/ ?  
Is there a way to get a current and complete version?  

Cheers  
Fabio  


Da: Rafa Haro rharoapa...@gmail.com  
Rispondi: dev@stanbol.apache.org dev@stanbol.apache.org  
Data: 18 September 2014 at 09:54:25  
A: dev@stanbol.apache.org dev@stanbol.apache.org, Fabio Ricci 
fabio.ri...@semweb.ch  
Oggetto:  Re: Missing some HTTP services on 1624013  

Hi Fabio,   

Which services were you expecting? Some components like ContentHub or Ortonet 
are not included in the current trunk version but are still maintained in the 
0.12 branch as far as I know.  

Cheers,  
Rafa  


En 17 de septiembre de 2014 en 23:04:11, Fabio Ricci (fabio.ri...@semweb.ch) 
escrito:  

Dear community  

I am just starting working on this highly promising framework!  
After one day figuring out on my osx mavericks installation - my stanbol 
instance runs but shows only /enhancer /topic /entityhub as services.

Using the latest stanbol release (1624013) from trunk. It compiles and builds 
the target(s).  

I start stanbol using the full target snapshot:  

java -Xmx4g -jar 
full/target/original-org.apache.stanbol.launchers.full-1.0.0-SNAPSHOT.jar 
-p9321  

My stanbol instance is showing only  /enhancer /topic /entityhub as services.  

Where are the other services? What should I do in order to get all the other 
services registered and running?  

I am grateful for every hint on this !  

Thank you very much in advance  

Kind regards   

Fabio  


Re: ApacheCon CFP closes June 25

2014-06-20 Thread Rafa Haro

FSTs integration!

El 18/06/14 19:55, Rupert Westenthaler escribió:

Hi Fabian, all

I also plan to participate and also to submit a presentation about
Stanbol. But I am not yet sure about the exact topic. Suggestions are
very welcome!

best
Rupert


On Wed, Jun 18, 2014 at 9:25 AM, Rafa Haro rh...@apache.org wrote:

Hi Fabian,

Zaizi as company is going to send a proposal that somehow involves Stanbol
and also others Apache projects.

Cheers,
Rafa

El 18/06/14 08:43, Fabian Christ escribió:


Hi,

is anybody going to visit ApacheCon EU and/or even better present Stanbol
there?

Best,
   - Fabian

2014-06-10 22:15 GMT+02:00 Fabian Christ fchr...@apache.org:

Dear Stanbolers,

As you may be aware, ApacheCon will be held this year in Budapest, on
November 17-23. (See http://apachecon.eu for more info.)

The Call For Papers for that conference is still open, but will be
closing soon. We need you talk proposals, to represent Stanbol at
ApacheCon. We need all kinds of talks - deep technical talks, hands-on
tutorials, introductions for beginners, or case studies about the
awesome stuff you're doing with Stanbol.

Please consider submitting a proposal, at
http://events.linuxfoundation.org//events/apachecon-europe/program/cfp

Thanks!









Re: ApacheCon CFP closes June 25

2014-06-18 Thread Rafa Haro

Hi Fabian,

Zaizi as company is going to send a proposal that somehow involves 
Stanbol and also others Apache projects.


Cheers,
Rafa

El 18/06/14 08:43, Fabian Christ escribió:

Hi,

is anybody going to visit ApacheCon EU and/or even better present Stanbol there?

Best,
  - Fabian

2014-06-10 22:15 GMT+02:00 Fabian Christ fchr...@apache.org:

Dear Stanbolers,

As you may be aware, ApacheCon will be held this year in Budapest, on
November 17-23. (See http://apachecon.eu for more info.)

The Call For Papers for that conference is still open, but will be
closing soon. We need you talk proposals, to represent Stanbol at
ApacheCon. We need all kinds of talks - deep technical talks, hands-on
tutorials, introductions for beginners, or case studies about the
awesome stuff you're doing with Stanbol.

Please consider submitting a proposal, at
http://events.linuxfoundation.org//events/apachecon-europe/program/cfp

Thanks!




Re: [] Apache Stanbol Partial Security Release 0.99

2014-06-02 Thread Rafa Haro

Hi,

El 02/06/14 08:35, Sergio Fernández escribió:

Hi,

is the stuff really tested by a broader community? Personally I've had 
to disable it in all my launchers due several issues with other 
components. So I'd like to clarify that before casting my vote.
I'm exactly in the same situation than Sergio. I honestly haven't tested 
it because of the problems in the firsts releases of the component.


Thanks.

Cheers,

PD: as Andreas pointed, the versioning is also quite confusing... I 
though the community had decided to switch from the old version policy 
(individual per module) to a common one for all modules.



On 02/06/14 00:16, Reto Gmür wrote:

Hi community,

Given that the 1.0.0 release might take some more discussion I've 
tailored
a mini-release of two modules. The 0.12 security.core module doesn't 
work

with jersey = 2.0 because of what I believe to be a bug in Jersey
(JERSEY-1926) but a work-around working with all Jersey versions is
straight forward and has been in trunk for almost exactly one year. 
Apart

from the patch to work around JERSEY-1926 affecting security.core and
authentication.basic the new release also incorporates the code
simplification patch provided by Furkan Kamaci (STANBOL-1317).

Solved issues:
- STANBOL-1094
- STANBOL-1317

SVN-Tag:
https://svn.apache.org/repos/asf/stanbol/tags/org.apache.stanbol.commons.security-0.99/ 



Staging repos
https://repository.apache.org/content/repositories/orgapachestanbol-1006/ 



Source tarball:
http://repository.apache.org/content/repositories/orgapachestanbol-1006/org/apache/stanbol/org.apache.stanbol.commons.security/0.99/org.apache.stanbol.commons.security-0.99-source-release.tar.gz 



Detached signature:
https://repository.apache.org/content/repositories/orgapachestanbol-1006/org/apache/stanbol/org.apache.stanbol.commons.security/0.99/org.apache.stanbol.commons.security-0.99-source-release.tar.gz.asc 



PGP release keys
https://dist.apache.org/repos/dist/release/stanbol/KEYS

The vote will be open for at least 48 hours.
Thanks for reviewing this release and for voting!

Cheers,
Reto







Re: [] Apache Stanbol Partial Security Release 0.99

2014-06-02 Thread Rafa Haro

really ? :-)

El 02/06/14 10:32, Reto Gmür escribió:

On Mon, Jun 2, 2014 at 9:09 AM, Rafa Haro rh...@apache.org wrote:


Hi,

El 02/06/14 08:35, Sergio Fernández escribió:

  Hi,

is the stuff really tested by a broader community? Personally I've had to
disable it in all my launchers due several issues with other components. So
I'd like to clarify that before casting my vote.


I'm exactly in the same situation than Sergio. I honestly haven't tested
it because of the problems in the firsts releases of the component.


Hi Rafa

You voted +1 to the previous release of these components on February 25th.
What testing could you do back then, that you can no longer do now?

Cheers,
Reto





Thanks.

Cheers,

PD: as Andreas pointed, the versioning is also quite confusing... I
though the community had decided to switch from the old version policy
(individual per module) to a common one for all modules.


On 02/06/14 00:16, Reto Gmür wrote:


Hi community,

Given that the 1.0.0 release might take some more discussion I've
tailored
a mini-release of two modules. The 0.12 security.core module doesn't work
with jersey = 2.0 because of what I believe to be a bug in Jersey
(JERSEY-1926) but a work-around working with all Jersey versions is
straight forward and has been in trunk for almost exactly one year. Apart
from the patch to work around JERSEY-1926 affecting security.core and
authentication.basic the new release also incorporates the code
simplification patch provided by Furkan Kamaci (STANBOL-1317).

Solved issues:
- STANBOL-1094
- STANBOL-1317

SVN-Tag:
https://svn.apache.org/repos/asf/stanbol/tags/org.apache.
stanbol.commons.security-0.99/

Staging repos
https://repository.apache.org/content/repositories/
orgapachestanbol-1006/

Source tarball:
http://repository.apache.org/content/repositories/
orgapachestanbol-1006/org/apache/stanbol/org.apache.
stanbol.commons.security/0.99/org.apache.stanbol.commons.
security-0.99-source-release.tar.gz

Detached signature:
https://repository.apache.org/content/repositories/
orgapachestanbol-1006/org/apache/stanbol/org.apache.
stanbol.commons.security/0.99/org.apache.stanbol.commons.
security-0.99-source-release.tar.gz.asc

PGP release keys
https://dist.apache.org/repos/dist/release/stanbol/KEYS

The vote will be open for at least 48 hours.
Thanks for reviewing this release and for voting!

Cheers,
Reto






Re: PDF Description Extraction For Linked data

2014-05-30 Thread Rafa Haro
Hi Maatari

On Thursday, May 29, 2014, Maatari Daniel Okouya okouy...@yahoo.fr wrote:

 Many Rafa,

 one last question. In some case i will already have the metadata available
 in other format that will need to be translated in RDF.

 Let’s assume i have it done. What i will get is basically an set of
 instance resource describe with the vocabularies of my choices.

 That’s where the data lifting process you are talking about comes in play.
 To be linked to the LOD, i would need to link my description to other
 dataset available on the LOD.

 Is there a way/pipeline that start from an RDF description and links it to
 the LOD, that is available with Sanbol. To be honest i already saw spotted
 things like, datalift and silk, but i was just wondering if something like
 was available with sanbol.

 If I have understood correctly, I would say that you can use an extension
of google refine which uses stanbol to reconciliate your current data with
LOD datasets imported in stanbol like for example DBPedia. You can find
more information here:

https://code.google.com/p/lmf/wiki/GoogleRefineUsersDocumentation

Hope that helps.
Cheers,

Rafa



 Many thanks,

 -M-
 --
 Maatari Daniel Okouya
 Sent with Airmail

 On 29 May 2014 at 09:24:54, Rafa Haro (rh...@apache.org) wrote:

 Hi Maatari,

 El 29/05/14 02:27, Maatari Daniel Okouya escribió:

  Rafa,

  Many thanks for your elaborated answer.

  It seems to me that from your elaborated answer i did not completely
 graps the concepts behind StanBol. Its primary purpose is semantically
 annotating the content of a file for the purpose of semantic search.
 Although one could divert by reusing the enhancing infrastructure to get
 the description generated and apply some Sparql rule to get the description
 in a format desire. It is not geared toward linked data out of the box.
 What i mean generating a description that you could publish as is, which is
 what i was looking for. As you say, the best match here is the description
 returned by the Topic annotation engine and maybe few things extracted by
 Tika.

 Well, the primary purpose or use case wouldn't have to be necessarily
 Semantic Search. I would say that Stanbol helps in the task of extracting
 semantic metadata from content (semantic lifting). It is true that the most
 common way of metadata extraction is the Entity Linking and there is a
 reason for that: stanbol was born as a tool for Content Management Systems
 where companies are supposed to manage domain vocabularies that could be
 used to enrich the enterprise content. Anyway, the enhancer has been
 modularized around extracting engines, so you can perfectly implement an
 engine for your use case and take advantage of the Stanbol APIs to express
 your extracted metadata as RDF.


  I mean i still need to read a bit, but this is what i get for now, from
 your explanation and my readings.

  Am I close ?

 I think so :-). Cheers

 Rafa


  Best,
  -M-
  --
 Maatari Daniel Okouya
 Sent with Airmail

 On 28 May 2014 at 13:46:00, Rafa Haro (rh...@apache.org) wrote:

  Hi Maatari,

 El 27/05/14 21:05, Maatari Daniel Okouya escribió:
  Hi ,
 
  Completing my previous question, I think it would be better for me to
 give the bigger picture of what i’m trying to achieve.
 
 
  I have been charge with helping in disseminating the publications
 content of my organisation. Most of them are in PDF.
 
  Therefore, I need a process to produce a meaningful RDF description of
 our content that links as much as possible to the LOD cloud and LOV (liked
 open vocab). Hence i need to use common core vocabularies as much as i can
 i.e. dublin, schema.org, Bibo, FOAF, etc… and reference entity from
 DBpedia for instance.
 
  Searching around the web how to automatically generate these
 descriptions which would include creator, publisher, primaryTopic, subject,
 thematic etc…. It seems to me that Apache StanBol was the best match.
 With Stanbol you can enrich your content with your own vocabularies or
 dataset from the LOD cloud as long as you import them before as a site.
 Let's say that out of the box enrichment process consist on linking
 pieces of texts (like entities/concepts' names/labels) with entities
 within your datasets.
 




  1   2   >