Thanks Fabian.

Here are the answers:

> 1) You introduce an new endpoint http://localhost:8080/api/tasks

Correct. Ideally the API front-end focuses more on developer adoption so to
provide APIs that ease integration.

> 2) The endpoint consumes JSON that either has the HTML content or a URL
> pointing to HTML content

When the consumer posts:
 - a 'content', it is sent straight to the enhancement chain.
 - a URL, it is parsed with Readability and the output content is then sent
to the enhancement chain. Note that if Readability understands the the URL
points to an article split on multiple pages (therefore multiple URLs), it
will then load the content from all the related URLs.

> 3) The accepted media-type is also defined in the JSON file for the
request

The HTTP *Accept *header is currently ignored. Indeed it would be probably
more correct to eliminate the *mimeType *property and rely solely on
the *Accept
*header.

> 4) Using readability the HTML is cleaned and then some enhancement chain
is
> triggered. Which chain is used here?

The default chain is used unless the consumer specifies which chain to use
by setting the chainName property in the JSON payload [1].

> 5) The usual enhancement RDF is returned to the user

Correct.

BR,
David

[1] ln 95:
https://github.com/insideout10/stanbol-facade/blob/master/stanbol-facade-api/src/main/java/io/insideout/stanbol/facade/services/TaskService.java




On Mon, Jan 14, 2013 at 12:21 PM, Fabian Christ <
[email protected]> wrote:

> Hi David,
>
> nice idea. First let me summarize what this contribution is about to see if
> I understood it correctly.
>
> 1) You introduce an new endpoint http://localhost:8080/api/tasks
> 2) The endpoint consumes JSON that either has the HTML content or a URL
> pointing to HTML content
> 3) The accepted media-type is also defined in the JSON file for the request
> 4) Using readability the HTML is cleaned and then some enhancement chain is
> triggered. Which chain is used here?
> 5) The usual enhancement RDF is returned to the user
>
> Is this what it does?
>
> Thanks,
>   - Fabian
>
>
> 2013/1/14 David Riccitelli <[email protected]>
>
> > Hello,
> >
> > I would like to introduce one more contribution for Apache Stanbol.
> >
> > It is not an engine, but an HTTP API for Stanbol which pre-processes and
> > submits analysis tasks, and returns the result synchronously to the
> > consumer. It aims to simplify development integrations and to provide a
> > powerful pre-processing API for analysis of URLs.
> >
> > It implements the *Readability* library, in order to support URL
> > submissions:
> >  - loading contents from remote URLs and
> >  - cleaning them up of all the surrounding noise.
> >
> > Readability is the same library behind the *Reader* function of Safari
> that
> > many users know already.
> >
> > To summarize:
> >
> >    - extremely simple APIs to ease prototyping, integration and usage
> >    - support for textual contents
> >    - support for URLs
> >    - *for URLs, preprocessing of HTML pages to capture the actual URL
> >    content while skipping noise such as ads, menus and so forth*
> >    - synchronous access (for asynchronous access see idntik.it)
> >
> > You can find more information and the source code here:
> > https://github.com/insideout10/stanbol-facade
> >
> > Shall I open a JIRA to discuss a possible integration in the trunk?
> >
> > BR,
> > David Riccitelli
> >
> > -- check the Swagger for WordLift <http://bit.ly/VtoM5H>
> >
> >
> ********************************************************************************
> > InsideOut10 s.r.l.
> > P.IVA: IT-11381771002
> > Fax: +39 0110708239
> > ---
> > LinkedIn: http://it.linkedin.com/in/riccitelli
> > Twitter: ziodave
> > ---
> > Layar Partner Network<
> >
> http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1
> > >
> >
> >
> ********************************************************************************
> >
>
>
>
> --
> Fabian
> http://twitter.com/fctwitt
>



-- 
David Riccitelli

-- check the Swagger for WordLift <http://bit.ly/VtoM5H>
********************************************************************************
InsideOut10 s.r.l.
P.IVA: IT-11381771002
Fax: +39 0110708239
---
LinkedIn: http://it.linkedin.com/in/riccitelli
Twitter: ziodave
---
Layar Partner 
Network<http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1>
********************************************************************************

Reply via email to