Thanks for the precious feedback, To summarize: a) unless there's a specific use case, the API could be implemented inside a pre-processing engine (Fabian). b) other tools exist to extract contents from Html such as Boilerpipe and Goose (Goose was based in the beginning on Readability). It could be worth to try out these tools as well so to understand which one is the best and eventually allow the consumer to choose the most suited tool according to the requested analysis (Andrea).
About point a) I have a question. As the API allow for selection of the Enhancement Chain, how would that work if we move the API in an engine. The engine can be executed outside of the scope of an enhancement chain? Shall we move this thread on a JIRA thread? Thanks, David On Mon, Jan 14, 2013 at 12:53 PM, Andrea Di Menna <[email protected]> wrote: > Hi David, > > what is the performance of Readability compared with other text extraction > tools like Boilerpipe [1] or Goose [2]? > > I think it would be interesting to extend your approach to configurable > text extraction engines. > As Fabian is suggesting, to create specialized enhancement engines which > extract text from HTML contents and feed them to other enhancement engines. > > Regards, > Andrea > > [1] http://code.google.com/p/boilerpipe/ > [2] https://github.com/jiminoc/goose/wiki > > 2013/1/14 Fabian Christ <[email protected]> > > > 2013/1/14 David Riccitelli <[email protected]> > > > > > > 1) You introduce an new endpoint http://localhost:8080/api/tasks > > > > > > Correct. Ideally the API front-end focuses more on developer adoption > so > > to > > > provide APIs that ease integration. > > > > > > At this point I am not sure how we want to add such an API layer. Or even > > if we want such a thing at all. It may be confusing for people. > > > > Why not add your service as an enhancement engine that can be configured > as > > the first engine in an enhancement chain. That would be the natural way > of > > doing it with the stuff we have right now. > > > > What is the benefit of having another REST API facade? Are there more use > > cases for it? > > > > Thanks, > > - Fabian > > -- > > Fabian > > http://twitter.com/fctwitt > > > -- David Riccitelli -- check the Swagger for WordLift <http://bit.ly/VtoM5H> ******************************************************************************** InsideOut10 s.r.l. P.IVA: IT-11381771002 Fax: +39 0110708239 --- LinkedIn: http://it.linkedin.com/in/riccitelli Twitter: ziodave --- Layar Partner Network<http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1> ********************************************************************************
