A Big [X ] +1 (non binding) Accept Any23 into the Apache Incubator (non binding)
(I am also interested on contributing) thanks, Enrico Daga On 27 September 2011 16:39, Ramirez, Paul M (388J) <paul.m.rami...@jpl.nasa.gov> wrote: > Hey All, > > +1 > > Thanks, > Paul Ramirez > > On Sep 26, 2011, at 10:18 PM, Mattmann, Chris A (388J) wrote: > >> Hi Folks, >> >> OK, the proposal period had died now and I'm now calling a formal VOTE on >> the Any23 proposal located here: >> >> http://wiki.apache.org/incubator/Any23Proposal >> >> Proposal text copied at the bottom of this email. I'll leave the VOTE open >> through the >> rest of the week, and close it around Saturday, October 1, early AM PDT. >> >> Please VOTE: >> >> [ ] +1 Accept Any23 into the Apache Incubator >> [ ] +0 Don't care >> [ ] -1 Don't Accept Any23 into the Apache Incubator because... >> >> Thanks! >> >> Cheers, >> Chris >> >> P.S. Here's my +1 >> >> Proposal Text: >> >> = Any23 = >> == Abstract == >> The following proposal is about ''Anything To Triples'' (shortly Any23) >> defined as a Java library, a Web service and a set of command line tools to >> extract and validate structured data in [[http://www.w3.org/RDF/|RDF]] >> format from a variety of Web documents and markup formats. Any23 is what it >> is informally named an ''RDF Distiller''. >> >> == Proposal == >> Any23 "Anything to Triples" is a library written in Java 6 and released >> under the Apache 2.0 License. It provides a set of extractors for scraping >> semantic markup (such as [[http://microformats.org/|Microformats]], >> [[http://www.w3.org/TR/rdfa-syntax/|RDFa]] and >> [[http://www.w3.org/TR/microdata/|Microdata]]) from several sources (HTML4, >> XHTML5, CSV), a set of data validations, a set of parsers and writers to >> handle the main RDF transport formats (RDFXML, Ntriples, NQuads, Turtle). >> The library provides a command line tool for dealing with data extraction, >> conversion and validation, and a REST service implementation. The library is >> plugin based, allowing the hot loading of new extractors and validators. >> Any23 enables third-parties developers to access structured data from Web >> pages without the need of implementing ad-hoc scraping techniques. In this >> sense, Any23 will relieve developers from build complex solutions when >> developing data acquisition pipelines and processes targeted to semantically >> marked-up Web data. >> >> == Background == >> Any23 has been initially developed at [[http://www.deri.ie/|DERI (Digital >> Enterprise Research Institute)]], as main component of the RDF extraction >> pipeline used in [[http://sindice.com/|Sindice (the Semantic Web Index)]], >> now is evolved in joint effort with [[http://www.fbk.eu/|FBK (Fondazione >> Bruno Kessler)]]. At present time the Any23 official >> [[http://developers.any23.org|developers page]] contains all the >> documentation, while the code is maintained on >> [[http://code.google.com/p/any23/|Google Code]]. An official up-to-date >> showcase [[http://any23.org|demo]] is also available. >> >> == Rationale == >> Provide and maintain a robust, standard and updated library for extracting >> and validating semantic markup from heterogeneous sources would provide >> large benefits to the entire Open Source Community. Researchers and academic >> projects are adopting RDF related technologies from years while the >> industry is actually moving toward Semantic Web technologies with more >> concreteness. Several industry initiatives related to the >> [[http://en.wikipedia.org/wiki/Semantic_Web|Web of Data]] are taking place >> in the these months. [[http://schema.org|Schema.org]], for example, is an >> initiative sponsored by >> [[http://www.google.com/about/corporate/company/|Google Inc]], >> [[http://info.yahoo.com/center/us/yahoo/|Yahoo Inc]] and >> [[http://www.microsoft.com/about/companyinformation/en/us/default.aspx|Microsoft >> Corporation]] to structure the data in a harmonized way on >> [[http://dev.w3.org/html5/spec/Overview.html|HTML5]] pages. >> [[http://schema.org|Schema.org]] leverages on the >> [[http://dev.w3.org/html5/md/|HTML5 Microdata]] native specification. >> [[http://ogp.me/|OpenGraphProtocol]] is the open standard sponsored by >> [[https://www.facebook.com/pages/Facebooking/114721225206500|Facebook Inc]] >> to include metadata in HTML page headers. >> [[http://ogp.me/|OpenGraphProtocol]], initially based on >> [[http://www.w3.org/TR/xhtml-rdfa-primer/|RDFa]], allows to describe the >> content of a Web page and its underlying vocabulary could be directly >> represented using RDF. >> >> = Current Status = >> == Meritocracy == >> The historical Any23 team believes in meritocracy and always acted as a >> community. Mailing list, open issue tracker and other communication channels >> have always been adopted since its first release. The adoption in a larger >> community, such as Apache, is the natural evolution for Any23. Moreover, >> the Apache standards will enforce the existing Any23 community practices and >> will be a foundation for future committers involvement. >> >> == Core Developers == >> In alphabetical order: >> >> * Davide Palmisano <dpalmisano at gmail dot com> >> * Giovanni Tummarello <giovanni dot tummarello at deri dot org> >> * Michele Mostarda <michele dot mostarda at gmail dot com> >> * Richard Cyganiak <richard at cyganiak dot de> >> * Reto Bachmann-Gmuer <reto at apache dot org> >> * Simone Tripodi <simonetripodi at apache dot org> >> * Szymon Danielczyk <danielczyk.szymon at gmail dot com> >> * Tommaso Teofili <tommaso at apache dot org> >> >> == Alignment == >> Main aim of the project is to develop and maintain a fully flavored semantic >> markup distiller that can be used by other Apache projects that need an RDF >> extraction tool. The Any23 library core is written using the following >> Apache libraries. >> >> * [[http://commons.apache.org/lang/|Apache Commons Lang]] >> * [[http://hc.apache.org/httpclient-3.x/|Apache Commons HTTP Client]] >> * [[http://commons.apache.org/codec/|Apache Commons Codec]] >> * [[http://tika.apache.org/|Apache Tika]] >> * [[http://commons.apache.org/cli/|Apache Commons CLI]] >> * [[http://poi.apache.org/|Apache POI]] >> >> The Any23 service is targeted to run within any compliant Servlet container >> like Tomcat. >> >> = Known Risks = >> == Orphaned Products == >> The increasing number of Any23 adopters and the raising interest for >> Semantic Web related technologies let us believe that there is a minimal >> risk for this work to being abandoned from the community. Moreover Any23 >> has already been used in production by Sindice.com and other DERI projects >> for years. >> >> == Inexperience with Open Source == >> All of the committers have experience working in one or more open source >> projects inside and outside ASF. >> >> == Homogeneous Developers == >> The list of initial committers are geographically distributed across Europe >> with no one company being associated with a majority of the developers. >> Many of these initial developers are experienced Apache committers already >> and all are experienced with working in distributed development communities. >> >> == Reliance on Salaried Developers == >> To the best of our knowledge, the biggest part of the initial committers is >> being paid to develop code for this project due to the adoption of Any23 in >> their organizations infrastructures. In any case, some of the core >> historical developers (some of them no longer getting paid from the original >> companies behind Any23) are still committing even if Any23 is not employed >> in their actual organizations. Any23 has already proven its capability to >> attract external developers. >> >> == Relationships with Other Apache Products == >> In the last years, other projects have been under ASF incubation process >> relying on the Semantic Web technology stack, such as Apache Clerezza, >> Stanbol and Jena. This could be seen as a proof of the consolidation and the >> adoption growing tendency of such technologies. Apart the specificity of >> those projects, sharing the same underlying stack, Any23 could be employed >> in every projects needing a reliable framework to access structured semantic >> markup. Any23 core could be easily released also as a >> [[http://wiki.apache.org/nutch/PluginCentral|Apache Nutch Plugin]] and then, >> used to handy fill >> [[http://www.openrdf.org/doc/sesame2/system/ch05.html|SAIL-compliant]] >> triple stores. >> >> == An Excessive Fascination with the Apache Brand == >> Even if the Any23 community recognizes the power and the attractiveness of >> the ASF brand, we are absolutely aware of our already established role in >> the wider Semantic Web developers community. Any23 already proved its >> reliability in closely support all the new specifications coming from the >> Microformats communities, our major contributors in term of opened issues >> about new feature requests. Furthermore, we are convinced that we can >> enthusiastically bring inside the ASF new and fresh energies in order to >> improve our visions, insights and knowledge about the other projects and, >> most important, to have the possibility of enlarge our small community with >> talented and passionate developers. >> >> = Documentation = >> Any23 Documentation >> >> 1. [[http://developers.any23.org/|Any23 Project Homepage]] >> 1. [[http://code.google.com/p/any23/|Any23 Developer Homepage]] >> 1. [[http://any23.org/|Any23 Live Demo]] >> >> Any23 Related Specifications >> >> 1. [[http://www.w3.org/RDF/|RDF]] >> 1. [[http://www.w3.org/TR/html5/|HTML5]] >> 1. [[http://www.w3.org/TR/rdfa-syntax/|RDFa]] >> 1. [[http://www.w3.org/TR/microdata/|Microdata]] >> 1. [[http://microformats.org/|Microformats]] >> 1. [[http://www.w3.org/TR/rdf-syntax-grammar/|RDF/XML]] >> 1. [[http://www.w3.org/TeamSubmission/turtle/|Turtle]] >> 1. [[http://www.w3.org/TR/rdf-testcases/#ntriples|N-Triples]] >> 1. [[http://sw.deri.org/2008/07/n-quads/|N-Quads]] >> >> Any23 Other documentation >> >> 1. >> [[http://www.slideshare.net/dpalmisano/distilling-the-web-of-data-drop-by-drop-with-java|Any23 >> presentation on Slideshare]] >> >> = Initial Source = >> The intial source comprises code developed on >> [[http://code.google.com/p/any23/|GoogleCode]] licensed under the Apache >> License 2.0 (to be contributed under Grant from Giovanni Tummarello for >> Any23). >> >> = Source and Intellectual Property Submission Plan = >> Source code will be moved from >> [[http://code.google.com/p/any23/|GoogleCode]] space inside the SVN space of >> the podling. >> >> = External Dependencies = >> All the external dependencies (and their licenses) used by Any23 follows: >> >> * [[http://nekohtml.sourceforge.net/|Nekohtml]] (Apache 2.0) >> * [[http://www.openrdf.org|OpenRDF Sesame]] (BSD-style license) >> * [[http://jetty.codehaus.org/jetty/|Jetty]] (Apache License 2.0 and Eclipse >> Public License 1.0) >> * [[http://code.google.com/p/jspf/|Java Simple Plugin Framework]] (new BSD >> License) >> * [[http://code.google.com/p/boilerpipe/[|Boilerpipe]] (Apache License 2.0) >> * [[http://www.slf4j.org/|slf4j]] (MIT License) >> * [[http://www.junit.org/|junit]] (Common Public License - v 1.0) >> * [[http://mockito.org/|Mockito]] (MIT License) >> >> = Cryptography = >> The project does not handle cryptography in any way. >> >> = Required Resources = >> * Mailing lists >> * any23-private (with moderated subscriptions) >> * any23-dev >> * any23-user >> * any23-commits >> * Subversion directory >> * https://svn.apache.org/repos/asf/incubator/any23 >> * Website >> * Confluence (ANY23) >> * Issue Tracking >> * JIRA (ANY23) >> >> = Initial Committers = >> Names of initial committers - in alphabetical order - with current ASF >> status: >> >> * Chris Mattmann <mattmann at apache dot org> (Member) >> * Davide Palmisano <dpalmisano at gmail dot com> (ICLA signed) >> * Giovanni Tumarello <giovanni dot tummarello at deri dot org> (ICLA signed) >> * Lewis John !McGibbney <lewismc at apache dot org> (PMC Member) >> * Michele Mostarda <michele dot mostarda at gmail dot com> (ICLA signed) >> * Paul Ramirez <pramirez at apache dot org> (Member) >> * Reto Bachmann-Gmuer <reto at apache dot org> (Committer) >> * Szymon Danielczyk <danielczyk.szymon at gmail dot com> (ICLA signed) >> >> = Sponsors = >> == Champion == >> * Chris Mattmann <mattmann at apache dot org> (Member) >> >> == Nominated Mentors == >> * Chris Mattmann <mattmann at apache dot org> >> * Paul Ramirez <pramirez at apache dot org> >> * Simone Tripodi <simonetripodi at apache dot org> >> * Tommaso Teofili <tommaso at apache dot org> >> >> == Sponsoring Entity == >> * Tika PMC >> >> = Other interested people (in alphabetical order) = >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Senior Computer Scientist >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 171-266B, Mailstop: 171-246 >> Email: chris.a.mattm...@nasa.gov >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Assistant Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > > -- Enrico Daga -- http://www.enridaga.net skype: enri-pan --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org