+1 Best regards
Pierre On Thu, 10 Aug 2017 at 21:09 John D. Ament <johndam...@apache.org> wrote: > +1 to accept > > On Thu, Aug 10, 2017 at 3:03 PM Steve Lawrence < > stephen.d.lawre...@gmail.com> > wrote: > > > Hi All, > > > > Based on the discussion on the incubator mailing list [1], I would like > > to start a VOTE to bring the Daffodil project in as an Apache incubator > > podling. > > > > The ASF voting rules are described: > > > > https://www.apache.org/foundation/voting.html > > > > A vote for accepting a new Apache Incubator podling is a majority vote > > for which only Incubator PMC member votes are binding. > > > > This vote will run for at least 72 hours. Please VOTE as follows > > [] +1 Accept Daffodil into the Apache Incubator > > [] +0 Abstain. > > [] -1 Do not accept Daffodil into the Apache Incubator because ... > > > > The proposal is listed below, but you can also access it on the wiki: > > > > https://wiki.apache.org/incubator/DaffodilProposal > > > > Thank you, > > - Steve > > > > [1] > > > > > https://lists.apache.org/thread.html/190d73e84508d2deaa6cfde1be197cb70ca4caddfb215bc269b3e44f@%3Cgeneral.incubator.apache.org%3E > > > > > > > > = Daffodil Proposal = > > > > == Abstract == > > > > Daffodil is an implementation of the Data Format Description Language > > (DFDL) used to convert between fixed format data and XML/JSON. > > > > == Proposal == > > > > The Data Format Description Language (DFDL) is a specification, > > developed by the Open Grid Forum, capable of describing many data > > formats, including both textual and binary, scientific and numeric, > > legacy and modern, commercial record-oriented, and many industry and > > military standards. It defines a language that is a subset of W3C XML > > schema to describe the logical format of the data, and annotations > > within the schema to describe the physical representation. > > > > Daffodil is an open source implementation of the DFDL specification that > > uses these DFDL schemas to parse fixed format data into an infoset, > > which is most commonly represented as either XML or JSON. This allows > > the use of well-established XML or JSON technologies and libraries to > > consume, inspect, and manipulate fixed format data in existing > > solutions. Daffodil is also capable of the reverse by serializing or > > "unparsing" an XML or JSON infoset back to the original data format. > > > > == Background == > > > > Many different software solutions need to consume and manage data, > > including data directed routing, databases, data analysis, data > > cleansing, data visualizing, and more. A key aspect of such solutions is > > the need to transform the data into an easily consumable format. > > Usually, this means that for each unique data format, one develops a > > tool that can read and extract the necessary information, often leading > > to ad-hoc and data-format-specific description systems. Such systems are > > often proprietary, not well tested, and incompatible, leading to vendor > > lock-in, flawed software, and increased training costs. DFDL is a new > > standard, with version 1.0 completed in October of 2016, that solves > > these problems by defining an open standard to describe many different > > data formats and how to parse and unparse between the data and XML/JSON. > > > > Two closed source implementations of DFDL currently exist. The first was > > created by IBM and is now part of their IBM® Integration Bus product. > > The second was created by the European Space Agency, called DFDL4S or > > "DFDL for Space" targeted at the challenges of their satellite data > > processing. > > > > Around 2005, Pacific Northwest National Lab created Defuddle, built as > > an open source implementation and proof of concept of the draft DFDL > > specification and a test bed to feed new concepts into specification > > development. Primary development of Defuddle was eventually taken over > > by the National Center for Supercomputing Applications (NCSA). However, > > due to evolution of the DFDL specification and architectural and > > performance issues with Defuddle, around 2009, NCSA restarted the > > project with the new name of Daffodil, with a goal of implementing the > > complete DFDL specification. Daffodil development continued at NCSA > > until around 2012, at which point development slowed due to budget > > limitations. Shortly thereafter, primary development was picked up by > > Tresys Technology where it continues today, with contributions from > > other entities such as the Navy Research Lab, the Air Force Research > > Lab, MITRE, and Booz Allen Hamilton. In February of 2015, Daffodil > > version 1.0.0 was released, including support for the DFDL features > > needed to parse many common file formats. Daffodil version 2.0.0 is > > expected to be released in August of 2017, which will include unparse > > support with one-to-one parsing feature parity. > > > > Entities including IBM, MITRE, NATO NCI Agency, Northrop-Grumman, Quark > > Security, Raytheon, and Tresys Technology have developed DFDL schemas > > for many data formats from varying technology domains, including PNG, > > GIF, BMP, PCAP, HL7, EDIFACT, NACHA, vCard, iCalendar, and MIL-STD-2045 > > , many of which are publicly available on the DFDL Schemas github. There > > are also a number of military-application data formats, the > > specifications of which are not public, which have historically been > > very difficult and expensive to process, and for which DFDL schemas have > > been created or are actively in development; these include > > MIL-STD-6040/USMTF ATO, MIL-STD-6017/VMF, MIL-STD-6016/NATO STANAG 5516 > > (aka "Link16"). > > > > == Rationale == > > > > Numerous software solutions exist that consume, inspect, analyze, and > > transform data, many of which can be found in the Apache Software > > Foundation (ASF). In order for tools like these to consume new types of > > data, custom extensions are usually required, often with high > > development and testing costs. Daffodil fills a clear gap in many of > > these solutions, providing a simple and low cost way to transform data > > to XML or JSON, which many of these tools natively support already. With > > the upcoming 2.0.0 release, the Daffodil project will have achieved a > > level of functionality in both parse and unparse that, when integrated > > into existing solutions, could provide for a new method to quickly > > enable support for new data formats. > > > > == Initial Goals == > > > > * Relicense the existing code from the University of Illinois/NCSA Open > > Source License to the Apache License version 2.0, working with Apache > > Legal to ensure correctness, and with Daffodil contributors to get their > > permission. > > * Move the existing codebase, documentation, bugs, and mailing lists to > > the Apache hosted infrastructure > > * Establish a formal release process and schedule, allowing for > > dependable release cycles in a manner consistent with the Apache > > development process. > > * Build relationships with ASF projects to add Daffodil support where > > appropriate > > * Grow the community to establish a diversity of background and > expertise. > > > > == Current Status == > > > > === Meritocracy === > > > > All initial committers are familiar with the principles of meritocracy. > > The Daffodil project has followed the model of meritocracy in the past, > > providing multiple outside entities commit access based on the quality > > of their contributions. In order to grow the Daffodil user base and > > development community, we are dedicated to continuing to operate > > Daffodil as a meritocracy. > > > > A key ingredient in a meritocracy of developers is open group code > > review. The Daffodil project has operated in this mode throughout its > > existence and this provides a forum to improve the code, verify code > > quality, and educate new developers on the code base. > > > > === Community === > > > > Daffodil has a small community of users and developers. Although primary > > Daffodil development is done by Tresys Technology, a handful of other > > contributions have come from other entities including the Navy Research > > Lab, the Air Force Research Lab, MITRE, and Booz Allen Hamilton. In > > addition to developers, multiple users of Daffodil have created DFDL > > schemas, including entities such as MITRE, IBM, Raytheon, Quark > > Security, and Tresys Technology. The DFDL Schemas github community has > > been created as a place for DFDL schemas to be published. The Daffodil > > project also makes use of mailing lists, HipChat, and Confluence > > Questions to build a community of users and system for support. > > > > === Core Developers === > > > > The core developers of Daffodil are employed by Tresys Technology. We > > will work to grow the community among a more diverse set of developers > > and industries. > > > > === Alignment === > > > > Daffodil was created as an open source project with a philosophy > > consistent with The Apache Way. A strong belief in meritocracy, > > community involvement in decisions, openness, and ensuring a high level > > of quality in code, documentation, and testing are some of our shared > > core beliefs. > > > > Further, as mentioned in the Rationale section, Daffodil fills a gap > > that exists in many ASF projects, including NiFi, Spark, Storm, Hadoop, > > Tika, and others. In order for tools like these to consume new types of > > data, custom extensions are usually required. Rather than create such > > extensions, Daffodil provides an easy and standards-compliant way to > > transform data to XML or JSON, which many of these tools already > > natively support. > > > > == Known Risks == > > > > === Orphaned Products === > > > > The current core developers are the leading contributors in the space of > > DFDL and wish to see it flourish. Though there is some risk that the > > initial committers all come from the same company, a goal of entering > > into incubation is to grow the development community to minimize the > > risk of reliance on a single company. > > > > === Inexperience with Open Source === > > > > The Daffodil project began as an open source project and has continued > > that model throughout development. This includes public bug tracking, > > git revision control, automated builds and tests, and a public wiki for > > documentation. > > > > Additionally, the current core developers and initial committers all > > work for a company that relies on, believes in, promotes, and has led or > > contributed to many open source software projects, including SELinux > > Userspace, OpenSCAP, CLIP, refpolicy, setools, RPM, and others. As such, > > there is low risk related to inexperience with open source software and > > processes. > > > > === Homogeneous Developers === > > > > The proposed initial committers come from a single entity, though we are > > committed to growing the Daffodil development community to include a > > broad group of additional committers from a wide array of industries. > > > > === Reliance on Salaried Developers === > > > > The proposed initial committers are paid by their employer to contribute > > to the Daffodil project. We expect that Daffodil development will > > continue with salaried developers, and are committed to growing the > > community to include non-salaried developers as well. > > > > === Relationship with other Apache Projects === > > > > As mentioned in the Alignment section, Daffodil fills a clear gap in > > numerous other ASF projects that consume and manage large amounts of > data. > > > > As a specific example, Daffodil developers have created a Daffodil > > Apache NiFi Processor, currently in use in data transfer solutions, > > which allows one to ingest non-native data into an Apache NiFi pipeline > > as XML or JSON. This processor was well received by the Apache NiFi > > developers, with positive comments about the concise API and how it > > could handle non-native data. Daffodil developers have also successfully > > prototyped integration with Apache Spark. We believe Daffodil could > > provide a strong benefit to many other ASF projects that handle fixed > > format data. We anticipate working closely with such ASF projects to > > include Daffodil where applicable to increase their ability to support > > new data formats with minimal effort. > > > > Daffodil also depends on existing ASF projects, including Apache Commons > > and Apache Xerces. > > > > === An Excessive Fascination with the Apache Brand === > > > > Although the Apache brand may certainly help to attract more > > contributors, publicity is not the reason for this proposal. We believe > > Daffodil could provide a great benefit to the ASF and the numerous data > > focused projects that comprise it, as described in the Rationale and > > Alignment sections. We hope to build a strong and vibrant community > > built around The Apache Way, and not dependent on a single company. > > > > === Documentation === > > > > Daffodil documentation can be found at: > > > > * > > > > > https://opensource.ncsa.illinois.edu/confluence/display/DFDL/Daffodil%3A+Open+Source+DFDL > > > > Information about DFDL can be found at: > > > > * https://www.ogf.org/ogf/doku.php/standards/dfdl/dfdl > > * > > > > > https://www.ibm.com/support/knowledgecenter/en/SSMKHH_9.0.0/com.ibm.etools.mft.doc/df20060_.htm > > > > Public examples of DFDL Schemas can be found at: > > > > * https://github.com/DFDLSchemas > > > > == Initial Source == > > > > The Daffodil git repo goes back to mid-2011 with approximately 20 > > different contributors and feedback from many users and developers. The > > core codebase is written in Scala and includes both a Scala and Java > > API, along with Javadocs and Scaladocs for API usage. The initial code > > will come from the git repository currently hosted by NCSA at the > > University of Illinois : > > > > > > > https://opensource.ncsa.illinois.edu/bitbucket/projects/DFDL/repos/daffodil/ > > > > == Source and Intellectual Property Submission == > > > > The complete Daffodil code is licensed under the University of > > Illinois/NCSA Open Source License. Much of the current codebase has been > > developed by Tresys Technology, who is open to relicensing the code to > > the Apache License version 2.0 and donate the source to the ASF. > > Contacts at NCSA are also open to relicensing their contributions to > > Apache v2. We plan to contact the other contributors and ask for > > permission to relicense and donate their contributed code. For those > > that decline or we cannot contact, their code will be removed or > > replaced. We will work closely with Apache Legal to ensure all issues > > related to relicensing are acceptable. > > > > == External Dependencies == > > > > We believe all current dependencies are compatible with the ASF > > guidelines. Our dependency licenses come from the following license > > styles: Apache v2, BSD, MIT, and ICU. The list of current Daffodil > > dependencies and their licenses are documented here: > > > > > > > https://opensource.ncsa.illinois.edu/confluence/display/DFDL/Dependencies+and+Licenses > > > > == Cryptography == > > > > None > > > > == Required Resources == > > > > === Mailing Lists === > > > > * comm...@daffodil.incubator.apache.org > > * d...@daffodil.incubator.apache.org > > * priv...@daffodil.incubator.apache.org > > * u...@daffodil.incubator.apache.org > > > > === Source Control === > > > > git://git.apache.org/incubator-daffodil.git > > > > === Issue Tracking === > > > > JIRA Daffodil (DFDL) > > > > === Initial Committers === > > > > * Beth Finnegan <efinnegan at tresys dot com> > > * Dave Thompson <dthompson at tresys dot com> > > * Josh Adams <jadams at tresys dot com> > > * Mike Beckerle <mbeckerle at tresys dot com> > > * Steve Lawrence <slawrence at tresys dot com> > > * Taylor Wise <twise at tresys dot com> > > > > === Affiliations === > > > > * Beth Finnegan (Tresys Technology) > > * Dave Thompson (Tresys Technology) > > * Josh Adams (Tresys Technology) > > * Mike Beckerle (Tresys Technology) > > * Steve Lawrence (Tresys Technology) > > * Taylor Wise (Tresys Technology) > > > > == Sponsors == > > > > === Champion === > > > > * John D. Ament > > > > === Nominated Mentors === > > > > * Dave Fisher > > * John D. Ament > > * > > > > === Sponsoring Entity === > > > > We request the Apache Incubator to sponsor this project. > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > -- Pierre Smits ORRTIZ.COM <http://www.orrtiz.com> OFBiz based solutions & services OFBiz Extensions Marketplace http://oem.ofbizci.net/oci-2/