Hi - To me you have two parts. One fits Apache and the other would need to be outside.
(1) Open Source Software which is the library, service and CLI tools. This is something that an Apache Community could grow around and be governed in the Apache Way. This part can be incubated. (2) Open Data. Justin refers to Kibble and Pony Mail which are incubating projects around consuming Apache Community data mostly. I would point out that you could host the data portion of your community elsewhere by some community members or others outside of Apache PMCs. Here is a real example. Apache Tika, PDFBox and POI PMCs all share a set of regression test documents (https://openpreservation.org/blog/2016/10/04/apache-tikas-regression-corpus-tika-1302/) and a community member Dominik Stadler (https://github.com/centic9/CommonCrawlDocumentDownload) that are retrieved from Common Crawl (http://commoncrawl.org) which uses the AWS Public Dataset Program (https://aws.amazon.com/opendata/public-datasets/) Regards, Dave > On Jul 2, 2019, at 1:59 PM, Alejandro Caceres <acace...@hyperiongray.com> > wrote: > > Hi Matt, > > Thanks for the response. You are sort of correct, I would say the end goal > is a service - an open source engine that is able to grab and ingest this > highly unstructured security information and turn it into something useful > - then provide that back to the user in a few different forms. One would be > a web services API for general use exposed to the Internet (a service, like > you said), and another would be a series of command line tools and > libraries that others can use to ingest this information easily. the third > goal would be: not only is the code open source, but all data used in the > application is available itself, so this could easily be used to run a > personal node of this information for an organization, scylla.sh is simply > my instance that I expose to the Internet at large for those that don't > want to run a "full node". If that is more palatable to the ASF I'm glad to > make that the focus. In other words: I'm not married to any model here. > > I knew coming in that it's a bit unconventional for Apache, but, I think, > it is a unique and powerful project that would increase engagement from the > infosec community in which I personally, as well as my R&D company have > some good visibility from. In other words, just testing the waters to see > how this is received by ASF :). > > Alex > > > On Tue, Jul 2, 2019 at 3:44 PM Matt Sicker <boa...@gmail.com> wrote: > >> I'm a little unclear about the scope of the project here. This project >> looks more like a service, and I don't know of any ASF projects that >> exist to provide services outside the ASF. >> >> On Tue, 2 Jul 2019 at 14:28, Alejandro Caceres >> <acace...@hyperiongray.com> wrote: >>> >>> Hey Folks, >>> >>> I'm interested in submitting a project as a seedling and am looking >> exactly >>> where to start. The project is already off the ground, being used by >> many, >>> is stable, reasonably mature (it's in alpha release), open source, and >>> already Apache licensed. I've been looking at a lot of resources to how >>> best to submit this to Apache and from what I understand I need to: >>> >>> Find a "champion/mentor" for the project and a "sponsor" -> submit an >>> incubator application -> wait (or do i submit for a vote on general@?) >> -> >>> ... -> profit :) >>> >>> For a bit more context, my project is http://scylla.sh or >>> https://github.com/acaceres2176/scylla. This project aggregates and >> makes >>> searchable database leaks and other information security data that is >> easy >>> for attackers to find (they have blackhat and underground resources) but >>> difficult for security professionals trying to defend their network (they >>> cannot buy stolen data, are not plugged into the blackhat hacker >> community, >>> and frankly generally don't know "where to start"). The Scylla engine >> aims >>> to even the playing field by making this data available and completely >> free >>> for everyone. The feed is meant to power threat intelligence engines to >> aid >>> in the defense of both large corporate networks, but also be accessible >> to >>> an average user who wants to check what information of theirs has been >>> leaked. It's a passion project of mine and have been working on it for >>> several months already. We have several terabytes of data and good >>> attention from the infosec community. >>> >>> Anyway, sorry for the brain dump above, but I suppose I should mainly >> ask - >>> where do I go from here? Do I simply ask this mailing list if there is a >>> sponsor and champion willing to bring this in as a podling? >>> >>> Thanks! >>> Alex >>> >>> >>> >>> -- >>> ___ >>> >>> Alejandro Caceres >>> Hyperion Gray, LLC >>> Owner/CTO >> >> >> >> -- >> Matt Sicker <boa...@gmail.com> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >> For additional commands, e-mail: general-h...@incubator.apache.org >> >> > > -- > ___ > > Alejandro Caceres > Hyperion Gray, LLC > Owner/CTO --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@community.apache.org For additional commands, e-mail: dev-h...@community.apache.org