Re: [Rdkit-discuss] chemfp preprint
On Mon, Mar 25, 2019 at 11:36 PM Geoffrey Hutchison < geoff.hutchi...@gmail.com> wrote: > > Sometimes, I wish there was a rdkit consortium/NPO (so that donations > are tax deductible), so that rdkit could be massively funded by all its > commercial users, and even accepting individual donations. > > I don't want to hijack the thread, so please feel free to take this > off-list with anyone interested. > It's been pretty thoroughly hijacked. :-) It's an important topic, so I'm going to start a new thread for this. > I think it's an interesting idea in general in open chemistry. We have set > up an Open Chemistry collective - this receives $$ from Google Summer of > Code. The "host" is the Open Source Collective, a 501c6 non-profit in the > United States ( > https://docs.opencollective.com/help/hosts/open-source-collective) > > The collective isn't perfect, it skims 5% for transaction fees and > overhead, but it's: > - completely transparent for donations > - completely transparent for expenses > - allows both one-time and recurring donations > > Greg can correct me - I think we handled the $$ to RDKit from Google > Summer of Code 2018 before we set this up, but it's certainly there to use. > You can create your own RDKit collective pretty easily too: > https://opencollective.com/open-chemistry Yeah, the travel grants for the two RDKit GSoC students were handled via a different mechanism. ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] chemfp preprint
On Mar 25, 2019, at 04:05, Francois Berenger wrote: > Sometimes, I wish there was a rdkit consortium/NPO (so that donations are tax > deductible), so that rdkit could be massively funded by all its commercial > users, and even accepting individual donations. Setting up such an organization is not difficult. It does take time, money, and effort, which add overhead to the funding process. It also requires people who are willing to do that sort of work. I was on the board of the Open Bioinformatics Foundation, and involved with the Python Software Foundation. I know that I am *not* that sort of person. In any case, as Geoff Hutchinson points out, there are umbrella organizations like the Open Source Collective which can handle most of that overhead. My question is, why would users - commercial or otherwise - be willing to fund such work in the first place? As far as I can tell, nearly everyone uses free and open source software because they are available for no cost. Users are rarely willing to pay for software freedom, or for economic benefits like avoiding vendor lock-in. And it seems like commercial users are often willing to use an internal fork of a project than to work with upstream to develop new features. This might be because it's easier to work with existing staff or existing contracting arrangements than figuring out how to get upstream to do the work and setting up a new contractual relationship, and take the risk that it isn't done to schedule. My conjecture is that there are several issues at play. 1) Most end users don't realize there is a funding problem for many FOSS projects. Package managers like pip/PyPI, Conda, Homebrew and apt make it *really easy* to install a large number of packages without knowing anything about the funding or staffing status of each underlying project. Consider that one of the early business models for PyMol was the idea that people would be willing to pay for pre-compiled packages from the main developer, even though the source code was available for free as open source. That business model somewhat worked then. It would not work now. 2) The proponents of "open source" in the late 1990s emphasized the volunteer nature of open source, going so much as to argue that there was a "gift culture" (using E. Raymond's term). The implication is that there was a sort of social contract, where donations of source code would be met with other sorts of payment, including job/consulting offers and non-trivial amounts of reciprocal code contributions. This has not turned out to be true, with rare exceptions. Instead, I think the association with volunteerism and gifts has caused people to avoid talking about fund raising. This should be particularly odd as many volunteer organizations outside of computing have funding drives. 3) FOSS developers who distribute at no cost are ignoring any capital value in the software. They can only make income on gifts (which are rare) or through labor (e.g., consulting). This places them at a funding disadvantage compared to proprietary software vendors who can amortize labor costs across multiple sales. To be clear, I am only talking about self-funded FOSS projects. My paper mentions a few other funding models, like research grants at universities, or in-house projects funded by the ability to reduce costs. In the latter case, the minor additional costs for releasing the project as FOSS can be justified by even small benefits. 4) The pricing of per-unit sales of FOSS software, either institutional sales like what I tried with chemfp, or end-user sales like PyMol, should factor in the likelihood that customers will redistribute the software further, and by doing so reduce the market size. This factor is hard to estimate, and higher in general for universities than pharmaceutical companies, which makes it harder to give a significant discount to universities like what proprietary vendors can do. 5) In my paper I bring up "free rider problem" as a way to think of the issues. To be clear, this is only a *problem* if people expect anything back from releasing and/or maintaining an open source software project. (Or don't expect people to insist on support, like I have received for the no-cost/open source version of chemfp.) Suppose I want to add a new feature to mmpdb, the matched molecular pair program which I helped develop and has been contributed to the RDKit project. I might go around to various users and ask for development funding as a consortium. 20 organizations might be interested, and each one willing to pay 50% of the development cost, which means in principle I could get 10x the cost of labor, which provides the extra profit that could go towards documentation, testing, and general support. However, it's also easy for each of those 20 organizations to think that someone else ("Let George do it", as the Stanford Encyclopedia of Philosophy article explains) is going t
Re: [Rdkit-discuss] chemfp preprint
> Sometimes, I wish there was a rdkit consortium/NPO (so that donations are tax > deductible), so that rdkit could be massively funded by all its commercial > users, and even accepting individual donations. I don't want to hijack the thread, so please feel free to take this off-list with anyone interested. I think it's an interesting idea in general in open chemistry. We have set up an Open Chemistry collective - this receives $$ from Google Summer of Code. The "host" is the Open Source Collective, a 501c6 non-profit in the United States (https://docs.opencollective.com/help/hosts/open-source-collective) The collective isn't perfect, it skims 5% for transaction fees and overhead, but it's: - completely transparent for donations - completely transparent for expenses - allows both one-time and recurring donations Greg can correct me - I think we handled the $$ to RDKit from Google Summer of Code 2018 before we set this up, but it's certainly there to use. You can create your own RDKit collective pretty easily too: https://opencollective.com/open-chemistry One big benefit is that OpenCollective handles all the legal paperwork and accounting. -Geoff PS One regret is that I haven't had need of chemfp in house, or I would have pushed some $$ towards Andrew. ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] chemfp preprint
On 23/03/2019 04:39, Andrew Dalke wrote: Hi RDKit users, This week I submitted a paper about chemfp for publication. I also submitted a preprint on ChemRxiv, which was just accepted. For those interested, it's at https://chemrxiv.org/articles/The_Chemfp_Project/7877846 . It's a rather long paper as it covers many aspects about the chemfp project, including the FPS and FPB formats, search algorithms, details about the different ways to compute a popcount, and memory bandwidth and latency bottlenecks. On a non-technical level I also describe some of the difficulties I ran into trying to run chemfp as "commercial free software." The part about funding free software is quite interesting (I just skimmed through this part of the paper, sorry). Sometimes, I wish there was a rdkit consortium/NPO (so that donations are tax deductible), so that rdkit could be massively funded by all its commercial users, and even accepting individual donations. When you think about Linux, several developers are paid full-time either by the Linux foundation (I think) or by large companies using Linux, to work on the Linux kernel full-time. I guess it gives them a lot of manpower to push their open-source project forward and maintain it in the long run. Let me know of any corrections or improvements, or any other feedback you might have. Cheers, Andrew da...@dalkescientific.com ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] chemfp preprint
Yes, we all love ref 57. - | Markus Sitzmann | markus.sitzm...@gmail.com > On 22. Mar 2019, at 20:39, Andrew Dalke wrote: > > Hi RDKit users, > > This week I submitted a paper about chemfp for publication. I also submitted > a preprint on ChemRxiv, which was just accepted. > > For those interested, it's at > https://chemrxiv.org/articles/The_Chemfp_Project/7877846 . > > It's a rather long paper as it covers many aspects about the chemfp project, > including the FPS and FPB formats, search algorithms, details about the > different ways to compute a popcount, and memory bandwidth and latency > bottlenecks. On a non-technical level I also describe some of the > difficulties I ran into trying to run chemfp as "commercial free software." > > Let me know of any corrections or improvements, or any other feedback you > might have. > > Cheers, > >Andrew >da...@dalkescientific.com > > > > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] chemfp preprint
Hi RDKit users, This week I submitted a paper about chemfp for publication. I also submitted a preprint on ChemRxiv, which was just accepted. For those interested, it's at https://chemrxiv.org/articles/The_Chemfp_Project/7877846 . It's a rather long paper as it covers many aspects about the chemfp project, including the FPS and FPB formats, search algorithms, details about the different ways to compute a popcount, and memory bandwidth and latency bottlenecks. On a non-technical level I also describe some of the difficulties I ran into trying to run chemfp as "commercial free software." Let me know of any corrections or improvements, or any other feedback you might have. Cheers, Andrew da...@dalkescientific.com ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss