Hi Joerg and Paul, thank you for your prompt answers and thank for everybody's contribution.
I would like to focus my questions on R binary objects that represent data that was not entirely computer-generated (that is, for which the source code can not be summarised by a mathematical formula and simple starting values). Note also that a large number of other software, like LibreOffice for instance, allow to store unformatted textual data as a binary object. Therefore "binary object" does not mean that the content is impractical to retreive. My first question is: to what extent do we need to verify that the object can be regenerated. - The starting point is a source package with a R binary object. - With this starting point only, it may be impossible to know if it has a source or not. Has the upstream developer typed the results by hand in a R session, for instance when collecting data from a table in a printed report, did he collect his data in a file, not provided in the source package, or does he need a combination of data and scripts to regenerate the binary object ? Unless the answer can be found on the Internet, one has to ask the author directly. - If we have to ask, how long do we need to wait for the answer, and what is the conclusion in case there is no answer. My second question is: to what extent do we need the source. - When the R binary object is a table that has been generated by hand, my understanding is that it does not matter whatever format Upstream prefers, since it is trivial for anybody to export the R object into his favorite format for modification. - When the data in the R binary object has been produced by processing another data file, to what point do we need to go backwards ? This is an important question, because at the end of the chain of rebuildability, there can be gigabytes of data. - When the source of the binary object is not strictly necessary for making relevant modifications, can we distribute the package in Debian ? My last question is, given the answers to the previous questions, what do we do with the R packages that are already in the archive and also contain data that is editable as is but do have an original source, who will do it, and what is the timeline in case of inaction. Also, since the case of pictures have been discussed, here is a parallel between R objects and PNG files is the following. 1) In the PNG file's metadata, there is a field that can indicate if for instance it was made by Inkscape. However, in presence of that field, one can not conclude if the SVG source is still existing, or if it exists on the computer of a contributor, but the upstream developers decided to discard it. 2) If a program displays an image in PNG format and does not use its SVG source, while one can regret that the source is not available, it does not prevent from editing the PNG, or even replacing it entirely. 3) One could consider to scan the Debian archive for PNG files made with Inkscape with no corresponding SVG file in the source package. Would such packages be non-Free ? If yes, how long would you wait before removing the package ? While writing this answer, I also read Don's email advocating for Debian to take the lead and change the current practice in the R community, that prefers to ditribute data as R binary objects in the source packages. This is laudable, but I expect that it will take time, and it needs people who have roots in both communities. In the current situation, that I describe as "active bitrotting", we do not apply the same rules to the packages that enter the archive and the packages that are already in, which cause the packages under active development to become obsolete each time new dependancies can not enter in Debian. Given the rotten tomatoes that fly on my face because I can not update anymore the r-cran-ggplot2 package, I do not feel fit to the task of negociating with the R community to change its traditions. In any case, I think that we need clear guidelines, that help to foresee if a R package is acceptable or not in Debian, so that we can better decide if we undertake the work at all. Currently, my take would be to move packages to non-free. This would also allow us to ship the PDF documentation that we currently delete. Cheers, -- Charles Plessy Debian Med packaging team, http://www.debian.org/devel/debian-med Tsurumi, Kanagawa, Japan -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130805232904.ga8...@falafel.plessy.net