hi thanks Steffen and Christian for your kind words. I believe I am seeing hash collisions in cpid on installing the client in Debian and Mint. I also believe the Mint package is the unchanged Debian one, inherited via Ubuntu.
The symptoms I am seeing are that when a new computer is added to my little farm, it sometimes is taken by the PrimeGrid (PG)server to be an existing host. This is bad for two reasons, and irritating for a third. 1. I cannot rely on setting the default location for new computers, because the new machine will come up in whatever location the doppelganger had. This means that it may download and start crunching work that, for example, will run for longer than that host has really got. 2. If the doppelganger had work in progress, then that work is marked as abandoned. That means that a new task is sent to someone else, wasting the collective time of the project. (PG have installed two work arounds that ensure that if I go on crunching I do not lose credit. If a task completes and is shjown as abandoned at the time of completion, it is sent for validation as if it were not abandoned. If a task trickles up then it reverts to being in progress or overdue, and then when it subsequently reports it goes for validation. Providing either of these happen before the WU is deleted from the server, the user gets credit -- neither feature is standard on other projects, or so I understand) With credit assured, providing I finish the work, that gives me a moral dilemma when the allegedly abandoined work is 10% into a 20 day task. If I abort it I lose credit, but if i continue it I am getting the last 80% of the credit for work I know is now being done by TWO other machines, which is a waste of the project's resources. 3 (a lesser irritation) when I am testing out different settings (running with and with hyperthreading, say) by mixing up historic hosts it makes it harder for me to track which host was doing what when. I have seen this happen among three laptops, running LinuxMInt Mate 17.1, Cinnamon 18, and Cinnamon 18.1. Two of these laptops have the same CPU model, but i7-6500U, but the third has a model number that looks rather different, m5 6y54. The cpus are similar in that they are all at the expensive end of the mobile processor range, When this has happened with these laptops, each time the respective OS was installed from live CD/USB, and boinc installed with synaptic, searching for the boinc meta package. The first time it happened, March 2016, I was told that I had provoked the problem by using the same usb ethernet dongle and the MAC address was therefore the same. So I went out and bought another couple of dongles, and labelled them for the respective machines. I honestly believe I have not swapped them around indavertently. This week (jan 2017) the same happened again, involving one of the original two laptops and one that had not been involved before. Different cpu, different usb dongle, even different kernel versions as I had not ywt updated the older machine's kernel at that time. Different manufacturer, so different hardware on motherboard, etc etc. The oddest feature is that after updating from both laptops a number of times, all of a sudden the server was showing them as separate machines, and had correctly assigned all 8 tasks issued to the new machine to that machine, and correctly assigned all the historic tasks and stats to the old machine. So I am wondering how it did that. Perhaps it is not the cpid at all, perhaps it is the server software being too clever? This effect also leaves oddities on the server, like this from my first experience of this issue http://www.primegrid.com/show_host_detail.php?hostid=512618 as you can see the computer has a different creation and last contact time, so you might think it had contacted the server at least twice. But by the server's own count, it has done so zero times. Maybe you can see how that makes sense (apart from it being a tunnelling effect of your quantum computing module ;) I am now told on the PG forum that "Linux sometimes fails to pick up the MAC address". ALSO, I have seen this among my collection of 11 desktop machines, 2 of which are identical apart from MAC address, and 1 is a NFS server, and 8 are diskless loading their OS from the server using PXE and root=/dev/nfs. The server runs LinuxMInt 18.1, The other desktop machines run a minimal Debian command line OS, netinstall plus ssh plus boinc-client. These are cloned, but the boinc directories are re-initialised each time to contain only the four config files in /etc/boinc-client and softlinks to them from /var/lib/boinc, plus a minimal account_www.primegird.xml that provides my weak auth code. In particular, there is no contamination of the <host_cpid> value as the file that holds that value is not cloned. Running the diskless machines one at a time works fine, but it does seem random whether it picks up thew history of its own hardware, or of one of the other machines. I am not sur about this yet, I am still collecting data. In any case, my preference would be to start each freshly booted machine as a new machine on the PG server, allowing me to merge them manually (I have been around long enough to remember when that was the norm after a re-install, and personally I preferred that). If I do not turn a machine on for months on end, when it is powered up I want it to be at the default location, not wherever that physical hardware was located last time it was used. I do understand the this old behaviour changed because of specific requests from users who had their own reasons for wanting hardware continuity. I am fairly sure there is at least one bug here, possbily a different bug in the two scenarios. I find it interesting that so far, in ten months, there has not yet been a case of confusion between a laptop and a desktop -- at some poiunt the different hardware becomes sufficiently different to avoid ambiguity. So, FIRST I believe there is a bug that means that cpid is sometimes independent of the MAC address. SECOND, I am requesting a user-selectble option that allows a user, at instll time, to choose to switch hardware-continuity on or off. I believe the second could be achieved by asking a question in the post-install trigger, and if the user wanted hardware continuity off, the script would create a cpid based on a freshly generated uuid. There could even be a three way option: hardware based, based on hostname, or based on a fresh-every install cpid (the latter not beng a hash of anything on the system but a random based uuid with the punctuation stripped out). This option would not be offered where the post install trigger found a pre-existing stare file with a pre-existing cpid. As a work-around, that same option would solve the issues created if there is, in fact, a bug in the client-generated cpid routine. Unless you guys can suggest a good reason why not, I intend to make this change to the .deb on my own system, and see what happens. I have spent too much time clearing up the messes that false allegations of "abandonment" make -- by which I mean when tasks on a different set of hardware get marked as abandoned. If you know of a reason why this idea is unwise, as a home project, please let me know in the next few days. I am also offering this to you as something you may (or may not) want to roll out more generally. I also do not know about making a bug report. This effect comes and goes, and I can think I have an effective way to avoid it (as with buying new USB dongles) then that can fall in a heap. I cannot (yet) produce a definitive recipe to reliably demonstrate this effect, so up to know I have not filed a bug report. Do you think I should? I would value your thoughts on any of the above. And if it does turn out to be provoked by me in a way I have not yet thought of, I would be glad to know that, too. Is there anything else you need to know from me at this time? River~~ On 22 January 2017 at 15:17, Steffen Möller <steffen_moel...@gmx.de> wrote: > Gianfranco is the more active one on the boinc Debian+Ubuntu packages, > but, anyway, I do not think the other readers on this list mind you > telling us about your concerns right here, in particular since this may > also be relevant for packages of other distributions. So, go ahead. > > Steffen > > > On 22/01/2017 14:45, Christian Beer wrote: > > Hi, > > > > if this is a packaging related problem than it's better to directly > > contact the package maintainer but the Debian maintainer is also reading > > this email list so you may try it out here before opening a Debian bug > > report. > > > > Regards > > Christian > > > > On 21.01.2017 18:44, trueriver wrote: > >> hi everyone, > >> > >> before I launch into a description and some questions, may I check this > is > >> the right place to ask about problems that seem to occur with running > Boinc > >> on multiple Linux machines? > >> > >> I am wondering, in particular, if the install triggers in the .deb can > be > >> improved to avoid a particular issue. I may be offering to assist with > >> that, depending what the issue turns out to be. > >> > >> So, is this the right place to ask, and if not can you kindly signpost > me > >> to the right place please? > >> > >> regards, > >> River~~ > >> _______________________________________________ > >> boinc_dev mailing list > >> boinc_dev@ssl.berkeley.edu > >> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > >> To unsubscribe, visit the above URL and > >> (near bottom of page) enter your email address. > > > > _______________________________________________ > > boinc_dev mailing list > > boinc_dev@ssl.berkeley.edu > > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > > To unsubscribe, visit the above URL and > > (near bottom of page) enter your email address. > > _______________________________________________ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.