Hi, I first got into contact with xCAT through our HPC installed in 2015, with xCAT version … hm …
# nodels --version Version 2.9.1 (git commit 7f6043fffd62d482931b17b60f9488eb5754fdc1, built Thu Mar 19 03:25:35 EDT 2015) 2.9.1 seems to be it. The base system is CentOS 7.x. Since the system was an en bloc purchase, we never updated xCAT, but I just adapted it to our needs and then let it do its thing over the years. I did some little changes, like fixing up /etc/hostname in initrd (not sure if that was a specific mixup in our setup with long and short hostnames) and recently the fix for CVE-2023-27486 (being rather annoyed that /root/.ssh/id_rsa would _ever_ be delivered out to cluster nodes, should always have been a separate directory where I consciously copied a key or had it generated). But nothing to rock the boat. CentOS upgrades up to 7.9 didn't hurt things. We did stick to a certain vanilla kernel build with our patches, though. The system will be out of production in the near future and we do not know what the next installation will be using. I intended to share a main point of my local hacking, but somehow never got around it, and I somehow figured that the obvious stuff would appear upstream, anyway. Example: I enabled squashfs+overlayfs for us with a few lines and I gather that is a standard thing now. Obvious to me back in 2015 was the distribution of stateless filesystem images being slowed down unnecessarily by them being served via HTTP from the admin node over the 1GbE interface. Booting a cluster of 400 nodes took ages because of that (well, quarter to half an hour or so). Is this still the current mechanism? While you could make the admin node part of the high-speed network (Infiniband in our case), or just using 10GbE as baseline today, it just feels right to me to scale out the distribution capacity with the number of compute nodes. Is anyone interested in that? Should I propose a formal change to xCAT for that feature? Did I miss an equivalent option that exists now in current xCAT? I only found some consulting company boasting about them having implemented torrents with xCAT for a customer, but nothing official. I'll describe what I did, anyway. 6 steps follow. 1. I got hold of a minimal torrent program: ctorrent from https://sourceforge.net/p/dtorrent/ 2. I wrote the first of the two attached patches to support the cluster use-case with /dev/loop0 for reading the rootimg (see also https://sourceforge.net/p/dtorrent/patches/5/ ), the second patch then followed to fix a memory issue (see also https://sourceforge.net/p/dtorrent/patches/7/ ). 3. I applied a rather small change to the xcatroot dracut script to download the image via ctorrent in initrd and prepare seeding later. ---------------8<--------------------- Index: share/xcat/netboot/rh/dracut_033/xcatroot =================================================================== --- share/xcat/netboot/rh/dracut_033/xcatroot (Revision 833) +++ share/xcat/netboot/rh/dracut_033/xcatroot (Revision 834) @@ -21,6 +21,12 @@ /tmp/updateflag $MASTER $XCATIPORT "installstatus netbooting" fi +if [ -e /rootimg.torrent ]; then + + ctorrent -s /rootimg.sfs -e 0 /rootimg.torrent + +else + if [ ! -z "$imgurl" ]; then if [ xhttp = x${imgurl%%:*} ]; then NFS=0 @@ -43,6 +49,9 @@ ROOTDIR=/${ROOTDIR#*/} fi fi + +fi # torrent + #echo 0 > /proc/sys/vm/zone_reclaim_mode #Avoid kernel bug if [ -r /rootimg.sfs ]; then @@ -61,6 +70,15 @@ mkdir -p $NEWROOT/rw mount --move /ro $NEWROOT/ro mount --move /rw $NEWROOT/rw + if [ -e /rootimg.torrent ]; then + # Prepare for seeding the rootimg. + # Note that this demands the patched dnh3.2.2thor1 ctorrent binary. + mkdir $NEWROOT/.sysdist + cp /usr/bin/ctorrent /rootimg.torrent $NEWROOT/.sysdist + rrz_distfile=$(ctorrent -x /rootimg.torrent | grep rootimg.sfs | cut -f 2 -d ' ') + mkdir -p $NEWROOT/.sysdist/$(dirname $rrz_distfile) + ln -s /dev/loop0 $NEWROOT/.sysdist/$rrz_distfile + fi elif [ -r /rootimg.gz ]; then echo Setting up RAM-root tmpfs. if [ -z $rootlimit ];then --------------->8--------------------- 4. Include the torrent stuff in the image generation script. ---------------8<-------------------- #!/bin/sh scriptdir=$(cd $(dirname $0) && pwd) PATH=$scriptdir:$PATH sysbase=centos79 osimage=$sysbase-x86_64-stateless-gpu imgdir=/install/netboot/$sysbase/x86_64/gpu xcatinitrd=$imgdir/initrd-stateless.gz # normal image generation # packimage, etc. # stop main seeding service on admin node service rrz-dist-mainseed stop # Create torrent file for efficient distribution. torrfile=gpu-$sysbase-$timecode-rootimg.torrent cd /install/dist ctorrent -t \ -s $torrfile \ -u http://$admin_ip:81/announce \ os/gpu-$sysbase-$timecode/rootimg.sfs # start seeding again, picking up added .torrent service rrz-dist-mainseed start # Disable that in case of weird boot trouble. # It pulls out lots of drivers/firmware that is not # obviously needed for booting. # Initrd loading without torrent is the new bottleneck . rrz-initrd-reduce $xcatinitrd # Insert torrent client and torrent file into initrd. # If that is disabled, standard HTTP download using the # URL from pxelinux config is done. initrdir=$(rrz-initrd-unpack $xcatinitrd) cp -v $scriptdir/ctorrent $initrdir/usr/bin cp -v /install/dist/$torrfile $initrdir/rootimg.torrent rrz-initrd-pack $xcatinitrd $initrdir rrz-initrd-rmdir "$initrdir" rrz-initrd-ucode $xcatinitrd # Yes, update the actual copy of the initrd that is used # during netboot. cp -v $xcatinitrd $bootinitrd --------------->8-------------------- 5. Added a seed service to syncfiles: cat /usr/lib/systemd/system/rrz-dist-seed.service [Unit] Description=ctorrent node seed for image distribution After=network.target [Service] # It might be that the fresh torrent file is not available right away # inside /work (?!), so restarting may be needed to really get # an instance up. Restart=always RestartSec=10 WorkingDirectory=/.sysdist # Not starting as user yet, because root perm needed for preparation # User=sysdist ExecStartPre=/bin/chmod 0640 /dev/loop0 ExecStartPre=/bin/chown :sysdist /dev/loop0 ExecStart=/bin/su sysdist -c '/.sysdist/ctorrent -q -m 5 -M 20 -U 102400 -e -1 rootimg.torrent' [Install] WantedBy=multi-user.target 6. Before all that … have the main seeder service: [root@adm1 xcat]# cat /install/rrz/rrz-dist-mainseed.sh #!/bin/bash # called as a system service pids= for torr in /install/dist/*.torrent do /install/rrz/ctorrent -q -m 5 -M 20 -U 60000 -e -1 "$torr" & pid=$? pids+=" $pid" echo "torrent for $torr with PID $pid" done trap "kill $pids" EXIT wait # end And as tracker, I built an instance of opentracker (below 90K binary, ctorrent is around 310K, not stripped) from http://erdgeist.org/arts/software/opentracker/ (snapshot: http://src.rrz.uni-hamburg.de/files/src/_unsorted/opentracker-20151001.tar.bz2) with the config file boiling down to three lines: listen.udp.workers 6 listen.tcp_udp $admin_ip:81 tracker.user nobody and this simple call as systemd service: [Service] ExecStart=/install/rrz/opentracker -f /install/rrz/opentracker.conf Now this is a long mail, but a rather complete description of the steps I took to make booting of my stateless nodes so fast that I didn't worry about the image distribution part since mid/end of 2015. Now, at the end of the system lifetime, I start to worry a bit about what will come next … Is there interest in the xCAT community to pick this up? One might have to adopt/fork ctorrent, while opentracker seems to be alive, although the author didn't bother to name a release yet. In the closed loop where we use ctorrent as the only client/server with this tracker servers, this might be acceptable. To me, more acceptable than bloating the initrd again with some other torrent software more than a few 100K big. Should this be supported in xCAT upstream? Having 100G networking in the admin node might make this obsolete, but this just means that we could scale to a few thousand nodes more without impacting a single network link. In clusters, when you can distribute a load, you should think twice before _not_ doing it, right? Alrighty then, Thomas -- Dr. Thomas Orgis HPC @ Universität Hamburg
diff -ru ./btconfig.cpp ../ctorrent-dnh3.3.2thor1/btconfig.cpp --- ./btconfig.cpp 2008-06-15 02:00:19.000000000 +0200 +++ ../ctorrent-dnh3.3.2thor1/btconfig.cpp 2015-10-02 09:01:16.966502705 +0200 @@ -35,6 +35,7 @@ unsigned char arg_flg_convert_filenames = 0; char *arg_file_to_download = (char *)0; unsigned char arg_verbose = 0; +unsigned char arg_quiet = 0; unsigned char arg_allocate = 0; unsigned char arg_daemon = 0; diff -ru ./btconfig.h ../ctorrent-dnh3.3.2thor1/btconfig.h --- ./btconfig.h 2008-06-15 02:00:19.000000000 +0200 +++ ../ctorrent-dnh3.3.2thor1/btconfig.h 2015-10-02 09:00:40.851835517 +0200 @@ -48,6 +48,7 @@ extern unsigned char arg_flg_convert_filenames; extern char *arg_file_to_download; extern unsigned char arg_verbose; +extern unsigned char arg_quiet; extern unsigned char arg_allocate; extern unsigned char arg_daemon; diff -ru ./btfiles.cpp ../ctorrent-dnh3.3.2thor1/btfiles.cpp --- ./btfiles.cpp 2008-06-15 02:00:19.000000000 +0200 +++ ../ctorrent-dnh3.3.2thor1/btfiles.cpp 2015-10-02 07:57:55.800456576 +0200 @@ -641,10 +641,10 @@ } }else{ if( !check_exist) check_exist = 1; + /* ThOr: I want to seed /dev/loop0 */ if( !(S_IFREG & sb.st_mode) ){ - CONSOLE.Warning(1, "error, file \"%s\" is not a regular file.", fn); - return -1; - } + CONSOLE.Warning(1, "file \"%s\" is not a regular file, assuming you know more than me about it", fn); + } else if(sb.st_size != pbt->bf_length){ CONSOLE.Warning(1,"error, file \"%s\" size doesn't match; must be %llu", fn, (unsigned long long)(pbt->bf_length)); diff -ru ./ChangeLog ../ctorrent-dnh3.3.2thor1/ChangeLog --- ./ChangeLog 2008-06-15 02:09:43.000000000 +0200 +++ ../ctorrent-dnh3.3.2thor1/ChangeLog 2015-10-02 09:10:02.914657947 +0200 @@ -2,6 +2,19 @@ Enhanced CTorrent Change Log _________________________________________________________________ + Changes for "dnh3.3.2thor1" hack + * Continue and skip size check if existing file is not a regular + one. Idea: I want to seed /dev/loop0 with a squashfs image + on the other end. This indeed works even without write permissions + to the file. Good thing, don't want to wreck the booted system. + * Speed up things when "Quitting" by simulating what Ctrl+C does + after seeding. + * Added -q for quiet(er) operation for running it as a service with + systemd, which doesn't like applications daemonizing themselves. + Also, I hope I get a reliable service file with that at all. + (With -d, I need systemctl daemon-reload all the time to make + ctorrent really start from the service file. Very weird.) + Changes for "dnh3.3.2" Release * [1954631] Fixed incorrect ordering of dictionary keys in created diff -ru ./configure ../ctorrent-dnh3.3.2thor1/configure --- ./configure 2008-06-15 02:00:19.000000000 +0200 +++ ../ctorrent-dnh3.3.2thor1/configure 2015-10-02 08:56:36.178090875 +0200 @@ -1,6 +1,6 @@ #! /bin/sh # Guess values for system-dependent variables and create Makefiles. -# Generated by GNU Autoconf 2.61 for Enhanced CTorrent dnh3.3.2. +# Generated by GNU Autoconf 2.61 for Enhanced CTorrent dnh3.3.2thor1. # # Report bugs to <http://sourceforge.net/projects/dtorrent/ or dhol...@ct.boxmail.com>. # @@ -574,8 +574,8 @@ # Identity of this package. PACKAGE_NAME='Enhanced CTorrent' PACKAGE_TARNAME='ctorrent' -PACKAGE_VERSION='dnh3.3.2' -PACKAGE_STRING='Enhanced CTorrent dnh3.3.2' +PACKAGE_VERSION='dnh3.3.2thor1' +PACKAGE_STRING='Enhanced CTorrent dnh3.3.2thor1' PACKAGE_BUGREPORT='http://sourceforge.net/projects/dtorrent/ or dhol...@ct.boxmail.com' ac_unique_file="ctorrent.cpp" @@ -1216,7 +1216,7 @@ # Omit some internal or obsolete options to make the list less imposing. # This message is too long to be a string in the A/UX 3.1 sh. cat <<_ACEOF -\`configure' configures Enhanced CTorrent dnh3.3.2 to adapt to many kinds of systems. +\`configure' configures Enhanced CTorrent dnh3.3.2thor1 to adapt to many kinds of systems. Usage: $0 [OPTION]... [VAR=VALUE]... @@ -1282,7 +1282,7 @@ if test -n "$ac_init_help"; then case $ac_init_help in - short | recursive ) echo "Configuration of Enhanced CTorrent dnh3.3.2:";; + short | recursive ) echo "Configuration of Enhanced CTorrent dnh3.3.2thor1:";; esac cat <<\_ACEOF @@ -1376,7 +1376,7 @@ test -n "$ac_init_help" && exit $ac_status if $ac_init_version; then cat <<\_ACEOF -Enhanced CTorrent configure dnh3.3.2 +Enhanced CTorrent configure dnh3.3.2thor1 generated by GNU Autoconf 2.61 Copyright (C) 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001, @@ -1390,7 +1390,7 @@ This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. -It was created by Enhanced CTorrent $as_me dnh3.3.2, which was +It was created by Enhanced CTorrent $as_me dnh3.3.2thor1, which was generated by GNU Autoconf 2.61. Invocation command line was $ $0 $@ @@ -2060,7 +2060,7 @@ # Define the identity of the package. PACKAGE=ctorrent - VERSION=dnh3.3.2 + VERSION=dnh3.3.2thor1 cat >>confdefs.h <<_ACEOF @@ -9777,7 +9777,7 @@ # report actual input values of CONFIG_FILES etc. instead of their # values after options handling. ac_log=" -This file was extended by Enhanced CTorrent $as_me dnh3.3.2, which was +This file was extended by Enhanced CTorrent $as_me dnh3.3.2thor1, which was generated by GNU Autoconf 2.61. Invocation command line was CONFIG_FILES = $CONFIG_FILES @@ -9830,7 +9830,7 @@ _ACEOF cat >>$CONFIG_STATUS <<_ACEOF ac_cs_version="\\ -Enhanced CTorrent config.status dnh3.3.2 +Enhanced CTorrent config.status dnh3.3.2thor1 configured by $0, generated by GNU Autoconf 2.61, with options \\"`echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`\\" diff -ru ./console.cpp ../ctorrent-dnh3.3.2thor1/console.cpp --- ./console.cpp 2008-06-15 02:00:19.000000000 +0200 +++ ../ctorrent-dnh3.3.2thor1/console.cpp 2015-10-02 09:06:39.493531278 +0200 @@ -350,7 +350,8 @@ int Console::IntervalCheck(fd_set *rfdp, fd_set *wfdp) { - Status(0); + if(!arg_quiet) + Status(0); if( m_oldfd >= 0 ){ FD_CLR(m_oldfd, rfdp); @@ -445,7 +446,8 @@ } if( '0' != pending ){ m_streams[O_INPUT]->SetInputMode(K_CHARS); - Status(1); + if(!arg_quiet) + Status(1); } }else{ // command character received @@ -574,7 +576,8 @@ BTCONTENT.CacheConfigure(); break; default: - Status(1); + if(!arg_quiet) + Status(1); break; } if( 10==++count ) inc *= 2; @@ -590,7 +593,8 @@ OperatorMenu(""); break; default: - Status(1); + if(!arg_quiet) + Status(1); break; } diff -ru ./ctorrent.cpp ../ctorrent-dnh3.3.2thor1/ctorrent.cpp --- ./ctorrent.cpp 2008-06-15 02:00:19.000000000 +0200 +++ ../ctorrent-dnh3.3.2thor1/ctorrent.cpp 2015-10-02 09:20:18.284994066 +0200 @@ -141,7 +141,7 @@ if( 0==strncmp(argv[1], "-t", 2) ) opts = "tc:l:ps:u:"; - else opts = "aA:b:cC:dD:e:E:fi:I:M:m:n:P:p:s:S:Tu:U:vxX:z:hH"; + else opts = "aA:b:cC:dD:e:E:fi:I:M:m:n:P:p:s:S:Tu:U:vxX:z:qhH"; while( (c=getopt(argc, argv, opts)) != -1 ) switch( c ){ @@ -334,6 +334,10 @@ arg_verbose = 1; break; + case 'q': // be quiet + arg_quiet = 1; + break; + case 'd': // daemon mode (fork to background) arg_daemon++; break; @@ -387,6 +391,7 @@ "Decode metainfo (torrent) file only, don't download"); fprintf(stderr, "%-15s %s\n", "-c", "Check pieces only, don't download"); fprintf(stderr, "%-15s %s\n", "-v", "Verbose output (for debugging)"); + fprintf(stderr, "%-15s %s\n", "-q", "Less verbose output (no progress indication)"); fprintf(stderr,"\nDownload Options:\n"); fprintf(stderr, "%-15s %s\n", "-e int", diff -ru ./downloader.cpp ../ctorrent-dnh3.3.2thor1/downloader.cpp --- ./downloader.cpp 2008-06-15 02:00:19.000000000 +0200 +++ ../ctorrent-dnh3.3.2thor1/downloader.cpp 2015-10-02 08:36:58.918936358 +0200 @@ -47,6 +47,14 @@ if( arg_ctcs ) CTCS.Send_Status(); } } + else + { + /* ThOr: Trying to speed things up. This is what the SIGINT + handler does. Without that, I habe at least 14 seconds delay + with ctorrent "Quitting". */ + Tracker.ClearRestart(); + Tracker.SetStoped(); + } maxfd = -1; maxsleep = -1; @@ -104,7 +112,6 @@ if( maxsleep <= -100 ) maxsleep = 0; else if( maxsleep <= 0 || maxsleep > MAX_SLEEP ) maxsleep = MAX_SLEEP; } - timeout.tv_sec = (long)maxsleep; timeout.tv_usec = (long)( (maxsleep-(long)maxsleep) * 1000000 ); diff -ru ./version.m4 ../ctorrent-dnh3.3.2thor1/version.m4 --- ./version.m4 2008-06-15 02:00:19.000000000 +0200 +++ ../ctorrent-dnh3.3.2thor1/version.m4 2015-10-02 08:40:36.275934703 +0200 @@ -1,5 +1,5 @@ m4_define([m4_PACKAGE_NAME], [Enhanced CTorrent]) m4_define([m4_PACKAGE_TARNAME], [ctorrent]) -m4_define([m4_PACKAGE_VERSION], [dnh3.3.2]) +m4_define([m4_PACKAGE_VERSION], [dnh3.3.2thor1]) m4_define([m4_PACKAGE_BUGREPORT], [http://sourceforge.net/projects/dtorrent/ or dhol...@ct.boxmail.com])
diff -ruN ../ctorrent-dnh3.3.2thor1/btcontent.cpp ./btcontent.cpp --- ../ctorrent-dnh3.3.2thor1/btcontent.cpp 2008-06-15 02:00:19.000000000 +0200 +++ ./btcontent.cpp 2015-12-03 16:42:39.000000000 +0100 @@ -53,7 +53,7 @@ (max_uint64_t((ca)->bc_off,(roff)) <= \ min_uint64_t(((ca)->bc_off + (ca)->bc_len - 1),(roff + rlen - 1))) - +size_t btcache_counter = 0; btContent BTCONTENT; static void Sha1(char *ptr,size_t len,unsigned char *dm) @@ -627,6 +627,7 @@ m_cache_used -= p->bc_len; delete []p->bc_buf; + //CONSOLE.Warning(0, "ThOr: CacheClean: deleting entry %zu: %p", p->id, (void*)p); delete p; } } @@ -749,7 +750,11 @@ CacheEval(); }else m_cache_size = 0; - if( m_cache_size < m_cache_used && !m_flush_failed ) CacheClean(0); + if( m_cache_size < m_cache_used && !m_flush_failed ) + { + //CONSOLE.Warning(0, "ThOr: btcontent:%i CacheClean", __LINE__); + CacheClean(0); + } } int btContent::NeedFlush() const @@ -849,6 +854,7 @@ m_cache_used -= p->bc_len; delete []p->bc_buf; + //CONSOLE.Warning(0, "ThOr: UnCache: deleting entry %zu: %p", p->id, (void*)p); delete p; } m_cache[idx] = (BTCACHE *)0; @@ -912,6 +918,7 @@ m_cache_used -= p->bc_len; delete []p->bc_buf; + //CONSOLE.Warning(0, "ThOr: CachePrep: deleting entry %p", (void*)p); delete p; } } @@ -988,12 +995,16 @@ CONSOLE.Debug("Read to %s %d/%d/%d", buf?"buffer":"cache", (int)(off / m_piece_length), (int)(off % m_piece_length), (int)len); - if( m_cache_size < m_cache_used + len ) CacheClean(len); + if( m_cache_size < m_cache_used + len ) + { + //CONSOLE.Warning(0, "ThOr: btcontent:%i CacheClean", __LINE__); + CacheClean(len); + } // Note, there is no failure code from CacheClean(). If nothing can be done // to increase the cache size, we allocate what we need anyway. if( 0==method && buf && m_btfiles.IO(buf, off, len, method) < 0 ) return -1; - + pnew = new BTCACHE; #ifndef WINDOWS if( !pnew ) @@ -1014,6 +1025,8 @@ delete pnew; return -1; } + pnew->id = ++btcache_counter; + //CONSOLE.Warning(0, "ThOr: btcontent: created cache entry %zu: %p", pnew->id, (void*)pnew); pnew->bc_off = off; pnew->bc_len = len; pnew->bc_f_flush = method; @@ -1366,7 +1379,11 @@ #ifdef HAVE_WORKING_FORK if( cfg_cache_size ){ // maybe free some cache before forking CacheEval(); - if( m_cache_size < m_cache_used && !m_flush_failed ) CacheClean(0); + if( m_cache_size < m_cache_used && !m_flush_failed ) + { + //CONSOLE.Warning(0, "ThOr: btcontent:%i CacheClean", __LINE__); + CacheClean(0); + } } pid_t r; if( (r = fork()) < 0 ){ diff -ruN ../ctorrent-dnh3.3.2thor1/btcontent.h ./btcontent.h --- ../ctorrent-dnh3.3.2thor1/btcontent.h 2008-06-15 02:00:19.000000000 +0200 +++ ./btcontent.h 2015-12-03 14:33:56.000000000 +0100 @@ -22,6 +22,7 @@ struct _btcache *bc_prev; struct _btcache *age_next; struct _btcache *age_prev; + size_t id; }BTCACHE; typedef struct _btflush{ diff -ruN ../ctorrent-dnh3.3.2thor1/btrequest.cpp ./btrequest.cpp --- ../ctorrent-dnh3.3.2thor1/btrequest.cpp 2008-06-15 02:00:19.000000000 +0200 +++ ./btrequest.cpp 2015-12-03 16:58:42.433868212 +0100 @@ -10,6 +10,7 @@ #include "compat.h" #endif +size_t btslice_counter = 0; static void _empty_slice_list(PSLICE *ps_head) { @@ -234,7 +235,8 @@ #ifndef WINDOWS if( !n ) return -1; #endif - + n->id = ++btslice_counter; + //CONSOLE.Warning(0, "ThOr: Insert slice %zu: %p", n->id, (void*)n); n->index = idx; n->offset = off; n->length = len; @@ -267,6 +269,8 @@ if( !n ) return -1; #endif + n->id = ++btslice_counter; + //CONSOLE.Warning(0, "ThOr: Add slice %zu: %p", n->id, (void*)n); n->next = (PSLICE) 0; n->index = idx; n->offset = off; @@ -312,6 +316,7 @@ if( u ) u->next = n->next; else rq_head = n->next; if( rq_send == n ) rq_send = n->next; + //CONSOLE.Warning(0,"ThOr: RequestQueue::Remove: deleting slice %zu: %p", n->id, (void*)n); delete n; return 0; @@ -409,6 +414,7 @@ if(plen) *plen = rq_head->length; if( rq_send == rq_head ) rq_send = n; + //CONSOLE.Warning(0,"ThOr: RequestQueue::Pop: deleting slice %zu: %p", rq_head->id, (void*)rq_head); delete rq_head; rq_head = n; @@ -636,6 +642,7 @@ //check if off & len match any slice //remove the slice if so rq.SetHead(pending_array[i]); + //CONSOLE.Warning(0, "ThOr: btrequest:%i Remove", __LINE__); if( rq.Remove(idx, off, len) == 0 ){ r = 1; pending_array[i] = rq.GetHead(); diff -ruN ../ctorrent-dnh3.3.2thor1/btrequest.h ./btrequest.h --- ../ctorrent-dnh3.3.2thor1/btrequest.h 2008-06-15 02:00:19.000000000 +0200 +++ ./btrequest.h 2015-12-03 14:54:46.000000000 +0100 @@ -14,6 +14,7 @@ size_t length; time_t reqtime; struct _slice *next; + size_t id; }SLICE,*PSLICE; class RequestQueue diff -ruN ../ctorrent-dnh3.3.2thor1/ChangeLog ./ChangeLog --- ../ctorrent-dnh3.3.2thor1/ChangeLog 2015-12-01 18:12:38.245855390 +0100 +++ ./ChangeLog 2015-12-03 16:46:09.000000000 +0100 @@ -2,6 +2,13 @@ Enhanced CTorrent Change Log _________________________________________________________________ + Changes for "dnh3.3.2thor2" hack + * hacking around some use-after-free reported by valgrind + * investigating double free issues and inadvertedly fixing them + by the above?! We had random crashes of ctorrent on a cluster + of about 400 nodes using it to fetch ca. 400 MiB each from + each other while booting. + Changes for "dnh3.3.2thor1" hack * Continue and skip size check if existing file is not a regular one. Idea: I want to seed /dev/loop0 with a squashfs image diff -ruN ../ctorrent-dnh3.3.2thor1/configure ./configure --- ../ctorrent-dnh3.3.2thor1/configure 2015-12-01 18:12:44.459713043 +0100 +++ ./configure 2015-12-03 17:01:22.923195795 +0100 @@ -1,6 +1,6 @@ #! /bin/sh # Guess values for system-dependent variables and create Makefiles. -# Generated by GNU Autoconf 2.61 for Enhanced CTorrent dnh3.3.2thor1. +# Generated by GNU Autoconf 2.61 for Enhanced CTorrent dnh3.3.2thor2. # # Report bugs to <http://sourceforge.net/projects/dtorrent/ or dhol...@ct.boxmail.com>. # @@ -581,8 +581,8 @@ # Identity of this package. PACKAGE_NAME='Enhanced CTorrent' PACKAGE_TARNAME='ctorrent' -PACKAGE_VERSION='dnh3.3.2thor1' -PACKAGE_STRING='Enhanced CTorrent dnh3.3.2thor1' +PACKAGE_VERSION='dnh3.3.2thor2' +PACKAGE_STRING='Enhanced CTorrent dnh3.3.2thor2' PACKAGE_BUGREPORT='http://sourceforge.net/projects/dtorrent/ or dhol...@ct.boxmail.com' PACKAGE_URL='' @@ -1270,7 +1270,7 @@ # Omit some internal or obsolete options to make the list less imposing. # This message is too long to be a string in the A/UX 3.1 sh. cat <<_ACEOF -\`configure' configures Enhanced CTorrent dnh3.3.2thor1 to adapt to many kinds of systems. +\`configure' configures Enhanced CTorrent dnh3.3.2thor2 to adapt to many kinds of systems. Usage: $0 [OPTION]... [VAR=VALUE]... @@ -1336,7 +1336,7 @@ if test -n "$ac_init_help"; then case $ac_init_help in - short | recursive ) echo "Configuration of Enhanced CTorrent dnh3.3.2thor1:";; + short | recursive ) echo "Configuration of Enhanced CTorrent dnh3.3.2thor2:";; esac cat <<\_ACEOF @@ -1433,7 +1433,7 @@ test -n "$ac_init_help" && exit $ac_status if $ac_init_version; then cat <<\_ACEOF -Enhanced CTorrent configure dnh3.3.2thor1 +Enhanced CTorrent configure dnh3.3.2thor2 generated by GNU Autoconf 2.61 Copyright (C) 2012 Free Software Foundation, Inc. @@ -2070,7 +2070,7 @@ This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. -It was created by Enhanced CTorrent $as_me dnh3.3.2thor1, which was +It was created by Enhanced CTorrent $as_me dnh3.3.2thor2, which was generated by GNU Autoconf 2.61. Invocation command line was $ $0 $@ @@ -2736,7 +2736,7 @@ # Define the identity of the package. PACKAGE=ctorrent - VERSION=dnh3.3.2thor1 + VERSION=dnh3.3.2thor2 cat >>confdefs.h <<_ACEOF @@ -7322,7 +7322,7 @@ # report actual input values of CONFIG_FILES etc. instead of their # values after options handling. ac_log=" -This file was extended by Enhanced CTorrent $as_me dnh3.3.2thor1, which was +This file was extended by Enhanced CTorrent $as_me dnh3.3.2thor2, which was generated by GNU Autoconf 2.61. Invocation command line was CONFIG_FILES = $CONFIG_FILES @@ -7388,7 +7388,7 @@ cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`" ac_cs_version="\\ -Enhanced CTorrent config.status dnh3.3.2thor1 +Enhanced CTorrent config.status dnh3.3.2thor2 configured by $0, generated by GNU Autoconf 2.61, with options \\"\$ac_cs_config\\" diff -ruN ../ctorrent-dnh3.3.2thor1/peer.cpp ./peer.cpp --- ../ctorrent-dnh3.3.2thor1/peer.cpp 2008-06-15 02:00:19.000000000 +0200 +++ ./peer.cpp 2015-12-03 16:42:39.000000000 +0100 @@ -525,6 +525,7 @@ idx = get_nl(msgbuf + H_LEN + H_BASE_LEN); off = get_nl(msgbuf + H_LEN + H_BASE_LEN + H_INT_LEN); len = get_nl(msgbuf + H_LEN + H_BASE_LEN + H_INT_LEN * 2); + //CONSOLE.Warning(0, "ThOr: peer:%i Remove", __LINE__); if( reponse_q.Remove(idx,off,len) < 0 ){ if( m_state.local_choked && m_last_timestamp - m_unchoke_timestamp > @@ -664,6 +665,7 @@ m_cancel_time = now; } next = ps->next; + //CONSOLE.Warning(0, "ThOr: peer:%i Remove", __LINE__); request_q.Remove(ps->index, ps->offset, ps->length); } if( request_q.IsEmpty() ){ @@ -843,6 +845,7 @@ // (then the request is already in Pending). if( f_requested && !BTCONTENT.FlushFailed() ){ // This removes only the first instance; re-queued request is safe. + //CONSOLE.Warning(0, "ThOr: peer:%i Remove", __LINE__); request_q.Remove(idx,off,len); m_req_out--; if( RequestSlice(idx,off,len) < 0 ){ @@ -852,6 +855,7 @@ } } }else{ // saved or had the data + //CONSOLE.Warning(0, "ThOr: peer:%i Remove", __LINE__); request_q.Remove(idx,off,len); if( f_requested ) m_req_out--; // Check for & cancel requests for this slice from other peers in initial diff -ruN ../ctorrent-dnh3.3.2thor1/peerlist.cpp ./peerlist.cpp --- ../ctorrent-dnh3.3.2thor1/peerlist.cpp 2008-06-15 02:00:19.000000000 +0200 +++ ./peerlist.cpp 2015-12-03 16:42:39.000000000 +0100 @@ -82,8 +82,10 @@ void PeerList::CloseAll() { PEERNODE *p; + //CONSOLE.Warning(0, "ThOr: PeerList::CloseAll()"); for( p = m_head; p; ){ m_head = p->next; + //CONSOLE.Warning(0, "ThOr: deleting peernode %p with peer %p", (void*)p, (void*)(p->peer)); delete (p->peer); delete p; p = m_head; @@ -194,6 +196,7 @@ m_peers_count++; p->peer = peer; p->next = m_head; + //CONSOLE.Warning(0,"ThOr: added peernode %p with peer %p", (void*)p, (void*)(p->peer)); m_head = p; return 0; @@ -612,8 +615,9 @@ } if( mark < slots && data[mark].count == 1 ) m_dup_req_pieces++; CONSOLE.Debug("%d dup req pieces", (int)m_dup_req_pieces); + size_t datamarkidx = data[mark].idx; delete []data; - return (mark < slots) ? data[mark].idx : BTCONTENT.GetNPieces(); + return (mark < slots) ? datamarkidx : BTCONTENT.GetNPieces(); } void PeerList::FindValuedPieces(BitField &bf, const btPeer *proposer, diff -ruN ../ctorrent-dnh3.3.2thor1/rate.cpp ./rate.cpp --- ../ctorrent-dnh3.3.2thor1/rate.cpp 2008-06-15 02:00:19.000000000 +0200 +++ ./rate.cpp 2015-12-03 16:42:39.000000000 +0100 @@ -25,6 +25,7 @@ m_late = 0; m_ontime = m_update_nominal = 0; m_lastrate.lasttime = (time_t)0; + m_lastrate.recent = 0; m_nominal = DEFAULT_SLICE_SIZE / 8; // minimum "acceptable" rate } diff -ruN ../ctorrent-dnh3.3.2thor1/tracker.cpp ./tracker.cpp --- ../ctorrent-dnh3.3.2thor1/tracker.cpp 2008-06-15 02:00:19.000000000 +0200 +++ ./tracker.cpp 2015-12-03 16:42:39.000000000 +0100 @@ -463,7 +463,7 @@ int btTracker::SendRequest() { - char *event,*str_event[] = {"started","stopped","completed" }; + const char *event,*str_event[] = {"started","stopped","completed" }; char REQ_BUFFER[2*MAXPATHLEN]; struct sockaddr_in addr; diff -ruN ../ctorrent-dnh3.3.2thor1/version.m4 ./version.m4 --- ../ctorrent-dnh3.3.2thor1/version.m4 2015-12-01 18:12:38.247855344 +0100 +++ ./version.m4 2015-12-03 17:00:51.685910589 +0100 @@ -1,5 +1,5 @@ m4_define([m4_PACKAGE_NAME], [Enhanced CTorrent]) m4_define([m4_PACKAGE_TARNAME], [ctorrent]) -m4_define([m4_PACKAGE_VERSION], [dnh3.3.2thor1]) +m4_define([m4_PACKAGE_VERSION], [dnh3.3.2thor2]) m4_define([m4_PACKAGE_BUGREPORT], [http://sourceforge.net/projects/dtorrent/ or dhol...@ct.boxmail.com])
_______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user