Hi,

I first got into contact with xCAT through our HPC installed in 2015,
with xCAT version … hm …

# nodels --version
Version 2.9.1 (git commit 7f6043fffd62d482931b17b60f9488eb5754fdc1, built Thu 
Mar 19 03:25:35 EDT 2015)

2.9.1 seems to be it. The base system is CentOS 7.x. Since the system
was an en bloc purchase, we never updated xCAT, but I just adapted it
to our needs and then let it do its thing over the years. I did some
little changes, like fixing up /etc/hostname in initrd (not sure if
that was a specific mixup in our setup with long and short hostnames)
and recently the fix for CVE-2023-27486 (being rather annoyed that
/root/.ssh/id_rsa would _ever_ be delivered out to cluster nodes,
should always have been a separate directory where I consciously copied
a key or had it generated). But nothing to rock the boat. CentOS
upgrades up to 7.9 didn't hurt things. We did stick to a certain
vanilla kernel build with our patches, though. 

The system will be out of production in the near future and we do not
know what the next installation will be using. I intended to share a
main point of my local hacking, but somehow never got around it, and I
somehow figured that the obvious stuff would appear upstream, anyway.
Example: I enabled squashfs+overlayfs for us with a few lines and I
gather that is a standard thing now.

Obvious to me back in 2015 was the distribution of stateless
filesystem images being slowed down unnecessarily by them being served
via HTTP from the admin node over the 1GbE interface. Booting a cluster
of 400 nodes took ages because of that (well, quarter to half an hour or
so).

Is this still the current mechanism? While you could make the admin
node part of the high-speed network (Infiniband in our case), or just
using 10GbE as baseline today, it just feels right to me to scale out
the distribution capacity with the number of compute nodes.

Is anyone interested in that? Should I propose a formal change to xCAT
for that feature? Did I miss an equivalent option that exists now in
current xCAT? I only found some consulting company boasting about them
having implemented torrents with xCAT for a customer, but nothing
official.

I'll describe what I did, anyway. 6 steps follow.

1. I got hold of a minimal torrent program: ctorrent from

        https://sourceforge.net/p/dtorrent/

2. I wrote the first of the two attached patches to support the cluster
use-case with /dev/loop0 for reading the rootimg (see also 

        https://sourceforge.net/p/dtorrent/patches/5/

), the second patch then followed to fix a memory issue (see also

        https://sourceforge.net/p/dtorrent/patches/7/

).

3. I applied a rather small change to the xcatroot dracut script to
   download the image via ctorrent in initrd and prepare seeding later.

---------------8<---------------------
Index: share/xcat/netboot/rh/dracut_033/xcatroot
===================================================================
--- share/xcat/netboot/rh/dracut_033/xcatroot   (Revision 833)
+++ share/xcat/netboot/rh/dracut_033/xcatroot   (Revision 834)
@@ -21,6 +21,12 @@
 /tmp/updateflag $MASTER $XCATIPORT "installstatus netbooting"
 fi
 
+if [ -e /rootimg.torrent ]; then
+
+  ctorrent -s /rootimg.sfs -e 0 /rootimg.torrent
+
+else
+
 if [ ! -z "$imgurl" ]; then
        if [ xhttp = x${imgurl%%:*} ]; then
                NFS=0
@@ -43,6 +49,9 @@
                ROOTDIR=/${ROOTDIR#*/} 
        fi
 fi
+
+fi # torrent
+
 #echo 0 > /proc/sys/vm/zone_reclaim_mode #Avoid kernel bug
 
 if [ -r /rootimg.sfs ]; then
@@ -61,6 +70,15 @@
   mkdir -p $NEWROOT/rw
   mount --move /ro $NEWROOT/ro
   mount --move /rw $NEWROOT/rw
+  if [ -e /rootimg.torrent ]; then
+    # Prepare for seeding the rootimg.
+    # Note that this demands the patched dnh3.2.2thor1 ctorrent binary.
+    mkdir $NEWROOT/.sysdist
+    cp /usr/bin/ctorrent /rootimg.torrent $NEWROOT/.sysdist
+    rrz_distfile=$(ctorrent -x /rootimg.torrent | grep rootimg.sfs | cut -f 2 
-d ' ')
+    mkdir -p $NEWROOT/.sysdist/$(dirname $rrz_distfile)
+    ln -s /dev/loop0 $NEWROOT/.sysdist/$rrz_distfile
+  fi
 elif [ -r /rootimg.gz ]; then
   echo Setting up RAM-root tmpfs.
   if [ -z $rootlimit ];then
--------------->8---------------------

4. Include the torrent stuff in the image generation script.

---------------8<--------------------
#!/bin/sh

scriptdir=$(cd $(dirname $0) && pwd)
PATH=$scriptdir:$PATH
sysbase=centos79
osimage=$sysbase-x86_64-stateless-gpu
imgdir=/install/netboot/$sysbase/x86_64/gpu
xcatinitrd=$imgdir/initrd-stateless.gz

# normal image generation
# packimage, etc.

# stop main seeding service on admin node
service rrz-dist-mainseed stop

# Create torrent file for efficient distribution.
torrfile=gpu-$sysbase-$timecode-rootimg.torrent
cd /install/dist
ctorrent -t \
  -s $torrfile  \
  -u http://$admin_ip:81/announce \
  os/gpu-$sysbase-$timecode/rootimg.sfs

# start seeding again, picking up added .torrent
service rrz-dist-mainseed start

# Disable that in case of weird boot trouble.
# It pulls out lots of drivers/firmware that is not
# obviously needed for booting.
# Initrd loading without torrent is the new bottleneck .
rrz-initrd-reduce $xcatinitrd

# Insert torrent client and torrent file into initrd.
# If that is disabled, standard HTTP download using the
# URL from pxelinux config is done.
initrdir=$(rrz-initrd-unpack $xcatinitrd)
cp -v $scriptdir/ctorrent $initrdir/usr/bin
cp -v /install/dist/$torrfile $initrdir/rootimg.torrent
rrz-initrd-pack $xcatinitrd $initrdir
rrz-initrd-rmdir "$initrdir"

rrz-initrd-ucode $xcatinitrd

# Yes, update the actual copy of the initrd that is used
# during netboot.
cp -v $xcatinitrd $bootinitrd
--------------->8--------------------


5. Added a seed service to syncfiles:

cat /usr/lib/systemd/system/rrz-dist-seed.service 
[Unit]
Description=ctorrent node seed for image distribution
After=network.target

[Service]
# It might be that the fresh torrent file is not available right away
# inside /work (?!), so restarting may be needed to really get
# an instance up.
Restart=always
RestartSec=10
WorkingDirectory=/.sysdist
# Not starting as user yet, because root perm needed for preparation
# User=sysdist
ExecStartPre=/bin/chmod 0640 /dev/loop0
ExecStartPre=/bin/chown :sysdist /dev/loop0
ExecStart=/bin/su sysdist -c '/.sysdist/ctorrent -q -m 5 -M 20 -U 102400 -e -1 
rootimg.torrent'

[Install]
WantedBy=multi-user.target


6. Before all that … have the main seeder service:

[root@adm1 xcat]# cat /install/rrz/rrz-dist-mainseed.sh
#!/bin/bash
# called as a system service
pids=
for torr in /install/dist/*.torrent
do
  /install/rrz/ctorrent -q -m 5 -M 20 -U 60000 -e -1 "$torr" &
  pid=$?
  pids+=" $pid"
  echo "torrent for $torr with PID $pid"
done

trap "kill $pids" EXIT

wait
# end

And as tracker, I built an instance of opentracker (below 90K binary,
ctorrent is around 310K, not stripped) from

        http://erdgeist.org/arts/software/opentracker/

(snapshot: 
http://src.rrz.uni-hamburg.de/files/src/_unsorted/opentracker-20151001.tar.bz2)
with the config file boiling down to three lines:

listen.udp.workers 6
listen.tcp_udp $admin_ip:81
tracker.user    nobody

and this simple call as systemd service:

[Service]
ExecStart=/install/rrz/opentracker -f /install/rrz/opentracker.conf


Now this is a long mail, but a rather complete description of the steps
I took to make booting of my stateless nodes so fast that I didn't
worry about the image distribution part since mid/end of 2015. Now, at
the end of the system lifetime, I start to worry a bit about what will
come next …

Is there interest in the xCAT community to pick this up? One might have
to adopt/fork ctorrent, while opentracker seems to be alive, although
the author didn't bother to name a release yet. In the closed loop
where we use ctorrent as the only client/server with this tracker
servers, this might be acceptable. To me, more acceptable than bloating
the initrd again with some other torrent software more than a few 100K
big.

Should this be supported in xCAT upstream? Having 100G networking in
the admin node might make this obsolete, but this just means that we
could scale to a few thousand nodes more without impacting a single
network link. In clusters, when you can distribute a load, you should
think twice before _not_ doing it, right?


Alrighty then,

Thomas

-- 
Dr. Thomas Orgis
HPC @ Universität Hamburg
diff -ru ./btconfig.cpp ../ctorrent-dnh3.3.2thor1/btconfig.cpp
--- ./btconfig.cpp	2008-06-15 02:00:19.000000000 +0200
+++ ../ctorrent-dnh3.3.2thor1/btconfig.cpp	2015-10-02 09:01:16.966502705 +0200
@@ -35,6 +35,7 @@
 unsigned char arg_flg_convert_filenames = 0;
 char *arg_file_to_download = (char *)0;
 unsigned char arg_verbose = 0;
+unsigned char arg_quiet = 0;
 unsigned char arg_allocate = 0;
 unsigned char arg_daemon = 0;
 
diff -ru ./btconfig.h ../ctorrent-dnh3.3.2thor1/btconfig.h
--- ./btconfig.h	2008-06-15 02:00:19.000000000 +0200
+++ ../ctorrent-dnh3.3.2thor1/btconfig.h	2015-10-02 09:00:40.851835517 +0200
@@ -48,6 +48,7 @@
 extern unsigned char arg_flg_convert_filenames;
 extern char *arg_file_to_download;
 extern unsigned char arg_verbose;
+extern unsigned char arg_quiet;
 extern unsigned char arg_allocate;
 extern unsigned char arg_daemon;
 
diff -ru ./btfiles.cpp ../ctorrent-dnh3.3.2thor1/btfiles.cpp
--- ./btfiles.cpp	2008-06-15 02:00:19.000000000 +0200
+++ ../ctorrent-dnh3.3.2thor1/btfiles.cpp	2015-10-02 07:57:55.800456576 +0200
@@ -641,10 +641,10 @@
       }
     }else{
       if( !check_exist) check_exist = 1;
+      /* ThOr: I want to seed /dev/loop0 */ 
       if( !(S_IFREG & sb.st_mode) ){
-        CONSOLE.Warning(1, "error, file \"%s\" is not a regular file.", fn);
-        return -1;
-      }
+        CONSOLE.Warning(1, "file \"%s\" is not a regular file, assuming you know more than me about it", fn);
+      } else
       if(sb.st_size != pbt->bf_length){
         CONSOLE.Warning(1,"error, file \"%s\" size doesn't match; must be %llu",
                 fn, (unsigned long long)(pbt->bf_length));
diff -ru ./ChangeLog ../ctorrent-dnh3.3.2thor1/ChangeLog
--- ./ChangeLog	2008-06-15 02:09:43.000000000 +0200
+++ ../ctorrent-dnh3.3.2thor1/ChangeLog	2015-10-02 09:10:02.914657947 +0200
@@ -2,6 +2,19 @@
                         Enhanced CTorrent Change Log
      _________________________________________________________________
 
+   Changes for "dnh3.3.2thor1" hack
+     * Continue and skip size check if existing file is not a regular
+       one. Idea: I want to seed /dev/loop0 with a squashfs image
+       on the other end. This indeed works even without write permissions
+       to the file. Good thing, don't want to wreck the booted system.
+     * Speed up things when "Quitting" by simulating what Ctrl+C does
+       after seeding. 
+     * Added -q for quiet(er) operation for running it as a service with
+       systemd, which doesn't like applications daemonizing themselves.
+       Also, I hope I get a reliable service file with that at all.
+       (With -d, I need systemctl daemon-reload all the time to make
+       ctorrent really start from the service file. Very weird.)
+
    Changes for "dnh3.3.2" Release
 
      * [1954631] Fixed incorrect ordering of dictionary keys in created
diff -ru ./configure ../ctorrent-dnh3.3.2thor1/configure
--- ./configure	2008-06-15 02:00:19.000000000 +0200
+++ ../ctorrent-dnh3.3.2thor1/configure	2015-10-02 08:56:36.178090875 +0200
@@ -1,6 +1,6 @@
 #! /bin/sh
 # Guess values for system-dependent variables and create Makefiles.
-# Generated by GNU Autoconf 2.61 for Enhanced CTorrent dnh3.3.2.
+# Generated by GNU Autoconf 2.61 for Enhanced CTorrent dnh3.3.2thor1.
 #
 # Report bugs to <http://sourceforge.net/projects/dtorrent/ or dhol...@ct.boxmail.com>.
 #
@@ -574,8 +574,8 @@
 # Identity of this package.
 PACKAGE_NAME='Enhanced CTorrent'
 PACKAGE_TARNAME='ctorrent'
-PACKAGE_VERSION='dnh3.3.2'
-PACKAGE_STRING='Enhanced CTorrent dnh3.3.2'
+PACKAGE_VERSION='dnh3.3.2thor1'
+PACKAGE_STRING='Enhanced CTorrent dnh3.3.2thor1'
 PACKAGE_BUGREPORT='http://sourceforge.net/projects/dtorrent/ or dhol...@ct.boxmail.com'
 
 ac_unique_file="ctorrent.cpp"
@@ -1216,7 +1216,7 @@
   # Omit some internal or obsolete options to make the list less imposing.
   # This message is too long to be a string in the A/UX 3.1 sh.
   cat <<_ACEOF
-\`configure' configures Enhanced CTorrent dnh3.3.2 to adapt to many kinds of systems.
+\`configure' configures Enhanced CTorrent dnh3.3.2thor1 to adapt to many kinds of systems.
 
 Usage: $0 [OPTION]... [VAR=VALUE]...
 
@@ -1282,7 +1282,7 @@
 
 if test -n "$ac_init_help"; then
   case $ac_init_help in
-     short | recursive ) echo "Configuration of Enhanced CTorrent dnh3.3.2:";;
+     short | recursive ) echo "Configuration of Enhanced CTorrent dnh3.3.2thor1:";;
    esac
   cat <<\_ACEOF
 
@@ -1376,7 +1376,7 @@
 test -n "$ac_init_help" && exit $ac_status
 if $ac_init_version; then
   cat <<\_ACEOF
-Enhanced CTorrent configure dnh3.3.2
+Enhanced CTorrent configure dnh3.3.2thor1
 generated by GNU Autoconf 2.61
 
 Copyright (C) 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001,
@@ -1390,7 +1390,7 @@
 This file contains any messages produced by compilers while
 running configure, to aid debugging if configure makes a mistake.
 
-It was created by Enhanced CTorrent $as_me dnh3.3.2, which was
+It was created by Enhanced CTorrent $as_me dnh3.3.2thor1, which was
 generated by GNU Autoconf 2.61.  Invocation command line was
 
   $ $0 $@
@@ -2060,7 +2060,7 @@
 
 # Define the identity of the package.
  PACKAGE=ctorrent
- VERSION=dnh3.3.2
+ VERSION=dnh3.3.2thor1
 
 
 cat >>confdefs.h <<_ACEOF
@@ -9777,7 +9777,7 @@
 # report actual input values of CONFIG_FILES etc. instead of their
 # values after options handling.
 ac_log="
-This file was extended by Enhanced CTorrent $as_me dnh3.3.2, which was
+This file was extended by Enhanced CTorrent $as_me dnh3.3.2thor1, which was
 generated by GNU Autoconf 2.61.  Invocation command line was
 
   CONFIG_FILES    = $CONFIG_FILES
@@ -9830,7 +9830,7 @@
 _ACEOF
 cat >>$CONFIG_STATUS <<_ACEOF
 ac_cs_version="\\
-Enhanced CTorrent config.status dnh3.3.2
+Enhanced CTorrent config.status dnh3.3.2thor1
 configured by $0, generated by GNU Autoconf 2.61,
   with options \\"`echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`\\"
 
diff -ru ./console.cpp ../ctorrent-dnh3.3.2thor1/console.cpp
--- ./console.cpp	2008-06-15 02:00:19.000000000 +0200
+++ ../ctorrent-dnh3.3.2thor1/console.cpp	2015-10-02 09:06:39.493531278 +0200
@@ -350,7 +350,8 @@
 
 int Console::IntervalCheck(fd_set *rfdp, fd_set *wfdp)
 {
-  Status(0);
+  if(!arg_quiet)
+    Status(0);
 
   if( m_oldfd >= 0 ){
     FD_CLR(m_oldfd, rfdp);
@@ -445,7 +446,8 @@
       }
       if( '0' != pending ){
           m_streams[O_INPUT]->SetInputMode(K_CHARS);
-          Status(1);
+          if(!arg_quiet)
+            Status(1);
       }
 
     }else{     // command character received
@@ -574,7 +576,8 @@
           BTCONTENT.CacheConfigure();
           break;
         default:
-          Status(1);
+          if(!arg_quiet)
+            Status(1);
           break;
         }
         if( 10==++count ) inc *= 2;
@@ -590,7 +593,8 @@
         OperatorMenu("");
         break;
       default:
-        Status(1);
+        if(!arg_quiet)
+          Status(1);
         break;
       }
 
diff -ru ./ctorrent.cpp ../ctorrent-dnh3.3.2thor1/ctorrent.cpp
--- ./ctorrent.cpp	2008-06-15 02:00:19.000000000 +0200
+++ ../ctorrent-dnh3.3.2thor1/ctorrent.cpp	2015-10-02 09:20:18.284994066 +0200
@@ -141,7 +141,7 @@
 
   if( 0==strncmp(argv[1], "-t", 2) )
     opts = "tc:l:ps:u:";
-  else opts = "aA:b:cC:dD:e:E:fi:I:M:m:n:P:p:s:S:Tu:U:vxX:z:hH";
+  else opts = "aA:b:cC:dD:e:E:fi:I:M:m:n:P:p:s:S:Tu:U:vxX:z:qhH";
 
   while( (c=getopt(argc, argv, opts)) != -1 )
     switch( c ){
@@ -334,6 +334,10 @@
       arg_verbose = 1;
       break;
 
+    case 'q':                   // be quiet
+      arg_quiet = 1;
+      break;
+
     case 'd':			// daemon mode (fork to background)
       arg_daemon++;
       break;
@@ -387,6 +391,7 @@
     "Decode metainfo (torrent) file only, don't download");
   fprintf(stderr, "%-15s %s\n", "-c", "Check pieces only, don't download");
   fprintf(stderr, "%-15s %s\n", "-v", "Verbose output (for debugging)");
+  fprintf(stderr, "%-15s %s\n", "-q", "Less verbose output (no progress indication)");
 
   fprintf(stderr,"\nDownload Options:\n");
   fprintf(stderr, "%-15s %s\n", "-e int",
diff -ru ./downloader.cpp ../ctorrent-dnh3.3.2thor1/downloader.cpp
--- ./downloader.cpp	2008-06-15 02:00:19.000000000 +0200
+++ ../ctorrent-dnh3.3.2thor1/downloader.cpp	2015-10-02 08:36:58.918936358 +0200
@@ -47,6 +47,14 @@
         if( arg_ctcs ) CTCS.Send_Status();
       }
     }
+    else
+    {
+       /* ThOr: Trying to speed things up. This is what the SIGINT
+          handler does. Without that, I habe at least 14 seconds delay
+          with ctorrent "Quitting". */
+       Tracker.ClearRestart();
+       Tracker.SetStoped();
+    }
 
     maxfd = -1;
     maxsleep = -1;
@@ -104,7 +112,6 @@
       if( maxsleep <= -100 ) maxsleep = 0;
       else if( maxsleep <= 0 || maxsleep > MAX_SLEEP ) maxsleep = MAX_SLEEP;
     }
-
     timeout.tv_sec = (long)maxsleep;
     timeout.tv_usec = (long)( (maxsleep-(long)maxsleep) * 1000000 );
 
diff -ru ./version.m4 ../ctorrent-dnh3.3.2thor1/version.m4
--- ./version.m4	2008-06-15 02:00:19.000000000 +0200
+++ ../ctorrent-dnh3.3.2thor1/version.m4	2015-10-02 08:40:36.275934703 +0200
@@ -1,5 +1,5 @@
 m4_define([m4_PACKAGE_NAME],      [Enhanced CTorrent])
 m4_define([m4_PACKAGE_TARNAME],   [ctorrent])
-m4_define([m4_PACKAGE_VERSION],   [dnh3.3.2])
+m4_define([m4_PACKAGE_VERSION],   [dnh3.3.2thor1])
 m4_define([m4_PACKAGE_BUGREPORT], [http://sourceforge.net/projects/dtorrent/ or dhol...@ct.boxmail.com])
 
diff -ruN ../ctorrent-dnh3.3.2thor1/btcontent.cpp ./btcontent.cpp
--- ../ctorrent-dnh3.3.2thor1/btcontent.cpp	2008-06-15 02:00:19.000000000 +0200
+++ ./btcontent.cpp	2015-12-03 16:42:39.000000000 +0100
@@ -53,7 +53,7 @@
 (max_uint64_t((ca)->bc_off,(roff)) <= \
  min_uint64_t(((ca)->bc_off + (ca)->bc_len - 1),(roff + rlen - 1)))
 
-
+size_t btcache_counter = 0;
 btContent BTCONTENT;
 
 static void Sha1(char *ptr,size_t len,unsigned char *dm)
@@ -627,6 +627,7 @@
 
       m_cache_used -= p->bc_len;
       delete []p->bc_buf;
+      //CONSOLE.Warning(0, "ThOr: CacheClean: deleting entry %zu: %p", p->id, (void*)p);
       delete p;
     }
   }
@@ -749,7 +750,11 @@
     CacheEval();
   }else m_cache_size = 0;
 
-  if( m_cache_size < m_cache_used && !m_flush_failed ) CacheClean(0);
+  if( m_cache_size < m_cache_used && !m_flush_failed )
+  {
+     //CONSOLE.Warning(0, "ThOr: btcontent:%i CacheClean", __LINE__);
+     CacheClean(0);
+  }
 }
 
 int btContent::NeedFlush() const
@@ -849,6 +854,7 @@
 
      m_cache_used -= p->bc_len;
      delete []p->bc_buf;
+     //CONSOLE.Warning(0, "ThOr: UnCache: deleting entry %zu: %p", p->id, (void*)p);
      delete p;
   }
   m_cache[idx] = (BTCACHE *)0;
@@ -912,6 +918,7 @@
 
       m_cache_used -= p->bc_len;
       delete []p->bc_buf;
+      //CONSOLE.Warning(0, "ThOr: CachePrep: deleting entry %p", (void*)p);
       delete p;
     }
   }
@@ -988,12 +995,16 @@
     CONSOLE.Debug("Read to %s %d/%d/%d", buf?"buffer":"cache",
       (int)(off / m_piece_length), (int)(off % m_piece_length), (int)len);
 
-  if( m_cache_size < m_cache_used + len ) CacheClean(len);
+  if( m_cache_size < m_cache_used + len )
+  {
+     //CONSOLE.Warning(0, "ThOr: btcontent:%i CacheClean", __LINE__);
+     CacheClean(len);
+  }
   // Note, there is no failure code from CacheClean().  If nothing can be done
   // to increase the cache size, we allocate what we need anyway.
   
   if( 0==method && buf && m_btfiles.IO(buf, off, len, method) < 0 ) return -1;
-  
+ 
   pnew = new BTCACHE;
 #ifndef WINDOWS
   if( !pnew )
@@ -1014,6 +1025,8 @@
     delete pnew;
     return -1;
   }
+  pnew->id = ++btcache_counter;
+  //CONSOLE.Warning(0, "ThOr: btcontent: created cache entry %zu: %p", pnew->id, (void*)pnew);
   pnew->bc_off = off;
   pnew->bc_len = len;
   pnew->bc_f_flush = method;
@@ -1366,7 +1379,11 @@
 #ifdef HAVE_WORKING_FORK
   if( cfg_cache_size ){  // maybe free some cache before forking
     CacheEval();
-    if( m_cache_size < m_cache_used && !m_flush_failed ) CacheClean(0);
+    if( m_cache_size < m_cache_used && !m_flush_failed )
+    {
+       //CONSOLE.Warning(0, "ThOr: btcontent:%i CacheClean", __LINE__);
+       CacheClean(0);
+    }
   }
   pid_t r;
   if( (r = fork()) < 0 ){
diff -ruN ../ctorrent-dnh3.3.2thor1/btcontent.h ./btcontent.h
--- ../ctorrent-dnh3.3.2thor1/btcontent.h	2008-06-15 02:00:19.000000000 +0200
+++ ./btcontent.h	2015-12-03 14:33:56.000000000 +0100
@@ -22,6 +22,7 @@
   struct _btcache *bc_prev;
   struct _btcache *age_next;
   struct _btcache *age_prev;
+  size_t id;
 }BTCACHE;
 
 typedef struct _btflush{
diff -ruN ../ctorrent-dnh3.3.2thor1/btrequest.cpp ./btrequest.cpp
--- ../ctorrent-dnh3.3.2thor1/btrequest.cpp	2008-06-15 02:00:19.000000000 +0200
+++ ./btrequest.cpp	2015-12-03 16:58:42.433868212 +0100
@@ -10,6 +10,7 @@
 #include "compat.h"
 #endif
 
+size_t btslice_counter = 0;
 
 static void _empty_slice_list(PSLICE *ps_head)
 {
@@ -234,7 +235,8 @@
 #ifndef WINDOWS
   if( !n ) return -1;
 #endif
-
+  n->id = ++btslice_counter;
+  //CONSOLE.Warning(0, "ThOr: Insert slice %zu: %p", n->id, (void*)n);
   n->index = idx;
   n->offset = off;
   n->length = len;
@@ -267,6 +269,8 @@
   if( !n ) return -1;
 #endif
 
+  n->id = ++btslice_counter;
+  //CONSOLE.Warning(0, "ThOr: Add slice %zu: %p", n->id, (void*)n);
   n->next = (PSLICE) 0;
   n->index = idx;
   n->offset = off;
@@ -312,6 +316,7 @@
 
   if( u ) u->next = n->next; else rq_head = n->next;
   if( rq_send == n ) rq_send = n->next;
+  //CONSOLE.Warning(0,"ThOr: RequestQueue::Remove: deleting slice %zu: %p", n->id, (void*)n);
   delete n;
 
   return 0;
@@ -409,6 +414,7 @@
   if(plen) *plen = rq_head->length;
 
   if( rq_send == rq_head ) rq_send = n;
+  //CONSOLE.Warning(0,"ThOr: RequestQueue::Pop: deleting slice %zu: %p", rq_head->id, (void*)rq_head);
   delete rq_head;
 
   rq_head = n;
@@ -636,6 +642,7 @@
         //check if off & len match any slice
         //remove the slice if so
         rq.SetHead(pending_array[i]);
+        //CONSOLE.Warning(0, "ThOr: btrequest:%i Remove", __LINE__);
         if( rq.Remove(idx, off, len) == 0 ){
           r = 1;
           pending_array[i] = rq.GetHead();
diff -ruN ../ctorrent-dnh3.3.2thor1/btrequest.h ./btrequest.h
--- ../ctorrent-dnh3.3.2thor1/btrequest.h	2008-06-15 02:00:19.000000000 +0200
+++ ./btrequest.h	2015-12-03 14:54:46.000000000 +0100
@@ -14,6 +14,7 @@
    size_t length;
    time_t reqtime;
    struct _slice *next;
+   size_t id;
 }SLICE,*PSLICE;
 
 class RequestQueue
diff -ruN ../ctorrent-dnh3.3.2thor1/ChangeLog ./ChangeLog
--- ../ctorrent-dnh3.3.2thor1/ChangeLog	2015-12-01 18:12:38.245855390 +0100
+++ ./ChangeLog	2015-12-03 16:46:09.000000000 +0100
@@ -2,6 +2,13 @@
                         Enhanced CTorrent Change Log
      _________________________________________________________________
 
+   Changes for "dnh3.3.2thor2" hack
+     * hacking around some use-after-free reported by valgrind
+     * investigating double free issues and inadvertedly fixing them
+       by the above?! We had random crashes of ctorrent on a cluster
+       of about 400 nodes using it to fetch ca. 400 MiB each from
+       each other while booting.
+
    Changes for "dnh3.3.2thor1" hack
      * Continue and skip size check if existing file is not a regular
        one. Idea: I want to seed /dev/loop0 with a squashfs image
diff -ruN ../ctorrent-dnh3.3.2thor1/configure ./configure
--- ../ctorrent-dnh3.3.2thor1/configure	2015-12-01 18:12:44.459713043 +0100
+++ ./configure	2015-12-03 17:01:22.923195795 +0100
@@ -1,6 +1,6 @@
 #! /bin/sh
 # Guess values for system-dependent variables and create Makefiles.
-# Generated by GNU Autoconf 2.61 for Enhanced CTorrent dnh3.3.2thor1.
+# Generated by GNU Autoconf 2.61 for Enhanced CTorrent dnh3.3.2thor2.
 #
 # Report bugs to <http://sourceforge.net/projects/dtorrent/ or dhol...@ct.boxmail.com>.
 #
@@ -581,8 +581,8 @@
 # Identity of this package.
 PACKAGE_NAME='Enhanced CTorrent'
 PACKAGE_TARNAME='ctorrent'
-PACKAGE_VERSION='dnh3.3.2thor1'
-PACKAGE_STRING='Enhanced CTorrent dnh3.3.2thor1'
+PACKAGE_VERSION='dnh3.3.2thor2'
+PACKAGE_STRING='Enhanced CTorrent dnh3.3.2thor2'
 PACKAGE_BUGREPORT='http://sourceforge.net/projects/dtorrent/ or dhol...@ct.boxmail.com'
 PACKAGE_URL=''
 
@@ -1270,7 +1270,7 @@
   # Omit some internal or obsolete options to make the list less imposing.
   # This message is too long to be a string in the A/UX 3.1 sh.
   cat <<_ACEOF
-\`configure' configures Enhanced CTorrent dnh3.3.2thor1 to adapt to many kinds of systems.
+\`configure' configures Enhanced CTorrent dnh3.3.2thor2 to adapt to many kinds of systems.
 
 Usage: $0 [OPTION]... [VAR=VALUE]...
 
@@ -1336,7 +1336,7 @@
 
 if test -n "$ac_init_help"; then
   case $ac_init_help in
-     short | recursive ) echo "Configuration of Enhanced CTorrent dnh3.3.2thor1:";;
+     short | recursive ) echo "Configuration of Enhanced CTorrent dnh3.3.2thor2:";;
    esac
   cat <<\_ACEOF
 
@@ -1433,7 +1433,7 @@
 test -n "$ac_init_help" && exit $ac_status
 if $ac_init_version; then
   cat <<\_ACEOF
-Enhanced CTorrent configure dnh3.3.2thor1
+Enhanced CTorrent configure dnh3.3.2thor2
 generated by GNU Autoconf 2.61
 
 Copyright (C) 2012 Free Software Foundation, Inc.
@@ -2070,7 +2070,7 @@
 This file contains any messages produced by compilers while
 running configure, to aid debugging if configure makes a mistake.
 
-It was created by Enhanced CTorrent $as_me dnh3.3.2thor1, which was
+It was created by Enhanced CTorrent $as_me dnh3.3.2thor2, which was
 generated by GNU Autoconf 2.61.  Invocation command line was
 
   $ $0 $@
@@ -2736,7 +2736,7 @@
 
 # Define the identity of the package.
  PACKAGE=ctorrent
- VERSION=dnh3.3.2thor1
+ VERSION=dnh3.3.2thor2
 
 
 cat >>confdefs.h <<_ACEOF
@@ -7322,7 +7322,7 @@
 # report actual input values of CONFIG_FILES etc. instead of their
 # values after options handling.
 ac_log="
-This file was extended by Enhanced CTorrent $as_me dnh3.3.2thor1, which was
+This file was extended by Enhanced CTorrent $as_me dnh3.3.2thor2, which was
 generated by GNU Autoconf 2.61.  Invocation command line was
 
   CONFIG_FILES    = $CONFIG_FILES
@@ -7388,7 +7388,7 @@
 cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
 ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
 ac_cs_version="\\
-Enhanced CTorrent config.status dnh3.3.2thor1
+Enhanced CTorrent config.status dnh3.3.2thor2
 configured by $0, generated by GNU Autoconf 2.61,
   with options \\"\$ac_cs_config\\"
 
diff -ruN ../ctorrent-dnh3.3.2thor1/peer.cpp ./peer.cpp
--- ../ctorrent-dnh3.3.2thor1/peer.cpp	2008-06-15 02:00:19.000000000 +0200
+++ ./peer.cpp	2015-12-03 16:42:39.000000000 +0100
@@ -525,6 +525,7 @@
       idx = get_nl(msgbuf + H_LEN + H_BASE_LEN);
       off = get_nl(msgbuf + H_LEN + H_BASE_LEN + H_INT_LEN);
       len = get_nl(msgbuf + H_LEN + H_BASE_LEN + H_INT_LEN * 2);
+      //CONSOLE.Warning(0, "ThOr: peer:%i Remove", __LINE__);
       if( reponse_q.Remove(idx,off,len) < 0 ){
         if( m_state.local_choked &&
             m_last_timestamp - m_unchoke_timestamp >
@@ -664,6 +665,7 @@
       m_cancel_time = now;
     }
     next = ps->next;
+    //CONSOLE.Warning(0, "ThOr: peer:%i Remove", __LINE__);
     request_q.Remove(ps->index, ps->offset, ps->length);
   }
   if( request_q.IsEmpty() ){
@@ -843,6 +845,7 @@
       // (then the request is already in Pending).
       if( f_requested && !BTCONTENT.FlushFailed() ){
         // This removes only the first instance; re-queued request is safe.
+        //CONSOLE.Warning(0, "ThOr: peer:%i Remove", __LINE__);
         request_q.Remove(idx,off,len);
         m_req_out--;
         if( RequestSlice(idx,off,len) < 0 ){
@@ -852,6 +855,7 @@
         }
       }
     }else{  // saved or had the data
+      //CONSOLE.Warning(0, "ThOr: peer:%i Remove", __LINE__);
       request_q.Remove(idx,off,len);
       if( f_requested ) m_req_out--;
       // Check for & cancel requests for this slice from other peers in initial
diff -ruN ../ctorrent-dnh3.3.2thor1/peerlist.cpp ./peerlist.cpp
--- ../ctorrent-dnh3.3.2thor1/peerlist.cpp	2008-06-15 02:00:19.000000000 +0200
+++ ./peerlist.cpp	2015-12-03 16:42:39.000000000 +0100
@@ -82,8 +82,10 @@
 void PeerList::CloseAll()
 {
   PEERNODE *p;
+  //CONSOLE.Warning(0, "ThOr: PeerList::CloseAll()");
   for( p = m_head; p; ){
     m_head = p->next;
+    //CONSOLE.Warning(0, "ThOr: deleting peernode %p with peer %p", (void*)p, (void*)(p->peer));
     delete (p->peer);
     delete p;
     p = m_head;
@@ -194,6 +196,7 @@
   m_peers_count++;
   p->peer = peer;
   p->next = m_head;
+  //CONSOLE.Warning(0,"ThOr: added peernode %p with peer %p", (void*)p, (void*)(p->peer));
   m_head = p;
   return 0;
 
@@ -612,8 +615,9 @@
   }
   if( mark < slots && data[mark].count == 1 ) m_dup_req_pieces++;
   CONSOLE.Debug("%d dup req pieces", (int)m_dup_req_pieces);
+  size_t datamarkidx = data[mark].idx;
   delete []data;
-  return (mark < slots) ? data[mark].idx : BTCONTENT.GetNPieces();
+  return (mark < slots) ? datamarkidx : BTCONTENT.GetNPieces();
 }
 
 void PeerList::FindValuedPieces(BitField &bf, const btPeer *proposer,
diff -ruN ../ctorrent-dnh3.3.2thor1/rate.cpp ./rate.cpp
--- ../ctorrent-dnh3.3.2thor1/rate.cpp	2008-06-15 02:00:19.000000000 +0200
+++ ./rate.cpp	2015-12-03 16:42:39.000000000 +0100
@@ -25,6 +25,7 @@
   m_late = 0;
   m_ontime = m_update_nominal = 0;
   m_lastrate.lasttime = (time_t)0;
+  m_lastrate.recent = 0;
   m_nominal = DEFAULT_SLICE_SIZE / 8;  // minimum "acceptable" rate
 }
 
diff -ruN ../ctorrent-dnh3.3.2thor1/tracker.cpp ./tracker.cpp
--- ../ctorrent-dnh3.3.2thor1/tracker.cpp	2008-06-15 02:00:19.000000000 +0200
+++ ./tracker.cpp	2015-12-03 16:42:39.000000000 +0100
@@ -463,7 +463,7 @@
 
 int btTracker::SendRequest()
 {
-  char *event,*str_event[] = {"started","stopped","completed" };
+  const char *event,*str_event[] = {"started","stopped","completed" };
   char REQ_BUFFER[2*MAXPATHLEN];
   struct sockaddr_in addr;
 
diff -ruN ../ctorrent-dnh3.3.2thor1/version.m4 ./version.m4
--- ../ctorrent-dnh3.3.2thor1/version.m4	2015-12-01 18:12:38.247855344 +0100
+++ ./version.m4	2015-12-03 17:00:51.685910589 +0100
@@ -1,5 +1,5 @@
 m4_define([m4_PACKAGE_NAME],      [Enhanced CTorrent])
 m4_define([m4_PACKAGE_TARNAME],   [ctorrent])
-m4_define([m4_PACKAGE_VERSION],   [dnh3.3.2thor1])
+m4_define([m4_PACKAGE_VERSION],   [dnh3.3.2thor2])
 m4_define([m4_PACKAGE_BUGREPORT], [http://sourceforge.net/projects/dtorrent/ or dhol...@ct.boxmail.com])
 
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to