Re: [PATCH 3/4] pci-arbiter: Fix a -Wstringop-truncation warning

2019-11-03 Thread James Clarke
On 3 Nov 2019, at 16:20, Samuel Thibault  wrote:
> 
> Hello,
> 
> Joan Lledó via Bug reports for the GNU Hurd, le dim. 03 nov. 2019 10:38:28 
> +0100, a ecrit:
>> * pci-arbiter/pcifs.c:
>>* create_dir_entry:
>>  Limit to NAME_SIZE-1 when calling strncpy().
>>  Finish entry->name with '\0'.
>>* create_fs_tree:
>>  memset() to 0 the directory entry.
>>  Limit to NAME_SIZE-1 all calls to
>>   snprintf() and strncpy().
> 
> Applied, thanks!
> 
>> @@ -206,7 +208,7 @@ create_fs_tree (struct pcifs * fs)
>>e_stat = list->stat;
>>e_stat.st_mode &= ~S_IROOT;   /* Remove the root mode */
>>memset (entry_name, 0, NAME_SIZE);
>> -  snprintf (entry_name, NAME_SIZE, "%04x", device->domain);
>> +  snprintf (entry_name, NAME_SIZE - 1, "%04x", device->domain);
> 
> Perhaps replace the whole memset with just setting
> entry_name[NAME_SIZE-1] = 0
> ? and ditto below.

snprintf guarantees NUL termination, so this now over-truncates.

James




Re: mapped-time interface

2019-07-19 Thread James Clarke
On 19 Jul 2019, at 23:15, Almudena Garcia  wrote:
> 
> Forwarded to bug-hurd maillist 
> 
> -- Forwarded message -
> De: Andrew Eggenberger 
> Date: vie., 19 jul. 2019 a las 20:11
> Subject: mapped-time interface
> To: 
> 
> 
> This gnu Mach reference manual page [1] refers to a “mapped-time interface” 
> for accessing the current time. A type is given and a usage example, but I 
> haven’t found a way to get a pointer to the value. Am I missing something? Is 
> the interface not available in user space?

See hurd's libshouldbeinlibc/maptime.[ch]; it's exposed through /dev/time and/or
the "time" mach device.

James

> 
> 1: https://www.gnu.org/software/hurd/gnumach-doc/Host-Time.html
> -- 
> Andrew Eggenberger




Re: Error with gnumach compilation

2019-02-15 Thread James Clarke
On 15 Feb 2019, at 13:33, Almudena Garcia  wrote:
> El vie., 15 feb. 2019 a las 14:28, James Clarke () 
> escribió:
>> On 15 Feb 2019, at 13:21, Samuel Thibault  wrote:
>> > 
>> > Almudena Garcia, le ven. 15 févr. 2019 14:13:17 +0100, a ecrit:
>> >> This is defined in imps/cpu_number.h , included in kern/cpu_number.h
>> > 
>> > cswitch.S includes i386/cpu_number.h, not kern/cpu_number.h
>> > 
>> > Really, make sure it gets defined, that's very most probably the issue,
>> > or else it's the CPU_NUMBER macro which is not actually valid assembly.
>> 
>> Well, I had checked before sending my email, and i386/cpu_number.h does not
>> define CPU_NUMBER, though I was unaware of kern/cpu_number.h. The latter does
>> define a cpu_number function, but it's a C header, not for use in assembly.
>> Your `smp` branch fixes this in [1] by making CPU_NUMBER a macro that calls
>> cpu_number (though I might suggest that you make the macro do nothing for
>> NCPUS==1). Honestly, this is not a hard bug to find, it took all of a few
>> minutes for me. You've had more than enough information from us to pinpoint 
>> the
>> problem; this mailing list is really not for simple programming questions 
>> like
>> this.
>> 
>> James
>> 
>> [1] 
>> https://github.com/AlmuHS/GNUMach_SMP/commit/371df36e565f4408737948ccc3d25acf2e1ccb57
> 
> I removed this macro tonight, to write a better solution.
> 
> https://github.com/AlmuHS/GNUMach_SMP/commit/342b7d62168bcaca944d01c0550b899da5d7f0c5
> 
> I've got to enabled correctly this macro, and feels that CPU_NUMBER assembly 
> macro is enabled, but compiler shows another error
> 
> ../i386/i386/cswitch.S: Mensajes del ensamblador:
> ../i386/i386/cswitch.S:46: Error: operandos inválidos (secciones *UND* y 
> *UND*) para «+»
> ../i386/i386/cswitch.S:64: Error: operandos inválidos (secciones *UND* y 
> *UND*) para «+»
> ../i386/i386/cswitch.S:116: Error: operandos inválidos (secciones *UND* y 
> *UND*) para «+»

This is now getting really tiring. It's complaining about both operands to +
being undefined, on lines where you're using CPU_NUMBER. CPU_NUMBER is defined
by you as:

#define CPU_NUMBER(reg) \
movzbl  APIC_LOCAL_VA+APIC_LOCAL_UNIT_ID+3,reg

Which of those operands for + look like they could be undefined symbols to you?
Seriously, you need to learn to fix these things for yourself.

James




Re: Error with gnumach compilation

2019-02-15 Thread James Clarke
On 15 Feb 2019, at 13:21, Samuel Thibault  wrote:
> 
> Almudena Garcia, le ven. 15 févr. 2019 14:13:17 +0100, a ecrit:
>> This is defined in imps/cpu_number.h , included in kern/cpu_number.h
> 
> cswitch.S includes i386/cpu_number.h, not kern/cpu_number.h
> 
> Really, make sure it gets defined, that's very most probably the issue,
> or else it's the CPU_NUMBER macro which is not actually valid assembly.

Well, I had checked before sending my email, and i386/cpu_number.h does not
define CPU_NUMBER, though I was unaware of kern/cpu_number.h. The latter does
define a cpu_number function, but it's a C header, not for use in assembly.
Your `smp` branch fixes this in [1] by making CPU_NUMBER a macro that calls
cpu_number (though I might suggest that you make the macro do nothing for
NCPUS==1). Honestly, this is not a hard bug to find, it took all of a few
minutes for me. You've had more than enough information from us to pinpoint the
problem; this mailing list is really not for simple programming questions like
this.

James

[1] 
https://github.com/AlmuHS/GNUMach_SMP/commit/371df36e565f4408737948ccc3d25acf2e1ccb57


Re: Enable SMP support

2018-06-08 Thread James Clarke
On 8 Jun 2018, at 18:06, Joshua Branson  wrote:
> 
> Almudena Garcia  writes:
> 
>> Hi all:
>> 
>> Reading this post in Hurd FAQ, I read that Mach has SMP support, but It was 
>> disabled because the Linux device drivers glue code isn't thread-safe.
>> 
>> https://www.gnu.org/software/hurd/faq/smp.html
>> 
>> Then, I ask . Are there any form to enable this SMP support in GNU Mach? (At 
>> my own risk).
>> 
>> I would like to test it.
>> 
>> P.D.: It's only a curiosity, not an urgency
> 
> That actually sounds like really fun!  haha.  I bet it would involve
> diving into the mach code...I believe that the Hurd currently uses
> drivers from linux via DDE.  A lot of that code was shoved into
> GNU/Mach.  You'd have to pull it out, or find the commandline option to
> not compile it in...but I don't know how to do it.

From gnumach's configfrag.ac:

> # Multiprocessor support is still broken.
> AH_TEMPLATE([MULTIPROCESSOR], [set things up for a uniprocessor])
> mach_ncpus=1
> AC_DEFINE_UNQUOTED([NCPUS], [$mach_ncpus], [number of CPUs])
> [if [ $mach_ncpus -gt 1 ]; then]
>   AC_DEFINE([MULTIPROCESSOR], [1], [set things up for a multiprocessor])
> [fi]

So enabling it is just a case of tweaking that constant or making it a
configurable option.

James




Re: Samuel: Do you have permission to _enable_ the gccgo patches again?

2018-03-26 Thread James Clarke
On 26 Mar 2018, at 18:20, Svante Signell  wrote:
> On Mon, 2018-03-26 at 19:06 +0200, Samuel Thibault wrote:
>> Svante Signell, on lun. 26 mars 2018 18:50:20 +0200, wrote:
>>> I just saw:
>>> https://anonscm.debian.org/viewvc/gcccvs?view=revision=10084
>>> 
>>> This is really a large step BACK:
>> 
>> It's not a step back, it's just fixing something that is completely
>> wrong.
> 
> What is really wrong is that Matthias Klose removed the Hurd patches. Adding
> them back is a piece of cake for you (or him), see #894080.
> 
> Next step is to send them to golang-dev for upstream adoption. gcc-patches 
> seems
> to be a null sink. Sending patches there gives no reply whatsoever: Recent
> examples gdb and gccgo.

The best place for them is https://go-review.googlesource.com against 
go-frontend.

James




Re: [PATCH] LwIP translator

2017-12-18 Thread James Clarke
On 18 Dec 2017, at 17:09, Samuel Thibault <samuel.thiba...@gnu.org> wrote:
> James Clarke, on lun. 18 déc. 2017 17:06:28 +, wrote:
>> On 18 Dec 2017, at 16:28, Samuel Thibault <samuel.thiba...@gnu.org> wrote:
>>> Joan Lledó, on lun. 18 déc. 2017 17:10:42 +0100, wrote:
>>>> 2017-12-18 14:46 GMT+01:00 Samuel Thibault <samuel.thiba...@gnu.org>:
>>>>> We need to know what is not yet in upstream, what will
>>>>> eventually get to upstream, and what may not get to
>>>>> upstream.
>>>> 
>>>> There're also some patches that are in upstream, I think it would be
>>>> simpler for me to first upgrade liblwip to 2.0.3 and then take that
>>>> version as start point to create the list of pending patches.
>>> 
>>> Ok.  In the meanwhile I have prepared an upload for lwip to Debian, I
>>> have borrowed some code I could find in your tree and hacked a bit, see 
>>> 
>>> git clone https://anonscm.debian.org/collab-maint/lwip.git
>> 
>> You missed the initial /git/ there (and in Vcs-Git).
> 
> It looks like dh-make is bogus then, that's what set this URL.

*sigh* Fixed in 2016, but no upload yet[1] (also [2] more recently).
I will ping the bug...

James

[1] 
https://anonscm.debian.org/cgit/collab-maint/dh-make.git/commit/?id=8bc8e0fab2f8293f8946e51768802532a6470d7f
[2] 
https://anonscm.debian.org/cgit/collab-maint/dh-make.git/commit/?id=b5a2903efa716e898c141397a6e471ec6fd9febb




Re: [PATCH] LwIP translator

2017-12-18 Thread James Clarke
On 18 Dec 2017, at 16:28, Samuel Thibault  wrote:
> Joan Lledó, on lun. 18 déc. 2017 17:10:42 +0100, wrote:
>> 2017-12-18 14:46 GMT+01:00 Samuel Thibault :
>>> We need to know what is not yet in upstream, what will
>>> eventually get to upstream, and what may not get to
>>> upstream.
>> 
>> There're also some patches that are in upstream, I think it would be
>> simpler for me to first upgrade liblwip to 2.0.3 and then take that
>> version as start point to create the list of pending patches.
> 
> Ok.  In the meanwhile I have prepared an upload for lwip to Debian, I
> have borrowed some code I could find in your tree and hacked a bit, see 
> 
> git clone https://anonscm.debian.org/collab-maint/lwip.git

You missed the initial /git/ there (and in Vcs-Git). Also there's already
src:lwipv6 in the archive which last saw an upload in 2012, so might be worth
unifying them if possible (it seems to be a fork from a while ago[1], probably
before IPv6 was added upstream).

Regards,
James

[1] https://sourceforge.net/p/view-os/code/HEAD/tree/trunk/lwipv6/




Re: [PATCH] eth-multiplxer: Implement ds_device_close()

2017-09-20 Thread James Clarke
On 20 Sep 2017, at 11:08, Joan Lledó  wrote:
> 
> ---
> eth-multiplexer/Makefile  | 2 +-
> eth-multiplexer/device_impl.c | 5 +
> 2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/eth-multiplexer/Makefile b/eth-multiplexer/Makefile
> index cefa0abd..2b19de6d 100644
> --- a/eth-multiplexer/Makefile
> +++ b/eth-multiplexer/Makefile
> @@ -29,6 +29,6 @@ LCLHDRS = ethernet.h util.h vdev.h netfs_impl.h
> HURDLIBS = ports ihash iohelp fshelp shouldbeinlibc netfs bpf
> LDLIBS = -lpthread
> 
> -CFLAGS += -I$(top_srcdir)/libbpf
> +CFLAGS += -I$(top_srcdir)/libbpf -ggdb -O0

I don't think you meant to include this hunk? :)

James




Re: [PATCH] LwIP translator

2017-08-24 Thread James Clarke
Hi Joan,

On 24 Aug 2017, at 18:49, Joan Lledó  wrote:
> ...
> diff --git a/config.make.in b/config.make.in
> index dfbf0c1..f74220e 100644
> --- a/config.make.in
> +++ b/config.make.in
> @@ -99,6 +99,13 @@ HAVE_SUN_RPC = @HAVE_SUN_RPC@
> # Whether we found libgcrypt.
> HAVE_LIBGCRYPT = @HAVE_LIBGCRYPT@
> 
> +# Whether we found libgcrypt.

Copy-paste error :)

James




Re: gnumach 1.7 issue

2016-06-04 Thread James Clarke
> On 4 Jun 2016, at 15:04, James Clarke <jrt...@jrtc27.com> wrote:
> 
>> On 3 Jun 2016, at 19:36, Samuel Thibault <samuel.thiba...@gnu.org> wrote:
>> 
>> Samuel Thibault, on Fri 03 Jun 2016 20:35:13 +0200, wrote:
>>> Adam Richards, on Fri 03 Jun 2016 19:30:40 +0100, wrote:
>>>> On rebooting and reverting to  gnumach-1.6-486.gz this is not an issue.
>>> 
>>> It could be worth bisecting the issue on the upstream git repository.
>> 
>> (there are really not much difference between the last 1.6 snapshot
>> 2:1.6+git20160502-1 and the 1.7 snapshot 2:1.7+git20160522-1)
> 
> 1.7 seems to break running Xorg for me (running latest MATE). I can’t
> move the mouse, opening htop via ssh hangs at a black screen, and then
> subsequent ssh attempts can’t connect, so I have to perform a hard
> reboot. If I don’t send *any* mouse events, htop loads, but locks up
> as soon as I move the mouse. MATE’s clock does continue to update every
> second, so some things still work. Booting 1.6+git20160502 instead works
> fine.
> 
> Looking at the changes[1], some variables types have been changed (many
> int -> unsigned, and a few int -> dev_t in xen/console.{c,h}), but I
> doubt that’s it. My guess is that it’s one of:
> 
> * b4d07d3 Fix pageout deadlock
> * 1db202e vm_map: back allocations with a red-black tree
> 
> I shall try building a kernel with 1db202e reverted.

Reverting 1db202e indeed fixes it. I also rebuilt the same source tree with
the patch commented out in d/p/series just to be sure there wasn’t a
toolchain issue, and that build *doesn’t* work, so 1db202e is either buggy
or exposing other bugs.

James



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Coppyright assignment (was: RFC: Lightweight synchronization mechanism for gnumach)

2016-03-03 Thread James Clarke
> On 3 Mar 2016, at 14:47, Justus Winter <4win...@informatik.uni-hamburg.de> 
> wrote:
> Quoting Samuel Thibault (2016-02-28 22:29:52)
>> We'll probably want to have a copyright assignement from you for this
>> work.  I have pasted below the form for this.  If you're to contribute
>> more to GNU/Hurd, it'd be useful that you already assign copyright for
>> GNUMACH, HURD, GLIBC.
> 
> You forgot MIG.

Unless it’s changed since I assigned copyright last year, MIG is an individual 
project and you can’t assign copyright for it.

James


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PATCH] Check for a return value in netfs_make_peropen before using it in netfs_make_protid.

2016-02-07 Thread James Clarke
On 7 Feb 2016, at 14:51, Flavio Cruz  wrote:
> -  cred = netfs_make_protid (netfs_make_peropen (node, flags, cookie2),
> - user);
> +  po = netfs_make_peropen (node, flags, cookie2);
> +  if (! po)
> +return errno;

You need to free user.

James



signature.asc
Description: Message signed with OpenPGP using GPGMail


[PATCH] Fixed leaks in _netfs_translator_callback2_fn

2016-02-07 Thread James Clarke
* libnetfs/trans-callback.c (_netfs_translator_callback2_fn): Fixed leaking
iouser and peropen structs on error.
---
 libnetfs/trans-callback.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/libnetfs/trans-callback.c b/libnetfs/trans-callback.c
index 3f1aee6..7816bd1 100644
--- a/libnetfs/trans-callback.c
+++ b/libnetfs/trans-callback.c
@@ -67,7 +67,10 @@ _netfs_translator_callback2_fn (void *cookie1, void 
*cookie2, int flags,
 
   po = netfs_make_peropen (node, flags, cookie2);
   if (! po)
-return errno;
+{
+  iohelp_free_iouser (user);
+  return errno;
+}
 
   cred = netfs_make_protid (po, user);
   if (cred)
@@ -79,6 +82,7 @@ _netfs_translator_callback2_fn (void *cookie1, void *cookie2, 
int flags,
 }
   else
 {
+  netfs_release_peropen (po);
   iohelp_free_iouser (user);
   return errno;
 }
-- 
2.7.1




Re: [PATCH] Fixed leaks in _netfs_translator_callback2_fn

2016-02-07 Thread James Clarke
> On 7 Feb 2016, at 23:10, Samuel Thibault  wrote:
> Flávio Cruz, on Sun 07 Feb 2016 23:57:25 +0100, wrote:
>> Maybe here we should do it as follows:
>> 
>> err = errno;
>> netfs_release_peropen (po);
>> iohelp_free_iouser (user);
>> return err;
> 
> Yes, you never know what they could be doing to errno.
> 
> Samuel

Does that include changing

>   po = netfs_make_peropen (node, flags, cookie2);
>   if (! po)
> -return errno;
> +{
> +  iohelp_free_iouser (user);
> +  return errno;
> +}

to

>   po = netfs_make_peropen (node, flags, cookie2);
>   if (! po)
> -return errno;
> +{
> +  err = errno;
> +  iohelp_free_iouser (user);
> +  return err;
> +}

or can we assume the free functions don’t set errno?

James


signature.asc
Description: Message signed with OpenPGP using GPGMail


[PATCH v2] Fixed leaks in _netfs_translator_callback2_fn

2016-02-07 Thread James Clarke
* libnetfs/trans-callback.c (_netfs_translator_callback2_fn): Fixed leaking
iouser and peropen structs on error.
---
 libnetfs/trans-callback.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/libnetfs/trans-callback.c b/libnetfs/trans-callback.c
index 3f1aee6..ed21aa2 100644
--- a/libnetfs/trans-callback.c
+++ b/libnetfs/trans-callback.c
@@ -67,7 +67,11 @@ _netfs_translator_callback2_fn (void *cookie1, void 
*cookie2, int flags,
 
   po = netfs_make_peropen (node, flags, cookie2);
   if (! po)
-return errno;
+{
+  err = errno;
+  iohelp_free_iouser (user);
+  return err;
+}
 
   cred = netfs_make_protid (po, user);
   if (cred)
@@ -79,8 +83,10 @@ _netfs_translator_callback2_fn (void *cookie1, void 
*cookie2, int flags,
 }
   else
 {
+  err = errno;
+  netfs_release_peropen (po);
   iohelp_free_iouser (user);
-  return errno;
+  return err;
 }
 }
 
-- 
2.7.1




Re: PATCH: Hurd FTBFS with perl 5.22

2016-01-04 Thread James Clarke
On 4 Jan 2016, at 22:13, Samuel Thibault  wrote:
> Svante Signell, on Mon 04 Jan 2016 23:09:00 +0100, wrote:
>> Obviously the !defined(@val) is
>> no longer allowed, and I don't know how to rewrite that condition.
> 
> Perhaps juste !@val?
> I have no idea, I don't know perl. But probably worth trying and
> investigating, and I believe you can do it.

It is indeed just !@val (this use of defined was deprecated in 5.6.1 and raised 
warnings since 5.16[1]). Here's the simplified patch (untested):

> Index: hurd-0.7/libdde-linux26/lib/src/kernel/timeconst.pl
> ===
> --- hurd-0.7.orig/libdde-linux26/lib/src/kernel/timeconst.pl
> +++ hurd-0.7/libdde-linux26/lib/src/kernel/timeconst.pl
> @@ -369,10 +369,10 @@ if ($hz eq '--can') {
>   die "Usage: $0 HZ\n";
>   }
> 
>   @val = @{$canned_values{$hz}};
> - if (!defined(@val)) {
> + if (!@val) {
>   @val = compute_values($hz);
>   }
>   output($hz, @val);
>  }
>  exit 0;


James

[1] 
https://metacpan.org/pod/release/RJBS/perl-5.22.0/pod/perldelta.pod#defined-array-and-defined-hash-are-now-fatal-errors



signature.asc
Description: Message signed with OpenPGP using GPGMail


[PATCH] sulogin: Use fallback method on the Hurd for detecting consoles

2015-10-09 Thread James Clarke
Signed-off-by: James Clarke <jrt...@jrtc27.com>
---
 login-utils/sulogin-consoles.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/login-utils/sulogin-consoles.c b/login-utils/sulogin-consoles.c
index 39d24d2..1b05b38 100644
--- a/login-utils/sulogin-consoles.c
+++ b/login-utils/sulogin-consoles.c
@@ -612,6 +612,14 @@ int detect_consoles(const char *device, int fallback, 
struct list_head *consoles
 #ifdef TIOCGDEV
unsigned int devnum;
 #endif
+#ifdef __GNU__
+   /*
+* The Hurd always gives st_rdev as 0, which causes this
+* method to select the first terminal it finds.
+*/
+   close(fd);
+   goto fallback;
+#endif
DBG(dbgprint("trying device/fallback file descriptor"));
 
if (fstat(fd, ) < 0) {
-- 
2.5.3




Re: Hurd Login Utility

2015-09-30 Thread James Clarke

This seems to be caused by a segfault, so I imagine this is not intended! 
Tracking down the cause...

James

On Tue, 29 Sep 2015, James Clarke wrote:


Whilst looking through the code in utils/login.c, I noticed a security issue. 
Even if --paranoid is set, if you give it a UID that doesn’t exist (login 
treats it as a UID if the first character is a digit, with no fallback to 
treating it as a username), it will exit without prompting for a password (and 
of course prompts for a password if it is a valid UID!). Is this intentional?
I was also thinking that login should prompt for a username if not provided on 
the command line, as with Linux and BSD. This would in fact let us get rid of 
/bin/loginpr (currently we go via bash just to prompt for a username, and then 
exec login, which seems unnecessary). Thoughts?

James



[PATCH 1/1] Add missing null checks in libshouldbeinlibc

2015-09-30 Thread James Clarke
The getpwnam_r and similar functions only return non-zero on error, but not
finding the given name/UID/GID does not count as an error. When they return 0,
the value of the result (*result when looking at the arguments in the man pages)
still needs to be checked for null.

* libshouldbeinlibc/idvec-rep.c (lookup_uid): Check result for null.
(lookup_gid): Likewise.
* libshouldbeinlibc/idvec-verify.c (verify_passwd): Likewise.
(verify_id): Likewise.
---
 libshouldbeinlibc/idvec-rep.c| 4 ++--
 libshouldbeinlibc/idvec-verify.c | 7 ---
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/libshouldbeinlibc/idvec-rep.c b/libshouldbeinlibc/idvec-rep.c
index 16408a4..4fc7712 100644
--- a/libshouldbeinlibc/idvec-rep.c
+++ b/libshouldbeinlibc/idvec-rep.c
@@ -129,7 +129,7 @@ lookup_uid (uid_t uid)
 {
   char buf[1024];
   struct passwd _pw, *pw;
-  if (getpwuid_r (uid, &_pw, buf, sizeof buf, ) == 0)
+  if (getpwuid_r (uid, &_pw, buf, sizeof buf, ) == 0 && pw)
 return strdup (pw->pw_name);
   else
 return 0;
@@ -141,7 +141,7 @@ lookup_gid (gid_t gid)
 {
   char buf[1024];
   struct group _gr, *gr;
-  if (getgrgid_r (gid, &_gr, buf, sizeof buf, ) == 0)
+  if (getgrgid_r (gid, &_gr, buf, sizeof buf, ) == 0 && gr)
 return strdup (gr->gr_name);
   else
 return 0;
diff --git a/libshouldbeinlibc/idvec-verify.c b/libshouldbeinlibc/idvec-verify.c
index 4d9b6db..4019a04 100644
--- a/libshouldbeinlibc/idvec-verify.c
+++ b/libshouldbeinlibc/idvec-verify.c
@@ -107,7 +107,8 @@ verify_passwd (const char *password,
  return pw->pw_passwd;
}
 
-  if (getpwuid_r (wheel_uid, &_pw, lookup_buf, sizeof lookup_buf, ))
+  if (getpwuid_r (wheel_uid, &_pw, lookup_buf, sizeof lookup_buf, )
+ || ! pw)
return errno ?: EINVAL;
 
   sys_encrypted = check_shadow (pw);
@@ -266,7 +267,7 @@ verify_id (uid_t id, int is_group, int multiple,
  {
struct group _gr, *gr;
if (getgrgid_r (id, &_gr, id_lookup_buf, sizeof id_lookup_buf, )
-   == 0)
+   == 0 && gr)
  {
if (!gr->gr_passwd || !*gr->gr_passwd)
  return (*verify_fn) ("", id, 1, gr, verify_hook);
@@ -278,7 +279,7 @@ verify_id (uid_t id, int is_group, int multiple,
  {
struct passwd _pw, *pw;
if (getpwuid_r (id, &_pw, id_lookup_buf, sizeof id_lookup_buf, )
-   == 0)
+   == 0 && pw)
  {
if (strcmp (pw->pw_passwd, SHADOW_PASSWORD_STRING) == 0)
  {
-- 
2.5.3




[PATCH 0/1] Add missing null checks in libshouldbeinlibc

2015-09-30 Thread James Clarke
This stops /bin/login segfaulting when giving it a bad UID. However, the fact
that the UID does not exist is still leaked, since
libshouldbeinlibc/idvec-verify.c:verify_id falls back on asking for the root
password, and indicates this by changing the prompt to "Password for root".

James Clarke (1):
  Add missing null checks in libshouldbeinlibc

 libshouldbeinlibc/idvec-rep.c| 4 ++--
 libshouldbeinlibc/idvec-verify.c | 7 ---
 2 files changed, 6 insertions(+), 5 deletions(-)

-- 
2.5.3




Hurd Login Utility

2015-09-29 Thread James Clarke
Whilst looking through the code in utils/login.c, I noticed a security issue. 
Even if --paranoid is set, if you give it a UID that doesn’t exist (login 
treats it as a UID if the first character is a digit, with no fallback to 
treating it as a username), it will exit without prompting for a password (and 
of course prompts for a password if it is a valid UID!). Is this intentional?
I was also thinking that login should prompt for a username if not provided on 
the command line, as with Linux and BSD. This would in fact let us get rid of 
/bin/loginpr (currently we go via bash just to prompt for a username, and then 
exec login, which seems unnecessary). Thoughts?

James




Re: (Newbie question) Failed building hurd on Debian GNU Hurd

2015-09-24 Thread James Clarke
That’s an issue with the current Hurd repo on Savannah. You need this patch 
http://darnassus.sceen.net/gitweb/teythoon/packaging/hurd.git/blob/HEAD:/debian/patches/proc-task-notify-0005-proc-fix-build.patch,
 from what I gather (at least it fixes it for me).

James

> On 24 Sep 2015, at 13:50, Braione Pietro  
> wrote:
> 
> Hello Samuel.
>> Il giorno 15/set/2015, alle ore 17:28, Samuel Thibault 
>>  ha scritto:
>> 
>> Hello,
>> 
>> Braione Pietro, le Tue 15 Sep 2015 12:31:16 +, a écrit :
>>> ../libports/libports.so: undefined reference to 
>>> `mach_port_set_protected_payload’
>>> ../libports/libports.so: undefined reference to 
>>> `mach_port_clear_protected_payload’
>> 
>> These are new features which were added to GNU Mach after the 2013
>> release.  To get them you need at least a newer glibc in addition to the
>> newer gnumach.  
> 
> Since I want to be able to build the head version, I downloaded the 2015 
> Debian distribution and rebuilt everything (except for mig, but I don’t think 
> this is the issue). I strictly followed the instructions at 
> http://www.gnu.org/software/hurd/microkernel/mach/gnumach/building.html and 
> http://www.gnu.org/software/hurd/hurd/building.html, in their non-Debian 
> variant, plus make install of gnumach. It still fails, but now while linking 
> the proc server:
> 
> …
> make -C proc all
> …
> mgt.o: In function `S_mach_notify_new_task’:
> /root/hurd/build/proc/../../proc/mgt.c:1081: undefined reference to 
> `mach_notify_new_task’
> collect2: error: ld returned 1 exit status
> 
> Can I suppose that I need a newer glibc? If yes, which of the many branches 
> should I check out?
> Best,
> Pietro




Re: [PATCH] Add support for ANSI.SYS SCP/RCP escape codes

2015-09-12 Thread James Clarke
Currently the new "apt" command-line tool relies on them (when it installs 
packages, it keeps a progress bar at the bottom, but shows a log above), 
although they've recently patched it to use the more widely-available ESC 7/8 
(though the patch is not yet in sid).
I chose SCP/RCP, because ANSI.SYS called them Save/Restore Cursor Position, and 
everywhere seems to refer to them by that acronym (they're even listed on 
Wikipedia in the table of codes, despite not being in ECMA-48, nor being 
implemented by the VT100).

James

> On 12 Sep 2015, at 10:15, Samuel Thibault <samuel.thiba...@gnu.org> wrote:
> 
> Hello,
> 
> James Clarke, le Sat 12 Sep 2015 00:42:05 +0100, a écrit :
>> This adds support for CSI s and u, which are equivalent to ESC 7 and 8,
>> saving/restoring the cursor position.
> 
> Just curious: in which case did you see it a problem, or is it just for
> completeness? (which can always become convenient anyway)
> 
>> +case 's':/* ANSI.SYS: Save cursor and attributes.  */
>> +  /* Save cursor position: .  */
> 
> Why scp?  AIUI from terminfo(5), scp is the capname for set_color_pair,
> the tcap code for save cursor position is sc.
> 
>> +  display->cursor.saved_x = user->cursor.col;
>> +  display->cursor.saved_y = user->cursor.row;
>> +  break;
>> +case 'u':/* ANSI.SYS: Restore cursor and attributes.  */
>> +  /* Restore cursor position: .  */
>> +  user->cursor.col = display->cursor.saved_x;
>> +  user->cursor.row = display->cursor.saved_y;
>> +  /* In case the screen was larger before:  */
>> +  limit_cursor (display);
>> +  break;
>> case 'l':
>>   /* Reset mode.  */
>>   for (i = 0; i < parse->nparams; i++)
>> -- 
>> 2.5.1
>> 
>> 



[PATCH] Add support for ANSI.SYS SCP/RCP escape codes

2015-09-11 Thread James Clarke
This adds support for CSI s and u, which are equivalent to ESC 7 and 8,
saving/restoring the cursor position.

* console/display.c (handle_esc_bracket): Added support for CSI s and u.
---
 console/display.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/console/display.c b/console/display.c
index eb420fd..98c70f5 100644
--- a/console/display.c
+++ b/console/display.c
@@ -1210,6 +1210,18 @@ handle_esc_bracket (display_t display, char op)
   user->cursor.col -= (parse->params[0] ?: 1);
   limit_cursor (display);
   break;
+case 's':  /* ANSI.SYS: Save cursor and attributes.  */
+  /* Save cursor position: .  */
+  display->cursor.saved_x = user->cursor.col;
+  display->cursor.saved_y = user->cursor.row;
+  break;
+case 'u':  /* ANSI.SYS: Restore cursor and attributes.  */
+  /* Restore cursor position: .  */
+  user->cursor.col = display->cursor.saved_x;
+  user->cursor.row = display->cursor.saved_y;
+  /* In case the screen was larger before:  */
+  limit_cursor (display);
+  break;
 case 'l':
   /* Reset mode.  */
   for (i = 0; i < parse->nparams; i++)
-- 
2.5.1




[PATCH] Install port-deref-deferred.h header for ports.h

2015-09-10 Thread James Clarke
* libports/Makefile (installhdrs): Add port-deref-deferred.h for ports.h
---
 libports/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libports/Makefile b/libports/Makefile
index b8b82ee..af881f8 100644
--- a/libports/Makefile
+++ b/libports/Makefile
@@ -38,7 +38,7 @@ SRCS = create-bucket.c create-class.c \
  claim-right.c transfer-right.c create-port-noinstall.c create-internal.c \
  interrupted.c extern-inline.c port-deref-deferred.c
 
-installhdrs = ports.h
+installhdrs = ports.h port-deref-deferred.h
 
 HURDLIBS= ihash
 LDLIBS += -lpthread
-- 
2.5.1




Re: [RFC] Fix printk not handling ANSI escape codes

2015-08-29 Thread James Clarke
Yes, that was a mistake, it should call kd_putc_esc.

James

 On 29 Aug 2015, at 11:55, Samuel Thibault samuel.thiba...@gnu.org wrote:
 
 James Clarke, le Sat 29 Aug 2015 11:53:57 +0100, a écrit :
 I never gave that as an option?
 
 ? I understand even less.  AIUI, before kd_start had esc support; with
 your patch it doesn't have any more.  Probably Mach/Hurd never made use
 of it, but that's probably not a reason for removing the support, you
 never know who might want to make use of it someday.
 
 Samuel



Re: [RFC] Fix printk not handling ANSI escape codes

2015-08-29 Thread James Clarke
Yes, it should be; since I didn't catch that I'd guess nothing in Mach/Hurd is 
actually using escape codes.

I was trying to decide whether putc should handle escape codes or whether it 
should be left as outputting raw bytes to the display. As you can see I chose 
the latter, but do you think that was the right decision?

James

 On 29 Aug 2015, at 11:20, Samuel Thibault samuel.thiba...@gnu.org wrote:
 
 James Clarke, le Sat 29 Aug 2015 11:15:29 +0100, a écrit :
 @@ -1069,33 +1068,12 @@ kdstart(struct tty *tp)
break;
if ((tp-t_outq.c_cc = 0) || (ch = getc(tp-t_outq)) == -1)
break;
 -c = ch;
/*
 * Drop priority for long screen updates. ttstart() calls us at
 * spltty.
 */
o_pri = splsoftclock();/* block timeout */
 -if (c == (K_ESC)) {
 ...
 -}
 +kd_putc(ch);
 
 I don't understand: shouldn't that be kd_putc_esc?
 
 Samuel



Re: [RFC] Fix printk not handling ANSI escape codes

2015-08-29 Thread James Clarke
I never gave that as an option?

James

 On 29 Aug 2015, at 11:44, Samuel Thibault samuel.thiba...@gnu.org wrote:
 
 James Clarke, le Sat 29 Aug 2015 11:42:38 +0100, a écrit :
 Yes, it should be; since I didn't catch that I'd guess nothing in Mach/Hurd 
 is actually using escape codes.
 
 I was trying to decide whether putc should handle escape codes or whether it 
 should be left as outputting raw bytes to the display. As you can see I 
 chose the latter, but do you think that was the right decision?
 
 Well, I don't see why we would want to forbid the use of ESC sequences
 in Mach/Hurd :)
 
 Samuel



[RFC] Fix printk not handling ANSI escape codes

2015-08-29 Thread James Clarke
* i386/i386at/kd.c (kdstart): Moved escape sequence handling to new
kd_putc_esc function.
(kd_putc_esc): New function with logic from kdstart.
(kdcnputc): Call kd_putc_esc rather than kd_putc to allow for ANSI
escape codes.
* i386/i386at/kd.h (kd_putc_esc): New function.
---
 i386/i386at/kd.c | 63 +++-
 i386/i386at/kd.h |  1 +
 2 files changed, 40 insertions(+), 24 deletions(-)

diff --git a/i386/i386at/kd.c b/i386/i386at/kd.c
index 5656e83..52ba0b6 100644
--- a/i386/i386at/kd.c
+++ b/i386/i386at/kd.c
@@ -1059,7 +1059,6 @@ kdstart(struct tty *tp)
 {
spl_t   o_pri;
int ch;
-   unsigned char   c;
 
if (tp-t_state  TS_TTSTOP)
return;
@@ -1069,33 +1068,12 @@ kdstart(struct tty *tp)
break;
if ((tp-t_outq.c_cc = 0) || (ch = getc(tp-t_outq)) == -1)
break;
-   c = ch;
/*
 * Drop priority for long screen updates. ttstart() calls us at
 * spltty.
 */
o_pri = splsoftclock(); /* block timeout */
-   if (c == (K_ESC)) {
-   if (esc_spt == esc_seq) {
-   *(esc_spt++)=(K_ESC);
-   *(esc_spt) = '\0';
-   } else {
-   kd_putc((K_ESC));
-   esc_spt = esc_seq;
-   }
-   } else {
-   if (esc_spt - esc_seq) {
-   if (esc_spt - esc_seq  K_MAXESC - 1)
-   esc_spt = esc_seq;
-   else {
-   *(esc_spt++) = c;
-   *(esc_spt) = '\0';
-   kd_parseesc();
- }
-   } else {
-   kd_putc(c);
-   }
-   }
+   kd_putc(ch);
splx(o_pri);
}
if (tp-t_outq.c_cc = TTLOWAT(tp)) {
@@ -1237,6 +1215,43 @@ kd_bellon(void)
 
 /*
  *
+ * Function kd_putc_esc():
+ *
+ * This function puts a character on the screen, handling escape
+ * sequences.
+ *
+ * input   : character to be displayed (or part of an escape code)
+ * output  : character is displayed, or some action is taken
+ *
+ */
+void
+kd_putc_esc(u_char c)
+{
+   if (c == (K_ESC)) {
+   if (esc_spt == esc_seq) {
+   *(esc_spt++)=(K_ESC);
+   *(esc_spt) = '\0';
+   } else {
+   kd_putc((K_ESC));
+   esc_spt = esc_seq;
+   }
+   } else {
+   if (esc_spt - esc_seq) {
+   if (esc_spt - esc_seq  K_MAXESC - 1)
+   esc_spt = esc_seq;
+   else {
+   *(esc_spt++) = c;
+   *(esc_spt) = '\0';
+   kd_parseesc();
+   }
+   } else {
+   kd_putc(c);
+   }
+   }
+}
+
+/*
+ *
  * Function kd_putc():
  *
  * This function simply puts a character on the screen.  It does some
@@ -2950,7 +2965,7 @@ kdcnputc(dev_t dev, int c)
/* Note that tab is handled in kd_putc */
if (c == '\n')
kd_putc('\r');
-   kd_putc(c);
+   kd_putc_esc(c);
 
return 0;
 }
diff --git a/i386/i386at/kd.h b/i386/i386at/kd.h
index 0cfed69..60cee7e 100644
--- a/i386/i386at/kd.h
+++ b/i386/i386at/kd.h
@@ -702,6 +702,7 @@ extern void kd_setleds1 (u_char);
 extern void kd_setleds2 (void);
 extern void cnsetleds (u_char);
 extern void kdreboot (void);
+extern void kd_putc_esc (u_char);
 extern void kd_putc (u_char);
 extern void kd_parseesc (void);
 extern void kd_down (void);
-- 
2.5.1




[PATCH] Fix race condition in ext2fs when remounting

2015-08-27 Thread James Clarke
On some systems, ext2fs.static would regularly hang at startup, as a
race condition meant it would process paging requests while remounting.
To fix this, libpager has been altered to allow inhibiting and resuming
its worker threads, and ext2fs uses this to inhibit paging while
remounting.

* console/pager.c (pager_requests): New variable.
(user_pager_init): Updated call to pager_start_workers to use new
pager_requests variable.
* daemons/runsystem.sh: Removed artificial delay working around the race
condition.
* ext2fs/ext2fs.c (diskfs_reload_global_state): Call new
inhibit_ext2_pager and resume_ext2_pager functions, and leave sblock as
non-NULL so it will be munmapped.
* ext2fs/ext2fs.h (inhibit_ext2_pager,resume_ext2_pager): New functions.
* ext2fs/pager.c (file_pager_requests): New variable.
(create_disk_pager): Updated call to pager_start_workers to use new
file_pager_requests variable.
(inhibit_ext2_pager,resume_ext2_pager): New functions.
* fatfs/fatfs.h (inhibit_fat_pager,resume_fat_pager): New functions.
* fatfs/pager.c (file_pager_requests): New variable.
(create_fat_pager): Updated call to pager_start_workers to use new
file_pager_requests variable.
(inhibit_fat_pager,resume_fat_pager): New functions.
* libdiskfs/disk-pager.c (diskfs_disk_pager_requests): New variable.
(diskfs_start_disk_pager): Updated call to pager_start_workers to use
new diskfs_disk_pager_requests variable.
* libdiskfs/diskfs-pager.h (diskfs_disk_pager_requests): New variable.
* libpager/demuxer.c (struct pager_requests): Renamed struct requests to
struct pager_requests. Replaced queue with queue_in and queue_out
pointers. Added inhibit_wakeup field.
(pager_demuxer): Updated to use new queue_in/queue_out pointers. Only
wake up workers if not inhibited.
(worker_func): Updated to use new queue_in/queue_out pointers. Final
worker thread to sleep notifies the inhibit_wakeup condition variable.
(pager_start_workers): Added out parameter for the requests instance.
Allocate heap space shared by both queues. Initialise new inhibit_wakeup
condition.
(pager_inhibit_workers,pager_resume_workers): New functions.
* libpager/pager.h (struct pager_requests): Public forward definition.
(pager_start_workers): Added out parameter for the requests instance.
(pager_inhibit_workers,pager_resume_workers): New functions.
* libpager/queue.h (queue_empty): New function.
* storeio/pager.c (pager_requests): New variable.
(init_dev_paging): Updated call to pager_start_workers to use new
pager_requests variable.
---
 console/pager.c  |   3 +-
 daemons/runsystem.sh |   3 --
 ext2fs/ext2fs.c  |  12 -
 ext2fs/ext2fs.h  |   6 +++
 ext2fs/pager.c   |  33 -
 fatfs/fatfs.h|   2 +
 fatfs/pager.c|  33 -
 libdiskfs/disk-pager.c   |   3 +-
 libdiskfs/diskfs-pager.h |   1 +
 libpager/demuxer.c   | 119 ---
 libpager/pager.h |  28 ++-
 libpager/queue.h |   8 
 storeio/pager.c  |   3 +-
 13 files changed, 227 insertions(+), 27 deletions(-)

diff --git a/console/pager.c b/console/pager.c
index 5e13ba4..818e49d 100644
--- a/console/pager.c
+++ b/console/pager.c
@@ -42,6 +42,7 @@ struct user_pager_info
 
 /* We need a separate bucket for the pager ports.  */
 static struct port_bucket *pager_bucket;
+static struct pager_requests *pager_requests;
 
 
 /* Implement the pager_clear_user_data callback from the pager library. */
@@ -133,7 +134,7 @@ user_pager_init (void)
 error (5, errno, Cannot create pager bucket);
 
   /* Start libpagers worker threads.  */
-  err = pager_start_workers (pager_bucket);
+  err = pager_start_workers (pager_bucket, pager_requests);
   if (err)
 error (5, err, Cannot start pager worker threads);
 }
diff --git a/daemons/runsystem.sh b/daemons/runsystem.sh
index ae25a7d..5d0ad01 100644
--- a/daemons/runsystem.sh
+++ b/daemons/runsystem.sh
@@ -118,9 +118,6 @@ esac
 /hurd/mach-defpager
 
 # This is necessary to make stat / return the correct device ids.
-# Work around a race condition (probably in the root translator).
-for i in `seq 1 10` ; do : ; done # XXX
-
 fsysopts / --update --readonly
 
 # Finally, start the actual init.
diff --git a/ext2fs/ext2fs.c b/ext2fs/ext2fs.c
index d0fdfe7..03c9eed 100644
--- a/ext2fs/ext2fs.c
+++ b/ext2fs/ext2fs.c
@@ -207,10 +207,20 @@ main (int argc, char **argv)
 error_t
 diskfs_reload_global_state ()
 {
+  error_t err;
+
   pokel_flush (global_pokel);
   pager_flush (diskfs_disk_pager, 1);
-  sblock = NULL;
+
+  /* libdiskfs is not responsible for inhibiting paging.  */
+  err = inhibit_ext2_pager ();
+  if (err)
+return err;
+
   get_hypermetadata ();
   map_hypermetadata ();
+
+  resume_ext2_pager ();
+
   return 0;
 }
diff --git a/ext2fs/ext2fs.h b/ext2fs/ext2fs.h
index 96d8e9d..a744685 100644
--- a/ext2fs/ext2fs.h
+++ b/ext2fs/ext2fs.h
@@ -201,6 +201,12 @@ struct user_pager_info
 /* Set up the disk pager.  */
 void 

Re: [PATCH v3] Fix race condition in ext2fs when remounting

2015-08-15 Thread James Clarke
Not sure how I missed that; will grep for any others.

James

 On 14 Aug 2015, at 13:16, Justus Winter 4win...@informatik.uni-hamburg.de 
 wrote:
 
 Quoting James Clarke (2015-07-23 19:33:42)
 diff --git a/libdiskfs/disk-pager.c b/libdiskfs/disk-pager.c
 index 008aa2d..33b109c 100644
 --- a/libdiskfs/disk-pager.c
 +++ b/libdiskfs/disk-pager.c
 @@ -24,6 +24,7 @@
 __thread struct disk_image_user *diskfs_exception_diu;
 
 struct pager *diskfs_disk_pager;
 +struct requests *diskfs_disk_pager_requests;
 
 struct pager_requests
 
 Justus



[PATCH v3] Fix race condition in ext2fs when remounting

2015-07-23 Thread James Clarke
On some systems, ext2fs.static would regularly hang at startup, as a
race condition meant it would process paging requests while remounting.
To fix this, libpager has been altered to allow inhibiting and resuming
its worker threads, and ext2fs uses this to inhibit paging while
remounting.

* console/pager.c (pager_requests): New variable.
(user_pager_init): Updated call to pager_start_workers to use new
pager_requests variable.
* daemons/runsystem.sh: Removed artificial delay working around the race
condition.
* ext2fs/ext2fs.c (diskfs_reload_global_state): Call new
inhibit_ext2_pager and resume_ext2_pager functions, and leave sblock as
non-NULL so it will be munmapped.
* ext2fs/ext2fs.h (inhibit_ext2_pager,resume_ext2_pager): New functions.
* ext2fs/pager.c (file_pager_requests): New variable.
(create_disk_pager): Updated call to pager_start_workers to use new
file_pager_requests variable.
(inhibit_ext2_pager,resume_ext2_pager): New functions.
* fatfs/fatfs.h (inhibit_fat_pager,resume_fat_pager): New functions.
* fatfs/pager.c (file_pager_requests): New variable.
(create_fat_pager): Updated call to pager_start_workers to use new
file_pager_requests variable.
(inhibit_fat_pager,resume_fat_pager): New functions.
* libdiskfs/disk-pager.c (diskfs_disk_pager_requests): New variable.
(diskfs_start_disk_pager): Updated call to pager_start_workers to use
new diskfs_disk_pager_requests variable.
* libdiskfs/diskfs-pager.h (diskfs_disk_pager_requests): New variable.
* libpager/demuxer.c (struct pager_requests): Renamed struct requests to
struct pager_requests. Replaced queue with queue_in and queue_out
pointers. Added inhibit_wakeup field.
(pager_demuxer): Updated to use new queue_in/queue_out pointers. Only
wake up workers if not inhibited.
(worker_func): Updated to use new queue_in/queue_out pointers. Final
worker thread to sleep notifies the inhibit_wakeup condition variable.
(pager_start_workers): Added out parameter for the requests instance.
Allocate heap space shared by both queues. Initialise new inhibit_wakeup
condition.
(pager_inhibit_workers,pager_resume_workers): New functions.
* libpager/pager.h (struct pager_requests): Public forward definition.
(pager_start_workers): Added out parameter for the requests instance.
(pager_inhibit_workers,pager_resume_workers): New functions.
* libpager/queue.h (queue_empty): New function.
* storeio/pager.c (pager_requests): New variable.
(init_dev_paging): Updated call to pager_start_workers to use new
pager_requests variable.
---
 console/pager.c  |   3 +-
 daemons/runsystem.sh |   3 --
 ext2fs/ext2fs.c  |  12 -
 ext2fs/ext2fs.h  |   6 +++
 ext2fs/pager.c   |  33 -
 fatfs/fatfs.h|   2 +
 fatfs/pager.c|  33 -
 libdiskfs/disk-pager.c   |   3 +-
 libdiskfs/diskfs-pager.h |   1 +
 libpager/demuxer.c   | 119 ---
 libpager/pager.h |  28 ++-
 libpager/queue.h |   8 
 storeio/pager.c  |   3 +-
 13 files changed, 227 insertions(+), 27 deletions(-)

diff --git a/console/pager.c b/console/pager.c
index 5e13ba4..818e49d 100644
--- a/console/pager.c
+++ b/console/pager.c
@@ -42,6 +42,7 @@ struct user_pager_info
 
 /* We need a separate bucket for the pager ports.  */
 static struct port_bucket *pager_bucket;
+static struct pager_requests *pager_requests;
 
 
 /* Implement the pager_clear_user_data callback from the pager library. */
@@ -133,7 +134,7 @@ user_pager_init (void)
 error (5, errno, Cannot create pager bucket);
 
   /* Start libpagers worker threads.  */
-  err = pager_start_workers (pager_bucket);
+  err = pager_start_workers (pager_bucket, pager_requests);
   if (err)
 error (5, err, Cannot start pager worker threads);
 }
diff --git a/daemons/runsystem.sh b/daemons/runsystem.sh
index ae25a7d..5d0ad01 100644
--- a/daemons/runsystem.sh
+++ b/daemons/runsystem.sh
@@ -118,9 +118,6 @@ esac
 /hurd/mach-defpager
 
 # This is necessary to make stat / return the correct device ids.
-# Work around a race condition (probably in the root translator).
-for i in `seq 1 10` ; do : ; done # XXX
-
 fsysopts / --update --readonly
 
 # Finally, start the actual init.
diff --git a/ext2fs/ext2fs.c b/ext2fs/ext2fs.c
index d0fdfe7..03c9eed 100644
--- a/ext2fs/ext2fs.c
+++ b/ext2fs/ext2fs.c
@@ -207,10 +207,20 @@ main (int argc, char **argv)
 error_t
 diskfs_reload_global_state ()
 {
+  error_t err;
+
   pokel_flush (global_pokel);
   pager_flush (diskfs_disk_pager, 1);
-  sblock = NULL;
+
+  /* libdiskfs is not responsible for inhibiting paging.  */
+  err = inhibit_ext2_pager ();
+  if (err)
+return err;
+
   get_hypermetadata ();
   map_hypermetadata ();
+
+  resume_ext2_pager ();
+
   return 0;
 }
diff --git a/ext2fs/ext2fs.h b/ext2fs/ext2fs.h
index 96d8e9d..a744685 100644
--- a/ext2fs/ext2fs.h
+++ b/ext2fs/ext2fs.h
@@ -201,6 +201,12 @@ struct user_pager_info
 /* Set up the disk pager.  */
 void 

[PATCH] Fix race condition in ext2fs when remounting

2015-07-22 Thread James Clarke
On some systems, ext2fs.static would regularly hang at startup, as a
race condition meant it would process paging requests while reounting.
To fix this, libpager has been altered to allow inhibiting and resuming
its worker threads.

* console/pager.c (pager_requests): New variable.
(user_pager_init): Updated call to pager_start_workers to use new
pager_requests variable.
* daemons/runsystem.sh: Removed artificial delay working around the race
condition.
* ext2fs/ext2fs.c (diskfs_reload_global_state): Call new
inhibit_ext2_pager and resume_ext2_pager functions, and leave sblock as
non-NULL so it will be munmapped.
* ext2fs/ext2fs.h (inhibit_ext2_pager,resume_ext2_pager): New functions.
* ext2fs/pager.c (file_pager_requests): New variable.
(create_disk_pager): Updated call to pager_start_workers to use new
file_pager_requests variable.
(inhibit_ext2_pager,resume_ext2_pager): New functions.
* fatfs/fatfs.h (inhibit_fat_pager,resume_fat_pager): New functions.
* fatfs/pager.c (file_pager_requests): New variable.
(create_fat_pager): Updated call to pager_start_workers to use new
file_pager_requests variable.
(inhibit_fat_pager,resume_fat_pager): New functions.
* libdiskfs/disk-pager.c (diskfs_disk_pager_requests): New variable.
(diskfs_start_disk_pager): Updated call to pager_start_workers to use
new diskfs_disk_pager_requests variable.
* libdiskfs/diskfs-pager.h (diskfs_disk_pager_requests): New variable.
* libpager/demuxer.c (struct pager_requests): Renamed struct requests to
struct pager_requests. Replaced queue with queue_in and queue_out
pointers. Added inhibit_wakeup field.
(pager_demuxer): Updated to use new queue_in/queue_out pointers. Only
wake up workers if not inhibited.
(worker_func): Updated to use new queue_in/queue_out pointers. Final
worker thread to sleep notifies the inhibit_wakeup condition variable.
(pager_start_workers): Added out parameter for the requests instance.
Allocate heap space shared by both queues. Initialise new inhibit_wakeup
condition.
(pager_inhibit_workers,pager_resume_workers): New functions.
* libpager/pager.h (struct pager_requests): Public forward definition.
(pager_start_workers): Added out parameter for the requests instance.
(pager_inhibit_workers,pager_resume_workers): New functions.
* libpager/queue.h (queue_empty): New function.
* storeio/pager.c (pager_requests): New variable.
(init_dev_paging): Updated call to pager_start_workers to use new
pager_requests variable.
---
 console/pager.c  |   3 +-
 daemons/runsystem.sh |   3 --
 ext2fs/ext2fs.c  |  12 -
 ext2fs/ext2fs.h  |   6 +++
 ext2fs/pager.c   |  29 +++-
 fatfs/fatfs.h|   2 +
 fatfs/pager.c|  29 +++-
 libdiskfs/disk-pager.c   |   3 +-
 libdiskfs/diskfs-pager.h |   1 +
 libpager/demuxer.c   | 119 ---
 libpager/pager.h |  28 ++-
 libpager/queue.h |   8 
 storeio/pager.c  |   3 +-
 13 files changed, 219 insertions(+), 27 deletions(-)

diff --git a/console/pager.c b/console/pager.c
index 5e13ba4..818e49d 100644
--- a/console/pager.c
+++ b/console/pager.c
@@ -42,6 +42,7 @@ struct user_pager_info
 
 /* We need a separate bucket for the pager ports.  */
 static struct port_bucket *pager_bucket;
+static struct pager_requests *pager_requests;
 
 
 /* Implement the pager_clear_user_data callback from the pager library. */
@@ -133,7 +134,7 @@ user_pager_init (void)
 error (5, errno, Cannot create pager bucket);
 
   /* Start libpagers worker threads.  */
-  err = pager_start_workers (pager_bucket);
+  err = pager_start_workers (pager_bucket, pager_requests);
   if (err)
 error (5, err, Cannot start pager worker threads);
 }
diff --git a/daemons/runsystem.sh b/daemons/runsystem.sh
index ae25a7d..5d0ad01 100644
--- a/daemons/runsystem.sh
+++ b/daemons/runsystem.sh
@@ -118,9 +118,6 @@ esac
 /hurd/mach-defpager
 
 # This is necessary to make stat / return the correct device ids.
-# Work around a race condition (probably in the root translator).
-for i in `seq 1 10` ; do : ; done # XXX
-
 fsysopts / --update --readonly
 
 # Finally, start the actual init.
diff --git a/ext2fs/ext2fs.c b/ext2fs/ext2fs.c
index d0fdfe7..03c9eed 100644
--- a/ext2fs/ext2fs.c
+++ b/ext2fs/ext2fs.c
@@ -207,10 +207,20 @@ main (int argc, char **argv)
 error_t
 diskfs_reload_global_state ()
 {
+  error_t err;
+
   pokel_flush (global_pokel);
   pager_flush (diskfs_disk_pager, 1);
-  sblock = NULL;
+
+  /* libdiskfs is not responsible for inhibiting paging.  */
+  err = inhibit_ext2_pager ();
+  if (err)
+return err;
+
   get_hypermetadata ();
   map_hypermetadata ();
+
+  resume_ext2_pager ();
+
   return 0;
 }
diff --git a/ext2fs/ext2fs.h b/ext2fs/ext2fs.h
index 96d8e9d..a744685 100644
--- a/ext2fs/ext2fs.h
+++ b/ext2fs/ext2fs.h
@@ -201,6 +201,12 @@ struct user_pager_info
 /* Set up the disk pager.  */
 void create_disk_pager (void);
 
+/* Inhibit the disk pager.  */

[PATCH v2] Fix race condition in ext2fs when remounting

2015-07-22 Thread James Clarke
On some systems, ext2fs.static would regularly hang at startup, as a
race condition meant it would process paging requests while remounting.
To fix this, libpager has been altered to allow inhibiting and resuming
its worker threads, and ext2fs uses this to inhibit paging while
remounting.

* console/pager.c (pager_requests): New variable.
(user_pager_init): Updated call to pager_start_workers to use new
pager_requests variable.
* daemons/runsystem.sh: Removed artificial delay working around the race
condition.
* ext2fs/ext2fs.c (diskfs_reload_global_state): Call new
inhibit_ext2_pager and resume_ext2_pager functions, and leave sblock as
non-NULL so it will be munmapped.
* ext2fs/ext2fs.h (inhibit_ext2_pager,resume_ext2_pager): New functions.
* ext2fs/pager.c (file_pager_requests): New variable.
(create_disk_pager): Updated call to pager_start_workers to use new
file_pager_requests variable.
(inhibit_ext2_pager,resume_ext2_pager): New functions.
* fatfs/fatfs.h (inhibit_fat_pager,resume_fat_pager): New functions.
* fatfs/pager.c (file_pager_requests): New variable.
(create_fat_pager): Updated call to pager_start_workers to use new
file_pager_requests variable.
(inhibit_fat_pager,resume_fat_pager): New functions.
* libdiskfs/disk-pager.c (diskfs_disk_pager_requests): New variable.
(diskfs_start_disk_pager): Updated call to pager_start_workers to use
new diskfs_disk_pager_requests variable.
* libdiskfs/diskfs-pager.h (diskfs_disk_pager_requests): New variable.
* libpager/demuxer.c (struct pager_requests): Renamed struct requests to
struct pager_requests. Replaced queue with queue_in and queue_out
pointers. Added inhibit_wakeup field.
(pager_demuxer): Updated to use new queue_in/queue_out pointers. Only
wake up workers if not inhibited.
(worker_func): Updated to use new queue_in/queue_out pointers. Final
worker thread to sleep notifies the inhibit_wakeup condition variable.
(pager_start_workers): Added out parameter for the requests instance.
Allocate heap space shared by both queues. Initialise new inhibit_wakeup
condition.
(pager_inhibit_workers,pager_resume_workers): New functions.
* libpager/pager.h (struct pager_requests): Public forward definition.
(pager_start_workers): Added out parameter for the requests instance.
(pager_inhibit_workers,pager_resume_workers): New functions.
* libpager/queue.h (queue_empty): New function.
* storeio/pager.c (pager_requests): New variable.
(init_dev_paging): Updated call to pager_start_workers to use new
pager_requests variable.
---
 console/pager.c  |   3 +-
 daemons/runsystem.sh |   3 --
 ext2fs/ext2fs.c  |  12 -
 ext2fs/ext2fs.h  |   6 +++
 ext2fs/pager.c   |  29 +++-
 fatfs/fatfs.h|   2 +
 fatfs/pager.c|  29 +++-
 libdiskfs/disk-pager.c   |   3 +-
 libdiskfs/diskfs-pager.h |   1 +
 libpager/demuxer.c   | 119 ---
 libpager/pager.h |  28 ++-
 libpager/queue.h |   8 
 storeio/pager.c  |   3 +-
 13 files changed, 219 insertions(+), 27 deletions(-)

diff --git a/console/pager.c b/console/pager.c
index 5e13ba4..818e49d 100644
--- a/console/pager.c
+++ b/console/pager.c
@@ -42,6 +42,7 @@ struct user_pager_info
 
 /* We need a separate bucket for the pager ports.  */
 static struct port_bucket *pager_bucket;
+static struct pager_requests *pager_requests;
 
 
 /* Implement the pager_clear_user_data callback from the pager library. */
@@ -133,7 +134,7 @@ user_pager_init (void)
 error (5, errno, Cannot create pager bucket);
 
   /* Start libpagers worker threads.  */
-  err = pager_start_workers (pager_bucket);
+  err = pager_start_workers (pager_bucket, pager_requests);
   if (err)
 error (5, err, Cannot start pager worker threads);
 }
diff --git a/daemons/runsystem.sh b/daemons/runsystem.sh
index ae25a7d..5d0ad01 100644
--- a/daemons/runsystem.sh
+++ b/daemons/runsystem.sh
@@ -118,9 +118,6 @@ esac
 /hurd/mach-defpager
 
 # This is necessary to make stat / return the correct device ids.
-# Work around a race condition (probably in the root translator).
-for i in `seq 1 10` ; do : ; done # XXX
-
 fsysopts / --update --readonly
 
 # Finally, start the actual init.
diff --git a/ext2fs/ext2fs.c b/ext2fs/ext2fs.c
index d0fdfe7..03c9eed 100644
--- a/ext2fs/ext2fs.c
+++ b/ext2fs/ext2fs.c
@@ -207,10 +207,20 @@ main (int argc, char **argv)
 error_t
 diskfs_reload_global_state ()
 {
+  error_t err;
+
   pokel_flush (global_pokel);
   pager_flush (diskfs_disk_pager, 1);
-  sblock = NULL;
+
+  /* libdiskfs is not responsible for inhibiting paging.  */
+  err = inhibit_ext2_pager ();
+  if (err)
+return err;
+
   get_hypermetadata ();
   map_hypermetadata ();
+
+  resume_ext2_pager ();
+
   return 0;
 }
diff --git a/ext2fs/ext2fs.h b/ext2fs/ext2fs.h
index 96d8e9d..a744685 100644
--- a/ext2fs/ext2fs.h
+++ b/ext2fs/ext2fs.h
@@ -201,6 +201,12 @@ struct user_pager_info
 /* Set up the disk pager.  */
 void 

Re: [PATCH] Fix race condition in ext2fs when remounting

2015-07-22 Thread James Clarke
Perhaps; I was following what diskfs_remount does when inhibiting RPCs, which 
stay inhibited on error.

James

 On 23 Jul 2015, at 00:51, Diego Nieto Cid dnie...@gmail.com wrote:
 
 Hi
 
 This is me being picky about a corner case :-)
 
 2015-07-22 19:42 GMT-03:00 James Clarke jrt...@jrtc27.com:
 +error_t
 +inhibit_ext2_pager (void)
 +{
 +  error_t err;
 +
 +  /* The file pager can rely on the disk pager, so inhibit the file
 + pager first.  */
 +
 +  err = pager_inhibit_workers (file_pager_requests);
 +  if (err)
 +return err;
 +
 +  err = pager_inhibit_workers (diskfs_disk_pager_requests);
 +  return err;
 +}
 
 It looks like the file pager workers will remain inhibited if the
 'pager_inhibit_workers' function
 fails to inhibit the disk pager. fatfs is affected by this problem too.
 
 Should a call to 'pager_resume_workers' be inserted before returning
 in case of error?
 
 Regards



[PATCH] WIP: Fix ext2fs remount race condition

2015-07-21 Thread James Clarke
---
 ext2fs/ext2fs.c  |  18 +++
 ext2fs/ext2fs.h  |   6 +++
 ext2fs/pager.c   |  35 +-
 fatfs/fatfs.h|   2 +
 fatfs/pager.c|  35 +-
 libdiskfs/disk-pager.c   |   3 +-
 libdiskfs/diskfs-pager.h |   1 +
 libpager/demuxer.c   | 122 +++
 libpager/pager.h |  23 -
 9 files changed, 230 insertions(+), 15 deletions(-)

diff --git a/ext2fs/ext2fs.c b/ext2fs/ext2fs.c
index d0fdfe7..200c210 100644
--- a/ext2fs/ext2fs.c
+++ b/ext2fs/ext2fs.c
@@ -207,10 +207,28 @@ main (int argc, char **argv)
 error_t
 diskfs_reload_global_state ()
 {
+  error_t err;
+
   pokel_flush (global_pokel);
   pager_flush (diskfs_disk_pager, 1);
+
+  /* Paging must specifically be inhibited; if not, a paging request
+ could be handled while sblock is still NULL.
+ While some RPCs are inhibited when this function is called by
+ libdiskfs, paging RPCs are still enabled. Even if we were to
+ inhibit paging RPCs, libpager has its own pool of workers to handle
+ requests asynchronously, which libports is unaware of, so requests
+ can be handled even after the relevant RPCs are disabled.  This is
+ all dealt with by {inhibit,resume}_disk_pager.  */
+  err = inhibit_disk_pager ();
+  if (err)
+return err;
+
   sblock = NULL;
   get_hypermetadata ();
   map_hypermetadata ();
+
+  resume_disk_pager ();
+
   return 0;
 }
diff --git a/ext2fs/ext2fs.h b/ext2fs/ext2fs.h
index 96d8e9d..d5f400e 100644
--- a/ext2fs/ext2fs.h
+++ b/ext2fs/ext2fs.h
@@ -201,6 +201,12 @@ struct user_pager_info
 /* Set up the disk pager.  */
 void create_disk_pager (void);
 
+/* Inhibit the disk pager.  */
+error_t inhibit_disk_pager (void);
+
+/* Resume the disk pager.  */
+void resume_disk_pager (void);
+
 /* Call this when we should turn off caching so that unused memory object
ports get freed.  */
 void drop_pager_softrefs (struct node *node);
diff --git a/ext2fs/pager.c b/ext2fs/pager.c
index b56c923..f41107f 100644
--- a/ext2fs/pager.c
+++ b/ext2fs/pager.c
@@ -34,6 +34,10 @@ struct port_bucket *disk_pager_bucket;
 /* A ports bucket to hold file pager ports.  */
 struct port_bucket *file_pager_bucket;
 
+/* Stores a reference to the requests instance used by the file pager so its
+   worker threads can be inhibited and resumed.  */
+struct requests *file_pager_requests;
+
 pthread_spinlock_t node_to_page_lock = PTHREAD_SPINLOCK_INITIALIZER;
 
 
@@ -1217,11 +1221,40 @@ create_disk_pager (void)
   file_pager_bucket = ports_create_bucket ();
 
   /* Start libpagers worker threads.  */
-  err = pager_start_workers (file_pager_bucket);
+  err = pager_start_workers (file_pager_requests, file_pager_bucket);
   if (err)
 ext2_panic (can't create libpager worker threads: %s, strerror (err));
 }
 
+error_t
+inhibit_disk_pager (void)
+{
+  error_t err;
+
+  /* Inhibiting RPCs is not sufficient, nor is it in fact necessary.
+ Since libpager has its own pool of workers, requests can still be
+ handled after RPCs have been inhibited, so pager_inhibit_workers
+ must be used. In fact, RPCs will not be inhibited; the requests
+ will just queue up inside libpager, and will be handled once the
+ workers are resumed.
+ The file pager can rely on the disk pager, so inhibit the file
+ pager first.  */
+
+  err = pager_inhibit_workers (file_pager_requests);
+  if (err)
+return err;
+
+  err = pager_inhibit_workers (diskfs_disk_pager_requests);
+  return err;
+}
+
+void
+resume_disk_pager (void)
+{
+  pager_resume_workers (diskfs_disk_pager_requests);
+  pager_resume_workers (file_pager_requests);
+}
+
 /* Call this to create a FILE_DATA pager and return a send right.
NODE must be locked.  */
 mach_port_t
diff --git a/fatfs/fatfs.h b/fatfs/fatfs.h
index 3c3d836..54c3426 100644
--- a/fatfs/fatfs.h
+++ b/fatfs/fatfs.h
@@ -121,6 +121,8 @@ extern struct dirrect dr_root_node;
 void drop_pager_softrefs (struct node *);
 void allow_pager_softrefs (struct node *);
 void create_fat_pager (void);
+error_t inhibit_fat_pager (void);
+void resume_fat_pager (void);
 
 void flush_node_pager (struct node *node);
 
diff --git a/fatfs/pager.c b/fatfs/pager.c
index 10d1fc9..3d860d1 100644
--- a/fatfs/pager.c
+++ b/fatfs/pager.c
@@ -29,6 +29,10 @@ struct port_bucket *disk_pager_bucket;
 /* A ports bucket to hold file pager ports.  */
 struct port_bucket *file_pager_bucket;
 
+/* Stores a reference to the requests instance used by the file pager so its
+   worker threads can be inhibited and resumed.  */
+struct requests *file_pager_requests;
+
 /* Mapped image of the FAT.  */
 void *fat_image;
 
@@ -776,11 +780,40 @@ create_fat_pager (void)
   file_pager_bucket = ports_create_bucket ();
 
   /* Start libpagers worker threads.  */
-  err = pager_start_workers (file_pager_bucket);
+  err = pager_start_workers (file_pager_requests, file_pager_bucket);
   if (err)
 error (2, err, can't create libpager 

Re: VirtualBox Hangs Pre-Init Due To Ext2FS Fault

2015-07-20 Thread James Clarke
I have now got a patch that drains down the queue, and it does successfully 
stop sblock from being dereferenced etc when we actually reload. However, 
sometimes thread 5 (the same one that would dereference sblock) seems to get 
stuck in vm_fault_continue (at least according to the kernel debugger), so I 
need to do some more debugging to see why.

James

 On 19 Jul 2015, at 15:00, Richard Braun rbr...@sceen.net wrote:
 
 On Sun, Jul 19, 2015 at 02:25:14PM +0100, James Clarke wrote:
 Yeah, I tried inhibiting both buckets, but the paging RPCs still got 
 through, so my guess was that libports's inhibit/resume methods weren't able 
 to deal with libpager's own threads. The thing is I don't think we currently 
 keep track of any reference to the main/worker threads, as 
 pager_start_workers just takes a bucket and returns void. Is there a way we 
 can instead make the main thread and/or workers able to block 
 ports_inhibit_X_rpcs like normal RPC handlers and be cancelled etc? If 
 possible I think that would be a cleaner solution.
 
 To continue our discussion on IRC:
 
 No, it would definitely not be a cleaner solution, just an ugly hack.
 Since paging doesn't occur as part of an RPC, you just can't use RPC
 stuff to manage it. I suggest building rwlock-based synrchonization
 functions specific to the pager workers.
 
 -- 
 Richard Braun



Re: VirtualBox Hangs Pre-Init Due To Ext2FS Fault

2015-07-19 Thread James Clarke
Yeah, I tried inhibiting both buckets, but the paging RPCs still got through, 
so my guess was that libports's inhibit/resume methods weren't able to deal 
with libpager's own threads. The thing is I don't think we currently keep track 
of any reference to the main/worker threads, as pager_start_workers just takes 
a bucket and returns void. Is there a way we can instead make the main thread 
and/or workers able to block ports_inhibit_X_rpcs like normal RPC handlers and 
be cancelled etc? If possible I think that would be a cleaner solution.

James

 On 19 Jul 2015, at 13:50, Justus Winter 4win...@informatik.uni-hamburg.de 
 wrote:
 
 Hello James :)
 
 Quoting James Clarke (2015-07-15 22:20:57)
 I had a look today at what's happening, and it's that the *file*
 pager is trying to read from disk. Any thoughts?
 
 There is another thing I forgot.  libpager is special, it has its own
 demuxer (see libpager/demuxer.c) that writes requests into a queue,
 and a pool of workers that process requests from said queue.
 
 The thing is, when we inhibit the pager RPCs, we merely prevent new
 ones from being enqueued, but we don't prevent the workers from
 processing already enqueued requests.  So we indeed need to add
 functions to inhibit and restart paging to libpager that know about
 the queue.
 
 Justus



Re: VirtualBox Hangs Pre-Init Due To Ext2FS Fault

2015-07-15 Thread James Clarke
As discussed in IRC, this successfully stopped the disk pager from 
dereferencing sblock. However, it was still hanging at boot a lot of the time 
(seemingly if and only if I booted in normal mode ie not recovery mode, but 
that's probably just a timing thing).

I had a look today at what's happening, and it's that the *file* pager is 
trying to read from disk. Any thoughts?

James

 On 14 Jul 2015, at 20:54, Justus Winter 4win...@informatik.uni-hamburg.de 
 wrote:
 
 Hi James :)
 
 you found a long-standing bug in ext2fs.  Fixing it allows us to get
 rid of the ugly workaround in daemons/runsystem.sh (look for `XXX').
 
 Quoting Richard Braun (2015-07-13 10:16:14)
 On Sun, Jul 12, 2015 at 12:56:31PM +0100, James Clarke wrote:
 That doesn’t seem to boot at all. I had tried changing it to inhibiting all 
 RPCs (it looks like you’ve inhibited an extra class?), but it seems that 
 paging is needed? Perhaps part of ext2fs gets paged out, and it needs to be 
 paged in when remounting?
 
 Remounting can require paging out, yes.
 
 See diskfs_reload_global_state in ext2fs :
 
 diskfs_reload_global_state ()
 {
  pokel_flush (global_pokel);
  pager_flush (diskfs_disk_pager, 1);
 
 So I guess we need to inhibit the RPCs here, not before calling
 diskfs_reload_global_state, then do:
 
  get_hypermetadata ();
  map_hypermetadata ();
 
 And reenable them here.
 
  return 0;
 }
 
 I guess that means changing the diskfs API.  James, do you want to
 give it a shot?
 
 In the mean time, enjoy my hacky workaround:
 http://nonmonolithic.org/ext2fs.static
 
 Cheers,
 Justus



Re: VirtualBox Hangs Pre-Init Due To Ext2FS Fault

2015-07-12 Thread James Clarke
That doesn’t seem to boot at all. I had tried changing it to inhibiting all 
RPCs (it looks like you’ve inhibited an extra class?), but it seems that paging 
is needed? Perhaps part of ext2fs gets paged out, and it needs to be paged in 
when remounting?

James

 On 12 Jul 2015, at 00:27, Justus Winter 4win...@informatik.uni-hamburg.de 
 wrote:
 
 Quoting James Clarke (2015-07-11 22:33:44)
 I did some more digging around today. I think what’s happening is
 that ext2fs tries to handle a pager RPC while the disk is being
 remounted
 
 Sounds plausible.  Could you try:
 
 http://darnassus.sceen.net/~teythoon/ext2fs.static
 
 I'll send the patch as follow-up.
 
 Justus




Re: VirtualBox Hangs Pre-Init Due To Ext2FS Fault

2015-07-11 Thread James Clarke
I did some more digging around today. I think what’s happening is that ext2fs 
tries to handle a pager RPC while the disk is being remounted.

We do call ports_inhibit_class_rpcs, which will wait until all RPCs for that 
class have finished. However, we call this with diskfs_protoid_class, which 
does *not* include the pager ports. These are added to _pager_class 
(libpager/priv.h) in pager_create (libpager/pager-create.c:32) and 
disk_pager_bucket (ext2fs/pager.c) in create_disk_pager (ext2fs/pager.c), and 
so as a result I believe we can get pager RPCs while remounting, leading to the 
call to ext2_getblk. Below is the stack for the call to ext2_getblk that leads 
to dereferencing sblock when it is NULL:

 0  ext2fs/getblk.c:253 (ext2_getblk)
 1  ext2fs/pager.c:147 (find_block)
 2  ext2fs/pager.c:244 (file_pager_read_page)
 3  ext2fs/pager.c:550 (pager_read_page)
 4  libpager/data-request.c:113 (_pager_S_memory_object_data_request)
 5  libpager/memory_objectServer.c:443 (_Xmemory_object_data_request)
 6  libpager/demuxer.c:215 (worker_func)
 7  libpthread/pthread/pt-create.c:64 (entry_point)

James Clarke

 On 27 Jun 2015, at 20:34, Richard Braun rbr...@sceen.net wrote:
 
 On Sat, Jun 27, 2015 at 03:39:58PM +0100, James Clarke wrote:
 I have been suffering a lot from my Hurd system (running in VirtualBox) 
 hanging at startup, just after Hurd server bootstrap... but before INIT: 
 version 2.88 booting.
 
 I have been able to trace it back to getblk.c:248 (unsigned long 
 addr_per_block = EXT2_ADDR_PER_BLOCK (sblock);) in ext2_getblk. It faults 
 because sblock is NULL.
 
 I have traced the execution with debugging statements, and what seems to 
 happen is as follows:
 
 1. diskfs_remount is called (because root is remounted as rw)
 2. RPCs are inhibited
 3. diskfs_reload_global_state is called
 4. sblock is set to NULL
 5. While this is happening, ext2_getblk is called
 
 If you’re lucky, the superblock is read and sblock is set to point to this 
 data before 5 (or at least before it gets to dereferencing sblock). If not, 
 sblock is still NULL and thus a page fault is raised, causing the system to 
 be stuck.
 
 Does anyone have an idea how this situation could be occurring?
 
 My initial thought would be how could it not happen ?.
 
 Despite diskfs_remount calling ports_inhibit_class_rpcs, other threads
 can very well be running to process previously received messages. There
 seems to be no other form of access synchronization such as locks in
 diskfs_reload_global_state.
 
 Can you get the call trace leading to ext2_getblk ? I'm not sure about
 backtrace(3) in static executables but it might be worth trying.
 
 -- 
 Richard Braun




VirtualBox Hangs Pre-Init Due To Ext2FS Fault

2015-06-27 Thread James Clarke
Hi,
I have been suffering a lot from my Hurd system (running in VirtualBox) hanging 
at startup, just after Hurd server bootstrap... but before INIT: version 
2.88 booting.

I have been able to trace it back to getblk.c:248 (unsigned long addr_per_block 
= EXT2_ADDR_PER_BLOCK (sblock);) in ext2_getblk. It faults because sblock is 
NULL.

I have traced the execution with debugging statements, and what seems to happen 
is as follows:

1. diskfs_remount is called (because root is remounted as rw)
2. RPCs are inhibited
3. diskfs_reload_global_state is called
4. sblock is set to NULL
5. While this is happening, ext2_getblk is called

If you’re lucky, the superblock is read and sblock is set to point to this data 
before 5 (or at least before it gets to dereferencing sblock). If not, sblock 
is still NULL and thus a page fault is raised, causing the system to be stuck.

Does anyone have an idea how this situation could be occurring?

James Clarke