* Juha Autero <[EMAIL PROTECTED]> [2005-09-19 19:07]: > > Our customer has a problem with dazuko. It causes kernel panic on Red > Hat Enterprise Linux 3 with hugemem kernel. We manage to reproduce > kernel panic in our end which probably means that it is not kernel > configuration problem. > > Kernel panic happens when our program tries to connect Dazuko. We get > following error messages to our log: > Dazuko error: writing to device: Bad address > > [ ... ] > > What worries me is that when googling hugemem I found a blog comment > that said:"Incidentally, the 4G/4G patch had the nice side-benefit of > exposing numerous bugs in the use of user pointers in drivers, most of > which were quickly resolved." > <http://www.orablogs.com/mt-bin/mt-comments271276.cgi?entry_id=1363>
Well, Dazuko has not been having any problems with "use of user pointers in drivers" for at least two years. We are quite aware of the fact that not all the world is Linux. :) After searching the web I understand that the hugemem 4G/4G split patch does not change the size of pointers, they still have 32 bits (on ia32 that is). So there should not be a problem passing them around in long variables. Translation between user space and kernel space addresses is already done by means of copyin and copyout. There is no direct use of user pointers in the kernel module. What you experience might be some kind of signedness problem. The hugemem approach increases the probability of applications using "high" addresses above 2G. I'm not sure how dazuko_strtoul() handles these cases. You may fetch a new version of dazuko_core.c from CVS which changed dazuko_strtoul() to use and return unsigned long values. If this does not remove the problem it would be interesting to learn which addresses get mangled and what kind of damage they suffer from (the code is in dazuko_core.c:dazuko_handle_user_request(), search for "RA="). Could you check the RA= or ra= text representation against what the pointer looks like after dazuko_strtoul() conversion? Resetting the ll_request and user_request pointers to NULL will even avoid panics or faults and make the request fail immediately. A different approach is to hand pointers from user space to the kernel in %llu text representation and to internally use unsigned long long for conversion inside the kernel. This will be attacked next. The assumption that unsigned long long will always be big enough to hold an address should be safe. It's a pity that there is no clean and portable way to detect the presence and type of uintptr_t. :( The int64_t/int32_t detection in dazuko_transport.c is a mess and actually is only done to silence compiler warnings when assigning between integer types and pointers. Are you aware of some live/rescue system CD with a hugemem kernel on it so we can easily reproduce the problem here? That would be very nice to diagnose the problem and confirm it's fixed. virtually yours Gerhard Sittig pgp fingerprint AF29 3CD2 A531 F5A8 5F42 CB9A 1B7F 59F8 BA7A 9EE5 -- Gerhard Sittig Software Engineer H+BEDV Datentechnik GmbH Lindauer Strasse 21, 88069 Tettnang, Germany tel +49 (0) 7542-500500, fax +49 (0) 7542-500576 _______________________________________________ Dazuko-devel mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/dazuko-devel
