presto phase II [LSARC/2008/389 FastTrack timeout 06/25/2008]

2008-06-25 Thread Irene Huang
Closing as approved.

--Irene
Artem Kachitchkine wrote:

 I have no further questions.

 -Artem

 This case is due timeout on Tuesday, any further issues, please send 
 an email before then.

 Thanks

 --Irene
 Halton Huo wrote:
 On Wed, 2008-06-18 at 23:48 -0700, Artem Kachitchkine wrote:
  
 The signal between X and Y or X sends signal to Y language is 
 confusing. DBus signals in general are not peer to peer, they are 
 messages broadcast on a bus. Arbitrary number of applications can 
 listen on the bus. It is possible, however, to establish a peer to 
 peer DBus connection between two applications, much like FIFOs or 
 System V message queues. You need to specify:
 
 Artem,

 Thanks for your review.

 Here is updates on this part, please review.

   Network Printer (via SNMP):
   - Enable network printer discovery service, 
 svc:/network/device-discovery/printers:snmp
   - The hald network printer add-on broadcast a SNMP GET
   - Network printer which is SNMP capable would then respond to it
   - The SNMP agent then populates the HAL Device Tree with the 
 network printer data.
   - hald detected changes in the HAL device tree and deduces that 
 these are printers, it sends out the DeviceAdded DBUS signal.
   - ospm-applet, which is a user's session daemon, is waiting 
 and responding to these signals. Based on the unique udi (Unique 
 Device Identifier) it received from hald, it looks up the rest 
 of the data from the Hal device tree. Then it adds print queues 
 for these
 printers in the background until these are all done.
   - ospm-applet pop-ups a generic message as a notification 
 bubble notifying the user that network print queues have been 
 added.   - ospm-applet also sends out a DBUS message, PrinterAdded.
   - If the Print Manager is running at the time, it will be notified 
 by the message PrinterAdded, and will refresh its view 
 immediately
 and hence shows the newly added queues. Otherwise, these messages
 are ignored.
  
 - the path of the object(s) that implement the 
 org.opensolaris.ospm.applet interface

 - which of the many possible DBus buses (system? session?) or 
 private connections the object is instantiated on

 - signal parameters, if any (UDI? queue name?)
 
 We're using two DBus signals. One is a system one ??DeviceAdded,
 another one is our application customized one ???PrinterAdded
 1.  ???DeviceAdded
path: /org/freedesktop/Hal/Manager
interface: ???org.freedesktop.Hal.Manager
bus: system
2.  PrinterAdded
path: ???/org/opensolaris/ospm/applet
interface: ???org.opensolaris.ospm.applet
 ???   bus: session

 Where need I mention this in the arc document?

 Thanks,
 Halton.

  
 Also your proposal has three variations on the signal name: 
 PrinterAdded, printerAdded and Printeradded. Which one is it?

 -Artem
 

   






Xsane [LSARC FastTrack 2008/385 timeout 06/24/2008]

2008-06-25 Thread Irene Huang
Closing as approved.

--Irene
Irene Huang wrote:
 Hi, all

 I am sponsoring this case. The timeout is set to be 06/24/2008
 XSane is targeting opensolaris 2008/11 (hopefully).

 Please review the attached proposal.

 --Irene




Meta Tracker - A Desktop Search Tool [LSARC/2008/375 FastTrack timeout 07/01/2008]

2008-06-25 Thread Irene Huang
Resetting time out to be July 1st.

--Irene
Robert Kinsella - Sun Microsystems Ireland - Software Engineer wrote:
 Darren J Moffat wrote:
 Stephen Browne wrote:

 On Tue, 2008-06-24 at 14:52, Darren J Moffat wrote:
 /Jerry Tan wrote:
  When tracker is integrated, it will be disable by default.
   users can run gnome-session-properties to enable it.

 How does that work with TX ?  Does that enable an instance per 
 labelled zone or only in the global zone ? /

 Since the startup is configurable it will be configurable per zone.  
 The default for the zones will be the same as teh default for teh 
 global zone.

 How is that done with gnome-session-properties ?  I don't believe 
 that is label aware, is it ?

 If I select gnome-session-properties from the menu as 
 Preferences-Sessions  I see no indication that it is able to set 
 separate policy per label and isn't it running in the global zone ?.  
 Am I missing something ?

 Hi Darren,
when an application/preference dialog is launched - it displays the 
 label it is launched in.

 Launching if from e.g. the internal zone, the window label  Internal. 
 Any settings changed in the window labeled Internal will affect the 
 settings for the internal zone for that user.

 Some settings are only applicable in the Global zone, these preference 
 dialogs / applications are always launched in the global zone.
 To review a list of these (global zone only preference 
 dialogs/applications)  see /usr/share/gnome/TrustedPathExecutables


 Bob





libtasn1 for OpenSolaris [LSARC/2008/390 FastTrack timeout 06/25/2008]

2008-06-25 Thread Irene Huang
Hi, Mike

if you are OK with the license indicated in the copyright file, please 
let me know and I would like to close this case as approved.

Thanks

--Irene
Irene Huang wrote:
 Generally, we don't include license information in the manpage. For
 Opensolaris project, we ship a copyright file in each package, I think
 that will do. 

 --Irene
 On Tue, 2008-06-24 at 10:10 +0800, Jeff Cai wrote:
   
 On Mon, 2008-06-23 at 16:24 -0700, Mike Oliver wrote:
 
 Irene Huang wrote:
   
 ...
 4.2. Interfaces:
  Exported Interfaces
InterfaceClassification  Comments
  -----   
 ---
 ...
  /usr/lib/libtasn1.soVolatileShared 
 library
 ...
   
 Why is this library not versioned in the usual manner?  (E.g.
 libtasn1.so.1, accompanied by a .so symlink pointing to the
 current version for use by the normal linker environment.)
   
 Sorry, this is my mistake.

 /usr/lib/libtasn1.so  Volatile  Symbolic link
 /usr/lib/libtasn1.so.3Volatile  Symbolic link
 /usr/lib/libtasn1.so.3.0.15   Volatile  Shared library

 Spec will be changed accordingly.

 
 I don't know whether there's a SAC Best Practice for bringing
 the license terms of libraries to the notice of developers who
 might wish to use those libraries.  AFAICT this one is LGPL;
 does that need to be mentioned anywhere?
   
 I'll mention that in libtasn1 man page.

 Thanks

 Jeff


 
 Mike.
   

   




Fast Reboot PSARC/2008/382

2008-06-25 Thread Sherry Moore
This is an e-mail note indicating that the dry-run information will be
suppressed from man pages, and are reclassified as Project Private per
disscussions between the project team, Jerry and Garrett.

Sherry
-- 
Sherry Moore, Solaris Core Kernel   http://blogs.sun.com/sherrym



Fast Reboot PSARC/2008/382

2008-06-25 Thread Garrett D'Amore
Sherry Moore wrote:
 This is an e-mail note indicating that the dry-run information will be
 suppressed from man pages, and are reclassified as Project Private per
 disscussions between the project team, Jerry and Garrett.

 Sherry
   
Thank you.  Just for the clarity of the record, I think this means that 
the project team agrees that the dry run options to reboot, as well as 
to uadmin(2), are project private.

Btw, given that the dry run option to uadmin is private, it seems that 
you probably could just skip modifying reboot for dry-run.  For the 
internal testing purposes for which I think this is intended, uadmin(1M) 
should be adequate for testing.  See the work done by the CPR project, 
where they use various different subcommands to uadmin for testing.

-- Garrett



libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread James Carlson
Darren J Moffat writes:
 The behavior of HP-UX and AIX is unknown to the author of this case,
 and since they are closed source there is no easy we to determine what
 their implementation does.

AIX prints an empty string, as though  had been passed in.  Our
HP-UX box has locked up, or I'd give you that one as well.  :-/

 This isn't a scalable way to approach the problem and hurts the
 reputation of Solaris and OpenSolaris releases.  It also hinders the

The sad thing here is that it's really the bug-ridden application code
that mishandles NULL pointers that's of poor quality, so it's not
OpenSolaris's reputation that should be at stake.

So, with this one under our belts, should we also fix up the str*(3C)
family of functions so that they quietly ignore NULL pointers as well?
An application that's incautious with NULL can't possibly just make
that mistake with printf alone, can it?

Is NULL the only bad pointer worth caring about?  What sorts of bad
pointer checks need to be made so that malfunctioning applications can
continue running without dropping core?  How deep does the rabbit hole
go?

-- 
James Carlson, Solaris Networking  james.d.carlson at sun.com
Sun Microsystems / 35 Network Drive71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677



libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Garrett D'Amore
My only possible objection to this is that this change might mask 
certain kinds of bugs which might be useful to catch.

Can we have an environment variable to turn off this behavior for 
debugging purposes (perhaps leaving it turned off by default in a debug 
build of ON?)  A perhaps better solution might be to place an assert() 
in printf for the string not being NULL.  (Hmmm... do we even *have* 
debug/non-debug versions of libc as we do for the kernel?)

-- Garrett

Darren J Moffat wrote:
 Template Version: @(#)sac_nextcase 1.66 04/17/08 SMI
 This information is Copyright 2008 Sun Microsystems
 1. Introduction
 1.1. Project/Component Working Name:
libc printf behaviour for NULL string
 1.2. Name of Document Author/Supplier:
Author:  Darren Moffat
 1.3  Date of This Document:
   25 June, 2008
 4. Technical Description

 Background
 --
 The current behavior of the printf(3C) family of functions in libc when
 passed a NULL value for a string format is undefined and usually
 results in a SEGV and crashed application.

 The workaround to applications written to depend on this behavior is to
 LD_PRELOAD=/usr/lib/0 at 0.so.1 (or the 64 bit equivalent).  The
 workaround isn't always easy to apply (or it is too late data has been
 lost or corrupted by that point).

 Some will often state, myself included, that you shouldn't assume that
 the printf(3C) family will deal with a NULL argument for a string and
 that arguments should be checked before calling printf(3C).

 The behavior of the SunOS 4.x printf(3C) and that of the still shipping
 binary compatibility library /usr/4lib/libc.so.1 was to use the string
 (null) if the argument for a %s was NULL.

 This is also the behavior in current versions of GNU libc (as used
 on most (maybe all) mainstream Linux kernel based distributions) 
 and the libc for FreeBSD, NetBSD, OpenBSD.

 The behavior of HP-UX and AIX is unknown to the author of this case,
 and since they are closed source there is no easy we to determine what
 their implementation does.

 It is reasonably common to find FOSS applications that when run on
 Solaris/OpenSolaris fail due to the behavior of the printf(3C) in
 libc.  The GNOME libraries in snv_92 even appear to make this
 assumption in some places (g_log function) based on cores I've seen for
 gnome-settings-daemon and pidgin, I know that the Sun GNOME team has
 fixed such issues in the past.  In both of those cases it wasn't the
 application that caused it but an assumption in some library code they
 both use for error handling.

 In some FOSS communities the upstream authors are often not interested
 in changing their source and view the failure as a Solaris bug (taking
 the stance that Linux and *BSD don't have this issue).  Some are more
 accommodating and have taken a half way step and have their code check
 that 0 at 0.so.1 is LD_PRELOAD'd when running on Solaris, sometimes the
 authors will fix the code.

 This isn't a scalable way to approach the problem and hurts the
 reputation of Solaris and OpenSolaris releases.  It also hinders the
 building of binaries for upstream FOSS components targeting an
 OpenSolaris release repository.  There is a large volume of software
 not originally authored on Solaris/OpenSolaris that is critical to the
 success of OpenSolaris.   So a permanent fix for this is needed that
 scales well in OpenSolaris and upstream community developer time.


 Proposal
 

 This case proposes to change the default Solaris/OpenSolaris libc
 behavior for the printf(3C) family so that it reverts to the SunOS 4.x
 behavior of printing (null) instead of (likely) causing an
 application crash.  This change will apply the XPG and wide char
 variants as well.

 There are no documentation changes from this case, as the current
 Solaris documentation says nothing about the behavior of printf(3C)
 family when passed a NULL.

 Since no application should be depending on the current behavior of
 getting a SEGV when a passing a NULL there is no need to making this
 change configurable.  In fact doing so could cause even more harm than
 the current situation.

 There are no interface taxonomy changes.

 The release binding for this change is patch.

 6. Resources and Schedule
 6.4. Steering Committee requested information
   6.4.1. Consolidation C-team Name:
   ON
 6.5. ARC review type: FastTrack
 6.6. ARC Exposure: open

   




Fast Reboot PSARC/2008/382

2008-06-25 Thread Garrett D'Amore
Sherry Moore wrote:
 On Wed, Jun 25, 2008 at 04:56:06AM -0700, Garrett D'Amore wrote:
   
 Sherry Moore wrote:
 
 This is an e-mail note indicating that the dry-run information will be
 suppressed from man pages, and are reclassified as Project Private per
 disscussions between the project team, Jerry and Garrett.

 Sherry
   
   
 Thank you.  Just for the clarity of the record, I think this means that the 
 project team agrees that the dry run options to reboot, as well as to 
 uadmin(2), are project private.
 

 Yes.
   

Thank you.

   
 Btw, given that the dry run option to uadmin is private, it seems that you 
 probably could just skip modifying reboot for dry-run.  For the internal 
 testing purposes for which I think this is intended, uadmin(1M) should be 
 adequate for testing.  See the work done by the CPR project, where they use 
 various different subcommands to uadmin for testing.
 

 I believe that's implementation detail that the project team can choose
 to implement.
   

Agreed.  I was just offering some friendly implementation advice, not 
architectural guidance.

-- Garrett

-- next part --
An HTML attachment was scrubbed...
URL: 
http://mail.opensolaris.org/pipermail/opensolaris-arc/attachments/20080625/390eae92/attachment.html


libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Shawn Walker
2008/6/25 James Carlson james.d.carlson at sun.com:
 Darren J Moffat writes:
 The behavior of HP-UX and AIX is unknown to the author of this case,
 and since they are closed source there is no easy we to determine what
 their implementation does.

 AIX prints an empty string, as though  had been passed in.  Our
 HP-UX box has locked up, or I'd give you that one as well.  :-/

According to Sun's developer pages [1], HP-UX has the same behaviour.

-- 
Shawn Walker

[1] http://developers.sun.com/solaris/articles/portingUNIXapps.html



libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Joerg Schilling
James Carlson james.d.carlson at sun.com wrote:

 Darren J Moffat writes:
  The behavior of HP-UX and AIX is unknown to the author of this case,
  and since they are closed source there is no easy we to determine what
  their implementation does.

 AIX prints an empty string, as though  had been passed in.  Our
 HP-UX box has locked up, or I'd give you that one as well.  :-/

HP-UX also prints an empty string like: 

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
   js at cs.tu-berlin.de(uni)  
   schilling at fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily



Updated NPIV support for xVM [PSARC/2008/404 FastTrack timeout 07/02/2008]

2008-06-25 Thread Mark Carlson

Template Version: @(#)sac_nextcase 1.66 04/17/08 SMI
This information is Copyright 2008 Sun Microsystems
1. Introduction
1.1. Project/Component Working Name:
 Updated NPIV support for xVM
1.2. Name of Document Author/Supplier:
 Author:  Jack Meng
1.3  Date of This Document:
25 June, 2008
4. Technical Description

1.3.1. Date this project was conceived:
N/A

   1.4. Name of Major Document Customer(s)/Consumer(s):
1.4.1. The PAC or CPT you expect to review your project:
   Solaris PAC
1.4.2. The ARC(s) you expect to review your project:
   PSARC
1.4.3. The Director/VP who is Sponsoring this project:
   Scott.Tracy at Sun.COM
1.4.4. The name of your business unit:
   Solaris SOFTWARE Group

   1.5. Email Aliases:
1.5.1. Responsible Manager: Roger.Dong at sun.com
1.5.2. Responsible Engineer: Jack.Meng at sun.com
1.5.3. Marketing Manager:
1.5.4. Interest List: npiv-iteam at sun.com

2. Project Summary
   2.1. Project Description:
Support NPIV device in Solaris xVM
   2.2. Risks and Assumptions:
This work is for Solaris xVM hosts only, therefore guest domains 
configured with
NPIV device may not be able to be migrated to hosts running on other 
platforms,
e.g., Linux.

3. Business Summary
   3.1. Problem Area:
N/A

   3.2. Market/Requester:
N/A

   3.3. Business Justification:
N/A

   3.4. Competitive Analysis:
N/A

   3.5. Opportunity Window/Exposure:
N/A

   3.6. How will you know when you are done?:
They are able to use NPIV within virtual machines in xVM.

4. Technical Description:
4.1. Details:
This project introduces two extensions for Solaris xVM utilities to 
configure NPIV
devices with paravirtualized guest domains. NPIV is enable in Solaris 
by PSARC
2007/501, refer to section 5 for more info.

The first one is to attach a specified LUN from a virtual FC port to 
guest domain.
xVM hypevisor in Solaris is extended to accept a new type of blk 
device, npiv,
and to trigger according script to create the virtual port on specified
physical port, discovery the lun on specified target and finally attach 
it as
a normal blk device to guest domain.

The second one is to attach a specified virtual FC port to guest domain 
as
a pseudo device. 'Pseudo' means there will be no corresponding frontend 
in guest domain
for that virtual FC port. xVM hypevisor in Solaris is extended to 
accept a new kind of
device, pseudo, and to trigger script to work on different pseudo 
devices. Currently
the only pseudo device will be NPIV port and the corresponding script 
will create
the virtual port on specified physical port and then,
1)attach existing luns from that virtual port to guest domain
2)register a script for device sysevents happenning on the virtual 
port, afterwards newly
added/deleted luns will be attached/detached from the guest domain.

Eigher way the npiv device is able to be migrated if the remote the
destination host has the specified physical FC port and on the same 
Fabric with the
physical port on source host.

4.2. Bug/RFE Number(s):
6713736 NPIV lun support in XVM 
6713700 Dynamic blk dev support in kernel

4.3. In Scope:
N/A

4.4. Out of Scope:
N/A

4.5. Interfaces:
N/A

4.6. Doc Impact:
Man page: virsh(1M), xm(1M)
System Administration Guide: Virtualization Using the Solaris Operating 
System  

4.7. Admin/Config Impact:
Introduces a new format of options in 'virsh' and 'xm'. Refer to
docs listed in 4.6 for details.

4.8. HA Impact:
N/A

4.9. I18N/L10N Impact:
N/A

4.10. Packaging  Delivery:
N/A

4.11. Security Impact:
N/A

4.12. Dependencies:
N/A

5. Reference Documents:
http://sac.sfbay/PSARC/2007/501/

6. Resources and Schedule:

   6.1. Projected Availability:
Solaris Nevada B94/B95

   6.2. Cost of Effort:
N/A 

   6.3. Cost of Capital Resources:
N/A

   6.4. Product Approval Committee requested information:
6.4.1. Consolidation or Component Name:
xvm, on
6.4.3. Type of CPT Review and Approval expected:
FastTrack
6.4.4. Project Boundary Conditions:
N/A
6.4.5. Is this a necessary project for OEM agreements:
N/A
6.4.6. Notes:
N/A
6.4.7. Target RTI Date/Release:
Nevada B94/B95
6.4.8. Target Code Design Review Date:
25/06/2008
6.4.9. Update approval addition:
N/A
6.6.1. 

libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Darren J Moffat
Garrett D'Amore wrote:
 My only possible objection to this is that this change might mask 
 certain kinds of bugs which might be useful to catch.

Doesn't seem to be a problem for other platforms.

 Can we have an environment variable to turn off this behavior for 
 debugging purposes (perhaps leaving it turned off by default in a debug 
 build of ON?)  

I don't think that is a good idea. None of the other platforms I viewed 
have this and we didn't have this with SunOS 4.x either.  More 
importantly I don't like the idea of having to check an environment 
variable on every printf(3C) family call, there could be a noticable 
performance hit for that.

Basically I see the current behaviour as a long standing regression from 
SunOS 4.x and was really tempted to not even file an ARC case but just 
do a bug fix.  Lets not over design this or beat it to death in ARC 
unless there is a standards reason our sound architectural reason why 
this is the wrong thing do to (and given the behaviour of all the other 
platforms I'd have a hard time believing that).

  A perhaps better solution might be to place an assert()
 in printf for the string not being NULL.  (Hmmm... do we even *have* 
 debug/non-debug versions of libc as we do for the kernel?)

No we don't, but then with DTrace we don't need to either.  If the goal 
is to find applications that could be fixed to not do this then use 
DTrace with the pid provider to find them.

-- 
Darren J Moffat



libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Stefan Teleman


James Carlson wrote:
 Stefan Teleman writes:

 Darren J Moffat wrote:

 The sad thing here is that it's really the bug-ridden application code
 that mishandles NULL pointers that's of poor quality, so it's not
 OpenSolaris's reputation that should be at stake.
 Oh I agree completely but we are the odd one out here.
 I apologize for interjecting in this discussion, but, wouldn't the character 
 string (null) or (nichts) or NULL being printed on stdout/stderr act 
 as 
 a clear indicator of the bug, and of its precise location ?
 
 Actually, no, it's not.  You know its apparent location in the stdout
 character stream, but nothing about where the problem might be in the
 code.

In other words, the fact that i see (null) instead of some other printable 
value, printed out, provides me with absolutely no indication as to which char* 
pointer was NULL, in the sequence of arguments passed to printf(3C) and friends.

That is because I do not have a reasonably defined set of expectations with 
respect to what *should* be the output of printf(3C) and friends.

Speaking only for myself: when i see the string (null) printed out, when in 
fact i was expecting Giraffe, i do not think Oh, the stdout character stream 
  contains the string \(null\). How odd. I wonder what could have caused 
this..

I think Why is Giraffe NULL, when it shouldn't be ?.

--Stefan

-- 
Stefan Teleman
Sun Microsystems, Inc.
Stefan.Teleman at Sun.COM




libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Glenn Fowler

On Wed, 25 Jun 2008 12:30:24 -0400 James Carlson wrote:
 Garrett D'Amore writes:
  My leaning is to #1.  It seems like we're trying to make bad 
  applications happy, to satisfy what is probably a small minority of 
  developers who feel that such use of NULL should be legal (despite 
  documentation to the contrary) -- and who are unwilling to use a 
  perfectly reasonable workaround, at a potential cost to the greater set 
  of well-behaved applications.

 I think what's missing there is that this is (unfortunately) not just
 a minority of developers.  The bulk of user-space software looks like
 this these days.  People are just plain careless, ...

this is a bit harsh
people tend to code to what their local system tolerates
even the most meticulous coders can fall prey to this
by assuming their favorite standard implementation actually implements the 
standard

e.g., glibc implements posix
well, yes, it might, but it also makes choices on some implementation defined 
behavior
that could be mistaken for standard behavior, sometimes even after
meticulous reading of the standard

-- Glenn Fowler -- ATT Research, Florham Park NJ --




libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Nicolas Williams
On Wed, Jun 25, 2008 at 09:39:55AM -0700, Garrett D'Amore wrote:
 Is the next step really to start checking for null arguments to other 
 string functions?  What about null pointers passed to other library 
 routines, such as free(), qsort(), bsearch()?

free(NULL) is already allowed, always.

To provide a general answer: if it were to turn out that, say,
strlen(NULL) works (e.g., returns 0) on Linux and *BSD and that *many*
applications depend on this behaviour, then we may have to consider
making our strlen() do the same.  If this were to violate some standard,
then that will complicate the decision process -- we may need to resort
to compile-/link-time behaviour selections (for libraries and
executables both).



libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread James Carlson
Stefan Teleman writes:
  Actually, no, it's not.  You know its apparent location in the stdout
  character stream, but nothing about where the problem might be in the
  code.
 
 In other words, the fact that i see (null) instead of some other printable 
 value, printed out, provides me with absolutely no indication as to which 
 char* 
 pointer was NULL, in the sequence of arguments passed to printf(3C) and 
 friends.

In many cases, it doesn't tell you which executable was involved, or
which loadable module in that executable, or anything about the
conditions that caused the problem.

 Speaking only for myself: when i see the string (null) printed out, when in 
 fact i was expecting Giraffe, i do not think Oh, the stdout character 
 stream 
   contains the string \(null\). How odd. I wonder what could have caused 
 this..
 
 I think Why is Giraffe NULL, when it shouldn't be ?.

I suspect that's because you may be the developer of that software,
and thus know the locations and contents of the printf() strings
themselves.

Try it as a user:

Installing software modules.
Unable to load module (null).
Operation complete; 327 modules loaded.

Does that tell you anything useful?  What executable failed?  Was it a
library function or the main program?  Could it have been one of the
modules that was (apparently) being loaded by way of dlopen?

It may as well have said somthing bad happened.

Glenn Fowler writes:
 On Wed, 25 Jun 2008 12:30:24 -0400 James Carlson wrote:
  I think what's missing there is that this is (unfortunately) not just
  a minority of developers.  The bulk of user-space software looks like
  this these days.  People are just plain careless, ...
 
 this is a bit harsh
 people tend to code to what their local system tolerates

I've thought about it a bit, and I'm going to stand by it.  Even if
your system 'tolerates' something as an implementation artifact, that
doesn't make it right.

In order for this to be something other than simple carelessness, I
think the developer would have had to have seen that (null) output,
and decided, eh, that's good enough.  If that's what happened, then
what I wrote isn't probably harsh enough.

 e.g., glibc implements posix
 well, yes, it might, but it also makes choices on some implementation defined 
 behavior
 that could be mistaken for standard behavior, sometimes even after
 meticulous reading of the standard

Regardless of whether glibc implements POSIX, I don't think that
ignoring bad pointers is good programming practice.  Perhaps it's just
my opinion alone, but I think failing to consider whether a pointer
ought to be NULL and doing something about it reflects a lack of due
care.

Sure; that can affect anyone.  If you find it in my code, feel free to
point the example out as an instance of carelessness on my part.

(I'm not seeking a Gary Hart moment here, but rather saying that I
don't think the criticism is unfair.)

-- 
James Carlson, Solaris Networking  james.d.carlson at sun.com
Sun Microsystems / 35 Network Drive71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677



fbconfig configuration utility for Xorg [PSARC/2008/396 FastTrack timeout 06/30/2008]

2008-06-25 Thread Eric Sultan
No substantive unresolved issues remaining, this case was approved 
during this morning's PSARC meeting.

To summarize the modifications to the original proposal, here is the 
revised interface table:

Interfaces exported:
   /usr/sbin/fbconfig   Uncommitted  modified program
   /usr/lib/fbconfig/fbconf_xorgProject Private  new
   /usr/lib/fbconfig/libfbconf_xorg.so  Project Private  new
   /usr/lib/fbconfig/libSUNWkfb_conf.so Project Private  new, kfb-specific
   /etc/X11/xorg.conf   External Volatile


  -- Eric




libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Bart Smaalders
James Carlson wrote:

 Regardless of whether glibc implements POSIX, I don't think that
 ignoring bad pointers is good programming practice.  Perhaps it's just
 my opinion alone, but I think failing to consider whether a pointer
 ought to be NULL and doing something about it reflects a lack of due
 care.

Indeed... but we're just being particularly picky in about printf... in
the libc malloc implementation, we were far more accommodating
of broken programs - the existing implementation allows the same
pointer to be freed multiple times (so long as there was no intervening
call to {re,m,c}alloc), and realloc works if given a pointer that was 
already
freed.  These are much more pernicious and dangerous than allowing a
NULL pointer to be passed to %s format specifiers.  Also, we print
NaN when an illegal floating point number is passed to printf rather tjhan
just raising a FP exception.

We can either choose to be compatible w/ virtually everyone else, or
we can continue to be particular about  printf's string arguments.
Personally, I'd vote for compatibility.

- Bart


-- 
Bart Smaalders  Solaris Kernel Performance
barts at cyber.eng.sun.com  http://blogs.sun.com/barts
You will contribute more with mercurial than with thunderbird.



libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Glenn Fowler

On Wed, 25 Jun 2008 13:07:51 -0400 James Carlson wrote:
 Glenn Fowler writes:
  On Wed, 25 Jun 2008 12:30:24 -0400 James Carlson wrote:
   I think what's missing there is that this is (unfortunately) not just
   a minority of developers.  The bulk of user-space software looks like
   this these days.  People are just plain careless, ...
  
  this is a bit harsh
  people tend to code to what their local system tolerates

 I've thought about it a bit, and I'm going to stand by it.  Even if
 your system 'tolerates' something as an implementation artifact, that
 doesn't make it right.

 In order for this to be something other than simple carelessness, I
 think the developer would have had to have seen that (null) output,
 and decided, eh, that's good enough.  If that's what happened, then
 what I wrote isn't probably harsh enough.

as is this case with many of my bugs, they are data dependent and my
regression tests never hit all of the cases cooked up by ingenous users

so instead of eh, that's good enough its didn't think of that
and the next release will have a fix and companion test(s) -- different
from carelessness

-- Glenn Fowler -- ATT Research, Florham Park NJ --




libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Gordon Ross

Darren Moffat wrote:

 ???James Carlson wrote:
[...]
  So, with this one under our belts, should we also fix up the
 str*(3C)
  family of functions so that they quietly ignore NULL pointers as
 well?
 
 The goal of this case was parity with the other mentioned libc 
 implementations.  I have looked at what the others do for strlen(NULL) 
 and they will SEGV on that.  I haven't looked at every str*(3C)
 function.

That's fine, but FYI, the Microsoft C runtime does this
substitution of  for NULL in the str* functions.
(Or they used to.  I haven't tried recently.)

  An application that's incautious with NULL can't possibly just make
  that mistake with printf alone, can it?
 
 Probably not but this is a safety net that is available on other
 platforms.  Similar saftey nets for the str*(3C) functions don't at 
 initial glance appear to exist.
 
 If the applicaiton/lib is that free and loose with NULL then we still 
 have the ability to LD_PRELOAD=0 at 0.so.1 if the code can't be fixed.

And for the record, that is not a sufficient solution,
because then you won't trap on other errant NULL pointers.
But again, OK, not this case.

 This case is about fixing the very commonly encountered case and the 
 case were Solaris is disastrously different to the common platforms.
 
  Is NULL the only bad pointer worth caring about?  What sorts of bad
  pointer checks need to be made so that malfunctioning applications
 can
  continue running without dropping core?  How deep does the rabbit
 hole
  go?
 
 The Rabbit hole is very deep but this case is just about getting
 dinner 
 for tonight, someone else can explore the rest of the warren.

Understood.  Later discussion is concerned with what to replace
the null pointer with.  Here's a suggestion for that:

In libc:printf
#pragma weak _printf_null_str_replacment()
const char *
_printf_null_str_replacement() { return (); }

and in printf
if (str_ptr == NULL)
str_ptr = _printf_null_str_replacement();

and then let whatever links with libc provide something
different if it wants to.  I.e. to get SIGSEGV:
provide a function that returns NULL instead.





libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Garrett D'Amore
Jyri Virkki wrote:
 Another ARC gone wild thread? Can we keep the lessons of the past
 months in mind?

 I read the materials and there are no exported interface changes, no
 imported interface changes and not even any documentation changes.

 Only an implementation change to something formally defined as
 undefined, so while your code reviewers should have something to say
 if the implementation chooses to, say, reboot the system, code reviews
 are not in scope for ARC.

 So there's actually nothing for ARC to review here.. why file this case?
 My vote is you close it approved automatic and go fix the bug already.


   

I think someone else commented on the potential for changes in 
debuggability and performance.  ARC review is not inappropriate.




 Everything below is just me adding to the noise, so ignore for this
 case purposes.

 James Carlson wrote:
   
 An application that's incautious with NULL can't possibly just make
 that mistake with printf alone, can it?
 

 They're not being incautious with NULLs, they (C developers) do it
 because printf is known and documented to handle it.

 Oh, not on OpenSolaris?  Too bad for us, nobody cares. A great way to
 make people avoid adopting OpenSolaris is to make sure the apps they
 run succesfully everywhere else crash only on OpenSolaris.

 GNU printf is documented to print '(null)', so no big surprise
 developers rely on documented behavior.

   If you accidentally pass a null pointer as the argument for a `%s'
   conversion, the GNU library prints it as `(null)'. We think this is
   more useful than crashing.

 http://www.gnu.org/software/libtool/manual/libc/Other-Output-Conversions.html

 (The text goes on to say But it's not good practice to pass a null
 argument intentionally but in true human/developer nature, people
 don't pay attention that that. Once the behavior has been promised and
 implemented, people will use it.)
   

So if this is the case, then lets just follow suit.  But lets do so 
explicitly, with similar language in our printf() documentation, rather 
than just silently doing something.  I'd assume for familiarity that 
the same (null) string should be used, as well.

Admittedly, I'm not thrilled with this (I want my cycles back!) but I'm 
OK with it, particularly if we just go ahead and document it as an 
acceptable practice.  (In particular, I'm thinking about all the cases 
of (x ? x : null) that are in debug statements around.  If you're 
going to make me burn the cycles to make the test in libc, at least let 
me reclaim them in my other code. :-)

-- Garrett

-- next part --
An HTML attachment was scrubbed...
URL: 
http://mail.opensolaris.org/pipermail/opensolaris-arc/attachments/20080625/6bd3e309/attachment.html


[zfs-discuss] zfs primarycache and secondarycache properties [PSARC/2008/393 FastTrack timeout 06/27/2008]

2008-06-25 Thread eric kustarz

On Jun 25, 2008, at 11:49 AM, Darren Reed wrote:

 This would seem to be a significant use case for the model of having
 non-overlapping data types in each of the two caches.  Since no reply
 was received on zfs-discuss, I'm redirecting it to psarc to indicate  
 that
 this question isn't closed.

I see some comments, but no direct question.  So what is the question?

eric



 Darren J Moffat wrote:
 Darren Reed wrote:

 So I spent some time thinking about different directions you could  
 build
 on this in the future, for example:
 1) controlling the size of the ARC/L2ARC by controlling the cache  
 size
 2) specifying different backing storage for primary/secondary cache
 3) having more than two levels of cache
 ...none of which is precluded by current efforts.

 With (2), if the backing storage for each cache is different and  
 it is slower
 to access the secondary cache than the primary, then you may not  
 want
 metadata to be stored in the secondary cache for performance  
 reasons.

 As an example, you might be using NVRAM (be it flash or otherwise)
 for the primary cache and ordinary RAM for the secondary.  In this  
 case
 you probably don't want any metadata to be stored in the secondary
 cache (power failure issues) but  the same may not hold for user  
 data.
 But I'm probably wrong about that.


 I doubt you would be, the primarycache is system memory not a cache  
 device.  The secondarycache is the L2ARC devices specified with the  
 cache vdev type to zpool so your examle would be the otherway  
 around.






2008/403 [libc printf behaviour for NULL string]

2008-06-25 Thread Glenn Skinner
Date: Wed, 25 Jun 2008 15:45:44 -0500
From: Rick Matthews Richard.Matthews at sun.com
Subject: Re: libc printf behaviour for NULL string [PSARC/2008/403
FastTrack timeout 07/02/2008]

...
About the case:

It seems to me this is about being bug compatible with other
implementations.  This one doesn't seem particularly offensive.
WRT compatibility, is this more a gang of four issue as to
whether this is the familiarity we want?

Speaking both as a PSARC member and a gang of four member, yes, this
case definitely helps achieve the sort of familiarity we're aiming
for.  From my perspective, it's a no-brainer.

I was a charter member of the purity camp whose position I see many of
the posters to this case arguing.  But that kind of purity is a luxury
we can no longer afford.

-- Glenn




[zfs-discuss] zfs primarycache and secondarycache properties [PSARC/2008/393 FastTrack timeout 06/27/2008]

2008-06-25 Thread Darren Reed
eric kustarz wrote:

 On Jun 25, 2008, at 12:53 PM, Darren Reed wrote:

 eric kustarz wrote:

 On Jun 25, 2008, at 12:02 PM, Darren Reed wrote:

 eric kustarz wrote:

 On Jun 25, 2008, at 11:49 AM, Darren Reed wrote:

 This would seem to be a significant use case for the model of having
 non-overlapping data types in each of the two caches.  Since no 
 reply
 was received on zfs-discuss, I'm redirecting it to psarc to 
 indicate that
 this question isn't closed.

 I see some comments, but no direct question.  So what is the 
 question?

 If the primary and secondary cache are different media, especially 
 in the case
 of one being non-volatile, shouldn't it be possible to allow the 
 user to specify
 that they want to use the non-volatile cache for meta data without 
 requiring
 them to forgo caching user data in a volatile cache?

 Sure:
 # zfs set primarycache=all tank/fs
 # zfs set secondarycache=metadata tank/fs

 ARC (server memory) is the primary cache, l2ARC (SSD) is the 
 secondary cache.

 eric

 Oh. are you saying that because metadata is directly s[ecofoed to be
 cached in one place, it won't also be cached in the other?  The case
 didn't make that behaviour clear, if so.

 No - the ARC will cache both data and metadata.  The l2ARC will only 
 cache metadata.



 the desire would be primary=user data, secondary=meta data...

 Desire for what workload?  You would have to *always* go to the 
 secondary cache (or disk) for metadata in order to get to the data 
 cached in the primary cache.  I don't see a sensible use case for this 
 - this is why we are not allowing a data only option.  But we've been 
 over this already.

Ugh, brain fade... I was thinking of the caches as being in parallel 
rather than layered.

Darren




libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Nicolas Williams
On Wed, Jun 25, 2008 at 09:20:54AM -1000, Joseph Kowalski wrote:
 Jyri Virkki wrote:
 I read the materials and there are no exported interface changes, no
 imported interface changes and not even any documentation changes.
   
 Sigh,...
 
 But its a very visible semantic change.

Yes: it lets certain apps run!

:)

Note: no tongue in cheek.



[zfs-discuss] zfs primarycache and secondarycache properties [PSARC/2008/393 FastTrack timeout 06/27/2008]

2008-06-25 Thread Darren Reed
eric kustarz wrote:

 On Jun 25, 2008, at 12:02 PM, Darren Reed wrote:

 eric kustarz wrote:

 On Jun 25, 2008, at 11:49 AM, Darren Reed wrote:

 This would seem to be a significant use case for the model of having
 non-overlapping data types in each of the two caches.  Since no reply
 was received on zfs-discuss, I'm redirecting it to psarc to 
 indicate that
 this question isn't closed.

 I see some comments, but no direct question.  So what is the question?

 If the primary and secondary cache are different media, especially in 
 the case
 of one being non-volatile, shouldn't it be possible to allow the user 
 to specify
 that they want to use the non-volatile cache for meta data without 
 requiring
 them to forgo caching user data in a volatile cache?

 Sure:
 # zfs set primarycache=all tank/fs
 # zfs set secondarycache=metadata tank/fs

 ARC (server memory) is the primary cache, l2ARC (SSD) is the secondary 
 cache.

 eric

Oh. are you saying that because metadata is directly s[ecofoed to be
cached in one place, it won't also be cached in the other?  The case
didn't make that behaviour clear, if so.

the desire would be primary=user data, secondary=meta data...

Darren

 Darren J Moffat wrote:
 Darren Reed wrote:

 So I spent some time thinking about different directions you 
 could build
 on this in the future, for example:
 1) controlling the size of the ARC/L2ARC by controlling the cache 
 size
 2) specifying different backing storage for primary/secondary cache
 3) having more than two levels of cache
 ...none of which is precluded by current efforts.

 With (2), if the backing storage for each cache is different and 
 it is slower
 to access the secondary cache than the primary, then you may not 
 want
 metadata to be stored in the secondary cache for performance 
 reasons.

 As an example, you might be using NVRAM (be it flash or otherwise)
 for the primary cache and ordinary RAM for the secondary.  In 
 this case
 you probably don't want any metadata to be stored in the secondary
 cache (power failure issues) but  the same may not hold for user 
 data.
 But I'm probably wrong about that.


 I doubt you would be, the primarycache is system memory not a 
 cache device.  The secondarycache is the L2ARC devices specified 
 with the cache vdev type to zpool so your examle would be the 
 otherway around.







2008/403 [libc printf behaviour for NULL string]

2008-06-25 Thread Joseph Kowalski

Darren (several timezones removed),

When you get through this tread, I need your position (as submitter) on 
the proposed binding of Patch.  From there, I can decide what to do with 
this case.

- thanks,

- jek3




libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Joseph Kowalski
Jyri Virkki wrote:
 I read the materials and there are no exported interface changes, no
 imported interface changes and not even any documentation changes.
   
Sigh,...

But its a very visible semantic change.

- jek3




libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Rick Matthews
Nicolas Williams wrote:
 On Wed, Jun 25, 2008 at 09:20:54AM -1000, Joseph Kowalski wrote:
   
 Jyri Virkki wrote:
 
 I read the materials and there are no exported interface changes, no
 imported interface changes and not even any documentation changes.
  
   
 Sigh,...

 But its a very visible semantic change.
 

 Yes: it lets certain apps run!

 :)

 Note: no tongue in cheek.
   
About the fast track length and protocol:

I picked Nico's message to reply for no particular reason except the
header should ensure it be recorded properly (note: use the ARC ans
case as part of the header).

This thread is also getting a bit long. Darren, is it time to make it a
discussion at PSARC rather than a email trail.

About the case:

It seems to me this is about being bug compatible with other 
implementations.
This one doesn't seem particularly offensive. WRT compatibility, is this 
more
a gang of four issue as to whether this is the familiarity we want?

-- 
-
Rick Matthews   email: Rick.Matthews at sun.com
Sun Microsystems, Inc.  phone:+1(651) 554-1518
1270 Eagan Industrial Road  phone(internal): 54418
Suite 160   fax:  +1(651) 554-1540
Eagan, MN 55121-1231 USAmain: +1(651) 554-1500  
-

-- next part --
An HTML attachment was scrubbed...
URL: 
http://mail.opensolaris.org/pipermail/opensolaris-arc/attachments/20080625/369cfdb6/attachment.html


[zfs-discuss] zfs primarycache and secondarycache properties [PSARC/2008/393 FastTrack timeout 06/27/2008]

2008-06-25 Thread Darren Reed
eric kustarz wrote:

 On Jun 25, 2008, at 11:49 AM, Darren Reed wrote:

 This would seem to be a significant use case for the model of having
 non-overlapping data types in each of the two caches.  Since no reply
 was received on zfs-discuss, I'm redirecting it to psarc to indicate 
 that
 this question isn't closed.

 I see some comments, but no direct question.  So what is the question?

If the primary and secondary cache are different media, especially in 
the case
of one being non-volatile, shouldn't it be possible to allow the user to 
specify
that they want to use the non-volatile cache for meta data without requiring
them to forgo caching user data in a volatile cache?

Darren

 Darren J Moffat wrote:
 Darren Reed wrote:

 So I spent some time thinking about different directions you could 
 build
 on this in the future, for example:
 1) controlling the size of the ARC/L2ARC by controlling the cache size
 2) specifying different backing storage for primary/secondary cache
 3) having more than two levels of cache
 ...none of which is precluded by current efforts.

 With (2), if the backing storage for each cache is different and it 
 is slower
 to access the secondary cache than the primary, then you may not want
 metadata to be stored in the secondary cache for performance reasons.

 As an example, you might be using NVRAM (be it flash or otherwise)
 for the primary cache and ordinary RAM for the secondary.  In this 
 case
 you probably don't want any metadata to be stored in the secondary
 cache (power failure issues) but  the same may not hold for user data.
 But I'm probably wrong about that.


 I doubt you would be, the primarycache is system memory not a cache 
 device.  The secondarycache is the L2ARC devices specified with the 
 cache vdev type to zpool so your examle would be the otherway around.






libc printf behaviour for NULL string [PSARC/2008/403 FastTrack timeout 07/02/2008]

2008-06-25 Thread Jyri Virkki

Another ARC gone wild thread? Can we keep the lessons of the past
months in mind?

I read the materials and there are no exported interface changes, no
imported interface changes and not even any documentation changes.

Only an implementation change to something formally defined as
undefined, so while your code reviewers should have something to say
if the implementation chooses to, say, reboot the system, code reviews
are not in scope for ARC.

So there's actually nothing for ARC to review here.. why file this case?
My vote is you close it approved automatic and go fix the bug already.






Everything below is just me adding to the noise, so ignore for this
case purposes.

James Carlson wrote:

 An application that's incautious with NULL can't possibly just make
 that mistake with printf alone, can it?

They're not being incautious with NULLs, they (C developers) do it
because printf is known and documented to handle it.

Oh, not on OpenSolaris?  Too bad for us, nobody cares. A great way to
make people avoid adopting OpenSolaris is to make sure the apps they
run succesfully everywhere else crash only on OpenSolaris.

GNU printf is documented to print '(null)', so no big surprise
developers rely on documented behavior.

  If you accidentally pass a null pointer as the argument for a `%s'
  conversion, the GNU library prints it as `(null)'. We think this is
  more useful than crashing.

http://www.gnu.org/software/libtool/manual/libc/Other-Output-Conversions.html

(The text goes on to say But it's not good practice to pass a null
argument intentionally but in true human/developer nature, people
don't pay attention that that. Once the behavior has been promised and
implemented, people will use it.)



Garrett D'Amore wrote:

 Is the next step really to start checking for null arguments to other 
 string functions?  What about null pointers passed to other library 
 routines, such as free(), qsort(), bsearch()?

I didn't see Darren propose that so not this case. But if you'd like
to go research all those functions to see if there are some other
areas where there is a serious disconnect between the defacto industry
standards and the OpenSolaris implementation, hurting OpenSolaris
adoption, it would be useful info to share later.


-- 
Jyri J. Virkki - jyri.virkki at sun.com - Sun Microsystems