from:"Kais Belgaied"

[2] FOSS case: Yersinia Layer 2 Attack Tool [PSARC/2009/643 FastTrack timeout 12/01/2009]

2009-11-25 Thread Kais Belgaied

the case should've used the term multi-protocol packet generation 
instead of attack tool.

At this point I'm withdrawing the case, and will be consulting with the 
project team whether to submitted it back
as a full case, or deliver it under /contrib as it was suggested here 
and off-line.


Kais


On 11/25/09 07:06, John Fischer wrote:
 Kais,

 Although this case might be a familiarity case I think it fails the
 non-controversial condition for a fast track.  Garrett and Darren
 have already questioned it raising the controversial question.  I
 also question the need to supply this in our repositories.  I would
 suggest that it be a full case.

 John

[2] FOSS case: Yersinia Layer 2 Attack Tool [PSARC/2009/643 FastTrack timeout 12/01/2009]

2009-11-24 Thread Kais Belgaied

FOSS questionnaire and draft man page have been placed in the case 
directory.

Kais.

FOSS case: Yersinia Layer 2 Attack Tool [PSARC/2009/643 FastTrack timeout 12/01/2009]

2009-11-24 Thread Kais Belgaied


Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI
This information is Copyright 2009 Sun Microsystems
1. Introduction
1.1. Project/Component Working Name:
 FOSS case: Yersinia Layer 2 Attack Tool
1.2. Name of Document Author/Supplier:
 Author:  Si-wei Liu
1.3  Date of This Document:
24 November, 2009

2. Project Summary
   2.1 Project Description

   This project introduces the package of yersinia 0.7.1 into the
   SFW consolidation.

4. Technical Description

   Yersinia implements several attacks for the following protocols:
   Spanning Tree (STP), Cisco Discovery (CDP), Dynamic Host
   Configuration (DHCP), Hot Standby Router (HSRP), Dynamic Trunking
   (DTP), 802.1q, Inter-Switch Link Protocol (ISL), and VLAN
   Trunking (VTP). It helps the pen-tester in different tasks, such
   as becoming the root role in the Spanning Tree, creating virtual
   CDP neighbors, setting up rogue DHCP servers, becoming the active
   router in a HSRP scenario, enabling trunk, performing ARP
   spoofing over VLAN hopping, adding or deleting VLANs (via VTP),
   and more. 

   yersinia is quite portable and runs on a variety of platforms.

   Command name   Notes
   ===
   yersinia   Penetration testing tool for layer 2 attacks


5. Interfaces 

   Exported interface   Classification   Interface type
   ====  ==
   SUNWyersinia Uncommitted  Package name
   /usr/bin/yersiniaUncommitted  Command
   /usr/share/man/man8/yersinia.8   Uncommitted  Manpage
  
   Imported interface   Classification   Interface type
   ====  ===
   /usr/lib/libnet.so.1.1.2.1   Volatile Library provided
 by SUNWlibnet

   Yersinia does not use any environment variable.

   draft man page and FOSS questionnaire to follow


6. Resources and Schedule:
   6.4. Product Approval Committee requested information:
6.4.1. Consolidation or Component Name:
   SFW
   6.5. ARC review type: FastTrack
   6.6. ARC Exposure: open

6. Resources and Schedule
6.4. Steering Committee requested information
6.4.1. Consolidation C-team Name:
sfw
6.5. ARC review type: FastTrack
6.6. ARC Exposure: open

Public GLDv3 Interfaces [PSARC/2009/638 FastTrack timeout 11/26/2009]

2009-11-21 Thread Kais Belgaied

+1

Kais

On 11/20/09 11:25, Nicolas Droux wrote:
 Seb,

 My intent is to include this entry point as part of the GLDv3 APIs 
 being committed. It is documented it in the mac(9F) man page draft [1],
 but I did not list it in the overview. I will update the spec to list 
 it there as well.

 Nicolas.

 [1] Available in the materials for the case, see
 http://arc.opensolaris.org/caselog/PSARC/2009/638/materials/man/mac-9f.txt 



 Sebastien Roy wrote:
 I haven't reviewed the materials fully yet, but a quick string search of
 the spec doesn't turn up any references to mac_init_ops(), and this
 function must be called in drivers' _init() routines.  This may have
 been an oversight.

 -Seb

Network Auto-Magic (NWAM) Phase 1 Updates [PSARC/2009/577 FastTrack timeout 10/29/2009]

2009-10-28 Thread Kais Belgaied

On 10/22/09 10:40, Sebastien Roy wrote:
 11. Add default-route properties to IP Interface NCUs
 Two new properties, ipv4-default-route and ipv6-default-route, allow
 the user to specify statically configured default router address(es),
 to be associated with a specific interface.  This provides a static
 alternative to a DHCP-specified default router, which may be associated
 with an interface if DHCP is in use.
   

alternative to the DHCP-specified default router (returned by dhcpinfo I 
presume) or in addition to it, as a second default router?

 17. Change upgrade behavior
 Upon upgrade, earlier nwam link and interface configuration will be
 imported into the User NCP.  However, the Automatic NCP will be active
 by default.  The rationale for this change is that the default config
 implemented in earlier nwam versions is the same as the Automatic NCP
 behavior, and we expect that most users will not have made changes, and
 therefore will want the Automatic NCP.  The previously discussed change
 with respect to automatic addition/removal of inserted/removed links
 makes this especially desirable.  Users who actually modified their
 earlier configuration (which should be a small minority) can switch to
 the User NCP to get their changes.

 There is one exception: if any static addresses are specified in the
 llp file, it is very clear that the user did in fact modify that file;
 therefore, if a static address is found, the User NCP will be active
 upon upgrade.

 Location profiles did not exist in earlier versions of NWAM, so any
 configuration that NWAM does based on Location specifications may
 overwrite previous system configuration.  On upgrade, the existing
 configuration will be saved into a User location.  This location will
 be activated if it includes an nsswitch.conf file which uses a nameservice
 other than DNS (i.e. a nameservice that cannot be configured by NWAM
   

you mean other tan DNS *and* files, ?

 based solely on information obtained from the network).
   

Kais

OVF Support in virt-convert [PSARC/2009/548 FastTrack timeout 10/16/2009]

2009-10-20 Thread Kais Belgaied

 On 10/19/09 07:12, Sebastien Roy wrote:
 Kais,

 Are you satisfied with Susan's answers to your questions?
 

almost there.


 On 10/12/09 11:17, Susan Kamm-Worrell wrote:

 The open source virt-convert import does not yet support the TAR (ova) 
 format.

 It does support an input of a directory that contains the OVF package 
 files or an input of the ovf file directly.  If specifying the ovf file
 directly the ovf file will describe the other files required by
 the OVF package.

OK,
Could you give an example or list in the text of the draft man page what 
the content of
such dir looks like?
Can you clarify if there are files ignored in the package content dir 
while importing (I'm thinking about the cert files for integrity
checks).

Kais,.

OpenSSL RSA keys by reference in PKCS#11 keystores through the PKCS11 engine [PSARC/2009/555 FastTrack timeout 10/20/2009]

2009-10-20 Thread Kais Belgaied

On 10/19/09 01:21, Darren J Moffat wrote:

 While these are all good points we (Solaris) don't own the 
 documentation for these APIs and we didn't design them.  These are 
 OpenSSL APIs that are documented in OpenSSL documentation and we don't 
 modify those docs.

is there any doc that comes from Sun where we could capture these 
gotchas that a developer will encounter?

Kais.


 What you have said is true regardless of which OpenSSL ENGINE is in 
 use and isn't unique to the Solaris provided pkcs11 engine.

OpenSSL RSA keys by reference in PKCS#11 keystores through the PKCS11 engine [PSARC/2009/555 FastTrack timeout 10/20/2009]

2009-10-20 Thread Kais Belgaied


   I may add a note to our openssl(5) draft change that high level 
 API must be used for that, and can add an example of few such functions 
 so that a user can get the picture. Is that OK?
   

sounds good.

+1.

Kais
   thanks, Jan.

OpenSSL RSA keys by reference in PKCS#11 keystores through the PKCS11 engine [PSARC/2009/555 FastTrack timeout 10/20/2009]

2009-10-16 Thread Kais Belgaied

+0.75


a couple of questions below


 + OpenSSL can access RSA keys in PKCS#11 keystores using the
 + following functions of the ENGINE API:
 +
 +   EVP_PKEY *ENGINE_load_private_key(ENGINE *e,
 +   const char *key_id, UI_METHOD *ui_method,
 +   void *callback_data)
 +
 +   EVP_PKEY *ENGINE_load_public_key(ENGINE *e,
 +   const char *key_id, UI_METHOD *ui_method,
 +   void *callback_data)
   

given the semantics described in the case, these functions will fail for 
multiple reasons: bad argument, key not found,
bad internal state (engine hasn't initialized or hasn't authenticated to 
the token). Yet the return value
can be either NULL: failure or Not NULL: a matching key was retrieved.
It will be more helpful to give the app developers some info as to the 
reason of failure, so that they
know what to do when the load function returns NULL.

Possibly Missing:
--
1. Need to mention somewhere that the caller of the load functions is 
responsible for calling EVP_PKEY_free().

2. since the private parts of the on-token keys are never read by the 
engine, there is an implication on all OpenSSL
  access routines, like  EVP_PKEY_copy_parameters(), 
EVP_PKEY_get1_RSA(), etc. The'll all gonna fail when the
pkey arg comes from a token.
Rather than chasing the dozens of functions that use RSA private keys in 
openssl, maybe it suffices to
document that EVP_Decrypt() and EVP_PKEY_free() are the only routines 
that can use an  RSA private  key by reference.

Kais.

Pass-through iconv code conversion [PSARC/2009/561 FastTrack timeout 10/21/2009]

2009-10-16 Thread Kais Belgaied

+1

Kais

OVF Support in virt-convert [PSARC/2009/548 FastTrack timeout 10/16/2009]

2009-10-11 Thread Kais Belgaied

- Scope of this case:
  Typically, you don't get a naked .ovf file. You get an OVF package, 
which is a tar ball  of the content (actual vmdk's of a disk image +
possible .iso if  needed for startup, optional signing cert ...) 
along with the .ovf that describes the metatdata for configuring
the VM. The package is what VWare's OVF tool exports and imports. It 
seems to me that stopping at importing and producing   
   only the .ovf falls short of delivering a complete answer and leaves 
the user on his/her own to go assemble the needed parts.

- Interoperability:
  OVF defines 3 levels of conformance, depending on the attributes and 
optional extensions implemented.
  What is the conformance level of  OVF files produced/exported by 
virt-convert?
  On the import side, what level is understood by this implementation?

- Evolution
  Any tying to a particular version of the format?

Kais.
 
On 10/09/09 12:57, Sebastien Roy wrote:
 The Open Virtualization Format (OVF) is the latest industry wide 
 format used to move guest VMs between different v12n platforms.

 In following the upstream virt-install project, virt-convert  
 has been enhanced to auto-detect the OVF format as an input type 
 or to have it explicitly set on use as follows:

 usage: virt-convert -i ovf inputdir|input.vmx|input.ovf [outputdir|output.xml]

 The manpage has also been updated to reflect the support of the OVF format.
 (see the man page in the materials directory).

 3.  Input/output formats

 virt-convert now supports both .vmx and OVF format input files.

 4.  Interface table

 An additional input format is listed in the interface table describing
 the still changing OVF specification.

virt-convert command line  Uncommitted
virt-convert outputNot-an-interface
virt-instance output formatUncommitted
VMX input format   Volatile
OVF input format   Volatile

 5.  References

 PSARC/2008/579 virt-convert

flowadm(1m) remote_port flow attribute [PSARC/2009/488 FastTrack timeout 09/21/2009]

2009-09-22 Thread Kais Belgaied

This case has its +1 and timed out yesterday. Marking it closed.

Kais.

Dynamic Ring Grouping on NICs [PSARC/2009/501 FastTrack timeout 09/25/2009]

2009-09-18 Thread Kais Belgaied


Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI
This information is Copyright 2009 Sun Microsystems
1. Introduction
1.1. Project/Component Working Name:
 Dynamic Ring Grouping on NICs
1.2. Name of Document Author/Supplier:
 Author:  Venu Iyer
1.3  Date of This Document:
18 September, 2009
4. Technical Description
I'm filing this fasttrack for Venu Iyer. The release binding is patch.
The interface taxonomy is Uncommitted

Background
==

Project Crossbow (PSARC/2006/357) enables creating hardware-based MAC
clients (some of these MAC clients are data links such as VNICs) both on
the RX and TX side. We define hardware-based MAC clients as having dedicated
hardware resources; a RX hardware-based MAC client will have one or more RX
ring for exclusive use while a TX hardware-based MAC client will have one or
more TX ring for exclusive use.  MAC clients that are not hardware-based
(RX or TX) share hardware resources with other MAC clients, such MAC
clients will not have any TX or RX rings exclusively reserved for them.
MAC clients may be hardware-based on RX, but not on TX (and vice-versa).

Currently, when a NIC registers with MAC it informs MAC if it supports
dedicated hardware RX or TX rings. MAC assigns hardware rings to MAC
clients as groups, where a group may contain 1 or more hardware rings.

dladm show-phys is currently used to show how RX rings are used by MAC
clients.

# dladm show-phys -H nxge4
LINK GROUPGROUPTYPE RINGS CLIENTS
nxge40RX3 nxge4
nxge41RX1 vnic1

which says we have 1 RX hardware-based MAC client - vnic1 with 1 ring.
nxge4, the primary MAC client, is using 3 rings, but will share
these with any other MAC client that is subsequently created on the
data link nxge4 (i.e. if vnic2 is created on nxge4, MAC clients vnic2 and
nxge4 will share group 1, and hence the 3 rings), eg:

# dladm show-phys -H nxge4
LINK GROUPGROUPTYPE RINGS CLIENTS
nxge40RX3 nxge4,vnic2
nxge41RX1 vnic1

Information about TX rings is not shown by the show-phys subcommand.

Today, an administrator can specify that a VNIC must be hardware-based on the
RX side (using the -H option to dladm create-vnic). However, there is
no way for an administrator to specify

o that a MAC client (VNIC or primary MAC client) should be software
  based, i.e. should not have any dedicated hardware resource,

o that a MAC client should be hardware or software based on TX.

o the number of RX or TX rings needed for a MAC client.

Proposal


This proposal gives administrative control over whether a MAC client
should be hardware-based or not (RX and TX) and also allows them to
specify the number of RX or TX rings that a MAC client needs, if it is
hardware-based.

We introduce two properties for a link:

rxringcnt: The number of RX rings needed.
txringcnt: The number of TX rings needed.

The values for these properties could be:

0 : This link must not assigned any hardware rings of the
specified type.

x  0 : This link needs x rings.

If the property is not specified for a link, the system will attempt
to maxmize the hardware resource utilization by making this MAC client
hardware-based depending on rings availability.

E.g:

# dladm create-vnic -p rxringcnt=0 -l nxge0 vnic1

Will create vnic1 which will not be RX hardware-based.

# dladm create-vnic -p txringcnt=2 -l nxge0 vnic2

Will create vnic2 that will be TX hardware-based with 2 TX rings.

# dladm create-vnic -p rxringcnt=2,txringcnt=2 vnic3

Will create vnic3 which will be both RX and TX hardware-based with 2
RX and TX rings resp.

Modifying the RX or TX rings assigned to an existing link, say nxge0,
can be done using set-linkprop,

e.g. if nxge0 needs to be given 2 RX rings:

# dladm set-linkprop -p rxringcnt=2 nxge0

or for a VNIC, say vnic1, as:

# dladm set-linkprop -p txringcnt=2 vnic1

The rings assigned to a link can be viewed using show-linkprop as:

# dladm show-linkprop nxge0
LINK PROPERTYPERM VALUEDEFAULT POSSIBLE
...
nxge0rxringscnt  rw   2--   0-4
nxge0txringscnt  rw   5--   0-6
...


These new properties obsolete the -H option of dladm create-vnic (i.e.
the -H option will be removed).

Given that we allow specifying RX and TX rings for links, we need a way
to display how many rings are available for use. Additionally, we need to
provide the number of hardware-based MAC clients that can be created on the RX
and TX side.

We introduce 4 additional read-only properties to display this information:

rxringavailcnt: The total number of RX rings available for use,
i.e. not exclusively given to any MAC client.

IP_DONTFRAG socket option [PSARC/2009/494 FastTrack timeout 09/23/2009]

2009-09-16 Thread Kais Belgaied

+1

Kais,

On 09/16/09 09:03, Sebastien Roy wrote:
 I'm submitting this fast-track for Erik Nordmark, it times out on
 09/23/2009.  The release binding is Patch.

 Background:
 --

 Busy DNS servers, and other servers that do UDP request/response 
 protocols, typically want to avoid Path MTU discovery since path MTU 
 discovery both adds latency (a packet would be dropped by routers 
 instead of being fragmented and forward) and adds state on the server 
 (Path MTU state would be created for the destination IP address).

 That is counterproductive when there are lots of clients that do a 
 single UDP request/response to the server.

 Details:
 ---

 For IPv6 we have had two socket options to control this (from RFC 3542) 
   IPv6_USE_MIN_MTU and IPV6_DONTFRAG.

 But there is no standard for IPv4.
 However, FreeBSD implements an IP_DONTFRAG socket option, which is used 
 by the BIND DNS server software.

 This case is to introduce IP_DONTFRAG in Solaris.

   Exported Interfaces
-

InterfaceClassification  Comments
-

IP_DONTFRAG  Committed   ip(7P)
-


 Man page updates:
 

 Add this text to ip(7P) after IP_TOS:
   IP_DONTFRAG  If enabled (the default) then the Don't
Fragment flag is set on IP packets. Disabling
the option means that Don't Fragment will not
be set which will result in not creating any
Path MTU state due to this socket.

netstat -r flags for blackhole and reject routes [PSARC/2009/495 FastTrack timeout 09/23/2009]

2009-09-16 Thread Kais Belgaied

+1

Kais.

On 09/16/09 09:10, Sebastien Roy wrote:
 I'm submitting this fast-track for Erik Nordmark.  It times out on
 09/23/2009.  The release binding is Minor due to the change in semantics
 of the netstat -r 'B' route flag.

flowadm(1m) remote_port flow attribute [PSARC/2009/488 FastTrack timeout 09/21/2009]

2009-09-15 Thread Kais Belgaied

On 09/15/09 07:16, Garrett D'Amore wrote:
 +1

 I guess this is mainly useful for sites that reuse the same connection 
 (or reconnect on the same port) repeatedly, e.g. for making offsite 
 backups or somesuch?

yep.

Kais.


-- Garrett

flowadm(1m) remote_port flow attribute [PSARC/2009/488 FastTrack timeout 09/21/2009]

2009-09-14 Thread Kais Belgaied


Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI
This information is Copyright 2009 Sun Microsystems
1. Introduction
1.1. Project/Component Working Name:
 flowadm(1m) remote_port flow attribute
1.2. Name of Document Author/Supplier:
 Author:  Kais Belgaied
1.3  Date of This Document:
14 September, 2009
4. Technical Description
flowadm(1m) currently offers only the specification of the local port
with TCP and UDP transports. That addressed the needs for expressing bandwidth
and priority constraints for services, ifentified by the transport and
the local port number.
Support for transport + remote port is needed to allow the creation of flows
that describe and regulate outbound communication to remote services.
This case adds a new flow attribute remote_port to flowadm(1m)
with the same restrictions and interface taxonomy as the existing flow
attributes.
The draft man page will be placed in the case directory.
The releas binding is patch. 

6. Resources and Schedule
6.4. Steering Committee requested information
6.4.1. Consolidation C-team Name:
ON
6.5. ARC review type: FastTrack
6.6. ARC Exposure: open

Opinion for PSARC review - PSARC/2009/364 dlstat and flowstat

2009-09-04 Thread Kais Belgaied

On 09/04/09 08:04, Sebastien Roy wrote:
 On Wed, 2009-09-02 at 12:10 -0700, Kais Belgaied wrote:
   
 3.  Interfaces

 Exported Interfaces
 Interface NameClassification   Comments
 -  --
 /usr/sbin/dlstat  CommittedSUNWcsu
 /usr/sbin/flowstatCommittedSUNWcnetr
 

 The newly Obsolete interface should be listed there, no?

   

yes I added
dladm show-* -s   Obsolete See draft man pages.
flowadm show-flow -s  Obsolete

to the text.

thanks,
Kais.

Opinion for PSARC review - PSARC/2009/364 dlstat and flowstat

2009-09-02 Thread Kais Belgaied

The project's modified doc per requested spec update and off-line 
editorial comments is in the finals.materials of the
case directory.  Attached is the opinion for PSARC review by 09/09/09.

   Kais.
   http://blogs.sun.com/kais

-- next part --
An embedded and charset-unspecified text was scrubbed...
Name: opinion.ascii
URL: 
http://mail.opensolaris.org/pipermail/opensolaris-arc/attachments/20090902/9d1d1a0d/attachment.ksh

Opinion for PSARC review - PSARC/2009/436 Anti-spoofing Link Protection

2009-08-26 Thread Kais Belgaied

The project's modified doc per requested spec updates is in the 
finals.materials of the case directory.
Attached is the opinion for PSARC review by 09/03/2009.

Kais.
http://blogs.sun.com/kais
-- next part --
An embedded and charset-unspecified text was scrubbed...
Name: opinion.ascii
URL: 
http://mail.opensolaris.org/pipermail/opensolaris-arc/attachments/20090826/0afbe354/attachment.ksh

pool dladm link property [PSARC/2009/448 FastTrack timeout 08/25/2009]

2009-08-19 Thread Kais Belgaied

On 08/19/09 01:58, Darren J Moffat wrote:

 Looks perfectly reasonable and has the user interface I'd expect so +1 
 from me.


+1 from me too

Kais.

PSARC/2009/374 libxmlsec

2009-07-08 Thread Kais Belgaied


   The libxmlsec source code tarball will be in the SFW gate and use
   a build harness similar to other libxml2 libraries.  It will be
   configured and compiled with only its libxmlsec-openssl module
   to support OpenSSL as the underlying encryption library.
   The libxmlsec-openssl crypto module is libxmlsec's default
   module, is MIT licensed and can make use of Sun's OpenSSL crypto
   engine to use the Userland Encryption Framework.  Legal approval for
   this usage is covered by OSR 7806.
   A follow up ARC cases may be filed after RFE#6479874 integrates
   in our OpenSSL implementation to improve crypto engine usage.
   A future ARC case could also switch us from using the OpenSSL
   module to a new module with more direct access to the crypto framework.
   Such a module would first need to be integrated in the community
   project.

so, any dependency on a particular version of OpenSSL's lib{crypto,ssl} ?

doesn't this case require an ARC contract against PSARC/2003/500 for the 
import of OpenSSL ?

Kais

PSARC/2009/374 libxmlsec

2009-07-08 Thread Kais Belgaied

On 07/08/09 13:36, Nicolas Williams wrote:
 On Wed, Jul 08, 2009 at 04:24:10PM -0400, Will Young wrote:
   
 On Wed, 2009-07-08 at 15:19 -0500, Nicolas Williams wrote:
 
 On Wed, Jul 08, 2009 at 04:06:53PM -0400, Will Young wrote:
   
 On Wed, 2009-07-08 at 12:11 -0700, Kais Belgaied wrote:
 
 so, any dependency on a particular version of OpenSSL's lib{crypto,ssl} ?

 doesn't this case require an ARC contract against PSARC/2003/500 for the 
 import of OpenSSL ?
   
 I no longer saw a need for a contract as of the integration of:
 6806387 Move OpenSSL from ON to SFW
 
 The move alone could not imply a change of interface stability.
 PSARC/2006/555 (Move OpenSSL to /usr) did not change the interface
 stability of any part of OpenSSL (see section 4.5 of the one-pager).
   
  It's my understanding that within one gate no contract is needed
 (updating any element of the gate one is responsible for
 examining/updating related elements.)  Is that not accurate?
 


well, looking back at  PSARC/2003/500. it exports the interfaces as 
Project Private not Consolidation Private.
So the contract would still be needed.

Alternatively, the supplier of PSARC/2003/500 can judge if it is the 
right thing for
the openssl libs' visibility  to be upgraded to  consolidation private.

Kais.

 Ah, sorry, another thinko on my part.

PSARC/2009/374 libxmlsec

2009-07-08 Thread Kais Belgaied

On 07/08/09 13:48, Garrett D'Amore wrote:

 Actually, it depends. A contract would be required if the interfaces 
 are Project Private, even within the same consolidation. If the 
 interfaces are Consolidation Private, then yes, no contract would be 
 required.

 Actually, with that last statement, its clear that moving a subsystem 
 which has Consolidation Private interfaces to a new subsystem has 
 *ARCHITECTURAL* impact. With that in mind, I hope such moves are 
 properly reviewed at ARC. (Something for folks doing such moves to 
 consider...)

yep.

Also, I haven't seen an answer to my other question about dependency on 
a particular version of openssl libs.

Kais


 - Garrett

Crossbow Import of Interrupt Affinity Interfaces [PSARC/2009/382 Self Review]

2009-07-07 Thread Kais Belgaied


Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI
This information is Copyright 2009 Sun Microsystems
1. Introduction
1.1. Project/Component Working Name:
 Crossbow Import of Interrupt Affinity Interfaces
1.2. Name of Document Author/Supplier:
 Author:  Rajagopal Kunhappan
1.3  Date of This Document:
07 July, 2009
4. Technical Description
This fasttrack covers the changes required for the Crossbow architecture
to import the DDI interfaces below, introduced by PSARC 2009/340. 
The interface taxonomy is Contracted Project Private. A copy of
the contract is placed in this case's directory.

Since the architectural impact of the interrupt affinity DDI interfaces and
its consumers has already been reviewed and approved with PSARC 2009/340,
I am marking this case as approved, pending the manager's emails
accepting the contracts terms. I'll be happy to set a timer if
needed.

typedef processorid_t ddi_intr_target_t;

int ddi_intr_get_affinity(ddi_intr_handle_t h, ddi_intr_target_t
*tgt_p);
int ddi_intr_set_affinity(ddi_intr_handle_t h, ddi_intr_target_t tgt);

Crossbow framework will be a consumer of these interfaces.

More details on Crossbow requirements:
1) Crossbow provides a framework by which NIC resources such as Rx and
Tx rings are exposed to the MAC layer. The MAC layer doles out these
resources to VNICs when they get created while reserving a fixed amount
for the primary NIC. CPUs, on which the processing of packets take
place, can be specified at VNIC creation time or later.  If they are
specified, the interrupts associated with the Rx/Tx rings need to be
re-targeted to the specified CPUs. A mechanism by which a specific MSI-X
interrupt can be re-targeted to a different CPU is needed. This is for
the virtualization part of Crossbow.

2) For optimal performance of regular NICs (as well as VNICs), the poll
thread associated with an Rx ring should be bound to the same CPU as the
interrupt CPU. So given an interrupt handle and a CPU, a mechanism is
needed to re-target the interrupt to the specified CPU.

The above 2 requirements are addressed by the interfaces introduced in
PSARC 2009/340. 

6. Resources and Schedule
6.4. Steering Committee requested information
6.4.1. Consolidation C-team Name:
ON
6.5. ARC review type: Automatic
6.6. ARC Exposure: open

inception review summary of PSARC/2009/364 - dlstat and flowstat

2009-06-24 Thread Kais Belgaied

Below are the main architectural issues from the PSARC/2009/364 
inception review,
to be addressed before the commitment review.

- Use of kstats as underlying stats, rather than a new way.
  The project team will need to justify why kstats aren't suitable for
  accumulating and reporting the counters needed to be extracted
  by dlstat and flowstat

- Drop the verb (show, show-history, reset) from the subcommand, replacing
  them with simple option. This is consistent with the rest of
  {vm,io,net,..}stat commands in the system.

- Multiple issues with reset stats:
  - The loss of information accumulated since boot time may hurt the
diagnosability of problem with flows and links.
  - The action needs to be privileged.
  - suggestion: move the reset subcommand to {dl,flow}adm, with expected
effect of resetting the state of the datalink/flow, which includes the
usage statistics thereof.

- Clarify change of interface commitment level of existing show-* statistic
  related subcommands of dladm and flowadm, from 'committed' to
  'committed obsolete'

- The '-o' (+ or -) option, Be consistent with the rest of the commands in
  general, (ps, dladm etc.). Use -o with the exact list of columns to
  be displayed.

- Before commitment, the man pages need to be updated to reflect the changes
  from the inception review.

Kais.

inception review summary of PSARC/2009/364 - dlstat and flowstat

2009-06-24 Thread Kais Belgaied

Thanks Jim (and good to hear from you). I captured this issue as jdc-01 
in the issues file,
and Shri indicated off-line that he will answer and follow up on 
crossbow-discuss at opensolaris.org.

Kais.

On 06/24/09 12:09, James Carlson wrote:
 Kais Belgaied writes:
   
 - Multiple issues with reset stats:
   - The loss of information accumulated since boot time may hurt the
 diagnosability of problem with flows and links.
   - The action needs to be privileged.
   - suggestion: move the reset subcommand to {dl,flow}adm, with expected
 effect of resetting the state of the datalink/flow, which includes the
 usage statistics thereof.
 

 I suggest getting rid of reset altogether.  Besides being
 fundamentally incompatible with SNMP instrumentation, reset just
 isn't necessary or complete as long as you have decent delta-
 calculating tools.  (And I'd really rather have a good way of
 computing deltas -- especially as a non-privileged user -- than having
 an inaccessible way to nuke kernel counters.)

Materials for PSARC/2009/364 (dlstat and flowstat) submitted for review on 06/24/2009

2009-06-18 Thread Kais Belgaied

The materials have been submitted in the case directory and should be 
reflected on the opensolaris.org shortly.

The case still needs an intern, so, volunteers welcome.

Kais.

Interrupt affinity interfaces and PCITool enhancements [PSARC/2009/340 FastTrack timeout 06/17/2009]

2009-06-17 Thread Kais Belgaied

On 06/15/09 20:38, Evan Yan wrote:
 Hi Kais,

 Thanks for the comments.

 Pcitool and the interrupt affinity interfaces use the same under-layer
 implementation to re-target interrupts to some cpu. Whatever read
 operation will reflect the current binding status and whatever write
 operation will override the former settings.
   

so, back the example,  let's say you you use pcitool to bind the 
interrupts from a physical NIC nxge0 to
cpu1, the usecrossbow's dladm  to  set-linprop cpus=2 vnic1 and cpus=3 
vnic2 (where vnic1 and vnic2
are built over nxge0) will pcitool show that nxge0's interrupts are 
bound to cpus 1, 2 and 3?


Kais

 Thanks,
 -Evan

 Kais Belgaied wrote:
   
 This case also includes the contract for Crossbow framework to use these
 interrupt affinity interfaces in place of existing PCITool ioctl 
 interfaces.
   
 
   
 If I look at the this case in isolation from its expected consumers, and 
 with pcitool as the
 only consumer of the CPU affinity APIs, I have no trouble sending a +1.
 However, when considering the overall architecture  that includes both 
 this case's deliverables as well as the changes
 expected imminently  from its external consumers,  I am unclear on how 
 the system will behave when
 we  use both pcitool and those consumers' interrupt settings.

 I'll use the interaction with Crossbow as an example. The point is 
 similar for the interaction with other tools
 (intd(1m), etc). Say the system has a physical NIC nxge0, whose 
 interrupts are bound the cpu's 1,2,3,4
 using pcitool modified by this case. Later one creates vnic1 over nxge0 
 which
 used a couple of hardware rings out of nxge0's, and uses dladm 
 set-linkprop cpus=5,6 vnic1.
 With the changes imported from this case, the implementation of that 
 call of dladm will attempt
 to have the MSI/X interrupts  assigned to  the rings (thus the vnic1)  
 bound to CPUs 5 and 6.
 Will such setting of vnic1's  interrupts, fail because of a conflict 
 with a previous binding by pcitool?
 will it succeed silently? what will the call to pcitool querying about 
 the interrupt binding for
 nxge0 return then? CPUs 1,2,3,4 only, as set ? or will it surprisingly 
 show 1,2,3,4,5,6?
 how about the other way around? will dladm  get-linkprop vnic1 see the 
 settings that were previously done
 by pcitool?


 Kais.

Interrupt affinity interfaces and PCITool enhancements [PSARC/2009/340 FastTrack timeout 06/17/2009]

2009-06-15 Thread Kais Belgaied


 This case also includes the contract for Crossbow framework to use these
 interrupt affinity interfaces in place of existing PCITool ioctl 
 interfaces.
   
If I look at the this case in isolation from its expected consumers, and 
with pcitool as the
only consumer of the CPU affinity APIs, I have no trouble sending a +1.
However, when considering the overall architecture  that includes both 
this case's deliverables as well as the changes
expected imminently  from its external consumers,  I am unclear on how 
the system will behave when
we  use both pcitool and those consumers' interrupt settings.

I'll use the interaction with Crossbow as an example. The point is 
similar for the interaction with other tools
(intd(1m), etc). Say the system has a physical NIC nxge0, whose 
interrupts are bound the cpu's 1,2,3,4
using pcitool modified by this case. Later one creates vnic1 over nxge0 
which
used a couple of hardware rings out of nxge0's, and uses dladm 
set-linkprop cpus=5,6 vnic1.
With the changes imported from this case, the implementation of that 
call of dladm will attempt
to have the MSI/X interrupts  assigned to  the rings (thus the vnic1)  
bound to CPUs 5 and 6.
Will such setting of vnic1's  interrupts, fail because of a conflict 
with a previous binding by pcitool?
will it succeed silently? what will the call to pcitool querying about 
the interrupt binding for
nxge0 return then? CPUs 1,2,3,4 only, as set ? or will it surprisingly 
show 1,2,3,4,5,6?
how about the other way around? will dladm  get-linkprop vnic1 see the 
settings that were previously done
by pcitool?


Kais.

 Constraints:
 a) Set affinity limitations for certain interrupt types 
Fixed or INTx interrupts could be either exclusive or sharable 
 depending 
on hardware. Because there is no good way to detect that, the current
implementation will refuse any set affinity requests for INTx 
 interrupts.

On x86 platforms, multiple MSI interrupts of a single PCI function need
to be rerouted together since all MSI interrupts share the same MSI
address, which in turn includes same CPU number. Hence the current x86
implementation will refuse any set affinity requests for MSI 
 interrupts.
The future phase of this project may support MSI group retarget, 
 similar
to PCITool method.

 b) CPU offline considerations
CPUs may be online/offlined through administrative interfaces. When
a CPU is offlined, all of the interrupts targeting it are re-targeted.
The OS will pick any set of the surviving CPUs for re-targeting. The
OS is under no obligation to maintain drivers' interrupt affinity
preferences.

The first phase of this project will not provide any callback on CPU
online/offline events. Such callback events need to be defined in the
future. If a driver or framework is interested in maintaining optimal
CPU targeting, it should monitor its interrupt CPU bindings on a 
 regular
basis using ddi_intr_get_affinity(9f) or register a callback to receive
various CPU specific events using register_cpu_setup_func(). Where as,
the userland entities should subscribe to CPU DR specific sysevents.

 4.5.2 PCITool Enhancements

 Current syntax:
   pcitool pci@unit-address -i ino=ino
 [ -r [ -c ] | -w cpu=CPU [ -g ] ] [ -v ] [ -q ]

 Proposed syntax:
   pcitool pci@unit-address -i ino# | all
   [ -r [ -c ] | -w cpu# [ -g ] ] [ -v ] [ -q ]
   
   pcitool pci@unit-address -m msi# | all
   [ -r [ -c ] | -w cpu# [ -g ] ] [ -v ] [ -q ]

 The PCItool is a low-level tool which provides a facility for getting and
 setting interrupt routing information. This project is making some minor
 syntax changes to PCITool since the current syntax is not compliant with
 existing userland guidelines.

 In addition, this project is adding a new -m option to retrieve and
 reroute the interrupt target CPU for MSI/Xs on SPARC platforms.
   
 On SPARC platforms, the INO is mapped to an interrupt mondo, and where as
 one or more MSI/Xs are mapped to an INO. So, INO and MSI/Xs are 
 individually
 retargetable. Use -i  option to retrieve or reroute a given INO, and
 where as use -m option for MSI/Xs.

 On x86 platforms, both INOs and MSI/Xs are mapped to the same interrupt
 vectors. Use -i option to retrieve and reroute any interrupt vectors
 (both INO and MSI/Xs). So, -m option is not required on x86 platforms.
 Hence it is not supported.
   
 4.6 Interfaces

 4.6.1 Exported Interfaces

 Interface Stability   Comments
 +---+--
 ddi_intr_target_t Project Interrupt target CPU
   Private
 ddi_intr_get_affinity Project Get interrupt target CPU

sysbench [PSARC/2009/351 FastTrack timeout 06/18/2009]

2009-06-15 Thread Kais Belgaied

+1 ,
a quick question though: the release binding is minor, is there a 
dependency on
future minor release features that are not available yet as patches to 
the current minor release of Solaris?

  Kais

On 06/11/09 20:57, James Walker wrote:
 I'm sponsoring this familiarity case for Peter Rival. The requested
 release binding is minor. The man page has been posted in the
 materials directory. Tim Cook of PAE has accepted this benchmark
 for inclusion in Solaris and a pointer to filebench has been added
 to the man page.

 Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI
 This information is Copyright 2009 Sun Microsystems
 1. Introduction
 1.1. Project/Component Working Name:
sysbench
 1.2. Name of Document Author/Supplier:
Author:  Peter Rival
 1.3  Date of This Document:
   11 June, 2009
 4. Technical Description
 Template Version: @(#)sac_nextcase %I% %G% SMI
 This information is Copyright 2009 Sun Microsystems
 1. Introduction
 1.1. Project/Component Working Name:
sysbench
 1.2. Name of Document Author/Supplier:
Author:  Frank Rival
 1.3  Date of This Document:
   04 May, 2009
 4. Technical Description
 Sysbench Check List
 1.0 Project Information
 1.1 Name of project/component
 sysbench

 1.2 Author of document
 Frank.Rival at Sun.COM

 2.0 Project Summary
   2.1 Project Description
 SysBench is a modular, cross-platform and multi-threaded benchmark 
 tool
 for evaluating OS parameters that are important for a system running a
 database under intensive load.

 The idea of this benchmark suite is to quickly get an impression about
 system performance without setting up complex database benchmarks or
 even without installing a database at all.
 Current features allow to test the following system parameters:

 * file I/O performance
 * scheduler performance
 * memory allocation and transfer speed
 * POSIX threads implementation performance
 * database server performance (OLTP benchmark)


   2.2 Release binding
   What is is the release binding?
   (see http://opensolaris.org/os/community/arc/policies/release-taxonomy/)
   [ ] Major
   [*] Minor
   [ ] Patch or Micro
   [ ] Unknown -- ARC review required

   2.3 Type of project
   Is this case a Linux Familiarity project?
   [ ] Yes
   [*] No

   2.4 Originating Community
 2.4.1 Community Name
   Sysbench[1]
 
 2.4.2 Community Involvement
   Indicate Sun's involvement in the community
   [ ] Maintainer
   [ ] Contributor
   [*] Monitoring
   
   Will the project team work with the upstream community to resolve
   architectural issues of interest to Sun?
   [*] Yes 
   [ ] No - briefly explain
   
   Will we or are we forking from the community?
   [ ] Yes - ARC review required prior to forking
   [*] No
   
 3.0 Technical Description
   3.1 Installation  Sharable
 3.1.1S Solaris Installation - section only required for Solaris Software
   (see 
 http://opensolaris.org/os/community/arc/policies/install-locations/ for 
 details)
   Does this project follow the Install Locations best practice?
   [*] Yes 
   [ ] No - ARC review required
   
   Does this project install into /usr under 
 [sbin|bin|lib|include|man|share]?
   [ ] Yes
   [*] No or N/A

   /usr/benchmarks/ - standard benchmark directory
   
   Does this project install into /opt?
   [ ] Yes - explain below
   [*] No or N/A
   
   Does this project install into a different directory structure?
   [*] Yes - ARC review required
   [ ] No or N/A
   
   Do any of the components of this project conflict with anything under 
 /usr?
   (see http://opensolaris.org/os/community/arc/caselog/2007/047/ for 
 details)
   [ ] Yes - explain below
   [*] No
   
   If conflicts exist then will this project install under /usr/gnu?
   [ ] Yes
   [ ] No - ARC review required
   [*] N/A
   
   Is this project installing into /usr/sfw?
   [ ] Yes - ARC review required
   [*] No
   
 3.1.1W Windows Installation - section only required for Windows Software
   (see http://sac.sfbay/WSARC/2002/494 for details)
   Does this project install software into a 
   system drive:\Program Files\Sun\product or system 
 drive:\Sun\product
   directory?
   [ ] Yes
   [ ] No - ARC review required
   
   Does the project use the Windows registry?
   [ ] Yes
   [ ] No - ARC review required
   
   Does the project use 
   HKEY_LOCAL_MACHINE\SOFTWARE\Sun Microsystems\product\version
   for the registry key?
   [ ] Yes
   [ ] No - ARC review required
   
   Is the project's stored location
   HKEY_LOCAL_MACHINE\SOFTWARE\Sun

Time Stamp Option for xxstat Commands Phase II [PSARC/2009/307 Self Review]

2009-05-17 Thread Kais Belgaied

On 05/15/09 16:21, Sherry Moore wrote:
 I am sponsoring this case for Chad Mynhier and closing it as approved
 automatic as it's simply a follow-on to PSARC/2009/105 to cover more
 commands.
   


will there be more follow-ons to 2009/105 for the remaining of xxstat 
commands?
netstat, dladm show-link -s -i, flowadm show-flow -s -i, etc?

Kais.

 Thanks,
 Sherry

bfe fast ethernet driver [PSARC/2009/242 FastTrack timeout 04/22/2009]

2009-04-16 Thread Kais Belgaied

+1

Kais

On 04/15/09 17:54, Garrett D'Amore - sun microsystems wrote:
 The following case is being submitted on behalf of Saurabh Mishra.  It 
 probably
 should qualify as automatic, but since its a GLDv3 driver and there was some
 doubt about it, I'm submitting it as a fast track.

 Patch binding is appropriate, although a backport to Solaris 10 is probably
 unlikely.

 Note that as this driver is for a simple 10/100 part with only a single TX
 and a single RX ring, questions about Crossbow support, etc. are probably 
 not terribly relevant.

 I believe that Saurabh *is* planning on supporting the appropriate Brussels
 interfaces.

PSARC/2009/232 Berkeley Packet Filter for OpenSolaris

2009-04-10 Thread Kais Belgaied



Hi Darren,

while the architecture looks sound, it has so many pieces and 
interactions with other subsystems
(the pluggale sockets, MAC, etc), way beyond what's suitable for a 
fasttrack.
This should be a full case.

Kais.

On 04/10/09 14:01, Darren Reed wrote:
 This is a self sponsored fast track, timeout set for 2 weeks...

 Abstract
 
 This case seeks to build on the Crossbow (PSARC/2006/357[7]) 
 infrastructure
 and provide a new (to OpenSolaris) mechanism for capturing packets: the
 use of the Berkeley Packet Filter (BPF). The goal of this project is to
 provide a method to capture packets that has higher performance than
 what we have to offer today on Solaris (DLPI based schemes.) It also has
 the added benefit of increasing our compatibility with other software
 that has been built to use BPF.

 Release Binding
 ---
 This case seeks to obtain approval for minor release binding.

 Background
 ==
 Packet capture on Solaris is currently built around the use of DLPI.
 Whilst the introduction of libdlpi (PSARC/2006/436[1]) has made it easier
 to program using DLPI and the IP Observability Project 
 (PSARC/2006/475[2])
 introduced the means by which packets that are local to the host could
 be intercepted, neither did anything to address the primary problem
 with DLPI: compared to other mechanisms, it is slow, the in-kernel
 filtering is either not used or very primitive and provides very
 little useful information about the packet capturing itself by way
 of statistics.

 Introduction
 

 The architecture of BPF lends itself to more efficient means of doing
 packet capture, where a single read can transfer large numbers of packets
 per call.  It also allows the sniffer to choose how much data from each
 packet they wish to copy, be it the entire packet or just the first 128
 bytes to capture headers.

 Internal Architecture
 ~
 Internally, the architecture of BPF is very simple: it has a lower
 half that receives packets from the NIC drivers, copying matching
 packets into a static buffer and an upper half that implements a
 character pseudo-device.

 Buffers
 ---
 The backing for the pseudo-device operating as a character device
 is a buffer allocated by the driver for storing packet data in.
 The buffersize used by the device for storing copied packet data
 in is set by the application. By default libpcap sets this size
 the the same size as the driver's default: 32k. The maximum this
 project allows is 16M.

 Two buffers of this size are allocated by the driver: an active
 buffer and a hold buffer. This supports applications doing
 sleeping reads, if they aren't using poll, and reading an entire
 buffer of data whilst the system continues to catch new packets.

 Applications can set the buffer size using libpcap or with the
 BIOCSBLEN ioctl (see man page.)

 List of Interfaces
 --
 BPF maintains an internal list of network interfaces that it supports
 capturing packets for. What distinguishes this list from that either
 in the mac or ip modules is that it uses the datalink type as a part
 of the key for determining what is an identical entry. Additionally,
 on OpenSolaris the device structure used inside of the ip module is
 different to the mac module, preventing either one being used as a
 master list by BPF. Answering queries such as returning the complete
 list of datalink types supported by a device (BIOCGDLTLIST), would
 be much more complicated without that internal list.

 Packet Capture
 --
 When BPF is called from the mac layer, it is handed the packet as
 it is received from the NIC driver as part of the promiscuous
 callback handling in the mac layer. It is the same mblk_t for the
 packet that will later be passed on though the stack and has
 neither the mblk_t's nor dblk_t's duplicated. Thus the capturing
 of the packet becomes part of the execution of the datapath for
 each packet.

 Interactions with existing technology in Solaris
 
 This section goes into detail about what impact this project has on
 other areas of Solaris or what impact they have on this project.

 Vanity Naming
 ~
 The Vanity Naming Project[6] introduced the means by which link names
 could be changed to be a different name than the underlying mac name.
 This project will only support packet capture on interfaces using the
 interface name allocated by the dls module that was delivered by the
 vanity naming project.

 IP Observability
 
 The IP observerability project introduced the ability to capture packets
 from within IP, presenting them through devices files in /dev/ipnet for
 libdlpi to use. This project will update some of the interfaces 
 introduced
 by IP observability.

 Updating IPNET
 --
 Unfortunately the mechanism used to do this is bound up within IP.  To
 build upon the work done here, this project will change the

10G link properties [PSARC/2009/206 Self Review]

2009-04-07 Thread Kais Belgaied

I requested more time for this case (timing out tomorrow)
I'm fine with Paul  Garrett's answers.

+1

Kais,

libvirt 0.6 [PSARC/2009/212 FastTrack timeout 04/09/2009]

2009-04-03 Thread Kais Belgaied

On 04/02/09 10:45, Tim Marsland wrote:
 I'm sponsoring the following fast-track for John Levon.
 Updated manpages in materials directory.

+1


Kais.

2009/211 SMIT for OpenSolaris

2009-04-01 Thread Kais Belgaied

Does this have anything to do with the April 1st today's date?


Kais.

On 04/01/09 08:55, James Carlson wrote:
 I'm just tickled pink to sponsor this request for Dan McDonald.  The
 change looks entirely obvious to me, so I've marked it as closed
 approved automatic.



 OpenSolaris currently lacks a standard, interoperable system
 management tool.  Fortunately, we are able to discern both the
 requirements for such a tool and the overall design by just looking at
 artifacts on other operating systems, so the architecture and top
 level design needed are trivial.

 This project provides SMIT for OpenSolaris.  SMIT stands for System
 Modified by Invisible Things, but the project team isn't sure why.
 The user/administrator interfaces are:

   /usr/bin/smit [-C] [-D] [-m menu-entry] [-R alternate-root]
   /usr/bin/smitty [-D] [-m menu-entry] [-R alternate-root]
   /usr/bin/xsmit [-D] [-m menu-entry] [-R alternate-root]

 smitty is equivalent to smit -C.  xsmit is an enhancement
 designed by the project team, and is just a symlink to smit.  These
 commands all bring up configuration menus, allowing the user (with
 appropriate privileges) to modify system configuration by exec-ing
 commands that he could otherwise learn about via the system man pages.

 Menus to be delivered with the smit tool are not described in detail
 here, but will include:

   SMF FMRI management
   Networking Interfaces
   Dtrace
   Zones
   ZFS file systems and pools

 SMIT is a system management tool but it's located in /usr/bin on other
 systems, so we're placing it there on OpenSolaris as well for
 familiarity reasons.

 Other interfaces delivered by this project include:

   /etc/objrepos   - symlink to /etc/svc/

   smit.log,   - droppings left in current directory
   smit.script

 A desktop link for GNOME will be provided.  The icon will depict a
 stick figure frozen in mid-step.

 The release binding is Tight.  The interfaces described are all
 Difficult.

 Related OpenSolaris projects may include Visual Panels.  The SMIT
 project team is not in contact with that team, and doesn't expect
 their agreement with this project, but would like to proceed anyway.

call for email vote: 2008/772 Command Assistant

2009-03-19 Thread Kais Belgaied

I vote to approve.

Kais.

On 03/19/09 12:03, James Carlson wrote:
 The project team has placed updated materials in the
 'post-inception.materials' directory.  The new materials reflect the
 change of direction (to implement an applet).  From looking at them,
 it appears that we've gotten all that we need, so I'm calling for the
 vote by email.

 Members: please let me know if you're not ready to vote.  Otherwise,
 please reply with your vote.

 I'm voting approve.

bwm-ng [PSARC/2009/160 FastTrack timeout 03/13/2009]

2009-03-17 Thread Kais Belgaied

OK.

+1 (in case it is needed)

Kais.

On 03/12/09 19:42, caijian guo - Sun Microsystems - Beijing China wrote:
 Kais,

 you  misunderstood me.
 I meant that
 (1) bwm-ng  can show statistics of  physical links
 (2)  bwm-ng can  also show  statistics  of  vnic  ,  aggregation 
 and vnic over aggregation .

 the outputs were not empty . the outputs were curses format (like 
 dladm show-link -S), so I could not copy the results to this email.

 You can try it too, my  bwm-ng x86 executable file is available at :
 /net/ns-x4200-22.sfbay/var/tmp/pkg/proto/root_i386/usr/bin/bwm-ng

 Caijian

bwm-ng [PSARC/2009/160 FastTrack timeout 03/13/2009]

2009-03-12 Thread Kais Belgaied


Caijian,

I guess you're saying that bwm-ng shows statistics about physical links.
only.
In that case how come the output of bwm-ng -I e1000g3 came out empty?

Kais


On 03/12/09 00:05, caijian guo - Sun Microsystems - Beijing China wrote:
 Kais,

I have tested that :
   Besides physical links, bwm-ng does be able to display the 
 statistics of vnic , aggregation and vnic over aggregation.
   I used netperf to test. the throughput displayed by bwm-ng is the 
 same as netperf.
 I tested it like so:
 # dladm create-vnic -l e1000g3 vnic1
 # dladm create-aggr -l e1000g1 -l e1000g2 -L active aggr1
 # dladm create-vnic -l aggr1 vnic2
 # bwm-ng -I e1000g3
 # bwm-ng -I vnic1
 # bwm-ng -I vnic2
 # bwm-ng -I aggr1
 # bwm-ng


 Caijian

 ? 2009?03?12? 07:22, Kais Belgaied ??:
 For networking devices, is the command expected to display the stats 
 of the physical links
 (the list reported by dladm show-phys) or the network links (the list 
 reported by dladm show-link) ?
 The lists may differ on Solaris when vnics or link aggregations are 
 present.

 Kais.

bwm-ng [PSARC/2009/160 FastTrack timeout 03/13/2009]

2009-03-11 Thread Kais Belgaied

For networking devices, is the command expected to display the stats of 
the physical links
(the list reported by dladm show-phys) or the network links (the list 
reported by dladm show-link) ?
The lists may differ on Solaris when vnics or link aggregations are present.

Kais.

On 03/08/09 22:56, caijian guo - Sun Microsystems - Beijing China wrote:
 Comments :
 Since we have kstat, there is no need to use libstatgrab, I have 
 tested it and have done lots of experiments . It works well using only 
 kstat,
  The application does not lack any of its normal functionality, the 
 output is the same. The functionality is the same.


 Caijian


 ? 2009?03?07? 05:59, James Carlson ??:
 Jim Walker writes:
   
 James Carlson wrote:
 
 bwm-ng is compile such that it is not dependent on libstatgrab.
   
 Assuming that change mean that the application lacks any of its
 normal functionality, I'll give this +1.
 
 I meant doesn't lack, obviously.  :-/
   
 Right. It does fine using only kstat.
 

 Sounds goodl; thanks.

conflict [PSARC/2009/003 FastTrack timeout 01/12/2009]

2009-01-07 Thread Kais Belgaied

nit: the case seems to import the PATH env variable. That should be 
listed in the imported interface
Quick question about the -t type. Is the expected output limited to 
executables too ?

Kais

2008/688 Sun Cluster TCP/IP Hooks Update

2008-12-10 Thread Kais Belgaied

On 12/03/08 08:06, James Carlson wrote:
 I'm restarting the timer on this fast-track for Huafeng Lu and the Sun
 Cluster team.  The changes from the last go-around include removing
 the version number string and adding a netstack ID and a flexible void
 * argument for future expansion, and the contract (contract-01) has
 been updated.  The timer is set to 12/10/2008.


   

what about the Sun Cluster hooks for SCTP ?

Kais.

2008/688 Sun Cluster TCP/IP Hooks Update

2008-12-10 Thread Kais Belgaied

On 12/10/08 09:32, James Carlson wrote:
 The specification sent out for review says this:

   talk to external servers. Note: this proposal only handles TCP and
   UDP; SCTP is beyond its scope.

 I assume that's a future project, if SCTP is to be supported within
 Sun Cluster at all.  (Just like IPv6, it'd likely require non-trivial
 changes to the Sun Cluster code to do it.)
   

OK.
+1

Kais

PSARC/2008/249 - Packet Interception for the MAC layer

2008-12-10 Thread Kais Belgaied

There is a rather long discussion about the design and some 
architectural questions in parallel, outside this alias.
I was hoping that discussion converges before the case times out.

Darren, this case needs to be put back in waiting need spec, at least 
until the variety of
changes being proposes over the last week settle.

Kais

Opinion for review: 2006/357 Crossbow - Network Virtualization and Resource Partitioning

2008-11-27 Thread Kais Belgaied

Thanks for review Garrett.
Fixed in the case directory. It now reads
One of the reviewers pointed out that the metering information
produced for flows and datalinks always follows the raw format
intended for gnuplot.

Kais.

On 11/27/08 07:53, Garrett D'Amore wrote:
 The first sentence in section 4.2 doesn't parse properly.

 Otherwise it looks OK to me.

-- Garrett

Opinion for review: 2006/357 Crossbow - Network Virtualization and Resource Partitioning

2008-11-26 Thread Kais Belgaied

Attached is the opinion of the Crossbow project, submitted for PSARC
review by December 2nd 2008.

Since the commitment review, the project team needed to make four minor
changes. The changes were motivated by internal and external feed-back from
early adopters and Beta customers, and by the integration with other
consumers of the project interfaces.

The changes do not constitute any architectural depart from the original
specifications voted on, therefore, I'm including them in this opinion.
In the case directory, final.materials has the exact specifications,
taking into account the commitment review TCR and spec updates.
The revised.materials directory there includes the four changes below.
I can file a separate fasttrack to cover these changes if members believe it
is necessary.

- Phased delivery of the some of the resource controls.
Initially, maxbw (maximum bandwidth), priority, cpus and fanout properties
were approved for flows and datalinks.
The support of the cpus property for flows and the fanout property for
flows and datalinks are now targeted after the first integration.
All other properties, are still supported, and provide sufficient added
value for the first phase.

- Add a -H option to dladm show-phys and dladm create-vnic.
One of the projects added value is exposing a CLI to control the
assignment
of some of the NICs hardware resources to MAC clients (i.e. VNICs).
At commitment time, factory MAC addresses were the only such hardware
resource.
The need for controlling the assignment of Receive Rings became
increasingly
obvious during the implementation and Beta testing, thus the -H option.

- The integration with new features of LDOMs required the ability to
allow an exclusive MAC client to set the interface's MTU,
thus a new Consolidation Private MAC client function: mac_set_mtu().

- A limitation in the multi-threadedness of a major NIC device driver
necessitated the addition of a flag to request the serialization
of transmit operations submitted to that driver.
A new flag needed to be added to the Consolidation Private MAC provider
interface (MAC_VIRT_SERIALIZE flag of the mac_register_t's m_v12n).

Kais.
-- next part --
An embedded and charset-unspecified text was scrubbed...
Name: opinion.ascii
URL:
http://mail.opensolaris.org/pipermail/opensolaris-arc/attachments/20081126/e0293c90/attachment.ksh

Welcome Sebastien Roy as a new PSARC member

2008-11-20 Thread Kais Belgaied

Please join me to welcome  Sebastien Roy as a new PSARC member,

Kais.

2007/272 Project Clearview: IPMP Rearchitecture (Commitment)

2008-11-15 Thread Kais Belgaied


* Section 4.12: To improve security, the IP filter interaction has been
  tweaked such that once an IP interface joins a group, it is subject
  to any filtering rules for the associated IPMP group interface.

and,  conversely, when an IP interface leaves the group, it is not 
subject to the group's filtering rules
any more, right?

Kais.

Volo Interfaces Amendment [PSARC/2008/694 FastTrack timeout 11/18/2008]

2008-11-11 Thread Kais Belgaied


Template Version: @(#)sac_nextcase %I% %G% SMI
This information is Copyright 2008 Sun Microsystems
1. Introduction
1.1. Project/Component Working Name:
 Volo Interfaces Amendment
1.2. Name of Document Author/Supplier:
 Author:  Rao Shoaib
1.3  Date of This Document:
11 November, 2008
4. Technical Description
I am sponsoring the following fast-track for Rao Shoaib and the Volo project
team. This case is a collection of minor changes to the interfaces introduced
by Project Volo PSARC/2007/587.
This amendment does not affect the original minor release binding.
All interfaces added by this amendment are Consolidation Private.

The updated  full Volo design document will be placed in the case directory.
Below is a summary of the changes covered by this fasttrack.

*) In the initial design a socket module writer was allowed to register it own
   sonodeops. This facility was deemed unnecessary and problematic. In the
   current design module writer registers only one create function that is
   called by the socket framework after allocating an sonode.

*) Functions used in fallback are no longer part of the public upcalls and
   downcalls vector. Since fallback is supported only on native protocols
   it is handled via private function calls. There is no change in how
   fallback works.

*) To support 3rd party socket modules two new downcalls sd_send_uio and
   sd_recv_uio have been introduced. These interfaces all protocol writer
   to control how data is copied to and from the user buffer. 

*) A new down call sd_poll has been introduced. This down call support
   polling when the protocol is doing it own buffering

*) To support evolution the interface is versioned.
   Current versions are obtained via the macros
SOCK_UC_VERSION (upcall interface)
SOCK_DC_VERSION (downcall interface)

*) /etc/sock2path now supports either a module name or a device name as the
   fourth member of the table.

*) Two new socket options have been added
SO_SNDTIMEO
SO_RCVTIMEO
   both take a pointer to struct timeval and return EWOUBLOCK if the timer
   expires.


6. Resources and Schedule
6.4. Steering Committee requested information
6.4.1. Consolidation C-team Name:
on
6.5. ARC review type: FastTrack
6.6. ARC Exposure: open

Integrate gbm (gnu-dbm) into Solaris [PSARC/2008/645 FastTrack timeout 10/28/2008]

2008-10-29 Thread Kais Belgaied

On 10/29/08 06:23, Martina Tomisova wrote:
 Hi Rainer,

 I don't know about any other system which makes this special directory.

 I can remove the compatibility files from the package and place gdbm.h
 into /usr/include/ - that's no problem.
   

there's already a /usr/include/ndbm.h shipped with Solaris.
The  The exported interface table by this case includes a 
/usr/include/gdbm/ndbm.h

it seems that a usr/include/gdbm directory is unavoidable here.

Kais
 Could someone else please express his opinion of this topic?

 Thank you and have a nice day,
 Martina

IBTF IO Memory [PSARC/2008/630 FastTrack timeout 10/17/2008]

2008-10-20 Thread Kais Belgaied

On 10/15/08 12:07, Ted H. Kim wrote:
 Kais,


 Kais Belgaied wrote:
 - Case boundary question: since this marks a flag day for both TI and 
 CI, can you list the components
  that are affected by this flag day?

 Most of the IB modules in ON -
 framework: IBTL
 IB ULPs (TI): IPonIB, SDP, NFS/RDMA, uDAPL
 HCA Drivers (CI): Tavor, Hermon

at the risk of re-stating the obvious,  all changes in the above 3 sets 
of components are in-scope of this case,
right?



 - I am not clear on the consumer side of this new  interface: What 
 prompts a ULP to start using this interface?
  Is it expected to attempt ibt_alloc_io_mem() until it  exhausts all 
 resources?
  It would be easier to assess the completeness and the usefulness of 
 the TI if you either extended the case's scope
  to include at least the changes on one transport consumer or gave a 
 real example thereof.

 We are in the process of fixing bugs in certain IB ULPs to
 be good citizens in big SPARC platforms with memory DR where
 we have to be careful about what is in/out of the cage.
 The current plan for ULP usage (i.e. TI usage) is related
 to this motivation.


 - mi_ibt_version seems to be an enumeration of apparently mutually 
 exclusive values  IBTI_V{1,2,3}
  yet the definition suggests a combination of independent (discrete) 
 capabilities
  (FMR support, DMA wrapper support, etc.)

 The features are examples of what was included at each version change.
 More discussion of the relationship between ABI and features below ...


 . Is there any consumer of this interface that uses DMA wapper 
 but not FMR?

 Well to be honest FMR in the current form turned out to be a failure.
 So no one uses FMR in ON right now.

 But I think more generally what is going to happen is that
 the features will be used independently of each other,
 since they are generally not related to each other.


   . For future evolution, is the mi_ibt_version always intended to 
 express a monotonically increasing set
of capabilities (capabs of V(n+1) includes all capabs of N(n)) ?

 Yes, that is the intent, but it is not guaranteed. However,
 as you might imagine, it would involve a great deal of
 discussion/agreement to remove anything and the ARC would be
 in the loop.


   Basically I'm trying to see if information of different nature  if 
 being encoded in the same field. Without slipping
  in a design discussion you should consider if two fileds are more 
 appropriate: 1 version (number or enum) and one capabs (bitmask).

 There are in fact capability bitmasks elsewhere.
 In IB there are a number of optional features. So the bitmasks
 generally are for saying you have these optional features.

 But the version number is more an ABI thing, and it is mistake
 to conflate the too, though the reason we have to change ABI
 is that new features demand more fields in the structs, etc.


so are you fixing the mistake of conflating the capabs + version ? I 
still see the updated material unchanged
on that.





Kais.

IBTF IO Memory [PSARC/2008/630 FastTrack timeout 10/17/2008]

2008-10-20 Thread Kais Belgaied

On 10/20/08 15:01, Ted H. Kim wrote:

 I am also not sure where this is leading.
 Are you suggesting some specific change to the case?

I'm not clear on the future compatibility expectations around the 
interface introduced by this case:

 I asked whether versions are incremental all the time and you answered yes
((capabs of V(n+1) includes all capabs of V(n))
I asked if some capabs can be used independently, you also said  yes, 
which suggests
(capabs of V(n+1)) don't necessarily have to include all capabs of V(n). 
Capabs are independent.

The former means that an IBT client module written to V(n) is guaranteed 
to work unmodified on a framework+HCAs
that evolved to V(n+1) or later.

The latter means modules may break or may continue to work. No backward 
compatibility is guaranteed.

Choose one semantic for the interface and clearly document it in the case.

Kais.


 -ted

Derailing PSARC/2008/628 Interrupt Resource Management

2008-10-15 Thread Kais Belgaied

Roamer since this is a full case now,  the procedure is to add the 
comments to the issues file (for internal contributors).

Kais.

On 10/10/08 19:14, Yunsong (Roamer) Lu wrote:
 A few more concerns about the IRM proposed interfaces.

 1. When the material talks about current interface limitation, 4.1.2, 
 why it's a problem to allow a driver to get more that *2* MSI-X? Those 
 integrated device drivers should be prepared that it can not get any 
 MSI-X interrupt vector, and it might try the legacy INTX instead. So 
 it should not be a problem even all MSI-X vectors have been given to 
 those attached drivers. Late-attached drivers will just use legacy 
 INTX interrupts. The justification for current *hard-coded* limitation 
 doesn't make sense.

 2. How the IRM framework decide to decrease the number of interrupt 
 vectors that have been given to a driver? 4.2.1 talk about how driver 
 participate the IRM interfaces, but it's obscure how the framework can 
 wisely move interrupt resources around drivers.

 3. How the IRM framework make *wise* decision about which driver can 
 take more interrupt vectors than others? For example, when you have a 
 10GbE NIC and a 1GbE NIC in the box, both drivers ask for 16 vectors 
 when you don't have enough vectors left. To give the same amount of 
 interrupt vectors to two driver instances are unreasonable. As part of 
 Crossbow project, hardware resources are allocated depending on the 
 real link speed and bandwidth need. But as the low level I/O 
 framework, IRM don't have knowledge about those information. How do 
 you prove that your management is reasonable?

 4. What's the perimeter of IRM? In a virtualized environment, 
 interrupts might have been bound to CPUs in an exclusive zone or a 
 guest domain, when IRM asks such interrupt vectors back from the 
 driver, who will take care of the interrupt re-targeting? It's out of 
 driver's control, and I can not find any relevant information from 
 this document.

 Thanks,

 Roamer

IBTF IO Memory [PSARC/2008/630 FastTrack timeout 10/17/2008]

2008-10-15 Thread Kais Belgaied

- Case boundary question: since this marks a flag day for both TI and 
CI, can you list the components
  that are affected by this flag day?

- I am not clear on the consumer side of this new  interface: What 
prompts a ULP to start using this interface?
  Is it expected to attempt ibt_alloc_io_mem() until it  exhausts all 
resources?
  It would be easier to assess the completeness and the usefulness of 
the TI if you either extended the case's scope
  to include at least the changes on one transport consumer or gave a 
real example thereof.
 
- mi_ibt_version seems to be an enumeration of apparently mutually 
exclusive values  IBTI_V{1,2,3}
  yet the definition suggests a combination of independent (discrete) 
capabilities
  (FMR support, DMA wrapper support, etc.)
 . Is there any consumer of this interface that uses DMA wapper but 
not FMR?
   . For future evolution, is the mi_ibt_version always intended to 
express a monotonically increasing set
of capabilities (capabs of V(n+1) includes all capabs of N(n)) ?
   Basically I'm trying to see if information of different nature  if 
being encoded in the same field. Without slipping
  in a design discussion you should consider if two fileds are more 
appropriate: 1 version (number or enum) and one capabs (bitmask).

- Under what condition can the caller of 
ibt_alloc_io_mem()/ibt_free_io_mem()  expect the following error
  to be returned?
 59 IBT_MR_ACCESS_REQ_INVALID   Invalid Access Control Specified.
 60 Remote Write or Remote Atomic access is
 61 requested without specifying Local 
Write.

 - in ibc_alloc_io_mem.9e
 these two sections are in conflict:
  11 ibt_status_t prefix_ibc_alloc_io_mem(ibc_hca_hdl_t hca_hdl,
  12 size_t size, ibt_mr_flags_t mr_flag, caddr_t *kaddrp,
  13 ibc_mem_alloc_hdl_t *mem_alloc_hdl);
and
  23 hca_hdl   IBTF channel Interface (TI) HCA Handle previously 
obtained
  24   by calling ibt_open_hca(9F).
  25

Kais.

On 10/10/08 11:04, Ted Kim wrote:
 Template Version: @(#)sac_nextcase %I% %G% SMI
 This information is Copyright 2008 Sun Microsystems
 1. Introduction
 1.1. Project/Component Working Name:
IBTF IO Memory
 1.2. Name of Document Author/Supplier:
Author:  Lida HornI
 1.3  Date of This Document:
   10 October, 2008
 4. Technical Description

 A. Background

 The DDI distinguishes between different types of memory. Memory from
 ddi_dma_mem_alloc(9F) is usable for DMA and takes into account various
 factors such alignment and other device attributes. Memory from
 kmem_(z)alloc is not guaranteed to be usable for DMA, though most of
 the time it does work for that purpose, because of the capability of
 modern platforms. Nevertheless, there is value in maintaining these
 DDI distinctions, especially when considering certain platform issues
 such as memory DR.

 In the context of InfiniBand, registered memory is the target of DMA
 operations. This case introduces new InfiniBand related interfaces
 analogous to the ddi_dma_mem_alloc family of functions to IBTF
 (InfiniBand Transport Framework, PSARC/2002/132 and follow-on
 cases). This addition will help the InfiniBand stack maintain the
 proper DDI memory distinctions important for certain types of
 platforms.


 B. Proposal

 The proposal is to make additions to the IBTF Channel and Transport
 interfaces. The functionality added to the Transport Interface (TI) is
 used by the ULPs to allocate memory suitable for DMA and IB memory
 registration. In turn, the framework uses new entry points in the
 Channel Interface (CI) to request memory allocation from the
 underlying HCA driver. These interfaces are basically a wrapper for
 DDI functions which on the one hand abstract away HCA device specific
 details at the ULP level, but at the same time allow for the HCA
 driver to adjust the memory attributes (alignment, etc.) as necessary
 for efficiency.

 These additions include an IBTF ABI change, so this case also marks an
 internal flag day, incrementing our interface version numbers for both
 the TI and CI as noted below.


 All interface additions and changes in this proposal have a
 micro/patch binding.

 Transport Interface (ON Consolidation Private):

   ibt_alloc_io_mem() - Allocates DMA memory (at the transport level)
   ibt_free_io_mem() -  Deallocates DMA memory 
   IBTI_V3 - TI version change
  
 Channel Interface (ON Consolidation Private):

   ibc_alloc_io_mem() - Allocates DMA memory (at the HCA driver level)
   ibc_free_io_mem() - Deallocates DMA memory 
   IBCI_V3 - CI version change


 C. Summary of Changes by man page 

 See materials directory for copies of man pages. Modified man pages
 have change bars in the left margin.

   ibci.9 - modfied (new CI entry points added)

   ibc_alloc_io_mem.9e - new (alloc  free CI entry points)

   ibt_alloc_io_mem.9f - new (alloc  free TI functions)

Derailing PSARC/2008/628 Interrupt Resource Management

2008-10-09 Thread Kais Belgaied

I am derailing this case on grounds of non-obviousness of its 
architectural impact, and possible incompleteness. The discussion 
already uncovered that there is more than a  minor amendment to 
PSARC/2004/253 Advanced DDI Interrupt Functions .

To prepare for the full review, the architecture should address the 
impact on device drivers and on the subsystems they are part of.
If the scope of the project is intended to remain generic enough, the 
material needs to reflect that more than one class of
device drivers were considered in the architecture.

To elaborate (see Garrett's previous email), the interrupt handles that 
a  NIC  driver acquired are actually exposed to the MAC layer (see
 PSARC/2006/357 - Crossbow), for enabling/disable the interrupts on demand.
The proposal should be clear on how the behavior of such drivers is 
intended to be modified when ported to the IRM interfaces.
Should there be an extra notification event between MAC and the drivers 
to invalidate the interrupt handles registered with MAC?
Are drivers supposed to insulate MAC from the real interrupt handles 
instead, and, internally map to real handles that can be
added/removed? are they supposed to start faking the polling mode in 
software on rx rings that lost their real interrupts for
example?

Cryptographic accelerators are another class of I/O where an external 
framework (the Solaris crypto framework) relies
on driver notifications coming from job completion interrupts. See 
PSARC/2001/557.
What such drivers are supposed to do for proper handling 
DDI_CB_INTR_REMOVE ?
Should they block until the jobs drain and they get to call 
crypto_provider_notification(READY), should they immediately
notify an error for all pending  crypto requests?

Kais.

No more monthly late meeting for PSARC

2008-10-08 Thread Kais Belgaied

Given the limited interest in the monthly late meeting of PSARC,
the PSARC members decided today to move back to regular time meeting.

Kais.

PSARC 2008/514 Python interface to dlpi(7P)

2008-08-13 Thread Kais Belgaied

Cecilia Hu wrote:
 I am sponsoring this case for Max Zhen.  This project is to provide a
 wrapper for dlpi(7p) functions that enables sending/receiving layer2
 network packet directly from Python, and getting/setting link related
 configuration.  The requested release binding is patch.

 The interface and architecture are clear enough to be a self-review.
 Of cause, if there is different opinion, I would like to shift it to a
 regular fast-track.  Otherwise, case is closed approved automatically.
No so fast. Please re-open and put a timer on this.
The is not an obvious case.

Thanks,
Kais




 Thanks,
 Cecilia

PSARC 2008/514 Python interface to dlpi(7P)

2008-08-13 Thread Kais Belgaied

Cecilia, I see that you already did.
Never mind.

Kais

Kais Belgaied wrote:
 Cecilia Hu wrote:
 I am sponsoring this case for Max Zhen.  This project is to provide a
 wrapper for dlpi(7p) functions that enables sending/receiving layer2
 network packet directly from Python, and getting/setting link related
 configuration.  The requested release binding is patch.

 The interface and architecture are clear enough to be a self-review.
 Of cause, if there is different opinion, I would like to shift it to a
 regular fast-track.  Otherwise, case is closed approved automatically.
 No so fast. Please re-open and put a timer on this.
 The is not an obvious case.

 Thanks,
Kais




 Thanks,
 Cecilia

Unix Domain Sockets for X11 clients in Trusted Extensions [LSARC/2008/506 FastTrack timeout 08/14/2008]

2008-08-07 Thread Kais Belgaied

Nicolas Williams wrote:
 On Thu, Aug 07, 2008 at 02:14:52PM -0700, Alan Coopersmith wrote:
   
 Ric Aleshire wrote:
 
 Yes - currently in the kernel socket I/O code, there is a check that the
 AF_UNIX socket endpoint is in the same
 zone as the server peer.  The proposal for a) above means that this
 check will be modified, so that when TX is
 enabled and the socket zone and server zone do not match, then the
 server must be in the global zone.
   

Thanks for the answer Ric.

 Which raises the interesting question of whether that check should really
 be for TX, or if this should be something that can be set on for any machine
 with Zones, and which TX just happens to always set.   It would seem things
 like running X clients in Etude or BrandZ zones could also benefit from this.
 

this sounds tempting.
anyway, the project team has the choice here whether to keep the scope 
of this case as-is,
or extend it tp permit privileged cross-zone communication through 
AF_UNIX sockets beyond
tx.

Kais

 I agree, though being careful to use untrusted cookies, of course.

 The problem this case is trying to solve affects non-TX zones uses too.

Unix Domain Sockets for X11 clients in Trusted Extensions [LSARC/2008/506 FastTrack timeout 08/14/2008]

2008-08-06 Thread Kais Belgaied


 Solution

 a) Allow labeled zones to access global zone X11 server via UNIX domain 
 sockets

 If Trusted Extensions is enabled, the kernel will permit labeled zones
 to connect to global zone clients if the global zone UNIX domain
 rendezvous file is made available to the zone via a loopback mount.
   

When you do (b), (a) follows naturally without any extra change. 
connect(3SOCKET)'ing  to the AF_UNIX
socket named /var/tsol/door/.X11-unix will succeed the moment that node 
is visible to the zone.

Am I missing a change proposed in sockfs or other part of the Solaris 
kernel as part of this case?

Kais.

 b) The X11 server will use a new rendezvous directory when TX is enabled.

 Normally, the UNIX domain rendezvous files are in the directory 
 /tmp/.X11-unix.
 To allow the rendezvous files to be exported to labeled zones, the directory
 pathname will be changed to:

 /var/tsol/door/.X11-unix.

 This directory pathname is chosen because /var/tsol/doors is already
 loopback mounted into every labeled zone, to export the door rendezvous
 files for nscd and the label daemon.  To make this change transparent to
 clients, a symbolic link to /tmp/.X11-unix will be created in each zone,
 including the global zone.

 This solution will permit labeled zone X11 clients to use any of the
 various DISPLAY environment variables they have been using previously,
 and not require the use of TCP.

PSARC 2008/498 datalink sysevents

2008-08-05 Thread Kais Belgaied

completeness question: who's the intended consumer for this event?

Kais

Sebastien Roy wrote:
 I'm submitting this case for Cathy Zhou.  It is being filed as closed
 approved automatic.

 ???Datalink sysevents
 --

 release binding: patch

 Summary
 ---

 This case proposes to introduce a new EC_DATALINK sysevent class to
 report data-link related sysevents.  For now, only one subclass
 (ESC_DATALINK_PHYS_ADD) will be introduced.  It will be generated
 when a new physical data-link shows up on the system.  In the future,
 the EC_DATALINK sysevent class can be extended to report other
 data-link sysevents, such as a data-link renaming event.

 Since we are still experimenting the new sysevent class, the format
 of the ESC_DATALINK_PHYS_ADD sysevent will be classified as Project
 Private.

 Interface Table
 ---

  -
  Interface   Commitment Level Comments
  -
  EC_DATALINK Consolidation PrivateEvent class
  ESC_DATALINK_PHYS_ADD   Project Private  Event subclass

PSARC 2008/498 datalink sysevents

2008-08-05 Thread Kais Belgaied

Garrett D'Amore wrote:
 Thanks for the clarification.

ditto.

Kais


 (I wasn't aware that RCM was being used this way.  I recall that once 
 upon a time there was a separate sysevent architecture where 
 insertion events were handled without RCM interposing.  The point, at 
 the time, of having a separate RCM from sysevent was that RCM could 
 interpose, and ultimately refuse certain operations, based on 
 consuming nodes.  Of course, this goes back to the Solaris 8 
 timeframes and the design discussions I had surrounding RCM.  Ancient 
 history, now.  Anyway, your suggested usage seems sane to me.)

-- Garrett


 Cathy Zhou wrote:
 RCM is also used to restore all the configuration when the device is 
 plugged back in and that is when this sysevent will be used.

 - Cathy

 RCM is normally (historically, anyway) used for device *removal*, 
 rather than addition.

 What do you intend the RCM module to do with this event?

-- Garrett

 Cathy Zhou wrote:
 This event will be consumed by a syseventd module which in turn 
 will generate a RCM event which will then be consumed by the RCM 
 modules.

 But the usage of the EC_DATALINK class would not be limited to this.

 - Cathy

 completeness question: who's the intended consumer for this event?

Kais

 Sebastien Roy wrote:
 I'm submitting this case for Cathy Zhou.  It is being filed as 
 closed
 approved automatic.

 ???Datalink sysevents
 --

 release binding: patch

 Summary
 ---

 This case proposes to introduce a new EC_DATALINK sysevent 
 class to
 report data-link related sysevents.  For now, only one subclass
 (ESC_DATALINK_PHYS_ADD) will be introduced.  It will be 
 generated
 when a new physical data-link shows up on the system.  In the 
 future,
 the EC_DATALINK sysevent class can be extended to report other
 data-link sysevents, such as a data-link renaming event.

 Since we are still experimenting the new sysevent class, the 
 format
 of the ESC_DATALINK_PHYS_ADD sysevent will be classified as 
 Project
 Private.

 Interface Table
 ---

  -
  Interface   Commitment Level Comments
  -
  EC_DATALINK Consolidation PrivateEvent class
  ESC_DATALINK_PHYS_ADD   Project Private  Event subclass

PSARC 2008/473 Fine-Grained Privileges for Datalink Administration

2008-07-28 Thread Kais Belgaied

could you include a delta of privileges(5) man page and the 
out-of-the-box exec_attr(4), and dladm(1m)
as modified by this case?

Kais

libnet [PSARC/2008/409 FastTrack timeout 07/03/2008]

2008-07-09 Thread Kais Belgaied

The changes made for using /dev/net seems to be a good compromise for  
benefiting from the UV features while
inhaling this library quickly enough in OpenSolaris, and opening the 
door for adding further dependent apps and libs
to FOSS.
Architecturally, this library injects, captures 'n parses packets at the 
Ethernet frame level and at IP and higher level
protocols level, so I see value in a future project for porting libnet 
to the PF_PACKET socket as soon as the latter is ready.

The PF_PACKET porting project to Opensolaris is being implemented by the 
gld-iteam. GLD-iteam
it would be good to ARC it soon.

Kais.

Mark A. Carlson wrote:
 More time was requested at today's PSARC meeting so I have
 extended the timer on this case to 07/16/2008

 -- mark

 Mark A. Carlson wrote:
 The Project team has updated the FOSS checklist for libnet, with 
 changes in sections:
 2.3.2 (upstream support),
 3.4.7 (privileges),
 3.7 (code modifications - dlpi).

 -- mark


-- next part --
An HTML attachment was scrubbed...
URL: 
http://mail.opensolaris.org/pipermail/opensolaris-arc/attachments/20080709/a5288da7/attachment.html

PSARC 2007/611 Intel 10GbE PCIE NIC Driver

2007-10-22 Thread Kais Belgaied

Cecilia Hu wrote:
 I am sponsoring this case for Samuel Tu.  This case is to provide
 a new NIC driver, ixge(7D), for Intel 10GbE PCI Express Adapter.
 The requested releas binding is micro/patch.

 I-team consider it is better to archive this driver in PSARC, while
 the architecture is straight forwarding and the interface is
 clear, I am marking it as closed approved automatic.


 -Cecilia



 Template Version: @(#)sac_nextcase 1.56 10/26/05 SMI
 This information is Sun Proprietary: Need-to-Know

 1. Introduction
 1.1. Project/Component Working Name:
  Intel 10GbE PCIE NIC Driver
 1.2. Name of Document Author/Supplier:
  Author: Samuel Tu
 1.3  Date of This Document:
  22 October, 2007

 4. Technical Description
 This case adds support for Intel 10GbE PCI Express Adapter Driver
 into ON.




 The architecture of the Intel 10GbE PCI Express Adapter differs
 significantly from the Intel 82597EX based PCIX Adapter, which is
 supported by ixgb. An important new feature of this adapter is I/OAT
 (I/O Acceleration Technology) from Intel which will be helpful for
 performance improvement. So we introduce a new driver to support 
 them.

let's see. I/O AT is a collection of new capabilities that may involve 
the NIC, the chipset and/or
the CPU.
Could you say more about which of these capabilities that this case will 
be using/supporting?
which  are expected to actually be present on SPARC, Intel, AMD systems?
Will the driver invoke any new kernel interface to query whether the 
platform specific
features are present or not? is this case introducing these interfaces 
or are they covered elsewhere?

Also looking at Intel's docs (and previous presentations), there is a 
hint to a need for an optimized
TCP/IP stack in order to benefit from I/O  AT. (see for instance 
http://download.intel.com/technology/comms/perfnet/download/98856.pdf).
Is this case introducing changes for the OpenSolaris TCP/IP stack to be 
able to use I/OAT ? Any new
interfaces needed to negotiate such capabs?

Last, one comment about the Asynchronous low cost data copy (a.k.a. 
Intel's QuickData component of the I/O AT),
this seems to be a generic enough functionality, with benefits beyond 
the networking.
My suggestion is to consider exposing the interfaces that use it.

 Kais.

 Intel has software license agreement with Sun to allow Sun integrate
 this driver and distribute the software in both source, and binary
 object code forms. This SLA also grants Sun the right to make
 modification to the source code and distribute the modified driver
 in open source and binary forms.

 The driver supports x86/x64  and SPARC platform.

 The vendor ID and device ID of the chips supported are:
 pci8086,10C6
 pci8086,10C7

[PSARC/2007/599 FastTrack timeout 10/23/2007]

2007-10-18 Thread Kais Belgaied

David Marx wrote:

 quick question: When the RAM is shared with multiple OS instances 
 (virtual machines),
 is 1/4 of all available memory a reasonable limit?
 Should this be 1/4 of RAM available to the domain (host or guest) ?

 Like the resources project.max-crypto-memory and
 project.max-shm-memory, project.max-device-locked-memory
 is based on the kernel variable availrmem_initial.
 Therefore, I suspect that this is 1/4 of the memory available
 to the guest.

what about  dom0 ? Assuming all RAM is seen as available to dom0, 
locking 1/4 of it
is probably excessive.
I'm not sure much can be done for such situation, other than a word of 
caution in the man page
documenting project.max-device-locked-memory, for the case of xVM.

Kais

[PSARC/2007/599 FastTrack timeout 10/23/2007]

2007-10-17 Thread Kais Belgaied


 3. Proposed Solution

To solve this, we propose
increasing this to 1/4 of available memory which is the
limit that in addition, agpgart imposes.

 4. Risks;

There is the risk that increasing this resource may allow
the system to allocate too much memory, which may cause
the Solaris kernel to run out.  The kernel is probably
not graceful when it runs out of memory.

If increasing this resource is not acceptable, and having
the user manually increase the resource is not
acceptable, then either Sun or the Xorg community need to
change the Xorg Intel graphics drivers to use less memory
for Sun to incorporate these drivers into the Solaris
product.

Increasing this resource affects both x86 and sparc,
although it is only currently needed on x86. 


quick question: When the RAM is shared with multiple OS instances 
(virtual machines),
is 1/4 of all available memory a reasonable limit?
Should this be 1/4 of RAM available to the domain (host or guest) ?

Kais.

[clearview-discuss] 2007/527 Addendum for Clearview Vanity Naming and Nemo Unification

2007-09-21 Thread Kais Belgaied

Cathy Zhou wrote:
 John Plocher wrote:
 Kais Belgaied wrote:
 Ah! in this case, it seems just an internal interface between Nemo 
 and itself. If that's true then it's an implemetnation choice and 
 shouldn't be
 exposed to device driver writers as part of the MAC_CAPAB* interface 
 maturing to soon become committed, and the case can just be 
 withdrawn as it turns out to be below the radar for an ARC review.


 Probably better to change it to closed approved automatic - the project
 isn't being withdrawn, it /is/ going into the product, it just doesn't
 need the ARCs to do anything formal along the way :-)

 Please be noted that other than the MAC_CAPAB_NO_NATIVEVLAN interface, 
 this case also proposed other interfaces (DLIOCMARGININFO ioctl, 
 m_margin etc.) that would be exposed to the device drivers.

noted. The DLIOCMARGININFOand the gldm_margin field in gld_mac_info_t 
(from a an existing reserved field)
seemed non controversial to me.

Kais


 Thanks
 - Cathy

[clearview-discuss] 2007/527 Addendum for Clearview Vanity Naming and Nemo Unification

2007-09-19 Thread Kais Belgaied

Sebastien Roy wrote:
 Kais Belgaied wrote:

 this sounds a little upside-down.
 A driver has to advertise a negative capability, essentially saying 
 Hey, I can't handle this feature
 as opposed to the more intuitive approach: drivers that can handle 
 native VLAN expose a capab (MAC_CAPAB_NATIVEVLAN),
 and those who don't do not.
 Any reason for this choice?

 This discussion veered off a little bit, and I want to bring it back 
 on-topic and make sure that progress is being made on this case.  
 Kais, was your original question answered?
well, not really.
I'm not sure I understand the following argument:
 It sounds really odd-ball to me, too.  Plus, it would require 
  touching all the existing GLDv3 driver.  
   

 No. We introduce MAC_CAPAB_NO_NATIVEVLAN exactly for the reason that 
 we do not want to touch most of the GLDv3 driver. 
 MAC_CAPAB_NO_NATIVEVLAN means that this driver cannot handle VLAN 
 PPA access itself (therefore, it might also implies that this driver 
 does not handle the hardware checksum for VLAN packets). Existing 
 GLDv3 drivers should *not* advertise this capability, except the 
 aggr driver, which might based on the underlying aggregated drivers.


does it mean that by default *all* GLDv3 drivers do or are assumed to 
support native VLAN, with the
exception of only a few?

another two unclear points that were brought up by the discussion is the 
interaction of the proposed
capab with with the HW Checksum capability, and with the 
MAC_CAPAB_PERSTREAM:

 We handle MAC_CAPAB_NO_NATIVEVLAN differently in two places:

 a. If mac_open() is for a VLAN PPA accessed stream, and the underlying 
 MAC supports MAC_CAPAB_PERSTREAM, but *not* MAC_CAPAB_NO_NATIVEVLAN, 
 we can open the underlying driver directly using its native VLAN PPA 
 access.

 b. If the MAC is MAC_CAPAB_NO_NATIVEVLAN, then do not advertise its 
 HW_CKSUM capability on VLAN streams even the MAC claims it is capable 
 of doing HW CKSUM.



 To summarize, the only driver expected to implement this is the 
 softmac driver introduced by UV (PSARC/2006/499).  The capability's 
 semantics were defined in such a way to not require every other driver 
 from having to care about its existence.


Ah! in this case, it seems just an internal interface between Nemo and 
itself. If that's true then it's an implemetnation choice and shouldn't be
exposed to device driver writers as part of the MAC_CAPAB* interface 
maturing to soon become committed, and the case can just be withdrawn as 
it turns out to be below the radar for an ARC review.

Kais.


 -Seb

[clearview-discuss] 2007/527 Addendum for Clearview Vanity Naming and Nemo Unification

2007-09-14 Thread Kais Belgaied


  ** MAC_CAPAB_NO_NATIVEVLAN

 A MAC_CAPAB_NO_NATIVEVLAN MAC capability will be added to the
 GLDv3 framework to indicate that a specific MAC cannot support
 VLAN PPA access by itself.
   

this sounds a little upside-down.
A driver has to advertise a negative capability, essentially saying 
Hey, I can't handle this feature
as opposed to the more intuitive approach: drivers that can handle 
native VLAN expose a capab (MAC_CAPAB_NATIVEVLAN),
and those who don't do not.
Any reason for this choice?

Kais.

2007/271 HME/QFE updates

2007-05-30 Thread Kais Belgaied

Garrett,

the transition plan for a customer that deployed a trunking of qfe's is 
not clear to me.

say someone used the Sun trunking software to build a 4 port trunk, with 
qfe2 as the head trunk,
defined some load balancing policy, and used the name 'qfe2' in various 
config places
(hostname.qfe0, IPFilter config files, third party firewalls,  etc ...),
What happens when they upgrade to the new version? who will convert the 
trunking configurations
to create aggrs, replace the 'qfe2' names to aggr1 everywhere?

BTW, there was a precedent to this kind of renaming, with the transition 
from ipge to e1000g
(the *e1000g* **transition* patch, 123334-01)*

I don't believe documentation is sufficient here.

Kais


 Sun Trunking Impact
 ---

 Using the Nemo interfaces means that the owners of QFE will have to use
 the nemo link aggregation commands with dladm(1M).   While this is to be
 viewed as a good thing, it does represent change that will need to be 
 noted
 in release notes, and such.

 The other Sun NIC drivers which are supported by Sun Trunking are the
 GEM (ge) and Cassini (ce) drivers.  We hope to move both of those to
 Nemo as well, in the near future, and follow up with an EOF of the
 Sun Trunking product altogether.  However, this is out of scope for this
 particular case.

GLDv3 link status logging [PSARC/2007/298 Self Review]

2007-05-29 Thread Kais Belgaied


Problem
---

Various network drivers are inconsistent in their handling of logging of
link messages.  One of the more annoying things that some drivers do is
flood the logs with link down messages (usually once every 10sec or so) when
trying to transmit packets out the link.
  

The root cause of the problem as you describe seems to be the fact that 
the stack above kept
submitting the packets to a link that is known to be down, causing the 
flood of syslogs.
Somehow the event of link-down was not generated, lost during the 
notification, or mishandled.
That is a bug to be fixed between the stack and the specific drivers you 
observed the misbehavior
on. The bug is probably below the radar screen for ARC.

Now, back the the symptoms (scope of this case): Each futile submission 
of a packet to be
transmitted on a link down indicates a problem worth paying attention 
to. It could be
uncovering a bug such as the above, or it could be transient race. I 
don't believe it
is a bad practice from  driver writers to adopt a defensive approach and log
an error on every occurrence of the offense.

Kais

Further, the detailed contents for link status changes are not consistent
from one driver to another.

Notably, the WIFI drivers generally do not do this.

73 matches

Mail list logo