Re: [ewg] is qlgc_vnic supposed to work with RHEL6[.2]?

2012-02-16 Thread Brian J. Murrell
On 12-02-16 02:11 PM, Tom Elken wrote:
> 
> Hi Brian,

Hi Tom,

> For now we do not yet have a RHEL6 port of qlgc_vnic.  Our final answer to 
> this question is dependent on some activities which will occur after the 
> close of the Intel acquisition of the QLogic IB program*.  We recommend you 
> leave things in their present state and omit qlgc_vnic from your builds for 
> RHEL6, until such time as we can have a more definitive solution.

OK.  Thanks for the update.

Cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] is qlgc_vnic supposed to work with RHEL6[.2]?

2012-02-16 Thread Brian J. Murrell
On 12-02-09 02:59 PM, Tom Elken wrote:
> 
> Brian,

Hi Tom,

> QLogic contributed this code to OFED, so it's on us to comment.
> We're still working on an answer to your question/issue.  Sorry for the delay.

Any news to report yet?

Cheers,
b.




signature.asc
Description: OpenPGP digital signature
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] is qlgc_vnic supposed to work with RHEL6[.2]?

2012-02-09 Thread Brian J. Murrell
On 12-02-09 02:59 PM, Tom Elken wrote:
> 
> Brian,

Hi Tom,

> QLogic contributed this code to OFED, so it's on us to comment.

Ahhh.  Great.

> We're still working on an answer to your question/issue.  Sorry for the delay.

OK.  No worries.  When there is no answer for some time it leaves one
wondering if one is talking in a vacuum.  Clearly that's not the case so
I will sit patiently.  :-)

Thanks for looking into it.

Cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] is qlgc_vnic supposed to work with RHEL6[.2]?

2012-02-09 Thread Brian J. Murrell
On 12-02-01 04:26 PM, Brian J. Murrell wrote:
> We've been having some problems building the qlgc_vnic with RHEL 6.2:
> 
> In file included from 
> /BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_config.h:41,
>  from 
> /BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.h:42,
>  from 
> /BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:45:
> /BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_control.h:76: 
> error: expected identifier before '(' token
> /BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c: In 
> function 'vnic_setup':
> /BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1063: 
> error: 'struct net_device' has no member named 'get_stats'
> /BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1064: 
> error: 'struct net_device' has no member named 'open'
> /BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1065: 
> error: 'struct net_device' has no member named 'stop'
> /BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1066: 
> error: 'struct net_device' has no member named 'hard_start_xmit'
> /BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1067: 
> error: 'struct net_device' has no member named 'tx_timeout'
> /BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1068: 
> error: 'struct net_device' has no member named 'set_multicast_list'
> /BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1069: 
> error: 'struct net_device' has no member named 'set_mac_address'
> /BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1070: 
> error: 'struct net_device' has no member named 'change_mtu'
> make[4]: *** 
> [/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.o] Error 1
> make[3]: *** [/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic] Error 
> 2
> make[2]: *** [/BUILD/ofa_kernel-1.5.4/drivers/infiniband] Error 2
> 
> I notice that the daily builds don't try to build the qlgc_vnic
> though.  Is this supposed to work with RHEL 6[.2], or any post 2.6.31
> for that matter?

Is there really nobody here who can comment on this?

b.




signature.asc
Description: OpenPGP digital signature
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] is qlgc_vnic supposed to work with RHEL6[.2]?

2012-02-01 Thread Brian J. Murrell
We've been having some problems building the qlgc_vnic with RHEL 6.2:

In file included from 
/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_config.h:41,
 from 
/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.h:42,
 from 
/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:45:
/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_control.h:76: 
error: expected identifier before '(' token
/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c: In 
function 'vnic_setup':
/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1063: 
error: 'struct net_device' has no member named 'get_stats'
/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1064: 
error: 'struct net_device' has no member named 'open'
/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1065: 
error: 'struct net_device' has no member named 'stop'
/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1066: 
error: 'struct net_device' has no member named 'hard_start_xmit'
/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1067: 
error: 'struct net_device' has no member named 'tx_timeout'
/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1068: 
error: 'struct net_device' has no member named 'set_multicast_list'
/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1069: 
error: 'struct net_device' has no member named 'set_mac_address'
/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.c:1070: 
error: 'struct net_device' has no member named 'change_mtu'
make[4]: *** 
[/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic/vnic_main.o] Error 1
make[3]: *** [/BUILD/ofa_kernel-1.5.4/drivers/infiniband/ulp/qlgc_vnic] Error 2
make[2]: *** [/BUILD/ofa_kernel-1.5.4/drivers/infiniband] Error 2

I notice that the daily builds don't try to build the qlgc_vnic
though.  Is this supposed to work with RHEL 6[.2], or any post 2.6.31
for that matter?

Cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] EWG/OFED meeting minutes for Sep 27, 2010

2010-10-01 Thread Brian J. Murrell
On Tue, 2010-09-28 at 16:03 +0200, Tziporet Koren wrote: 
> 
> 2. OFED 1.5.2 - release was provided

This is great news!

> 3. How to handle the SDP issue on 32 bits systems and other bugs?
>It was decided not to have 1.5.2-1 release.

I wonder why then, we have "1.5.2-1" daily releases (and no 1.5.3 daily
releases).

And why "-1"?  Wouldn't ".1" be a more consistent nomenclature?

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Delaying OFED 1.5.2 release

2010-07-14 Thread Brian J. Murrell
On Wed, 2010-07-14 at 18:44 +0300, Tziporet Koren wrote: 
> Hi All,
> 
> I wish to delay the OFED 1.5.2 release to Aug.
> 
> New Schedule will be:
> 
> - RC3  - Jul 29
> - RC4  - Aug  8
> - GA   - Aug 23

Will this allow time for bug 2053 to be addressed?

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] ofed 1.5.1 and rhel4u7: ipath_driver.c:150: error: unknown field `groups' specified in initializer

2010-06-08 Thread Brian J. Murrell
Hi All,

In trying to build OFED 1.5.1 with RHEL4U7's linux-2.6.9-78.0.24 kernel,
I am getting the following errors:

  gcc 
-Wp,-MD,/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/.ipath_driver.o.d
 -nostdinc -iwithprefix include -D__KERNEL__ -D__OFED_BUILD__  -include 
include/linux/autoconf.h  -include 
/cache/build/BUILD/ofa_kernel-1.5.1/include/linux/autoconf.h  
-I/cache/build/BUILD/ofa_kernel-1.5.1/kernel_addons/backport/2.6.9_U7/include/  
   -I/cache/build/BUILD/ofa_kernel-1.5.1/include  
-I/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/debug  
-I/usr/local/include/scst  
-I/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/ulp/srpt  
-I/cache/build/BUILD/ofa_kernel-1.5.1/drivers/net/cxgb3  -Iinclude
-I/cache/build/BUILD/lustre-kernel-2.6.9/lustre-1.8.3.53/linux-2.6.9-78.0.24/arch//include
   -Wall -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common 
-Os -fomit-frame-pointer -g -Wdeclaration-after-statement  -mno-red-zone 
-mcmodel=kernel -pipe -fno-reorder-blocks  -Wno-sign-compare -funit-at-a-time 
-DIPATH_IDSTR=QLogic kernel.org driver -DIPATH_KERN_TYPE=0  -DMODULE 
-DKBUILD_BASENAME=ipath_driver -DKBUILD_MODNAME=ib_ipath -c -o 
/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/.tmp_ipath_driver.o
 /cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.c
/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.c:150:
 error: unknown field `groups' specified in initializer
/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.c:150:
 warning: initialization from incompatible pointer type
/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.c: 
In function `ipath_verify_pioperf':
/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.c:366:
 warning: implicit declaration of function `__iowrite32_copy'
/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.c: 
In function `ipath_init_one':
/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.c:620:
 warning: passing arg 1 of `iounmap' discards qualifiers from pointer target 
type
/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.c: 
In function `ipath_remove_one':
/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.c:789:
 warning: passing arg 1 of `iounmap' discards qualifiers from pointer target 
type
/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.c: 
In function `ipath_reset_device':
/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.c:2598:
 warning: implicit declaration of function `pid_nr'
/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.c: 
In function `ipath_signal_procs':
/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.c:2656:
 warning: implicit declaration of function `kill_pid'
make[4]: *** 
[/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath/ipath_driver.o]
 Error 1
make[3]: *** [/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband/hw/ipath] 
Error 2
make[2]: *** [/cache/build/BUILD/ofa_kernel-1.5.1/drivers/infiniband] Error 2
make[1]: *** [_module_/cache/build/BUILD/ofa_kernel-1.5.1] Error 2
make[1]: Leaving directory 
`/cache/build/BUILD/lustre-kernel-2.6.9/lustre-1.8.3.53/linux-2.6.9-78.0.24'
make: *** [kernel] Error 2

I know these sorts of things usually wind up being operator error but
I've checked the release notes and they specify as supported:

- RedHat EL4 up72.6.9-78.ELsmp

Does that really preclude the 2.6.9-78.0.24 variant of that kernel?

I  have checked into the nature of the problem and found that the struct
being initialized is:

static struct pci_driver ipath_driver = {
.name = IPATH_DRV_NAME,
.probe = ipath_init_one,
.remove = __devexit_p(ipath_remove_one),
.id_table = ipath_pci_tbl,
.driver = {
.groups = ipath_driver_attr_groups,
},
};

struct pci_driver has the following definition:

struct pci_driver {
struct list_head node;
char *name;
const struct pci_device_id *id_table;   /* must be non-NULL for probe 
to be called */
int  (*probe)  (struct pci_dev *dev, const struct pci_device_id *id);   
/* New device inserted */
void (*remove) (struct pci_dev *dev);   /* Device removed (NULL if not 
a hot-plug capable driver) */
int  (*suspend) (struct pci_dev *dev, u32 state);   /* Device 
suspended */
int  (*resume) (struct pci_dev *dev);   /* Device woken 
up */   
int  (*enable_wake) (struct pci_dev *dev, u32 state, int enable);   /* 
Enable wake event */

struct device_driverdriver;
struct pci_dynids dynids;
};

and struct device_driver the following definition:

struct device_driver {
char* name;
struct bus_type   

Re: [ewg] [Fwd: a forum?]

2010-05-06 Thread Brian J. Murrell
On Thu, 2010-05-06 at 14:46 -0400, David Dillow wrote: 
> 
> Maybe, but I think even that is a bad way to go. From my experience,
> users of forums are less likely to quote the context for their messages
> -- they're used to the message they are replying to being right above
> them on the web page. Gating the messages to the mailing list would just
> give us words in a vacuum.

Yeah.  Probably such a gating implementation would probably need to
quote the message that the forum posting is in reply to.

Note, that I am not proposing that a gatewayed forum would be good (I'd
just as soon see no forums implemented -- I have a real dislike for
them), but just that if enough pressure did mount to provide them,
having them gated to the list is better than stand-alone forums to
prevent fragmenting the community into two different groups
communicating amongst themselves.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] [Fwd: a forum?]

2010-05-06 Thread Brian J. Murrell
On Thu, 2010-05-06 at 09:05 -0700, Jeff Becker wrote: 
> FYI.

Unless it was only a gateway to/from a/the mailing list(s), I would
vehemently vote against forums.  Creating un-gatewayed formus in
addition to keeping the list(s) would, IMHO, just serve to fragment the
community.

b.

>  Original Message 
> Subject:  a forum?
> Date: Thu, 6 May 2010 09:41:05 -0500
> From: Tushar Kapila 
> To:   webmas...@openfabrics.org 
> 
> 
> 
> i would be nice to have a forum on your site
> any out of the box forum software ... or a few more read mes ... how a
> regular windows/ java user can use your SDP sockets 
> Regards,
> Tushar Kapila
> 
> ___
> ewg mailing list
> ewg@lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Version of OFED in RHEL6?

2010-04-22 Thread Brian J. Murrell
On Thu, 2010-04-22 at 16:57 +0200, Peter Kjellstrom wrote: 
> 
> I interpret Dougs comment as: no version of OFED. The infiniband stack will 
> be 
> a different set of infiniband packages than any specific OFED release.

OK.  That's fair enough.  I'm not sure if the OP was looking for this in
particular, but my interest is more about what's in the kernel itself
rather than the userspace packages.  Is it fair to try to put a single
version on the portion of the stack that lives in the kernel or is it
just as difficult as it could come from many places as well?

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Version of OFED in RHEL6?

2010-04-22 Thread Brian J. Murrell
On Wed, 2010-04-21 at 15:50 -0700, Andy Grover wrote: 
> Jon Forrest wrote:
> > What version of OFED should we expect to
> > find in RHEL6?
> 
> It looks like RHEL6 will not include all of OFED, but pick things from 
> OFED's upstream. see https://bugzilla.redhat.com/show_bug.cgi?id=463454
> 
> "All of the SGI package requests are met as of the Beta. However, please 
> note that the packages are pulled from upstream, and not from OFED. 
> Going forward in RHEL6 we will be pulling all InfiniBand related code 
> from upstream (from the upstream Linus kernel for kernel code and from 
> the various upstream package maintainers for individual packages). So if 
> people have specific requests for a certain feature in a certain 
> package, then they need to request that specific package be updated."

So what is the answer to the OP's original question though?  Probably
this is not really a question for this list, but surely, somebody here
knows...  Given the above statement about getting OFED from the upstream
kernel rather than an OFA release, what does that mean in terms of which
actual version of OFED will be in RHEL6?

An additional note on the above mentioned bug notes:

According to the latest builds, the following package version
are in 6.0:

infiniband-diags-1.5.3-1.el6
perftest-1.2.3-1.el6
qperf-0.4.6-1.el6

That first one kinda indicates to me that it might be OFED (core) 1.5.3.
Would that be an unreasonable conclusion?

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] EWG/OFED meeting agenda for today (Jan 11, 2010)

2010-01-11 Thread Brian J. Murrell
On Mon, 2010-01-11 at 18:26 +0200, Tziporet Koren wrote: 
> This is the agenda for the meeting today:
> 
> 1. OFED 1.5.1 status:
... 
> Other changes that should be done:
... 
> - Other - let discuss in the meeting today

How about the ISER/iSCSI on kernels < 2.6.30 situation?

I'm not sure I can make the call, but regardless, it would be nice to
have it discussed and a status included in the meeting minutes.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] vendor supplied iser CANNOT be used with OFED 1.5

2010-01-05 Thread Brian J. Murrell
Hi,

There is a bug (1764) which has been closed WONTFIX as well as a
discussion in the thread at
http://lists.openfabrics.org/pipermail/ewg/2009-October/013958.html
about iser and OFED 1.5.

The gist of all of it is that the ISER in OFED 1.5 is supported only by
kernels 2.6.30+ (there is no backport for kernels < 2.6.30).  The thread
meandered into the territory of what a, say, RHEL5 user is supposed to
do and it was recommended (in
http://lists.openfabrics.org/pipermail/ewg/2009-October/013980.html)
that he should use the RedHat supplied iSCSI stack.  It was also
reported in the same message by Or, that (unsurprisingly) the vendor
supplied iSCSI stack doesn't load with OFED 1.5 supplied as an external
(set of) module(s) as it produces symbol mismatch errors.  This is my
current experience:

ib_iser: disagrees about version of symbol ib_fmr_pool_unmap
ib_iser: Unknown symbol ib_fmr_pool_unmap
ib_iser: disagrees about version of symbol ib_create_cq
ib_iser: Unknown symbol ib_create_cq
ib_iser: disagrees about version of symbol rdma_resolve_addr
ib_iser: Unknown symbol rdma_resolve_addr

To be fair, in the same message (above) it was reported that modprobe
with -f will load the iser module.  modprobe does not by default use the
(-f) force switch so during automatic module loading, iser still fails
to load and requires user intervention to force it to load.

Also, in that same above message, Or reported that he would do further
investigation on this "and think a bit".

Nothing more was done.  Now we are left with this "hacky" solution of
having to force load the iser module, with who knows what repercussions.
Was the use of this force loading for older kernels such as RHEL5 tested
in the QA process for OFED 1.5 and certified to be non-harmful to the
combination of the vendor iSCSI stack with OFED 1.5?  How about other
distros with kernels < 2.6.30?

b.


signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

RE: [ewg] ofa_1_5_kernel 20091211-0200 daily build status

2009-12-11 Thread Brian J. Murrell
On Fri, 2009-12-11 at 07:36 -0700, Tung, Chien Tin wrote: 
> 
> The website seems to be okay right now.

Indeed.  Same from here.

> But the latest in daily is 12-8 @ 6:08.
> Did I screw up again?  :-)

I see today's daily snapshot just fine.  Our daily testing of it has
just launched in fact.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091211-0200 daily build status

2009-12-11 Thread Brian J. Murrell
All built without error, but the website is timing out.  It pings.  It
just doesn't return GET requests.

Can anyone take a look?

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] missing daily build for 09/12/09?

2009-12-09 Thread Brian J. Murrell
Hi all,

I don't see a daily snapshot for Dec. 9, 2009 on the website:

[DIR]Parent Directory - 
[TXT]latest.txt  08-Dec-2009 06:08   27   
[ ]latest.tgz  08-Dec-2009 06:08   65M  
[ ]OFED-1.5-20091208-06..> 08-Dec-2009 06:08   65M  
[ ]OFED-1.5-20091207-07..> 07-Dec-2009 07:34   65M  
[ ]OFED-1.5-20091207-06..> 07-Dec-2009 06:49   65M  
...

Is this a known condition?

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091104-0200 daily build status

2009-11-20 Thread Brian J. Murrell
On Fri, 2009-11-20 at 11:07 -0600, Jon Mason wrote: 
> 
> I can provide assistance, if you wish.

My use for a git clone here is probably next to nil and I just have too
many pokers in the fire to take it on right now.

> Hop on IRC channel #ofed on OFTC
> or e-mail me with questions.

I do appreciate the offer though.


> You can send Vlad a normal patch and he'll include it.  It's no biggie.
> I have a patch queued, so its no additional work for me.

Sure, feel free to just add mine to yours if you like the solution.
Easier than me figuring out the process for the few times I am likely to
need it.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091104-0200 daily build status

2009-11-20 Thread Brian J. Murrell
On Fri, 2009-11-20 at 10:58 -0600, Jon Mason wrote: 
> 
> It looks like a valid solution to the problem.  I don't see any reason
> why he would reject it.  Have Brian push the patch

I don't have a git tree (or knowledge of how to use it, exactly -- a
TODO) so I don't really have the ability to push.

> (or I can commit it
> to my tree and push it).

Feel free if you wish.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091104-0200 daily build status

2009-11-20 Thread Brian J. Murrell
On Fri, 2009-11-20 at 08:49 -0800, Jeff Becker wrote: 
>
> Actually - that's OK, since the problem arises from the fact that the
> backport file contains the definition because older versions of
> SLES10SP2 didn't.

Right.

> You're basically replacing the kernel header
> definition (in newer SLES10SP2 kernels) with the identical definition.

Right.  For as long as it remains identical (i.e. it's changed in the
SLES source -- for whatever, ever so remote reason).

The purest in me would rather the solution worked the other way and
preferred the version supplied by the kernel if it's present and fell
back to the OFED definition if it's not.  I cannot figure out how to
engineer that though.  :-/

In the absence of that "more pure" solution, my patch seems to do the
job with that very slight risk of de-synchronization.

Cheers,
b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofed 1.5 daily builds

2009-11-20 Thread Brian J. Murrell
On Fri, 2009-11-20 at 10:28 -0600, Jon Mason wrote: 
> On Fri, Nov 20, 2009 at 07:55:46AM -0500, Brian J. Murrell wrote:
> > I'm wondering why there are 3 daily builds for Nov. 19:
> > 
> > http://www.openfabrics.org/downloads/OFED/ofed-1.5-daily/OFED-1.5-20091119-0915.tgz
> > http://www.openfabrics.org/downloads/OFED/ofed-1.5-daily/OFED-1.5-20091119-0715.tgz
> > http://www.openfabrics.org/downloads/OFED/ofed-1.5-daily/OFED-1.5-20091119-0600.tgz
> > 
> > vs. the normal one daily build per day.
> 
> I believe those are related to the build breaks yesterday, with the first one 
> being the only one that will build for all distros.
> 
> Today's is out:
> http://www.openfabrics.org/downloads/OFED/ofed-1.5-daily/OFED-1.5-20091120-0600.tgz

Yes, indeed.  And along with the patch I posted in that other thread, it
seems to make me happy -- from an "it builds!" perspective at least.  I
am working to get an OFED 1.5-daily-snaphot testing cycle into our
routine daily testing so that we can identify regressions quickly
enough.

Cheers!

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091104-0200 daily build status

2009-11-20 Thread Brian J. Murrell
On Fri, 2009-11-20 at 10:14 -0500, Brian J. Murrell wrote: 
> 
> But it does mean usurping the possible definition of
> ipv6_addr_loopback() in the O/S for the one in OFED, for whatever that's
> worth.

And just to further reply to myself, this patch appears to do the job,
although I have to admit not being able to do any testing with ipv6 at
all:

--- kernel_addons/backport/2.6.16_sles10_sp2/include/net/ipv6.h.old 
2009-11-03 14:17:26.0 -0500
+++ kernel_addons/backport/2.6.16_sles10_sp2/include/net/ipv6.h 2009-11-20 
10:23:12.0 -0500
@@ -18,10 +18,12 @@
(a->s6_addr32[2] ^ htonl(0x))) == 0);
 }
 
-static inline int ipv6_addr_loopback(const struct in6_addr *a)
+static inline int __ipv6_addr_loopback(const struct in6_addr *a)
 {
return ((a->s6_addr32[0] | a->s6_addr32[1] |
 a->s6_addr32[2] | (a->s6_addr32[3] ^ htonl(1))) == 0);
 }
 
+#define ipv6_addr_loopback(a) __ipv6_addr_loopback(a)
+
 #endif



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091104-0200 daily build status

2009-11-20 Thread Brian J. Murrell
On Fri, 2009-11-20 at 10:06 -0500, Brian J. Murrell wrote: 
> 
> Hrm.  Does cpp's "#if[n]def" construct work with C defined functions?

> My test says no:

But something like this will work:

#include 

static inline int my_foo()
{
fprintf(stderr, "this is my_foo()\n");
}

static inline int foo()
{
fprintf(stderr, "this is foo()\n");
}

#define foo() my_foo()

main() {

foo();

}
$ gcc -o /tmp/a{,.c}
$ /tmp/a
this is my_foo()

But it does mean usurping the possible definition of
ipv6_addr_loopback() in the O/S for the one in OFED, for whatever that's
worth.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091104-0200 daily build status

2009-11-20 Thread Brian J. Murrell
On Thu, 2009-11-19 at 09:26 -0800, Jeff Becker wrote: 
> Hi.

Hi Jeff,

> Jon Mason suggested adding a "#ifndef ipv6_addr_loopback" around the
> function definition in
> kernel_addons/backport/2.6.16_sles10_sp2/include/net/ipv6.h . I'll look
> into this today.

Hrm.  Does cpp's "#if[n]def" construct work with C defined functions?
i.e. does:

static inline int ipv6_addr_loopback(const struct in6_addr *a)
{
return ((a->s6_addr32[0] | a->s6_addr32[1] |
 a->s6_addr32[2] | (a->s6_addr32[3] ^ htonl(1))) == 0);
}

provide cpp with a "ipv6_addr_loopback" definition such that cpp can
test for it?

My test says no:

$ cat /tmp/a.c
static inline int foo()
{
return 0;
}

#ifndef foo
static inline int foo()
{
return 0;
}
#endif

main() {
}
$ gcc -o /tmp/a{,.c}
/tmp/a.c:8: error: redefinition of ‘foo’
/tmp/a.c:3: error: previous definition of ‘foo’ was here

Maybe I am misunderstanding the proposed solution?

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] ofed 1.5 daily builds

2009-11-20 Thread Brian J. Murrell
I'm wondering why there are 3 daily builds for Nov. 19:

http://www.openfabrics.org/downloads/OFED/ofed-1.5-daily/OFED-1.5-20091119-0915.tgz
http://www.openfabrics.org/downloads/OFED/ofed-1.5-daily/OFED-1.5-20091119-0715.tgz
http://www.openfabrics.org/downloads/OFED/ofed-1.5-daily/OFED-1.5-20091119-0600.tgz

vs. the normal one daily build per day.

I'm also wondering where the daily build for Nov. 20 is.  Maybe I am
just being impatient.

Perhaps somebody can tell me what timezone the timestamp in the filename
is representing.

Cheers,
b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091104-0200 daily build status

2009-11-19 Thread Brian J. Murrell
On Thu, 2009-11-19 at 16:28 +0200, Vladimir Sokolovsky wrote: 
> 

You recall that this was a change you suggested in response to the
previous failure:

http://www.mail-archive.com/ewg@lists.openfabrics.org/msg07773.html

which Jeff Becker is also reporting here:

http://www.mail-archive.com/ewg@lists.openfabrics.org/msg07854.html

Jeff includes more detail about the commit that actually broke this:

committer Jack Morgenstein  
Thu, 11 Jun 2009 13:17:33 + (16:17 +0300)
commit 1f462241bd18d9b5727ddea90459e7763b69e11c
backports: 2.6.16_sles10_sp2: patches and add-ons based on kernel 2.6.18 
backport

> Then this kernel requires another backport directory based on 
> 2.6.16_sles10_sp2 under
> kernel_patches/backport/ and kernel_addons/backport/ with corresponding 
> change in ofed_scripts/get_backport_dir.sh
> (E.g. 2.6.16_sles10_sp2_lustre).
  ^^
This breakage has got nothing to do with Lustre, per Jeff's report.  In
any case, creating an entire new and mostly duplicate backport for a
single change that is simply not portable seems like a sledgehammer of a
solution, no?

> Please prepare backports for this kernel and I'll apply it to the OFED's 1.5 
> kernel git tree.

I don't think I am going to have to time to prepare an entire backport
(nor do I believe it's the correct solution) for this problem, but it
seems it must be fixed before GA per Jeff's independent report of the
same failure or anyone using this newer SLES10_SP2 kernel will run into
this problem.

Cheers,
b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091104-0200 daily build status

2009-11-18 Thread Brian J. Murrell
On Thu, 2009-11-12 at 10:37 +0200, Vladimir Sokolovsky wrote: 
> 
> Please try:
> 
> diff --git a/ofed_scripts/get_backport_dir.sh 
> b/ofed_scripts/get_backport_dir.sh
> index ed0c091..0da5c17 100755
> --- a/ofed_scripts/get_backport_dir.sh
> +++ b/ofed_scripts/get_backport_dir.sh
> @@ -51,7 +51,7 @@ get_backport_dir()
>   echo 2.6.16_sles10_sp1
>   else
>   subminor=$(echo $KVERSION | cut -d "-" -f 2 | cut 
> -d"." -f2)
> -if [ $subminor -lt 49 ]; then
> +if [ $subminor -lt 42 ]; then
>   echo 2.6.16_sles10_sp2
>   else
>   echo 2.6.16_sles10_sp3

The patch is working because it identifies the kernel as SP3, however
the result ends up looking bad for a different reason:

...
Created config.mk:
BACKPORT_INCLUDES=-I${CWD}/kernel_addons/backport/2.6.16_sles10_sp3/include/
Created configure.mk.kernel:
# Current working directory
CWD=/cache/build/BUILD/ofa_kernel-1.5

# Kernel level
KVERSION=2.6.16.60-0.42.4_lustre.1.8.1.54.20091118002918-bigsmp
ARCH=i686
MODULES_DIR=/lib/modules/2.6.16.60-0.42.4_lustre.1.8.1.54.20091118002918-bigsmp/updates
KSRC=/cache/build/reused/usr/src/linux-2.6.16.60-0.42.4_lustre.1.8.1.54.20091118002918-obj/i386/bigsmp

AUTOCONF_H=/cache/build/BUILD/ofa_kernel-1.5/include/linux/autoconf.h

WITH_MAKE_PARAMS=

CONFIG_MEMTRACK=
CONFIG_DEBUG_INFO=y
CONFIG_INFINIBAND=m
CONFIG_INFINIBAND_IPOIB=m
CONFIG_INFINIBAND_IPOIB_CM=y
CONFIG_INFINIBAND_SDP=m
CONFIG_INFINIBAND_SRP=
CONFIG_INFINIBAND_SRPT=

CONFIG_INFINIBAND_USER_MAD=m
CONFIG_INFINIBAND_USER_ACCESS=m
CONFIG_INFINIBAND_ADDR_TRANS=y
CONFIG_INFINIBAND_USER_MEM=y
CONFIG_INFINIBAND_MTHCA=m

CONFIG_MLX4_CORE=m
CONFIG_MLX4_EN=m
CONFIG_MLX4_INFINIBAND=m
CONFIG_MLX4_DEBUG=y

CONFIG_INFINIBAND_IPOIB_DEBUG=y
CONFIG_INFINIBAND_ISER=
CONFIG_SCSI_ISCSI_ATTRS=
CONFIG_ISCSI_TCP=
CONFIG_INFINIBAND_EHCA=
CONFIG_INFINIBAND_EHCA_SCALING=
CONFIG_RDS=m
CONFIG_RDS_RDMA=m
CONFIG_RDS_TCP=m
CONFIG_RDS_DEBUG=
CONFIG_INFINIBAND_MADEYE=m
CONFIG_INFINIBAND_QLGC_VNIC=m
CONFIG_INFINIBAND_CXGB3=m
CONFIG_CHELSIO_T3=m
CONFIG_INFINIBAND_NES=m

CONFIG_SUNRPC_XPRT_RDMA=
CONFIG_SUNRPC=
CONFIG_SUNRPC_GSS=
CONFIG_RPCSEC_GSS_KRB5=
CONFIG_RPCSEC_GSS_SPKM3=

CONFIG_NFS_FS=
CONFIG_NFS_V3=
CONFIG_NFS_V3_ACL=
CONFIG_NFS_V4=
CONFIG_NFS_ACL_SUPPORT=
CONFIG_NFS_DIRECTIO=
CONFIG_EXPORTFS=
CONFIG_LOCKD=
CONFIG_LOCKD_V4=
CONFIG_NFSD=
CONFIG_NFSD_V2_ACL=
CONFIG_NFSD_V3=
CONFIG_NFSD_V3_ACL=
CONFIG_NFSD_V4=
CONFIG_NFSD_RDMA=

CONFIG_INFINIBAND_IPOIB_DEBUG_DATA=
CONFIG_INFINIBAND_SDP_SEND_ZCOPY=
CONFIG_INFINIBAND_SDP_RECV_ZCOPY=
CONFIG_INFINIBAND_SDP_DEBUG=y
CONFIG_INFINIBAND_SDP_DEBUG_DATA=
CONFIG_INFINIBAND_IPATH=
CONFIG_INFINIBAND_QIB=
CONFIG_INFINIBAND_MTHCA_DEBUG=y
CONFIG_INFINIBAND_QLGC_VNIC_STATS=
CONFIG_INFINIBAND_CXGB3_DEBUG=
CONFIG_INFINIBAND_NES_DEBUG=
CONFIG_INFINIBAND_AMSO1100=

Created /cache/build/BUILD/ofa_kernel-1.5/include/linux/autoconf.h:
#ifndef __OFED_BUILD__
#include_next 
#else
#undef CONFIG_MEMTRACK
#undef CONFIG_DEBUG_INFO
#undef CONFIG_INFINIBAND
#undef CONFIG_INFINIBAND_IPOIB
#undef CONFIG_INFINIBAND_IPOIB_CM
#undef CONFIG_INFINIBAND_SDP
#undef CONFIG_INFINIBAND_SRP
#undef CONFIG_INFINIBAND_SRPT

#undef CONFIG_INFINIBAND_USER_MAD
#undef CONFIG_INFINIBAND_USER_ACCESS
#undef CONFIG_INFINIBAND_ADDR_TRANS
#undef CONFIG_INFINIBAND_USER_MEM
#undef CONFIG_INFINIBAND_MTHCA

#undef CONFIG_MLX4_CORE
#undef CONFIG_MLX4_DEBUG
#undef CONFIG_MLX4_EN
#undef CONFIG_MLX4_INFINIBAND

#undef CONFIG_INFINIBAND_IPOIB_DEBUG
#undef CONFIG_INFINIBAND_ISER
#undef CONFIG_INFINIBAND_EHCA
#undef CONFIG_INFINIBAND_EHCA_SCALING
#undef CONFIG_RDS
#undef CONFIG_RDS_RDMA
#undef CONFIG_RDS_TCP
#undef CONFIG_RDS_DEBUG
#undef CONFIG_INFINIBAND_MADEYE
#undef CONFIG_INFINIBAND_QLGC_VNIC
#undef CONFIG_INFINIBAND_QLGC_VNIC_STATS
#undef CONFIG_INFINIBAND_CXGB3
#undef CONFIG_INFINIBAND_CXGB3_DEBUG
#undef CONFIG_CHELSIO_T3
#undef CONFIG_INFINIBAND_NES
#undef CONFIG_INFINIBAND_NES_DEBUG

#undef CONFIG_SUNRPC_XPRT_RDMA
#undef CONFIG_SUNRPC
#undef CONFIG_SUNRPC_GSS
#undef CONFIG_RPCSEC_GSS_KRB5
#undef CONFIG_RPCSEC_GSS_SPKM3
#undef CONFIG_NFS_FS
#undef CONFIG_NFS_V3
#undef CONFIG_NFS_V3_ACL
#undef CONFIG_NFS_V4
#undef CONFIG_NFS_ACL_SUPPORT
#undef CONFIG_NFS_DIRECTIO
#undef CONFIG_EXPORTFS
#undef CONFIG_LOCKD
#undef CONFIG_LOCKD_V4
#undef CONFIG_NFSD
#undef CONFIG_NFSD_V2_ACL
#undef CONFIG_NFSD_V3
#undef CONFIG_NFSD_V3_ACL
#undef CONFIG_NFSD_V4
#undef CONFIG_NFSD_RDMA

#undef CONFIG_INFINIBAND_IPOIB_DEBUG_DATA
#undef CONFIG_INFINIBAND_SDP_SEND_ZCOPY
#undef CONFIG_INFINIBAND_SDP_RECV_ZCOPY
#undef CONFIG_INFINIBAND_SDP_DEBUG
#undef CONFIG_INFINIBAND_SDP_DEBUG_DATA
#undef CONFIG_INFINIBAND_IPATH
#undef CONFIG_INFINIBAND_QIB
#undef CONFIG_INFINIBAND_MTHCA_DEBUG
#undef CONFIG_INFINIBAND_AMSO1100
#endif

#undef CONFIG_INFINIBAND
#define CONFIG_INFINIBAND 1
#undef CONFIG_INFINIBAND_IPOIB
#define CONFIG_INFINIBAND_IPOIB 1

Re: [ewg] ofa_1_5_kernel 20091104-0200 daily build status

2009-11-11 Thread Brian J. Murrell
On Wed, 2009-11-04 at 11:52 -0500, Brian J. Murrell wrote: 
> On Wed, 2009-11-04 at 17:52 +0200, Vladimir Sokolovsky wrote: 
> > 
> > Hi Brian,
> 
> Hi Vlad,

Any further comment on this situation?

> > Is it SLES10 SP2 or SP3?
> 
> According to this webpage:
> 
> http://support.novell.com/security/cve/CVE-2009-1758.html it looks like
> it's SP2.
> 
> > As I see from the log that sp2 backports are applied.
> 
> Indeed.  I also see the difference between 
> 
> kernel_addons/backport/2.6.16_sles10_sp{2,3}/include/net/ipv6.h
> 
> That is driving this query.
> 
> > The reason for other failures is that Ipath driver is not included in 
> > the daily build.
> > Here is the reason:
> > 
> > 
> > commit 7bc2fad6c49a1fb58a3f59e71258095f5c5b1be0
> > Author: Ralph Campbell 
> > Date:   Thu Jul 2 09:49:12 2009 -0700
> > 
> >  Take ipath out of the build
> > 
> >  commit 076a0993420716eed9fbbd6315793ee00c0c2a09
> >  Author: Ralph Campbell 
> >  Date:   Thu Jul 2 09:46:11 2009 -0700
> > 
> >  OFED: Take ipath out of the build
> > 
> >  The ipath driver is being replaced by the qib driver in this 
> > release.
> >  This patch removes ipath from the build, a subsequent patch 
> > will add
> >  qib.
> > 
> >  Signed-off-by: Ralph Campbell 
> 
> OK.  I will remove it from my build as well.
> 
> Thanx for the pointer.
> 
> b.
> 
> ___
> ewg mailing list
> ewg@lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091104-0200 daily build status

2009-11-04 Thread Brian J. Murrell
On Wed, 2009-11-04 at 17:52 +0200, Vladimir Sokolovsky wrote: 
> 
> Hi Brian,

Hi Vlad,

> Is it SLES10 SP2 or SP3?

According to this webpage:

http://support.novell.com/security/cve/CVE-2009-1758.html it looks like
it's SP2.

> As I see from the log that sp2 backports are applied.

Indeed.  I also see the difference between 

kernel_addons/backport/2.6.16_sles10_sp{2,3}/include/net/ipv6.h

That is driving this query.

> The reason for other failures is that Ipath driver is not included in 
> the daily build.
> Here is the reason:
> 
> 
> commit 7bc2fad6c49a1fb58a3f59e71258095f5c5b1be0
> Author: Ralph Campbell 
> Date:   Thu Jul 2 09:49:12 2009 -0700
> 
>  Take ipath out of the build
> 
>  commit 076a0993420716eed9fbbd6315793ee00c0c2a09
>  Author: Ralph Campbell 
>  Date:   Thu Jul 2 09:46:11 2009 -0700
> 
>  OFED: Take ipath out of the build
> 
>  The ipath driver is being replaced by the qib driver in this 
> release.
>  This patch removes ipath from the build, a subsequent patch 
> will add
>  qib.
> 
>  Signed-off-by: Ralph Campbell 

OK.  I will remove it from my build as well.

Thanx for the pointer.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091104-0200 daily build status

2009-11-04 Thread Brian J. Murrell
On Wed, 2009-11-04 at 03:26 -0800, Vladimir Sokolovsky (Mellanox)
wrote: 
> Passed on x86_64 with linux-2.6.16.60-0.54.5-smp

2.6.16.60-0.42.4 on both x86_64 and i686 yield:

gcc -m32 
-Wp,-MD,/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/core/.addr.o.d  
-nostdinc -isystem /usr/lib/gcc/i586-suse-linux/4.1.2/include -D__KERNEL__ 
-D__OFED_BUILD__  -include include/linux/autoconf.h  -include 
/cache/build/BUILD/ofa_kernel-1.5/include/linux/autoconf.h  
-I/cache/build/BUILD/ofa_kernel-1.5/kernel_addons/backport/2.6.16_sles10_sp2/include/
-I/cache/build/BUILD/ofa_kernel-1.5/include  
-I/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/debug  
-I/usr/local/include/scst  
-I/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/ulp/srpt  
-I/cache/build/BUILD/ofa_kernel-1.5/drivers/net/cxgb3  -Iinclude  -Iinclude2 
-I/cache/build/reused/usr/src/linux-2.6.16.60-0.42.4/include  
-I/cache/build/reused/usr/src/linux-2.6.16.60-0.42.4/arch//include   
-Iinclude/asm-i386/mach-generic -Iinclude/asm-i386/mach-default 
-I/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/core  -Wall -Wundef 
-Wstrict-prototypes -Wno-trigraphs -Werror-implicit-function-declaration 
-fno-strict-aliasing -fno-common -ffreestanding -fno-delete-null-pointer-checks 
-fwrapv -Os -pipe -msoft-float -mpreferred-stack-boundary=2 -march=i586 
-mtune=generic -mregparm=3 -Iinclude/asm-i386/mach-generic 
-I/cache/build/reused/usr/src/linux-2.6.16.60-0.42.4/include/asm-i386/mach-generic
 -Iinclude/asm-i386/mach-default 
-I/cache/build/reused/usr/src/linux-2.6.16.60-0.42.4/include/asm-i386/mach-default
 -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement 
-Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" 
-D"KBUILD_BASENAME=KBUILD_STR(addr)"  -D"KBUILD_MODNAME=KBUILD_STR(ib_addr)" -c 
-o /cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/core/.tmp_addr.o 
/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/core/addr.c
In file included from 
/cache/build/reused/usr/src/linux-2.6.16.60-0.42.4/include/net/addrconf.h:51,
 from 
/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/core/addr.c:43:
/cache/build/BUILD/ofa_kernel-1.5/kernel_addons/backport/2.6.16_sles10_sp2/include/net/ipv6.h:22:
 error: redefinition of ‘ipv6_addr_loopback’
/cache/build/reused/usr/src/linux-2.6.16.60-0.42.4/include/net/ipv6.h:361: 
error: previous definition of ‘ipv6_addr_loopback’ was here
make[6]: *** [/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/core/addr.o] 
Error 1
make[5]: *** [/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/core] Error 2
make[4]: *** [/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband] Error 2
make[3]: *** [_module_/cache/build/BUILD/ofa_kernel-1.5] Error 2
make[2]: *** [modules] Error 2
make[1]: *** [modules] Error 2
make[1]: Leaving directory 
`/cache/build/reused/usr/src/linux-2.6.16.60-0.42.4-obj/i386/bigsmp'
make: *** [kernel] Error 2

> Passed on x86_64 with linux-2.6.16.60-0.21-smp

> Passed on x86_64 with linux-2.6.18-128.el5

When I try to build with 2.6.18-128.7.1.el5 on x86_64 I get:

gcc 
-Wp,-MD,/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/hw/ipath/.ipath_driver.o.d
  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.1.2/include 
-D__KERNEL__ \
-D__OFED_BUILD__ \
-include include/linux/autoconf.h \
-include /cache/build/BUILD/ofa_kernel-1.5/include/linux/autoconf.h \
-I/cache/build/BUILD/ofa_kernel-1.5/kernel_addons/backport/2.6.18-EL5.3/include/
 \
 \
 \
-I/cache/build/BUILD/ofa_kernel-1.5/include \
-I/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/debug \
-I/usr/local/include/scst \
-I/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/ulp/srpt \
-I/cache/build/BUILD/ofa_kernel-1.5/drivers/net/cxgb3 \
-Iinclude \
 \
-I/cache/build/reused/usr/src/kernels/2.6.18-128.7.1.el5-x86_64/arch//include \
  -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing 
-fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration 
-fwrapv -Os  -mtune=generic -m64 -mno-red-zone -mcmodel=kernel -pipe 
-fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables 
-funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g 
 -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign 
-DIPATH_IDSTR='"QLogic kernel.org driver"' -DIPATH_KERN_TYPE=0  -DMODULE 
-D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipath_driver)"  
-D"KBUILD_MODNAME=KBUILD_STR(ib_ipath)" -c -o 
/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/hw/ipath/.tmp_ipath_driver.o
 /cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/hw/ipath/ipath_driver.c
/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/hw/ipath/ipath_driver.c:155:
 error: unknown field 'groups' specified in initializer
/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/hw/ipath/ipath_driver.c:155:
 warning: initialization from incompatible pointer type
/cache/build/BUILD/ofa_kernel-1.5/drivers/infiniband/hw/ipath/ipath_driver.c: 
In function 'ipath_reset_device':

Re: [ewg] ofa_1_5_kernel 20091007-0200 daily build status

2009-10-16 Thread Brian J. Murrell
On Wed, 2009-10-14 at 17:12 +0200, Or Gerlitz wrote:
> 
> Hi Vlad, Brian

Hi Or,

> We're checking this. Basically, I'd like to see people using their 
> distro iSCSI stack.

OK.  I'll bite and try to go forward with this for our Lustre release
that will included OFED 1.5 and see how it works out.

As I asked in bug 1764, can you confirm which ofa_kernel RPM build
options I should stop using in order to stop using the OFED supplied
iSCSI and start using the kernel supplied stack?

Obviously "--with-iser-mod", but any others?  Currently I build
ofa_kernel with:

--with-user_mad-mod
--with-user_access-mod
--with-addr_trans-mod
--with-srp-target-mod
--with-core-mod
--with-mthca-mod
--with-mlx4-mod
--with-mlx4_en-mod
--with-cxgb3-mod
--with-nes-mod
--with-sdp-mod
--with-srp-mod
--with-rds-mod
--with-iser-mod
--with-qlgc_vnic-mod
--with-madeye-mod

Thanx,
b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Re: [Fwd: FW: OFED installation question re-AltLinux]

2009-10-15 Thread Brian J. Murrell
On Thu, 2009-10-15 at 09:54 -0700, Jeff Becker wrote:
> 
> This makes sense. Although it seems like we should have a more
> descriptive statement about ewg on the mail lists web page, so folks
> know this is where to ask questions about OFED usage. Any takers? Thanks.

I must agree with this.  It is not at all obvious what "ewg" is for.

How about "ofed-users"?  If we were still hosting the general list I'd
probably suggest ofed-devel for it.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091007-0200 daily build status

2009-10-14 Thread Brian J. Murrell
On Wed, 2009-10-14 at 17:15 +0200, Or Gerlitz wrote:
> 
> Brian,

Hi Or,

> You (Sun) may basically be in the other side of the same problem, 
> suppose you have Lustre build to distro X and then someone replaced the 
> IB stack to be that of ofed Y. Lustre uses the IB stack (verbs/rdma-cm) 
> directly and hence should have the same symbol version issue, how do you 
> solve that? rebuild Lustre? load it with modprobe -f?

We provide a complete package of kernel + kernel-ib + lustre-modules.
If somebody wants an OFED implementation other than the kernel-ib that
we provide, they are obligated to rebuild lustre against whatever OFED
stack they've decided to use.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091007-0200 daily build status

2009-10-08 Thread Brian J. Murrell
On Thu, 2009-10-08 at 09:51 +0200, Vladimir Sokolovsky wrote:
> 
> Hi Brian,

Hi Vlad,

> ISER in OFED-1.5 supported on kernel 2.6.30 only (there is no backport 
> patches for kernels < 2.6.30).

Ahhh.  This is a new requirement then with OFED 1.5?  Previous OFED
releases (i.e. 1.4) supported ISER on most (all?) supported (i.e.
RHEL/SLES vendor) kernels didn't it?

> Daily build script and OFED's install 
> script prevent ISER compilation on kernels < 2.6.30.

I see.

> As I understand, you passed '--with-iser-mod' parameter manually to the 
> rpmbuild command. Therefore you got this failure.

Right.

> So, if you want to use ISER on kernel < 2.6.30 you can't use OFED-1.5 or 
> you need to prepare the backport patches.

Is this something that's expected to be remedied before the GA release,
or has a decision been made to officially not support ISER for $kernel <
2.6.30 with OFED 1.5?

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] ofa_1_5_kernel 20091007-0200 daily build status

2009-10-07 Thread Brian J. Murrell
On Wed, 2009-10-07 at 03:10 -0700, Vladimir Sokolovsky (Mellanox) wrote:
> This email was generated automatically, please do not reply
> 
> 
> git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git
> git_branch: ofed_kernel_1_5
> 
> Common build parameters: 
> 
> Passed:
> Passed on i686 with linux-2.6.18
> Passed on i686 with linux-2.6.19
> Passed on i686 with linux-2.6.21.1
> Passed on i686 with linux-2.6.26
> Passed on i686 with linux-2.6.24
> Passed on i686 with linux-2.6.22
> Passed on i686 with linux-2.6.27
> Passed on x86_64 with linux-2.6.16.60-0.21-smp
> Passed on x86_64 with linux-2.6.18
> Passed on x86_64 with linux-2.6.18-164.el5
> Passed on x86_64 with linux-2.6.18-128.el5

Any idea how this is building successfully in the daily builds when it
fails for me (bug 1764) in beta1 and yesterday's daily snapshot with:

/home/brian/rpm/BUILD/ofa_kernel-1.5/drivers/infiniband/ulp/iser/iscsi_iser.c:601:
 error: unknown field ‘eh_target_reset_handler’ specified in initializer

FWIW, I am building ofa_kernel with:

rpmbuild --rebuild --define 'build_kernel_ib 1' --define 'build_kernel_ib_devel 
1'
 --define '_topdir /home/brian/rpm' --target i686
 --define 'KVERSION 2.6.18-128.el5'
 --define 'K_SRC /usr/src/linux-2.6.18-128'
 --define 'LIB_MOD_DIR /lib/modules/2.6.18-prep/updates'
 --define '_release ofed1.5.1.5.beta1'
 --define 'configure_options --without-quilt --with-core-mod
 --with-user_mad-mod --with-user_access-mod
 --with-addr_trans-mod --with-srp-target-mod
 --with-core-mod --with-mthca-mod
 --with-mlx4-mod --with-mlx4_en-mod
 --with-cxgb3-mod --with-nes-mod
 --with-ipoib-mod --with-sdp-mod
 --with-srp-mod --with-rds-mod
 --with-iser-mod --with-qlgc_vnic-mod
 --with-madeye-mod '
 ../SRPMS//ofa_kernel-1.5-ofed1.5.beta1.src.rpm

Cheers,
b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Re: [ofa-general] help install ofed 1.4 on Centos 5.2

2009-09-28 Thread Brian J. Murrell
On Mon, 2009-09-28 at 10:45 -0500, Jon Mason wrote:
> 
> The issue is having customers breaking when installing OFED due to them using 
> a kernel that was not in existance when that version of OFED shipped (and 
> thus should not be supported).  This should be prevented from occuring (and 
> should be fairly easy to do).

Agreed.  It should be obvious, very early that the desired kernel is not
(yet) supported.

> There will always be a window between the latest kernel coming out (of 
> whatever flavor) and OFED's support of it.

Indeed.  This is even true of Lustre.  However that window should not be
open until the next "scheduled" release given that the release cycle is
6 months or more.  We cut new releases that contain just the new kernel
support when new kernels warrant it.  I guess that's all I'm saying.

> Unless OFED is changed to a release model where its releases coincide with 
> distro releases.

That doesn't work either.  There will always be unscheduled errata
releases made to deal with urgent issues.

I guess what I am promoting here is "bugfix" OFED releases that include
newer kernel backport support on an as-needed basis.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [ofa-general] help install ofed 1.4 on Centos 5.2

2009-09-28 Thread Brian J. Murrell
On Mon, 2009-09-28 at 10:07 -0500, Jon Mason wrote:
> 
> We should add a upper limit for the kernels supported in the install
> script.  So that when new kernels come out, we could very cleanly say
> that it is not supported.

If I'm understanding you, you are addressing the issue of OFED not
building with a newer kernel than it was originally written to support.
I suppose to short-circuit the eventual build failure?

If so, that doesn't deal at all with the real-world issue of leaving
people between the rock and hard place of having to either run
vulnerable kernels or alpha/beta quality OFED software.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] i need to avoid the NFSRDMA headers in kernel-ib-devel

2009-07-08 Thread Brian J. Murrell
On Wed, 2009-07-08 at 15:22 -0500, Jon Mason wrote:
> 
> I can look into this more and see if it does what you want.

Great.

> Am I correct in assuming that you want this for 1.4.2 as well?

Ideally.  :-)  Definitely fixing in trunk is desired though.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] i need to avoid the NFSRDMA headers in kernel-ib-devel

2009-07-08 Thread Brian J. Murrell
On Wed, 2009-07-08 at 14:42 -0500, Jon Mason wrote:
> 
> Just tell them that NFSRMDA is NOT optional ;-)

Oh, I'm sure not many would have a problem with that.  It's our own risk
management that prevents us from doing that.

> Creating a seperate package can be quite harry and will eat a large
> chunk of time.

Fair enough.

> I think it would be much easier to have some compile
> time checkes, for example #ifdef CONFIG_NFS_FS around all the NFS
> specific functions/headers.  Does that sound plausable/reasonable?

This sounds like one of the options I suggested in the bugzilla bug, but
CONFIG_NFS_FS is the generic kernel definition to enable NFS isn't it?
That would mean that if the vendor's default .config enabled NFS, it
would still not guard the NFSRDMA headers, if I understand you
correctly.

I think you need to invent your own macro (maybe CONFIG_OFA_NFS_FS)
which you define when the user chooses the OFED NFSRDMA and that
definition guards all of the declarations (heck, in most cases, probably
just entire include files, with #include_next <> as the #else case of
the guard) in the OFED NFSRDMA headers.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] i need to avoid the NFSRDMA headers in kernel-ib-devel

2009-07-08 Thread Brian J. Murrell
On Wed, 2009-07-08 at 14:16 -0500, Jon Mason wrote:
> 
> The OFED NFS implementation

*optionally*

> replaces the in-kernel version.

If one desires that.

> So it will
> be the version being used.

Per above, only if you selected to build/install it.

> Why would you not want this to be the
> case?

Because we are trying to minimize our responsibility for kernel
deviation (and hence bugs) from the vendor's distributed kernel.

OFED is an integral component of our software, so we are willing to take
on the responsibility for that deviation.  NFS[RDMA] is not integral to
our software, so we'd just as soon let the customer use what they would
be using with the vendor's stock kernel if they want to use NFS.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] i need to avoid the NFSRDMA headers in kernel-ib-devel

2009-07-06 Thread Brian J. Murrell
On Tue, 2009-06-30 at 15:36 -0400, Brian J. Murrell wrote:
> I think I have asked about this before, or at least tangentially in bug
> 1523.

Does nobody have any opinion or ideas on this?  Maybe I should just go
open a bug about it.

OK.  I have.  Bug 1671.

Cheers,
b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] i need to avoid the NFSRDMA headers in kernel-ib-devel

2009-06-30 Thread Brian J. Murrell
I think I have asked about this before, or at least tangentially in bug
1523.

The problem is, when I am building software against the OFED
kernel-ib-devel package, because the NFSRDMA headers are in the same
include tree as the OFED headers, I cannot get to the OFED headers and
avoid the NFSRDMA headers.

When is this a problem?  When I have built my OFED stack *without*
NFSRDMA support.

So my expectation is that the driver that I am building with the OFED
kernel-ib-devel will use the NFS and RPC implementation in the linux
kernel, not in /usr/src/ofa-kernel, however because the NFSRDMA
implementation is populated in the same include tree as the core OFED
implementation, I cannot avoid it.

Thots?
b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Please send me subjects for EWG meeting on Monday

2009-06-25 Thread Brian J. Murrell
On Thu, 2009-06-25 at 18:32 +0300, Tziporet Koren wrote:
> 
> 2. OFED 1.4.2 - there is a request from Sun to have bug fixes only 
> release since there is a critical bug

Two now, including the backports conflict I bug reported and posted here
about.

> that prevent Lustre to run over 
> OFED 1.4 and 1.4.1

Cheers,
b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] RHEL4 U[4-7] backports conflict with vendor headers

2009-06-23 Thread Brian J. Murrell
Hi All,

I have just filed bug 1655 about another backports header conflict withe
the vendor kernel headers.

assert_spin_locked is defined in include/linux/jbd.h in the vendor's
sources and is also defined in
ofa_kernel/kernel_addons/backport/2.6.9_U{4,5,6,7}/include/linux/spinlock.h in 
OFED 1.4.1.

This of course yields a:

include/linux/jbd.h:1204:1: "assert_spin_locked" redefined

for any source file that includes both linux/spinlock.h and linux/jbd.h.

Thots?

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Re: ofed-1.4.1 rc5 and sles10sp2

2009-05-11 Thread Brian J. Murrell
On Mon, 2009-05-11 at 15:52 -0500, Steve Wise wrote:
> I see the same behavior with RC5 on RH5.2/i386.
> 
> I did some reverting, and it appears this commit is the culprit:
> 
> --
> commit 9e1a4dd600347472829859849168fd3a6f7c597c
> Author: Jeff Becker 
> Date:   Sun May 10 20:21:12 2009 -0700
> 
> NFSRDMA: remove dependence of backport ib_core on nfs: fixes Bug 1596
> Signed-off-by: Jeff Becker 
> 

So is this just a regression when trying to use NFSRDMA?  If I don't
build NFSRDMA, will RC5 still be a problem for me or will the core I/B
stack be unaffected?

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] RHEL5 backports and crypto functions

2009-05-05 Thread Brian J. Murrell
On Mon, 2009-05-04 at 15:19 -0500, Jon Mason wrote:
> 
> Yes, those can be quite tricky.  Let me know if there is anything I can
> do to help.

I think I got it all figured out.

> I'll wait for your ok before sending the patch out for inclusion.  I
> would like for it to be reviewed and tested heavily.

I have yet to run the result through our test framework, but your
proposed patch certainly seems to fix the compilation errors I was
getting.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] RHEL5 backports and crypto functions

2009-05-04 Thread Brian J. Murrell
On Mon, 2009-05-04 at 14:46 -0500, Jon Mason wrote: 
> On Fri, May 01, 2009 at 02:15:39PM -0500, Jon Mason wrote:
> > Hey Brian,
> > I've attached a patch for RHEL5.3 which should enable you to compile on
> > Solaris.  I've done a quick sniff test on my systems, and I do not see
> > any issues.
> > 
> > Please confirm that it works for you, and I'll make the necessary
> > changes for RHEL5.1 and RHEL 5.2.
> 
> Did this do the trick for you?

Sorry for not getting back to you Jon.  Unfortunately, I've had some
other work pre-empt this for today.  I hope to get cracking on it first
thing tomorrow though.

I do know that at where I left off, I found that I was running into
crypto vs. ncrypto funniness.  It's not just the header that has a
different name.  Some of the data objects in the header are subject to
s/crypto/ncrypto/ substitutions as well, and I am still trying to figure
out how to deal with this in a portable manner.

b.




signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] RHEL5 backports and crypto functions

2009-05-01 Thread Brian J. Murrell
On Fri, 2009-05-01 at 11:13 -0500, Jon Mason wrote:
> 
> It's the same definition as the distro kernel has.

As in like, RHEL5.3?  The definition I see there is:

struct hash_desc {
struct crypto_hash *tfm;
u32 flags;
};

in ncrypto.h.

> So the struct has
> changed since 2.6.18.

RH's 2.6.18 or vanilla?  I don't even see the struct in vanilla in
anything earlier than 2.6.19.

> However, it should not be necessary to ship these
> files, and they are simply redefining structs and funcs found in
> ncrypto.h (in the distro kernel).

Yeah.  I saw that.  What is the difference between crypto.h and
ncrypto.h in RH's kernel?

> I am in the process of trying to rip those out and use the existing
> definitions in ncrypto.h, but it is a bit of a tangled mess.

Hopefully this isn't too big a can of worms.  How's that process going?

> I assume
> that not having it defined in the backport header files would work
> around the issue you are experiencing, correct?

Hrm.  Yeah, I think so, because then we'd use our own "backport"
definitions.  I can't really be sure.  I guess I could try commenting it
out and building and see how far it gets.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] RHEL5 backports and crypto functions

2009-05-01 Thread Brian J. Murrell
Hi,

I've just filed bug 1620 against 1.4.1.  It was suggested I also post a
message here in case there is any discussion needed.

I was looking at another aspect of the RHEL5 backports and found another
area that looks like a problem.  I am hoping we can act quickly to try
to resolve this before 1.4.1 GA as this is another problem that will
prevent us from being able to build our driver with OFED 1.4.1.

What seems to be at issue is the backport definition of struct hash_desc
in

ofa_kernel/kernel_addons/backport/2.6.18-EL5.3/include/linux/crypto.h 

which is:

struct hash_desc
{
struct crypto_tfm *tfm;
u32 flags;
};

I'm far from even slightly knowledgeable on the Linux kernel's crypto
facilities but I'm just trying to see how this meets up (or fails to)
with our code's use of struct hash_desc.

For reference I went to the vanilla kernel's implementation of this
struct and it seems to have been introduced in 2.6.19 as:

struct hash_desc {
struct crypto_hash *tfm;
u32 flags;
};

and has not changed all the way up to and including 2.6.29.

The problem is that we use struct hash_desc in our code and we expect it
to have the same definition as the kernel, specifically, we expect tfm
to be of type "crypto_hash *".

I'm wondering why the OFED backports have a different definition than
the kernel has?

As we are getting close to a release, I am hoping we can resolve this
quickly.

Thanx,
b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] website really slow

2009-04-15 Thread Brian J. Murrell
FWIW, this did finally complete and download rate was good but it sure
took a while to get it started.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] website really slow

2009-04-15 Thread Brian J. Murrell
Hi,

I'm trying to fetch
http://www.openfabrics.org/downloads/OFED/ofed-1.4.1/OFED-1.4.1-rc3.tgz
and while the connection is made very quickly, trying to get the file is
yielding nothing:

 1. $ telnet www.openfabrics.org 80
 2. Trying 69.55.231.195...
 3. Connected to www.openfabrics.org (69.55.231.195).
 4. Escape character is '^]'.
 5. GET /downloads/OFED/ofed-1.4.1/OFED-1.4.1-rc3.tgz HTTP/1.0
 6. 

Lines 1-4 complete very quickly.  I am left at line 6 for a long time
after sending the request on line 5.

Can somebody give things a boot over there?

Thanx,
b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [ofa-general] ofed autoconf.h

2009-04-08 Thread Brian J. Murrell
On Wed, 2009-04-08 at 12:39 -0500, Steve Wise wrote:
> 
> Ok we'll do this for 1538 and push it into ofed-1.4.1.

Great.  I will test as soon as it's pushed in to make sure it works.

> > Yeah.  I was considering that as well WRT to bug 1578 and not wholesale
> > "#undef"ing all macros leading to a mixture of kernel provided and OFED
> > provided RDMA options.
> >
> > I wonder if this is something that is appropriate to do at (OFED0
> > configure time, and simply bail if a mismatch is found with a "you can't
> > do that.  either change your ofed selections or disable FOO in your
> > kernel configuration" type error message.
> >   
> 
> Gimme an example of what you mean?

I don't know enough about the OFED stack to give you a specific example,
but, if you know of an API change that is happening in a given release,
you write an autoconf macro to test which API is available and if the
wrong one is, bail out of configure with an (informative) error.

> > I don't think this particular problem is something we need to address
> > for 1.4.1 though.
> >
> >   
> 
> So 1578 can be deferred?

No, I mean the problem of detecting API changes in configure.  I'd still
like to see 1578 addressed for 1.4.1 as the fix in 1538 is pretty
useless without it as it will still mean needing to create a third,
temporary version of the autoconf.h file.

Cheers,
b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [ofa-general] ofed autoconf.h

2009-04-08 Thread Brian J. Murrell
On Wed, 2009-04-08 at 09:27 -0500, Steve Wise wrote:
> 
> Maybe.  I thought include_next included it after the existing file was 
> processed.

Hrm.  You could be right about that.  My interpretation was always that
it included it immediately, as if the contents of the #include_next
files were actually in the caller's file right where the #include_next
is.

> Maybe I'm all wet though.

Or maybe it's me who is all wet.  :-)

> I'll run experiments.

I just did.  It seems to work as I thought.  You do get macro
redefinition warnings though for something defined in both files:

bar/a.h:1:1: warning: "FOO" redefined

which will be an error if you build with -Werror.  :-(

If including the kernel's autoconf.h *before* doing all of the OFED
macros is the right solution (which I think it is) the warnings can be
fixed by doing:

#include_next 

#undef FOO
#define FOO 1

But relative to bug 1578, I'd only want to see macros which are to be
set to something "#undef"ed first and not have every macro "#undef"ed
wholesale.

> If you're 
> right, then my original patch to the configure script will handle 1538.

Yes, indeed.

> You can always work around these issues, yes?

Yeah.  It's ugly though.  It essentially winds up being your "cat"
solution (with some "#undef" removals to address bug 1578), to create a
third autoconf.h in a temporary directory (which is basically a union of
the two files) and put that temporary directory in the include path
before the OFED include and backing kernel include paths.

This is a hack that will need to be undone in a future Lustre release
when this issue is resolved.

> You have to rebuild/reinstall ofed if you change the backing kernel.

Hrm.  Even if I change something completely unrelated to OFED or
networking at all, like say just changing CONFIG_SERIAL_8250 from m to
y?

> If the include_next solution works, then we're all set...

Indeed.  I think so too.

> This does expose an issue, however.  If an ofed release changes the 
> kernel verbs or cm APIs, then it can break  any rdma kernel modules that 
> do not get rebuilt against the ofed headers.  But this issue has always 
> been there I guess. 

Yeah.  I was considering that as well WRT to bug 1578 and not wholesale
"#undef"ing all macros leading to a mixture of kernel provided and OFED
provided RDMA options.

I wonder if this is something that is appropriate to do at (OFED0
configure time, and simply bail if a mismatch is found with a "you can't
do that.  either change your ofed selections or disable FOO in your
kernel configuration" type error message.

I don't think this particular problem is something we need to address
for 1.4.1 though.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [ofa-general] ofed autoconf.h

2009-04-08 Thread Brian J. Murrell
Hi Steve,

Thanx for looking into these two items for us.

On Tue, 2009-04-07 at 16:24 -0500, Steve Wise wrote:
> I've been charged to resolve OFED bugs 1538 and 1578 dealing with 
> autoconf.h.  I seek input from the OFA lists.

For the record of discussion, I opened these two bugs.

> It was suggested that we could add a "#include_next 
> " to the ofed file and thus include both, but that 
> doesn't work because we need the backing kernel autoconf.h included 
> first, and then the ofed one.

Isn't that just a function of where you put the #include_next in the
ofed autoconf.h?  If you put it at the top, the ofed autoconf.h will
override anything in the backing kernel autoconf.h but if you put it at
the bottom, the backing kernel's autoconf.h overrides values set in the
ofed autoconf.h, no?

> I think the idea originally was 
> that the backing kernel didn't have any of these ofed modules.

That was my suspicion as well -- that this grew out of something that
was acceptable to do before the OFED stack started providing more of
what the backing kernel could be providing.

> 1) do we think these should be resolved in ofed-1.4.1?

I would really like to see them resolved in the 1.4.1 release as we have
already missed the 1.4.0 release due to other external-module build
problems.

> Here are my proposed solutions (dunno if they break anything)
> 
> 1538:  change the ofed configure script to create a fully-populated 
> autoconf.h that basically is a cat of the back kernel tree autoconf.h 
> and the current ofed autoconf.h.  That way, modules will get everything 
> when they include the ofed autoconf.h.

What happens then if the user changes something in their kernel
configuration (i.e. after having built kernel-ib{,-devel} and installing
kernel-ib-devel) which is completely independent of OFED, like, say
enabling the serial module?

I'm definitely no module versioning expert, but I think such a change
would be allowable and not invalidate the modules in kernel-ib,  If
anyone knows better than I, please do correct me here.

But in such a case, "#include " will not reflect that
kernel configuration change.  This is not to say that such an operation
will be common, but I'm just trying to find the solution that covers the
most use-cases and still achieves our goal.

Is this solution to merge the backing kernel's autoconf.h into the ofed
one because of the perceived inability to use #include_next to
effectively concat the files at (external module) compile time?  Does my
suggestion as to positioning of the #include_next  in
the ofed autoconf.h change your thoughts on the best solution to this
problem?

> 1578: I propose we don't #undef any modules that are not configured into 
> the ofed build.

Yes.  This was my proposal as well.

> Thus if you don't build in NFSRDMA for ofed, the status 
> of the NFS CONFIG* defines will be based on the backing kernel tree 
> autoconf.h instead of always being turned off.

This is the solution I like.  The use-case this covers is that I want to
use infiniband in my network (for something other than NFS) and don't
want to use the NFSRDMA supplied by the OFED stack and instead want to
use the kernel's own provided NFS stack.  Not replacing the vendor
kernel supplied NFS stack reduces our risk and maintenance efforts.

b.




signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] compile failure for RHEL 5.3 on IA64

2009-04-01 Thread Brian J. Murrell
On Wed, 2009-04-01 at 12:56 -0500, Jon Mason wrote:
> 
> I fixed this issue recently.

Awesome news Jon!

> If you pulldown last nights tarball, it
> should be fixed.

OK.  I will let you know if it's otherwise.

Thanx!
b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] compile failure for RHEL 5.3 on IA64

2009-04-01 Thread Brian J. Murrell
Running the OFED 1.4.1-rc release through our build system was mostly
successful with the exception of RHEL 5.3 on IA64.  It failed with:

make -f scripts/Makefile.build 
obj=/cache/build/BUILD/ofa_kernel-1.4.1/drivers/infiniband/core
  gcc 
-Wp,-MD,/cache/build/BUILD/ofa_kernel-1.4.1/drivers/infiniband/core/.addr.o.d  
-nostdinc -isystem /usr/lib/gcc/ia64-redhat-linux/4.1.2/include -D__KERNEL__ \
-include include/linux/autoconf.h \
-include /cache/build/BUILD/ofa_kernel-1.4.1/include/linux/autoconf.h \
-I/cache/build/BUILD/ofa_kernel-1.4.1/kernel_addons/backport/2.6.18-EL5.3/include/
 \
 \
 \
-I/cache/build/BUILD/ofa_kernel-1.4.1/include \
-I/cache/build/BUILD/ofa_kernel-1.4.1/drivers/infiniband/debug \
-I/usr/local/include/scst \
-I/cache/build/BUILD/ofa_kernel-1.4.1/drivers/infiniband/ulp/srpt \
-I/cache/build/BUILD/ofa_kernel-1.4.1/drivers/net/cxgb3 \
-Iinclude \
 \
-I/cache/build/BUILD/lustre-kernel-2.6.18/lustre-1.8.0.50/linux-2.6.18-128.1.1/arch/ia64/include
 \
 -DHAVE_WORKING_TEXT_ALIGN -DHAVE_MODEL_SMALL_ATTRIBUTE 
-DHAVE_SERIALIZE_DIRECTIVE  -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs 
-fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef 
-Werror-implicit-function-declaration -Os -pipe  -ffixed-r13 
-mfixed-range=f12-f15,f32-f127 -falign-functions=32 -frename-registers 
-fno-optimize-sibling-calls -fomit-frame-pointer -g  -fno-stack-protector 
-Wdeclaration-after-statement -Wno-pointer-sign   -DMODULE -D"KBUILD_STR(s)=#s" 
-D"KBUILD_BASENAME=KBUILD_STR(addr)"  -D"KBUILD_MODNAME=KBUILD_STR(ib_addr)" -c 
-o /cache/build/BUILD/ofa_kernel-1.4.1/drivers/infiniband/core/.tmp_addr.o 
/cache/build/BUILD/ofa_kernel-1.4.1/drivers/infiniband/core/addr.c
In file included from include/asm/cacheflush.h:9,
 from include/asm/pgtable.h:154,
 from include/linux/mm.h:39,
 from 
/cache/build/BUILD/ofa_kernel-1.4.1/kernel_addons/backport/2.6.18-EL5.3/include/linux/mm.h:4,
 from include/linux/skbuff.h:25,
 from 
/cache/build/BUILD/ofa_kernel-1.4.1/kernel_addons/backport/2.6.18-EL5.3/include/linux/skbuff.h:4,
 from include/linux/if_ether.h:111,
 from 
/cache/build/BUILD/ofa_kernel-1.4.1/kernel_addons/backport/2.6.18-EL5.3/include/linux/if_ether.h:4,
 from include/linux/netdevice.h:29,
 from 
/cache/build/BUILD/ofa_kernel-1.4.1/kernel_addons/backport/2.6.18-EL5.3/include/linux/netdevice.h:4,
 from include/linux/inetdevice.h:7,
 from 
/cache/build/BUILD/ofa_kernel-1.4.1/kernel_addons/backport/2.6.18-EL5.3/include/linux/inetdevice.h:4,
 from 
/cache/build/BUILD/ofa_kernel-1.4.1/drivers/infiniband/core/addr.c:37:
/cache/build/BUILD/ofa_kernel-1.4.1/kernel_addons/backport/2.6.18-EL5.3/include/linux/page-flags.h:
 In function 'cancel_dirty_page':
/cache/build/BUILD/ofa_kernel-1.4.1/kernel_addons/backport/2.6.18-EL5.3/include/linux/page-flags.h:14:
 error: dereferencing pointer to incomplete type
/cache/build/BUILD/ofa_kernel-1.4.1/kernel_addons/backport/2.6.18-EL5.3/include/linux/page-flags.h:15:
 error: dereferencing pointer to incomplete type
make[4]: *** 
[/cache/build/BUILD/ofa_kernel-1.4.1/drivers/infiniband/core/addr.o] Error 1
make[3]: *** [/cache/build/BUILD/ofa_kernel-1.4.1/drivers/infiniband/core] 
Error 2
make[2]: *** [/cache/build/BUILD/ofa_kernel-1.4.1/drivers/infiniband] Error 2
make[1]: *** [_module_/cache/build/BUILD/ofa_kernel-1.4.1] Error 2
make[1]: Leaving directory 
`/cache/build/BUILD/lustre-kernel-2.6.18/lustre-1.8.0.50/linux-2.6.18-128.1.1'
make: *** [kernel] Error 2

RHEL 5.3 was successful on both our i686 and x86_64 builds.

Thots?

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] error with RHEL 5.3 backports in 1.4.1-rc2

2009-03-31 Thread Brian J. Murrell
On Mon, 2009-03-30 at 15:20 -0500, Jon Mason wrote:
> 

Hi Jon,

> I believe the attached patch will fix the issue.  Please download the
> latest ofed kernel, configure, apply the patch, build and load the new
> modules.

Yes, indeed it does fix the problem.  I based my  tests on 1.4.1-rc2
FWIW.

> Let me know if it fixes you problem, and I'll make similar changes to
> RHEL5.1 and RHEL5.2.

Great!  Thanx again!

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] error with RHEL 5.3 backports in 1.4.1-rc2

2009-03-30 Thread Brian J. Murrell
On Sun, 2009-03-29 at 14:23 -0500, Jon Mason wrote:
> 

Hi Jon,

> If you have
> writeback.h and mpage.h included in lustre_compat25.h (or one of the
> other files), the it could be causing this.

Indeed, we do include both:

#include 
#include /* for generic_writepages */

in lustre_compat25.h.

> Feel free to file a bug.

I can and will if you wish.  If you would prefer to just deal with this
here, that is fine with me too.  I just want to work within whatever
processes work best for y'all.

> One way to resolve it would be to move
> kernel_addons/backport/2.6.18-EL5.3/include/linux/writeback.h
> to 
> kernel_addons/backport/2.6.18-EL5.3/include/linux/mpage.h
> 
> I can send you a quick patch to try out, if you are interested.

Sure.  I'm interested in whatever suggestions you might have.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] error with RHEL 5.3 backports in 1.4.1-rc2

2009-03-27 Thread Brian J. Murrell
I'm trying to build Lustre with OFED 1.4.1-rc2 and RHEL5.3's
2.6.18-128.1.1.el5 kernel and getting the following error:

  gcc
-Wp,-MD,/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/lustre/llite/.lloop.o.d
  -nostdinc -isystem /usr/lib/gcc/i486-linux-gnu/4.3.2/include -D__KERNEL__ 
-I/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/foo 
-I/usr/src/ofa_kernel/kernel_addons/backport/2.6.18-EL5.3/include/ 
-I/usr/src/ofa_kernel/include  -Iinclude  -include include/linux/autoconf.h  
-Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing 
-fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration 
-Os -pipe -msoft-float -fno-builtin-sprintf -fno-builtin-log2 -fno-builtin-puts 
 -mpreferred-stack-boundary=2  -march=i686 -mtune=generic -mtune=generic 
-mregparm=3 -ffreestanding -Iinclude/asm-i386/mach-generic 
-Iinclude/asm-i386/mach-default -fomit-frame-pointer -g  -fno-stack-protector 
-Wdeclaration-after-statement -Wno-pointer-sign  -include 
/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/config.h  -g 
-I/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/lnet/include 
-I/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/lnet/include 
-I/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/lustre/include  -g -O2 
-I/opt/mpich/include   -DMODULE -D"KBUILD_STR(s)=#s" 
-D"KBUILD_BASENAME=KBUILD_STR(lloop)"  
-D"KBUILD_MODNAME=KBUILD_STR(llite_lloop)" -c -o 
/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/lustre/llite/.tmp_lloop.o 
/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/lustre/llite/lloop.c
In file included
from 
/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/lustre/include/linux/lustre_compat25.h:337,

from 
/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/lustre/include/linux/lvfs.h:49,

from 
/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/lustre/include/lvfs.h:48,

from 
/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/lustre/include/obd_support.h:41,

from 
/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/lustre/include/lustre_cfg.h:211,

from 
/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/lustre/include/lustre_lib.h:47,

from 
/mnt/lustre/brian/lustre/OFED-1.4.1-rc2/build/b1_8/lustre/llite/lloop.c:111:
include/linux/mpage.h:45: error: conflicting types for
‘backport_write_cache_pages’
/usr/src/ofa_kernel/kernel_addons/backport/2.6.18-EL5.3/include/linux/writeback.h:10:
 error: previous declaration of ‘backport_write_cache_pages’ was here

Looks like there is some conflict in the re-definition of
write_cache_pages() in the backport?

My kernel's mpage.h is defining it as:

int
write_cache_pages(struct address_space *mapping, int range_cont,
struct writeback_control *wbc, writepage_data_t writepage,
void *data);

Shall I file a bug on this?  Any thoughts on how to resolve?

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Please update OFED 1.4.1 bugs status on bugzilla

2009-03-09 Thread Brian J. Murrell
On Mon, 2009-03-09 at 09:13 -0700, Jeff Becker wrote:
> Hi Brian. I was wondering if you tried the RHEL5.2 NFSRDMA backport and
> if so, did it solve your exportfs.h issue?

In fact I tried the RHEL5.3 backports with 1.4.1-rc1 and it seems to
have fixed the problems -- in building at least.  We still have to put
the resulting modules through a real QA run.

> Can we close/modify the bug?

Sure.  As you wish.  I can certainly reopen it if we run into some
problem(s), or you can leave it as open as you wish to keep the issue on
your radar if you like/need.

Thanx again,
b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Please update OFED 1.4.1 bugs status on bugzilla

2009-03-08 Thread Brian J. Murrell
Ooops.  I filed bug 1538 but forgot to make it block 1.4.1.  I would
like it considered for the 1.4.1 release please.

Thanx!

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] how to deal with /usr/src/ofa_kernel/include/linux/autoconf.h?

2009-03-04 Thread Brian J. Murrell
I'm wondering what the general wisdom is here to deal with the fact that
given a kernel-ib-devel installation, /usr/src/ofa_kernel/include/linux
has an autoconf.h in it with just the OFED kernel definitions in it.

Given that when I build my kernel module I do:

. /usr/src/ofa_kernel/config.mk
gcc ... $BACKPORT_INCLUDES -I/usr/src/ofa_kernel/include -I ...

and my kernel module does a:

#include 

my module obviously ends up getting the very abbreviated OFED autoconf.h
rather than the kernel's own, much bigger and more complete version.

Should I be jiggering my gcc options to somehow include the kernel one
(i.e. -include $kernel_path/include/linux/autoconf.h)?  I dislike that
solution somewhat as it includes hte autoconf.h in source files that
don't specifically need it.  Or is there some other solution?

Should, perhaps, the autoconf.h that's
in /usr/src/ofa_kernel/include/linux perhaps employ an "#include_next
" to get the kernel's own version of that file merged
with the OFED one?

Thanx,
b.


signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] question about sles 11 backport

2009-03-03 Thread Brian J. Murrell
On Tue, 2009-03-03 at 11:48 -0800, Jeff Becker wrote:
> Hi Brian

Hi Jeff,

> >  2. does the lack of a kernel_addons/backport directory for sles 11
> > mean that there are no kernel "include" changes needed for the
> > sles 11 kernel to build the OFED stack?
> >   
> 
> Right, since OFED 1.4.1 and SLES11 are both based on kernel 2.6.27.

As I figured then.  Excellent.

> Note
> however, that sometimes the includes are patched by the backport patch.

Meaning the ofa_kernel/include includes?

> This is typically done in cases where the include file is actually
> checked out by the ofed_checkout.sh script - see
> ofed_scripts/checkout_files for a list of these.

OK.  Will do.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] question about sles 11 backport

2009-03-03 Thread Brian J. Murrell
I'm looking at the latest tree for the 1.4.1 release, and while I see a

kernel_patches/backport/2.6.27_sles11/to_sles11.patch

I don't see any subdir in kernel_addons/backport for sles 11.

That brings me to 2 questions...

 1. is what are in the kernel_patches/backport directory patches
that get applied to the OFA source directly in order to work
with a given kernel?
 2. does the lack of a kernel_addons/backport directory for sles 11
mean that there are no kernel "include" changes needed for the
sles 11 kernel to build the OFED stack?

TIA for any insight you can provide.

Cheers,
b.




signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Re: [PATCH] NFS-RDMA backport for RHEL 5.2

2009-03-03 Thread Brian J. Murrell
On Sat, 2009-02-28 at 13:01 -0600, Steve Wise wrote:
> Jon Mason wrote:
> > Hey Vlad,
> >
> > I wanted to get you the NFS-RDMA backport patches I have queued up prior
> > to rc1 being built.  I have this patch (RHEL5.2), as well as 2.6.22, and
> > 2.6.25 (which I will be sending in separate e-mails).
> >   
> 
> 
> Thanks Jon and Tom for doing this work!

Indeed!  Given that we have been a squeaky wheel about this backport, I
want to give you my/our thanks as well.

b.


signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] /usr/src/ofa_kernel/include/linux/exportfs.h:145: error: redefinition of 'struct export_operations' on RHEL5

2009-02-11 Thread Brian J. Murrell
Hi,

I'm trying to build a third party module with the OFED 1.4.0
kernel-ib-devel /usr/src/ofa_kernel tree (as it uses the OFED stack).
In my compiler flags I include:

-I/usr/src/ofa_kernel/include
-I/usr/src/ofa_kernel/kernel_addons/backport/2.6.18-EL5.2/include/linux/

prior to any of the regular kernel module build includes.  The result is
not good though:

In file included from 
/cache/build/BUILD/lustre-1.8.0/lustre/llite/llite_nfs.c:45:
/cache/build/BUILD/lustre-kernel-2.6.18/lustre/kernel-ib-devel/usr/src/ofa_kernel/include/linux/exportfs.h:145:
 error: redefinition of 'struct export_operations'
/cache/build/BUILD/lustre-1.8.0/lustre/llite/llite_nfs.c:243: error: unknown 
field 'get_dentry' specified in initializer
/cache/build/BUILD/lustre-1.8.0/lustre/llite/llite_nfs.c:243: warning: excess 
elements in struct initializer
/cache/build/BUILD/lustre-1.8.0/lustre/llite/llite_nfs.c:243: warning: (near 
initialization for 'lustre_export_operations')

The problem comes from the fact that llite_nfs.c does a:

#ifdef HAVE_LINUX_EXPORTFS_H
#include 
#endif

because it needs "struct export_operations" which in later kernels, as
you know, got moved to that header -- and had it's members changed.
There is no get_dentry member in the exportfs.h version of it.

The problem arises from the inclusion of both
/include/linux/fs.h
and /usr/src/ofa_kernel/include/linux/exportfs.h

which both define struct export_operations;

How can I resolve this conflict?  Am I specifying the various includes
(ofed generic, ofed backport, kernel regular) in the wrong order?  I
would have thought surely, you would want /usr/src/ofa_kernel/include/
to be preferred so that it's rdma/ and scsi/ paths override what's in
the kernel, but of course, that brings with it everything added to the
include/ subdir.

How can I resolve this conflict?

Thanx,
b.


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


RE: [ewg] OFED 1.4's autoconf.h conflicting with kernel

2009-01-29 Thread Brian J. Murrell
On Thu, 2009-01-29 at 04:59 -0800, Poornima Kamath (Contractor) wrote:
> Hi Brian,
> 
> You can  create another autoconf.h file and define all the variables that are 
> being overridden by OFED and which you want defined in that file.

I'm not sure I'm following that.  If I am, then it seems
less-than-straightforward that I, as a package maintainer need to start
cobbling up my own autoconf.h files to be a union of the two I need to
use.  But on to your example...

> Consider you keep this autoconf.h in /usr/src/test/linux directory.
> So then you can specifically include it in the following way.
>   -include include/linux/autoconf.h -include 
> /usr/src/ofa_kernel/include/linux/autoconf.h -include 
> /usr/src/test/linux/autoconf.h
> This will take care of the variables being defined correctly. 
> Now you also will have to include the following
>   -I  /usr/src/test  -I/usr/src/ofa_kernel/include/
>This is  so that if you have any files with , then 
> autoconf.h will be taken from /usr/src/test/ 
> Thus variables in OFED autoconf.h will be overridden by the ones in your 
> autoconf.h

Yeah.  I'm understanding it alright.  I just don't like it.  The need
for me to have to manually build "unionized" autoconf.h files for every
combination of kernel/OFED configuration that I will want to build is
just not optimal.

IMHO, if the OFED development tree is going to provide an autoconf.h
that is going to get found and preferred to the kernel's own, it needs
to do it in a way that _augments_ the kernel's file, not completely
overrides it.

Thots?

If we use the example on hand, where the nfsrdma option is not selected
in the OFED build, then maybe all of the #undefs for the NFS related
options should simply be omitted from the OFED version of the autoconf.h
file rather than included as #undef and then the OFED autoconf.h should
#include_next the kernel's autoconf.h

So maybe something like the following in the OFED autoconf.h:


#include_next 

[ core IB defines which should override the kernel ]

[ nfsrdma IB defines, if nfsrdma is configured -- no #undefs if not ]
---

And then there are specific items like CONFIG_DEBUG_INFO.  I'm not even
sure why the OFED stack is doing anything with them.  I don't see a
single source file in the whole tree that uses it.

CONFIG_SYSCTL does get used by the core NFS code in the OFED stack that
is built when nfsrdma is selected, but I'm not convinced that we should
not just be following what the kernel does with this.  There's really no
point to building the sysctl stuff into the modules if the kernel the
modules are being inserted into doesn't have it enabled, yes?

b.


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] how to use /usr/src/ofa_kernel to build with the OFED stack

2009-01-26 Thread Brian J. Murrell
Is there a document that explains exactly what I should do in my kernel
module to have it build with the OFED stack, like in terms of compiler
flags, linker, etc.?

Currently my compiler line looks like this:

gcc ... -nostdinc -isystem /usr/lib/gcc/i486-linux-gnu/4.3.2/include ... \
-I/usr/src/ofa_kernel/kernel_addons/backport/2.6.18-EL5.2/include/ \
-I/usr/src/ofa_kernel/include \
-I/usr/src/ofa_kernel/kernel_addons/backport/2.6.18-EL5.2/include/ \
-Iinclude -include include/linux/autoconf.h ... \
-I/usr/src/ofa_kernel/kernel_addons/backport/2.6.18-EL5.2/include/ \
-I/usr/src/ofa_kernel/include \
-include //build/HEAD/config.h -c \
-o //build/HEAD/foo.o //build/HEAD/foo.c

(Not sure why/how the kernel_addons/backport/2.6.18-EL5.2/include/ got
listed three times but it should not make a difference in this case)

I want to be sure I am building my compiler command correctly.

This is in regard to my previous posting, "OFED 1.4's autoconf.h
conflicting with kernel".  I just want to ensure that my basic
understanding is correct.

Thanx,
b.


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] OFED 1.4's autoconf.h conflicting with kernel

2009-01-26 Thread Brian J. Murrell
I was asked to bring this discussion over here... and there has been
some initial response from Jeff Becker about it.  You can catch up on
the thread here:
http://lists.openfabrics.org/pipermail/general/2009-January/056633.html

Anyway... On to the posting...

So, I noticed that OFED 1.4 is bringing a whole lot more into
it's /usr/src/ofa_kernel/include/linux/autoconf.h than 1.3.1 did.  I
find these additional defines with 1.4:

#undef CONFIG_SUNRPC_XPRT_RDMA
#undef CONFIG_SUNRPC
#undef CONFIG_SUNRPC_GSS
#undef CONFIG_RPCSEC_GSS_KRB5
#undef CONFIG_RPCSEC_GSS_SPKM3
#undef CONFIG_NFS_FS
#undef CONFIG_NFS_V3
#undef CONFIG_NFS_V3_ACL
#undef CONFIG_NFS_V4
#undef CONFIG_NFS_ACL_SUPPORT
#undef CONFIG_NFS_DIRECTIO
#undef CONFIG_SYSCTL
#undef CONFIG_EXPORTFS
#undef CONFIG_LOCKD
#undef CONFIG_LOCKD_V4
#undef CONFIG_NFSD
#undef CONFIG_NFSD_V2_ACL
#undef CONFIG_NFSD_V3
#undef CONFIG_NFSD_V3_ACL
#undef CONFIG_NFSD_V4
#undef CONFIG_NFSD_RDMA

amongst a few others that I am far less concerned with.

The problem is, that these are in direct conflict with what I have
chosen for my kernel build.

Some research has led me to a message
(http://www.mail-archive.com/gene...@lists.openfabrics.org/msg18161.html)
from Jeff Becker back on Thu, 10 Jul 2008 15:58:53 -0700 in which he
submitted a patch to integrate NFSRDMA into OFED 1.4 which is what appears
to have brought these changes into OFED 1.4.

So, I guess I am wondering how can I build OFED 1.4, leaving out the
NFSRDMA stuff and yet not override my kernel's config settings for all
of the above variables?  IOW, I don't want any of those #undefs in the
OFED autoconf.h overriding the ones in my kernel's autoconf.h

It seems with OFED the only two options for NFS are NFSRDMA or no NFS at
all.  There is no "use the kernel's NFS, unmodified" option.

Beyond all of the NFSRDMA stuff, why is CONFIG_SYSCTL being drawn into
all of this?

Any insights would be appreciated.

b.


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: [ofa-general] resolve conflict between OFED 1.3 and 2.6.18 with ISCSI

2008-04-15 Thread Brian J. Murrell
On Tue, 2008-04-15 at 09:25 +0300, Erez Zilber wrote:
> Voltaire will not be able to add qla4xxx support to open-iscsi in OFED
> 1.4. I understand that this may be important for some people, so if you
> (or anyone else) wants to add it, we can help with some info about
> open-iscsi and its backports & scripts in OFED (but we can't do the
> backports and testing ourselves).

Thanx for the update Erez.  I understand.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [ofa-general] resolve conflict between OFED 1.3 and 2.6.18 with ISCSI

2008-04-14 Thread Brian J. Murrell
On Mon, 2008-04-14 at 16:50 +0300, Erez Zilber wrote:
> 
> I'm not sure if there's a real demand for this transport for OFED users,
> is there?

Maybe I'm not seeing the bigger picture but it seems pretty orthogonal
to me.  Does using OFED 1.3 preclude using a qla4xxx host adapter?  IOW,
is there anything inherent in using OFED 1.3 as the networking fabric on
a (say) storage server that uses a QLogic ISP4XXX adapter to access it's
storage?

> Adding qla4xxx will require backport patches for all supported
> distros, and we don't have the HW to test it.

Yeah, the old conundrum.

> Therefore, unless it's
> really important for enough OFED users, I don't think that we should add it.

Well, given the alternative that it's completely unbuildable in the
kernel when you choose OFED's iscsi options, is including the qla4xxx in
the OFED distribution, even untested so bad?

> BTW - I don't mind if other people add the required code to OFED 1.4 for
> qla4xxx support.

~sigh~  Yeah.

I wonder how many (if any) of our userbase we are going to upset if we
cease providing the qla4xxx driver in our kernels.  On the other hand, I
wonder how many we'd upset by not providing iSER and the newer
open-iscsi modules.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [ofa-general] resolve conflict between OFED 1.3 and 2.6.18 with ISCSI

2008-04-14 Thread Brian J. Murrell
On Mon, 2008-04-14 at 16:05 +0300, Erez Zilber wrote:
> 
> General comment - in the future, I suggest that you send OFED related
> e-mails also to the EWG list and to me (I maintain iSER in OFED &
> kernel.org).

I will probably need to subscribe first.  :-(

> OFED 1.3 provides open-iscsi 2.0-865.15 (userspace & kernel). This
> version is newer than the version that is shipped with RHEL5. It also
> has full iSER support.

Yeah, I had a feeling that what I really wanted was to use the
ofa_kernel one.

> Yeah, it is an open-iscsi transport, so you must have open-iscsi in
> order to use this driver. With OFED 1.3, qla4xxx is not included. We
> only included the TCP & iSER transports.

Indeed.

> Of course. You can't have open-iscsi modules twice.

Exactly, which is why I want to disable them in the kernel if I can.

> OFED is shipped with its own version of open-iscsi because I don't want
> to support multiple versions of open-iscsi (each distro has its own
> version of open-iscsi).

That's certainly fair enough.

> Also, having a newer version of open-iscsi
> (which we have in OFED) fixes many bugs and adds new features (which is
> good).

Indeed.  All the more reason to use the OFED supplied one.  However...

> Is qla4xxx the only problem that you have with open-iscsi in OFED?

Looking through the kernel Kconfig files, it does appear that
SCSI_QLA_ISCSI is the only driver needing SCSI_ISCSI_ATTRS that isn't in
the OFED 1.3 release.

b.



signature.asc
Description: This is a digitally signed message part
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg