I'm about to send some patches for libibverbs and Roland's infiniband kernel
git tree. The patches fit into two general categories:
1. Add enums for Cisco's Ethernet Virtual NIC (it's not an RNIC and therefore
doesn't fit the RNIC/IWARP enums). Also add enums for 1500 and 9000 MTUs.
2. Minor
On Apr 3, 2013, at 10:49 AM, "Hefty, Sean" wrote:
> Can we get a better patch description?
>
> Maybe mention something about the NIC? Does it support all verbs? Is it for
> kernel users or just user space? Does this simply export a raw ethernet
> interface?
Sure. For a little background,
On Apr 3, 2013, at 2:45 PM, Or Gerlitz wrote:
> Jeff, I agree with Sean, there's not much point to review/discuss
> these general/pre-step patches without seeing some actual device
> specific kernel (if there are such or user space code if there aren't
> any kernel ones) code. e.g you can submit
On Apr 3, 2013, at 12:52 PM, Roland Dreier wrote:
> I don't think we can blithely do this... I think the IB enum values
> are defined to match the values used in the IB spec (PathRecord etc).
Gotcha. I inserted the enums in their proper numerical order to make the range
comparisons simpler in
type good.
On Apr 4, 2013, at 5:27 PM, "Or Gerlitz" wrote:
> Jeff Squyres (jsquyres) wrote:
>
>> Sure. For a little background, the 2nd-generation Cisco VIC has been
>> available
>> since last year (IIRC): http://www.cisco.com/en/US/products/ps10277
>&g
Per my previous email, forgive my top reply...
RDMA_NODE_VENDOR would be great, actually. Should I work up a patch for that?
Sent from my phone. No type good.
On Apr 4, 2013, at 10:32 AM, "Hefty, Sean" wrote:
>> The reason we're asking for these IBV_*_USNIC enums now -- before we've
>> submit
On Apr 5, 2013, at 4:40 PM, Roland Dreier wrote:
> I think the idea is that without context, it's hard to know if adding
> these enums makes sense or not. And I'm sorry but I'm not that
> sympathetic to "my code isn't ready but you have to take this
> out-of-context patch so I can meet Red Hat's
On Apr 4, 2013, at 1:57 PM, "Weiny, Ira" wrote:
>> In hindsight, the user space API never should have exposed the mtu as an
>> enum...
>>
>> Since an enum is an int, and we're never going to have anything with an mtu
>> <= 5 bytes, couldn't we just store all new mtu values directly as their byte
Roland --
If there are no objections, can this patch (and patch 4 of this set:
https://patchwork.kernel.org/patch/2387321/) be committed? Neither should not
have any real impact other than the modernization of the libibverbs build
system.
On Apr 3, 2013, at 9:06 AM, Jeff Squyres wrote:
> T
On Apr 8, 2013, at 6:16 PM, "Hefty, Sean" wrote:
> Why can't IB_MTU_1500 = 1500?
It certainly could. Additionally, since Roland was a little concerned about
the "IB" prefix (since 1500 and 9000 are not IBTA-sanctioned MTUs), they could
have a different prefix -- perhaps RDMA_MTU_1500.
Alt
On Apr 9, 2013, at 4:10 PM, "Weiny, Ira" wrote:
>> Just to re-state: our issue is that there does not seem to be any other way
>> to
>> get the max UD message size without knowing the actual MTU (are we
>> incorrect about that?). Hence, using the IB-defined values is not really
>> sufficient.
>
On Apr 9, 2013, at 10:44 PM, "Weiny, Ira" wrote:
> As an aside I like the use of RDMA_MTU_* for these values. Again to
> distinguish them from the IBTA values. But I know that is poor form.
So what's the right way to move forward on this? Is it this:
enum ib_mtu {
IB_MTU_256 = 1,
On Apr 12, 2013, at 11:40 AM, Jeff Squyres (jsquyres)
wrote:
>> As an aside I like the use of RDMA_MTU_* for these values. Again to
>> distinguish them from the IBTA values. But I know that is poor form.
>
> So what's the right way to move forward on this? Is it
Bump.
Any thoughts on these two patches? They're pretty trivial, enable use with
modern versions of Autotools, and now feature the proper Signed-off-by line.
On Apr 13, 2013, at 8:15 AM, Jeff Squyres wrote:
> The old sequence of Autotools commands listed in autogen.sh is no
> longer correct.
On Apr 19, 2013, at 8:19 PM, "Hefty, Sean" wrote:
> It may help if you identify the library this patch is against. :)
3rd time sending will be the charm... :-)
--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
On Apr 22, 2013, at 1:30 PM, Doug Ledford wrote:
> However, for some reason I had it
> in my mind when I was reading the patch that it was against libibverbs.
> That's what I get for staying up late and reviewing when I'm tired :-/
There were other patches against libibverbs that were submitted
Bump.
On Apr 22, 2013, at 1:41 PM, Jeff Squyres wrote:
> The old sequence of Autotools commands listed in autogen.sh is no
> longer correct. Instead, just use the single "autoreconf" command,
> which will invoke all the Right Autotools commands in the correct
> order.
>
> Signed-off-by: Jeff S
Bump bump. :-)
On Apr 25, 2013, at 11:38 AM, Jeff Squyres (jsquyres)
wrote:
> Bump.
>
> On Apr 22, 2013, at 1:41 PM, Jeff Squyres wrote:
>
>> The old sequence of Autotools commands listed in autogen.sh is no
>> longer correct. Instead, just use the single "
Because I have no control over that. :-)
>> On Apr 25, 2013, at 11:38 AM, Jeff Squyres (jsquyres)
>> wrote:
>>
>>> Bump.
>>>
>>> On Apr 22, 2013, at 1:41 PM, Jeff Squyres wrote:
>>>
>>>> The old sequence of Autotools commands liste
On Apr 22, 2013, at 4:00 PM, Doug Ledford wrote:
>> 2. Change all instances of ib_mtu/ibv_mtu to an int. Code such as
>> "switch(mtu) case IBV_MTU_1024: ..." will need to be updated to
>> "switch(mtu) case 1024: ...".
>
> I was actually thinking that an ibverbs API version 2.0 might be an
> i
Bump.
FYI: Automake just released a new beta version, which included this in the
release notes (http://lwn.net/Articles/531373/):
- Automake 2.0 will drop support for the long-deprecated 'configure.in'
name for the Autoconf input file. You are advised to start using the
recommended name
Roland --
I see a ummunot branch on your kernel tree at git.kernel.org
(https://git.kernel.org/cgit/linux/kernel/git/roland/infiniband.git/log/?h=ummunot).
Just curious -- what's the status of this tree? I ask because, as an MPI guy,
I would *love* to see this stuff integrated into the kernel
On May 28, 2013, at 1:52 PM, Roland Dreier wrote:
> Haven't touched it in quite a while except to keep it building. Needs
> work to finish up.
What kinds of things still need to be done? (I don't know if we could work on
this or not; just asking to scope out what would need to be done at this
On May 29, 2013, at 4:53 AM, Or Gerlitz wrote:
> Have you looked on ODP? see
> https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/568-on-demand-paging-for-user-space-networking.html
Is this upstream?
Has this been run by the MPI implementor community?
The limi
On May 30, 2013, at 1:09 AM, Or Gerlitz wrote:
>> Has this been run by the MPI implementor community?
>
> The team that works on this here isn't ready for submission, so community
> runs were not made yet
If this is a solution to an MPI problem, it would seem like a good idea to run
the speci
On May 29, 2013, at 1:53 AM, Or Gerlitz wrote:
> Have you looked on ODP? see
> https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/568-on-demand-paging-for-user-space-networking.html
Is the idea behind ODP that, at the beginning of time, you register the entire
On Jun 4, 2013, at 2:54 AM, Haggai Eran wrote:
> We wish to get there eventually. In our current implementation you still
> have to register an on-demand memory region explicitly. The difference
> between a regular memory region is that the pages in the region aren't
> pinned.
Does this mean tha
On Jun 4, 2013, at 4:50 AM, Haggai Eran wrote:
>> Does this mean that an MPI implementation still has to register memory upon
>> usage, and maintain its own registered memory cache?
> Yes. However, since registration doesn't pin memory, you can leave
> registered memory regions in the cache for
On Jun 5, 2013, at 12:14 AM, Haggai Eran wrote:
>> Hmm; I'm confused. How does this fix the
>> MPI-needs-to-intercept-freed-memory problem?
> Well, there is no problem if an application frees registered memory (in
> an on-demand paging memory region) and that memory is returned to the
> OS. The
On Jun 5, 2013, at 6:39 AM, Haggai Eran wrote:
> Perhaps I'm missing something, but I believe ODP deals with the first
> two problems in the list (slide 8), even if it doesn't solve them
> completely.
Unfortunately, it does not. If we could register(0 ... 2^64) and never have to
worry about re
On Jun 5, 2013, at 9:46 AM, Jason Gunthorpe
wrote:
> No, this too big of an ABI break, and silent at that..
>
> The IBA values have to continue to be accepted and exported in all
> cases so the ABI stays the same, which is what I thought was agreed
> on??
Can this go to a libibverbs 2.0, wher
On Jun 5, 2013, at 10:19 AM, Jason Gunthorpe
wrote:
> The concept of a libibverbs 2.0 has been NAK's by pretty much everyone
> involved. This is why we are suffering with the complex extension
> mechanism.
Are you saying that libibverbs must always always always be backwards
compatible, and th
On Jun 5, 2013, at 10:14 AM, Jason Gunthorpe
wrote:
>> a = malloc(x);// a gets (va=0x100, pa=0x12345) back from malloc
>> MPI_Send(a, ...); // MPI registers 0x100 for len=x, and saves (0x100,x) in
>> reg cache
>> free(a);
>> a = malloc(x);// a gets (va=0x100, pa=0x98765) back from mallo
On Jun 5, 2013, at 11:11 AM, Jason Gunthorpe
wrote:
> I won't say never, but this is what people want. Bumping the soname is
> seen as too difficult now.
Gotcha.
Ok, so my patch is a non-starter.
>>> Thoughts:
>>> - 1024 and 3 both mean 1024, the library must accept both values,
>>> it sho
On Jun 5, 2013, at 11:18 AM, Jason Gunthorpe
wrote:
>> Are you saying that the 2nd malloc will magically be registered
>> (with the new physical address)?
>
> Yes, that is the whole point.
Interesting.
> ODP fundamentally fixes the *bug* where the HCA's view of process
> memory can become inc
On Jun 5, 2013, at 12:05 PM, Jason Gunthorpe
wrote:
>> It does seem quite odd, abstractly speaking, that a registration
>> would survive a free/re-malloc (which is arguably a "different"
>> buffer).
>
> Not at all: the purpose of the registration is to allow access via
> RDMA to a portion of th
On Jun 5, 2013, at 10:52 PM, Haggai Eran wrote:
>> Haggai: A verb to resize a registration would probably be a helpful
>> step. MPI could maintain one registration that covers the sbrk
>> region and one registration that covers the heap, much easier than
>> searching tables and things.
>
> That'
On Jun 6, 2013, at 4:33 PM, Jeff Squyres (jsquyres) wrote:
> I don't think this covers other memory regions, like those added via mmap,
> right?
We talked about this at the MPI Forum this week; it doesn't seem like ODP fixes
any MPI problems.
1. MPI still has to have a mem
On Jun 7, 2013, at 4:57 PM, Jason Gunthorpe
wrote:
>> We talked about this at the MPI Forum this week; it doesn't seem
>> like ODP fixes any MPI problems.
>
> ODP without 'register all address space' changes the nature of the
> problem, and fixes only one problem.
I agree that pushing all regi
On Jun 10, 2013, at 11:56 AM, Liran Liss wrote:
>> "Register all address space" is the moral equivalent of not having userspace
>> registration, so let's talk about it in those terms. Specifically, there's
>> a subtle
>> difference between:
>>
>> a) telling verbs to register (0...2^64)
>> --
On Jun 10, 2013, at 1:26 PM, Jason Gunthorpe
wrote:
>> I agree that pushing all registration issues out of the application
>> and (somewhere) into the verbs stack would be a nice solution.
>
> Well, it creates a mess in another sense, because now you've lost
> context. When your MPI goes to do
On Jun 12, 2013, at 5:17 PM, Jason Gunthorpe
wrote:
> Yes, it can, via MAP_FIXED. There are lots of fun tricks you can play
> using that.
You're missing the point.
Normal users (i.e., MPI users) don't do that. They call malloc() and they get
what they get.
The whole point of upper-layer AP
On Jun 12, 2013, at 5:47 PM, Jason Gunthorpe
wrote:
> Someone has to finish the ummunotify rewrite Roland
> started. Realistically MPI is going to be the only user, can someone
> from the MPI world do this?
1. I tried to ask what needed to be done at the beginning of this thread and
didn't get
On May 9, 2015, at 8:04 AM, Yann Droneaud wrote:
>
> Le vendredi 08 mai 2015 à 11:21 -0700, Jeff Squyres a écrit :
>> Signed-off-by: Jeff Squyres
>
> This is a little short for an explanation: what was the issue with the
> error messages ?
Cisco has stopped shipping its libibverbs usnic driver
On May 20, 2015, at 1:11 PM, Doug Ledford wrote:
>
> The location of the upstream sources and tarballs would not change.
> Neither the git repo nor the tarball repo were like the kernel. The
> upstream kernel.org git repo Roland had, had his name in the repo. So
> it had to change. But the lib
On May 22, 2015, at 9:44 AM, Doug Ledford wrote:
>
>> Did that happen yet?
>
> I don't think so. I didn't file a specific ticket for it at k.o yet
> (the k.o tickets take a while to process, so I didn't want to file it
> until after the comment period here on list).
Ping.
This is just a perio
Ping.
This is just a periodic query to see if there has been any progress on
accepting this patch into libibverbs.
> On Jun 3, 2015, at 12:50 PM, Doug Ledford wrote:
>
> On Mon, 2015-06-01 at 22:02 +0000, Jeff Squyres (jsquyres) wrote:
>> On May 22, 2015, at 9:44 AM, Doug
On Jun 17, 2015, at 10:25 AM, Doug Ledford wrote:
>
> The patch is accepted, I just haven’t pushed it out yet.
Is there a timeline for when this patch will be available in the upstream git
repo and released in a new version of libibverbs?
I ask because we'd like to see this patch get into upst
On Jun 18, 2013, at 2:49 PM, Jason Gunthorpe
wrote:
>> +int num_to_ibv_mtu(int num);
>
> Probably should be ibv_num_to_mtu() to keep with the naming pattern..
New patch coming momentarily, but I wanted to comment on this one:
I used the name "num_to_ibv_mtu" because it is in the spirit of t
On Jun 20, 2013, at 1:09 PM, "Hefty, Sean" wrote:
>> int ibv_rate_to_mult(enum ibv_rate rate);
>> enum ibv_rate mult_to_ibv_rate(int mult);
>>
>> int ibv_rate_to_mbps(enum ibv_rate rate);
>> enum ibv_rate mbps_to_ibv_rate(int mbps);
>
> libibverbs uses the "ibv_" prefix for pretty much everythi
On Jun 20, 2013, at 4:40 PM, Doug Ledford wrote:
>> {
>> static char str[16];
>> snprintf(str, sizeof(str), "%d", ibv_mtu_to_num(max_mtu));
>>return str;
>> }
>
> That is not, however, multi-thread safe nor advisable unless you clearly
> indicate in the man page to the function
On Jun 21, 2013, at 5:20 PM, Jason Gunthorpe
wrote:
> Jeff: If you are still reading -
I am still reading, just didn't have much to contribute until now. :-)
> one concrete suggestion, I think, is
> to ensure compile-time failure when the new-format MTU variable is
> touched. This is trivial
Bump.
On Jul 2, 2013, at 8:31 AM, Jeff Squyres wrote:
> (Previous patch did not include updates for the man pages)
>
> Keep IBV_MTU_* enums values as they are, but pass MTU values around as
> a struct containing a single int.
>
> Per lengthy discusson on the linux-rdma list, this patch intro
On Jul 5, 2013, at 3:11 PM, Roland Dreier wrote:
> So what happens if I have an old application binary, and I run against
> a new libibverbs without recompiling?
>
> Also it seems that I'm forced to change my source code to be able to
> compile against new libibverbs?
I previously sent an ABI-
On Jul 8, 2013, at 1:26 PM, Jason Gunthorpe
wrote:
> Jeff's patch doesn't break old binaries, old binaries, running with
> normal IB MTUs work fine. The structure layouts all stay the same,
> etc.
FWIW, I did a simple test to confirm this. I installed a stock git HEAD
libibverbs into $HOME/l
Bump.
On Jul 10, 2013, at 8:14 AM, Jeff Squyres (jsquyres) wrote:
> On Jul 8, 2013, at 1:26 PM, Jason Gunthorpe
> wrote:
>
>> Jeff's patch doesn't break old binaries, old binaries, running with
>> normal IB MTUs work fine. The structure layouts all stay the sa
Bump.
On Jul 10, 2013, at 4:32 PM, Jeff Squyres wrote:
> If the send size is less than the cap.max_inline_data reported by the
> qp, use the IBV_SEND_INLINE flag. This now only shows the example of
> using ibv_query_qp(), it also reduces the latency time shown by the
> pingpong programs when th
On Jul 16, 2013, at 10:47 AM, Jason Gunthorpe
wrote:
> A source change is completely unvaoidable. Supporting the new MTU
> values requires updated source.
I don't really care one way or the other; I'll submit whatever patch people
want. :-)
But FWIW, I tend to believe the Doug/Jason positio
On Jul 17, 2013, at 12:06 AM, "Hefty, Sean" wrote:
> I don't remember. Is it known how the mtu is communicated with the kernel?
I hadn't looked at the kernel side yet; I was waiting for the userspace side to
sort itself out first.
> Looking at kern-abi.h, the mtu fields are:
>
> struct ibv
On Jul 17, 2013, at 5:44 PM, Steve Wise wrote:
> The iwarp drivers just report the nearest mtu enum. Apps don't need it for
> iwarp like they do for ib.
For RC, it doesn't matter much. So the fact that RoCE and iWARP lie about
their MTU isn't a huge deal. It's wrong, but it doesn't matter
Bump bump.
On Jul 10, 2013, at 4:32 PM, Jeff Squyres wrote:
> If the send size is less than the cap.max_inline_data reported by the
> qp, use the IBV_SEND_INLINE flag. This now only shows the example of
> using ibv_query_qp(), it also reduces the latency time shown by the
> pingpong programs wh
Bump bump bump.
I know this isn't a huge / important patch, but it is a small thing that does
decrease the latency reported by these example programs.
On Jul 10, 2013, at 4:32 PM, Jeff Squyres wrote:
> If the send size is less than the cap.max_inline_data reported by the
> qp, use the IBV_SEN
On Jul 18, 2013, at 12:50 PM, Jason Gunthorpe
wrote:
>> We need it for UD for our upcoming device, however, because the MTU
>> is the only way to get the max message size.
>
> .. and UD is the least abstracted transport, so existing apps won't
> support Jeff's new NIC anyhow, MTU is the least o
4th bump...
On Jul 10, 2013, at 4:32 PM, Jeff Squyres wrote:
> If the send size is less than the cap.max_inline_data reported by the
> qp, use the IBV_SEND_INLINE flag. This now only shows the example of
> using ibv_query_qp(), it also reduces the latency time shown by the
> pingpong programs w
On Jul 23, 2013, at 9:26 AM, Jeff Squyres (jsquyres) wrote:
>> .. and UD is the least abstracted transport, so existing apps won't
>> support Jeff's new NIC anyhow, MTU is the least of their problems.
>>
>> Existing apps with existing transports see the
On Jul 30, 2013, at 12:44 PM, Christoph Lameter wrote:
> What in the world does that mean? I am an oldtimer I guess. Seems that
> this is something that can be done in the newfangled forum? How does this
> affect mailing lists?
I'm not sure what you're asking me; please see the prior posts on t
On Aug 19, 2013, at 4:19 PM, Jason Gunthorpe
wrote:
> What about doing query port in this case and returning that value,
> decoded to an enum? Otherwise apps have to include that logic anyhow.
>
> I'm assuming the kernel will do basically the same?
>
> Bascially, the only failure for this call
On Aug 19, 2013, at 5:18 PM, "Hefty, Sean" wrote:
>> Bumped the ABI version to 7 (the new verb will return -ENOSYS if
>> abi_verb is < 7).
>
> How does this break the ABI?
It doesn't *break* the ABI, but it does add a new downcall into the kernel.
That requires bumping the ABI version to 7,
On Aug 19, 2013, at 6:07 PM, "Hefty, Sean" wrote:
>> It doesn't *break* the ABI, but it does add a new downcall into the kernel.
>> That requires bumping the ABI version to 7, no?
>
> No - adding a new command is fine. Older kernels will return ENOSYS if that
> command is not supported. In th
On Aug 19, 2013, at 6:36 PM, "Hefty, Sean" wrote:
> This breaks the libibverbs ABI. You can't modify ibv_context_ops because it
> changes struct ibv_context.
Any suggestions on how one adds a new driver call without breaking ABI?
--
Jeff Squyres
jsquy...@cisco.com
For corporate legal informa
On Aug 19, 2013, at 8:59 PM, "Hefty, Sean" wrote:
>> Any suggestions on how one adds a new driver call without breaking ABI?
>
> It could be built on the verbs extension mechanism.
Where is the documentation for this? Multiple people have referred to it, but
I don't see any mention of it in l
Bump. This is V2 of the patch, which removes the ABI issue: libibverbs
directly calls the command in the kernel (without going through the provider
plugin).
On Aug 21, 2013, at 5:22 PM, Jeff Squyres wrote:
> Per lengthy discussion on the linux-rdma list, add a new verb to get
> max datagram s
Any further comments on this?
Doug -- does it look ok to you?
> On Dec 7, 2015, at 5:27 AM, Haggai Eran wrote:
>
> On 12/04/2015 01:09 AM, Jeff Squyres wrote:
>> The default value of 8 is too small to read
>> /sys/class/infiniband/usnic_x/node_type, which contains "6: usNIC
>> UDP". Per a7a73
73 matches
Mail list logo