RE: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-09 Thread Veeraiyan, Ayyappan
From: Neil Horman [mailto:[EMAIL PROTECTED]
Replying to myself...
   I've looked through the driver pretty throughly with regards to
my
above
concern, and it appears the driver is reasonably free of netpoll issues
at
the
moment, at least as far as what we found in e1000 was concerned.  I do

Thanks for reviewing the code..

however,
see a concern in the use of the in_netpoll flag within the driver.
Given
that
the primary registered net_device, and all the dummy net_devices in the
rx_ring
point to the same ixgbe_adapter structure, there can be some level of
confusion
over weather a given rx queue is in netpoll_mode or not.

The revised driver I am going to post today will not have fake
netdevs...

adapter prforms a netpoll, all the individual rx queues will follow the
in_netpoll path in the receive path (assuming misx interrupts are
used).
The
result I think is the potential for a large amount of packet reordering
during a
netpoll operation.  Perhaps not a serious problem, but likely worth
looking

Multiple Rx queues are used in non-NAPI mode only, and all Rx queues use
one netdev (which is associated with the adapter struct). Also, the RSS
(receive side scaling or rx packet steering) feature is used in multiple
rx queues mode. In this mode, HW will always select the same Rx queue
(for a flow) and this should prevent any packet reordering issue.


Neil

Ayyappan
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-09 Thread Neil Horman
On Mon, Jul 09, 2007 at 07:21:24AM -0700, Veeraiyan, Ayyappan wrote:
 From: Neil Horman [mailto:[EMAIL PROTECTED]
 Replying to myself...
  I've looked through the driver pretty throughly with regards to
 my
 above
 concern, and it appears the driver is reasonably free of netpoll issues
 at
 the
 moment, at least as far as what we found in e1000 was concerned.  I do
 
 Thanks for reviewing the code..
 
 however,
 see a concern in the use of the in_netpoll flag within the driver.
 Given
 that
 the primary registered net_device, and all the dummy net_devices in the
 rx_ring
 point to the same ixgbe_adapter structure, there can be some level of
 confusion
 over weather a given rx queue is in netpoll_mode or not.
 
 The revised driver I am going to post today will not have fake
 netdevs...
 
 adapter prforms a netpoll, all the individual rx queues will follow the
 in_netpoll path in the receive path (assuming misx interrupts are
 used).
 The
 result I think is the potential for a large amount of packet reordering
 during a
 netpoll operation.  Perhaps not a serious problem, but likely worth
 looking
 
 Multiple Rx queues are used in non-NAPI mode only, and all Rx queues use
 one netdev (which is associated with the adapter struct). Also, the RSS
 (receive side scaling or rx packet steering) feature is used in multiple
 rx queues mode. In this mode, HW will always select the same Rx queue
 (for a flow) and this should prevent any packet reordering issue.
 
 
 Neil
 
 Ayyappan

Thank you, I think that satisfies all my concerns.

Regards
Neil
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-06 Thread Ingo Oeser
Hi Auke,

Kok, Auke schrieb:
 tg3.c:
 
   if ((tp-tg3_flags  TG3_FLAG_PCIX_TARGET_HWBUG) ||
   (tp-tg3_flags2  TG3_FLG2_ICH_WORKAROUND))
 
 is obviously _not_ easier to read than
 
  if (tp-tg3_flags.pcix_target_hwbug || tp-tg3_flags.ich_workaround)
 

Yes, but what about:

static bool tg3_has_pcix_target_hwdebug(const struct tg3 *tp)
{
return (tp-tg3_flags  TG3_FLAG_PCIX_TARGET_HWBUG) != 0;
}

static bool tg3_has_ich_workaround(const struct tg3 *tp)
{
return (tp-tg3_flags2  TG3_FLG2_ICH_WORKAROUND) != 0;
}

if (tg3_has_pcix_target_hwdebug(tp) || tg3_has_ich_workaround(tp)) {
/* do foo */
}

That is just as nice as bitfields and makes that stuff more readable.
This is also a migration path while going from bitfields to flags incrementally.

If you find out that two of these tests are always done together you could even 
test two of them in one.

 I would say that this method is definately worse for readability. 

Yes, open coding this bit masking mess IS making code hardly readable!

 I would much rather then stick with 'bool' instead.

If you can afford the space, that is just as good. If you are undecided,
try the accessors. You can even write code, which generates them once. 

Maybe we could get such nice script for generating 
a set of bitmasks and accessors in the kernel :-)

Source would than be a struct definition with bitfields.

All points raised be the people here are actually valid and I consider
bitfields nice and clear for specification and example code, 
but avoid them while doing real coding.


Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-05 Thread Neil Horman
On Tue, Jul 03, 2007 at 08:53:46AM -0400, Neil Horman wrote:
 On Mon, Jul 02, 2007 at 12:00:29PM -0700, Veeraiyan, Ayyappan wrote:
  On 7/2/07, Jeff Garzik [EMAIL PROTECTED] wrote:
   Ayyappan Veeraiyan wrote:
+#define IXGBE_TX_FLAGS_VLAN_MASK 0x
+#define IXGBE_TX_FLAGS_VLAN_SHIFT16
   
   defining bits using the form (1  n) is preferred.  Makes it easier
   to read, by eliminating the requirement of the human brain to decode
  hex
   into bit numbers.
   
   
  
  Ok.
  
+ struct net_device netdev;
+ };
   
   Embedded a struct net_device into your ring?  How can I put this?
   
   Wrong, wrong.  Wrong, wrong, wrong.  Wrong.
   
  
  Agreed. 
  Fake netdevs are needed for doing the multiple Rx queues in NAPI mode.
  We are thinking to solve in 2 ways. Having netdev pointer in ring
  structure or having an array of netdev pointers in ixgbe_adatper struct
  which would be visible to all rings.
  
 If thats what you're using the netdev pointers in the ring structure for, I'd
 like to remind you that you tried using Fake netdevs in e1000 last year, and 
 the
 results were crashes and potential data corruption.
 
 http://marc.info/?l=linux-netdevm=114954878711789w=2
 
 The consensus there at the
 end of the above thread was to revert out the e1000 multiple rx queue code 
 until
 such time as the netpoll code could be updated to better interoperate with
 multiple rx queues (Andy Grover I believe was the person who volunteered for 
 the
 job).  Given that I've not seen any netpoll updates regarding netpoll and
 multi-rx since then, I'm guessing the use of fake netdevs is still a problem.
 I've not checked this driver for all of the potential problems we found, but I
 will, and let you know if I see any.
 
 Regards
 Neil
  
Replying to myself...
I've looked through the driver pretty throughly with regards to my above
concern, and it appears the driver is reasonably free of netpoll issues at the
moment, at least as far as what we found in e1000 was concerned.  I do however,
see a concern in the use of the in_netpoll flag within the driver.  Given that
the primary registered net_device, and all the dummy net_devices in the rx_ring
point to the same ixgbe_adapter structure, there can be some level of confusion
over weather a given rx queue is in netpoll_mode or not.  When the primary
adapter prforms a netpoll, all the individual rx queues will follow the
in_netpoll path in the receive path (assuming misx interrupts are used).  The
result I think is the potential for a large amount of packet reordering during a
netpoll operation.  Perhaps not a serious problem, but likely worth looking into
further.

Regards
Neil


-- 
/***
 *Neil Horman
 *Software Engineer
 *Red Hat, Inc.
 [EMAIL PROTECTED]
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu
 ***/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-05 Thread Jeff Garzik

Inaky Perez-Gonzalez wrote:

On Tuesday 03 July 2007, Jeff Garzik wrote:

Inaky Perez-Gonzalez wrote:
Access to bitfields are not atomic within the machine int in which they 
are stored... you need to unpack the values stored in bitfields, even 
if they are single-bit bitfields.


Which we do manually when we don't use bitfields. Again, conceptually,
there is no difference.


Practically speaking -- there are differences.

The manual method hides nothing from the programmer, while use of 
bitfields hides the lack of atomicity.


When you have programmers who make mistakes -- i.e. real humans -- these 
things matter.


But overall, it is not any one detail that discourages use of bitfields; 
it is the sum of all the reasons.  Practical experience, compiler 
technology, mistakes made (and not made), all point to avoiding bitfields.


Jeff


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-03 Thread Neil Horman
On Mon, Jul 02, 2007 at 12:00:29PM -0700, Veeraiyan, Ayyappan wrote:
 On 7/2/07, Jeff Garzik [EMAIL PROTECTED] wrote:
  Ayyappan Veeraiyan wrote:
   +#define IXGBE_TX_FLAGS_VLAN_MASK 0x
   +#define IXGBE_TX_FLAGS_VLAN_SHIFT16
  
  defining bits using the form (1  n) is preferred.  Makes it easier
  to read, by eliminating the requirement of the human brain to decode
 hex
  into bit numbers.
  
  
 
 Ok.
 
   + struct net_device netdev;
   + };
  
  Embedded a struct net_device into your ring?  How can I put this?
  
  Wrong, wrong.  Wrong, wrong, wrong.  Wrong.
  
 
 Agreed. 
 Fake netdevs are needed for doing the multiple Rx queues in NAPI mode.
 We are thinking to solve in 2 ways. Having netdev pointer in ring
 structure or having an array of netdev pointers in ixgbe_adatper struct
 which would be visible to all rings.
 
If thats what you're using the netdev pointers in the ring structure for, I'd
like to remind you that you tried using Fake netdevs in e1000 last year, and the
results were crashes and potential data corruption.

http://marc.info/?l=linux-netdevm=114954878711789w=2

The consensus there at the
end of the above thread was to revert out the e1000 multiple rx queue code until
such time as the netpoll code could be updated to better interoperate with
multiple rx queues (Andy Grover I believe was the person who volunteered for the
job).  Given that I've not seen any netpoll updates regarding netpoll and
multi-rx since then, I'm guessing the use of fake netdevs is still a problem.
I've not checked this driver for all of the potential problems we found, but I
will, and let you know if I see any.

Regards
Neil
 

-- 
/***
 *Neil Horman
 *Software Engineer
 *Red Hat, Inc.
 [EMAIL PROTECTED]
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu
 ***/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-03 Thread Jeff Garzik

Inaky Perez-Gonzalez wrote:

I don't think bitfields are broken. Maybe it's the compiler what should be 
fixed (*)


Then you do not understand bitfields.  It is -axiomatic- that bitfields 
are more difficult for compilers to implement.


Access to bitfields are not atomic within the machine int in which they 
are stored... you need to unpack the values stored in bitfields, even 
if they are single-bit bitfields.


You cannot set multiple bitfields at one time, without even more complex 
data structures.  You cannot compare and test multiple bitfields at one 
 time.


Humans have proven in kernel-land to screw up bitfields repeatedly.

The evidence is plain to see:


union {
  struct {
u32 reserved1:15;
u32 val:2;
   } __attribute__((packed))
   u32 data;
} value;


Using u32 flags, and nothing else, is just so much more simple and 
obvious.


Finally, this is -nothing new-.  I've been telling other driver writers 
not to use bitfields in their drivers.  Google for 'garzik' and 'bitfield'.


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-03 Thread Inaky Perez-Gonzalez
On Tuesday 03 July 2007, Jeff Garzik wrote:
 Inaky Perez-Gonzalez wrote:
  I don't think bitfields are broken. Maybe it's the compiler what should be 
  fixed (*)
 
 Then you do not understand bitfields.  It is -axiomatic- that bitfields 
 are more difficult for compilers to implement.
 
 Access to bitfields are not atomic within the machine int in which they 
 are stored... you need to unpack the values stored in bitfields, even 
 if they are single-bit bitfields.

Which we do manually when we don't use bitfields. Again, conceptually,
there is no difference.

 You cannot set multiple bitfields at one time, without even more complex 
 data structures.  You cannot compare and test multiple bitfields at one 
   time.

But I am not saying that you only use bitfields. There are cases where
they make the code much more readable (for me, at least), and cases
(like the one you mention) where they not [this is why I use unions,
btw]..

 ...
  
 Finally, this is -nothing new-.  I've been telling other driver writers 
 not to use bitfields in their drivers.  Google for 'garzik' and 'bitfield'.

I know, I am just disagreeing. This is one of those things each person is
entrenched on and is impossible to change (like emacs vs vi).
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Jeff Garzik

Ayyappan Veeraiyan wrote:

+#define IXGBE_TX_FLAGS_CSUM0x0001
+#define IXGBE_TX_FLAGS_VLAN0x0002
+#define IXGBE_TX_FLAGS_TSO 0x0004
+#define IXGBE_TX_FLAGS_IPV40x0008
+#define IXGBE_TX_FLAGS_VLAN_MASK   0x
+#define IXGBE_TX_FLAGS_VLAN_SHIFT  16


defining bits using the form (1  n) is preferred.  Makes it easier 
to read, by eliminating the requirement of the human brain to decode hex 
into bit numbers.




+   union {
+   /* To protect race between sender and clean_tx_irq */
+   spinlock_t tx_lock;
+   struct net_device netdev;
+   };


Embedded a struct net_device into your ring?  How can I put this?

Wrong, wrong.  Wrong, wrong, wrong.  Wrong.




+   struct ixgbe_queue_stats stats;
+
+   u32 eims_value;
+   u32 itr_val;
+   u16 itr_range;
+   u16 itr_register;
+
+   char name[IFNAMSIZ + 5];


The interface name should not be stored by your ring structure




+#define IXGBE_DESC_UNUSED(R) \
+   R)-next_to_clean  (R)-next_to_use) ? 0 : (R)-count) + \
+   (R)-next_to_clean - (R)-next_to_use - 1)
+
+#define IXGBE_RX_DESC_ADV(R, i)\
+   union ixgbe_adv_rx_desc *)((R).desc))[i]))
+#define IXGBE_TX_DESC_ADV(R, i)\
+   union ixgbe_adv_tx_desc *)((R).desc))[i]))
+#define IXGBE_TX_CTXTDESC_ADV(R, i)\
+   struct ixgbe_adv_tx_context_desc *)((R).desc))[i]))
+
+#define IXGBE_MAX_JUMBO_FRAME_SIZE 16128
+
+/* board specific private data structure */
+struct ixgbe_adapter {
+   struct timer_list watchdog_timer;
+   struct vlan_group *vlgrp;
+   u16 bd_number;
+   u16 rx_buf_len;
+   u32 part_num;
+   u32 link_speed;
+   unsigned long io_base;


Kill io_base and stop setting netdev-base_addr



+   atomic_t irq_sem;
+   struct work_struct reset_task;
+
+   /* TX */
+   struct ixgbe_ring *tx_ring; /* One per active queue */
+   u64 restart_queue;
+   u64 lsc_int;
+   u32 txd_cmd;
+   u64 hw_tso_ctxt;
+   u64 hw_tso6_ctxt;
+   u32 tx_timeout_count;
+   boolean_t detect_tx_hung;
+
+   /* RX */
+   struct ixgbe_ring *rx_ring; /* One per active queue */
+   u64 hw_csum_tx_good;
+   u64 hw_csum_rx_error;
+   u64 hw_csum_rx_good;
+   u64 non_eop_descs;
+   int num_tx_queues;
+   int num_rx_queues;
+   struct msix_entry *msix_entries;
+
+   u64 rx_hdr_split;
+   u32 alloc_rx_page_failed;
+   u32 alloc_rx_buff_failed;
+   struct {
+   unsigned int rx_csum_enabled:1;
+   unsigned int msi_capable:1;
+   unsigned int msi_enabled:1;
+   unsigned int msix_capable   :1;
+   unsigned int msix_enabled   :1;
+   unsigned int imir_enabled   :1;
+   unsigned int in_netpoll :1;
+   } flags;


always avoid bitfields.  They generate horrible code, and endian 
problems abound (though no endian problems are apparent here).




+   u32 max_frame_size; /* Maximum frame size supported */
+   u32 eitr;   /* Interrupt Throttle Rate */
+
+   /* OS defined structs */
+   struct net_device *netdev;
+   struct pci_dev *pdev;
+   struct net_device_stats net_stats;
+
+   /* structs defined in ixgbe_hw.h */
+   struct ixgbe_hw hw;
+   u16 msg_enable;
+   struct ixgbe_hw_stats stats;
+   char lsc_name[IFNAMSIZ + 5];


delete lsc_name and use netdev name directly in request_irq()



Will review more after you fix these problems.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Arjan van de Ven

 +u32 alloc_rx_buff_failed;

+struct {
+unsigned int rx_csum_enabled:1;
+unsigned int msi_capable:1;
+unsigned int msi_enabled:1;
+unsigned int msix_capable:1;
+unsigned int msix_enabled:1;
+unsigned int imir_enabled:1;
+unsigned int in_netpoll:1;
+} flags;


always avoid bitfields.  They generate horrible code, and endian 
problems abound (though no endian problems are apparent here).


they generate no worse code than open coding the checks for these 
feature flags...

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Jeff Garzik

Arjan van de Ven wrote:

  +u32 alloc_rx_buff_failed;

+struct {
+unsigned int rx_csum_enabled:1;
+unsigned int msi_capable:1;
+unsigned int msi_enabled:1;
+unsigned int msix_capable:1;
+unsigned int msix_enabled:1;
+unsigned int imir_enabled:1;
+unsigned int in_netpoll:1;
+} flags;


always avoid bitfields.  They generate horrible code, and endian 
problems abound (though no endian problems are apparent here).


they generate no worse code than open coding the checks for these 
feature flags...


That would be the logical assumption, but reality does not bear that 
logic out to be true.


I'm not sure whether gcc is confused about ABI alignment or such, on 
various platforms, but every time I look at the asm generated it is 
/not/ equivalent to open-coded feature flags and bitwise logic.  Often 
it is demonstrably worse.


The same is true for icc too :)  Bitfields are just bad juju for 
compilers, I guess.


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Arjan van de Ven

Jeff Garzik wrote:
always avoid bitfields.  They generate horrible code, and endian 
problems abound (though no endian problems are apparent here).


they generate no worse code than open coding the checks for these 
feature flags...


That would be the logical assumption, but reality does not bear that 
logic out to be true.




I just checked a small example and gcc just generates a testb with an 
immediate value, which isn't all that bad code.


Do you remember which gcc you tested with?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Arjan van de Ven

Jeff Garzik wrote:
I'm not sure whether gcc is confused about ABI alignment or such, on 
various platforms, but every time I look at the asm generated it is 
/not/ equivalent to open-coded feature flags and bitwise logic.  Often 
it is demonstrably worse.


(I can imagine this being different if you forcefully pack your 
structure to align with hw masks, but that is obviously not the case here)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Jeff Garzik

Arjan van de Ven wrote:

Jeff Garzik wrote:
always avoid bitfields.  They generate horrible code, and endian 
problems abound (though no endian problems are apparent here).


they generate no worse code than open coding the checks for these 
feature flags...


That would be the logical assumption, but reality does not bear that 
logic out to be true.




I just checked a small example and gcc just generates a testb with an 
immediate value, which isn't all that bad code.


Do you remember which gcc you tested with?


gcc 2.95, gcc 3.x, gcc 4.x, ...  on multiple architectures, not just ia32.

It's simple logic:  using machine integers are the easiest for the 
compiler to Do The Right Thing, the easiest way to eliminate endian 
problems, the easiest way for programmers to read and understand struct 
alignment.


Just say no to bitfields.

Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Arjan van de Ven

Jeff Garzik wrote:
It's simple logic:  using machine integers are the easiest for the 
compiler to Do The Right Thing, the easiest way to eliminate endian 
problems, the easiest way for programmers to read and understand struct 
alignment.


using integers pure is obviously natural..
but using integers and then manually doing bit masking yourself... 
that's not fundamentally better than what the compiler can do.


yes bitfields are hard for not-1-bit cases and for cases where you 
mimick hardware layouts. neither is the case in this code.
The code gets actually harder to read for the feature flags if you 
don't use bitfields  so unless the code is really worse (and so 
far I've not seen that, but I'll investigate more when I get to work), 
I think it's fair to use single-bit, non-packed bitfields for them...

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Kok, Auke

Jeff Garzik wrote:

Arjan van de Ven wrote:

Jeff Garzik wrote:
always avoid bitfields.  They generate horrible code, and endian 
problems abound (though no endian problems are apparent here).
they generate no worse code than open coding the checks for these 
feature flags...
That would be the logical assumption, but reality does not bear that 
logic out to be true.


I just checked a small example and gcc just generates a testb with an 
immediate value, which isn't all that bad code.


Do you remember which gcc you tested with?


gcc 2.95, gcc 3.x, gcc 4.x, ...  on multiple architectures, not just ia32.

It's simple logic:  using machine integers are the easiest for the 
compiler to Do The Right Thing, the easiest way to eliminate endian 
problems, the easiest way for programmers to read and understand struct 
alignment.


I really disagree with you here, I much rather prefer using code style like:

  if (adapter-flags.msi_enabled) ..

than

  if (adapter-flags  E1000_FLAG_MSI_ENABLED) ...

not only does it read easier, it's also shorter and not prone to / confusion 
typo's, takes away parentheses when the test has multiple parts etc...


Maybe this is not most important for ixgbe, where we only have 8 or so flags, 
but the new e1000 driver that I posted this weekend currently has 63 (you wanted 
flags ;)) of them. Do you want me to use 63 integers or just 2 ?


And as Arjan said, we're not passing any of these to hardware, so there should 
not be any endian issues.


I think acme would agree with me that pahole currently is the easiest way to 
show struct alignment ...



Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Jeff Garzik

Kok, Auke wrote:
Maybe this is not most important for ixgbe, where we only have 8 or so 
flags, but the new e1000 driver that I posted this weekend currently has 
63 (you wanted flags ;)) of them. Do you want me to use 63 integers or 
just 2 ?



Don't be silly.  We are talking about single-bit feature flags 
implemented using machine ints (a la tg3 with 32 flags per int), versus 
bitfields.


Bitfields are to be avoided for many reasons:
* more difficult, in general, for a compiler to generate optimal code
* in particular, known to generate worse code on various architectures
* often causes endian problems
* often enhances programmer confusion, when trying to review structs and 
determine optimal layout and alignment
* programmers have proven they will screw up bitfields in e.g. cases 
with 1-bit and signedness.


I can probably think up more reasons to avoid bitfields if given another 
5 minutes :)


Jeff


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Jeff Garzik

Andrew Morton wrote:

On Mon, 02 Jul 2007 11:32:41 -0400 Jeff Garzik [EMAIL PROTECTED] wrote:


Bitfields are to be avoided for many reasons:
* more difficult, in general, for a compiler to generate optimal code
* in particular, known to generate worse code on various architectures
* often causes endian problems
* often enhances programmer confusion, when trying to review structs and 
determine optimal layout and alignment
* programmers have proven they will screw up bitfields in e.g. cases 
with 1-bit and signedness.


I can probably think up more reasons to avoid bitfields if given another 
5 minutes :)


A significant problem is that modifications to nearby bitfields need
locking: concurrent modifications to two bitfields can result in concurrent
modifications to the same word.

And that's OK, but it's pretty unobvious that these stores are nonatomic
from the source code and people could easily forget to do it.


Indeed.

Overall, it isn't any one specific thing that makes me reject bitfields 
in new code, it's the sum of all these reasons.


Kernel and compiler history proves that humans and compilers screw up 
bitfields early and often :)


Another reason that I just thought of:  bitfields cannot be 
conglomerated into easy-to-assign-and-test masks, making


foo = (1  0),
bar = (1  4),
baz = (1  9),
default_feature_flags = foo | bar | baz,

foo-flags = default_feature_flags;

impossible with bitfields.

You also cannot test multiple bits at one time, with bitfields.



That being said, they _are_ attractive from the nice-to-read POV...


My personal style disagrees, but that's a personal taste.  I can see how 
other people might think that, though, agreed.


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Veeraiyan, Ayyappan
On 7/2/07, Jeff Garzik [EMAIL PROTECTED] wrote:
 Ayyappan Veeraiyan wrote:
  +#define IXGBE_TX_FLAGS_VLAN_MASK 0x
  +#define IXGBE_TX_FLAGS_VLAN_SHIFT16
 
 defining bits using the form (1  n) is preferred.  Makes it easier
 to read, by eliminating the requirement of the human brain to decode
hex
 into bit numbers.
 
 

Ok.

  + struct net_device netdev;
  + };
 
 Embedded a struct net_device into your ring?  How can I put this?
 
 Wrong, wrong.  Wrong, wrong, wrong.  Wrong.
 

Agreed. 
Fake netdevs are needed for doing the multiple Rx queues in NAPI mode.
We are thinking to solve in 2 ways. Having netdev pointer in ring
structure or having an array of netdev pointers in ixgbe_adatper struct
which would be visible to all rings.

  +
  + char name[IFNAMSIZ + 5];
 
 The interface name should not be stored by your ring structure


Yes, I agree and also (pointed out by someone before) this would break
if the user changes the interface name..
But, having the cause in the MSIX vector name really helps in
debugging and helps the user also. 

I think the below output is much better

[EMAIL PROTECTED] src]# cat /proc/interrupts | grep eth0
214:  0  0   PCI-MSI-edge  eth0-lsc
215:  11763  4   PCI-MSI-edge  eth0-rx7
216:  0  0   PCI-MSI-edge  eth0-rx6
217:  77324  0   PCI-MSI-edge  eth0-rx5
218:  0  0   PCI-MSI-edge  eth0-rx4
219:  52911  0   PCI-MSI-edge  eth0-rx3
220:  80271  0   PCI-MSI-edge  eth0-rx2
221:  80244  6   PCI-MSI-edge  eth0-rx1
222: 12  0   PCI-MSI-edge  eth0-rx0
223: 124870  28543   PCI-MSI-edge  eth0-tx0

Compared to 

[EMAIL PROTECTED] src]# cat /proc/interrupts | grep eth0
214:  0  0   PCI-MSI-edge  eth0
215:  11763  4   PCI-MSI-edge  eth0
216:  0  0   PCI-MSI-edge  eth0
217:  77324  0   PCI-MSI-edge  eth0
218:  0  0   PCI-MSI-edge  eth0
219:  52911  0   PCI-MSI-edge  eth0
220:  80271  0   PCI-MSI-edge  eth0
221:  80244  6   PCI-MSI-edge  eth0
222: 12  0   PCI-MSI-edge  eth0
223: 124900  28543   PCI-MSI-edge  eth0

Since we wanted to distinguish the various MSIX vectors in
/proc/interrupts and since request_irq expects memory for name to be
allocated somewhere, I added this part of the ring struct.

 
 Kill io_base and stop setting netdev-base_addr

In my latest internal version, I already removed the io_base member (and
couple more from ixgbe_adapter) but still setting the netdev-base_addr.
I will remove that also..

  + struct ixgbe_hw_stats stats;
  + char lsc_name[IFNAMSIZ + 5];
 
 delete lsc_name and use netdev name directly in request_irq()
 

Please see the response for the name member of ring structure.
 
 
 Will review more after you fix these problems.
 

Thanks for the feedback. I will post another version shortly (except the
feature flags change and the ring struct name members) which fixes my
previous TODO list and also most of Francois comments..

Ayyappan
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Ayyappan Veeraiyan

On 7/2/07, Veeraiyan, Ayyappan [EMAIL PROTECTED] wrote:


Thanks for the feedback. I will post another version shortly (except the
feature flags change and the ring struct name members) which fixes my


Just to clarify this, I will wait some more time for feature flags
discussion and the ring structure name feedback..

Ayyappan
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Kok, Auke

Christoph Hellwig wrote:

On Mon, Jul 02, 2007 at 02:09:58PM -0700, Stephen Hemminger wrote:

The patch is close to ready for 2.6.24 when this driver will need to show up.


If intel manages to fix up the reamining issues I'd rather see it appear
in 2.6.23..


Since I know Intel will be forced to backport this to older distro's. You
would be best to have a single receive queue version when you have to make
it work on the older code.


But that'll require the single receiver queue version I guess.  The
netdevice abuse is the only really major issue I see, although I'd of
course really like to see the driver getting rid of the bitfield abuse
aswell.


well, FWIW when I started looking at adding these flags I looked in various 
subsystems in the kernel and picked an implementation that suited. Guess what 
pci.h has? ...:


unsigned int msi_enabled:1;
unsigned int msix_enabled:1;

this is literally where I copied the example from

I suppose I can fix those, but I really don't understand what all the fuzz is 
about here. We're only conserving memory and staying far away from the real 
risks of bitmasks, so forgive me if I don't grasp the problem.


Honestly, if this is really considered Bad coding (TM) then we need to fix 
these prominent abuses of it too.


I count about 60 or so of these bitfields in drivers/net... (and countless more 
in other parts) !


Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Jeff Garzik

Kok, Auke wrote:
I suppose I can fix those, but I really don't understand what all the 
fuzz is about here. We're only conserving memory and staying far away 
from the real risks of bitmasks, so forgive me if I don't grasp the 
problem.


Be it machine ints or bitfields, you're not conserving memory.  Think 
about struct padding.


As to the overall question, I already posted a long list of problems 
with bitfields.  Shall I repeat?


Jeff


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Michael Buesch
On Tuesday 03 July 2007 00:02:57 Kok, Auke wrote:
 well, FWIW when I started looking at adding these flags I looked in various 
 subsystems in the kernel and picked an implementation that suited. Guess what 
 pci.h has? ...:
 
   unsigned int msi_enabled:1;
   unsigned int msix_enabled:1;
 
 this is literally where I copied the example from
 
 I suppose I can fix those, but I really don't understand what all the fuzz is 
 about here. We're only conserving memory and staying far away from the real 

I'm not sure if these bitfields actually _do_ conserve memory.
Generated code gets bigger (need bitwise masks and stuff).
Code also needs memory. It probably only conserves memory, if the
structure is instanciated a lot.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Jeff Garzik

Michael Buesch wrote:

On Tuesday 03 July 2007 00:02:57 Kok, Auke wrote:
well, FWIW when I started looking at adding these flags I looked in various 
subsystems in the kernel and picked an implementation that suited. Guess what 
pci.h has? ...:


unsigned int msi_enabled:1;
unsigned int msix_enabled:1;

this is literally where I copied the example from

I suppose I can fix those, but I really don't understand what all the fuzz is 
about here. We're only conserving memory and staying far away from the real 


I'm not sure if these bitfields actually _do_ conserve memory.
Generated code gets bigger (need bitwise masks and stuff).
Code also needs memory. It probably only conserves memory, if the
structure is instanciated a lot.


Actually, that's a good point.  On several RISC architectures it 
certainly generates bigger code.


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Veeraiyan, Ayyappan
On 7/2/07, Stephen Hemminger [EMAIL PROTECTED] wrote:
  Fake netdevs are needed for doing the multiple Rx queues in NAPI
mode.
  We are thinking to solve in 2 ways. Having netdev pointer in ring
  structure or having an array of netdev pointers in ixgbe_adatper
struct
  which would be visible to all rings.
 
 Wait until Davem  I separate NAPI from network device.
 The patch is close to ready for 2.6.24 when this driver will need to
show up.
 
 Since I know Intel will be forced to backport this to older distro's.
You
 would be best to have a single receive queue version when you have to
make
 it work on the older code.

So far all our testing indicates we need multiple Rx queues to have
better CPU utilization number at 10Gig line rate. So I am thinking
adding the non-NAPI support to the driver (like other 10Gig drivers) and
restrict to single rx queue in case of NAPI. I already have the non-NAPI
version coded up and went through internal testing. I will add this in
the next submission. We will add the multiple Rx queues support in NAPI
mode once when separate NAPI from network device is done. Does this
sound ok?

 
 You only need to store the name for when you are doing request_irq, so
 it can just be a stack temporary.


request_irq expects allocated memory not just stack temporary. I glanced
the kernel source.. There are precedents to the way we did.

linux-2.6/source/drivers/usb/core/hcd.c

1594 /* enable irqs just before we start the controller */
1595 if (hcd-driver-irq) {
1596 snprintf(hcd-irq_descr, sizeof(hcd-irq_descr),
%s:usb%d,
1597 hcd-driver-description,
hcd-self.busnum);
1598 if ((retval = request_irq(irqnum, usb_hcd_irq,
irqflags,
1599 hcd-irq_descr, hcd)) != 0) {
1600 dev_err(hcd-self.controller,
1601 request interrupt %d
failed\n, irqnum);
1602 goto err_request_irq;
1603 }

 
 Stephen Hemminger [EMAIL PROTECTED]

I appreciate the feedback.

Thanks,
Ayyappa
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Kok, Auke

Jeff Garzik wrote:

Michael Buesch wrote:

On Tuesday 03 July 2007 00:02:57 Kok, Auke wrote:
well, FWIW when I started looking at adding these flags I looked in various 
subsystems in the kernel and picked an implementation that suited. Guess what 
pci.h has? ...:


unsigned int msi_enabled:1;
unsigned int msix_enabled:1;

this is literally where I copied the example from

I suppose I can fix those, but I really don't understand what all the fuzz is 
about here. We're only conserving memory and staying far away from the real 

I'm not sure if these bitfields actually _do_ conserve memory.
Generated code gets bigger (need bitwise masks and stuff).
Code also needs memory. It probably only conserves memory, if the
structure is instanciated a lot.


Actually, that's a good point.  On several RISC architectures it 
certainly generates bigger code.


so, even on i386 it does (17 bytes and 6 instructions to test vs. 10:3 if using 
a bool):


unsigned int:1:
 8048365:   0f b6 45 f8 movzbl 0xfff8(%ebp),%eax
 8048369:   0c 01   or $0x1,%al
 804836b:   88 45 f8mov%al,0xfff8(%ebp)
 804836e:   0f b6 45 f8 movzbl 0xfff8(%ebp),%eax
 8048372:   24 01   and$0x1,%al
 8048374:   84 c0   test   %al,%al

bool:
 8048365:   c6 45 fb 01 movb   $0x1,0xfffb(%ebp)
 8048369:   0f b6 45 fb movzbl 0xfffb(%ebp),%eax
 804836d:   84 c0   test   %al,%al

unsigned int:
 8048365:   c7 45 f8 01 00 00 00movl   $0x1,0xfff8(%ebp)
 804836c:   8b 45 f8mov0xfff8(%ebp),%eax
 804836f:   85 c0   test   %eax,%eax

using var  flag:
 8048365:   c7 45 f8 01 00 00 00movl   $0x1,0xfff8(%ebp)
 804836c:   8b 45 f8mov0xfff8(%ebp),%eax
 804836f:   25 00 04 00 00  and$0x400,%eax
 8048374:   85 c0   test   %eax,%eax

note that using var  flag is slightly larger here... 1 extra instruction and 
7 more bytes.


interesting...


but let's stay constructive here:

~/git/linux-2.6 $ grep -r 'unsigned int.*:1;' * | wc -l
748

Is anyone going to fix those? If we don't, someone will certainly again submit 
patches to add more of these bitfields, after all, some very prominent parts of 
the kernel still use them. Only recently for instance mac80211 merged like 30 of 
these and drivers/net, include etc.. certainly has a lot of these.


Cheers,


Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Veeraiyan, Ayyappan
On 7/2/07, Christoph Hellwig [EMAIL PROTECTED] wrote:
 
 But that'll require the single receiver queue version I guess.  The
 netdevice abuse is the only really major issue I see, although I'd of
 course really like to see the driver getting rid of the bitfield abuse
 aswell.

The submitted driver code supports single queue version in case of MSIX
allocation failures... As I said in the other mail, I feel, restricting
to single Rx queue in NAPI mode is better approach till Stephen's and
DaveM' work of separating NAPI from netdevice is done..

 Lots of drivers where the interface name is assigned after request_irq
 just use an internal name, e.g. ixgbeX in the case of this driver.
 

This sounds ok to me. 

With this change, this is the output..

[EMAIL PROTECTED] src]# ip link
1: lo: LOOPBACK,UP,1 mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: sit0: NOARP mtu 1480 qdisc noop
link/sit 0.0.0.0 brd 0.0.0.0
3: eth6: BROADCAST,MULTICAST,UP,1 mtu 1500 qdisc pfifo_fast qlen
1000
link/ether 00:50:8b:05:5f:95 brd ff:ff:ff:ff:ff:ff
29: eth0: BROADCAST,MULTICAST,UP,1 mtu 1500 qdisc pfifo_fast qlen
1000
link/ether 00:1b:21:01:e4:93 brd ff:ff:ff:ff:ff:ff
30: eth1: BROADCAST,MULTICAST mtu 1500 qdisc noop qlen 1000
link/ether 00:1b:21:01:e4:92 brd ff:ff:ff:ff:ff:ff

[EMAIL PROTECTED] src]# cat /proc/interrupts | grep 29
214:  0  0   PCI-MSI-edge  ixgbe29-lsc
215:  11764  80213   PCI-MSI-edge  ixgbe29-rx7
216:  80257  0   PCI-MSI-edge  ixgbe29-rx6
217:  77331  0   PCI-MSI-edge  ixgbe29-rx5
218:  24201  0   PCI-MSI-edge  ixgbe29-rx4
219:  52911  0   PCI-MSI-edge  ixgbe29-rx3
220: 104591  0   PCI-MSI-edge  ixgbe29-rx2
221:  80249  8   PCI-MSI-edge  ixgbe29-rx1
222: 14  0   PCI-MSI-edge  ixgbe29-rx0
223: 194023 118220   PCI-MSI-edge  ixgbe29-tx0 

Ayyappan
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Jeff Garzik

Kok, Auke wrote:

but let's stay constructive here:

~/git/linux-2.6 $ grep -r 'unsigned int.*:1;' * | wc -l
748

Is anyone going to fix those? If we don't, someone will certainly again 
submit patches to add more of these bitfields, after all, some very 
prominent parts of the kernel still use them. Only recently for instance 
mac80211 merged like 30 of these and drivers/net, include etc.. 
certainly has a lot of these.



There is a lot of crap in the kernel.  Just look at all the grotty SCSI 
or char drivers, for example.  The first task is to avoid adding more 
crap :)


Jeff


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixgbe: Introduce new 10GbE driver for Intel 82598 based PCI Express adapters...

2007-07-02 Thread Inaky Perez-Gonzalez
On Monday 02 July 2007, Kok, Auke wrote:
 Jeff Garzik wrote:
  Michael Buesch wrote:
  On Tuesday 03 July 2007 00:02:57 Kok, Auke wrote:
  well, FWIW when I started looking at adding these flags I looked in 
  various 
  subsystems in the kernel and picked an implementation that suited. Guess 
  what 
  pci.h has? ...:
 
unsigned int msi_enabled:1;
unsigned int msix_enabled:1;
 
  this is literally where I copied the example from
 
  I suppose I can fix those, but I really don't understand what all the 
  fuzz is 
  about here. We're only conserving memory and staying far away from the 
  real 
  I'm not sure if these bitfields actually _do_ conserve memory.
  Generated code gets bigger (need bitwise masks and stuff).
  Code also needs memory. It probably only conserves memory, if the
  structure is instanciated a lot.
  
  Actually, that's a good point.  On several RISC architectures it 
  certainly generates bigger code.
 
...

 but let's stay constructive here:
 
 ~/git/linux-2.6 $ grep -r 'unsigned int.*:1;' * | wc -l
 748
 
 Is anyone going to fix those? If we don't, someone will certainly again 
 submit 
 patches to add more of these bitfields, after all, some very prominent parts 
 of 
 the kernel still use them. Only recently for instance mac80211 merged like 30 
 of 
 these and drivers/net, include etc.. certainly has a lot of these.

I don't think bitfields are broken. Maybe it's the compiler what should be 
fixed (*)

...bitfields are there to save the coder having to type masks by hand. There 
should 
be no difference in the generated code from doing

u32 value = readl(fromsomewhere);
printf(bits 16  15 0x%08x\n, value  0x00018000  15);

union {
  struct {
u32 reserved1:15;
u32 val:2;
   } __attribute__((packed))
   u32 data;
} value;
value.data = readl(fromsomewhere);
printf(bits 16  15 0x%08x\n, value.val);

Granted, that looks big, but once you nail it out at your struct definitons, it
makes maintenance much easier when looking at the bitmasks in the specs. Masks 
and
shifts tend to suck on the long term when they accumulate.

-- Inaky


(*) not to mention the endianness mess
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html