RE: Memory allocation modifications in ibm_newemac driver AND sil24 driver

2010-09-02 Thread Jonathan Haws
 Can anyone explain to me why I would be getting this error in the
 first place?  Why is it failing to allocate a page when there are
 pages available?  That does not make any sense to me.

 order:1

 It's failing to allocate -two- pages.

Good point.  However, why is it failing?  According to the dump, there are 28 
8k pages available.

From doing more testing yesterday, I found that this error is also coming up 
from the SATA driver as well (sil24).  Again with those errors, it fails 
allocating a page of order 0, when there are a few hundred of those available. 
 What is up with the VMM?  Why are pages failing to allocate when they are 
available?

Thanks,

Jonathan


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Memory allocation modifications in ibm_newemac driver

2010-09-01 Thread Jonathan Haws
I found out what was causing the crash, but still am not there and could use 
some direction:

What was happening was that I was not allocating a new SKB to replace the one 
in the ring that was being passed up the stack.  I have remedied that and am 
now having another issue:

Once the ring index rolls over (it does so at 64) I start to lose packets 
because they are not being handled correctly (or do not contain the correct 
headers or something of that sort).  Here is a simple ping test showing what is 
happening:

64 bytes from 172.31.22.21: seq=29 ttl=128 time=10.826 ms
emac/plb/opb/ether...@ef600900: PACKET: 54 0X1800 110 0XC04E6B80
emac/plb/opb/ether...@ef600900: PACKET: 55 0X1800 110 0XC04E6EC0
emac/plb/opb/ether...@ef600900: PACKET: 56 0X1800 98 0XC04E70C0
64 bytes from 172.31.22.21: seq=30 ttl=128 time=10.839 ms
emac/plb/opb/ether...@ef600900: PACKET: 57 0X1800 110 0XC04E6580
emac/plb/opb/ether...@ef600900: PACKET: 58 0X1800 98 0XC04E5B80
64 bytes from 172.31.22.21: seq=31 ttl=128 time=10.832 ms
emac/plb/opb/ether...@ef600900: PACKET: 59 0X1800 219 0XC04E5740
emac/plb/opb/ether...@ef600900: PACKET: 60 0X1800 249 0XC04E5000
emac/plb/opb/ether...@ef600900: PACKET: 61 0X1800 92 0XC04E5340
emac/plb/opb/ether...@ef600900: PACKET: 62 0X1800 98 0XC04E4A00
64 bytes from 172.31.22.21: seq=32 ttl=128 time=10.825 ms
emac/plb/opb/ether...@ef600900: PACKET: 63 0X5800 92 0XC04E4E00
emac/plb/opb/ether...@ef600900: PACKET: 0 0X1800 98 0XC04EF520
emac/plb/opb/ether...@ef600900: PACKET: 1 0X1800 92 0XC04D8340
emac/plb/opb/ether...@ef600900: PACKET: 2 0X1800 98 0XC04D8260
emac/plb/opb/ether...@ef600900: PACKET: 3 0X1800 60 0XC04E7AA0
emac/plb/opb/ether...@ef600900: PACKET: 4 0X1800 92 0XC04E74A0
emac/plb/opb/ether...@ef600900: PACKET: 5 0X1800 92 0XC04E6BC0
emac/plb/opb/ether...@ef600900: PACKET: 6 0X1800 98 0XC04E6F80
emac/plb/opb/ether...@ef600900: PACKET: 7 0X1800 92 0XC04E64E0
emac/plb/opb/ether...@ef600900: PACKET: 8 0X1800 98 0XC04E5BC0

The first number in my debug print statement is what the driver calls the slot 
number (the ring index).  When it rolls over I start losing the ping replies.  
What have I done wrong to cause that?  The data is coming in and is there.  
Have any other network device developers seen similar behavior?

Thanks,

Jonathan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Memory allocation modifications in ibm_newemac driver

2010-09-01 Thread Jonathan Haws
Okay, I think I have all the issues worked out and can now send and receive any 
size packet without a hiccup.  I have tested this in our system setup as well 
with data being sent out to disk and did not see any problems there either 
(since it only ever allocates a single page, never more).

Is this something that may be wanted in the mainline?  I have not run full 
benchmarks, but I anticipate that my modified driver is slightly slower than 
the mainline driver because we keep track of an SKB ring, as well as a ring of 
pages and allocate both on each packet received.

Thanks,

Jonathan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Memory allocation modifications in ibm_newemac driver

2010-09-01 Thread Jonathan Haws
Apparently I spoke too soon - sorry about that.  I am still getting the error 
when I try to write to disk and receive on the network at the same time.  Here 
is the output:

blastee: page allocation failure. order:1, mode:0x4020
Call Trace:
[ccea9a40] [c0006ef0] show_stack+0x44/0x16c (unreliable)
[ccea9a80] [c006f9f0] __alloc_pages_nodemask+0x38c/0x4f8
[ccea9b00] [c0095008] __slab_alloc+0x594/0x5e0
[ccea9b40] [c0095a08] __kmalloc_track_caller+0xe8/0xf0
[ccea9b60] [c01c848c] __alloc_skb+0x60/0x140
[ccea9b80] [c01a7df8] emac_poll_rx+0x568/0x768
[ccea9bc0] [c01a28e4] mal_poll+0xa8/0x1ec
[ccea9bf0] [c01d3eec] net_rx_action+0x9c/0x1b4
[ccea9c20] [c003b3c0] __do_softirq+0xc4/0x148
[ccea9c60] [c0004d18] do_softirq+0x78/0x80
[ccea9c70] [c003b67c] local_bh_enable+0xc0/0xd8
[ccea9c80] [c01c29bc] lock_sock_nested+0xc0/0xdc
[ccea9cc0] [c0212cb4] udp_recvmsg+0x318/0x3a4
[ccea9d10] [c01c2334] sock_common_recvmsg+0x3c/0x60
[ccea9d30] [c01c06c4] sock_recvmsg+0xb8/0xf0
[ccea9e20] [c01c09b0] sys_recvfrom+0x8c/0xfc
[ccea9f00] [c01c18d4] sys_socketcall+0x128/0x1f8
[ccea9f40] [c000f434] ret_from_syscall+0x0/0x3c
Mem-Info:
DMA per-cpu:
CPU0: hi:   90, btch:  15 usd:  48
Active_anon:28 active_file:807 inactive_anon:85
 inactive_file:171 unevictable:0 dirty:0 writeback:0 unstable:0
 free:506 slab:53530 mapped:362 pagetables:19 bounce:0
DMA free:2024kB min:2036kB low:2544kB high:3052kB active_anon:112kB 
inactive_anon:340kB active_file:3228kB inactive_file:684kB unevictable:0kB 
present:260096kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 410*4kB 28*8kB 4*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 
0*2048kB 0*4096kB = 2024kB
978 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap  = 0kB
Total swap = 0kB
65536 pages RAM
1400 pages reserved
1001 pages shared
62828 pages non-shared
SLUB: Unable to allocate memory on node -1 (gfp=0x20)
  cache: kmalloc-8192, object size: 8192, buffer size: 8192, default order: 3, 
min order: 1
  node 0: slabs: 7140, objs: 25809, free: 0

Can anyone explain to me why I would be getting this error in the first place?  
Why is it failing to allocate a page when there are pages available?  That does 
not make any sense to me.

Thanks,

Jonathan
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Disable Caching for mmap() address

2009-11-12 Thread Jonathan Haws
 On Mon, 2009-11-09 at 16:21 -0700, Jonathan Haws wrote:
  All,
 
  I would like to disable caching for an address that was returned
 from a call to mmap().  I am using this address for DMA operations
 in user space and want to make sure that the data cache is turned
 off for that buffer.
 
  The way this works is the driver simply takes an address I provide
 and begins a DMA operation to that location in RAM (I have ensured
 that this is a physical address I am passing already).  When the DMA
 is complete, an interrupt fires and the ISR gives a semaphore that
 the user space application is pending on (RT_SEM from Xenomai).  I
 have tried simply calling a cache invalidate routine in the ISR
 before I give the semaphore, but the kernel crashes when I try to
 call that routine - my guess it because the kernel does not have
 direct access to that location in memory (only my application does,
 according to the MMU).
 
  Anyway, all I want to do is make sure that the buffer is never
 stored in the cache and that I always fetch it from RAM.  How can I
 specify that using mmap() on the /dev/mem device, or is there a
 better way to accomplish this?
 
 There is no proper way to do this, in large part because it's not
 always legal to map memory non-cached for various reasons I don't
 have time to explain right now...
 
 You may be able to get it working though but using a specific driver
 with an mmap function that tweaks the attributes or using mmap
 of /dev/mem after opening it with O_SYNC (off the top of my mind)
 
 But it's a bit fishy as the kernel has a cacheable mapping of most
 of memory and so you may end up with cache aliases...


Thanks for the response, Ben.  I am hoping that by passing a mem= argument to 
the kernel at boot time, the memory that I am setting aside for my DMA will be 
kept hidden from the kernel and the MMU.  I am then mapping that memory in user 
space with mmap() on /dev/mem and that descriptor is being opened with the 
O_SYNC flag.  I just wanted to make sure I was covering all my bases.

Thanks,

Jonathan



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Disable Caching for mmap() address

2009-11-09 Thread Jonathan Haws
All,

I would like to disable caching for an address that was returned from a call to 
mmap().  I am using this address for DMA operations in user space and want to 
make sure that the data cache is turned off for that buffer.

The way this works is the driver simply takes an address I provide and begins a 
DMA operation to that location in RAM (I have ensured that this is a physical 
address I am passing already).  When the DMA is complete, an interrupt fires 
and the ISR gives a semaphore that the user space application is pending on 
(RT_SEM from Xenomai).  I have tried simply calling a cache invalidate routine 
in the ISR before I give the semaphore, but the kernel crashes when I try to 
call that routine - my guess it because the kernel does not have direct access 
to that location in memory (only my application does, according to the MMU).

Anyway, all I want to do is make sure that the buffer is never stored in the 
cache and that I always fetch it from RAM.  How can I specify that using mmap() 
on the /dev/mem device, or is there a better way to accomplish this?

Thanks,

Jonathan



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Invalidate Data Cache from User Space

2009-11-09 Thread Jonathan Haws
All,

I have a routine to invalidate the data cache from user space (since I do not 
believe there is a standard routine I can use outside of kernel space??).

Here is the code:

.text;
.globl cacheInvalidate405;
cacheInvalidate405:

  /*
   *   r3 = Data cache
   *   r4 = address
   *   r5 = number of bytes
   */

cmpwi   r5,0/* make sure number of bytes is  0 */
beq invalDone
add r6,r4,r5
addir6,r6,31
rlwinm  r6,r6,0,0,26/* end addr to start of next cache line */
rlwinm  r7,r4,0,0,26/* start address back to start of line  */
sub r6,r6,r7
srawi   r6,r6,5 /* divide by 32 to get number of lines  */
mtctr   r6
invalLoop:
dcbir0,r4   /* THIS INSTRUCTION FAILS! */
addir4,r4,32
bdnzinvalLoop
sync
invalDone:
blr
.size   cacheInvalidate405, . - cacheInvalidate405

What is happening is the dcbi instruction will fail.  I get an Illegal 
Instruction message on the console and my program exits.

Is there a reason I cannot call dbci from a user space application, or is there 
something wrong in my code?  Even better, is there a working and tested 
function that I can call from user space to invalidate a portion of the data 
cache?

Thanks!

Jonathan


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Interrupts not Firing on PPC405EX

2009-11-05 Thread Jonathan Haws
All,

I am having some troubles getting interrupts to fire from my kernel module.  I 
have connected the ISR with a call to request_irq() and have configured my 
device to generate interrupts.   However, my ISR is called once when I connect 
the interrupt for the first time.  After that it never is called again.  It 
seems like that interrupt is getting stuck disabled, but that does not make 
sense as to why.

The device is on the PCIE0 bus and works just fine in another OS (namely 
Vxworks - that is the driver I am working on porting to Linux).

Here is how I am connecting the ISR and the ISR itself.  Am I doing something 
stupid?

Thanks for the help!

Jonathan

PS - Our hardware is a custom spun PPC405EX board based on the AMCC Kilauea 
board and uses the kilauea.dts with no modifications.

A quick note - I realize that I am not checking if I was the one to interrupt 
the CPU.  I am not worried about that right now - especially since I know there 
is nothing else that will interrupt the CPU on this IRQ right now anyway - it 
never fires.


int fpga_open(struct inode *inode, struct file *filp)
{
int err = 0;

/* Make sure we have successfully probed the device */
if (NULL == fpga_drv.pcidev)
{
return -ENODEV;
}

/* Only one process at a time can have access to the FPGA */
if (0 != atomic_read(fpga_drv.openCount))
{
atomic_inc(fpga_drv.openCount);
printk(KERN_WARNING FPGA: Could not open device: already open. 
\n);
return -EBUSY;
}

/* If not already in use, state that we are */
atomic_inc(fpga_drv.openCount);

/* Store a pointer to the PCI device structure */
filp-private_data = fpga_drv.pcidev;

/* Attach ISR to IRQ */
if (request_irq(fpga_drv.pcidev-irq, fpga_isr, IRQF_SHARED,
FPGA_MODULE_NAME, fpga_drv.pcidev))
{
printk( KERN_ERR FPGA: Unable to connect FPGA ISR (%d)!\n,
fpga_drv.pcidev-irq);
return -EPERM;
}

return 0;
}

/* Interrupt Service Routine */
irqreturn_t fpga_isr(int irq, void *dev_id)
{
uint32_t status = 0;

status = fpga_drv.cfg_ptr[FPGA_INTER_STATUS];
printk(KERN_NOTICE FPGA: Interrupt fired! (%#08x)\n, status);
if (status  FPGA_INTERRUPT_SPI)
{
rt_sem_v(fpga_drv.sarSem);
}

/* Return HANDLED */
return (IRQ_HANDLED);}
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: DMA to User-Space

2009-11-04 Thread Jonathan Haws
 Jonathan Haws wrote:
  All,
 
  I have what may be an unconventional question:
 
  Our application consists of data being captured by an FPGA,
 processed, and transferred to SDRAM.  I simply give the FPGA an
 address of where I want it stored in SDRAM and it simply DMAs the
 data over and interrupts me when finished.  I then take that data
 and store it to disk.
 
  I have code in user space that handles all of the writing to disk
 nicely and fast enough for my application (I am capturing data at
 about 35-40 Mbytes/sec).
 
  My question is this:  is it possible to give a user-space pointer
 to the FPGA to DMA to?  It seems like I would have problems with
 alignment, address manipulation, and a whole slew of other issues.
 
  What would be the best way to accomplish something like that?  I
 want to handle all the disk access in user-space, but I do not want
 to have to copy 40 MB/s from kernel space to user-space either.
 
 You can maintain a DMA buffer in kernel, then mmap to user space.
 And
 maybe you need some handshake between FPGA and the apps to balance
 input
 datas  with datas to disk.
  I can maintain an allocated, DMA-safe buffer in kernel space if
 needed.  Can I simply get a user-space pointer to that buffer?  What
 calls are needed to translate addresses?
 
 Use remap_pfn_range()  in your kernel DMA buffer manipulation driver
 .mmap() handler to export DMA  buffer address to user space.
 

Can you provide an example for how to do that?  I have an mmap routine to map 
BARs that the FPGA maintains and I can access those, however when I try to map 
the DMA buffer and access what is in it, the system crashes.  Here is the mmap 
function I have right now:

/* fpga_mmap()
 *
 *  Description:
 *   The purpose of this function is to serve as a 'file_operation'
 *   which maps different PCI resources into the calling processes
 *   memory space.
 *
 *   NOTE:  The file offset are in page size; i.e.:
 *   offset 0 in process's mmap syscall - BAR0
 *   offset 4096 in process's mmap syscall - BAR1
 *   offset 8192 in process's mmap syscall - BAR2
 * offset 12288 - Streaming DMA buffer
 *
 *  Arguments:
 *   struct file *filp -- struct representing an open file
 *   struct vm_area_struct   *vma  -- struct representing memory 'segment'
 *
 *  Returns:
 *   int -- indication of success or failure
 *
 */
int fpga_mmap(struct file *filp, struct vm_area_struct *vma)
{
struct pci_dev *dev;
unsigned long addressToMap;
uint8_t mapType = 0; /* 0 = IO, 1 = memory */

/* Get the PCI device */
dev = (struct pci_dev*)(filp-private_data);

/* Map in the appropriate BAR based on page offset */
if (vma-vm_pgoff == FPGA_CONFIG_SPACE)
{
/* Map BAR1 (the CONFIG area) */
printk(KERN_ALERT FPGA: Mapping BAR1 (CONFIG BAR).\n);
addressToMap = pci_resource_start(dev, FPGA_CONFIG_SPACE);
printk(KERN_ALERT FPGA: PCI BAR1 (CONFIG BAR) Size - 
%#08x.\n,
pci_resource_len(dev, FPGA_CONFIG_SPACE));
mapType = 0;
}
else if(vma-vm_pgoff == FPGA_TEST_SPACE)
{
/* Map BAR2 (the TEST area) */
printk(KERN_ALERT FPGA: Mapping BAR2 (TEST BAR).\n);
addressToMap = (pci_resource_start(dev, FPGA_TEST_SPACE) +
pci_resource_len(dev, FPGA_TEST_SPACE)) - 
FPGA_TEST_LENGTH;
printk(KERN_ALERT FPGA: PCI BAR2 (TEST BAR) Size - %#08x.\n,
pci_resource_len(dev, FPGA_TEST_SPACE));
mapType = 0;
}
else if(vma-vm_pgoff == 3)
{

addressToMap = (unsigned long)fpga_drv.strmData[0];
mapType = 1;
}
else
{
printk(KERN_ALERT  FPGA: Invalid BAR mapping specified.\n);
return ERROR;
}

/* Execute the mapping */
vma-vm_flags |= VM_IO;
vma-vm_flags |= VM_RESERVED;
vma-vm_page_prot = pgprot_noncached(vma-vm_page_prot);
printk(KERN_ALERT FPGA: vmSize - 0x%x.\n,
(unsigned int)(vma-vm_end - vma-vm_start));

if( mapType == 0 )
{
if(io_remap_pfn_range(vma, vma-vm_start, addressToMap  
PAGE_SHIFT,
vma-vm_end - vma-vm_start, vma-vm_page_prot) 
!= 0)
{
printk(KERN_ALERT FPGA: Failed to map BAR PCI space to 
user space.\n);
return ERROR;
}
}
else
{
printk(KERN_NOTICE FPGA: Mapping stream ptr (%#08x) to user 
space\n,(uint32_t)fpga_drv.strmData[0]);
printk(KERN_NOTICE FPGA: Setting strmData[0][0] to 0x37\n);
fpga_drv.strmData[0][0] = 0x37;
if(remap_pfn_range(vma, vma-vm_start, addressToMap  
PAGE_SHIFT

RE: DMA to User-Space

2009-11-04 Thread Jonathan Haws
 1. I open /dev/mem and get a file descriptor
 2. I use mmap to reserve some physical addresses for my buffers in
 user space.
 3. I give that address to the FPGA for DMA use.
 4. When I get the FPGA interrupt, I invalidate the data cache and
 write the data to disk
 
 Does that sound like it would work?  Would the address I receive
 from mmap() and pass to the FPGA be the actual physical address, or
 would I need to send the physical address to the FPGA and use the
 mmap() address to access and write to disk?

One more question about this approach: does the mmap() call prevent the kernel 
from using this memory for other purposes?  Will the kernel be able to move 
this memory elsewhere?  I guess what I am asking is if this memory is locked 
for all other purposes?
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


DMA to User-Space

2009-11-03 Thread Jonathan Haws
All,

I have what may be an unconventional question:

Our application consists of data being captured by an FPGA, processed, and 
transferred to SDRAM.  I simply give the FPGA an address of where I want it 
stored in SDRAM and it simply DMAs the data over and interrupts me when 
finished.  I then take that data and store it to disk.

I have code in user space that handles all of the writing to disk nicely and 
fast enough for my application (I am capturing data at about 35-40 Mbytes/sec).

My question is this:  is it possible to give a user-space pointer to the FPGA 
to DMA to?  It seems like I would have problems with alignment, address 
manipulation, and a whole slew of other issues.

What would be the best way to accomplish something like that?  I want to handle 
all the disk access in user-space, but I do not want to have to copy 40 MB/s 
from kernel space to user-space either.

I can maintain an allocated, DMA-safe buffer in kernel space if needed.  Can I 
simply get a user-space pointer to that buffer?  What calls are needed to 
translate addresses?

Thanks for the help!  I am still a newbie when it comes to kernel programming, 
so I really appreciate the help!

Jonathan



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space [SOLVED]

2009-10-30 Thread Jonathan Haws
 On Friday 30 October 2009 15:50:22 Jonathan Haws wrote:
   I suspect that the msync() was merely serving as a very
 heavyweight
   memory barrier.
 
  I did try hacking the mb() calls from the kernel source to use
 them in user space, but they had no effect.  I still had to include
 the calls to msync().
 
 What did the resulting mb() that you used look like?

asm(eieio; sync);
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space [SOLVED]

2009-10-30 Thread Jonathan Haws
 On Friday 30 October 2009 16:08:55 Alessandro Rubini wrote:
   asm(eieio; sync);
 
  Hmm...
  : : : memory
 
  And, doesn't ; start a comment in assembly? (no, not on powerpc
 it seems)
 
 Yes, I think the barrier is wrong.
 Please try with
 
 #define mb()  __asm__ __volatile__(eieio\n sync\n : : :
 memory)

That definition worked great.  I must have missed the : : : memory bit when I 
was digging through code.

Thanks, that gives me about a 2x speedup over the msync() calls.

Thanks for the help!
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space

2009-10-29 Thread Jonathan Haws
 On Tue, Oct 27, 2009 at 04:52:40PM -0600, Jonathan Haws wrote:
   Will the device respond to 0x1234 being written at offset zero?
 You
   generally have to poke these things pretty specifically in order
 to
   get
   them to go into command mode.
  
 
  It should because that is the first data location in flash.
 
 I don't follow.  Even if you have an Intel command set flash (and
 thus don't
 need unlock writes), 0x1234 isn't a valid command that I know of.
 The flash
 doesn't behave as a register that you can read back; it just
 responds in a
 certain way based on what you write to it.
 
  Also, just to be sure I am telling the truth, I tried writing to
 one of
  the registers to setup an erase and got the same results - the
 value did
  not get written.
 
 Following the exact sequence that the driver uses?  What did you
 write, what
 did you expect (you're generally not going to get the same thing
 back that
 you wrote), and what did you get?  What kind of command set, bus
 width, and
 interleaving do you have?

I used the erase pattern, then write pattern for my flash device.  When I tried 
to read back the value that should have been stored, it was what it was 
previously.

 
 If you manually do the same exact accesses from a firmware prompt,
 external
 debugger, etc. does it work?
 
The driver works perfectly in VxWorks,
 
 On this exact hardware?

Yes.

   Including the 0x1234 thing?
 
  Actually, I have not tried that - I have not had to since the
 driver worked.
 
 What happens without the 0x1234?

Have not bothered to try it.  My guess, after finding out what the problem is 
that it would not read back 0x1234.  In the test I performed, I intended to 
erase the sector, prep it for write, then write out 0x1234 to the first two 
bytes in flash.  However, I failed in include the code to erase and prep the 
sector for writing in my rush to find out what the heck was going on.

As I mentioned previously, I was just not allowing the correct sequence of 
operations to take place to erase the sector (that is where my problem began) 
because when I setup the sector for erasure, the sequencing did not take place 
correctly because what I would assign to flash was not committed immediately.

I hope that makes sense.

Thanks,

Jonathan


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space [SOLVED]

2009-10-29 Thread Jonathan Haws
 Does O_DIRECT help? (you may need to define _GNU_SOURCE before
 #include)


Nope, O_DIRECT did not help - in fact it caused the application to crash.  Why 
that is I am not sure, but it crashed.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space [SOLVED]

2009-10-29 Thread Jonathan Haws
   Anyway, to make a long story short, I inserted an msync() after
 each
   assignment to the flash.  This resolved my problem and I can now
 program my flash.
 
  Ouch, this was news to me too. Calling msync() after every write
 kills performance.
  We use mmap(/dev/mem) to access HW and havn't seen any issues yet.
 Is this
  perhaps a new behaviour for mmap(/dev/mem) and is there a way
  to avoid calling msync()?
 
 The address range should be outside the dram and thus uncached. Any
 write to any address in the range mmaped should go directly to the
 NOR
 flash. Any other behavior is a bug. It's not mapping an actual file
 here.

That is what I was thinking.  But I have a working driver and the extra delay 
of writing to flash caused by the msync() calls I can deal with.  I only ever 
write data to flash when I need to update one of my boot images.

Thanks,

Jonathan
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space

2009-10-29 Thread Jonathan Haws
 Looking through our notes and talking with the engineer 
 who was performing the tests, it was exactly that - MTD was waiting
 for a signal that was produced differently than the hardware 
 ready signal.  By simply polling the flash until the hardware
 ready signal toggled we were able to get a much faster read and write speed.
 Granted, most of our signals are being sent through a CPLD,
 so that may be why MTD did not work as well.

even if your problem is solved I'd like to understand this performance issue.
I had a look into the datasheet of the S29GL Mirrorbit flash by Spansion as an 
example. They provide a dedicated pin RY/BY#, which signals the end of an 
embedded algorithm (erase or programming). While figure 11.9 shows no timing 
advance of RY/BY# against Dout on the data line, figure 11.12 has one of 
unspecified length between RY/BY# and the end of data toggling.

If you had a 10-fold slowdown with MTD, either the CPLD really slows down the 
read access to the flash or maybe your custom driver uses some acceleration 
(write buffer programming,
unlock bypass, accelerated program with 12V on the WP#/ACC pin) while MTD does 
not.

Which kernel version and flash device did you use in this comparsion?

We were using VxWorks when we did the comparison, so there may be a problem in 
their driver.  We are using unlock bypass by the way.  Our flash chip is from 
Spansion.  I do not have the datasheet right with me, so I do not have the part 
number.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space

2009-10-28 Thread Jonathan Haws
 On Tue, Oct 27, 2009 at 04:24:53PM -0600, Jonathan Haws wrote:
   How can I get that pointer?  Unfortunately I cannot simply
 use
  the
  
   address of the flash.  Is there some magical function call
 that
   gives me access to that portion of the memory space?
  
   $ man 2 mmap
  
   You want MAP_SHARED and O_SYNC.
  
  
  
   To use that I need to have a file descriptor to a device, do I
  not?  However, I do not have a base flash driver to give me that
  file descriptor.  Am I missing something with that call?
  
 
  /dev/mem
 
 Okay, I now have access to the flash memory, however when I write
 to it the writes do not take.  I have tried calling msync() on the
 mapping to no avail.  I have opened the fd with O_SYNC, but cannot
 get things to work right.
 
 Here are the calls:
 
  int fd = open(/dev/mem, O_SYNC | O_RDWR);
  uint16_t * flash = (uint16_t *)mmap(NULL, NOR_FLASH_SIZE,
  (PROT_READ | PROT_WRITE), MAP_PRIVATE, fd,
  NOR_FLASH_BASE_ADRS);
 
 What board and CPU are you using?  Is your flash really at
 0xFC80, or is
 that the virtual address that VxWorks puts it at?

I am using a custom board based on the AMCC Kilauea development board.  It uses 
a 405EX CPU.  Yes, the flash is really at 0xFC00.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space

2009-10-28 Thread Jonathan Haws
 On Tue, 2009-10-27 at 16:52 -0600, Jonathan Haws wrote:
   Jonathan Haws wrote:
I had thought about using MTD, but decided against it because
 with
previous benchmarking that we did with MTD and our custom
 driver,
   we
found that our custom driver was about 10x faster.
  
   Ouch.  Any idea where the slowdown is coming from?
 
  From what I remember (I would have to dig up notes to make sure)
 it is something to do with MTD looking for a signal to go high that
 is processed a bunch before MTD even sees it.  Our flash produces a
 hardware ready signal that we are triggering off of to move on.  MTD
 took much longer to report to us that the hardware was ready.
 
  Thanks
 
 
 It would be interesting to know in more detail what is was. If we
 have a
 10x performance increase hiding from for us I would be very
 interested
 in knowing where it is.
 
 Are you using some custom command to the flash that the generic chip
 drivers in Linux is not yet supporting ?

Looking through our notes and talking with the engineer who was performing the 
tests, it was exactly that - MTD was waiting for a signal that was produced 
differently than the hardware ready signal.  By simply polling the flash until 
the hardware ready signal toggled we were able to get a much faster read and 
write speed.  Granted, most of our signals are being sent through a CPLD, so 
that may be why MTD did not work as well.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space [SOLVED]

2009-10-28 Thread Jonathan Haws
  On Tue, Oct 27, 2009 at 04:24:53PM -0600, Jonathan Haws wrote:
How can I get that pointer?  Unfortunately I cannot simply
  use
   the
   
address of the flash.  Is there some magical function call
  that
gives me access to that portion of the memory space?
   
$ man 2 mmap
   
You want MAP_SHARED and O_SYNC.
   
   
   
To use that I need to have a file descriptor to a device, do
 I
   not?  However, I do not have a base flash driver to give me
 that
   file descriptor.  Am I missing something with that call?
   
  
   /dev/mem
  
  Okay, I now have access to the flash memory, however when I write
  to it the writes do not take.  I have tried calling msync() on the
  mapping to no avail.  I have opened the fd with O_SYNC, but cannot
  get things to work right.
  
  Here are the calls:
  
 int fd = open(/dev/mem, O_SYNC | O_RDWR);
 uint16_t * flash = (uint16_t *)mmap(NULL, NOR_FLASH_SIZE,
 (PROT_READ | PROT_WRITE), MAP_PRIVATE, fd,
 NOR_FLASH_BASE_ADRS);
 
  What board and CPU are you using?  Is your flash really at
  0xFC80, or is
  that the virtual address that VxWorks puts it at?
 
 I am using a custom board based on the AMCC Kilauea development
 board.  It uses a 405EX CPU.  Yes, the flash is really at
 0xFC00.

I have found the problem.  It occurred to me in the shower (okay not really, 
but most good ideas happen there).

What was happening is that I was in fact able to write to the correct 
registers.  However, I would try and write to them in a batch.  But the way 
mmap works (at least according to the man page) with MAP_SHARED is that the 
file may not be updated until msync() is called.  Now, I thought that O_SYNC 
would take care of that when I open /dev/mem, but that was not the case.

Anyway, to make a long story short, I inserted an msync() after each assignment 
to the flash.  This resolved my problem and I can now program my flash.

Thanks everyone for your help!

Jonathan


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Network Stack SKB Reallocation

2009-10-27 Thread Jonathan Haws
 try to reuse it later, the kernel would panic because that was not a 
valid SKB.

So, moral of the story is keep your MTU at 4000 or lower.  This hammers your 
throughput, but it seems to be the best we can do given the way the stack works.

If anyone has any other solutions, that would be GREAT!  I would love to be 
able to use a 9000 byte MTU without getting out of memory errors simply due to 
fragmentation.

HTH,

Jonathan


 
 -Original Message-
 From: linuxppc-dev-bounces+john.p.price=l-3com@lists.ozlabs.org
 [mailto:linuxppc-dev-bounces+john.p.price=l-
 3com@lists.ozlabs.org]
 On Behalf Of Jonathan Haws
 Sent: Monday, October 26, 2009 2:43 PM
 To: linuxppc-dev@lists.ozlabs.org
 Subject: Network Stack SKB Reallocation
 
 Quick question about the network stack in general:
 
 Does the stack itself release an SKB allocated by the device driver
 back
 to the heap upstream, or does it require that the device driver
 handle
 that?
 
 Thanks!
 
 Jonathan
 
 
 ___
 Linuxppc-dev mailing list
 Linuxppc-dev@lists.ozlabs.org
 https://lists.ozlabs.org/listinfo/linuxppc-dev
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Accessing flash directly from User Space

2009-10-27 Thread Jonathan Haws
I know this is probably a really dumb question, but a wise man once said that 
the only stupid question is the one that is not asked.

So, I have written a flash driver in VxWorks that simply addresses the flash 
directly and handles all the hardware accesses just fine.  I am porting that to 
Linux and need it to run in user space (mainly to simplify the interface with 
the user - I want to keep it the same as in VxWorks).  Here is a snippet of 
what my question is:

static uint8_t bflashEraseSector(int sa, int verbose)
{
uint16_t * flash = (uint16_t *) NOR_FLASH_BASE_ADRS;
uint32_t offset;

...

/* We divide by 2 here to adjust for the 16-bit offset into the address 
*/
offset = sa * NOR_FLASH_SECTOR_SIZE / 2;
flash[BFLASH_SECTOR_ERASE_ADDR1] = BFLASH_SECTOR_ERASE_BYTE1;

...

}

I am trying to get a pointer to NOR_FLASH_BASE_ADRS which is defined to be 
0xFC00.  I then dereference that directly to write to the flash.

How can I get that pointer?  Unfortunately I cannot simply use the address of 
the flash.  Is there some magical function call that gives me access to that 
portion of the memory space?

Thanks for the help!

Jonathan

PS - I know that I could simply use the MTD driver provided by the kernel, but 
I need to be able to keep the interface the same so we can use previously 
written code.



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space

2009-10-27 Thread Jonathan Haws
  How can I get that pointer?  Unfortunately I cannot simply use the
 address of the flash.  Is there some magical function call that
 gives me access to that portion of the memory space?
 
 
 $ man 2 mmap
 
 You want MAP_SHARED and O_SYNC.


To use that I need to have a file descriptor to a device, do I not?  However, I 
do not have a base flash driver to give me that file descriptor.  Am I missing 
something with that call?

Thanks!
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space

2009-10-27 Thread Jonathan Haws
 Jonathan Haws wrote:
  How can I get that pointer?  Unfortunately I cannot simply use
 the
 
  address of the flash.  Is there some magical function call that
  gives me access to that portion of the memory space?
 
  $ man 2 mmap
 
  You want MAP_SHARED and O_SYNC.
 
 
 
  To use that I need to have a file descriptor to a device, do I
 not?  However, I do not have a base flash driver to give me that
 file descriptor.  Am I missing something with that call?
 
 
 /dev/mem


Ah, yes.  I told you this was going to be a dumb question.

Thanks, Bill.

Jonathan


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space

2009-10-27 Thread Jonathan Haws
  How can I get that pointer?  Unfortunately I cannot simply use
 the
 
  address of the flash.  Is there some magical function call that
  gives me access to that portion of the memory space?
 
  $ man 2 mmap
 
  You want MAP_SHARED and O_SYNC.
 
 
 
  To use that I need to have a file descriptor to a device, do I
 not?  However, I do not have a base flash driver to give me that
 file descriptor.  Am I missing something with that call?
 
 
 /dev/mem
 
Okay, I now have access to the flash memory, however when I write to it the 
writes do not take.  I have tried calling msync() on the mapping to no avail.  
I have opened the fd with O_SYNC, but cannot get things to work right.

Here are the calls:

int fd = open(/dev/mem, O_SYNC | O_RDWR);
uint16_t * flash = (uint16_t *)mmap(NULL, NOR_FLASH_SIZE,
(PROT_READ | PROT_WRITE), MAP_PRIVATE, fd,
NOR_FLASH_BASE_ADRS);

When I do flash[0] = 0x1234, and then check the value, they do not match.

flash[0] = 0x1234;
msync(flash, NOR_FLASH_SIZE, MS_SYNC | MS_INVALIDATE);
printf(flash[0] = %#04x\n, flash[0]);

That prints flash[0] = 0x7f45.  I have verified that I am reading the correct 
values.  I can display the flash contents in U-Boot and 7f45 is what is in the 
first 16 bits of flash.

Why can I not write to flash?  What am I doing wrong?

Thanks!

Jonathan



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space

2009-10-27 Thread Jonathan Haws
  Okay, I now have access to the flash memory, however when I write
 to
  it the writes do not take.
 
 
  (PROT_READ | PROT_WRITE), MAP_PRIVATE, fd,
 
 MAP_SHARED. Bill told you.  With MAP_PRIVATE you write to a local
 in-ram copy of the data, not to the original one.

I apologize, that MAP_PRIVATE was leftover from me trying to get it to work.  
With MAP_SHARED I am having the problem.

Sorry for the confusion.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Accessing flash directly from User Space

2009-10-27 Thread Jonathan Haws
 Jonathan Haws wrote:
flash[0] = 0x1234;
msync(flash, NOR_FLASH_SIZE, MS_SYNC | MS_INVALIDATE);
printf(flash[0] = %#04x\n, flash[0]);
 
  That prints flash[0] = 0x7f45.  I have verified that I am
 reading
  the correct values.  I can display the flash contents in U-Boot
 and
  7f45 is what is in the first 16 bits of flash.
  Why can I not write to flash?  What am I doing wrong?
  Flash does not work that way -- you must send it commands to
 erase a
  block, and then further commands to program new data.
 
  I realize that.  I have a driver written that does exactly that.
  However, I need to be able to write to certain registers to setup
 the
  erasure.
 
 Will the device respond to 0x1234 being written at offset zero?  You
 generally have to poke these things pretty specifically in order to
 get
 them to go into command mode.
 

It should because that is the first data location in flash.  Also, just to be 
sure I am telling the truth, I tried writing to one of the registers to setup 
an erase and got the same results - the value did not get written.

  The driver works perfectly in VxWorks,
 
 Including the 0x1234 thing?

Actually, I have not tried that - I have not had to since the driver worked.

  It sounds like what you really want is the /dev/mtd or
 /dev/mtdblock
  interface, not raw access to the flash chip.
 
  As mentioned in my initial post, I need to use my custom driver to
 maintain the interface to the application that uses the flash for
 data storage.
 
  I had thought about using MTD, but decided against it because with
  previous benchmarking that we did with MTD and our custom driver,
 we
  found that our custom driver was about 10x faster.
 
 Ouch.  Any idea where the slowdown is coming from?

From what I remember (I would have to dig up notes to make sure) it is 
something to do with MTD looking for a signal to go high that is processed a 
bunch before MTD even sees it.  Our flash produces a hardware ready signal 
that we are triggering off of to move on.  MTD took much longer to report to 
us that the hardware was ready.

Thanks


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Jumbo Frame bug in ibm_newemac driver (was Jumbo Frames, sil24 SATA driver, and kswapd0 page allocation failures)

2009-10-26 Thread Jonathan Haws
Okay, I need to revisit this issue.  I have had my time taken away for other 
things the past couple of months, but I am now back at this network issue.

Here is what I have done:

1. I modified the ibm_newemac driver to follow scatter-gather chains on the RX 
path.  The idea was to setup the driver to only ever deal with single pages.  
The MAL in the PPC only supports data transfers of up to 4080 bytes (less than 
a single page), so it appears that the hardware should support single page 
chains.  I set this up just like the e1000 driver.  For whatever reason, this 
did not work.  It is probably because I do not fully understand the Linux 
network stack yet (as is apparent in the next iteration).

2. I reverted to the original driver and found that, contrary to what I had 
thought earlier, the driver does allocate a ring of skbs for use in the driver. 
 However, when a jumbo packet is received (larger than 4080 bytes) it uses the 
skb that was pre-allocated for the jumbo packet and allocates a new skb to 
replace the one in the ring.  This is where the problem is - in that new 
allocation to replace the one in the stack.  So, to remedy this, I 
pre-allocated the same number of jumbo skbs for the sole purpose of being used 
as new skbs for the rx ring.  Here is some code that shows the idea:

Statuc int emaC_open(struct net_device *ndev)
{
...

/* Allocate RX ring */
for (i = 0; i  NUM_RX_BUFF; ++i)
{
if (emac_alloc_rx_skb(dev, i, GFP_KERNEL)) {
printk(KERN_ERR %s: failed to allocate RX ring\n,
   ndev-name);
goto oom;
}

}

...
}

static inline int emac_alloc_rx_skb2(struct emac_instance *dev, int slot,
gfp_t flags)
{
struct sk_buff *skb = dev-rx_skb_pool[slot];
if (unlikely(!skb))
return -ENOMEM;

if(skb_recycle_check(skb, emac_rx_skb_size(dev-rx_skb_size)))
{
dev-rx_skb[slot] = skb;
dev-rx_desc[slot].data_len = 0;

skb_reserve(skb, EMAC_RX_SKB_HEADROOM + 2);
dev-rx_desc[slot].data_ptr =
dma_map_single(dev-ofdev-dev, skb-data - 2, dev-rx_sync_size,
   DMA_FROM_DEVICE) + 2;
wmb();
dev-rx_desc[slot].ctrl = MAL_RX_CTRL_EMPTY |
(slot == (NUM_RX_BUFF - 1) ? MAL_RX_CTRL_WRAP : 0);

return 0;
}
else
{
printk(KERN_NOTICE EMAC: SKB not recycleable\n);
return -ENOMEM;
}
}

Static int emac_poll_rx(void *param, int budget)
{
...
  sg:
if (ctrl  MAL_RX_CTRL_FIRST) {
BUG_ON(dev-rx_sg_skb);
if (unlikely(emac_alloc_rx_skb2(dev, slot, 
GFP_ATOMIC))) {
DBG(dev, rx OOM %d NL, slot);
++dev-estats.rx_dropped_oom;
emac_recycle_rx_skb(dev, slot, 0);
} else {
dev-rx_sg_skb = skb;
  emac_recycle_rx_skb(dev,slot,len);
skb_put(skb, len);
}
} else if (!emac_rx_sg_append(dev, slot) 
   (ctrl  MAL_RX_CTRL_LAST)) {

skb = dev-rx_sg_skb;
dev-rx_sg_skb = NULL;

ctrl = EMAC_BAD_RX_MASK;
if (unlikely(ctrl  ctrl != EMAC_RX_TAH_BAD_CSUM)) {
emac_parse_rx_error(dev, ctrl);
++dev-estats.rx_dropped_error;
dev_kfree_skb(skb);
len = 0;
} else {
/*  printk(KERN_NOTICE EMAC: pushing sg 
packet\n);*/
goto push_packet;
}
}
goto skip;
...
}

The changes are the allocation of the rx_skb_pool in emac_open(), the function 
call emac_alloc_rx_skb2() in emac_poll_rx(), and the modifications to 
emac_alloc_skb to create emac_alloc_rx_skb2.  Also, corresponding allocations 
for rx_skb_pool are found in emac_resize_rx_ring() for when we need to resize 
the pool.

Now the problem that I am having is this - the first time through the ring, 
things work just fine.  But the second time through the loop, the buffers are 
not cleaned out - they still think they contain data.  I have tried calling 
skb_recycle_check() to restore the skb to a new state, however that call fails 
because apparently the skb cannot be reused for receive.  Why is that the case? 
 What am I missing?  It seems like I am missing something that allows the skb 
to be reused?

I will admit, I am not a Linux network driver expert, though I am learning.  If 

Network Stack SKB Reallocation

2009-10-26 Thread Jonathan Haws
Quick question about the network stack in general:

Does the stack itself release an SKB allocated by the device driver back to the 
heap upstream, or does it require that the device driver handle that?

Thanks!

Jonathan


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Network Stack SKB Reallocation

2009-10-26 Thread Jonathan Haws
So, in my case, I allocate a bunch of skb's that I want to be able to reuse 
during network operation (256 in fact).  When I pass it up the stack, the stack 
will free that skb back to the system making any further use of it invalid 
until I call alloc_skb() again?

Thanks.

 On Monday 26 October 2009 19:43:00 Jonathan Haws wrote:
  Quick question about the network stack in general:
 
  Does the stack itself release an SKB allocated by the device
 driver back to the heap upstream, or does it require that the device
 driver handle that?
 
 There's the concept of passing responsibilities for the frames
 between
 the networking layers. So the driver passes the frame and all
 responsibilities
 to the networking stack. So if the networking stack accepts the
 packet in the first place,
 it needs to free it (or pass it to somebody else to take care of).
 
 --
 Greetings, Michael.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Page map BUG on program exit

2009-10-23 Thread Jonathan Haws
Jake,

That is exactly what I needed.  Patch 34113 worked like a charm.

Thanks for the help!

Jonathan



Here ya go Jonathan,
http://patchwork.ozlabs.org/patch/34047/
http://patchwork.ozlabs.org/patch/34113/

Both patches work for my situation, but I went with the second set as a final 
patch(34113).

- Jake Magee
On Thu, Oct 22, 2009 at 3:57 PM, Jonathan Haws 
jonathan.h...@sdl.usu.edumailto:jonathan.h...@sdl.usu.edu wrote:
All,

I am using a 405EX CPU on a custom board.  The layout and hardware is very 
similar to the AMCC Kilauea board.  Here is the output of uname -a:

Linux (none) 2.6.30.3-wolverine-dirty #3 PREEMPT Thu Sep 10 11:41:37 MDT 2009 
ppc unknown

I am getting the following BUG output when my program exits:

BUG: Bad page map in process main  pte:980005d7 pmd:0d840400
addr:4800 vm_flags:400844fb anon_vma:(null) mapping:cd8454f8 index:98000
vma-vm_file-f_op-mmap: fpga_mmap+0x0/0x178 [fpgaDriver]
Call Trace:
[cd84dc40] [c0006f0c] show_stack+0x44/0x16c (unreliable)
[cd84dc80] [c00ba314] print_bad_pte+0x140/0x1d0
[cd84dcb0] [c00ba3ec] vm_normal_page+0x48/0x50
[cd84dcc0] [c00bb2ec] unmap_vmas+0x214/0x614
[cd84dd40] [c00bffe0] exit_mmap+0xd0/0x1b4
[cd84dd70] [c0031e40] mmput+0x50/0x134
[cd84dd80] [c0036470] exit_mm+0x114/0x13c
[cd84ddb0] [c0037d80] do_exit+0xc0/0x68c
[cd84de00] [c0038390] do_group_exit+0x44/0xd8
[cd84de10] [c0044468] get_signal_to_deliver+0x1f8/0x430
[cd84de70] [c0008224] do_signal+0x54/0x29c
[cd84df40] [c0010d5c] do_user_signal+0x74/0xc4

I have an FPGA on the PCIe bus that I am mapping BAR0 to user space with a call 
to mmap().  The mapping works just fine and I can access all the registers in 
the BAR without a problem.  However, on exit this comes up.

A Google search showed tons of people with similar problems in standard 
distributions (Ubuntu primarily), but no resolutions.

Has anyone seen this crop up before and know what the issue is?  I include any 
source code, if that is required.

Thanks!

Jonathan



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.orgmailto:Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Page map BUG on program exit

2009-10-22 Thread Jonathan Haws
All,

I am using a 405EX CPU on a custom board.  The layout and hardware is very 
similar to the AMCC Kilauea board.  Here is the output of uname -a:

Linux (none) 2.6.30.3-wolverine-dirty #3 PREEMPT Thu Sep 10 11:41:37 MDT 2009 
ppc unknown

I am getting the following BUG output when my program exits:

BUG: Bad page map in process main  pte:980005d7 pmd:0d840400
addr:4800 vm_flags:400844fb anon_vma:(null) mapping:cd8454f8 index:98000
vma-vm_file-f_op-mmap: fpga_mmap+0x0/0x178 [fpgaDriver]
Call Trace:
[cd84dc40] [c0006f0c] show_stack+0x44/0x16c (unreliable)
[cd84dc80] [c00ba314] print_bad_pte+0x140/0x1d0
[cd84dcb0] [c00ba3ec] vm_normal_page+0x48/0x50
[cd84dcc0] [c00bb2ec] unmap_vmas+0x214/0x614
[cd84dd40] [c00bffe0] exit_mmap+0xd0/0x1b4
[cd84dd70] [c0031e40] mmput+0x50/0x134
[cd84dd80] [c0036470] exit_mm+0x114/0x13c
[cd84ddb0] [c0037d80] do_exit+0xc0/0x68c
[cd84de00] [c0038390] do_group_exit+0x44/0xd8
[cd84de10] [c0044468] get_signal_to_deliver+0x1f8/0x430
[cd84de70] [c0008224] do_signal+0x54/0x29c
[cd84df40] [c0010d5c] do_user_signal+0x74/0xc4

I have an FPGA on the PCIe bus that I am mapping BAR0 to user space with a call 
to mmap().  The mapping works just fine and I can access all the registers in 
the BAR without a problem.  However, on exit this comes up.

A Google search showed tons of people with similar problems in standard 
distributions (Ubuntu primarily), but no resolutions.

Has anyone seen this crop up before and know what the issue is?  I include any 
source code, if that is required.

Thanks!

Jonathan



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: Jumbo Frames, sil24 SATA driver, and kswapd0 page allocation failures

2009-08-18 Thread Jonathan Haws
 If the hardware supports it, the best way to deal with it is to set
 up
 the driver so that it only ever deals in single pages.  

I am working on fixing the driver to support NETIF_F_SG and have changed how it 
receives packets to follow how the e1000 driver does it.

Here is where I am at:

When I get the first part of the frame, I allocate an skb for the packet.  I 
call dev-page = alloc_page(GFP_ATOMIC) to allocate a page for the 4080 bytes 
coming from the MAL.

I then setup a DMA mapping for that page to get the data out of the MAL (the 
original code simply used dma_map_single, but I need a page).

Once the DMA map has been setup and data transferred, I call 
skb_fill_page_desc() to put the data into the skb.  I then wrote a function 
called emac_consume_page, which unmaps the DMA mapping, frees the page, and 
updates the lengths in the skb.

The relevant source code is at the end of this email.

My problem is this:

When I run this code, it appears to create the fragmented packet just fine, but 
when it passes it up the stack, the kernel spits out these bugs, one after 
another:

BUG: Bad page state in process swapper  pfn:0ee9b
page:c051f360 flags:(null) count:-3 mapcount:0 mapping:(null) index:766
Call Trace:
[c032bc30] [c0006ef0] show_stack+0x44/0x16c (unreliable)
[c032bc70] [c006c438] bad_page+0x94/0x130
[c032bc90] [c006d4a0] get_page_from_freelist+0x458/0x4d4
[c032bd20] [c006d5f4] __alloc_pages_nodemask+0xd8/0x4f8
[c032bda0] [c01a1174] emac_poll_rx+0x300/0x9c8
[c032bdf0] [c019cb64] mal_poll+0xa8/0x1ec
[c032be20] [c01cf218] net_rx_action+0x9c/0x1b4
[c032be50] [c0039678] __do_softirq+0xc4/0x148
[c032be90] [c0004d18] do_softirq+0x78/0x80
[c032bea0] [c0039264] irq_exit+0x64/0x7c
[c032beb0] [c0005210] do_IRQ+0x9c/0xb4
[c032bed0] [c000fa7c] ret_from_except+0x0/0x18
[c032bf90] [c000808c] cpu_idle+0xdc/0xec
[c032bfb0] [c00028fc] rest_init+0x70/0x84
[c032bfc0] [c02e0864] start_kernel+0x240/0x2c4
[c032bff0] [c0002254] start_here+0x44/0xb0
BUG: Bad page state in process swapper  pfn:0ee8c
page:c051f180 flags:(null) count:-3 mapcount:0 mapping:(null) index:757
Call Trace:
[c032bc30] [c0006ef0] show_stack+0x44/0x16c (unreliable)
[c032bc70] [c006c438] bad_page+0x94/0x130
[c032bc90] [c006d4a0] get_page_from_freelist+0x458/0x4d4
[c032bd20] [c006d5f4] __alloc_pages_nodemask+0xd8/0x4f8
[c032bda0] [c01a1174] emac_poll_rx+0x300/0x9c8
[c032bdf0] [c019cb64] mal_poll+0xa8/0x1ec
[c032be20] [c01cf218] net_rx_action+0x9c/0x1b4
[c032be50] [c0039678] __do_softirq+0xc4/0x148
[c032be90] [c0004d18] do_softirq+0x78/0x80
[c032bea0] [c0039264] irq_exit+0x64/0x7c
[c032beb0] [c0005210] do_IRQ+0x9c/0xb4
[c032bed0] [c000fa7c] ret_from_except+0x0/0x18
[c032bf90] [c000808c] cpu_idle+0xdc/0xec
[c032bfb0] [c00028fc] rest_init+0x70/0x84
[c032bfc0] [c02e0864] start_kernel+0x240/0x2c4
[c032bff0] [c0002254] start_here+0x44/0xb0

I know that I am missing something when it comes to allocating the pages for 
the fragments, but when I compare my methodology to the e1000 driver, they 
appear to be functionally the same?

Any ideas?  I can send the entire source file for the driver if needs be.

Thanks!

Jonathan


Here is the source:

static int emac_poll_rx(void *param, int budget)
{

... /* Other code is here */

push_packet:
skb-dev = dev-ndev;
skb-protocol = eth_type_trans(skb, dev-ndev);
emac_rx_csum(dev, skb, ctrl);

if (unlikely(netif_receive_skb(skb) == NET_RX_DROP))
++dev-estats.rx_dropped_stack;
next:
++dev-stats.rx_packets;
skip:
dev-stats.rx_bytes += len;
slot = (slot + 1) % NUM_RX_BUFF;
--budget;
++received;
continue;
sg:
if (ctrl  MAL_RX_CTRL_FIRST) {
BUG_ON(dev-rx_sg_skb);
if (unlikely(emac_alloc_rx_skb2(dev, slot, GFP_ATOMIC))) {
DBG(dev, rx OOM %d (%d) (%d) NL, slot, dev-rx_skb_size, len);
++dev-estats.rx_dropped_oom;
emac_recycle_rx_skb(dev, slot, 0);
} else {
dev-rx_sg_skb = skb;
skb_fill_page_desc(dev-rx_sg_skb, 0, dev-page, 0, len);
emac_consume_page(dev, len, slot);
dev-rx_sg_skb-len += ETH_HLEN;
}
} else if (!emac_rx_sg_append(dev, slot)  (ctrl  MAL_RX_CTRL_LAST)) {
skb = dev-rx_sg_skb;
dev-rx_sg_skb = NULL;

ctrl = EMAC_BAD_RX_MASK;
if (unlikely(ctrl  ctrl != EMAC_RX_TAH_BAD_CSUM)) {
emac_parse_rx_error(dev, ctrl);
++dev-estats.rx_dropped_error;
dev_kfree_skb(skb);
len = 0;
} else
goto push_packet;
}

... /* Other code is here */
} /* end of emac_poll_rx */

static inline int emac_alloc_rx_skb2(struct emac_instance *dev, int slot,
gfp_t flags)
{
struct sk_buff *skb = alloc_skb(242, flags);
if (unlikely(!skb))
return -ENOMEM;


dev-rx_skb[slot] = skb;
dev-rx_desc[slot].data_len = 0;


Jumbo Frames, sil24 SATA driver, and kswapd0 page allocation failures

2009-08-12 Thread Jonathan Haws
All,

I am having some issues with my target and was hoping that someone could lend a 
hand.  I am using an AMCC 405EX (Kilauea) board running Linux kernel 2.6.31.

Here is the problem.  I have some code that receives jumbo frames via the EMAC, 
sticks the data in a buffer, and writes the data out to a solid-state SATA disk 
(using a Silicon Image 3531 controller).

What is happening is that I appear to be running out of memory and I cannot 
figure out why.  The closest thing I can tell is that the sil24 driver for the 
SATA controller does not seem to be releasing memory back to the kernel for 
some reason.  After some time of capturing data and logging it to disk, I get 
the following kernel dump:

kswapd0: page allocation failure. order:2, mode:0x4020 Call Trace:
[cfaa19a0] [c0006ef0] show_stack+0x44/0x16c (unreliable) [cfaa19e0] [c006f5e4] 
__alloc_pages_nodemask+0x38c/0x4f8
[cfaa1a60] [c006f770] __get_free_pages+0x20/0x50 [cfaa1a70] [c00955d4] 
__kmalloc_track_caller+0xcc/0xf0 [cfaa1a90] [c01c437c] __alloc_skb+0x60/0x140 
[cfaa1ab0] [c01a319c] emac_poll_rx+0x46c/0x7e4 [cfaa1af0] [c019e85c] 
mal_poll+0xa8/0x1ec [cfaa1b20] [c01cfddc] net_rx_action+0x9c/0x1b4 [cfaa1b50] 
[c003b3a8] __do_softirq+0xc4/0x148 [cfaa1b90] [c0004d18] do_softirq+0x78/0x80 
[cfaa1ba0] [c003af94] irq_exit+0x64/0x7c [cfaa1bb0] [c0005210] do_IRQ+0x9c/0xb4 
[cfaa1bd0] [c000fa7c] ret_from_except+0x0/0x18 [cfaa1c90] [c0094dc4] 
kmem_cache_free+0x74/0xcc [cfaa1cb0] [c00c0570] free_buffer_head+0x38/0x84 
[cfaa1cc0] [c00c0b8c] try_to_free_buffers+0x94/0xe0 [cfaa1cf0] [c0067e70] 
try_to_release_page+0x6c/0x84 [cfaa1d00] [c0075f58] 
shrink_page_list+0x648/0x818 [cfaa1de0] [c0076620] shrink_zone+0x4f8/0xac4 
[cfaa1f00] [c0077294] kswapd+0x4a0/0x4bc [cfaa1fc0] [c004d6d8] 
kthread+0x70/0x74 [cfaa1ff0] [c000f220] kernel_thread+0x4c/0x68
Mem-Info:
DMA per-cpu:
CPU0: hi:   90, btch:  15 usd:  54
Active_anon:5155 active_file:626 inactive_anon:5216
 inactive_file:42474 unevictable:0 dirty:176 writeback:0 unstable:0
 free:631 slab:6416 mapped:324 pagetables:32 bounce:0 DMA free:2524kB 
min:2036kB low:2544kB high:3052kB active_anon:20620kB inactive_anon:20864kB 
active_file:2504kB inactive_file:169896kB unevictable:0kB present:260096kB 
pages_scanned:64 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 345*4kB 119*8kB 0*16kB 0*32kB 1*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 
0*2048kB 0*4096kB = 2524kB
43129 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0 Free swap  = 0kB Total swap = 0kB
65536 pages RAM
1397 pages reserved
43434 pages shared
20347 pages non-shared

I am not sure what is causing this.  It only happens when I run both the 
network and the SATA disk at the same time.  If I only capture data on the 
EMAC, things work just fine (I ran the system overnight, capturing data at 
36Mbytes/s without even a hiccup).  If I only write data to disk, things seem 
to work fine.  But when I combine the two, then things go crazy.

Here is the loop:

for(;;)
{
if( datalength + 9000  16*1024*1024 )
{
write(fd, (char*)rxBuf[count][0], dataLength);
fsync(fd);
wrBytes += dataLength;
dataLength = 0;

count = (count+1)%RXCNT;
}

bytes = recvfrom(sock.socket,(char*)rxBuf[count][dataLength],
MTUSIZE, (int)NULL, NULL, NULL);

rxBytes += bytes;
dataLength += bytes;

sched_yield();

} /* for(;;) */

A pretty simple loop to receive the data, place it into a buffer, and write it 
to disk when ready.

What is it about the write call that would not release memory?

Any ideas?  Has anyone seen this type of behavior before?

Thanks!

Jonathan
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Jumbo Frames, sil24 SATA driver, and kswapd0 page allocation failures

2009-08-12 Thread Jonathan Haws
All,

I am having some issues with my target and was hoping that someone could lend a 
hand.  I am using an AMCC 405EX (Kilauea) board running Linux kernel 2.6.31.

Here is the problem.  I have some code that receives jumbo frames via the EMAC, 
sticks the data in a buffer, and writes the data out to a solid-state SATA disk 
(using a Silicon Image 3531 controller).

What is happening is that I appear to be running out of memory and I cannot 
figure out why.  The closest thing I can tell is that the sil24 driver for the 
SATA controller does not seem to be releasing memory back to the kernel for 
some reason.  After some time of capturing data and logging it to disk, I get 
the following kernel dump:

kswapd0: page allocation failure. order:2, mode:0x4020 Call Trace:
[cfaa19a0] [c0006ef0] show_stack+0x44/0x16c (unreliable) [cfaa19e0] [c006f5e4] 
__alloc_pages_nodemask+0x38c/0x4f8
[cfaa1a60] [c006f770] __get_free_pages+0x20/0x50 [cfaa1a70] [c00955d4] 
__kmalloc_track_caller+0xcc/0xf0 [cfaa1a90] [c01c437c] __alloc_skb+0x60/0x140 
[cfaa1ab0] [c01a319c] emac_poll_rx+0x46c/0x7e4 [cfaa1af0] [c019e85c] 
mal_poll+0xa8/0x1ec [cfaa1b20] [c01cfddc] net_rx_action+0x9c/0x1b4 [cfaa1b50] 
[c003b3a8] __do_softirq+0xc4/0x148 [cfaa1b90] [c0004d18] do_softirq+0x78/0x80 
[cfaa1ba0] [c003af94] irq_exit+0x64/0x7c [cfaa1bb0] [c0005210] do_IRQ+0x9c/0xb4 
[cfaa1bd0] [c000fa7c] ret_from_except+0x0/0x18 [cfaa1c90] [c0094dc4] 
kmem_cache_free+0x74/0xcc [cfaa1cb0] [c00c0570] free_buffer_head+0x38/0x84 
[cfaa1cc0] [c00c0b8c] try_to_free_buffers+0x94/0xe0 [cfaa1cf0] [c0067e70] 
try_to_release_page+0x6c/0x84 [cfaa1d00] [c0075f58] 
shrink_page_list+0x648/0x818 [cfaa1de0] [c0076620] shrink_zone+0x4f8/0xac4 
[cfaa1f00] [c0077294] kswapd+0x4a0/0x4bc [cfaa1fc0] [c004d6d8] 
kthread+0x70/0x74 [cfaa1ff0] [c000f220] kernel_thread+0x4c/0x68
Mem-Info:
DMA per-cpu:
CPU0: hi:   90, btch:  15 usd:  54
Active_anon:5155 active_file:626 inactive_anon:5216
 inactive_file:42474 unevictable:0 dirty:176 writeback:0 unstable:0
 free:631 slab:6416 mapped:324 pagetables:32 bounce:0 DMA free:2524kB 
min:2036kB low:2544kB high:3052kB active_anon:20620kB inactive_anon:20864kB 
active_file:2504kB inactive_file:169896kB unevictable:0kB present:260096kB 
pages_scanned:64 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 345*4kB 119*8kB 0*16kB 0*32kB 1*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 
0*2048kB 0*4096kB = 2524kB
43129 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0 Free swap  = 0kB Total swap = 0kB
65536 pages RAM
1397 pages reserved
43434 pages shared
20347 pages non-shared

I am not sure what is causing this.  It only happens when I run both the 
network and the SATA disk at the same time.  If I only capture data on the 
EMAC, things work just fine (I ran the system overnight, capturing data at 
36Mbytes/s without even a hiccup).  If I only write data to disk, things seem 
to work fine.  But when I combine the two, then things go crazy.

Here is the loop:

for(;;)
{
if( datalength + 9000  16*1024*1024 )
{
write(fd, (char*)rxBuf[count][0], dataLength);
fsync(fd);
wrBytes += dataLength;
dataLength = 0;

count = (count+1)%RXCNT;
}

bytes = recvfrom(sock.socket,(char*)rxBuf[count][dataLength],
MTUSIZE, (int)NULL, NULL, NULL);

rxBytes += bytes;
dataLength += bytes;

sched_yield();

} /* for(;;) */

A pretty simple loop to receive the data, place it into a buffer, and write it 
to disk when ready.

What is it about the write call that would not release memory?

Any ideas?  Has anyone seen this type of behavior before?

Thanks!

Jonathan
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev