No 100 HZ timer !

2001-04-09 Thread schwidefsky




Hi,
seems like my first try with the complete patch hasn't made it through
to the mailing list. This is the second try with only the common part of
the patch. Here we go (again):
---

I have a suggestion that might seem unusual at first but it is important
for Linux on S/390. We are facing the problem that we want to start many
(> 1000) Linux images on a big S/390 machine. Every image has its own
100 HZ timer on every processor the images uses (normally 1). On a single
image system the processor use of the 100 HZ timer is not a big deal but
with > 1000 images you need a lot of processing power just to execute the
100 HZ timers. You quickly end up with 100% CPU only for the timer
interrupts of otherwise idle images. Therefore I had a go at the timer
stuff and now I have a system running without the 100 HZ timer. Unluckly
I need to make changes to common code and I want you opinion on it.

The first problem was how to get rid of the jiffies. The solution is
simple. I simply defined a macro that calculates the jiffies value from
the TOD clock:
  #define jiffies ({ \
  uint64_t __ticks; \
  asm ("STCK %0" : "=m" (__ticks) ); \
  __ticks = (__ticks - init_timer_cc) >> 12; \
  do_div(__ticks, (100/HZ)); \
  ((unsigned long) __ticks); \
  })
With this define you are independent of the jiffies variable which is no
longer needed so I ifdef'ed the definition. There are some places where a
local variable is named jiffies. You may not replace these so I renamed
them to _jiffies. A kernel compiled with only this change works as always.

The second problem is that you need to be able to find out when the next
timer event is due to happen. You'll find a new function "next_timer_event"
in the patch which traverses tv1-tv5 and returns the timer_list of the next
timer event. It is used in timer_bh to indicate to the backend when the
next interrupt should happen. This leads us to the notifier functions.
Each time a new timer is added, a timer is modified, or a timer expires
the architecture backend needs to reset its timeout value. That is what the
"timer_notify" callback is used for. The implementation on S/390 uses the
clock comparator and looks like this:
  static void s390_timer_notify(unsigned long expires)
  {
  S390_lowcore.timer_event =
  ((__u64) expires*CLK_TICKS_PER_JIFFY) + init_timer_cc;
  asm volatile ("SCKC %0" : : "m" (S390_lowcore.timer_event));
  }
This causes an interrupt on the cpu which executed s390_timer_notify after
"expires" has passed. That means that timer events are spread over the cpus
in the system. Modified or deleted timer events do not cause a deletion
notification. A cpu might be errornously interrupted to early because of a
timer event that has been modified or deleted. But that doesn't do any
harm,
it is just unnecessary work.

There is a second callback "itimer_notify" that is used to get the per
process timers right. We use the cpu timer for this purpose:
  void set_cpu_timer(void)
  {
  unsigned long min_ticks;
  __u64 time_slice;
  if (current->pid != 0 && current->need_resched == 0) {
  min_ticks = current->counter;
  if (current->it_prof_value != 0 &&
  current->it_prof_value < min_ticks)
  min_ticks = current->it_prof_value;
  if (current->it_virt_value != 0 &&
  current->it_virt_value < min_ticks)
  min_ticks = current->it_virt_value;
  time_slice = (__u64) min_ticks*CLK_TICKS_PER_JIFFY;
  asm volatile ("spt %0" : : "m" (time_slice));
  }
  }
The cpu timer is a one shot timer that interrupts after the specified
amount
of time has passed. Not a 100% accurate because VM can schedule the virtual
processor before the "spt" has been done but good enough for per process
timers.

The remaining changes to common code parts deal with the problem that many
ticks may be accounted at once. For example without the 100 HZ timer it is
possible that a process runs for half a second in user space. With the next
interrupt all the ticks between the last update and the interrupt have to
be added to the tick counters. This is why update_wall_time and do_it_prof
have changed and update_process_times2 has been introduced.

That leaves three problems: 1) you need to check on every system entry if
a tick or more has passed and do the update if necessary, 2) you need to
keep track of the elapsed time in user space and in kernel space and 3) you
need to check tq_timer every time the system is left and setup a timer
event for the next timer tick if there is work to do on the timer queue.
These three problems are related and have to be implemented architecture
dependent. A nice thing we get for free is that the user/kernel elapsed
time
measurement gets much more accurate.

The number of interrupts in an idle system due to timer activit

Re: No 100 HZ timer !

2001-04-10 Thread schwidefsky



>Just how would you do kernel/user CPU time accounting then ?  It's
currently done
>on every timer tick, and doing it less often would make it useless.
This part is architecture dependent. For S/390 I choose to do a "STCK" on
every
system entry/exit. Dunno if this can be done on other architectures too, on
 S/390
this is reasonably cheap (one STCK costs 15 cycles). That means the
kernel/user CPU
time accounting is MUCH better now.

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: No 100 HZ timer !

2001-04-10 Thread schwidefsky



>Its worth doing even on the ancient x86 boards with the PIT. It does
require
>some driver changes since
>
>
>while(time_before(jiffies, we_explode))
> poll_things();
>
>no longer works
On S/390 we have a big advantage here. Driver code of this kind does not
exist.
That makes it a lot easier for us compared to other architectures. As I
said in
the original posting, the patch I have is working fine for S/390.

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: No 100 HZ timer !

2001-04-10 Thread schwidefsky



>> Just how would you do kernel/user CPU time accounting then ?  It's
currently done
>> on every timer tick, and doing it less often would make it useless.
>
>On the contrary doing it less often but at the right time massively
improves
>its accuracy. You do it on reschedule. An rdtsc instruction is cheap and
all
>of a sudden you have nearly cycle accurate accounting
If you do the accounting on reschedule, how do you find out how much time
has been spent in user versus kernel mode? Or do the Intel chips have two
counters, one for user space execution and one for the kernel?

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: No 100 HZ timer !

2001-04-10 Thread schwidefsky



>Does not sound very attractive all at all on non virtual machines (I see
the point on
>UML/VM):
>making system entry/context switch/interrupts slower, making add_timer
slower, just to
>process a few less timer interrupts. That's like robbing the fast paths
for a slow path.
The system entry/exit/context switch is slower. The add_timer/mod_timer is
only
a little bit slower in the case a new soonest timer event has been created.
 I
think you can forget the additional overhead for add_timer/mod_timer, its
the
additional path length on the system entry/exit that might be problematic.

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: No 100 HZ timer !

2001-04-10 Thread schwidefsky



>BTW. Why we need to redesign timers at all? The cost of timer interrupt
>each 1/100 second is nearly zero (1000 instances on S/390 VM is not common
>case - it is not reasonable to degradate performance of timers because of
>this).
The cost of the timer interrupts on a single image system is neglectable,
true. As I already pointed out in the original proposal we are looking
for a solution that will allow us to minimize the costs of the timer
interrupts when we run many images. For us this case is not unusual and
it is reasonable to degrade performance of a running system by a very
small amount to get rid of the HZ timer. This proposal was never meant
to be the perfect solution for every platform, that is why it is
configuratable with the CONFIG_NO_HZ_TIMER option.

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: No 100 HZ timer !

2001-04-11 Thread schwidefsky



>f) As noted, the account timers (task user/system times) would be much
>more accurate with the tick less approach.  The cost is added code in
>both the system call and the schedule path.
>
>Tentative conclusions:
>
>Currently we feel that the tick less approach is not acceptable due to
>(f).  We felt that this added code would NOT be welcome AND would, in a
>reasonably active system, have much higher overhead than any savings in
>not having a tick.  Also (d) implies a list organization that will, at
>the very least, be harder to understand.  (We have some thoughts here,
>but abandoned the effort because of (f).)  We are, of course, open to
>discussion on this issue and all others related to the project
>objectives.
f) might be true on Intel based systems. At least for s/390 the situation
is a little bit different. Here is a extract from the s/390 part of the
timer patch:

   .macro  UPDATE_ENTER_TIME reload
   la  %r14,thread+_TSS_UTIME(%r9) # pointer to utime
   tm  SP_PSW+1(%r15),0x01  # interrupting from user ?
   jno 0f   # yes -> add to user time
   la  %r14,8(%r14) # no -> add to system time
0: lm  %r0,%r1,0(%r14)  # load user/system time
   sl  %r1,__LC_LAST_MARK+4 # subtract last time mark
   bc  3,BASED(1f)  # borrow ?
   sl  %r0,BASED(.Lc1)
1: sl  %r0,__LC_LAST_MARK
   stck__LC_LAST_MARK   # make a new mark
   al  %r1,__LC_LAST_MARK+4 # add new mark -> added delta
   bc  12,BASED(2f) # carry ?
   al  %r0,BASED(.Lc1)
2: al  %r0,__LC_LAST_MARK
   stm %r0,%r1,0(%r14)  # store updated user/system time
   clc __LC_LAST_MARK(8),__LC_JIFFY_TIMER # check if enough time
   jl  3f   # passed for a jiffy update
   l   %r1,BASED(.Ltime_warp)
   basr%r14,%r1
   .if \reload  # reload != 0 for system call
   lm  %r2,%r6,SP_R2(%r15)  # reload clobbered parameters
   .endif
3:

   .macro  UPDATE_LEAVE_TIME
   l   %r1,BASED(.Ltq_timer)# test if tq_timer list is empty
   x   %r1,0(%r1)   # tq_timer->next != tq_timer ?
   jz  0f
   l   %r1,BASED(.Ltq_timer_active)
   icm %r0,15,0(%r1)# timer event already added ?
   jnz 0f
   l   %r1,BASED(.Ltq_pending)
   basr%r14,%r1
0: lm  %r0,%r1,thread+_TSS_STIME(%r9) # load system time
   sl  %r1,__LC_LAST_MARK+4 # subtract last time mark
   bc  3,BASED(1f)  # borrow ?
   sl  %r0,BASED(.Lc1)
1: sl  %r0,__LC_LAST_MARK
   stck__LC_LAST_MARK   # make new mark
   al  %r1,__LC_LAST_MARK+4 # add new mark -> added delta
   bc  12,BASED(2f) # carry ?
   al  %r0,BASED(.Lc1)
2: al  %r0,__LC_LAST_MARK
   stm %r0,%r1,thread+_TSS_STIME(%r9) # store system time
   .endm

The two macros UPDATE_ENTER_TIME and UPDATE_LEAVE_TIMER are executed
on every system entry/exit. In the case that no special work has to
be done less then 31 instruction are executed in addition to the
normal system entry/exit code. Special work has to be done if more
time then 1/HZ has passed (call time_warp), or if tq_timer contains
an element (call tq_pending).
The accuracy of the timer events has not changed. It still is 1/HZ.
The only thing this patch does is to avoid unneeded interruptions.
I'd be happy if this could be combined with a new, more accurate
timing method.

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



bug in generic math-emu

2001-02-19 Thread schwidefsky



Hi,
I found a bug in the generic floating point emulation code. The glibc
math testcases showed errors for the nearbyint(0.5) test. The problem
is in the _FP_FRAC_SRS_2 macro. The expression to find out if some 1
bits have been shifted out is wrong. Try the _FP_FRAC_SRS_2 macro
with the following parameters: X_f1 = 0x0080, X_f0 = 0x,
N = 53, sz = 56. It comes back with 0x0005 instead of
0x0004. Now you can do the math ;-)
With the following patch test-float and test-double (both from glibc)
now run without errors on a S/390 without IEEE fpu.

diff -u -r1.1.1.1 op-2.h
--- include/math-emu/op-2.h   2000/02/04 13:53:47  1.1.1.1
+++ include/math-emu/op-2.h   2001/02/19 18:53:41
@@ -79,7 +79,7 @@
 else \
   {   \
 X##_f0 = (X##_f1 >> ((N) - _FP_W_TYPE_SIZE) |\
-   (((X##_f1 << (sz - (N))) | X##_f0) != 0));   \
+   (((X##_f1 << (2*_FP_W_TYPE_SIZE - (N))) | X##_f0) != 0)); \
 X##_f1 = 0;   \
   }   \
   } while (0)


blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Memory management bug

2000-11-15 Thread schwidefsky



I think I spotted a problem in the memory management of some (all?)
architectures in 2.4.0-test10.  At the moment I am fighting with the 64bit
backend for the new S/390 machines. I experienced infinite loops in
do_check_pgt_cache because pgtable_cache_size indicated that a lot of pages
are in the quicklists but the pgd/pmd/pte quicklists have been empty (NULL
pointers). After some trickery with some special hardware feature (storage
keys) I found out that empty_bad_pmd_table and empty_bad_pte_table have
been put to the page table quicklists multiple(!) times. It is already a
bug that these two arrays are inserted into the quicklist at all but the
second insertation destroys the quicklists. I solved this problem by
inserting checks for the special entries in  the free_xxx_fast routines,
here is a sample for the i386 free_pte_fast:

diff -u -r1.5 pgalloc.h
--- include/asm-i386/pgalloc.h  2000/11/02 10:14:51 1.5
+++ include/asm-i386/pgalloc.h  2000/11/15 12:27:58
@@ -80,8 +80,11 @@
return (pte_t *)ret;
 }

+extern pte_t empty_bad_pte_table[];
 extern __inline__ void free_pte_fast(pte_t *pte)
 {
+   if (pte == empty_bad_pte_table)
+   return;
*(unsigned long *)pte = (unsigned long) pte_quicklist;
pte_quicklist = (unsigned long *) pte;
pgtable_cache_size++;

I still get the "__alloc_pages: 2-order allocation failed." error messages
but at least the machine doesn't go into infinite loops anymore. Could
someone with more experience with the other architectures verify that my
observation is true?

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Memory management bug

2000-11-15 Thread schwidefsky



>> +extern pte_t empty_bad_pte_table[];
>>  extern __inline__ void free_pte_fast(pte_t *pte)
>>  {
>> +   if (pte == empty_bad_pte_table)
>> +   return;
>
>I guess that should be BUG() instead of return, so that the callers can be
>fixed.
Not really. pte_free and pmd_free are called from the common mm code but
the concept of empty_bad_{pte,pmd}_table is architecture dependent. The
trouble starts in arch/???/mm/init.c where these special arrays are
inserted into the paging tables. So the solution to the problem should be
in architecture dependent files too.

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



establish_pte and ipte

2000-11-15 Thread schwidefsky



After the rework of the page management in 2.4.0-test10 I tried again if we
could make use of the "invalidate page table entry" (ipte) in the s390
backend. Thanks to the great work of you guys we are almost there. To be
able to do that we would like to make the establish_pte an architecture
dependent function. This way s390 could implement it in its own special way
(note: no flush_tlb_page because ipte already does it) :

static inline void ptep_invalidate_and_flush(pte_t *ptep, unsigned long
addr)
{
if (pte_val(*ptep) & _PAGE_INVALID)
return;
__asm__ __volatile__ ("ipte %0,%1" : : "a" (ptep), "a" (addr) );
pte_clear(ptep);
}

static inline void establish_pte(struct vm_area_struct * vma,
 unsigned long address,
 pte_t *page_table, pte_t entry)
{
/* Note that there is a race condition if the new PTE is
 * also valid: Another CPU might try to access the page in
 * between the IPTE (setting the PTE invalid) and the store
 * (setting the PTE valid again).
 *
 * This doesn't matter, however, as the caller must hold
 * the mm semaphore; thus, if another CPU does if fact get
 * a page fault, the handler will spin until we have the
 * the PTE consistent again;  then, the page access will
 * simply be retried.
 */
ptep_invalidate_and_flush(page_table, address);
set_pte(page_table, entry);
}

Besides moving establish_pte we only need one more fix:

--- vmscan.c2000/11/02 10:14:52 1.19
+++ vmscan.c2000/11/13 18:41:25 1.20
@@ -195,8 +195,7 @@

/* Put the swap entry into the pte after the page is in swapcache
*/
mm->rss--;
-   set_pte(page_table, swp_entry_to_pte(entry));
-   flush_tlb_page(vma, address);
+   establish_pte(vma, address, page_table, swp_entry_to_pte(entry));
spin_unlock(&mm->page_table_lock);

/* OK, do a physical asynchronous write to swap.  */


In addition we are trying to get rid of flush_tlb_page at all. I did a
little experiment and made flush_tlb_page a nop and checked if all users of
flush_tlb_page do the flush with a prior mm operation. At the moment all
users do so on s390. Therefore s390 could define flush_tlb_page as a nop,
but that would be an ugly hack. As soon as some new use of flush_tlb_page
is invented the s390 backend might break. A far better solution would be to
integrate the flush operation into the prior mm operation (again this would
move code from the common code to the arch folders). An example from
filemap_sync_pte:

if (!ptep_test_and_clear_dirty(ptep))
goto out;
flush_page_to_ram(pte_page(pte));
flush_cache_page(vma, address);
flush_tlb_page(vma, address);

The flush_tlb_page is only used to make it known on all processors that the
dirty bit has been cleared. On s390 we will use the "set storage key
extended" (sske) instruction for ptep_test_and_clear_dirty which does
everything necessary. We do not need a flush_tlb_page on s390 in this case.
Therefore I would like to suggest that ptep_test_and_clear_dirty should do
the flushing. Same thing for ptep_get_and_clear.

Comments ?

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Memory management bug

2000-11-16 Thread schwidefsky



>What happens if you just replace all places that would use a bad page
>table with a BUG()? (Ie do _not_ add the bug to the place where you
>added the test: by that time it's too late.  I'm talking about the
>places where the bad page tables are used, like in the error cases of
>"get_pte_kernel_slow()" etc.

Ok, the BUG() hit in get_pmd_slow:

pmd_t *
get_pmd_slow(pgd_t *pgd, unsigned long offset)
{
pmd_t *pmd;
int i;

pmd = (pmd_t *) __get_free_pages(GFP_KERNEL,2);
if (pgd_none(*pgd)) {
if (pmd) {
for (i = 0; i < PTRS_PER_PMD; i++)
pmd_clear(pmd+i);
pgd_set(pgd, pmd);
return pmd + offset;
}
BUG();  /* <--- this one hit */
pmd = (pmd_t *) get_bad_pmd_table();
pgd_set(pgd, pmd);
return NULL;
}
free_pages((unsigned long)pmd,2);
if (pgd_bad(*pgd))
BUG();
return (pmd_t *) pgd_page(*pgd) + offset;
}

The allocation of 4 consecutive pages for the page middle directory failed.
This caused empty_bad_pmd_table to be used and clear_page_tables inserted
it to the pmd quicklist. The important question is: why did
__get_free_pages fail?

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Memory management bug

2000-11-17 Thread schwidefsky



>>
>> If they absolutely needs 4 pages for pmd pagetables due hardware
constraints
>> I'd recommend to use _four_ hardware pages for each softpage, not two.
>
>Yes.
>
>However, it definitely is an issue of making trade-offs. Most 64-bit MMU
>models tend to have some flexibility in how you set up the page tables,
>and it may be possible to just move bits around too (ie making both the
>pmd and the pgd twice as large, and getting the expansion of 4 by doing
>two expand-by-two's, for example, if the hardware has support for doing
>things like that).

Unluckly we don't have any flexibility. The segment index (pmd) has 11
bits,
pointers are 8 byte. That makes 16K segment table. I have understood that
this is a problem if the system is really low on memory. But low on memory
does mean low on real memory + swap space, doesn't it ? The system has
enough swap space but it isn't using any of it when the BUG hits. I think
the "if (!order)" statements before the "goto try_again" in __alloc_pages
have something to do with it. To test this assumption I removed the ifs and

I didn't see any "__alloc_pages: %lu-order allocation failed." message
before I hit yet another BUG in swap_state.c:60.
Whats the reasoning behind these ifs ?

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Memory management bug

2000-11-17 Thread schwidefsky



>> before I hit yet another BUG in swap_state.c:60.
>
>The bug in swap_state:60 shows a kernel bug in the VM or random memory
>corruption. Make sure you can reproduce on x86 to be sure it's not a s390
>that is randomly corrupting memory. If you read the oops after the BUG
message
>with asm at hand you will see in the registers the value of page->mapping
and
>you can guess if it's random memory corruption or bug in VM this way (for
>example if `reg & 3 != 0' it's memory corruption for sure, you should also
>if it's pointing to a suitable kernel-heap address).
I did a little closer investigation. The BUG was triggered by a page with
page->mapping pointing to an address space of a mapped ext2 file
(page->mapping->a_ops == &ext2_aops). The page had PG_locked, PG_uptodate,
PG_active and PG_swap_cache set. The stack backstrace showed that kswapd
called do_try_to_free_pages, refill_inactive, swap_out, swap_out_mm,
swap_out_vma, try_to_swap_out and add_to_swap_cache where BUG hit.
The registers look good, the struct page looks good. I don't think that
this
was a random memory corruption.

>> Whats the reasoning behind these ifs ?
>
>To catch memory corruption or things running out of control in the kernel.
I was refering to the "if (!order) goto try_again" ifs in alloc_pages, not
the "if (something) BUG()" ifs.

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Memory management bug

2000-11-21 Thread schwidefsky



>Agreed, that's almost sure _not_ random memory corruption of the page
>structure. It looks like a VM bug (if you can reproduce trivially I'd give
a
>try to test8 too since test8 is rock solid for me while test10 lockups in
VM
>core at the second bonnie if using emulated highmem).
I was lucky. Somehow I managed to f**k up my disk in a way that the
filesystem
check triggers the bug in a reproducible way and always with the same page!
I setup a "trace store into" to the page structure and logged who is
changing
the "struct page". Here is the log starting after page->mapping was set:

address changed   function
5c13a   mapping   add_to_page_cache_unique
 count=2, flags=PG_locked, age=2
5b14a   next_hash __add_page_to_hash_queue
5b178   buffers   __add_page_to_hash_queue
68440   flags lru_cache_add
 flags=PG_active|PG_locked
6846a   lru   lru_cache_add
68470   lru   lru_cache_add
78fc6   virtual   create_empty_buffers
78fda   count create_empty_buffers
 count=3
6d9ce   count __free_pages
 count=2
5c122   list  __add_page_to_hash_queue
68464   lru   lru_cache_add
77b16   flags end_buffer_io_async
 flags=PG_active|PG_uptodate|PG_locked
77b52   flags end_buffer_io_async
 flags=PG_active|PG_uptodate|PG_locked
77bc4   flags end_buffer_io_async
 flags=PG_active|PG_uptodate
67792   age   age_page_up
 age=5
5c88c   count __find_get_page
 count=3
559be   count copy_page_range
 count=4
559be   count copy_page_rage
 count=5
6d9ce   count __free_pages
 count=4
6b55e   lru   refill_inactive_scan
6b4ac   flags refill_inactive_scan
 flags=PG_active|PG_uptodate
6770c   age   age_page_down_ageonly
 age=2
6b570   lru   refill_inactive_scan
6b576   lru   refill_inactive_scan
6b56a   lru   refill_inactive_scan
6b55e   lru   refill_inactive_scan
6b4ac   flags refill_inactive_scan
 flags=PG_active|PG_uptodate
6770c   age   age_page_down_ageonly
 age=1
6b570   lru   refill_inactive_scan
6b576   lru   refill_inactive_scan
6b56a   lru   refill_inactive_scan
6b55e   lru   refill_inactive_scan
6b4ac   flags refill_inactive_scan
 flags=PG_active|PG_uptodate
6770c   age   age_page_down_ageonly
 age=0
6b570   lru   refill_inactive_scan
6b576   lru   refill_inactive_scan
6b56a   lru   refill_inactive_scan

program check at 6e1e0 because of BUG() in line 60 of swap_state.c.
Stack backtrace from there:
6e1e0 add_to_swap_cache
6900a try_to_swap_out
69408 swap_out_vma
69578 swap_out_mm
69838 swap_out
6b90a refill_inactive
6bab4 do_try_to_free_pages
6bbba kswapd

age_page_down_ageonly was always called from refill_inactive_scan. So
refill_inactive_scan lowers the age of the pages but does not deactivate
the
page when it reached age==0 (page->count to big). try_to_swap_out doesn't
check for page->mapping and tries to swap out the page because the age is
0. Bang!

blue skies,
   Martin

P.S. by the way this test was done on linux-2.4.0-test11

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



plug problem in linux-2.4.0-test11

2000-11-29 Thread schwidefsky



Hi,
I experienced disk hangs with linux-2.4.0-test11 on S/390 and after
some debugging I found the cause. It the new method of unplugging
block devices that doesn't go along with the S/390 disk driver:

/*
 * remove the plug and let it rip..
 */
static inline void __generic_unplug_device(request_queue_t *q)
{
if (!list_empty(&q->queue_head)) {
q->plugged = 0;
q->request_fn(q);
}
}

The story goes like this: at start the request queue was empty but
the disk driver was still working on an older, already dequeued
request. Someone plugged the device (q->plugged = 1 && queue on
tq_disk). Then a new request arrived and the unplugging was
started. But before __generic_unplug_device was reached the
outstanding request finished. The bottom half of the S/390 disk
drivers always checks for queued requests after an interrupt,
starts the first and dequeues some of the requests on the
request queue to put them on its internal queue. You could argue
that it shouldn't dequeue request if q->plugged == 1. On the other
hand why not, before the disk has nothing to do. Anyway the result
was that when the unplug routine was finally reached list_empty
was true. In that case q->plugged will not be cleared! The device
stays plugged. Forever.

The following implementation works:

/*
 * remove the plug and let it rip..
 */
static inline void __generic_unplug_device(request_queue_t *q)
{
q->plugged = 0;
if (!list_empty(&q->queue_head))
q->request_fn(q);
}

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0-test11 ext2 fs corruption

2000-11-29 Thread schwidefsky



>--- drivers/block/ll_rw_blk.c~  Wed Nov 29 01:30:22 2000
>+++ drivers/block/ll_rw_blk.c   Wed Nov 29 01:33:00 2000
>@@ -684,7 +684,7 @@
>int max_segments = MAX_SEGMENTS;
>struct request * req = NULL, *freereq = NULL;
>int rw_ahead, max_sectors, el_ret;
>-   struct list_head *head = &q->queue_head;
>+   struct list_head *head;
>int latency;
>elevator_t *elevator = &q->elevator;

head = &q->queue_head is a simple offset calculation in the request
queue structure. Moving this into the spinlock won't change anything,
since q->queue_head isn't a pointer that can change.

Independent of that I can second the observation that test11 can corrupt
ext2 in memory. I think that this is related to the memory management
problems I see but I can't prove it yet.

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



64 bit s390 and 2.4.0-test11 (was Memory management bug)

2000-12-07 Thread schwidefsky




Hi,
good news (at least for us): linux on the 64 bit S/390 (aka zServer)
is now running pretty stable. Our implementation of ptep_get_and_clear
didn't clear the pte if the invalid bit was already set. But a swapped
page has the invalid bit set too and in that case we didn't clear the
pte. That did cause the BUG() in swap_state.c:60.

With this new backend in mind I'd like to suggest two small changes
for the common code.
1) move establish_pte to the architecture dependent folders. Our
implementation looks like this:

static inline void ptep_invalidate_and_flush(pte_t *ptep, unsigned long
addr)
{
if (!(pte_val(*ptep) & _PAGE_INVALID))
__asm__ __volatile__ ("ipte %0,%1" : : "a" (ptep), "a"
(addr));
}

static inline void establish_pte(struct vm_area_struct * vma,
 unsigned long address,
 pte_t *page_table, pte_t entry)
{
ptep_invalidate_and_flush(page_table, address);
set_pte(page_table, entry);
}

The question I face at the moment is: where is the right place in
include/asm for establish_pte. I added it to include/asm-generic/pgtable.h
and include/asm-i386/pgtable.h but I now face the problem that the
default implementation has a call to flush_tlb_page and that is defined
in pgalloc.h. I added an #include  but I fear that this
could
cause compile errors. For details see establish_pte.diff.

2) add a check for EI_CLASS in the binfmt_elf loader. We will use the
same EM_S390 for 31 bit and 64 bit binaries. The distinction is done
by means of the EI_CLASS byte in the ELF header. See binfmt_elf.diff

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]

(See attached file: establish_pte.diff)(See attached file: binfmt_elf.diff)

 establish_pte.diff
 binfmt_elf.diff


Re: Linux-2.4.0-test9-pre2

2000-09-19 Thread schwidefsky



>>  Linus,
>
>> Where do architecture maintainers stand when they don't submit their
>> problems to linux-kernel or the great Ted Bug List(tm)?
>
>Up against the wall so we can shoot them?
>
>:)

So I am one of the guys who will be shot ... I wanted to do an update for
the s/390 architecture since weeks but there was always something more
important. I finally cut some hours out of my ribs and made a patch against
linux-2.4.0-test8. The diff for files in arch/s390, include/asm-s390 and
drivers/s390 is pretty big, about 1 MB. The diffs for non s/390 files is
smaller, only 35 KB.
The question is now do you want to have the patch or do we wait until 2.4.1
?

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[patch 3/3] arch_rebalance_pgtables call

2007-11-12 Thread schwidefsky
From: Martin Schwidefsky <[EMAIL PROTECTED]>

In order to change the layout of the page tables after an mmap has
crossed the adress space limit of the current page table layout a
architecture hook in get_unmapped_area is needed. The arguments
are the address of the new mapping and the length of it.

Cc: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 mm/mmap.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

Index: linux-2.6/mm/mmap.c
===
--- linux-2.6.orig/mm/mmap.c
+++ linux-2.6/mm/mmap.c
@@ -36,6 +36,10 @@
 #define arch_mmap_check(addr, len, flags)  (0)
 #endif
 
+#ifndef arch_rebalance_pgtables
+#define arch_rebalance_pgtables(addr, len) (addr)
+#endif
+
 static void unmap_region(struct mm_struct *mm,
struct vm_area_struct *vma, struct vm_area_struct *prev,
unsigned long start, unsigned long end);
@@ -1436,7 +1440,7 @@ get_unmapped_area(struct file *file, uns
if (addr & ~PAGE_MASK)
return -EINVAL;
 
-   return addr;
+   return arch_rebalance_pgtables(addr, len);
 }
 
 EXPORT_SYMBOL(get_unmapped_area);

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 0/3] page table changes

2007-11-12 Thread schwidefsky
Hi Andrew,
more than 2 weeks have passed since I posted my six page table patches
to linux-kernel/linux-arch. Nobody complained so far (keeping fingers
crossed..) and Ben has a use for the first two patches on powerpc as
well. Next logical step would be to add the patches that affect common
code to the -mm tree, no?
The three patches in this patchset do apply against your snapshot
broken-out-2007-11-06-02-32.tar.gz. Please add.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 2/3] CONFIG_HIGHPTE vs. sub-page page tables.

2007-11-12 Thread schwidefsky
From: Martin Schwidefsky <[EMAIL PROTECTED]>

Background: I've implemented 1K/2K page tables for s390. These sub-page
page tables are required to properly support the s390 virtualization
instruction with KVM. The SIE instruction requires that the page tables
have 256 page table entries (pte) followed by 256 page status table
entries (pgste). The pgstes are only required if the process is using
the SIE instruction. The pgstes are updated by the hardware and by the
hypervisor for a number of reasons, one of them is dirty and reference
bit tracking. To avoid wasting memory the standard pte table allocation
should return 1K/2K (31/64 bit) and 2K/4K if the process is using SIE.

Problem: Page size on s390 is 4K, page table size is 1K or 2K. That
means the s390 version for pte_alloc_one cannot return a pointer to
a struct page. Trouble is that with the CONFIG_HIGHPTE feature on x86
pte_alloc_one cannot return a pointer to a pte either, since that would
require more than 32 bit for the return value of pte_alloc_one (and the
pte * would not be accessible since its not kmapped).

Solution: The only solution I found to this dilemma is a new typedef:
a pgtable_t. For s390 pgtable_t will be a (pte *) - to be introduced
with a later patch. For everybody else it will be a (struct page *).
The additional problem with the initialization of the ptl lock and the
NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor
and a destructor pgtable_page_dtor. The page table allocation and free
functions need to call these two whenever a page table page is allocated
or freed. pmd_populate will get a pgtable_t instead of a struct page
pointer. To get the pgtable_t back from a pmd entry that has been
installed with pmd_populate a new function pmd_pgtable is added. It
replaces the pmd_page call in free_pte_range and apply_to_pte_range.

Cc: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 arch/frv/mm/pgalloc.c   |8 +---
 arch/powerpc/mm/pgtable_32.c|   14 --
 arch/ppc/mm/pgtable.c   |9 ++---
 arch/s390/mm/pgtable.c  |2 ++
 arch/sparc/mm/srmmu.c   |   10 +++---
 arch/sparc/mm/sun4c.c   |   14 ++
 arch/um/kernel/mem.c|4 +++-
 arch/x86/mm/pgtable_32.c|4 +++-
 include/asm-alpha/page.h|2 ++
 include/asm-alpha/pgalloc.h |   22 ++
 include/asm-arm/page.h  |2 ++
 include/asm-arm/pgalloc.h   |9 ++---
 include/asm-avr32/page.h|1 +
 include/asm-avr32/pgalloc.h |   16 
 include/asm-cris/page.h |1 +
 include/asm-cris/pgalloc.h  |   14 ++
 include/asm-frv/page.h  |1 +
 include/asm-frv/pgalloc.h   |   12 +---
 include/asm-ia64/page.h |2 ++
 include/asm-ia64/pgalloc.h  |   20 ++--
 include/asm-m32r/page.h |1 +
 include/asm-m32r/pgalloc.h  |   10 ++
 include/asm-m68k/motorola_pgalloc.h |   14 --
 include/asm-m68k/page.h |1 +
 include/asm-m68k/sun3_pgalloc.h |   17 -
 include/asm-mips/page.h |1 +
 include/asm-mips/pgalloc.h  |5 +++--
 include/asm-parisc/page.h   |1 +
 include/asm-parisc/pgalloc.h|   11 +--
 include/asm-powerpc/page.h  |2 ++
 include/asm-powerpc/pgalloc-32.h|6 --
 include/asm-powerpc/pgalloc-64.h|   26 +++---
 include/asm-ppc/pgalloc.h   |6 --
 include/asm-s390/page.h |2 ++
 include/asm-s390/pgalloc.h  |3 ++-
 include/asm-s390/tlb.h  |2 +-
 include/asm-sh/page.h   |2 ++
 include/asm-sh/pgalloc.h|   27 ---
 include/asm-sh64/page.h |2 ++
 include/asm-sh64/pgalloc.h  |   27 ---
 include/asm-sparc/page.h|2 ++
 include/asm-sparc/pgalloc.h |5 +++--
 include/asm-sparc64/page.h  |2 ++
 include/asm-sparc64/pgalloc.h   |   19 ++-
 include/asm-um/page.h   |2 ++
 include/asm-um/pgalloc.h|   12 +---
 include/asm-x86/page_32.h   |2 ++
 include/asm-x86/page_64.h   |2 ++
 include/asm-x86/pgalloc_32.h|7 +--
 include/asm-x86/pgalloc_64.h|   22 +-
 include/asm-xtensa/page.h   |1 +
 include/asm-xtensa/pgalloc.h|   17 -
 include/linux/mm.h  |   14 +-
 mm/memory.c |   32 +++-
 mm/vmalloc.c|2 +-
 55 files changed, 338 insertions(+), 136 deletions(-)

Index

[patch 1/3] add mm argument to pte/pmd/pud/pgd_free.

2007-11-12 Thread schwidefsky
From: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
From: Martin Schwidefsky <[EMAIL PROTECTED]>

The pgd/pud/pmd/pte page table allocation functions get a mm_struct
pointer as first argument. The free functions do not get the mm_struct
argument. This is 1) asymmetrical and 2) to do mm related page table
allocations the mm argument is needed on the free function as well.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 arch/arm/kernel/smp.c   |2 +-
 arch/arm/mm/ioremap.c   |2 +-
 arch/arm/mm/pgd.c   |8 
 arch/frv/mm/pgalloc.c   |2 +-
 arch/powerpc/mm/pgtable_32.c|6 +++---
 arch/ppc/mm/pgtable.c   |6 +++---
 arch/um/kernel/mem.c|2 +-
 arch/um/kernel/skas/mmu.c   |8 
 arch/x86/mm/pgtable_32.c|2 +-
 include/asm-alpha/pgalloc.h |8 
 include/asm-alpha/tlb.h |4 ++--
 include/asm-arm/pgalloc.h   |   10 +-
 include/asm-arm/tlb.h   |4 ++--
 include/asm-avr32/pgalloc.h |6 +++---
 include/asm-cris/pgalloc.h  |6 +++---
 include/asm-frv/pgalloc.h   |8 
 include/asm-frv/pgtable.h   |2 +-
 include/asm-generic/4level-fixup.h  |2 +-
 include/asm-generic/pgtable-nopmd.h |2 +-
 include/asm-generic/pgtable-nopud.h |2 +-
 include/asm-ia64/pgalloc.h  |   16 
 include/asm-m32r/pgalloc.h  |   10 +-
 include/asm-m68k/motorola_pgalloc.h |   10 +-
 include/asm-m68k/sun3_pgalloc.h |8 
 include/asm-mips/pgalloc.h  |   12 ++--
 include/asm-parisc/pgalloc.h|   10 +-
 include/asm-parisc/tlb.h|4 ++--
 include/asm-powerpc/pgalloc-32.h|   10 +-
 include/asm-powerpc/pgalloc-64.h|   10 +-
 include/asm-ppc/pgalloc.h   |   10 +-
 include/asm-s390/pgalloc.h  |   14 +++---
 include/asm-s390/tlb.h  |8 
 include/asm-sh/pgalloc.h|8 
 include/asm-sh64/pgalloc.h  |   12 ++--
 include/asm-sparc/pgalloc.h |   12 ++--
 include/asm-sparc64/pgalloc.h   |8 
 include/asm-sparc64/tlb.h   |4 ++--
 include/asm-um/pgalloc.h|8 
 include/asm-x86/pgalloc_32.h|8 
 include/asm-x86/pgalloc_64.h|   10 +-
 include/asm-xtensa/pgalloc.h|6 +++---
 include/asm-xtensa/tlb.h|2 +-
 kernel/fork.c   |2 +-
 mm/memory.c |   10 +-
 44 files changed, 152 insertions(+), 152 deletions(-)

Index: linux-2.6/arch/arm/kernel/smp.c
===
--- linux-2.6.orig/arch/arm/kernel/smp.c
+++ linux-2.6/arch/arm/kernel/smp.c
@@ -150,7 +150,7 @@ int __cpuinit __cpu_up(unsigned int cpu)
secondary_data.pgdir = 0;
 
*pmd_offset(pgd, PHYS_OFFSET) = __pmd(0);
-   pgd_free(pgd);
+   pgd_free(&init_mm, pgd);
 
if (ret) {
printk(KERN_CRIT "CPU%u: processor failed to boot\n", cpu);
Index: linux-2.6/arch/arm/mm/ioremap.c
===
--- linux-2.6.orig/arch/arm/mm/ioremap.c
+++ linux-2.6/arch/arm/mm/ioremap.c
@@ -162,7 +162,7 @@ static void unmap_area_sections(unsigned
 * Free the page table, if there was one.
 */
if ((pmd_val(pmd) & PMD_TYPE_MASK) == PMD_TYPE_TABLE)
-   pte_free_kernel(pmd_page_vaddr(pmd));
+   pte_free_kernel(&init_mm, pmd_page_vaddr(pmd));
}
 
addr += PGDIR_SIZE;
Index: linux-2.6/arch/arm/mm/pgd.c
===
--- linux-2.6.orig/arch/arm/mm/pgd.c
+++ linux-2.6/arch/arm/mm/pgd.c
@@ -65,14 +65,14 @@ pgd_t *get_pgd_slow(struct mm_struct *mm
return new_pgd;
 
 no_pte:
-   pmd_free(new_pmd);
+   pmd_free(mm, new_pmd);
 no_pmd:
free_pages((unsigned long)new_pgd, 2);
 no_pgd:
return NULL;
 }
 
-void free_pgd_slow(pgd_t *pgd)
+void free_pgd_slow(struct mm_struct *mm, pgd_t *pgd)
 {
pmd_t *pmd;
struct page *pte;
@@ -94,8 +94,8 @@ void free_pgd_slow(pgd_t *pgd)
pmd_clear(pmd);
dec_zone_page_state(virt_to_page((unsigned long *)pgd), NR_PAGETABLE);
pte_lock_deinit(pte);
-   pte_free(pte);
-   pmd_free(pmd);
+   pte_free(mm, pte);
+   pmd_free(mm, pmd);
 free:
free_pages((unsigned long) pgd, 2);
 }
Index: linux-2.6/arch/frv/mm/pgalloc.c
===
--- linux-2.6.orig/a

[GIT PULL] s390 patches for the 3.9-rc4

2013-03-18 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:
A couple of bug fixes, the most hairy on is the flush_tlb_kernel_range
fix. Another case of "how could this ever have worked?".

Heiko Carstens (3):
  s390/mm: fix vmemmap size calculation
  s390/mm: fix flush_tlb_kernel_range()
  drivers/i2c: remove !S390 dependency, add missing GENERIC_HARDIRQS 
dependencies

Martin Schwidefsky (1):
  s390: critical section cleanup vs. machine checks

Michael Holzheu (1):
  s390/kdump: Do not add standby memory for kdump

Sebastian Ott (4):
  s390/scm_blk: fix request number accounting
  s390/scm_drv: extend notify callback
  s390/scm_blk: suspend writes
  s390/scm: process availability

 arch/s390/include/asm/eadm.h |6 +++-
 arch/s390/include/asm/tlbflush.h |2 --
 arch/s390/kernel/entry.S |3 +-
 arch/s390/kernel/entry64.S   |5 +--
 arch/s390/kernel/setup.c |2 ++
 drivers/i2c/Kconfig  |2 +-
 drivers/i2c/busses/Kconfig   |6 ++--
 drivers/s390/block/scm_blk.c |   69 ++
 drivers/s390/block/scm_blk.h |2 ++
 drivers/s390/block/scm_drv.c |   23 +
 drivers/s390/char/sclp_cmd.c |2 ++
 drivers/s390/cio/chsc.c  |   17 ++
 drivers/s390/cio/chsc.h  |2 ++
 drivers/s390/cio/scm.c   |   18 +-
 14 files changed, 136 insertions(+), 23 deletions(-)

diff --git a/arch/s390/include/asm/eadm.h b/arch/s390/include/asm/eadm.h
index 8d48471..dc9200c 100644
--- a/arch/s390/include/asm/eadm.h
+++ b/arch/s390/include/asm/eadm.h
@@ -34,6 +34,8 @@ struct arsb {
u32 reserved[4];
 } __packed;
 
+#define EQC_WR_PROHIBIT 22
+
 struct msb {
u8 fmt:4;
u8 oc:4;
@@ -96,11 +98,13 @@ struct scm_device {
 #define OP_STATE_TEMP_ERR  2
 #define OP_STATE_PERM_ERR  3
 
+enum scm_event {SCM_CHANGE, SCM_AVAIL};
+
 struct scm_driver {
struct device_driver drv;
int (*probe) (struct scm_device *scmdev);
int (*remove) (struct scm_device *scmdev);
-   void (*notify) (struct scm_device *scmdev);
+   void (*notify) (struct scm_device *scmdev, enum scm_event event);
void (*handler) (struct scm_device *scmdev, void *data, int error);
 };
 
diff --git a/arch/s390/include/asm/tlbflush.h b/arch/s390/include/asm/tlbflush.h
index 1d8fe2b..6b32af3 100644
--- a/arch/s390/include/asm/tlbflush.h
+++ b/arch/s390/include/asm/tlbflush.h
@@ -74,8 +74,6 @@ static inline void __tlb_flush_idte(unsigned long asce)
 
 static inline void __tlb_flush_mm(struct mm_struct * mm)
 {
-   if (unlikely(cpumask_empty(mm_cpumask(mm
-   return;
/*
 * If the machine has IDTE we prefer to do a per mm flush
 * on all cpus instead of doing a local flush if the mm
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 5502285..94feff7 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -636,7 +636,8 @@ ENTRY(mcck_int_handler)
UPDATE_VTIME %r14,%r15,__LC_MCCK_ENTER_TIMER
 mcck_skip:
SWITCH_ASYNC __LC_GPREGS_SAVE_AREA+32,__LC_PANIC_STACK,PAGE_SHIFT
-   mvc __PT_R0(64,%r11),__LC_GPREGS_SAVE_AREA
+   stm %r0,%r7,__PT_R0(%r11)
+   mvc __PT_R8(32,%r11),__LC_GPREGS_SAVE_AREA+32
stm %r8,%r9,__PT_PSW(%r11)
xc  __SF_BACKCHAIN(4,%r15),__SF_BACKCHAIN(%r15)
l   %r1,BASED(.Ldo_machine_check)
diff --git a/arch/s390/kernel/entry64.S b/arch/s390/kernel/entry64.S
index 9c837c1..2e6d60c 100644
--- a/arch/s390/kernel/entry64.S
+++ b/arch/s390/kernel/entry64.S
@@ -678,8 +678,9 @@ ENTRY(mcck_int_handler)
UPDATE_VTIME %r14,__LC_MCCK_ENTER_TIMER
LAST_BREAK %r14
 mcck_skip:
-   lghi%r14,__LC_GPREGS_SAVE_AREA
-   mvc __PT_R0(128,%r11),0(%r14)
+   lghi%r14,__LC_GPREGS_SAVE_AREA+64
+   stmg%r0,%r7,__PT_R0(%r11)
+   mvc __PT_R8(64,%r11),0(%r14)
stmg%r8,%r9,__PT_PSW(%r11)
xc  __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
lgr %r2,%r11# pass pointer to pt_regs
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index a5360de..2926885 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -571,6 +571,8 @@ static void __init setup_memory_end(void)
 
/* Split remaining virtual space between 1:1 mapping & vmemmap array */
tmp = VMALLOC_START / (PAGE_SIZE + sizeof(struct page));
+   /* vmemmap contains a multiple of PAGES_PER_SECTION struct pages */
+   tmp = SECTION_ALIGN_UP(tmp);
tmp = VMALLOC_START - tmp * sizeof(struct page);
tmp &= ~((vmax >> 11) - 1); /* align to page table level */
tmp = min(tmp, 1UL << MAX_PHYSMEM_BITS);
diff --git a/drivers/i2c/Kconfig b/dr

[GIT PULL] s390 regression patch for 3.8-rc8

2013-02-11 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:
The recent fix for the s390 sched_clock() function uncovered yet
another bug in s390_next_ktime which causes an endless loop in KVM.
This regression should be fixed before v3.8.

I keep the fingers crossed that this is the last one for v3.8.

Heiko Carstens (1):
  s390/timer: avoid overflow when programming clock comparator

 arch/s390/kernel/time.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
index a5f4f5a..0aa98db 100644
--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -120,6 +120,9 @@ static int s390_next_ktime(ktime_t expires,
nsecs = ktime_to_ns(ktime_add(timespec_to_ktime(ts), expires));
do_div(nsecs, 125);
S390_lowcore.clock_comparator = sched_clock_base_cc + (nsecs << 9);
+   /* Program the maximum value if we have an overflow (== year 2042) */
+   if (unlikely(S390_lowcore.clock_comparator < sched_clock_base_cc))
+   S390_lowcore.clock_comparator = -1ULL;
set_clock_comparator(S390_lowcore.clock_comparator);
return 0;
 }

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] s390 patches for the 3.9-rc6

2013-04-03 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates: Just a bunch of bugfixes.

Heiko Carstens (4):
  drivers/Kconfig: add several missing GENERIC_HARDIRQS dependencies
  s390/uaccess: fix clear_user_pt()
  s390/uaccess: fix page table walk
  s390/mm: provide emtpy check_pgt_cache() function

Martin Schwidefsky (1):
  s390/3270: fix minor_start issue

Sebastian Ott (1):
  s390/scm_block: fix printk format string

Wei Yongjun (1):
  s390/scm_blk: fix error return code in scm_blk_init()

 arch/s390/include/asm/pgtable.h |4 +-
 arch/s390/lib/uaccess_pt.c  |   83 ++-
 drivers/dma/Kconfig |1 +
 drivers/media/platform/Kconfig  |2 +-
 drivers/s390/block/scm_blk.c|   11 --
 drivers/s390/block/scm_drv.c|2 +-
 drivers/s390/char/tty3270.c |   16 
 drivers/spi/Kconfig |3 +-
 8 files changed, 79 insertions(+), 43 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 4a29308..4a54431 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -344,6 +344,7 @@ extern unsigned long MODULES_END;
 #define _REGION3_ENTRY_CO  0x100   /* change-recording override*/
 
 /* Bits in the segment table entry */
+#define _SEGMENT_ENTRY_ORIGIN_LARGE ~0xfUL /* large page address   */
 #define _SEGMENT_ENTRY_ORIGIN  ~0x7ffUL/* segment table origin */
 #define _SEGMENT_ENTRY_RO  0x200   /* page protection bit  */
 #define _SEGMENT_ENTRY_INV 0x20/* invalid segment table entry  */
@@ -1531,7 +1532,8 @@ extern int s390_enable_sie(void);
 /*
  * No page table caches to initialise
  */
-#define pgtable_cache_init()   do { } while (0)
+static inline void pgtable_cache_init(void) { }
+static inline void check_pgt_cache(void) { }
 
 #include 
 
diff --git a/arch/s390/lib/uaccess_pt.c b/arch/s390/lib/uaccess_pt.c
index dff631d..466fb33 100644
--- a/arch/s390/lib/uaccess_pt.c
+++ b/arch/s390/lib/uaccess_pt.c
@@ -77,42 +77,69 @@ static size_t copy_in_kernel(size_t count, void __user *to,
  * >= -4095 (IS_ERR_VALUE(x) returns true), a fault has occured and the address
  * contains the (negative) exception code.
  */
-static __always_inline unsigned long follow_table(struct mm_struct *mm,
- unsigned long addr, int write)
+#ifdef CONFIG_64BIT
+static unsigned long follow_table(struct mm_struct *mm,
+ unsigned long address, int write)
 {
-   pgd_t *pgd;
-   pud_t *pud;
-   pmd_t *pmd;
-   pte_t *ptep;
+   unsigned long *table = (unsigned long *)__pa(mm->pgd);
+
+   switch (mm->context.asce_bits & _ASCE_TYPE_MASK) {
+   case _ASCE_TYPE_REGION1:
+   table = table + ((address >> 53) & 0x7ff);
+   if (unlikely(*table & _REGION_ENTRY_INV))
+   return -0x39UL;
+   table = (unsigned long *)(*table & _REGION_ENTRY_ORIGIN);
+   case _ASCE_TYPE_REGION2:
+   table = table + ((address >> 42) & 0x7ff);
+   if (unlikely(*table & _REGION_ENTRY_INV))
+   return -0x3aUL;
+   table = (unsigned long *)(*table & _REGION_ENTRY_ORIGIN);
+   case _ASCE_TYPE_REGION3:
+   table = table + ((address >> 31) & 0x7ff);
+   if (unlikely(*table & _REGION_ENTRY_INV))
+   return -0x3bUL;
+   table = (unsigned long *)(*table & _REGION_ENTRY_ORIGIN);
+   case _ASCE_TYPE_SEGMENT:
+   table = table + ((address >> 20) & 0x7ff);
+   if (unlikely(*table & _SEGMENT_ENTRY_INV))
+   return -0x10UL;
+   if (unlikely(*table & _SEGMENT_ENTRY_LARGE)) {
+   if (write && (*table & _SEGMENT_ENTRY_RO))
+   return -0x04UL;
+   return (*table & _SEGMENT_ENTRY_ORIGIN_LARGE) +
+   (address & ~_SEGMENT_ENTRY_ORIGIN_LARGE);
+   }
+   table = (unsigned long *)(*table & _SEGMENT_ENTRY_ORIGIN);
+   }
+   table = table + ((address >> 12) & 0xff);
+   if (unlikely(*table & _PAGE_INVALID))
+   return -0x11UL;
+   if (write && (*table & _PAGE_RO))
+   return -0x04UL;
+   return (*table & PAGE_MASK) + (address & ~PAGE_MASK);
+}
 
-   pgd = pgd_offset(mm, addr);
-   if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd)))
-   return -0x3aUL;
+#else /* CONFIG_64BIT */
 
-   pud = pud_offset(pgd, addr);
-   if (pud_none(*pud) || unlikely(pud_bad(*pud)))
-   return 

Re: [PATCH 3/7] dump_stack: consolidate dump_stack() implementations and unify their behaviors

2013-04-04 Thread Martin Schwidefsky
On Wed,  3 Apr 2013 12:14:53 -0700
Tejun Heo  wrote:

> v2: CPU number added to the generic debug info as requested by s390
> folks and dropped the s390 specific dump_stack().  This loses %ksp
> from the debug message which the maintainers think isn't important
> enough to keep the s390-specific dump_stack() implementation.
> 
> dump_stack_print_info() is moved to kernel/printk.c from
> lib/dump_stack.c.  Because linkage is per objecct file,
> dump_stack_print_info() living in the same lib file as generic
> dump_stack() means that archs which implement custom dump_stack()
> - at this point, only blackfin - can't use dump_stack_print_info()
> as that will bring in the generic version of dump_stack() too.  v1
> The v1 patch broke build on blackfin due to this issue.  The build
> breakage was reported by Fengguang Wu.

For the s390 changes:
Acked-by: Martin Schwidefsky 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] one s390 patch for 3.8-rc6

2013-01-29 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following update:
Another transparent huge page fix, we need to define a s390 variant
for pmdp_set_wrprotect to flush the TLB for the huge page correctly.

Gerald Schaefer (1):
  s390/thp: implement pmdp_set_wrprotect()

 arch/s390/include/asm/pgtable.h |   12 
 1 file changed, 12 insertions(+)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index c1d7930..098adbb 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1365,6 +1365,18 @@ static inline void pmdp_invalidate(struct vm_area_struct 
*vma,
__pmd_idte(address, pmdp);
 }
 
+#define __HAVE_ARCH_PMDP_SET_WRPROTECT
+static inline void pmdp_set_wrprotect(struct mm_struct *mm,
+ unsigned long address, pmd_t *pmdp)
+{
+   pmd_t pmd = *pmdp;
+
+   if (pmd_write(pmd)) {
+   __pmd_idte(address, pmdp);
+   set_pmd_at(mm, address, pmdp, pmd_wrprotect(pmd));
+   }
+}
+
 static inline pmd_t mk_pmd_phys(unsigned long physpage, pgprot_t pgprot)
 {
pmd_t __pmd;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] KVM updates for the 3.9 merge window

2013-02-25 Thread Martin Schwidefsky
On Sun, 24 Feb 2013 16:05:31 -0800
Linus Torvalds  wrote:

> On Wed, Feb 20, 2013 at 5:17 PM, Marcelo Tosatti  wrote:
> >
> > Please pull from
> >
> > git://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/kvm-3.9-1
> >
> > to receive the KVM updates for the 3.9 merge window [..]
> 
> Ok, particularly the s390 people should check me resolution of the
> conflicts, since they include the renaming of IOINT_VIR to IRQIO_VIR.
> But the uapi header file move should be couble-checked by people who
> use this too.

The IRQIO_VIR is now at the end of the IRQIO_xxx entries while we had
it between the IRQIO_CSC and the IRQIO_PCI entry. No worry, we just
change our code and use the upstream version. Just completed the
double-check and pushed my branches to the linux-s390 tree. The code
is in sync.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] s390 patches for the 3.9 merge window

2013-02-21 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:
The most prominent change in this patch set is the software dirty bit
patch for s390. It removes __HAVE_ARCH_PAGE_TEST_AND_CLEAR_DIRTY and
the page_test_and_clear_dirty primitive which makes the common memory
management code a bit less obscure. Heiko fixed most of the PCI related
fallout, more often than not missing GENERIC_HARDIRQS dependencies.
Notable is one of the 3270 patches which adds an export to tty_io to
be able to resize a tty. The rest is the usual bunch of cleanups and
bug fixes.

There is a merge conflict in arch/s390/Kconfig between the current
upstream and the s390 branch. The cause is the Heikos Kconfig sorting
vs the removal of HAVE_IRQ_WORK. The correct merge is the sorted list
without the HAVE_IRQ_WORK select.

Heiko Carstens (18):
  asm-generic/io.h: convert readX defines to functions
  s390/time: rename tod clock access functions
  s390/barrier: convert mb() to define again
  s390/dma: provide dma_cache_sync() function
  s390/dma: remove dma_is_consistent() declaration
  s390/pci: rename pci_probe to s390_pci_probe
  ata: disable ATA for s390
  parport: disable PC-style parallel port support for s390
  s390/mm: provide PAGE_SHARED define
  uio: remove !S390 dependency from Kconfig
  phylib: remove !S390 dependeny from Kconfig
  s390/Kconfig: sort list of arch selected config options
  drivers/net,AT91RM9200: add missing GENERIC_HARDIRQS dependency
  s390/bpf,jit: add vlan tag support
  drivers/media: add missing GENERIC_HARDIRQS dependency
  s390/linker skript: discard exit.data at runtime
  drivers/input: add couple of missing GENERIC_HARDIRQS dependencies
  drivers/gpio: add missing GENERIC_HARDIRQ dependency

Hendrik Brueckner (5):
  s390/perf: cpum_cf: fallback to software sampling events
  iucv: fix kernel panic at reboot
  s390/mm: Fix crst upgrade of mmap with MAP_FIXED
  s390/cleanup: rename SPP to LPP
  s390/module: Add missing R_390_NONE relocation type

Ingo Tuchscherer (1):
  maintainer for s390 zcrypt component changed

Martin Schwidefsky (6):
  s390/3270: readd tty3270_open
  s390/3270: fix initialization order in tty3270_alloc_view
  s390/3270: introduce device notifier
  s390/3270: asynchronous size sensing
  s390/modules: add relocation overflow checking
  s390/mm: implement software dirty bits

Michael Holzheu (2):
  s390/ipl: Implement diag308 loop for zfcpdump
  s390/zcore: Add hsa file

Sebastian Ott (9):
  s390/chsc: cleanup SEI helper functions
  s390/cio: dont abort verification after missing irq
  s390/cio: skip broken paths
  s390/cio: export vpm via sysfs
  s390/cio: handle unknown pgroup state
  s390/scm: use inline dummy functions
  s390/pci: cleanup clp inline assembly
  s390/pci: cleanup clp page allocation
  s390/pci: fix hotplug module init

Stefan Weinhuber (1):
  dasd: fix sysfs cleanup in dasd_generic_remove

 MAINTAINERS   |2 +-
 arch/s390/Kconfig |  115 ---
 arch/s390/appldata/appldata_mem.c |2 +-
 arch/s390/appldata/appldata_net_sum.c |2 +-
 arch/s390/appldata/appldata_os.c  |2 +-
 arch/s390/hypfs/hypfs_vm.c|2 +-
 arch/s390/include/asm/barrier.h   |9 +-
 arch/s390/include/asm/clp.h   |2 +-
 arch/s390/include/asm/cpu_mf.h|4 +-
 arch/s390/include/asm/dma-mapping.h   |8 +-
 arch/s390/include/asm/mman.h  |4 +-
 arch/s390/include/asm/page.h  |   22 --
 arch/s390/include/asm/pci.h   |   11 +-
 arch/s390/include/asm/pgtable.h   |  132 ---
 arch/s390/include/asm/sclp.h  |1 -
 arch/s390/include/asm/setup.h |   22 +-
 arch/s390/include/asm/timex.h |   18 +-
 arch/s390/kernel/debug.c  |2 +-
 arch/s390/kernel/dis.c|1 -
 arch/s390/kernel/early.c  |8 +-
 arch/s390/kernel/entry64.S|   10 +-
 arch/s390/kernel/ipl.c|   16 +-
 arch/s390/kernel/module.c |  143 +---
 arch/s390/kernel/nmi.c|2 +-
 arch/s390/kernel/perf_cpum_cf.c   |   13 +-
 arch/s390/kernel/smp.c|   10 +-
 arch/s390/kernel/time.c   |   26 +-
 arch/s390/kernel/vmlinux.lds.S|4 +
 arch/s390/kernel/vtime.c  |2 +-
 arch/s390/kvm/interrupt.c |6 +-
 arch/s390/kvm/kvm-s390.c  |2 +-
 arch/s390/lib/delay.c |   16 +-
 arch/s390/lib/uaccess_pt.c|2 +-
 arch/s390/mm/mmap.c   |9 +-
 arch/s390/mm/pageattr.c   |2 +-
 arch/s390/mm/vmem.c   |   24 +-
 arch/s390/net/bpf_jit_comp.c  |   21 ++
 arc

Re: [RFC v2 PATCH 0/7] thp: transparent hugepages on s390

2012-08-31 Thread Martin Schwidefsky
On Thu, 30 Aug 2012 12:54:44 -0700
Andrew Morton  wrote:

> On Wed, 29 Aug 2012 17:32:57 +0200
> Gerald Schaefer  wrote:
> 
> > This patch series adds support for transparent hugepages on System z.
> > Small changes to common code are necessary with regard to a different
> > pgtable_t, tlb flushing and kvm behaviour on s390, see patches 1 to 3.
> 
> "RFC" always worries me.  I read it as "Really Flakey Code" ;) Is it
> still appropriate to this patchset?

The code quality is IMHO already good. We do change common mm code though
and we want to get some feedback on that.

> I grabbed them all.  Patches 1-3 look sane to me and I cheerfully
> didn't read the s390 changes at all.  Hopefully Andrea will be able to
> review at least patches 1-3 for us.
> 
> If that all goes well, how do we play this?  I'd prefer to merge 1-3
> myself, as they do interact with ongoing MM development.  I can also
> merge 4-7 if appropriate s390 maintainer acks are seen.  Or I can drop
> them and the s390 parts can be merged via the s390 tree at a later
> date?

I would really appreciate if Andrea could have a look at the code. I've
read the patches and I am fine with them but it is very easy to miss some
important bit.

As far as upstreaming is concerned: I can deal with the pure s390 parts
via the s390 tree if that helps you. If you prefer the carry all of them,
that is fine with me as well.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v2 PATCH 1/7] thp: remove assumptions on pgtable_t type

2012-08-31 Thread Martin Schwidefsky
On Fri, 31 Aug 2012 10:59:38 +0530
"Aneesh Kumar K.V"  wrote:

> Gerald Schaefer  writes:
> 
> > The thp page table pre-allocation code currently assumes that pgtable_t
> > is of type "struct page *". This may not be true for all architectures,
> > so this patch removes that assumption by replacing the functions
> > prepare_pmd_huge_pte() and get_pmd_huge_pte() with two new functions
> > that can be defined architecture-specific.
> >
> > It also removes two VM_BUG_ON checks for page_count() and page_mapcount()
> > operating on a pgtable_t. Apart from the VM_BUG_ON removal, there will
> > be no functional change introduced by this patch.
> 
> Why is that VM_BUG_ON not needed any more ? What is that changed which break
> that requirement ?

Because pgtable_t for s390 is not a page and there simply is no page_count or
page_mapcount.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] s390 patches for the 3.6-rc4

2012-08-31 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive a couple of s390 bug fixes for 3.5-rc4:

Geert Uytterhoeven (1):
  s390: Always use "long" for ssize_t to match size_t

Heiko Carstens (3):
  s390/dasd: fix ioctl return value
  s390/smp: add missing smp_store_status() for !SMP
  s390/32: Don't clobber personality flags on exec

 arch/s390/include/asm/elf.h |3 ++-
 arch/s390/include/asm/posix_types.h |3 +--
 arch/s390/include/asm/smp.h |1 +
 drivers/s390/block/dasd_eckd.c  |2 +-
 drivers/s390/block/dasd_ioctl.c |7 ++-
 5 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/arch/s390/include/asm/elf.h b/arch/s390/include/asm/elf.h
index 32e8449..9b94a16 100644
--- a/arch/s390/include/asm/elf.h
+++ b/arch/s390/include/asm/elf.h
@@ -180,7 +180,8 @@ extern char elf_platform[];
 #define ELF_PLATFORM (elf_platform)
 
 #ifndef CONFIG_64BIT
-#define SET_PERSONALITY(ex) set_personality(PER_LINUX)
+#define SET_PERSONALITY(ex) \
+   set_personality(PER_LINUX | (current->personality & (~PER_MASK)))
 #else /* CONFIG_64BIT */
 #define SET_PERSONALITY(ex)\
 do {   \
diff --git a/arch/s390/include/asm/posix_types.h 
b/arch/s390/include/asm/posix_types.h
index 7bcc14e..bf2a2ad 100644
--- a/arch/s390/include/asm/posix_types.h
+++ b/arch/s390/include/asm/posix_types.h
@@ -13,6 +13,7 @@
  */
 
 typedef unsigned long   __kernel_size_t;
+typedef long__kernel_ssize_t;
 #define __kernel_size_t __kernel_size_t
 
 typedef unsigned short __kernel_old_dev_t;
@@ -25,7 +26,6 @@ typedef unsigned short  __kernel_mode_t;
 typedef unsigned short  __kernel_ipc_pid_t;
 typedef unsigned short  __kernel_uid_t;
 typedef unsigned short  __kernel_gid_t;
-typedef int __kernel_ssize_t;
 typedef int __kernel_ptrdiff_t;
 
 #else /* __s390x__ */
@@ -35,7 +35,6 @@ typedef unsigned int__kernel_mode_t;
 typedef int __kernel_ipc_pid_t;
 typedef unsigned int__kernel_uid_t;
 typedef unsigned int__kernel_gid_t;
-typedef long__kernel_ssize_t;
 typedef long__kernel_ptrdiff_t;
 typedef unsigned long   __kernel_sigset_t;  /* at least 32 bits */
 
diff --git a/arch/s390/include/asm/smp.h b/arch/s390/include/asm/smp.h
index a0a8340..ce26ac3 100644
--- a/arch/s390/include/asm/smp.h
+++ b/arch/s390/include/asm/smp.h
@@ -44,6 +44,7 @@ static inline void smp_call_online_cpu(void (*func)(void *), 
void *data)
 }
 
 static inline int smp_find_processor_id(int address) { return 0; }
+static inline int smp_store_status(int cpu) { return 0; }
 static inline int smp_vcpu_scheduled(int cpu) { return 1; }
 static inline void smp_yield_cpu(int cpu) { }
 static inline void smp_yield(void) { }
diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c
index 40a826a..2fb2b9e 100644
--- a/drivers/s390/block/dasd_eckd.c
+++ b/drivers/s390/block/dasd_eckd.c
@@ -3804,7 +3804,7 @@ dasd_eckd_ioctl(struct dasd_block *block, unsigned int 
cmd, void __user *argp)
case BIODASDSYMMIO:
return dasd_symm_io(device, argp);
default:
-   return -ENOIOCTLCMD;
+   return -ENOTTY;
}
 }
 
diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
index cceae70..654c692 100644
--- a/drivers/s390/block/dasd_ioctl.c
+++ b/drivers/s390/block/dasd_ioctl.c
@@ -498,12 +498,9 @@ int dasd_ioctl(struct block_device *bdev, fmode_t mode,
break;
default:
/* if the discipline has an ioctl method try it. */
-   if (base->discipline->ioctl) {
+   rc = -ENOTTY;
+   if (base->discipline->ioctl)
rc = base->discipline->ioctl(block, cmd, argp);
-   if (rc == -ENOIOCTLCMD)
-   rc = -EINVAL;
-   } else
-   rc = -EINVAL;
}
dasd_put_device(base);
return rc;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/6] s390: Use generic percpu linux-2.6.git

2008-01-31 Thread Martin Schwidefsky
On Wed, 2008-01-30 at 22:53 +0100, Ingo Molnar wrote:
> * [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> 
> > Change s390 percpu.h to use asm-generic/percpu.h
> 
> do the s390 maintainer agree with this change (Acks please), and has it 
> been tested on s390?

Now I'm confused. The patch has been acked a few weeks ago and the last
5+ version of the patch had the acked line. The lastest version dropped
it for a reason I don't know. And more, the patch is already upstream
with the (correct) acked line, see git commit
f034347470e486835ccdcd7a5bb2ceb417be11c4.
So, what is the problem ?

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: x86 arch updates also broke s390

2008-01-31 Thread Martin Schwidefsky
On Thu, 2008-01-31 at 02:33 +0200, Adrian Bunk wrote:
> <--  snip  -->
> 
> ...
>   CC  arch/s390/kernel/asm-offsets.s
> In file included from 
> /home/bunk/linux/kernel-2.6/git/linux-2.6/arch/s390/kernel/asm-offsets.c:7:
> /home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/sched.h: In
> function 'spin_needbreak':
> /home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/sched.h:1931:
> error: implicit declaration of function '__raw_spin_is_contended'
> make[2]: *** [arch/s390/kernel/asm-offsets.s] Error 1
> 
> <--  snip  -->

Defining GENERIC_LOCKBREAK in arch/s390/Kconfig takes care of it. I'll
cook up a patch and queue it in git390.

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: x86 arch updates also broke s390

2008-02-01 Thread Martin Schwidefsky
On Fri, 2008-02-01 at 10:48 +0100, Ingo Molnar wrote:
> > Defining GENERIC_LOCKBREAK in arch/s390/Kconfig takes care of it.
> I'll 
> > cook up a patch and queue it in git390.
> 
> the one below should do the trick.

Thanks but I already queued a different one (see below). The other
architectures that define GENERIC_LOCKBREAK have the "depends on SMP &&
PREEMPT" line as well. The line does make sense if you look at the way
how spin_is_contended is used, no ?

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.

---
Subject: [PATCH] Define GENERIC_LOCKBREAK.

From: Martin Schwidefsky <[EMAIL PROTECTED]>

Fix compile error:

  CC  arch/s390/kernel/asm-offsets.s
In file included from 
arch/s390/kernel/asm-offsets.c:7:
include/linux/sched.h: In function 'spin_needbreak':
include/linux/sched.h:1931: error: implicit declaration of function
'__raw_spin_is_contended'
make[2]: *** [arch/s390/kernel/asm-offsets.s] Error 1

Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 arch/s390/Kconfig |5 +
 1 file changed, 5 insertions(+)

diff -urpN linux-2.6/arch/s390/Kconfig
linux-2.6-patched/arch/s390/Kconfig
--- linux-2.6/arch/s390/Kconfig 2008-01-31 13:57:33.0 +0100
+++ linux-2.6-patched/arch/s390/Kconfig 2008-01-31 13:57:42.0
+0100
@@ -47,6 +47,11 @@ config NO_IOMEM
 config NO_DMA
def_bool y
 
+config GENERIC_LOCKBREAK
+   bool
+   default y
+   depends on SMP && PREEMPT
+
 mainmenu "Linux Kernel Configuration"
 
 config S390


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] CONFIG_HIGHPTE vs. sub-page page tables.

2008-02-04 Thread Martin Schwidefsky
On Sat, 2008-02-02 at 21:53 -0800, Andrew Morton wrote:
> On Sun, 03 Feb 2008 16:37:00 +1100 Benjamin Herrenschmidt <[EMAIL PROTECTED]> 
> wrote:
> 
> > Why dropping add-mm-argument-to-pte-pmd-pud-pgd_free.patch though ?
> 
> I dropped the whole series.

Sniff .. my patches .. ;-)

> > It's a sane patch and a helps going further, and a total pain to re-do
> > later on. Besides, I may have some use for it on powerpc at some point
> > too...
> 
> OK, I'll try to reestablish it.

Fine. I've got the patch-merge message, so the first of the series is
done. 

> Look: I can't fix *everyone's* stuff.  This was a consequence of ongoing
> unbounded churn in the x86 tree.  If we can find a way of preventing those
> guys (and everyone else) from trashing everyone else's stuff then we'd have
> much smoother sailing.

Understood. That is where I jump in and regenerate my patches on the
latest available level. That the patches did hold up for some months in
-mm now without really breaking anything is an indication that we can
push them upstream now, isn't ? That would make the patch problem go
away and I could queue my s390 specific page table rework. Our KVM
people keep asking about it.

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] CONFIG_HIGHPTE vs. sub-page page tables.

2008-02-05 Thread Martin Schwidefsky
On Mon, 2008-02-04 at 02:51 -0800, Andrew Morton wrote:
> > > Look: I can't fix *everyone's* stuff.  This was a consequence of ongoing
> > > unbounded churn in the x86 tree.  If we can find a way of preventing those
> > > guys (and everyone else) from trashing everyone else's stuff then we'd 
> > > have
> > > much smoother sailing.
> > 
> > Understood. That is where I jump in and regenerate my patches on the
> > latest available level. That the patches did hold up for some months in
> > -mm now without really breaking anything is an indication that we can
> > push them upstream now, isn't ? That would make the patch problem go
> > away and I could queue my s390 specific page table rework. Our KVM
> > people keep asking about it.
> 
> yes, against 2.6.24-mm1 would be good, thanks.  I really don't know what
> went wrong in i386 but I ended up getting all grumpy at the macro mess
> we've made in all the pagetable handling.  Please do take a look at
> improving that.

I'm trying to replace the __pte_free_tlb macros my patch touches for the
different architectures. Not much luck yet, there is a reason why
__pte_free_tlb is a macro in the first place: welcome to #include hell.
I'm starting to get grumpy as well..

Just an example for x86-64:
* asm-x86/tlb.h includes asm-generic/tlb.h
* asm-generic/tlb.h includes asm-x86/pgalloc.h
* asm-x86/pgalloc.h includes asm-x86/pgalloc_64.h
* asm-x86/pgalloc_64.h includes asm-x86/tlb.h
* since asm-x86/tlb.h started this #include chain it expands to nothing
* asm-x86/pgalloc_64.h calls tlb_remove_page which is defined in
  asm-x86/tlb.h but the compiler hasn't seen the definition yet
* you loose..

I got x86-64 compiled by removing the #include  from
asm-generic/tlb.h. But who knows what will break if the include is
missing .. I'll cross compile some of the other architectures next.

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 00/18] s390 bug fix patches.

2008-02-05 Thread Martin Schwidefsky
The shortlog says is all:

Christian Borntraeger (1):
  [S390] sclp_tty/sclp_vt220: Fix scheduling while atomic

Cornelia Huck (3):
  [S390] cio: Clean up chsc response code handling.
  [S390] cio: Update documentation.
  [S390] cio: Add shutdown callback for ccwgroup.

Heiko Carstens (8):
  [S390] DEBUG_PAGEALLOC support for s390.
  [S390] Fix linker script.
  [S390] Fix smp_call_function_mask semantics.
  [S390] Fix couple of section mismatches.
  [S390] Implement ext2_find_next_bit.
  [S390] latencytop s390 support.
  [S390] Remove BUILD_BUG_ON() in vmem code.
  [S390] dcss: Initialize workqueue before using it.

Martin Schwidefsky (2):
  [S390] Define GENERIC_LOCKBREAK.
  [S390] Cleanup & optimize bitops.

Peter Oberparleiter (2):
  [S390] cio: make sense id procedure work with partial hardware response
  [S390] console: allow vt220 console to be the only console

Stefan Haberland (1):
  [S390] dasd: add ifcc handling

Stefan Weinhuber (1):
  [S390] dasd: fix panic caused by alias device offline


The only notable changes are the DEBUG_PAGEALLOC and the latencytop
support. Have fun.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 01/18] cio: make sense id procedure work with partial hardware response

2008-02-05 Thread Martin Schwidefsky
From: Peter Oberparleiter <[EMAIL PROTECTED]>

In some cases the current sense id procedure trips over incomplete
hardware responses. In these cases, checking against the preset value
of 0x is not enough. More critically, the VM DIAG call will always be
considered to have provided data after such an incident, even if it was not
successful at all.

The solution is to always initialize the control unit data before doing a
sense id call. Check the condition code before considering the control unit
data. And initialize again, before evaluating the VM data.

Signed-off-by: Peter Oberparleiter <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 drivers/s390/cio/device_id.c |  107 +--
 1 file changed, 64 insertions(+), 43 deletions(-)

Index: quilt-2.6/drivers/s390/cio/device_id.c
===
--- quilt-2.6.orig/drivers/s390/cio/device_id.c
+++ quilt-2.6/drivers/s390/cio/device_id.c
@@ -26,17 +26,18 @@
 #include "ioasm.h"
 #include "io_sch.h"
 
-/*
- * Input :
- *   devno - device number
- *   ps   - pointer to sense ID data area
- * Output : none
+/**
+ * vm_vdev_to_cu_type - Convert vm virtual device into control unit type
+ * for certain devices.
+ * @class: virtual device class
+ * @type: virtual device type
+ *
+ * Returns control unit type if a match was made or %0x otherwise.
  */
-static void
-VM_virtual_device_info (__u16 devno, struct senseid *ps)
+static int vm_vdev_to_cu_type(int class, int type)
 {
static struct {
-   int vrdcvcla, vrdcvtyp, cu_type;
+   int class, type, cu_type;
} vm_devices[] = {
{ 0x08, 0x01, 0x3480 },
{ 0x08, 0x02, 0x3430 },
@@ -68,8 +69,26 @@ VM_virtual_device_info (__u16 devno, str
{ 0x40, 0xc0, 0x5080 },
{ 0x80, 0x00, 0x3215 },
};
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(vm_devices); i++)
+   if (class == vm_devices[i].class && type == vm_devices[i].type)
+   return vm_devices[i].cu_type;
+
+   return 0x;
+}
+
+/**
+ * diag_get_dev_info - retrieve device information via DIAG X'210'
+ * @devno: device number
+ * @ps: pointer to sense ID data area
+ *
+ * Returns zero on success, non-zero otherwise.
+ */
+static int diag_get_dev_info(u16 devno, struct senseid *ps)
+{
struct diag210 diag_data;
-   int ccode, i;
+   int ccode;
 
CIO_TRACE_EVENT (4, "VMvdinf");
 
@@ -79,21 +98,21 @@ VM_virtual_device_info (__u16 devno, str
};
 
ccode = diag210 (&diag_data);
-   ps->reserved = 0xff;
+   if ((ccode == 0) || (ccode == 2)) {
+   ps->reserved = 0xff;
 
-   /* Special case for bloody osa devices. */
-   if (diag_data.vrdcvcla == 0x02 &&
-   diag_data.vrdcvtyp == 0x20) {
-   ps->cu_type = 0x3088;
-   ps->cu_model = 0x60;
-   return;
-   }
-   for (i = 0; i < ARRAY_SIZE(vm_devices); i++)
-   if (diag_data.vrdcvcla == vm_devices[i].vrdcvcla &&
-   diag_data.vrdcvtyp == vm_devices[i].vrdcvtyp) {
-   ps->cu_type = vm_devices[i].cu_type;
-   return;
+   /* Special case for osa devices. */
+   if (diag_data.vrdcvcla == 0x02 && diag_data.vrdcvtyp == 0x20) {
+   ps->cu_type = 0x3088;
+   ps->cu_model = 0x60;
+   return 0;
}
+   ps->cu_type = vm_vdev_to_cu_type(diag_data.vrdcvcla,
+   diag_data.vrdcvtyp);
+   if (ps->cu_type != 0x)
+   return 0;
+   }
+
CIO_MSG_EVENT(0, "DIAG X'210' for device %04X returned (cc = %d):"
  "vdev class : %02X, vdev type : %04X \n ...  "
  "rdev class : %02X, rdev type : %04X, "
@@ -102,6 +121,8 @@ VM_virtual_device_info (__u16 devno, str
  diag_data.vrdcvcla, diag_data.vrdcvtyp,
  diag_data.vrdcrccl, diag_data.vrdccrty,
  diag_data.vrdccrmd);
+
+   return -ENODEV;
 }
 
 /*
@@ -130,6 +151,7 @@ __ccw_device_sense_id_start(struct ccw_d
/* Try on every path. */
ret = -ENODEV;
while (cdev->private->imask != 0) {
+   cdev->private->senseid.cu_type = 0x;
if ((sch->opm & cdev->private->imask) != 0 &&
cdev->private->iretry > 0) {
cdev->private->iretry--;
@@ -153,7 +175,6 @@ ccw_device_sense_id_start(struct ccw_dev
int ret;
 
memset (&c

[patch 02/18] cio: Clean up chsc response code handling.

2008-02-05 Thread Martin Schwidefsky
From: Cornelia Huck <[EMAIL PROTECTED]>

This provides unified return codes for common response codes and
also makes the debug feature messages more similar and informational.

Signed-off-by: Cornelia Huck <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 drivers/s390/cio/chsc.c |  147 +---
 1 file changed, 55 insertions(+), 92 deletions(-)

Index: quilt-2.6/drivers/s390/cio/chsc.c
===
--- quilt-2.6.orig/drivers/s390/cio/chsc.c
+++ quilt-2.6/drivers/s390/cio/chsc.c
@@ -26,6 +26,25 @@
 
 static void *sei_page;
 
+static int chsc_error_from_response(int response)
+{
+   switch (response) {
+   case 0x0001:
+   return 0;
+   case 0x0002:
+   case 0x0003:
+   case 0x0006:
+   case 0x0007:
+   case 0x0008:
+   case 0x000a:
+   return -EINVAL;
+   case 0x0004:
+   return -EOPNOTSUPP;
+   default:
+   return -EIO;
+   }
+}
+
 struct chsc_ssd_area {
struct chsc_header request;
u16 :10;
@@ -75,11 +94,11 @@ int chsc_get_ssd_info(struct subchannel_
ret = (ccode == 3) ? -ENODEV : -EBUSY;
goto out_free;
}
-   if (ssd_area->response.code != 0x0001) {
+   ret = chsc_error_from_response(ssd_area->response.code);
+   if (ret != 0) {
CIO_MSG_EVENT(2, "chsc: ssd failed for 0.%x.%04x (rc=%04x)\n",
  schid.ssid, schid.sch_no,
  ssd_area->response.code);
-   ret = -EIO;
goto out_free;
}
if (!ssd_area->sch_valid) {
@@ -717,36 +736,15 @@ __chsc_do_secm(struct channel_subsystem 
return (ccode == 3) ? -ENODEV : -EBUSY;
 
switch (secm_area->response.code) {
-   case 0x0001: /* Success. */
-   ret = 0;
-   break;
-   case 0x0003: /* Invalid block. */
-   case 0x0007: /* Invalid format. */
-   case 0x0008: /* Other invalid block. */
-   CIO_CRW_EVENT(2, "Error in chsc request block!\n");
-   ret = -EINVAL;
-   break;
-   case 0x0004: /* Command not provided in model. */
-   CIO_CRW_EVENT(2, "Model does not provide secm\n");
-   ret = -EOPNOTSUPP;
-   break;
-   case 0x0102: /* cub adresses incorrect */
-   CIO_CRW_EVENT(2, "Invalid addresses in chsc request block\n");
-   ret = -EINVAL;
-   break;
-   case 0x0103: /* key error */
-   CIO_CRW_EVENT(2, "Access key error in secm\n");
+   case 0x0102:
+   case 0x0103:
ret = -EINVAL;
-   break;
-   case 0x0105: /* error while starting */
-   CIO_CRW_EVENT(2, "Error while starting channel measurement\n");
-   ret = -EIO;
-   break;
default:
-   CIO_CRW_EVENT(2, "Unknown CHSC response %d\n",
- secm_area->response.code);
-   ret = -EIO;
+   ret = chsc_error_from_response(secm_area->response.code);
}
+   if (ret != 0)
+   CIO_CRW_EVENT(2, "chsc: secm failed (rc=%04x)\n",
+ secm_area->response.code);
return ret;
 }
 
@@ -827,27 +825,14 @@ int chsc_determine_channel_path_descript
goto out;
}
 
-   switch (scpd_area->response.code) {
-   case 0x0001: /* Success. */
+   ret = chsc_error_from_response(scpd_area->response.code);
+   if (ret == 0)
+   /* Success. */
memcpy(desc, &scpd_area->desc,
   sizeof(struct channel_path_desc));
-   ret = 0;
-   break;
-   case 0x0003: /* Invalid block. */
-   case 0x0007: /* Invalid format. */
-   case 0x0008: /* Other invalid block. */
-   CIO_CRW_EVENT(2, "Error in chsc request block!\n");
-   ret = -EINVAL;
-   break;
-   case 0x0004: /* Command not provided in model. */
-   CIO_CRW_EVENT(2, "Model does not provide scpd\n");
-   ret = -EOPNOTSUPP;
-   break;
-   default:
-   CIO_CRW_EVENT(2, "Unknown CHSC response %d\n",
+   else
+   CIO_CRW_EVENT(2, "chsc: scpd failed (rc=%04x)\n",
  scpd_area->response.code);
-   ret = -EIO;
-   }
 out:
free_page((unsigned long)scpd_area);
return ret;
@@ -923,8 +908,9 @@ int chsc_get_channel_measurement_chars(s
goto out;
}
 
-   switch (scmc_area->response.code) {
-   case 0x0001: /* Success. */
+   ret = chsc_error_from

[patch 05/18] DEBUG_PAGEALLOC support for s390.

2008-02-05 Thread Martin Schwidefsky
From: Heiko Carstens <[EMAIL PROTECTED]>

Signed-off-by: Heiko Carstens <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 arch/s390/Kconfig.debug   |8 
 arch/s390/kernel/traps.c  |5 -
 arch/s390/mm/init.c   |   27 +++
 include/asm-s390/cacheflush.h |4 
 4 files changed, 43 insertions(+), 1 deletion(-)

Index: quilt-2.6/arch/s390/Kconfig.debug
===
--- quilt-2.6.orig/arch/s390/Kconfig.debug
+++ quilt-2.6/arch/s390/Kconfig.debug
@@ -6,4 +6,12 @@ config TRACE_IRQFLAGS_SUPPORT
 
 source "lib/Kconfig.debug"
 
+config DEBUG_PAGEALLOC
+   bool "Debug page memory allocations"
+   depends on DEBUG_KERNEL
+   help
+ Unmap pages from the kernel linear mapping after free_pages().
+ This results in a slowdown, but helps to find certain types of
+ memory corruptions.
+
 endmenu
Index: quilt-2.6/arch/s390/kernel/traps.c
===
--- quilt-2.6.orig/arch/s390/kernel/traps.c
+++ quilt-2.6/arch/s390/kernel/traps.c
@@ -271,7 +271,10 @@ void die(const char * str, struct pt_reg
printk("PREEMPT ");
 #endif
 #ifdef CONFIG_SMP
-   printk("SMP");
+   printk("SMP ");
+#endif
+#ifdef CONFIG_DEBUG_PAGEALLOC
+   printk("DEBUG_PAGEALLOC");
 #endif
printk("\n");
notify_die(DIE_OOPS, str, regs, err, current->thread.trap_no, SIGSEGV);
Index: quilt-2.6/arch/s390/mm/init.c
===
--- quilt-2.6.orig/arch/s390/mm/init.c
+++ quilt-2.6/arch/s390/mm/init.c
@@ -167,6 +167,33 @@ void __init mem_init(void)
   PFN_ALIGN((unsigned long)&_eshared) - 1);
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
+void kernel_map_pages(struct page *page, int numpages, int enable)
+{
+   pgd_t *pgd;
+   pud_t *pud;
+   pmd_t *pmd;
+   pte_t *pte;
+   unsigned long address;
+   int i;
+
+   for (i = 0; i < numpages; i++) {
+   address = page_to_phys(page + i);
+   pgd = pgd_offset_k(address);
+   pud = pud_offset(pgd, address);
+   pmd = pmd_offset(pud, address);
+   pte = pte_offset_kernel(pmd, address);
+   if (!enable) {
+   ptep_invalidate(address, pte);
+   continue;
+   }
+   *pte = mk_pte_phys(address, __pgprot(_PAGE_TYPE_RW));
+   /* Flush cpu write queue. */
+   mb();
+   }
+}
+#endif
+
 void free_initmem(void)
 {
 unsigned long addr;
Index: quilt-2.6/include/asm-s390/cacheflush.h
===
--- quilt-2.6.orig/include/asm-s390/cacheflush.h
+++ quilt-2.6/include/asm-s390/cacheflush.h
@@ -24,4 +24,8 @@
 #define copy_from_user_page(vma, page, vaddr, dst, src, len) \
memcpy(dst, src, len)
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
+void kernel_map_pages(struct page *page, int numpages, int enable);
+#endif
+
 #endif /* _S390_CACHEFLUSH_H */

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 07/18] Fix smp_call_function_mask semantics.

2008-02-05 Thread Martin Schwidefsky
From: Heiko Carstens <[EMAIL PROTECTED]>

Make sure func isn't called on the local cpu just like on all other
architectures that implement this function.

Signed-off-by: Heiko Carstens <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 arch/s390/kernel/smp.c |7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

Index: quilt-2.6/arch/s390/kernel/smp.c
===
--- quilt-2.6.orig/arch/s390/kernel/smp.c
+++ quilt-2.6/arch/s390/kernel/smp.c
@@ -225,12 +225,11 @@ EXPORT_SYMBOL(smp_call_function_single);
  * You must not call this function with disabled interrupts or from a
  * hardware interrupt handler or from a bottom half handler.
  */
-int
-smp_call_function_mask(cpumask_t mask,
-   void (*func)(void *), void *info,
-   int wait)
+int smp_call_function_mask(cpumask_t mask, void (*func)(void *), void *info,
+  int wait)
 {
preempt_disable();
+   cpu_clear(smp_processor_id(), mask);
__smp_call_function_map(func, info, 0, wait, mask);
preempt_enable();
return 0;

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 03/18] cio: Update documentation.

2008-02-05 Thread Martin Schwidefsky
From: Cornelia Huck <[EMAIL PROTECTED]>

Signed-off-by: Cornelia Huck <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 Documentation/DocBook/s390-drivers.tmpl |   21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

Index: quilt-2.6/Documentation/DocBook/s390-drivers.tmpl
===
--- quilt-2.6.orig/Documentation/DocBook/s390-drivers.tmpl
+++ quilt-2.6/Documentation/DocBook/s390-drivers.tmpl
@@ -59,7 +59,7 @@
Introduction
   
 This document describes the interfaces available for device drivers that
-drive s390 based channel attached devices. This includes interfaces for
+drive s390 based channel attached I/O devices. This includes interfaces for
 interaction with the hardware and interfaces for interacting with the
 common driver core. Those interfaces are provided by the s390 common I/O
 layer.
@@ -86,9 +86,10 @@
The ccw bus typically contains the majority of devices available to
a s390 system. Named after the channel command word (ccw), the basic
command structure used to address its devices, the ccw bus contains
-   so-called channel attached devices. They are addressed via subchannels,
-   visible on the css bus. A device driver, however, will never interact
-   with the subchannel directly, but only via the device on the ccw bus,
+   so-called channel attached devices. They are addressed via I/O
+   subchannels, visible on the css bus. A device driver for
+   channel-attached devices, however, will never interact  with the
+   subchannel directly, but only via the I/O device on the ccw bus,
the ccw device.
   
 
@@ -116,7 +117,6 @@
 !Iinclude/asm-s390/ccwdev.h
 !Edrivers/s390/cio/device.c
 !Edrivers/s390/cio/device_ops.c
-!Edrivers/s390/cio/airq.c
 
 
  The channel-measurement facility
@@ -147,4 +147,15 @@

   
 
+  
+   Generic interfaces
+  
+   Some interfaces are available to other drivers that do not necessarily
+   have anything to do with the busses described above, but still are
+   indirectly using basic infrastructure in the common I/O layer.
+   One example is the support for adapter interrupts.
+  
+!Edrivers/s390/cio/airq.c
+  
+
 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 10/18] Define GENERIC_LOCKBREAK.

2008-02-05 Thread Martin Schwidefsky
From: Martin Schwidefsky <[EMAIL PROTECTED]>

Fix compile error:

  CC  arch/s390/kernel/asm-offsets.s
In file included from 
arch/s390/kernel/asm-offsets.c:7:
include/linux/sched.h: In function 'spin_needbreak':
include/linux/sched.h:1931: error: implicit declaration of function 
'__raw_spin_is_contended'
make[2]: *** [arch/s390/kernel/asm-offsets.s] Error 1

Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 arch/s390/Kconfig |5 +
 1 file changed, 5 insertions(+)

Index: quilt-2.6/arch/s390/Kconfig
===
--- quilt-2.6.orig/arch/s390/Kconfig
+++ quilt-2.6/arch/s390/Kconfig
@@ -47,6 +47,11 @@ config NO_IOMEM
 config NO_DMA
def_bool y
 
+config GENERIC_LOCKBREAK
+   bool
+   default y
+   depends on SMP && PREEMPT
+
 mainmenu "Linux Kernel Configuration"
 
 config S390

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 12/18] Implement ext2_find_next_bit.

2008-02-05 Thread Martin Schwidefsky
From: Heiko Carstens <[EMAIL PROTECTED]>

Fixes this compile error:

fs/ext4/mballoc.c:
In function 'ext4_mb_generate_buddy':
fs/ext4/mballoc.c:954:
error: implicit declaration of function 'generic_find_next_le_bit'

Signed-off-by: Heiko Carstens <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 include/asm-s390/bitops.h |   43 +--
 1 file changed, 41 insertions(+), 2 deletions(-)

Index: quilt-2.6/include/asm-s390/bitops.h
===
--- quilt-2.6.orig/include/asm-s390/bitops.h
+++ quilt-2.6/include/asm-s390/bitops.h
@@ -790,8 +790,6 @@ static inline int sched_find_first_bit(u
test_and_clear_bit((nr)^(__BITOPS_WORDSIZE - 8), (unsigned long *)addr)
 #define ext2_test_bit(nr, addr)  \
test_bit((nr)^(__BITOPS_WORDSIZE - 8), (unsigned long *)addr)
-#define ext2_find_next_bit(addr, size, off) \
-   generic_find_next_le_bit((unsigned long *)(addr), (size), (off))
 
 static inline int ext2_find_first_zero_bit(void *vaddr, unsigned int size)
 {
@@ -833,6 +831,47 @@ static inline int ext2_find_next_zero_bi
return offset + ext2_find_first_zero_bit(p, size);
 }
 
+static inline unsigned long ext2_find_first_bit(void *vaddr,
+   unsigned long size)
+{
+   unsigned long bytes, bits;
+
+   if (!size)
+   return 0;
+   bytes = __ffs_word_loop(vaddr, size);
+   bits = __ffs_word(bytes*8, __load_ulong_le(vaddr, bytes));
+   return (bits < size) ? bits : size;
+}
+
+static inline int ext2_find_next_bit(void *vaddr, unsigned long size,
+unsigned long offset)
+{
+   unsigned long *addr = vaddr, *p;
+   unsigned long bit, set;
+
+   if (offset >= size)
+   return size;
+   bit = offset & (__BITOPS_WORDSIZE - 1);
+   offset -= bit;
+   size -= offset;
+   p = addr + offset / __BITOPS_WORDSIZE;
+   if (bit) {
+   /*
+* s390 version of ffz returns __BITOPS_WORDSIZE
+* if no zero bit is present in the word.
+*/
+   set = ffs(__load_ulong_le(p, 0) >> bit) + bit;
+   if (set >= size)
+   return size + offset;
+   if (set < __BITOPS_WORDSIZE)
+   return set + offset;
+   offset += __BITOPS_WORDSIZE;
+   size -= __BITOPS_WORDSIZE;
+   p++;
+   }
+   return offset + ext2_find_first_bit(p, size);
+}
+
 #include 
 
 #endif /* __KERNEL__ */

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 11/18] Cleanup & optimize bitops.

2008-02-05 Thread Martin Schwidefsky
From: Martin Schwidefsky <[EMAIL PROTECTED]>

The bitops header is now a bit shorter and easier to understand since
it uses less inline assembly. It requires some tricks to persuade the
compiler to generate decent code. The ffz/ffs functions now use the
_zb_findmap/_sb_findmap table as well.
With this cleanup the new bitops for ext4 can be implemented with a
few lines, instead of another large inline assembly.

Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 include/asm-s390/bitops.h |  515 +++---
 1 file changed, 219 insertions(+), 296 deletions(-)

Index: quilt-2.6/include/asm-s390/bitops.h
===
--- quilt-2.6.orig/include/asm-s390/bitops.h
+++ quilt-2.6/include/asm-s390/bitops.h
@@ -440,242 +440,256 @@ __constant_test_bit(unsigned long nr, co
  __test_bit((nr),(addr)) )
 
 /*
- * ffz = Find First Zero in word. Undefined if no zero exists,
- * so code should check against ~0UL first..
+ * Optimized find bit helper functions.
  */
-static inline unsigned long ffz(unsigned long word)
+
+/**
+ * __ffz_word_loop - find byte offset of first long != -1UL
+ * @addr: pointer to array of unsigned long
+ * @size: size of the array in bits
+ */
+static inline unsigned long __ffz_word_loop(const unsigned long *addr,
+   unsigned long size)
+{
+   typedef struct { long _[__BITOPS_WORDS(size)]; } addrtype;
+   unsigned long bytes = 0;
+
+   asm volatile(
+#ifndef __s390x__
+   "   ahi %1,31\n"
+   "   srl %1,5\n"
+   "0: c   %2,0(%0,%3)\n"
+   "   jne 1f\n"
+   "   la  %0,4(%0)\n"
+   "   brct%1,0b\n"
+   "1:\n"
+#else
+   "   aghi%1,63\n"
+   "   srlg%1,%1,6\n"
+   "0: cg  %2,0(%0,%3)\n"
+   "   jne 1f\n"
+   "   la  %0,8(%0)\n"
+   "   brct%1,0b\n"
+   "1:\n"
+#endif
+   : "+a" (bytes), "+d" (size)
+   : "d" (-1UL), "a" (addr), "m" (*(addrtype *) addr)
+   : "cc" );
+   return bytes;
+}
+
+/**
+ * __ffs_word_loop - find byte offset of first long != 0UL
+ * @addr: pointer to array of unsigned long
+ * @size: size of the array in bits
+ */
+static inline unsigned long __ffs_word_loop(const unsigned long *addr,
+   unsigned long size)
 {
-unsigned long bit = 0;
+   typedef struct { long _[__BITOPS_WORDS(size)]; } addrtype;
+   unsigned long bytes = 0;
+
+   asm volatile(
+#ifndef __s390x__
+   "   ahi %1,31\n"
+   "   srl %1,5\n"
+   "0: c   %2,0(%0,%3)\n"
+   "   jne 1f\n"
+   "   la  %0,4(%0)\n"
+   "   brct%1,0b\n"
+   "1:\n"
+#else
+   "   aghi%1,63\n"
+   "   srlg%1,%1,6\n"
+   "0: cg  %2,0(%0,%3)\n"
+   "   jne 1f\n"
+   "   la  %0,8(%0)\n"
+   "   brct%1,0b\n"
+   "1:\n"
+#endif
+   : "+a" (bytes), "+a" (size)
+   : "d" (0UL), "a" (addr), "m" (*(addrtype *) addr)
+   : "cc" );
+   return bytes;
+}
 
+/**
+ * __ffz_word - add number of the first unset bit
+ * @nr: base value the bit number is added to
+ * @word: the word that is searched for unset bits
+ */
+static inline unsigned long __ffz_word(unsigned long nr, unsigned long word)
+{
 #ifdef __s390x__
if (likely((word & 0x) == 0x)) {
word >>= 32;
-   bit += 32;
+   nr += 32;
}
 #endif
if (likely((word & 0x) == 0x)) {
word >>= 16;
-   bit += 16;
+   nr += 16;
}
if (likely((word & 0xff) == 0xff)) {
word >>= 8;
-   bit += 8;
+   nr += 8;
}
-   return bit + _zb_findmap[word & 0xff];
+   return nr + _zb_findmap[(unsigned char) word];
 }
 
-/*
- * __ffs = find first bit in word. Undefined if no bit exists,
- * so code should check against 0UL first..
+/**
+ * __ffs_word - add number of the first set bit
+ * @nr: base value the bit number is added to
+ * @word: the word that is searched for set bits
  */
-static 

[patch 06/18] Fix linker script.

2008-02-05 Thread Martin Schwidefsky
From: Heiko Carstens <[EMAIL PROTECTED]>

Fixes this warning:
vmlinux: warning: allocated section `.text' not in segment

Signed-off-by: Heiko Carstens <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 arch/s390/kernel/vmlinux.lds.S |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: quilt-2.6/arch/s390/kernel/vmlinux.lds.S
===
--- quilt-2.6.orig/arch/s390/kernel/vmlinux.lds.S
+++ quilt-2.6/arch/s390/kernel/vmlinux.lds.S
@@ -35,7 +35,7 @@ SECTIONS
KPROBES_TEXT
*(.fixup)
*(.gnu.warning)
-   } = 0x0700
+   } :text = 0x0700
 
_etext = .; /* End of text section */
 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 08/18] Fix couple of section mismatches.

2008-02-05 Thread Martin Schwidefsky
From: Heiko Carstens <[EMAIL PROTECTED]>

Fix couple of section mismatches. And since we touch the code
anyway change the IPL code to use C99 initializers.

Cc: Michael Holzheu <[EMAIL PROTECTED]>
Signed-off-by: Heiko Carstens <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 arch/s390/kernel/entry.S   |7 ++-
 arch/s390/kernel/entry64.S |7 ++-
 arch/s390/kernel/ipl.c |   27 ++-
 arch/s390/kernel/setup.c   |2 +-
 arch/s390/kernel/smp.c |6 +++---
 arch/s390/mm/vmem.c|2 +-
 6 files changed, 27 insertions(+), 24 deletions(-)

Index: quilt-2.6/arch/s390/kernel/entry64.S
===
--- quilt-2.6.orig/arch/s390/kernel/entry64.S
+++ quilt-2.6/arch/s390/kernel/entry64.S
@@ -11,6 +11,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -801,9 +802,7 @@ mcck_return:
  * Restart interruption handler, kick starter for additional CPUs
  */
 #ifdef CONFIG_SMP
-#ifndef CONFIG_HOTPLUG_CPU
-   .section .init.text,"ax"
-#endif
+   __CPUINIT
.globl restart_int_handler
 restart_int_handler:
lg  %r15,__LC_SAVE_AREA+120 # load ksp
@@ -814,9 +813,7 @@ restart_int_handler:
lmg %r6,%r15,__SF_GPRS(%r15) # load registers from clone
stosm   __SF_EMPTY(%r15),0x04   # now we can turn dat on
jg  start_secondary
-#ifndef CONFIG_HOTPLUG_CPU
.previous
-#endif
 #else
 /*
  * If we do not run with SMP enabled, let the new CPU crash ...
Index: quilt-2.6/arch/s390/kernel/entry.S
===
--- quilt-2.6.orig/arch/s390/kernel/entry.S
+++ quilt-2.6/arch/s390/kernel/entry.S
@@ -11,6 +11,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -830,9 +831,7 @@ mcck_return:
  * Restart interruption handler, kick starter for additional CPUs
  */
 #ifdef CONFIG_SMP
-#ifndef CONFIG_HOTPLUG_CPU
-   .section .init.text,"ax"
-#endif
+   __CPUINIT
.globl restart_int_handler
 restart_int_handler:
l   %r15,__LC_SAVE_AREA+60  # load ksp
@@ -845,9 +844,7 @@ restart_int_handler:
br  %r14# branch to start_secondary
 restart_addr:
.long   start_secondary
-#ifndef CONFIG_HOTPLUG_CPU
.previous
-#endif
 #else
 /*
  * If we do not run with SMP enabled, let the new CPU crash ...
Index: quilt-2.6/arch/s390/kernel/ipl.c
===
--- quilt-2.6.orig/arch/s390/kernel/ipl.c
+++ quilt-2.6/arch/s390/kernel/ipl.c
@@ -439,7 +439,7 @@ static void ipl_run(struct shutdown_trig
reipl_ccw_dev(&ipl_info.data.ccw.dev_id);
 }
 
-static int ipl_init(void)
+static int __init ipl_init(void)
 {
int rc;
 
@@ -471,8 +471,11 @@ out:
return 0;
 }
 
-static struct shutdown_action ipl_action = {SHUTDOWN_ACTION_IPL_STR, ipl_run,
-   ipl_init};
+static struct shutdown_action __refdata ipl_action = {
+   .name   = SHUTDOWN_ACTION_IPL_STR,
+   .fn = ipl_run,
+   .init   = ipl_init,
+};
 
 /*
  * reipl shutdown action: Reboot Linux on shutdown.
@@ -792,7 +795,7 @@ static int __init reipl_fcp_init(void)
return 0;
 }
 
-static int reipl_init(void)
+static int __init reipl_init(void)
 {
int rc;
 
@@ -819,8 +822,11 @@ static int reipl_init(void)
return 0;
 }
 
-static struct shutdown_action reipl_action = {SHUTDOWN_ACTION_REIPL_STR,
- reipl_run, reipl_init};
+static struct shutdown_action __refdata reipl_action = {
+   .name   = SHUTDOWN_ACTION_REIPL_STR,
+   .fn = reipl_run,
+   .init   = reipl_init,
+};
 
 /*
  * dump shutdown action: Dump Linux on shutdown.
@@ -998,7 +1004,7 @@ static int __init dump_fcp_init(void)
return 0;
 }
 
-static int dump_init(void)
+static int __init dump_init(void)
 {
int rc;
 
@@ -1020,8 +1026,11 @@ static int dump_init(void)
return 0;
 }
 
-static struct shutdown_action dump_action = {SHUTDOWN_ACTION_DUMP_STR,
-dump_run, dump_init};
+static struct shutdown_action __refdata dump_action = {
+   .name   = SHUTDOWN_ACTION_DUMP_STR,
+   .fn = dump_run,
+   .init   = dump_init,
+};
 
 /*
  * vmcmd shutdown action: Trigger vm command on shutdown.
Index: quilt-2.6/arch/s390/kernel/setup.c
===
--- quilt-2.6.orig/arch/s390/kernel/setup.c
+++ quilt-2.6/arch/s390/kernel/setup.c
@@ -77,7 +77,7 @@ unsigned long machine_flags = 0;
 unsigned long elf_hwcap = 0;
 char elf_platform[ELF_PLATFORM_SIZE];
 
-struct mem_chunk __initdata memory_chunk[MEMORY_CHUNKS];
+struct mem_chunk __meminitdata memory_chunk[MEMORY_CHUNKS];
 volatile int __cpu_logical_map[NR_CPUS]; /* logical cpu to

[patch 13/18] latencytop s390 support.

2008-02-05 Thread Martin Schwidefsky
From: Heiko Carstens <[EMAIL PROTECTED]>

Cc: Holger Wolf <[EMAIL PROTECTED]>
Signed-off-by: Heiko Carstens <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 arch/s390/Kconfig |3 +++
 arch/s390/kernel/stacktrace.c |   31 +++
 2 files changed, 26 insertions(+), 8 deletions(-)

Index: quilt-2.6/arch/s390/Kconfig
===
--- quilt-2.6.orig/arch/s390/Kconfig
+++ quilt-2.6/arch/s390/Kconfig
@@ -16,6 +16,9 @@ config LOCKDEP_SUPPORT
 config STACKTRACE_SUPPORT
def_bool y
 
+config HAVE_LATENCYTOP_SUPPORT
+   def_bool y
+
 config RWSEM_GENERIC_SPINLOCK
bool
 
Index: quilt-2.6/arch/s390/kernel/stacktrace.c
===
--- quilt-2.6.orig/arch/s390/kernel/stacktrace.c
+++ quilt-2.6/arch/s390/kernel/stacktrace.c
@@ -14,7 +14,8 @@
 static unsigned long save_context_stack(struct stack_trace *trace,
unsigned long sp,
unsigned long low,
-   unsigned long high)
+   unsigned long high,
+   int savesched)
 {
struct stack_frame *sf;
struct pt_regs *regs;
@@ -47,10 +48,12 @@ static unsigned long save_context_stack(
return sp;
regs = (struct pt_regs *)sp;
addr = regs->psw.addr & PSW_ADDR_INSN;
-   if (!trace->skip)
-   trace->entries[trace->nr_entries++] = addr;
-   else
-   trace->skip--;
+   if (savesched || !in_sched_functions(addr)) {
+   if (!trace->skip)
+   trace->entries[trace->nr_entries++] = addr;
+   else
+   trace->skip--;
+   }
if (trace->nr_entries >= trace->max_entries)
return sp;
low = sp;
@@ -66,15 +69,27 @@ void save_stack_trace(struct stack_trace
orig_sp = sp & PSW_ADDR_INSN;
new_sp = save_context_stack(trace, orig_sp,
S390_lowcore.panic_stack - PAGE_SIZE,
-   S390_lowcore.panic_stack);
+   S390_lowcore.panic_stack, 1);
if (new_sp != orig_sp)
return;
new_sp = save_context_stack(trace, new_sp,
S390_lowcore.async_stack - ASYNC_SIZE,
-   S390_lowcore.async_stack);
+   S390_lowcore.async_stack, 1);
if (new_sp != orig_sp)
return;
save_context_stack(trace, new_sp,
   S390_lowcore.thread_info,
-  S390_lowcore.thread_info + THREAD_SIZE);
+  S390_lowcore.thread_info + THREAD_SIZE, 1);
+}
+
+void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
+{
+   unsigned long sp, low, high;
+
+   sp = tsk->thread.ksp & PSW_ADDR_INSN;
+   low = (unsigned long) task_stack_page(tsk);
+   high = (unsigned long) task_pt_regs(tsk);
+   save_context_stack(trace, sp, low, high, 0);
+   if (trace->nr_entries < trace->max_entries)
+   trace->entries[trace->nr_entries++] = ULONG_MAX;
 }

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 15/18] dasd: fix panic caused by alias device offline

2008-02-05 Thread Martin Schwidefsky
From: Stefan Weinhuber <[EMAIL PROTECTED]>

When an alias device is set offline while it is in use this may
result in a panic in the cleanup part of the dasd_block_tasklet.
The problem here is that there may exist some ccw requests that were
originally created for the alias device and transferred to the base
device when the alias was set offline. When these request are
cleaned up later, the discipline pointer in the alias device may not
be valid anymore. To fix this use the base device discipline to find
the cleanup function.

Signed-off-by: Stefan Weinhuber <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 drivers/s390/block/dasd.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: quilt-2.6/drivers/s390/block/dasd.c
===
--- quilt-2.6.orig/drivers/s390/block/dasd.c
+++ quilt-2.6/drivers/s390/block/dasd.c
@@ -1706,7 +1706,7 @@ static void __dasd_cleanup_cqr(struct da
 
req = (struct request *) cqr->callback_data;
dasd_profile_end(cqr->block, cqr, req);
-   status = cqr->memdev->discipline->free_cp(cqr, req);
+   status = cqr->block->base->discipline->free_cp(cqr, req);
if (status <= 0)
error = status ? status : -EIO;
dasd_end_request(req, error);

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 14/18] dasd: add ifcc handling

2008-02-05 Thread Martin Schwidefsky
From: Stefan Haberland <[EMAIL PROTECTED]>

Adding interface control check (ifcc) handling in error recovery.
First retry up to 255 times and if all retries fail try an alternate
path if possible.

Signed-off-by: Stefan Haberland <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 drivers/s390/block/dasd.c  |   17 +++---
 drivers/s390/block/dasd_3990_erp.c |   62 +++--
 2 files changed, 52 insertions(+), 27 deletions(-)

Index: quilt-2.6/drivers/s390/block/dasd_3990_erp.c
===
--- quilt-2.6.orig/drivers/s390/block/dasd_3990_erp.c
+++ quilt-2.6/drivers/s390/block/dasd_3990_erp.c
@@ -164,7 +164,7 @@ dasd_3990_erp_alternate_path(struct dasd
 
/* reset status to submit the request again... */
erp->status = DASD_CQR_FILLED;
-   erp->retries = 1;
+   erp->retries = 10;
} else {
DEV_MESSAGE(KERN_ERR, device,
"No alternate channel path left (lpum=%x / "
@@ -301,8 +301,7 @@ dasd_3990_erp_action_4(struct dasd_ccw_r
erp->function = dasd_3990_erp_action_4;
 
} else {
-
-   if (sense[25] == 0x1D) {/* state change pending */
+   if (sense && (sense[25] == 0x1D)) { /* state change pending */
 
DEV_MESSAGE(KERN_INFO, device,
"waiting for state change pending "
@@ -311,7 +310,7 @@ dasd_3990_erp_action_4(struct dasd_ccw_r
 
dasd_3990_erp_block_queue(erp, 30*HZ);
 
-} else if (sense[25] == 0x1E) {/* busy */
+   } else if (sense && (sense[25] == 0x1E)) {  /* busy */
DEV_MESSAGE(KERN_INFO, device,
"busy - redriving request later, "
"%d retries left",
@@ -2120,6 +2119,34 @@ dasd_3990_erp_inspect_32(struct dasd_ccw
  */
 
 /*
+ * DASD_3990_ERP_CONTROL_CHECK
+ *
+ * DESCRIPTION
+ *   Does a generic inspection if a control check occured and sets up
+ *   the related error recovery procedure
+ *
+ * PARAMETER
+ *   erp   pointer to the currently created default ERP
+ *
+ * RETURN VALUES
+ *   erp_filledpointer to the erp
+ */
+
+static struct dasd_ccw_req *
+dasd_3990_erp_control_check(struct dasd_ccw_req *erp)
+{
+   struct dasd_device *device = erp->startdev;
+
+   if (erp->refers->irb.scsw.cstat & (SCHN_STAT_INTF_CTRL_CHK
+  | SCHN_STAT_CHN_CTRL_CHK)) {
+   DEV_MESSAGE(KERN_DEBUG, device, "%s",
+   "channel or interface control check");
+   erp = dasd_3990_erp_action_4(erp, NULL);
+   }
+   return erp;
+}
+
+/*
  * DASD_3990_ERP_INSPECT
  *
  * DESCRIPTION
@@ -2145,8 +2172,11 @@ dasd_3990_erp_inspect(struct dasd_ccw_re
if (erp_new)
return erp_new;
 
+   /* check if no concurrent sens is available */
+   if (!erp->refers->irb.esw.esw0.erw.cons)
+   erp_new = dasd_3990_erp_control_check(erp);
/* distinguish between 24 and 32 byte sense data */
-   if (sense[27] & DASD_SENSE_BIT_0) {
+   else if (sense[27] & DASD_SENSE_BIT_0) {
 
/* inspect the 24 byte sense data */
erp_new = dasd_3990_erp_inspect_24(erp, sense);
@@ -2285,6 +2315,17 @@ dasd_3990_erp_error_match(struct dasd_cc
//  return 0;   /* CCW doesn't match */
}
 
+   if (cqr1->irb.esw.esw0.erw.cons != cqr2->irb.esw.esw0.erw.cons)
+   return 0;
+
+   if ((cqr1->irb.esw.esw0.erw.cons == 0) &&
+   (cqr2->irb.esw.esw0.erw.cons == 0)) {
+   if ((cqr1->irb.scsw.cstat & (SCHN_STAT_INTF_CTRL_CHK |
+SCHN_STAT_CHN_CTRL_CHK)) ==
+   (cqr2->irb.scsw.cstat & (SCHN_STAT_INTF_CTRL_CHK |
+SCHN_STAT_CHN_CTRL_CHK)))
+   return 1; /* match with ifcc*/
+   }
/* check sense data; byte 0-2,25,27 */
if (!((memcmp (cqr1->irb.ecw, cqr2->irb.ecw, 3) == 0) &&
  (cqr1->irb.ecw[27] == cqr2->irb.ecw[27]) &&
@@ -2560,17 +2601,6 @@ dasd_3990_erp_action(struct dasd_ccw_req
 
return cqr;
}
-   /* check if sense data are available */
-   if (!cqr->irb.ecw) {
-   DEV_MESSAGE(KERN_DEBUG, device,
-   "ERP called witout sense data avail ..."
-   "request %p - NO ERP possible", cqr)

[patch 18/18] dcss: Initialize workqueue before using it.

2008-02-05 Thread Martin Schwidefsky
From: Heiko Carstens <[EMAIL PROTECTED]>

In case a dcss segment cannot be loaded blk_cleanup_queue
will be called before blk_queue_make_request, leaving the
struct work unplug_work of the request queue uninitialized
before it is used.
That leads also to the lockdep message below.
To avoid that call blk_queue_make_request right after the
request_queue has been allocated. 
This makes sure that the struct work is always initialized
before it is used.

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 2 Not tainted 2.6.24 #6
Process swapper (pid: 1, task: 0f854038, ksp: 0f85f980)
04000f85f860 0f85f880 0002  
   0f85f920 0f85f898 0f85f898 0001622e 
    0f85f980   
   0f85f880 000c 0f85f880 0f85f8f0 
   00342908 0001622e 0f85f880 0f85f8d0 
Call Trace:
([<0001619e>] show_trace+0xda/0x104)
 [<00016288>] show_stack+0xc0/0xf8
 [<000163d0>] dump_stack+0xb0/0xc0
 [<0006e4ea>] __lock_acquire+0x47e/0x1160
 [<0006f27c>] lock_acquire+0xb0/0xd8
 [<0005a522>] __cancel_work_timer+0x9e/0x240
 [<0005a72e>] cancel_work_sync+0x2a/0x3c
 [<00165c46>] kblockd_flush_work+0x26/0x34
 [<00169034>] blk_sync_queue+0x38/0x48
 [<00169080>] blk_release_queue+0x3c/0xa8
 [<0017bce8>] kobject_cleanup+0x58/0xac
 [<0017bd66>] kobject_release+0x2a/0x38
 [<0017d28e>] kref_put+0x6e/0x94
 [<0017bc80>] kobject_put+0x38/0x48
 [<001653be>] blk_put_queue+0x2a/0x38
 [<00168fee>] blk_cleanup_queue+0x82/0x90
 [<00213e7e>] dcssblk_add_store+0x34e/0x700
 [<005243b8>] dcssblk_init+0x1a0/0x308
 [<0050a3c2>] kernel_init+0x1b2/0x3a4
 [<0001ac82>] kernel_thread_starter+0x6/0xc
 [<0001ac7c>] kernel_thread_starter+0x0/0xc

INFO: lockdep is turned off.

Cc: Gerald Schaefer <[EMAIL PROTECTED]>
Cc: Carsten Otte <[EMAIL PROTECTED]>
Signed-off-by: Heiko Carstens <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 drivers/s390/block/dcssblk.c |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Index: quilt-2.6/drivers/s390/block/dcssblk.c
===
--- quilt-2.6.orig/drivers/s390/block/dcssblk.c
+++ quilt-2.6/drivers/s390/block/dcssblk.c
@@ -415,6 +415,8 @@ dcssblk_add_store(struct device *dev, st
dev_info->gd->queue = dev_info->dcssblk_queue;
dev_info->gd->private_data = dev_info;
dev_info->gd->driverfs_dev = &dev_info->dev;
+   blk_queue_make_request(dev_info->dcssblk_queue, dcssblk_make_request);
+   blk_queue_hardsect_size(dev_info->dcssblk_queue, 4096);
/*
 * load the segment
 */
@@ -472,9 +474,6 @@ dcssblk_add_store(struct device *dev, st
if (rc)
goto unregister_dev;
 
-   blk_queue_make_request(dev_info->dcssblk_queue, dcssblk_make_request);
-   blk_queue_hardsect_size(dev_info->dcssblk_queue, 4096);
-
add_disk(dev_info->gd);
 
switch (dev_info->segment_type) {

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 17/18] Remove BUILD_BUG_ON() in vmem code.

2008-02-05 Thread Martin Schwidefsky
From: Heiko Carstens <[EMAIL PROTECTED]>

Remove BUILD_BUG_ON() in vmem code since it causes build failures if
the size of struct page increases. Instead calculate at compile time
the address of the highest physical address that can be added to the
1:1 mapping.
This supposed to fix a build failure with the page owner tracking leak
detector patches as reported by akpm.

page-owner-tracking-leak-detector-broken-on-s390.patch can be removed
from -mm again when this is merged.

Cc: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Heiko Carstens <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 arch/s390/kernel/setup.c   |2 +-
 arch/s390/mm/vmem.c|3 +--
 include/asm-s390/pgtable.h |   12 +---
 3 files changed, 11 insertions(+), 6 deletions(-)

Index: quilt-2.6/arch/s390/kernel/setup.c
===
--- quilt-2.6.orig/arch/s390/kernel/setup.c
+++ quilt-2.6/arch/s390/kernel/setup.c
@@ -528,7 +528,7 @@ static void __init setup_memory_end(void
memory_size = 0;
memory_end &= PAGE_MASK;
 
-   max_mem = memory_end ? min(VMALLOC_START, memory_end) : VMALLOC_START;
+   max_mem = memory_end ? min(VMEM_MAX_PHYS, memory_end) : VMEM_MAX_PHYS;
memory_end = min(max_mem, memory_end);
 
/*
Index: quilt-2.6/arch/s390/mm/vmem.c
===
--- quilt-2.6.orig/arch/s390/mm/vmem.c
+++ quilt-2.6/arch/s390/mm/vmem.c
@@ -250,7 +250,7 @@ static int insert_memory_segment(struct 
 {
struct memory_segment *tmp;
 
-   if (seg->start + seg->size >= VMALLOC_START ||
+   if (seg->start + seg->size >= VMEM_MAX_PHYS ||
seg->start + seg->size < seg->start)
return -ERANGE;
 
@@ -360,7 +360,6 @@ void __init vmem_map_init(void)
 {
int i;
 
-   BUILD_BUG_ON((unsigned long)VMEM_MAP + VMEM_MAP_SIZE > VMEM_MAP_MAX);
NODE_DATA(0)->node_mem_map = VMEM_MAP;
for (i = 0; i < MEMORY_CHUNKS && memory_chunk[i].size > 0; i++)
vmem_add_mem(memory_chunk[i].addr, memory_chunk[i].size);
Index: quilt-2.6/include/asm-s390/pgtable.h
===
--- quilt-2.6.orig/include/asm-s390/pgtable.h
+++ quilt-2.6/include/asm-s390/pgtable.h
@@ -115,15 +115,21 @@ extern char empty_zero_page[PAGE_SIZE];
 #ifndef __s390x__
 #define VMALLOC_START  0x7800UL
 #define VMALLOC_END0x7e00UL
-#define VMEM_MAP_MAX   0x8000UL
+#define VMEM_MAP_END   0x8000UL
 #else /* __s390x__ */
 #define VMALLOC_START  0x3e0UL
 #define VMALLOC_END0x3e04000UL
-#define VMEM_MAP_MAX   0x400UL
+#define VMEM_MAP_END   0x400UL
 #endif /* __s390x__ */
 
+/*
+ * VMEM_MAX_PHYS is the highest physical address that can be added to the 1:1
+ * mapping. This needs to be calculated at compile time since the size of the
+ * VMEM_MAP is static but the size of struct page can change.
+ */
+#define VMEM_MAX_PHYS  min(VMALLOC_START, ((VMEM_MAP_END - VMALLOC_END) / \
+ sizeof(struct page) * PAGE_SIZE) & ~((16 << 20) - 1))
 #define VMEM_MAP   ((struct page *) VMALLOC_END)
-#define VMEM_MAP_SIZE  ((VMALLOC_START / PAGE_SIZE) * sizeof(struct page))
 
 /*
  * A 31 bit pagetable entry of S390 has following format:

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 16/18] sclp_tty/sclp_vt220: Fix scheduling while atomic

2008-02-05 Thread Martin Schwidefsky
From: Christian Borntraeger <[EMAIL PROTECTED]>

Under load the following bug message appeared while using sysrq-t:

BUG: scheduling while atomic: bash/3662/0x0004
00105b74 3ba17740 0002 
   3ba177e0 3ba17758 3ba17758 00105bfe
   00817ba8 3f2a5350  
   3ba17740 000c 3ba17740 3ba177b0
   00568630 00105bfe 3ba17740 3ba17790
Call Trace:
([<00105b74>] show_trace+0x13c/0x158)
 [<00105c58>] show_stack+0xc8/0xfc
 [<00105cbc>] dump_stack+0x30/0x40
 [<0012a0c8>] __schedule_bug+0x84/0x94
 [<0056234e>] schedule+0x5ea/0x970
 [<00477cd2>] __sclp_vt220_write+0x1f6/0x3ec
 [<00477f00>] sclp_vt220_con_write+0x38/0x48
 [<00130b4a>] __call_console_drivers+0xbe/0xd8
 [<00130bf0>] _call_console_drivers+0x8c/0xd0
 [<00130eea>] release_console_sem+0x1a6/0x2fc
 [<00131786>] vprintk+0x262/0x480
 [<001319fa>] printk+0x56/0x68
 [<00125aaa>] print_cfs_rq+0x45e/0x4a4
 [<0012614e>] sched_debug_show+0x65e/0xee8
 [<0012a8fc>] show_state_filter+0x1cc/0x1f0
 [<0044d39c>] sysrq_handle_showstate+0x2c/0x3c
 [<0044d1fe>] __handle_sysrq+0xae/0x18c
 [<002001f2>] write_sysrq_trigger+0x8a/0x90
 [<001f7862>] proc_reg_write+0x9a/0xc4
 [<001a83d4>] vfs_write+0xb8/0x174
 [<001a8b88>] sys_write+0x58/0x8c
 [<00112e7c>] sysc_noemu+0x10/0x16
 [<02116f68>] 0x2116f68

The problem seems to be, that with a full console buffer, release_console_sem
disables interrupts with spin_lock_irqsave and then calls the console function
without enabling interrupts. __sclp_vt220_write checks for in_interrupt, to
decide if it can schedule. It should check for in_atomic instead.

The same is true for sclp_tty.c.

Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 drivers/s390/char/sclp_tty.c   |2 +-
 drivers/s390/char/sclp_vt220.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Index: quilt-2.6/drivers/s390/char/sclp_tty.c
===
--- quilt-2.6.orig/drivers/s390/char/sclp_tty.c
+++ quilt-2.6/drivers/s390/char/sclp_tty.c
@@ -332,7 +332,7 @@ sclp_tty_write_string(const unsigned cha
if (sclp_ttybuf == NULL) {
while (list_empty(&sclp_tty_pages)) {
spin_unlock_irqrestore(&sclp_tty_lock, flags);
-   if (in_interrupt())
+   if (in_atomic())
sclp_sync_wait();
else
wait_event(sclp_tty_waitq,
Index: quilt-2.6/drivers/s390/char/sclp_vt220.c
===
--- quilt-2.6.orig/drivers/s390/char/sclp_vt220.c
+++ quilt-2.6/drivers/s390/char/sclp_vt220.c
@@ -400,7 +400,7 @@ __sclp_vt220_write(const unsigned char *
while (list_empty(&sclp_vt220_empty)) {
spin_unlock_irqrestore(&sclp_vt220_lock,
   flags);
-   if (in_interrupt())
+   if (in_atomic())
sclp_sync_wait();
else
wait_event(sclp_vt220_waitq,

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 04/18] cio: Add shutdown callback for ccwgroup.

2008-02-05 Thread Martin Schwidefsky
From: Cornelia Huck <[EMAIL PROTECTED]>

This intendeds to make proper shutdown of qeth devices easier.

Signed-off-by: Cornelia Huck <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 drivers/s390/cio/ccwgroup.c |   12 
 include/asm-s390/ccwgroup.h |2 ++
 2 files changed, 14 insertions(+)

Index: quilt-2.6/drivers/s390/cio/ccwgroup.c
===
--- quilt-2.6.orig/drivers/s390/cio/ccwgroup.c
+++ quilt-2.6/drivers/s390/cio/ccwgroup.c
@@ -391,12 +391,24 @@ ccwgroup_remove (struct device *dev)
return 0;
 }
 
+static void ccwgroup_shutdown(struct device *dev)
+{
+   struct ccwgroup_device *gdev;
+   struct ccwgroup_driver *gdrv;
+
+   gdev = to_ccwgroupdev(dev);
+   gdrv = to_ccwgroupdrv(dev->driver);
+   if (gdrv && gdrv->shutdown)
+   gdrv->shutdown(gdev);
+}
+
 static struct bus_type ccwgroup_bus_type = {
.name   = "ccwgroup",
.match  = ccwgroup_bus_match,
.uevent = ccwgroup_uevent,
.probe  = ccwgroup_probe,
.remove = ccwgroup_remove,
+   .shutdown = ccwgroup_shutdown,
 };
 
 /**
Index: quilt-2.6/include/asm-s390/ccwgroup.h
===
--- quilt-2.6.orig/include/asm-s390/ccwgroup.h
+++ quilt-2.6/include/asm-s390/ccwgroup.h
@@ -37,6 +37,7 @@ struct ccwgroup_device {
  * @remove: function called on remove
  * @set_online: function called when device is set online
  * @set_offline: function called when device is set offline
+ * @shutdown: function called when device is shut down
  * @driver: embedded driver structure
  */
 struct ccwgroup_driver {
@@ -49,6 +50,7 @@ struct ccwgroup_driver {
void (*remove) (struct ccwgroup_device *);
int (*set_online) (struct ccwgroup_device *);
int (*set_offline) (struct ccwgroup_device *);
+   void (*shutdown)(struct ccwgroup_device *);
 
struct device_driver driver;
 };

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Please pull git390 'for-linus' branch

2008-02-05 Thread Martin Schwidefsky
Please pull from 'for-linus' branch of

git://git390.osdl.marist.edu/pub/scm/linux-2.6.git for-linus

to receive the following updates:

 Documentation/DocBook/s390-drivers.tmpl |   21 +-
 arch/s390/Kconfig   |8 +
 arch/s390/Kconfig.debug |8 +
 arch/s390/kernel/entry.S|7 +-
 arch/s390/kernel/entry64.S  |7 +-
 arch/s390/kernel/ipl.c  |   27 +-
 arch/s390/kernel/setup.c|   14 +-
 arch/s390/kernel/smp.c  |   13 +-
 arch/s390/kernel/stacktrace.c   |   31 ++-
 arch/s390/kernel/traps.c|5 +-
 arch/s390/kernel/vmlinux.lds.S  |2 +-
 arch/s390/mm/init.c |   27 ++
 arch/s390/mm/vmem.c |5 +-
 drivers/s390/block/dasd.c   |   19 +-
 drivers/s390/block/dasd_3990_erp.c  |   62 +++-
 drivers/s390/block/dcssblk.c|5 +-
 drivers/s390/char/sclp_tty.c|2 +-
 drivers/s390/char/sclp_vt220.c  |2 +-
 drivers/s390/cio/ccwgroup.c |   12 +
 drivers/s390/cio/chsc.c |  147 +++-
 drivers/s390/cio/device_id.c|  107 ---
 include/asm-s390/bitops.h   |  558 ++
 include/asm-s390/cacheflush.h   |4 +
 include/asm-s390/ccwgroup.h |2 +
 include/asm-s390/pgtable.h  |   12 +-
 25 files changed, 587 insertions(+), 520 deletions(-)

Christian Borntraeger (1):
  [S390] sclp_tty/sclp_vt220: Fix scheduling while atomic

Cornelia Huck (3):
  [S390] cio: Clean up chsc response code handling.
  [S390] cio: Update documentation.
  [S390] cio: Add shutdown callback for ccwgroup.

Heiko Carstens (8):
  [S390] DEBUG_PAGEALLOC support for s390.
  [S390] Fix linker script.
  [S390] Fix smp_call_function_mask semantics.
  [S390] Fix couple of section mismatches.
  [S390] Implement ext2_find_next_bit.
  [S390] latencytop s390 support.
  [S390] Remove BUILD_BUG_ON() in vmem code.
  [S390] dcss: Initialize workqueue before using it.

Martin Schwidefsky (2):
  [S390] Define GENERIC_LOCKBREAK.
  [S390] Cleanup & optimize bitops.

Peter Oberparleiter (2):
  [S390] cio: make sense id procedure work with partial hardware response
  [S390] console: allow vt220 console to be the only console

Stefan Haberland (1):
  [S390] dasd: add ifcc handling

Stefan Weinhuber (1):
  [S390] dasd: fix panic caused by alias device offline

diff --git a/Documentation/DocBook/s390-drivers.tmpl 
b/Documentation/DocBook/s390-drivers.tmpl
index 3d2f31b..4acc732 100644
--- a/Documentation/DocBook/s390-drivers.tmpl
+++ b/Documentation/DocBook/s390-drivers.tmpl
@@ -59,7 +59,7 @@
Introduction
   
 This document describes the interfaces available for device drivers that
-drive s390 based channel attached devices. This includes interfaces for
+drive s390 based channel attached I/O devices. This includes interfaces for
 interaction with the hardware and interfaces for interacting with the
 common driver core. Those interfaces are provided by the s390 common I/O
 layer.
@@ -86,9 +86,10 @@
The ccw bus typically contains the majority of devices available to
a s390 system. Named after the channel command word (ccw), the basic
command structure used to address its devices, the ccw bus contains
-   so-called channel attached devices. They are addressed via subchannels,
-   visible on the css bus. A device driver, however, will never interact
-   with the subchannel directly, but only via the device on the ccw bus,
+   so-called channel attached devices. They are addressed via I/O
+   subchannels, visible on the css bus. A device driver for
+   channel-attached devices, however, will never interact  with the
+   subchannel directly, but only via the I/O device on the ccw bus,
the ccw device.
   
 
@@ -116,7 +117,6 @@
 !Iinclude/asm-s390/ccwdev.h
 !Edrivers/s390/cio/device.c
 !Edrivers/s390/cio/device_ops.c
-!Edrivers/s390/cio/airq.c
 
 
  The channel-measurement facility
@@ -147,4 +147,15 @@

   
 
+  
+   Generic interfaces
+  
+   Some interfaces are available to other drivers that do not necessarily
+   have anything to do with the busses described above, but still are
+   indirectly using basic infrastructure in the common I/O layer.
+   One example is the support for adapter interrupts.
+  
+!Edrivers/s390/cio/airq.c
+  
+
 
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 82cbffd..92a4f7b 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -16,6 +16,9 @@ config LOCKDEP_SUPPORT
 config STACKTRACE_SUPPORT
def_bool y
 
+config HAVE_LATENCYTOP_SUPPORT
+   def_bool y
+
 config RWSEM_GENERIC_SPINLOCK
bool
 
@@ -47,6 +50,11 @@ config NO_IOMEM
 config NO_DMA

[patch 09/18] console: allow vt220 console to be the only console

2008-02-05 Thread Martin Schwidefsky
From: Peter Oberparleiter <[EMAIL PROTECTED]>

Fix console detection logic to support configurations in which the
vt220 console is the only available Linux console.

Signed-off-by: Peter Oberparleiter <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 arch/s390/kernel/setup.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

Index: quilt-2.6/arch/s390/kernel/setup.c
===
--- quilt-2.6.orig/arch/s390/kernel/setup.c
+++ quilt-2.6/arch/s390/kernel/setup.c
@@ -145,7 +145,7 @@ __setup("condev=", condev_setup);
 
 static int __init conmode_setup(char *str)
 {
-#if defined(CONFIG_SCLP_CONSOLE)
+#if defined(CONFIG_SCLP_CONSOLE) || defined(CONFIG_SCLP_VT220_CONSOLE)
if (strncmp(str, "hwc", 4) == 0 || strncmp(str, "sclp", 5) == 0)
 SET_CONSOLE_SCLP;
 #endif
@@ -183,7 +183,7 @@ static void __init conmode_default(void)
 */
cpcmd("TERM CONMODE 3215", NULL, 0, NULL);
if (ptr == NULL) {
-#if defined(CONFIG_SCLP_CONSOLE)
+#if defined(CONFIG_SCLP_CONSOLE) || defined(CONFIG_SCLP_VT220_CONSOLE)
SET_CONSOLE_SCLP;
 #endif
return;
@@ -193,7 +193,7 @@ static void __init conmode_default(void)
SET_CONSOLE_3270;
 #elif defined(CONFIG_TN3215_CONSOLE)
SET_CONSOLE_3215;
-#elif defined(CONFIG_SCLP_CONSOLE)
+#elif defined(CONFIG_SCLP_CONSOLE) || defined(CONFIG_SCLP_VT220_CONSOLE)
SET_CONSOLE_SCLP;
 #endif
} else if (strncmp(ptr + 8, "3215", 4) == 0) {
@@ -201,7 +201,7 @@ static void __init conmode_default(void)
SET_CONSOLE_3215;
 #elif defined(CONFIG_TN3270_CONSOLE)
SET_CONSOLE_3270;
-#elif defined(CONFIG_SCLP_CONSOLE)
+#elif defined(CONFIG_SCLP_CONSOLE) || defined(CONFIG_SCLP_VT220_CONSOLE)
SET_CONSOLE_SCLP;
 #endif
}
@@ -212,7 +212,7 @@ static void __init conmode_default(void)
SET_CONSOLE_3270;
 #endif
} else {
-#if defined(CONFIG_SCLP_CONSOLE)
+#if defined(CONFIG_SCLP_CONSOLE) || defined(CONFIG_SCLP_VT220_CONSOLE)
SET_CONSOLE_SCLP;
 #endif
}

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] s390 patches for the 3.7-rc3

2012-10-23 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:
Among the usual minor bug fixes the more interesting patches are the perf
counters for the latest machine, the missing select to enable transparent
huge pages and a build fix for the UAPI rework.

David Howells (1):
  s390,uapi: do not use uapi/asm-generic/kvm_para.h

Gerald Schaefer (1):
  s390/thp: select HAVE_ARCH_TRANSPARENT_HUGEPAGE

Heiko Carstens (2):
  s390: fix linker script for 31 bit builds
  s390/cache: fix data/instruction cache output

Hendrik Brueckner (1):
  perf_cpum_cf: Add support for counters available with IBM zEC12

Michael Holzheu (1):
  s390/kdump: Use 64 bit mode for 0x1 entry point

Sebastian Ott (3):
  s390/chpid: make headers usable (again)
  s390/cio: use generic bitmap functions
  s390/css: stop stsch loop after cc 3

 arch/s390/Kconfig   |1 +
 arch/s390/boot/compressed/vmlinux.lds.S |2 +-
 arch/s390/include/asm/perf_event.h  |2 +-
 arch/s390/include/uapi/asm/Kbuild   |2 --
 arch/s390/include/uapi/asm/chpid.h  |   10 +-
 arch/s390/include/uapi/asm/kvm_para.h   |   11 +++
 arch/s390/kernel/cache.c|9 ++---
 arch/s390/kernel/head_kdump.S   |   10 ++
 arch/s390/kernel/perf_cpum_cf.c |6 +-
 arch/s390/kernel/vmlinux.lds.S  |2 +-
 drivers/s390/cio/css.c  |7 ++-
 drivers/s390/cio/idset.c|   26 ++
 drivers/s390/cio/idset.h|3 ++-
 13 files changed, 55 insertions(+), 36 deletions(-)
 create mode 100644 arch/s390/include/uapi/asm/kvm_para.h

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 3f3d9ca..5dba755 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -130,6 +130,7 @@ config S390
select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE
select HAVE_UID16 if 32BIT
select ARCH_WANT_IPC_PARSE_VERSION
+   select HAVE_ARCH_TRANSPARENT_HUGEPAGE if 64BIT
select GENERIC_SMP_IDLE_THREAD
select GENERIC_TIME_VSYSCALL_OLD
select GENERIC_CLOCKEVENTS
diff --git a/arch/s390/boot/compressed/vmlinux.lds.S 
b/arch/s390/boot/compressed/vmlinux.lds.S
index d80f79d..8e1fb82 100644
--- a/arch/s390/boot/compressed/vmlinux.lds.S
+++ b/arch/s390/boot/compressed/vmlinux.lds.S
@@ -5,7 +5,7 @@ OUTPUT_FORMAT("elf64-s390", "elf64-s390", "elf64-s390")
 OUTPUT_ARCH(s390:64-bit)
 #else
 OUTPUT_FORMAT("elf32-s390", "elf32-s390", "elf32-s390")
-OUTPUT_ARCH(s390)
+OUTPUT_ARCH(s390:31-bit)
 #endif
 
 ENTRY(startup)
diff --git a/arch/s390/include/asm/perf_event.h 
b/arch/s390/include/asm/perf_event.h
index 7941968..5f0173a 100644
--- a/arch/s390/include/asm/perf_event.h
+++ b/arch/s390/include/asm/perf_event.h
@@ -9,7 +9,7 @@
 #include 
 
 /* CPU-measurement counter facility */
-#define PERF_CPUM_CF_MAX_CTR   160
+#define PERF_CPUM_CF_MAX_CTR   256
 
 /* Per-CPU flags for PMU states */
 #define PMU_F_RESERVED 0x1000
diff --git a/arch/s390/include/uapi/asm/Kbuild 
b/arch/s390/include/uapi/asm/Kbuild
index 59b67ed..7bf68ff 100644
--- a/arch/s390/include/uapi/asm/Kbuild
+++ b/arch/s390/include/uapi/asm/Kbuild
@@ -1,8 +1,6 @@
 # UAPI Header export list
 include include/uapi/asm-generic/Kbuild.asm
 
-generic-y += kvm_para.h
-
 header-y += auxvec.h
 header-y += bitsperlong.h
 header-y += byteorder.h
diff --git a/arch/s390/include/uapi/asm/chpid.h 
b/arch/s390/include/uapi/asm/chpid.h
index 581992d..6b4fb29 100644
--- a/arch/s390/include/uapi/asm/chpid.h
+++ b/arch/s390/include/uapi/asm/chpid.h
@@ -1,5 +1,5 @@
 /*
- *Copyright IBM Corp. 2007
+ *Copyright IBM Corp. 2007, 2012
  *Author(s): Peter Oberparleiter 
  */
 
@@ -12,10 +12,10 @@
 #define __MAX_CHPID 255
 
 struct chp_id {
-   u8 reserved1;
-   u8 cssid;
-   u8 reserved2;
-   u8 id;
+   __u8 reserved1;
+   __u8 cssid;
+   __u8 reserved2;
+   __u8 id;
 } __attribute__((packed));
 
 
diff --git a/arch/s390/include/uapi/asm/kvm_para.h 
b/arch/s390/include/uapi/asm/kvm_para.h
new file mode 100644
index 000..ff1f4e7
--- /dev/null
+++ b/arch/s390/include/uapi/asm/kvm_para.h
@@ -0,0 +1,11 @@
+/*
+ * User API definitions for paravirtual devices on s390
+ *
+ * Copyright IBM Corp. 2008
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License (version 2 only)
+ * as published by the Free Software Foundation.
+ *
+ *Author(s): Christian Borntraeger 
+ */
diff --git a/arch/s390/kernel/cache.c b/arch/s390/kernel/cache.c
index 8df8d8a1..64b2465 100644
--- a/arch/s390/kernel/cache.c
+++ b/arch/s390/kernel/cache.c
@@ -59,8 +59,8 @@ enum {
 
 enum {
CACHE_TI_UNIFIED = 0,
-   CACHE_TI_INSTRUCTION = 0,
-   CACHE_TI_DATA,
+   CACHE_TI_DATA = 0,
+

Re: linux-next: first tree

2008-02-14 Thread Martin Schwidefsky
On Fri, 2008-02-15 at 02:00 +1100, Stephen Rothwell wrote:
> I would prefer that the trees be added by the subsystem maintainers and
> they can tell me which branch represents their expectations for the next
> kernel release (in this case 2.6.26). But thanks, hopefully you will have
> prodded them along. :-)

For the s390 architecture please use the "features" branch of git390:

git://git390.osdl.marist.edu/pub/scm/linux-2.6.git features

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] s390 patches for the 3.6 merge window #2

2012-07-31 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the second batch of s390 patches for the 3.6 merge window.
Included is enablement for two common code changes, killable page faults and
sorted exception tables. And the regular set of cleanup and bug fix patches.
The shortlog:

Heiko Carstens (8):
  s390/debug: remove module_exit function / move EXPORT_SYMBOLs
  s390/exceptions: sort exception table at build time
  s390/linker script: use RO_DATA_SECTION
  s390: update defconfig
  s390/mm: make page faults killable
  s390/mm: fix fault handling for page table walk case
  s390/mm: rename user_mode variable to addressing_mode
  s390: make use of user_mode() macro where possible

Martin Schwidefsky (1):
  s390/mm: downgrade page table after fork of a 31 bit process

Michael Holzheu (1):
  s390/ipl: Use diagnose 8 command separation

 arch/s390/Kconfig   |1 +
 arch/s390/defconfig |5 ++-
 arch/s390/include/asm/mmu_context.h |   16 +++-
 arch/s390/include/asm/processor.h   |2 +
 arch/s390/include/asm/setup.h   |2 +-
 arch/s390/kernel/debug.c|   70 ---
 arch/s390/kernel/dis.c  |4 +-
 arch/s390/kernel/early.c|1 -
 arch/s390/kernel/ipl.c  |   12 +-
 arch/s390/kernel/setup.c|   12 +++---
 arch/s390/kernel/traps.c|   16 
 arch/s390/kernel/vdso.c |9 +++--
 arch/s390/kernel/vmlinux.lds.S  |2 +-
 arch/s390/mm/fault.c|   35 --
 arch/s390/mm/mmap.c |   12 +-
 arch/s390/mm/pgtable.c  |7 +---
 arch/s390/oprofile/backtrace.c  |2 +-
 scripts/sortextable.c   |1 +
 18 files changed, 104 insertions(+), 105 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index a39b469..d610859 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -89,6 +89,7 @@ config S390
select HAVE_MEMBLOCK_NODE_MAP
select HAVE_CMPXCHG_LOCAL
select ARCH_DISCARD_MEMBLOCK
+   select BUILDTIME_EXTABLE_SORT
select ARCH_INLINE_SPIN_TRYLOCK
select ARCH_INLINE_SPIN_TRYLOCK_BH
select ARCH_INLINE_SPIN_LOCK
diff --git a/arch/s390/defconfig b/arch/s390/defconfig
index 37d2bf2..967923d 100644
--- a/arch/s390/defconfig
+++ b/arch/s390/defconfig
@@ -7,6 +7,9 @@ CONFIG_TASK_DELAY_ACCT=y
 CONFIG_TASK_XACCT=y
 CONFIG_TASK_IO_ACCOUNTING=y
 CONFIG_AUDIT=y
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_RCU_FAST_NO_HZ=y
 CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_CGROUPS=y
@@ -35,8 +38,6 @@ CONFIG_MODVERSIONS=y
 CONFIG_PARTITION_ADVANCED=y
 CONFIG_IBM_PARTITION=y
 CONFIG_DEFAULT_DEADLINE=y
-CONFIG_NO_HZ=y
-CONFIG_HIGH_RES_TIMERS=y
 CONFIG_PREEMPT=y
 CONFIG_MEMORY_HOTPLUG=y
 CONFIG_MEMORY_HOTREMOVE=y
diff --git a/arch/s390/include/asm/mmu_context.h 
b/arch/s390/include/asm/mmu_context.h
index 5c63615..b749c57 100644
--- a/arch/s390/include/asm/mmu_context.h
+++ b/arch/s390/include/asm/mmu_context.h
@@ -11,7 +11,6 @@
 #include 
 #include 
 #include 
-#include 
 
 static inline int init_new_context(struct task_struct *tsk,
   struct mm_struct *mm)
@@ -58,7 +57,7 @@ static inline void update_mm(struct mm_struct *mm, struct 
task_struct *tsk)
pgd_t *pgd = mm->pgd;
 
S390_lowcore.user_asce = mm->context.asce_bits | __pa(pgd);
-   if (user_mode != HOME_SPACE_MODE) {
+   if (addressing_mode != HOME_SPACE_MODE) {
/* Load primary space page table origin. */
asm volatile(LCTL_OPCODE" 1,1,%0\n"
 : : "m" (S390_lowcore.user_asce) );
@@ -91,4 +90,17 @@ static inline void activate_mm(struct mm_struct *prev,
 switch_mm(prev, next, current);
 }
 
+static inline void arch_dup_mmap(struct mm_struct *oldmm,
+struct mm_struct *mm)
+{
+#ifdef CONFIG_64BIT
+   if (oldmm->context.asce_limit < mm->context.asce_limit)
+   crst_table_downgrade(mm, oldmm->context.asce_limit);
+#endif
+}
+
+static inline void arch_exit_mmap(struct mm_struct *mm)
+{
+}
+
 #endif /* __S390_MMU_CONTEXT_H */
diff --git a/arch/s390/include/asm/processor.h 
b/arch/s390/include/asm/processor.h
index c40fa91..11e4e32 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -120,7 +120,9 @@ struct stack_frame {
regs->psw.mask  = psw_user_bits | PSW_MASK_BA;  \
regs->psw.addr  = new_psw | PSW_ADDR_AMODE; \
regs->gprs[15]  = new_stackp;   \
+   __tlb_flush_mm(current->mm);\
crst_table_downgrade(current->mm, 1UL <

Re: [PATCH] s390: Add pmd_mknotpresent()

2012-10-29 Thread Martin Schwidefsky
On Sun, 28 Oct 2012 14:10:14 +0100
Ingo Molnar  wrote:

> 
> There's a related problem on s390: other THP implementations 
> have pmd_mknotpresent() while s390 not, resulting in:
> 
>   mm/huge_memory.c:1543:2: error: implicit declaration of function 
> 'pmd_mknotpresent'
> 
> The (untested!) patch below adds the s390 version of this 
> method.
> 
> Gerald, Martin, did I get the S390 details right?

This won't work I'm afraid. S390 uses invalid bits which need to be set to
make an entry not present. Just setting the _SEGMENT_ENTRY_INV bit is not
good enough either, there is _HPAGE_TYPE_NONE to consider. The patch to fix
this for pmd_none & pmd_present just got added to the s390 tree on kernel.org:
https://git.kernel.org/?p=linux/kernel/git/s390/linux.git;a=shortlog;h=refs/heads/fixes

Now, if pmd_mknotpresent is supposed to make the entry invalid so that
pmd_present will return false the function needs to do two things,
1) set the _SEGMENT_ENTRY_INV bit, and 2) clear the _SEGMENT_ENTRY_RO bit.

> diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
> index 098fc5a..b820ff1 100644
> --- a/arch/s390/include/asm/pgtable.h
> +++ b/arch/s390/include/asm/pgtable.h
> @@ -1310,6 +1310,12 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd)
>   return pmd;
>  }
> 
> +static inline pmd_t pmd_mknotpresent(pmd_t pmd)
> +{
> + pmd_val(pmd) &= ~_SEGMENT_ENTRY_ORIGIN;
> + return pmd;
> +}
> +
>  #define __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG
>  static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
>   unsigned long address, pmd_t *pmdp)
> 


-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [update] Re: new execve/kernel_thread design

2012-10-29 Thread Martin Schwidefsky
On Fri, 26 Oct 2012 19:31:07 +0100
Al Viro  wrote:

>   The situation got much better by now.  More than a half of
> architectures are done - alpha arm arm64 c6x hexagon ia64 m68k mips openrisc
> parisc sparc tile um unicore32 and x86.
> 
>   Two more avait ACKs from maintainers - powerpc and s390.  Should work,
> AFAICS.

Oops, sorry. I tested this weeks ago but it seems I never wrote a mail to
indicate success. The current git kernel works just fine.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] s390: Add pmd_mknotpresent()

2012-10-29 Thread Martin Schwidefsky
On Mon, 29 Oct 2012 12:05:19 +0100
Ingo Molnar  wrote:

> 
> * Martin Schwidefsky  wrote:
> 
> > On Sun, 28 Oct 2012 14:10:14 +0100
> > Ingo Molnar  wrote:
> > 
> > > 
> > > There's a related problem on s390: other THP implementations 
> > > have pmd_mknotpresent() while s390 not, resulting in:
> > > 
> > >   mm/huge_memory.c:1543:2: error: implicit declaration of function 
> > > 'pmd_mknotpresent'
> > > 
> > > The (untested!) patch below adds the s390 version of this 
> > > method.
> > > 
> > > Gerald, Martin, did I get the S390 details right?
> > 
> > This won't work I'm afraid. S390 uses invalid bits which need 
> > to be set to make an entry not present. Just setting the 
> > _SEGMENT_ENTRY_INV bit is not good enough either, there is 
> > _HPAGE_TYPE_NONE to consider. The patch to fix this for 
> > pmd_none & pmd_present just got added to the s390 tree on 
> > kernel.org:
> >
> > https://git.kernel.org/?p=linux/kernel/git/s390/linux.git;a=shortlog;h=refs/heads/fixes
> > 
> > Now, if pmd_mknotpresent is supposed to make the entry invalid 
> > so that pmd_present will return false the function needs to do 
> > two things, 1) set the _SEGMENT_ENTRY_INV bit, and 2) clear 
> > the _SEGMENT_ENTRY_RO bit.
> 
> Would be nice if you could send me your suggested 
> pmd_mknotpresent().
> 
> (Writing it into the email would be enough, I can turn it into a 
> patch - but a patch would be welcome as well.)

This would look like the following, the patch should apply to all recent
kernel version.
--
Subject: [PATCH] s390,mm: add pmd_mknotpresent

Fix the following build problem in huge_memory:

mm/huge_memory.c:1543:2: error: implicit declaration of function 
'pmd_mknotpresent'

Signed-off-by: Martin Schwidefsky 
---
 arch/s390/include/asm/pgtable.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 335b601..4a84431 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1326,6 +1326,13 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd)
return pmd;
 }
 
+static inline pmd_t pmd_mknotpresent(pmd_t pmd)
+{
+   pmd_val(pmd) |= _SEGMENT_ENTRY_INV;
+   pmd_val(pmd) &= ~_SEGMENT_ENTRY_RO;
+   return pmd;
+}
+
 #define __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG
 static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
unsigned long address, pmd_t *pmdp)
-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [update] Re: new execve/kernel_thread design

2012-10-29 Thread Martin Schwidefsky
On Mon, 29 Oct 2012 13:25:21 +
Al Viro  wrote:

> On Mon, Oct 29, 2012 at 08:53:39AM +0100, Martin Schwidefsky wrote:
> 
> > Oops, sorry. I tested this weeks ago but it seems I never wrote a mail to
> > indicate success. The current git kernel works just fine.
> 
> "Current git" being what?  Linus' tree?  linux-next?  signal.git#arch-s390?
> FWIW, the relevant diff against mainline is below, linux-next already
> contains it.

git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal arch-s390

against Linux 3.7-rc3

# uname -a
Linux r3545014 3.7.0-rc3-2-g95a96a7 #1 SMP Mon Oct 29 15:32:18 CET 2012 
s390x s390x s390x GNU/Linux

works as it is. Feel free to add my Acked-By.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC 03/77] PCI/MSI/s390: Fix single MSI only check

2013-10-04 Thread Martin Schwidefsky
On Wed,  2 Oct 2013 12:48:19 +0200
Alexander Gordeev  wrote:

> Multiple MSIs have never been supported on s390 architecture,
> but the platform code fails to report single MSI only.
> 
> Signed-off-by: Alexander Gordeev 
> ---
>  arch/s390/pci/pci.c |2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index f17a834..c79c6e4 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -427,6 +427,8 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, 
> int type)
>   pr_debug("%s: requesting %d MSI-X interrupts...", __func__, nvec);
>   if (type != PCI_CAP_ID_MSIX && type != PCI_CAP_ID_MSI)
>   return -EINVAL;
> + if (type == PCI_CAP_ID_MSI && nvec > 1)
> + return 1;
>   msi_vecs = min(nvec, ZPCI_MSI_VEC_MAX);
>   msi_vecs = min_t(unsigned int, msi_vecs, CONFIG_PCI_NR_MSI);
> 

Acked-by: Martin Schwidefsky 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC 04/77] PCI/MSI/s390: Remove superfluous check of MSI type

2013-10-04 Thread Martin Schwidefsky
On Wed,  2 Oct 2013 12:48:20 +0200
Alexander Gordeev  wrote:

> arch_setup_msi_irqs() hook can only be called from the generic
> MSI code which ensures correct MSI type parameter.
> 
> Signed-off-by: Alexander Gordeev 
> ---
>  arch/s390/pci/pci.c |2 --
>  1 files changed, 0 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index c79c6e4..61a3c2c 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -425,8 +425,6 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, 
> int type)
>   int rc;
> 
>   pr_debug("%s: requesting %d MSI-X interrupts...", __func__, nvec);
> - if (type != PCI_CAP_ID_MSIX && type != PCI_CAP_ID_MSI)
> - return -EINVAL;
>   if (type == PCI_CAP_ID_MSI && nvec > 1)
>   return 1;
>   msi_vecs = min(nvec, ZPCI_MSI_VEC_MAX);

Acked-by: Martin Schwidefsky 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] s390 patches for 3.12-rc5

2013-10-07 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:
A couple of bux fixes, notable are the regression with ptrace vs. restarting
system calls and the patch for kdump to be able to copy from virtual memory.

Christian Borntraeger (1):
  s390/sclp: properly detect line mode console

Heiko Carstens (1):
  s390/kprobes: add exrl to list of prohibited opcodes

Martin Schwidefsky (1):
  s390: fix system call restart after inferior call

Michael Holzheu (1):
  s390: Allow vmalloc target buffers for copy_from_oldmem()

Wei Yongjun (1):
  s390/3270: fix return value check in tty3270_resize_work()

 arch/s390/kernel/crash_dump.c |   42 -
 arch/s390/kernel/entry.S  |1 +
 arch/s390/kernel/entry64.S|1 +
 arch/s390/kernel/kprobes.c|6 +-
 drivers/s390/char/sclp_cmd.c  |8 +---
 drivers/s390/char/tty3270.c   |2 +-
 6 files changed, 33 insertions(+), 27 deletions(-)

diff --git a/arch/s390/kernel/crash_dump.c b/arch/s390/kernel/crash_dump.c
index c84f33d..7dd2172 100644
--- a/arch/s390/kernel/crash_dump.c
+++ b/arch/s390/kernel/crash_dump.c
@@ -40,28 +40,26 @@ static inline void *load_real_addr(void *addr)
 }
 
 /*
- * Copy up to one page to vmalloc or real memory
+ * Copy real to virtual or real memory
  */
-static ssize_t copy_page_real(void *buf, void *src, size_t csize)
+static int copy_from_realmem(void *dest, void *src, size_t count)
 {
-   size_t size;
+   unsigned long size;
+   int rc;
 
-   if (is_vmalloc_addr(buf)) {
-   BUG_ON(csize >= PAGE_SIZE);
-   /* If buf is not page aligned, copy first part */
-   size = min(roundup(__pa(buf), PAGE_SIZE) - __pa(buf), csize);
-   if (size) {
-   if (memcpy_real(load_real_addr(buf), src, size))
-   return -EFAULT;
-   buf += size;
-   src += size;
-   }
-   /* Copy second part */
-   size = csize - size;
-   return (size) ? memcpy_real(load_real_addr(buf), src, size) : 0;
-   } else {
-   return memcpy_real(buf, src, csize);
-   }
+   if (!count)
+   return 0;
+   if (!is_vmalloc_or_module_addr(dest))
+   return memcpy_real(dest, src, count);
+   do {
+   size = min(count, PAGE_SIZE - (__pa(dest) & ~PAGE_MASK));
+   if (memcpy_real(load_real_addr(dest), src, size))
+   return -EFAULT;
+   count -= size;
+   dest += size;
+   src += size;
+   } while (count);
+   return 0;
 }
 
 /*
@@ -114,7 +112,7 @@ static ssize_t copy_oldmem_page_kdump(char *buf, size_t 
csize,
rc = copy_to_user_real((void __force __user *) buf,
   (void *) src, csize);
else
-   rc = copy_page_real(buf, (void *) src, csize);
+   rc = copy_from_realmem(buf, (void *) src, csize);
return (rc == 0) ? rc : csize;
 }
 
@@ -210,7 +208,7 @@ int copy_from_oldmem(void *dest, void *src, size_t count)
if (OLDMEM_BASE) {
if ((unsigned long) src < OLDMEM_SIZE) {
copied = min(count, OLDMEM_SIZE - (unsigned long) src);
-   rc = memcpy_real(dest, src + OLDMEM_BASE, copied);
+   rc = copy_from_realmem(dest, src + OLDMEM_BASE, copied);
if (rc)
return rc;
}
@@ -223,7 +221,7 @@ int copy_from_oldmem(void *dest, void *src, size_t count)
return rc;
}
}
-   return memcpy_real(dest + copied, src + copied, count - copied);
+   return copy_from_realmem(dest + copied, src + copied, count - copied);
 }
 
 /*
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index cc30d1f..0dc2b6d 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -266,6 +266,7 @@ sysc_sigpending:
tm  __TI_flags+3(%r12),_TIF_SYSCALL
jno sysc_return
lm  %r2,%r7,__PT_R2(%r11)   # load svc arguments
+   l   %r10,__TI_sysc_table(%r12)  # 31 bit system call table
xr  %r8,%r8 # svc 0 returns -ENOSYS
clc __PT_INT_CODE+2(2,%r11),BASED(.Lnr_syscalls+2)
jnl sysc_nr_ok  # invalid svc number -> do svc 0
diff --git a/arch/s390/kernel/entry64.S b/arch/s390/kernel/entry64.S
index 2b2188b..e5b43c9 100644
--- a/arch/s390/kernel/entry64.S
+++ b/arch/s390/kernel/entry64.S
@@ -297,6 +297,7 @@ sysc_sigpending:
tm  __TI_flags+7(%r12),_TIF_SYSCALL
jno sysc_return
lmg %r2,%r7,__PT_R2(%r11)   # load svc arguments
+   lg   

Re: [PATCH] mm: Fix XFS oops due to dirty pages without buffers on s390

2012-10-09 Thread Martin Schwidefsky
On Mon, 8 Oct 2012 21:24:40 -0700 (PDT)
Hugh Dickins  wrote:

> On Mon, 1 Oct 2012, Jan Kara wrote:
> 
> > On s390 any write to a page (even from kernel itself) sets architecture
> > specific page dirty bit. Thus when a page is written to via standard write, 
> > HW
> > dirty bit gets set and when we later map and unmap the page, 
> > page_remove_rmap()
> > finds the dirty bit and calls set_page_dirty().
> > 
> > Dirtying of a page which shouldn't be dirty can cause all sorts of problems 
> > to
> > filesystems. The bug we observed in practice is that buffers from the page 
> > get
> > freed, so when the page gets later marked as dirty and writeback writes it, 
> > XFS
> > crashes due to an assertion BUG_ON(!PagePrivate(page)) in page_buffers() 
> > called
> > from xfs_count_page_state().
> 
> What changed recently?  Was XFS hardly used on s390 until now?

One thing that changed is that the zero_user_segment for the remaining bytes 
between
i_size and the end of the page has been moved to block_write_full_page_endio, 
see
git commit eebd2aa355692afa. That changed the timing of the race window in 
regard
to map/unmap of the page by user space. And yes XFS is in use on s390.
 
> > 
> > Similar problem can also happen when zero_user_segment() call from
> > xfs_vm_writepage() (or block_write_full_page() for that matter) set the
> > hardware dirty bit during writeback, later buffers get freed, and then page
> > unmapped.
> > 
> > Fix the issue by ignoring s390 HW dirty bit for page cache pages in
> > page_mkclean() and page_remove_rmap(). This is safe because when a page gets
> > marked as writeable in PTE it is also marked dirty in do_wp_page() or
> > do_page_fault(). When the dirty bit is cleared by clear_page_dirty_for_io(),
> > the page gets writeprotected in page_mkclean(). So pagecache page is 
> > writeable
> > if and only if it is dirty.
> 
> Very interesting patch...

Yes, it is an interesting idea. I really like the part that we'll use less 
storage
key operations, as these are freaking expensive.

> > 
> > CC: Martin Schwidefsky 
> 
> which I'd very much like Martin's opinion on...

Until you pointed out the short-comings of the patch I really liked it ..

> > ---
> >  mm/rmap.c |   16 ++--
> >  1 files changed, 14 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 0f3b7cd..6ce8ddb 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -973,7 +973,15 @@ int page_mkclean(struct page *page)
> > struct address_space *mapping = page_mapping(page);
> > if (mapping) {
> > ret = page_mkclean_file(mapping, page);
> > -   if (page_test_and_clear_dirty(page_to_pfn(page), 1))
> > +   /*
> > +* We ignore dirty bit for pagecache pages. It is safe
> > +* as page is marked dirty iff it is writeable (page is
> > +* marked as dirty when it is made writeable and
> > +* clear_page_dirty_for_io() writeprotects the page
> > +* again).
> > +*/
> > +   if (PageSwapCache(page) &&
> > +   page_test_and_clear_dirty(page_to_pfn(page), 1))
> > ret = 1;
> 
> This part you could cut out: page_mkclean() is not used on SwapCache pages.
> I believe you are safe to remove the page_test_and_clear_dirty() from here.

Hmm, who guarantees that page_mkclean won't be used for SwapCache in the
future? At least we should add a comment there.

> > }
> > }
> > @@ -1183,8 +1191,12 @@ void page_remove_rmap(struct page *page)
> >  * this if the page is anon, so about to be freed; but perhaps
> >  * not if it's in swapcache - there might be another pte slot
> >  * containing the swap entry, but page not yet written to swap.
> > +* For pagecache pages, we don't care about dirty bit in storage
> > +* key because the page is writeable iff it is dirty (page is marked
> > +* as dirty when it is made writeable and clear_page_dirty_for_io()
> > +* writeprotects the page again).
> >  */
> > -   if ((!anon || PageSwapCache(page)) &&
> > +   if (PageSwapCache(page) &&
> > page_test_and_clear_dirty(page_to_pfn(page), 1))
> > set_page_dirty(page);
> 
> But here's where I think the problem is.  You're assuming that all
> filesystems go the same mapping_cap_account_writeback_di

Re: [GIT PULL] Disintegrate UAPI for s390 [ver #2]

2012-10-09 Thread Martin Schwidefsky
On Tue, 09 Oct 2012 10:15:52 +0100
David Howells  wrote:

> Can you merge the following branch into the s390 tree please.
> 
> This is to complete part of the UAPI disintegration for which the preparatory
> patches were pulled recently.
> 
> Now that the fixups and the asm-generic chunk have been merged, I've
> regenerated the patches to get rid of those dependencies and to take account 
> of
> any changes made so far in the merge window.  If you have already pulled the
> older version of the branch aimed at you, then please feel free to ignore this
> request.
> 
> The following changes since commit 9e2d8656f5e8aa214e66b462680cf86b210b74a8:
> 
>   Merge branch 'akpm' (Andrew's patch-bomb) (2012-10-09 16:23:15 +0900)
> 
> are available in the git repository at:
> 
> 
>   git://git.infradead.org/users/dhowells/linux-headers.git 
> tags/disintegrate-s390-20121009
> 
> for you to fetch changes up to 9807f75955ea7f1877981056755284481873115c:
> 
>   UAPI: (Scripted) Disintegrate arch/s390/include/asm (2012-10-09 09:47:31 
> +0100)

Ok, rebased the s390 tree to get akpm's patch-bomb out of the way and merged 
the UAPI
patchset. Result can be gawked at here:

https://git.kernel.org/?p=linux/kernel/git/s390/linux.git;a=shortlog;h=refs/heads/features

I will send Linus a please pull soon.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] s390 patches for the 3.7 merge window #2

2012-10-10 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:
The big thing in this pull request is the UAPI patch from David,
and worth mentioning is the page table dumper. The rest are small
improvements and bug fixes.

David Howells (1):
  UAPI: (Scripted) Disintegrate arch/s390/include/asm

Heiko Carstens (10):
  s390/facilities: cleanup PFMF and HPAGE machine facility detection
  s390/mm: use pfmf instruction to initialize storage keys
  s390/mm: fix pmd_huge() usage for kernel mapping
  s390/mm,pageattr: add more page table walk sanity checks
  s390/mm,pageattr: remove superfluous EXPORT_SYMBOLs
  s390/mm: add page table dumper
  s390/mm: fix mapping of read-only kernel text section
  s390/mm: let kernel text section always begin at 1MB
  s390/vmalloc: have separate modules area
  s390/mm,vmem: fix vmem_add_mem()/vmem_remove_range()

Martin Schwidefsky (2):
  s390: add support to start the kernel in 64 bit mode.
  s390/entry: fix svc number for TIF_SYSCALL system call restart

Sebastian Ott (3):
  s390/dcssblk: cleanup device attribute usage
  s390/chsc: make headers usable
  s390/css_chars: remove superfluous ifdef

Wei Yongjun (1):
  s390/zcrypt: remove duplicated include from zcrypt_pcixcc.c

 arch/s390/Kconfig.debug|   12 +
 arch/s390/include/asm/Kbuild   |   14 -
 arch/s390/include/asm/chpid.h  |   19 +-
 arch/s390/include/asm/cmb.h|   51 +--
 arch/s390/include/asm/css_chars.h  |3 -
 arch/s390/include/asm/debug.h  |   28 +-
 arch/s390/include/asm/kvm_para.h   |   14 +-
 arch/s390/include/asm/mman.h   |6 +-
 arch/s390/include/asm/page.h   |   14 +-
 arch/s390/include/asm/pgtable.h|   30 +-
 arch/s390/include/asm/ptrace.h |  462 +--
 arch/s390/include/asm/schid.h  |   15 +-
 arch/s390/include/asm/setup.h  |   21 +-
 arch/s390/include/asm/signal.h |  128 +--
 arch/s390/include/asm/termios.h|   42 +--
 arch/s390/include/asm/types.h  |   15 +-
 arch/s390/include/asm/unistd.h |  367 +-
 arch/s390/include/uapi/asm/Kbuild  |   45 +++
 arch/s390/include/{ => uapi}/asm/auxvec.h  |0
 arch/s390/include/{ => uapi}/asm/bitsperlong.h |0
 arch/s390/include/{ => uapi}/asm/byteorder.h   |0
 arch/s390/include/uapi/asm/chpid.h |   22 ++
 arch/s390/include/{ => uapi}/asm/chsc.h|   10 +-
 arch/s390/include/uapi/asm/cmb.h   |   53 +++
 arch/s390/include/{ => uapi}/asm/dasd.h|0
 arch/s390/include/uapi/asm/debug.h |   34 ++
 arch/s390/include/{ => uapi}/asm/errno.h   |0
 arch/s390/include/{ => uapi}/asm/fcntl.h   |0
 arch/s390/include/{ => uapi}/asm/ioctl.h   |0
 arch/s390/include/{ => uapi}/asm/ioctls.h  |0
 arch/s390/include/{ => uapi}/asm/ipcbuf.h  |0
 arch/s390/include/{ => uapi}/asm/kvm.h |0
 arch/s390/include/{ => uapi}/asm/kvm_virtio.h  |0
 arch/s390/include/uapi/asm/mman.h  |6 +
 arch/s390/include/{ => uapi}/asm/monwriter.h   |0
 arch/s390/include/{ => uapi}/asm/msgbuf.h  |0
 arch/s390/include/{ => uapi}/asm/param.h   |0
 arch/s390/include/{ => uapi}/asm/poll.h|0
 arch/s390/include/{ => uapi}/asm/posix_types.h |0
 arch/s390/include/uapi/asm/ptrace.h|  472 
 arch/s390/include/{ => uapi}/asm/qeth.h|0
 arch/s390/include/{ => uapi}/asm/resource.h|0
 arch/s390/include/uapi/asm/schid.h |   16 +
 arch/s390/include/{ => uapi}/asm/sembuf.h  |0
 arch/s390/include/uapi/asm/setup.h |   13 +
 arch/s390/include/{ => uapi}/asm/shmbuf.h  |0
 arch/s390/include/{ => uapi}/asm/sigcontext.h  |0
 arch/s390/include/{ => uapi}/asm/siginfo.h |0
 arch/s390/include/uapi/asm/signal.h|  135 +++
 arch/s390/include/{ => uapi}/asm/socket.h  |0
 arch/s390/include/{ => uapi}/asm/sockios.h |0
 arch/s390/include/{ => uapi}/asm/stat.h|0
 arch/s390/include/{ => uapi}/asm/statfs.h  |0
 arch/s390/include/{ => uapi}/asm/swab.h|0
 arch/s390/include/{ => uapi}/asm/tape390.h |0
 arch/s390/include/{ => uapi}/asm/termbits.h|0
 arch/s390/include/uapi/asm/termios.h   |   49 +++
 arch/s390/include/uapi/asm/types.h |   22 ++
 arch/s390/include/{ => uapi}/asm/ucontext.h|0
 arch/s390/include/uapi/asm/unistd.h|  374 +++
 arch/s390/include/{ =&g

Re: [PATCH] mm: Fix XFS oops due to dirty pages without buffers on s390

2012-10-19 Thread Martin Schwidefsky
On Tue, 9 Oct 2012 16:21:24 -0700 (PDT)
Hugh Dickins  wrote:

> > 
> > I am seriously tempted to switch to pure software dirty bits by using
> > page protection for writable but clean pages. The worry is the number of
> > additional protection faults we would get. But as we do software dirty
> > bit tracking for the most part anyway this might not be as bad as it
> > used to be.  
> 
> That's exactly the same reason why tmpfs opts out of dirty tracking, fear
> of unnecessary extra faults.  Anomalous as s390 is here, tmpfs is being
> anomalous too, and I'd be a hypocrite to push for you to make that change.

I tested the waters with the software dirty bit idea. Using kernel compile
as test case I got these numbers:

disk backing, swdirty: 10,023,870 minor-faults 18 major-faults
disk backing, hwdirty: 10,023,829 minor-faults 21 major-faults  


tmpfs backing, swdirty: 10,019,552 minor-faults 49 major-faults
tmpfs backing, hwdirty: 10,032,909 minor-faults 81 major-faults

That does not look bad at all. One test I found that shows an effect is
lat_mmap from LMBench:

disk backing, hwdirty: 30,894 minor-faults 0 major-faults
disk backing, swdirty: 30,894 minor-faults 0 major-faults

tmpfs backing, hwdirty: 22,574 minor-faults 0 major-faults
tmpfs backing, swdirty: 36,652 minor-faults 0 major-faults 

The runtime between the hwdirty vs. the swdirty setup is very similar,
encouraging enough for me to ask our performance team to run a larger test.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] s390 patches for the 3.7-rc5

2012-11-08 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive a couple of bug fixes. I keep the fingers crossed that we now
got transparent huge pages ready for prime time.

Cornelia Huck (1):
  s390: Move css limits from drivers/s390/cio/ to include/asm/.

Gerald Schaefer (2):
  s390/mm: use pmd_large() instead of pmd_huge()
  s390/thp: respect page protection in pmd_none() and pmd_present()

Heiko Carstens (1):
  s390/sclp: fix addressing mode clobber

Sebastian Ott (2):
  s390/cio: suppress 2nd path verification during resume
  s390/cio: fix length calculation in idset.c

 arch/s390/include/asm/cio.h |2 ++
 arch/s390/include/asm/pgtable.h |   35 ++-
 arch/s390/kernel/sclp.S |8 +++-
 arch/s390/lib/uaccess_pt.c  |2 +-
 arch/s390/mm/gup.c  |2 +-
 drivers/s390/cio/css.h  |3 ---
 drivers/s390/cio/device.c   |8 +---
 drivers/s390/cio/idset.c|3 +--
 8 files changed, 35 insertions(+), 28 deletions(-)

diff --git a/arch/s390/include/asm/cio.h b/arch/s390/include/asm/cio.h
index 55bde60..ad2b924 100644
--- a/arch/s390/include/asm/cio.h
+++ b/arch/s390/include/asm/cio.h
@@ -9,6 +9,8 @@
 
 #define LPM_ANYPATH 0xff
 #define __MAX_CSSID 0
+#define __MAX_SUBCHANNEL 65535
+#define __MAX_SSID 3
 
 #include 
 
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index dd647c9..2d3b7cb 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -506,12 +506,15 @@ static inline int pud_bad(pud_t pud)
 
 static inline int pmd_present(pmd_t pmd)
 {
-   return (pmd_val(pmd) & _SEGMENT_ENTRY_ORIGIN) != 0UL;
+   unsigned long mask = _SEGMENT_ENTRY_INV | _SEGMENT_ENTRY_RO;
+   return (pmd_val(pmd) & mask) == _HPAGE_TYPE_NONE ||
+  !(pmd_val(pmd) & _SEGMENT_ENTRY_INV);
 }
 
 static inline int pmd_none(pmd_t pmd)
 {
-   return (pmd_val(pmd) & _SEGMENT_ENTRY_INV) != 0UL;
+   return (pmd_val(pmd) & _SEGMENT_ENTRY_INV) &&
+  !(pmd_val(pmd) & _SEGMENT_ENTRY_RO);
 }
 
 static inline int pmd_large(pmd_t pmd)
@@ -1223,6 +1226,11 @@ static inline void __pmd_idte(unsigned long address, 
pmd_t *pmdp)
 }
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
+
+#define SEGMENT_NONE   __pgprot(_HPAGE_TYPE_NONE)
+#define SEGMENT_RO __pgprot(_HPAGE_TYPE_RO)
+#define SEGMENT_RW __pgprot(_HPAGE_TYPE_RW)
+
 #define __HAVE_ARCH_PGTABLE_DEPOSIT
 extern void pgtable_trans_huge_deposit(struct mm_struct *mm, pgtable_t 
pgtable);
 
@@ -1242,16 +1250,15 @@ static inline void set_pmd_at(struct mm_struct *mm, 
unsigned long addr,
 
 static inline unsigned long massage_pgprot_pmd(pgprot_t pgprot)
 {
-   unsigned long pgprot_pmd = 0;
-
-   if (pgprot_val(pgprot) & _PAGE_INVALID) {
-   if (pgprot_val(pgprot) & _PAGE_SWT)
-   pgprot_pmd |= _HPAGE_TYPE_NONE;
-   pgprot_pmd |= _SEGMENT_ENTRY_INV;
-   }
-   if (pgprot_val(pgprot) & _PAGE_RO)
-   pgprot_pmd |= _SEGMENT_ENTRY_RO;
-   return pgprot_pmd;
+   /*
+* pgprot is PAGE_NONE, PAGE_RO, or PAGE_RW (see __Pxxx / __Sxxx)
+* Convert to segment table entry format.
+*/
+   if (pgprot_val(pgprot) == pgprot_val(PAGE_NONE))
+   return pgprot_val(SEGMENT_NONE);
+   if (pgprot_val(pgprot) == pgprot_val(PAGE_RO))
+   return pgprot_val(SEGMENT_RO);
+   return pgprot_val(SEGMENT_RW);
 }
 
 static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)
@@ -1269,7 +1276,9 @@ static inline pmd_t pmd_mkhuge(pmd_t pmd)
 
 static inline pmd_t pmd_mkwrite(pmd_t pmd)
 {
-   pmd_val(pmd) &= ~_SEGMENT_ENTRY_RO;
+   /* Do not clobber _HPAGE_TYPE_NONE pages! */
+   if (!(pmd_val(pmd) & _SEGMENT_ENTRY_INV))
+   pmd_val(pmd) &= ~_SEGMENT_ENTRY_RO;
return pmd;
 }
 
diff --git a/arch/s390/kernel/sclp.S b/arch/s390/kernel/sclp.S
index bf05389..b6506ee 100644
--- a/arch/s390/kernel/sclp.S
+++ b/arch/s390/kernel/sclp.S
@@ -44,6 +44,12 @@ _sclp_wait_int:
 #endif
mvc .LoldpswS1-.LbaseS1(16,%r13),0(%r8)
mvc 0(16,%r8),0(%r9)
+#ifdef CONFIG_64BIT
+   epsw%r6,%r7 # set current addressing mode
+   nill%r6,0x1 # in new psw (31 or 64 bit mode)
+   nilh%r7,0x8000
+   stm %r6,%r7,0(%r8)
+#endif
lhi %r6,0x0200  # cr mask for ext int (cr0.54)
ltr %r2,%r2
jz  .LsetctS1
@@ -87,7 +93,7 @@ _sclp_wait_int:
.long   0x0008, 0x8000+.LwaitS1 # PSW to handle ext int
 #ifdef CONFIG_64BIT
 .LextpswS1_64:
-   .quad   0x00018000, .LwaitS1# PSW to handle ext int, 64 bit
+   .quad   0, .LwaitS1 # PSW to handle ext int, 64 bit
 #endif
 .LwaitpswS1:
.long   0x010a, 0x+.

Re: linux-next: manual merge of the kvm tree with the s390 tree

2013-01-02 Thread Martin Schwidefsky
Hi Stephen,

On Thu, 3 Jan 2013 12:06:50 +1100
Stephen Rothwell  wrote:

> Today's linux-next merge of the kvm tree got conflicts in
> arch/s390/include/asm/irq.h and arch/s390/kernel/irq.c between commit
> bfb048f594d5 ("s390/irq: remove split irq fields from /proc/stat") from
> the s390 tree and commit 7e64e0597fd6 ("KVM: s390: Add a channel I/O
> based virtio transport driver") from the kvm tree.
> 
> I fixed it up (I think - see below) including the following merge fix
> patch and can carry the fix as necessary (more action may be required).
> 
> From: Stephen Rothwell 
> Date: Thu, 3 Jan 2013 12:04:39 +1100
> Subject: [PATCH] KVM: s390: fix for IOINT_VIR name change
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  drivers/s390/kvm/virtio_ccw.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/s390/kvm/virtio_ccw.c b/drivers/s390/kvm/virtio_ccw.c
> index 1a5aff3..f94a0b1 100644
> --- a/drivers/s390/kvm/virtio_ccw.c
> +++ b/drivers/s390/kvm/virtio_ccw.c
> @@ -745,7 +745,7 @@ static struct ccw_driver virtio_ccw_driver = {
>   .set_offline = virtio_ccw_offline,
>   .set_online = virtio_ccw_online,
>   .notify = virtio_ccw_cio_notify,
> - .int_class = IOINT_VIR,
> + .int_class = IRQIO_VIR,
>  };
>  
>  static int __init pure_hex(char **cp, unsigned int *val, int min_digit,

I surprises me a bit that there is only one hunk in the cleanup patch.
I expected three, the above for virtio_ccw.c, one for irq.c and another
for irq.h. I checked the resulting tree which is correct! The merge diff
has a '++' line for irq.c/irq.h:

irq.h:
 +  IRQIO_PCI,
 +  IRQIO_MSI,
++  IRQIO_VIR,
NMI_NMI,
 -  NR_IRQS,
 +  CPU_RST,

irq.c:
 +  [IRQIO_PCI]  = {.name = "PCI", .desc = "[I/O] PCI Interrupt" },
 +  [IRQIO_MSI]  = {.name = "MSI", .desc = "[I/O] MSI Interrupt" },
++  [IRQIO_VIR]  = {.name = "VIR", .desc = "[I/O] Virtual I/O Devices"},
[NMI_NMI]= {.name = "NMI", .desc = "[NMI] Machine Check"},
 +  [CPU_RST]= {.name = "RST", .desc = "[CPU] CPU Restart"},

Magic ?

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] s390 patches for the 3.8 merge window #2

2012-12-18 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:
The main patch is the function measurement blocks extension for PCI to do
performance statistics and help with debugging. The other patch is a small
cleanup in ccwdev.h.

Cornelia Huck (1):
  s390/ccwdev: Include asm/schid.h.

Jan Glauber (1):
  s390/pci: performance statistics and debug infrastructure

 arch/s390/include/asm/ccwdev.h|4 +-
 arch/s390/include/asm/pci.h   |   39 
 arch/s390/include/asm/pci_debug.h |   36 +++
 arch/s390/pci/Makefile|2 +-
 arch/s390/pci/pci.c   |   73 +-
 arch/s390/pci/pci_clp.c   |1 +
 arch/s390/pci/pci_debug.c |  193 +
 arch/s390/pci/pci_dma.c   |8 +-
 arch/s390/pci/pci_event.c |2 +
 9 files changed, 350 insertions(+), 8 deletions(-)
 create mode 100644 arch/s390/include/asm/pci_debug.h
 create mode 100644 arch/s390/pci/pci_debug.c

diff --git a/arch/s390/include/asm/ccwdev.h b/arch/s390/include/asm/ccwdev.h
index 6d1f357..e606161 100644
--- a/arch/s390/include/asm/ccwdev.h
+++ b/arch/s390/include/asm/ccwdev.h
@@ -12,15 +12,13 @@
 #include 
 #include 
 #include 
+#include 
 
 /* structs from asm/cio.h */
 struct irb;
 struct ccw1;
 struct ccw_dev_id;
 
-/* from asm/schid.h */
-struct subchannel_id;
-
 /* simplified initializers for struct ccw_device:
  * CCW_DEVICE and CCW_DEVICE_DEVTYPE initialize one
  * entry in your MODULE_DEVICE_TABLE and set the match_flag correctly */
diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index a6175ad..b1fa93c 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define PCIBIOS_MIN_IO 0x1000
 #define PCIBIOS_MIN_MEM0x1000
@@ -33,6 +34,25 @@ int pci_proc_domain(struct pci_bus *);
 #define ZPCI_FC_BLOCKED0x20
 #define ZPCI_FC_DMA_ENABLED0x10
 
+struct zpci_fmb {
+   u32 format  :  8;
+   u32 dma_valid   :  1;
+   u32 : 23;
+   u32 samples;
+   u64 last_update;
+   /* hardware counters */
+   u64 ld_ops;
+   u64 st_ops;
+   u64 stb_ops;
+   u64 rpcit_ops;
+   u64 dma_rbytes;
+   u64 dma_wbytes;
+   /* software counters */
+   atomic64_t allocated_pages;
+   atomic64_t mapped_pages;
+   atomic64_t unmapped_pages;
+} __packed __aligned(16);
+
 struct msi_map {
unsigned long irq;
struct msi_desc *msi;
@@ -92,7 +112,15 @@ struct zpci_dev {
u64 end_dma;/* End of available DMA addresses */
u64 dma_mask;   /* DMA address space mask */
 
+   /* Function measurement block */
+   struct zpci_fmb *fmb;
+   u16 fmb_update; /* update interval */
+
enum pci_bus_speed max_bus_speed;
+
+   struct dentry   *debugfs_dev;
+   struct dentry   *debugfs_perf;
+   struct dentry   *debugfs_debug;
 };
 
 struct pci_hp_callback_ops {
@@ -155,4 +183,15 @@ extern struct list_head zpci_list;
 extern struct pci_hp_callback_ops hotplug_ops;
 extern unsigned int pci_probe;
 
+/* FMB */
+int zpci_fmb_enable_device(struct zpci_dev *);
+int zpci_fmb_disable_device(struct zpci_dev *);
+
+/* Debug */
+int zpci_debug_init(void);
+void zpci_debug_exit(void);
+void zpci_debug_init_device(struct zpci_dev *);
+void zpci_debug_exit_device(struct zpci_dev *);
+void zpci_debug_info(struct zpci_dev *, struct seq_file *);
+
 #endif
diff --git a/arch/s390/include/asm/pci_debug.h 
b/arch/s390/include/asm/pci_debug.h
new file mode 100644
index 000..6bbec42
--- /dev/null
+++ b/arch/s390/include/asm/pci_debug.h
@@ -0,0 +1,36 @@
+#ifndef _S390_ASM_PCI_DEBUG_H
+#define _S390_ASM_PCI_DEBUG_H
+
+#include 
+
+extern debug_info_t *pci_debug_msg_id;
+extern debug_info_t *pci_debug_err_id;
+
+#ifdef CONFIG_PCI_DEBUG
+#define zpci_dbg(fmt, args...) 
\
+   do {
\
+   if (pci_debug_msg_id->level >= 2)   
\
+   debug_sprintf_event(pci_debug_msg_id, 2, fmt , ## 
args);\
+   } while (0)
+
+#else /* !CONFIG_PCI_DEBUG */
+#define zpci_dbg(fmt, args...) do { } while (0)
+#endif
+
+#define zpci_err(text...)  
\
+   do {
\
+   char debug_buffer[16];  
\
+   snprintf(debug_buffer, 16, text);   
\
+   debug_text_event(pci_debug_err_id, 0, debug_buffer);
\
+   } while (0)
+
+static inline void zpci_err_hex(void *addr

[GIT PULL] s390 patches for 3.7-rc6

2012-11-16 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:
Some more bug fixes and a config change. The signal bug is nasty, if
the clock_gettime vdso function is interrupted by a signal while in
access-register-mode we end up with an endless signal loop until the
signal stack is full. The config change is for aligned struct pages,
gives us 8% improvement with hackbench.

Heiko Carstens (5):
  s390/topology: fix core id vs physical package id mix-up
  s390/gup: add missing TASK_SIZE check to get_user_pages_fast()
  s390/gup: fix access_ok() usage in __get_user_pages_fast()
  s390/mm: have 16 byte aligned struct pages
  s390/3215: fix tty close handling

Martin Schwidefsky (1):
  s390/signal: set correct address space control

 arch/s390/Kconfig   |1 +
 arch/s390/include/asm/compat.h  |2 +-
 arch/s390/include/asm/topology.h|3 +++
 arch/s390/include/uapi/asm/ptrace.h |4 ++--
 arch/s390/kernel/compat_signal.c|   14 --
 arch/s390/kernel/signal.c   |   14 --
 arch/s390/kernel/topology.c |6 --
 arch/s390/mm/gup.c  |5 ++---
 drivers/s390/char/con3215.c |   12 +---
 9 files changed, 42 insertions(+), 19 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 5dba755..d385f39 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -96,6 +96,7 @@ config S390
select HAVE_MEMBLOCK_NODE_MAP
select HAVE_CMPXCHG_LOCAL
select HAVE_CMPXCHG_DOUBLE
+   select HAVE_ALIGNED_STRUCT_PAGE if SLUB
select HAVE_VIRT_CPU_ACCOUNTING
select VIRT_CPU_ACCOUNTING
select ARCH_DISCARD_MEMBLOCK
diff --git a/arch/s390/include/asm/compat.h b/arch/s390/include/asm/compat.h
index a34a9d6..18cd6b5 100644
--- a/arch/s390/include/asm/compat.h
+++ b/arch/s390/include/asm/compat.h
@@ -20,7 +20,7 @@
 #define PSW32_MASK_CC  0x3000UL
 #define PSW32_MASK_PM  0x0f00UL
 
-#define PSW32_MASK_USER0x3F00UL
+#define PSW32_MASK_USER0xFF00UL
 
 #define PSW32_ADDR_AMODE   0x8000UL
 #define PSW32_ADDR_INSN0x7FFFUL
diff --git a/arch/s390/include/asm/topology.h b/arch/s390/include/asm/topology.h
index 9ca3053..9935cbd 100644
--- a/arch/s390/include/asm/topology.h
+++ b/arch/s390/include/asm/topology.h
@@ -8,6 +8,9 @@ struct cpu;
 
 #ifdef CONFIG_SCHED_BOOK
 
+extern unsigned char cpu_socket_id[NR_CPUS];
+#define topology_physical_package_id(cpu) (cpu_socket_id[cpu])
+
 extern unsigned char cpu_core_id[NR_CPUS];
 extern cpumask_t cpu_core_map[NR_CPUS];
 
diff --git a/arch/s390/include/uapi/asm/ptrace.h 
b/arch/s390/include/uapi/asm/ptrace.h
index 705588a..a5ca214 100644
--- a/arch/s390/include/uapi/asm/ptrace.h
+++ b/arch/s390/include/uapi/asm/ptrace.h
@@ -239,7 +239,7 @@ typedef struct
 #define PSW_MASK_EA0xUL
 #define PSW_MASK_BA0xUL
 
-#define PSW_MASK_USER  0x3F00UL
+#define PSW_MASK_USER  0xFF00UL
 
 #define PSW_ADDR_AMODE 0x8000UL
 #define PSW_ADDR_INSN  0x7FFFUL
@@ -269,7 +269,7 @@ typedef struct
 #define PSW_MASK_EA0x0001UL
 #define PSW_MASK_BA0x8000UL
 
-#define PSW_MASK_USER  0x3F818000UL
+#define PSW_MASK_USER  0xFF818000UL
 
 #define PSW_ADDR_AMODE 0xUL
 #define PSW_ADDR_INSN  0xUL
diff --git a/arch/s390/kernel/compat_signal.c b/arch/s390/kernel/compat_signal.c
index a1e8a86..593fcc9 100644
--- a/arch/s390/kernel/compat_signal.c
+++ b/arch/s390/kernel/compat_signal.c
@@ -309,6 +309,10 @@ static int restore_sigregs32(struct pt_regs 
*regs,_sigregs32 __user *sregs)
regs->psw.mask = (regs->psw.mask & ~PSW_MASK_USER) |
(__u64)(regs32.psw.mask & PSW32_MASK_USER) << 32 |
(__u64)(regs32.psw.addr & PSW32_ADDR_AMODE);
+   /* Check for invalid user address space control. */
+   if ((regs->psw.mask & PSW_MASK_ASC) >= (psw_kernel_bits & PSW_MASK_ASC))
+   regs->psw.mask = (psw_user_bits & PSW_MASK_ASC) |
+   (regs->psw.mask & ~PSW_MASK_ASC);
regs->psw.addr = (__u64)(regs32.psw.addr & PSW32_ADDR_INSN);
for (i = 0; i < NUM_GPRS; i++)
regs->gprs[i] = (__u64) regs32.gprs[i];
@@ -481,7 +485,10 @@ static int setup_frame32(int sig, struct k_sigaction *ka,
 
/* Set up registers for signal handler */
regs->gprs[15] = (__force __u64) frame;
-   regs->psw.mask |= PSW_MASK_BA;  /* force amode 31 */
+   /* Force 31 bit amode and default user address space control. */
+   regs->psw.mask = PSW_MASK_BA |
+   

[GIT PULL] s390 patches for the 3.10-rc6

2013-06-13 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:

Three kvm related memory management fixes, a fix for show_trace,
a fix for early console output and a patch from Ben to help prevent
compile errors in regard to irq functions (or our lack thereof).

Ben Hutchings (1):
  s390/pci: Implement IRQ functions if !PCI

Christian Borntraeger (3):
  s390/pgtable: Fix guest overindication for change bit
  s390/pgtable: Save pgste during modify_prot_start/commit
  s390/pgtable: make pgste lock an explicit barrier

Martin Schwidefsky (1):
  s390/dumpstack: fix address ranges for asynchronous and panic stack

Peter Oberparleiter (1):
  s390/sclp: fix new line detection

 arch/s390/include/asm/pgtable.h |   32 ++--
 arch/s390/kernel/dumpstack.c|   12 +---
 arch/s390/kernel/irq.c  |   64 +++
 arch/s390/kernel/sclp.S |2 +-
 arch/s390/pci/pci.c |   33 
 5 files changed, 95 insertions(+), 48 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index ac01463..e8b6e5b 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -623,7 +623,7 @@ static inline pgste_t pgste_get_lock(pte_t *ptep)
"   csg %0,%1,%2\n"
"   jl  0b\n"
: "=&d" (old), "=&d" (new), "=Q" (ptep[PTRS_PER_PTE])
-   : "Q" (ptep[PTRS_PER_PTE]) : "cc");
+   : "Q" (ptep[PTRS_PER_PTE]) : "cc", "memory");
 #endif
return __pgste(new);
 }
@@ -635,11 +635,19 @@ static inline void pgste_set_unlock(pte_t *ptep, pgste_t 
pgste)
"   nihh%1,0xff7f\n"/* clear RCP_PCL_BIT */
"   stg %1,%0\n"
: "=Q" (ptep[PTRS_PER_PTE])
-   : "d" (pgste_val(pgste)), "Q" (ptep[PTRS_PER_PTE]) : "cc");
+   : "d" (pgste_val(pgste)), "Q" (ptep[PTRS_PER_PTE])
+   : "cc", "memory");
preempt_enable();
 #endif
 }
 
+static inline void pgste_set(pte_t *ptep, pgste_t pgste)
+{
+#ifdef CONFIG_PGSTE
+   *(pgste_t *)(ptep + PTRS_PER_PTE) = pgste;
+#endif
+}
+
 static inline pgste_t pgste_update_all(pte_t *ptep, pgste_t pgste)
 {
 #ifdef CONFIG_PGSTE
@@ -704,17 +712,19 @@ static inline void pgste_set_key(pte_t *ptep, pgste_t 
pgste, pte_t entry)
 {
 #ifdef CONFIG_PGSTE
unsigned long address;
-   unsigned long okey, nkey;
+   unsigned long nkey;
 
if (pte_val(entry) & _PAGE_INVALID)
return;
+   VM_BUG_ON(!(pte_val(*ptep) & _PAGE_INVALID));
address = pte_val(entry) & PAGE_MASK;
-   okey = nkey = page_get_storage_key(address);
-   nkey &= ~(_PAGE_ACC_BITS | _PAGE_FP_BIT);
-   /* Set page access key and fetch protection bit from pgste */
-   nkey |= (pgste_val(pgste) & (RCP_ACC_BITS | RCP_FP_BIT)) >> 56;
-   if (okey != nkey)
-   page_set_storage_key(address, nkey, 0);
+   /*
+* Set page access key and fetch protection bit from pgste.
+* The guest C/R information is still in the PGSTE, set real
+* key C/R to 0.
+*/
+   nkey = (pgste_val(pgste) & (RCP_ACC_BITS | RCP_FP_BIT)) >> 56;
+   page_set_storage_key(address, nkey, 0);
 #endif
 }
 
@@ -1099,8 +1109,10 @@ static inline pte_t ptep_modify_prot_start(struct 
mm_struct *mm,
if (!mm_exclusive(mm))
__ptep_ipte(address, ptep);
 
-   if (mm_has_pgste(mm))
+   if (mm_has_pgste(mm)) {
pgste = pgste_update_all(&pte, pgste);
+   pgste_set(ptep, pgste);
+   }
return pte;
 }
 
diff --git a/arch/s390/kernel/dumpstack.c b/arch/s390/kernel/dumpstack.c
index 2982974..87acc38 100644
--- a/arch/s390/kernel/dumpstack.c
+++ b/arch/s390/kernel/dumpstack.c
@@ -74,6 +74,8 @@ __show_trace(unsigned long sp, unsigned long low, unsigned 
long high)
 
 static void show_trace(struct task_struct *task, unsigned long *stack)
 {
+   const unsigned long frame_size =
+   STACK_FRAME_OVERHEAD + sizeof(struct pt_regs);
register unsigned long __r15 asm ("15");
unsigned long sp;
 
@@ -82,11 +84,13 @@ static void show_trace(struct task_struct *task, unsigned 
long *stack)
sp = task ? task->thread.ksp : __r15;
printk("Call Trace:\n");
 #ifdef CONFIG_CHECK_STACK
-   sp = __show_trace(sp, S390_lowcore.panic_stack - 4096,
- S390_lowcore.panic_stack);
+   sp = __show_trace(sp,
+  

Re: [PATCH 0/9] dasd: implement block timeout

2013-06-17 Thread Martin Schwidefsky
On Mon,  3 Jun 2013 17:03:13 +0200
Martin Schwidefsky  wrote:

> This is a re-send of a patch series Hannes sent last january. Stefan
> looked at the patches and our tests went well, so I guess this is ready
> for upstream integration.
> 
> The changes to block/blk-core.c and block/blk-timeout.c look good
> to me, but I would like to request an acked-by from Jens for the
> block layer parts in patch #6 and #7 of the series (pretty please :-)
> 
> The original patch description from Hannes:
> 
> This patch series implements a block timeout handler for
> DASDs. The main impetus was to allow for a fixed upper
> timeout value after which a request is aborted.
> This is required eg when implementing a host-based
> mirroring system where otherwise the entire mirror
> would stall under certain circumstances.
> 
> Changes since v1:
> - Fixed lock inversion in dasd_times_out()
> - Checked for 'device->block' when writing to 'timeout' attribute
> - Check against 'UINT_MAX' when verifying the 'timeout' value
> 
> Once I got the required acked-by I can carry the patch set in the
> linux-s390 tree for the next merge window.
> 
> Hannes Reinecke (9):
>   dasd: Clarify comment
>   dasd: make number of retries configurable
>   dasd: process all requests in the device tasklet
>   dasd: Implement block timeout handling
>   dasd: Reduce amount of messages for specific errors
>   block,dasd: detailed I/O errors
>   block: check for timeout function in blk_rq_timed_out()
>   dasd: Add 'timeout' attribute
>   dasd: Fail all requests when DASD_FLAG_ABORTIO is set
> 
>  arch/s390/include/uapi/asm/dasd.h |4 ++
>  block/blk-core.c  |3 +
>  block/blk-timeout.c   |5 +-
>  drivers/s390/block/dasd.c |  115 
> +
>  drivers/s390/block/dasd_devmap.c  |   97 +++
>  drivers/s390/block/dasd_diag.c|8 ++-
>  drivers/s390/block/dasd_eckd.c|   15 +++--
>  drivers/s390/block/dasd_erp.c |8 +++
>  drivers/s390/block/dasd_fba.c |   10 +++-
>  drivers/s390/block/dasd_int.h |   10 
>  drivers/s390/block/dasd_ioctl.c   |   59 +++
>  11 files changed, 313 insertions(+), 21 deletions(-)
> 

Ping. Jens could you please look at patch #6 and #7? I don't want
to do a please pull with block layer parts without your consent.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 13/48] s390/pci: Implement IRQ functions if !PCI

2013-06-19 Thread Martin Schwidefsky
On Tue, 18 Jun 2013 18:35:40 +0100
Ben Hutchings  wrote:

> On Tue, Jun 18, 2013 at 09:17:39AM -0700, Greg Kroah-Hartman wrote:
> > From: Greg Kroah-Hartman 
> > 
> > 3.9-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Ben Hutchings 
> > 
> > commit c46b54f7406780ec4cf9c9124d1cfb777674dc70 upstream.
> > 
> > All architectures must implement IRQ functions.  Since various
> > dependencies on !S390 were removed, there are various drivers that can
> > be selected but will fail to link.  Provide a dummy implementation of
> > these functions for the !PCI case.
> [...]
> 
> This breaks !SMP builds, so it's probably best to defer this until the
> following fix is in mainline.

I guess that all of the relevant kernels that are build for s390 are SMP
enabled. The patch fixes fallout in a number of PCI drivers and should
go in as is. The patch to fix the !SMP build will go on top.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/9] dasd: Reduce amount of messages for specific errors

2013-06-03 Thread Martin Schwidefsky
From: Hannes Reinecke 

Whenever a request has been aborted internally by the driver
there is no sense data to be had. And printing lots of messages
stalls the system, so better to print out a short one-liner.

Signed-off-by: Hannes Reinecke 
Signed-off-by: Stefan Weinhuber 
Signed-off-by: Martin Schwidefsky 
---
 drivers/s390/block/dasd_erp.c |8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/s390/block/dasd_erp.c b/drivers/s390/block/dasd_erp.c
index 3250cb4..8d11f77 100644
--- a/drivers/s390/block/dasd_erp.c
+++ b/drivers/s390/block/dasd_erp.c
@@ -159,6 +159,14 @@ dasd_log_sense(struct dasd_ccw_req *cqr, struct irb *irb)
struct dasd_device *device;
 
device = cqr->startdev;
+   if (cqr->intrc == -ETIMEDOUT) {
+   dev_err(&device->cdev->dev, "cqr %p timeout error", cqr);
+   return;
+   }
+   if (cqr->intrc == -ENOLINK) {
+   dev_err(&device->cdev->dev, "cqr %p transport error", cqr);
+   return;
+   }
/* dump sense data */
if (device->discipline && device->discipline->dump_sense)
device->discipline->dump_sense(device, cqr, irb);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/9] dasd: Clarify comment

2013-06-03 Thread Martin Schwidefsky
From: Hannes Reinecke 

dasd_cancel_req will never return 1, only 0.

Signed-off-by: Hannes Reinecke 
Signed-off-by: Stefan Weinhuber 
Signed-off-by: Martin Schwidefsky 
---
 drivers/s390/block/dasd.c |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index d72a9216e..4985489 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -2402,8 +2402,7 @@ int dasd_sleep_on_immediatly(struct dasd_ccw_req *cqr)
  * Cancels a request that was started with dasd_sleep_on_req.
  * This is useful to timeout requests. The request will be
  * terminated if it is currently in i/o.
- * Returns 1 if the request has been terminated.
- *0 if there was no need to terminate the request (not started yet)
+ * Returns 0 if request termination was successful
  *negative error code if termination failed
  * Cancellation of a request is an asynchronous operation! The calling
  * function has to wait until the request is properly returned via callback.
@@ -2440,7 +2439,6 @@ int dasd_cancel_req(struct dasd_ccw_req *cqr)
return rc;
 }
 
-
 /*
  * SECTION: Operations of the dasd_block layer.
  */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 9/9] dasd: Fail all requests when DASD_FLAG_ABORTIO is set

2013-06-03 Thread Martin Schwidefsky
From: Hannes Reinecke 

Whenever a DASD request encounters a timeout we might
need to abort all outstanding requests on this or
even other devices.

This is especially useful if one wants to fail all
devices on one side of a RAID10 configuration, even
though only one device exhibited an error.

To handle this I've introduced a new device flag
DASD_FLAG_ABORTIO.
This flag is evaluated in __dasd_process_request_queue()
and will invoke blk_abort_request() for all
outstanding requests with DASD_CQR_FLAGS_FAILFAST set.
This will cause any of these requests to be aborted
immediately if the blk_timeout function is activated.

The DASD_FLAG_ABORTIO is also evaluated in
__dasd_process_request_queue to abort all
new request which would have the
DASD_CQR_FLAGS_FAILFAST bit set.

The flag can be set with the new ioctls 'BIODASDABORTIO'
and removed with 'BIODASDALLOWIO'.

Signed-off-by: Hannes Reinecke 
Signed-off-by: Stefan Weinhuber 
Signed-off-by: Martin Schwidefsky 
---
 arch/s390/include/uapi/asm/dasd.h |4 +++
 drivers/s390/block/dasd.c |   13 ++--
 drivers/s390/block/dasd_int.h |3 ++
 drivers/s390/block/dasd_ioctl.c   |   59 +
 4 files changed, 76 insertions(+), 3 deletions(-)

diff --git a/arch/s390/include/uapi/asm/dasd.h 
b/arch/s390/include/uapi/asm/dasd.h
index 38eca3b..5812a3b 100644
--- a/arch/s390/include/uapi/asm/dasd.h
+++ b/arch/s390/include/uapi/asm/dasd.h
@@ -261,6 +261,10 @@ struct dasd_snid_ioctl_data {
 #define BIODASDQUIESCE _IO(DASD_IOCTL_LETTER,6) 
 /* Resume IO on device */
 #define BIODASDRESUME  _IO(DASD_IOCTL_LETTER,7) 
+/* Abort all I/O on a device */
+#define BIODASDABORTIO _IO(DASD_IOCTL_LETTER, 240)
+/* Allow I/O on a device */
+#define BIODASDALLOWIO _IO(DASD_IOCTL_LETTER, 241)
 
 
 /* retrieve API version number */
diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index 54f4bb8..17150a7 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -38,9 +38,6 @@
  */
 #define DASD_CHANQ_MAX_SIZE 4
 
-#define DASD_SLEEPON_START_TAG (void *) 1
-#define DASD_SLEEPON_END_TAG   (void *) 2
-
 /*
  * SECTION: exported variables of dasd.c
  */
@@ -2535,6 +2532,16 @@ static void __dasd_process_request_queue(struct 
dasd_block *block)
__blk_end_request_all(req, -EIO);
continue;
}
+   if (test_bit(DASD_FLAG_ABORTALL, &basedev->flags) &&
+   (basedev->features & DASD_FEATURE_FAILFAST ||
+blk_noretry_request(req))) {
+   DBF_DEV_EVENT(DBF_ERR, basedev,
+ "Rejecting failfast request %p",
+ req);
+   blk_start_request(req);
+   __blk_end_request_all(req, -ETIMEDOUT);
+   continue;
+   }
cqr = basedev->discipline->build_cp(basedev, block, req);
if (IS_ERR(cqr)) {
if (PTR_ERR(cqr) == -EBUSY)
diff --git a/drivers/s390/block/dasd_int.h b/drivers/s390/block/dasd_int.h
index 2bd03f4..690001a 100644
--- a/drivers/s390/block/dasd_int.h
+++ b/drivers/s390/block/dasd_int.h
@@ -524,7 +524,10 @@ struct dasd_block {
 #define DASD_FLAG_SUSPENDED9   /* The device was suspended */
 #define DASD_FLAG_SAFE_OFFLINE 10  /* safe offline processing requested*/
 #define DASD_FLAG_SAFE_OFFLINE_RUNNING 11  /* safe offline running */
+#define DASD_FLAG_ABORTALL 12  /* Abort all noretry requests */
 
+#define DASD_SLEEPON_START_TAG ((void *) 1)
+#define DASD_SLEEPON_END_TAG   ((void *) 2)
 
 void dasd_put_device_wake(struct dasd_device *);
 
diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
index 8be1b51..25a0f2f 100644
--- a/drivers/s390/block/dasd_ioctl.c
+++ b/drivers/s390/block/dasd_ioctl.c
@@ -141,6 +141,59 @@ static int dasd_ioctl_resume(struct dasd_block *block)
 }
 
 /*
+ * Abort all failfast I/O on a device.
+ */
+static int dasd_ioctl_abortio(struct dasd_block *block)
+{
+   unsigned long flags;
+   struct dasd_device *base;
+   struct dasd_ccw_req *cqr, *n;
+
+   base = block->base;
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
+
+   if (test_and_set_bit(DASD_FLAG_ABORTALL, &base->flags))
+   return 0;
+   DBF_DEV_EVENT(DBF_NOTICE, base, "%s", "abortall flag set");
+
+   spin_lock_irqsave(&block->request_queue_lock, flags);
+   spin_lock(&block->queue_lock);
+   list_for_each_entry_safe(cqr, n, &block->ccw_queue, blocklist) {
+   if (test_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags) &&
+   cqr->callback_data &&
+   cqr->callback_data != DASD_SLEEPON_START_TAG &&
+

[PATCH 0/9] dasd: implement block timeout

2013-06-03 Thread Martin Schwidefsky
This is a re-send of a patch series Hannes sent last january. Stefan
looked at the patches and our tests went well, so I guess this is ready
for upstream integration.

The changes to block/blk-core.c and block/blk-timeout.c look good
to me, but I would like to request an acked-by from Jens for the
block layer parts in patch #6 and #7 of the series (pretty please :-)

The original patch description from Hannes:

This patch series implements a block timeout handler for
DASDs. The main impetus was to allow for a fixed upper
timeout value after which a request is aborted.
This is required eg when implementing a host-based
mirroring system where otherwise the entire mirror
would stall under certain circumstances.

Changes since v1:
- Fixed lock inversion in dasd_times_out()
- Checked for 'device->block' when writing to 'timeout' attribute
- Check against 'UINT_MAX' when verifying the 'timeout' value

Once I got the required acked-by I can carry the patch set in the
linux-s390 tree for the next merge window.

Hannes Reinecke (9):
  dasd: Clarify comment
  dasd: make number of retries configurable
  dasd: process all requests in the device tasklet
  dasd: Implement block timeout handling
  dasd: Reduce amount of messages for specific errors
  block,dasd: detailed I/O errors
  block: check for timeout function in blk_rq_timed_out()
  dasd: Add 'timeout' attribute
  dasd: Fail all requests when DASD_FLAG_ABORTIO is set

 arch/s390/include/uapi/asm/dasd.h |4 ++
 block/blk-core.c  |3 +
 block/blk-timeout.c   |5 +-
 drivers/s390/block/dasd.c |  115 +
 drivers/s390/block/dasd_devmap.c  |   97 +++
 drivers/s390/block/dasd_diag.c|8 ++-
 drivers/s390/block/dasd_eckd.c|   15 +++--
 drivers/s390/block/dasd_erp.c |8 +++
 drivers/s390/block/dasd_fba.c |   10 +++-
 drivers/s390/block/dasd_int.h |   10 
 drivers/s390/block/dasd_ioctl.c   |   59 +++
 11 files changed, 313 insertions(+), 21 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/9] block,dasd: detailed I/O errors

2013-06-03 Thread Martin Schwidefsky
From: Hannes Reinecke 

The DASD driver is using FASTFAIL as an equivalent to the
transport errors in SCSI. And the 'steal lock' function maps
roughly to a reservation error. So we should be returning the
appropriate error codes when completing a request.

Cc: Jens Axboe 
Signed-off-by: Hannes Reinecke 
Signed-off-by: Stefan Weinhuber 
Signed-off-by: Martin Schwidefsky 
---
 block/blk-core.c  |3 +++
 drivers/s390/block/dasd.c |   16 +---
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 33c33bc..ef19bf2 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2315,6 +2315,9 @@ bool blk_update_request(struct request *req, int error, 
unsigned int nr_bytes)
case -EBADE:
error_type = "critical nexus";
break;
+   case -ETIMEDOUT:
+   error_type = "timeout";
+   break;
case -EIO:
default:
error_type = "I/O";
diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index 87478be..b97624b 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -2183,7 +2183,7 @@ static int _dasd_sleep_on(struct dasd_ccw_req *maincqr, 
int interruptible)
test_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags) &&
(!dasd_eer_enabled(device))) {
cqr->status = DASD_CQR_FAILED;
-   cqr->intrc = -EAGAIN;
+   cqr->intrc = -ENOLINK;
continue;
}
/* Don't try to start requests if device is stopped */
@@ -2590,8 +2590,17 @@ static void __dasd_cleanup_cqr(struct dasd_ccw_req *cqr)
req = (struct request *) cqr->callback_data;
dasd_profile_end(cqr->block, cqr, req);
status = cqr->block->base->discipline->free_cp(cqr, req);
-   if (status <= 0)
-   error = status ? status : -EIO;
+   if (status < 0)
+   error = status;
+   else if (status == 0) {
+   if (cqr->intrc == -EPERM)
+   error = -EBADE;
+   else if (cqr->intrc == -ENOLINK ||
+cqr->intrc == -ETIMEDOUT)
+   error = cqr->intrc;
+   else
+   error = -EIO;
+   }
__blk_end_request_all(req, error);
 }
 
@@ -2692,6 +2701,7 @@ static void __dasd_block_start_head(struct dasd_block 
*block)
test_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags) &&
(!dasd_eer_enabled(block->base))) {
cqr->status = DASD_CQR_FAILED;
+   cqr->intrc = -ENOLINK;
dasd_schedule_block_bh(block);
continue;
}
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/9] block: check for timeout function in blk_rq_timed_out()

2013-06-03 Thread Martin Schwidefsky
From: Hannes Reinecke 

rq_timed_out_fn might have been unset while the request
was in flight, so we need to check for it in blk_rq_timed_out().

Cc: Jens Axboe 
Signed-off-by: Hannes Reinecke 
Signed-off-by: Stefan Weinhuber 
Signed-off-by: Martin Schwidefsky 
---
 block/blk-timeout.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 6e4744c..65f1035 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -82,9 +82,10 @@ void blk_delete_timer(struct request *req)
 static void blk_rq_timed_out(struct request *req)
 {
struct request_queue *q = req->q;
-   enum blk_eh_timer_return ret;
+   enum blk_eh_timer_return ret = BLK_EH_RESET_TIMER;
 
-   ret = q->rq_timed_out_fn(req);
+   if (q->rq_timed_out_fn)
+   ret = q->rq_timed_out_fn(req);
switch (ret) {
case BLK_EH_HANDLED:
__blk_complete_request(req);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/9] dasd: Implement block timeout handling

2013-06-03 Thread Martin Schwidefsky
From: Hannes Reinecke 

This patch implements generic block layer timeout handling
callbacks for DASDs. When the timeout expires the respective
cqr is aborted.

With this timeout handler time-critical request abort
is guaranteed as the abort does not depend on the internal
state of the various DASD driver queues.

Signed-off-by: Hannes Reinecke 
Acked-by: Stefan Weinhuber 
Signed-off-by: Stefan Weinhuber 
Signed-off-by: Martin Schwidefsky 
---
 drivers/s390/block/dasd.c  |   76 
 drivers/s390/block/dasd_diag.c |5 ++-
 drivers/s390/block/dasd_eckd.c |4 +++
 drivers/s390/block/dasd_fba.c  |5 ++-
 4 files changed, 88 insertions(+), 2 deletions(-)

diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index 000e514..87478be 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -2573,8 +2573,10 @@ static void __dasd_process_request_queue(struct 
dasd_block *block)
 */
cqr->callback_data = (void *) req;
cqr->status = DASD_CQR_FILLED;
+   req->completion_data = cqr;
blk_start_request(req);
list_add_tail(&cqr->blocklist, &block->ccw_queue);
+   INIT_LIST_HEAD(&cqr->devlist);
dasd_profile_start(block, cqr, req);
}
 }
@@ -2862,6 +2864,80 @@ static void do_dasd_request(struct request_queue *queue)
 }
 
 /*
+ * Block timeout callback, called from the block layer
+ *
+ * request_queue lock is held on entry.
+ *
+ * Return values:
+ * BLK_EH_RESET_TIMER if the request should be left running
+ * BLK_EH_NOT_HANDLED if the request is handled or terminated
+ *   by the driver.
+ */
+enum blk_eh_timer_return dasd_times_out(struct request *req)
+{
+   struct dasd_ccw_req *cqr = req->completion_data;
+   struct dasd_block *block = req->q->queuedata;
+   struct dasd_device *device;
+   int rc = 0;
+
+   if (!cqr)
+   return BLK_EH_NOT_HANDLED;
+
+   device = cqr->startdev ? cqr->startdev : block->base;
+   DBF_DEV_EVENT(DBF_WARNING, device,
+ " dasd_times_out cqr %p status %x",
+ cqr, cqr->status);
+
+   spin_lock(&block->queue_lock);
+   spin_lock(get_ccwdev_lock(device->cdev));
+   cqr->retries = -1;
+   cqr->intrc = -ETIMEDOUT;
+   if (cqr->status >= DASD_CQR_QUEUED) {
+   spin_unlock(get_ccwdev_lock(device->cdev));
+   rc = dasd_cancel_req(cqr);
+   } else if (cqr->status == DASD_CQR_FILLED ||
+  cqr->status == DASD_CQR_NEED_ERP) {
+   cqr->status = DASD_CQR_TERMINATED;
+   spin_unlock(get_ccwdev_lock(device->cdev));
+   } else if (cqr->status == DASD_CQR_IN_ERP) {
+   struct dasd_ccw_req *searchcqr, *nextcqr, *tmpcqr;
+
+   list_for_each_entry_safe(searchcqr, nextcqr,
+&block->ccw_queue, blocklist) {
+   tmpcqr = searchcqr;
+   while (tmpcqr->refers)
+   tmpcqr = tmpcqr->refers;
+   if (tmpcqr != cqr)
+   continue;
+   /* searchcqr is an ERP request for cqr */
+   searchcqr->retries = -1;
+   searchcqr->intrc = -ETIMEDOUT;
+   if (searchcqr->status >= DASD_CQR_QUEUED) {
+   spin_unlock(get_ccwdev_lock(device->cdev));
+   rc = dasd_cancel_req(searchcqr);
+   spin_lock(get_ccwdev_lock(device->cdev));
+   } else if ((searchcqr->status == DASD_CQR_FILLED) ||
+  (searchcqr->status == DASD_CQR_NEED_ERP)) {
+   searchcqr->status = DASD_CQR_TERMINATED;
+   rc = 0;
+   } else if (searchcqr->status == DASD_CQR_IN_ERP) {
+   /*
+* Shouldn't happen; most recent ERP
+* request is at the front of queue
+*/
+   continue;
+   }
+   break;
+   }
+   spin_unlock(get_ccwdev_lock(device->cdev));
+   }
+   dasd_schedule_block_bh(block);
+   spin_unlock(&block->queue_lock);
+
+   return rc ? BLK_EH_RESET_TIMER : BLK_EH_NOT_HANDLED;
+}
+
+/*
  * Allocate and initialize request queue and default I/O scheduler.
  */
 static int dasd_alloc_queue(struct dasd_block *block)
diff --git a/drivers/s390/block/dasd_diag.c b/drivers/s390/block/dasd_diag.c
index 1548422..feca317 100644
--- a/drivers/s3

[PATCH 8/9] dasd: Add 'timeout' attribute

2013-06-03 Thread Martin Schwidefsky
From: Hannes Reinecke 

This patch adds a 'timeout' attibute to the DASD driver.
When set to non-zero, the blk_timeout function will
be enabled with the timeout specified in the attribute.
Setting 'timeout' to '0' will disable block timeouts.

Signed-off-by: Hannes Reinecke 
Signed-off-by: Stefan Weinhuber 
Signed-off-by: Martin Schwidefsky 
---
 drivers/s390/block/dasd.c|2 ++
 drivers/s390/block/dasd_devmap.c |   56 ++
 drivers/s390/block/dasd_int.h|4 +++
 3 files changed, 62 insertions(+)

diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index b97624b..54f4bb8 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -2894,6 +2894,8 @@ enum blk_eh_timer_return dasd_times_out(struct request 
*req)
return BLK_EH_NOT_HANDLED;
 
device = cqr->startdev ? cqr->startdev : block->base;
+   if (!device->blk_timeout)
+   return BLK_EH_RESET_TIMER;
DBF_DEV_EVENT(DBF_WARNING, device,
  " dasd_times_out cqr %p status %x",
  cqr, cqr->status);
diff --git a/drivers/s390/block/dasd_devmap.c b/drivers/s390/block/dasd_devmap.c
index bc3e7af..58bc6eb 100644
--- a/drivers/s390/block/dasd_devmap.c
+++ b/drivers/s390/block/dasd_devmap.c
@@ -1280,6 +1280,61 @@ dasd_retries_store(struct device *dev, struct 
device_attribute *attr,
 
 static DEVICE_ATTR(retries, 0644, dasd_retries_show, dasd_retries_store);
 
+static ssize_t
+dasd_timeout_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+   struct dasd_device *device;
+   int len;
+
+   device = dasd_device_from_cdev(to_ccwdev(dev));
+   if (IS_ERR(device))
+   return -ENODEV;
+   len = snprintf(buf, PAGE_SIZE, "%lu\n", device->blk_timeout);
+   dasd_put_device(device);
+   return len;
+}
+
+static ssize_t
+dasd_timeout_store(struct device *dev, struct device_attribute *attr,
+  const char *buf, size_t count)
+{
+   struct dasd_device *device;
+   struct request_queue *q;
+   unsigned long val, flags;
+
+   device = dasd_device_from_cdev(to_ccwdev(dev));
+   if (IS_ERR(device) || !device->block)
+   return -ENODEV;
+
+   if ((strict_strtoul(buf, 10, &val) != 0) ||
+   val > UINT_MAX / HZ) {
+   dasd_put_device(device);
+   return -EINVAL;
+   }
+   q = device->block->request_queue;
+   if (!q) {
+   dasd_put_device(device);
+   return -ENODEV;
+   }
+   spin_lock_irqsave(&device->block->request_queue_lock, flags);
+   if (!val)
+   blk_queue_rq_timed_out(q, NULL);
+   else
+   blk_queue_rq_timed_out(q, dasd_times_out);
+
+   device->blk_timeout = val;
+
+   blk_queue_rq_timeout(q, device->blk_timeout * HZ);
+   spin_unlock_irqrestore(&device->block->request_queue_lock, flags);
+
+   dasd_put_device(device);
+   return count;
+}
+
+static DEVICE_ATTR(timeout, 0644,
+  dasd_timeout_show, dasd_timeout_store);
+
 static ssize_t dasd_reservation_policy_show(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -1391,6 +1446,7 @@ static struct attribute * dasd_attrs[] = {
&dev_attr_failfast.attr,
&dev_attr_expires.attr,
&dev_attr_retries.attr,
+   &dev_attr_timeout.attr,
&dev_attr_reservation_policy.attr,
&dev_attr_last_known_reservation_state.attr,
&dev_attr_safe_offline.attr,
diff --git a/drivers/s390/block/dasd_int.h b/drivers/s390/block/dasd_int.h
index ad42075..2bd03f4 100644
--- a/drivers/s390/block/dasd_int.h
+++ b/drivers/s390/block/dasd_int.h
@@ -470,6 +470,8 @@ struct dasd_device {
unsigned long default_expires;
unsigned long default_retries;
 
+   unsigned long blk_timeout;
+
struct dentry *debugfs_dentry;
struct dasd_profile profile;
 };
@@ -663,6 +665,8 @@ void dasd_free_device(struct dasd_device *);
 struct dasd_block *dasd_alloc_block(void);
 void dasd_free_block(struct dasd_block *);
 
+enum blk_eh_timer_return dasd_times_out(struct request *req);
+
 void dasd_enable_device(struct dasd_device *);
 void dasd_set_target_state(struct dasd_device *, int);
 void dasd_kick_device(struct dasd_device *);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/9] dasd: make number of retries configurable

2013-06-03 Thread Martin Schwidefsky
From: Hannes Reinecke 

Instead of having the number of retries hard-coded in the various
functions we should be using a default retry value, which can
be modified via sysfs.

Signed-off-by: Hannes Reinecke 
Signed-off-by: Stefan Weinhuber 
Signed-off-by: Martin Schwidefsky 
---
 drivers/s390/block/dasd_devmap.c |   41 ++
 drivers/s390/block/dasd_diag.c   |3 ++-
 drivers/s390/block/dasd_eckd.c   |   11 ++
 drivers/s390/block/dasd_fba.c|5 -
 drivers/s390/block/dasd_int.h|3 +++
 5 files changed, 57 insertions(+), 6 deletions(-)

diff --git a/drivers/s390/block/dasd_devmap.c b/drivers/s390/block/dasd_devmap.c
index a71bb8a..bc3e7af 100644
--- a/drivers/s390/block/dasd_devmap.c
+++ b/drivers/s390/block/dasd_devmap.c
@@ -1240,6 +1240,46 @@ dasd_expires_store(struct device *dev, struct 
device_attribute *attr,
 
 static DEVICE_ATTR(expires, 0644, dasd_expires_show, dasd_expires_store);
 
+static ssize_t
+dasd_retries_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+   struct dasd_device *device;
+   int len;
+
+   device = dasd_device_from_cdev(to_ccwdev(dev));
+   if (IS_ERR(device))
+   return -ENODEV;
+   len = snprintf(buf, PAGE_SIZE, "%lu\n", device->default_retries);
+   dasd_put_device(device);
+   return len;
+}
+
+static ssize_t
+dasd_retries_store(struct device *dev, struct device_attribute *attr,
+  const char *buf, size_t count)
+{
+   struct dasd_device *device;
+   unsigned long val;
+
+   device = dasd_device_from_cdev(to_ccwdev(dev));
+   if (IS_ERR(device))
+   return -ENODEV;
+
+   if ((strict_strtoul(buf, 10, &val) != 0) ||
+   (val > DASD_RETRIES_MAX)) {
+   dasd_put_device(device);
+   return -EINVAL;
+   }
+
+   if (val)
+   device->default_retries = val;
+
+   dasd_put_device(device);
+   return count;
+}
+
+static DEVICE_ATTR(retries, 0644, dasd_retries_show, dasd_retries_store);
+
 static ssize_t dasd_reservation_policy_show(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -1350,6 +1390,7 @@ static struct attribute * dasd_attrs[] = {
&dev_attr_erplog.attr,
&dev_attr_failfast.attr,
&dev_attr_expires.attr,
+   &dev_attr_retries.attr,
&dev_attr_reservation_policy.attr,
&dev_attr_last_known_reservation_state.attr,
&dev_attr_safe_offline.attr,
diff --git a/drivers/s390/block/dasd_diag.c b/drivers/s390/block/dasd_diag.c
index cc06033..1548422 100644
--- a/drivers/s390/block/dasd_diag.c
+++ b/drivers/s390/block/dasd_diag.c
@@ -359,6 +359,7 @@ dasd_diag_check_device(struct dasd_device *device)
}
 
device->default_expires = DIAG_TIMEOUT;
+   device->default_retries = DIAG_MAX_RETRIES;
 
/* Figure out position of label block */
switch (private->rdc_data.vdev_class) {
@@ -555,7 +556,7 @@ static struct dasd_ccw_req *dasd_diag_build_cp(struct 
dasd_device *memdev,
recid++;
}
}
-   cqr->retries = DIAG_MAX_RETRIES;
+   cqr->retries = memdev->default_retries;
cqr->buildclk = get_tod_clock();
if (blk_noretry_request(req) ||
block->base->features & DASD_FEATURE_FAILFAST)
diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c
index 6a44b27..f4315dc 100644
--- a/drivers/s390/block/dasd_eckd.c
+++ b/drivers/s390/block/dasd_eckd.c
@@ -1682,6 +1682,9 @@ dasd_eckd_check_characteristics(struct dasd_device 
*device)
 
/* set default timeout */
device->default_expires = DASD_EXPIRES;
+   /* set default retry count */
+   device->default_retries = DASD_RETRIES;
+
if (private->gneq) {
value = 1;
for (i = 0; i < private->gneq->timeout.value; i++)
@@ -2659,7 +2662,7 @@ static struct dasd_ccw_req *dasd_eckd_build_cp_cmd_single(
cqr->block = block;
cqr->expires = startdev->default_expires * HZ;  /* default 5 minutes */
cqr->lpm = startdev->path_data.ppm;
-   cqr->retries = 256;
+   cqr->retries = startdev->default_retries;
cqr->buildclk = get_tod_clock();
cqr->status = DASD_CQR_FILLED;
return cqr;
@@ -2834,7 +2837,7 @@ static struct dasd_ccw_req *dasd_eckd_build_cp_cmd_track(
cqr->block = block;
cqr->expires = startdev->default_expires * HZ;  /* default 5 minutes */
cqr->lpm = startdev->path_data.ppm;
-   cqr->retries = 256;
+   cqr->retries = startdev->default_retries;
cqr->buildclk = get_tod_clock();
cqr->status = DASD_CQR_FILLED;
return cqr;
@@ -3127,7 +3130

[PATCH 3/9] dasd: process all requests in the device tasklet

2013-06-03 Thread Martin Schwidefsky
From: Hannes Reinecke 

Originally the DASD device tasklet would process the entries on
the ccw_queue until the first non-final request was found.
Which was okay as long as all requests have the same retries and
expires parameter.
However, as we're now allowing to modify both it is possible to
have requests _after_ the first request which already have expired.
So we need to check all requests in the device tasklet.

Signed-off-by: Hannes Reinecke 
Signed-off-by: Stefan Weinhuber 
Signed-off-by: Martin Schwidefsky 
---
 drivers/s390/block/dasd.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index 4985489..000e514 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -1787,11 +1787,11 @@ static void __dasd_device_process_ccw_queue(struct 
dasd_device *device,
list_for_each_safe(l, n, &device->ccw_queue) {
cqr = list_entry(l, struct dasd_ccw_req, devlist);
 
-   /* Stop list processing at the first non-final request. */
+   /* Skip any non-final request. */
if (cqr->status == DASD_CQR_QUEUED ||
cqr->status == DASD_CQR_IN_IO ||
cqr->status == DASD_CQR_CLEAR_PENDING)
-   break;
+   continue;
if (cqr->status == DASD_CQR_ERROR) {
__dasd_device_recovery(device, cqr);
}
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suggestion] arch: s390: mm: the warnings with allmodconfig and "EXTRA_CFLAGS=-W"

2013-06-24 Thread Martin Schwidefsky
On Tue, 25 Jun 2013 09:54:41 +0800
Chen Gang  wrote:

> Hello Maintainers:
> 
> When allmodconfig for " IBM zSeries model z800 and z900"
> 
> It will report the related warnings ("EXTRA_CFLAGS=-W"):
>   mm/slub.c:1875:1: warning: ‘deactivate_slab’ uses dynamic stack allocation 
> [enabled by default]
>   mm/slub.c:1941:1: warning: ‘unfreeze_partials.isra.32’ uses dynamic stack 
> allocation [enabled by default]
>   mm/slub.c:2575:1: warning: ‘__slab_free’ uses dynamic stack allocation 
> [enabled by default]
>   mm/slub.c:1582:1: warning: ‘get_partial_node.isra.34’ uses dynamic stack 
> allocation [enabled by default]
>   mm/slub.c:2311:1: warning: ‘__slab_alloc.constprop.42’ uses dynamic stack 
> allocation [enabled by default]
> 
> Is it OK ?

Yes, these warnings should be ok. They are enabled by CONFIG_WARN_DYNAMIC_STACK,
the purpose is to find all functions with dynamic stack allocations. The check
if the allocations are truly ok needs to be done manually as the compiler
can not find out the maximum allocation size automatically.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] s390 patches for the 3.9-rc6

2013-06-25 Thread Martin Schwidefsky
On Tue, 25 Jun 2013 13:09:51 +0100
Grant Likely  wrote:

> On Wed, Apr 3, 2013 at 4:25 PM, Martin Schwidefsky
>  wrote:
> > Hi Linus,
> >
> > please pull from the 'for-linus' branch of
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git 
> > for-linus
> >
> > to receive the following updates: Just a bunch of bugfixes.
> >
> > Heiko Carstens (4):
> >   drivers/Kconfig: add several missing GENERIC_HARDIRQS dependencies
> 
> Is anyone currently working on fixing this? s390 is the only
> architecture left that does not enable GENERIC_HARDIRQS. It's painful
> to keep adding dependencies on GENERIC_HARDIRQS to driver configs.

I am working on it. The hardest part is MSI irqs for PCI. Chances are
that I get it done for the merge window of 3.12.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] s390 patches for the 3.9-rc6

2013-06-25 Thread Martin Schwidefsky
On Tue, 25 Jun 2013 13:42:23 +0100
Grant Likely  wrote:

> On Tue, Jun 25, 2013 at 1:15 PM, Martin Schwidefsky
>  wrote:
> > On Tue, 25 Jun 2013 13:09:51 +0100
> > Grant Likely  wrote:
> >
> >> On Wed, Apr 3, 2013 at 4:25 PM, Martin Schwidefsky
> >>  wrote:
> >> > Hi Linus,
> >> >
> >> > please pull from the 'for-linus' branch of
> >> >
> >> > git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git 
> >> > for-linus
> >> >
> >> > to receive the following updates: Just a bunch of bugfixes.
> >> >
> >> > Heiko Carstens (4):
> >> >   drivers/Kconfig: add several missing GENERIC_HARDIRQS dependencies
> >>
> >> Is anyone currently working on fixing this? s390 is the only
> >> architecture left that does not enable GENERIC_HARDIRQS. It's painful
> >> to keep adding dependencies on GENERIC_HARDIRQS to driver configs.
> >
> > I am working on it. The hardest part is MSI irqs for PCI. Chances are
> > that I get it done for the merge window of 3.12.
> 
> How are you handling the MSIs? I've just been looking at some code for
> irq_domain to handle MSI mapping. What's the part that is getting you
> hung up?

Basically a name-space thing. The current code allocates 64 interrupts numbers
for each PCI device, starting at 0. With GENERIC_HARDIRQS=y irq #0 is used for
for external interrupts, irq #1 for I/O interrupts and irq #2 for adapter
interrupts. The adapter interrupt handler for PCI has to scan the interrupt
vectors and call generic_handle_irq for the MSI interrupts starting at irq #3.
As I don't want to create a huge irq_desc array the number of allocatable
interrupts for MSI will be limited and I can not simply assign 64 interrupts
numbers to each device anymore.

> I'd be happy to take a look if you want a hand.

Thanks for the offer, I might take you up on it if I hit a real problem.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] s390 patches for the 3.9-rc6

2013-06-25 Thread Martin Schwidefsky
On Tue, 25 Jun 2013 14:30:20 +0100
Grant Likely  wrote:

> On Tue, Jun 25, 2013 at 2:11 PM, Martin Schwidefsky
>  wrote:
> > On Tue, 25 Jun 2013 13:42:23 +0100
> > Grant Likely  wrote:
> >
> >> On Tue, Jun 25, 2013 at 1:15 PM, Martin Schwidefsky
> >>  wrote:
> >> > On Tue, 25 Jun 2013 13:09:51 +0100
> >> > Grant Likely  wrote:
> >> >
> >> >> On Wed, Apr 3, 2013 at 4:25 PM, Martin Schwidefsky
> >> >>  wrote:
> >> >> > Hi Linus,
> >> >> >
> >> >> > please pull from the 'for-linus' branch of
> >> >> >
> >> >> > git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git 
> >> >> > for-linus
> >> >> >
> >> >> > to receive the following updates: Just a bunch of bugfixes.
> >> >> >
> >> >> > Heiko Carstens (4):
> >> >> >   drivers/Kconfig: add several missing GENERIC_HARDIRQS 
> >> >> > dependencies
> >> >>
> >> >> Is anyone currently working on fixing this? s390 is the only
> >> >> architecture left that does not enable GENERIC_HARDIRQS. It's painful
> >> >> to keep adding dependencies on GENERIC_HARDIRQS to driver configs.
> >> >
> >> > I am working on it. The hardest part is MSI irqs for PCI. Chances are
> >> > that I get it done for the merge window of 3.12.
> >>
> >> How are you handling the MSIs? I've just been looking at some code for
> >> irq_domain to handle MSI mapping. What's the part that is getting you
> >> hung up?
> >
> > Basically a name-space thing. The current code allocates 64 interrupts 
> > numbers
> > for each PCI device, starting at 0. With GENERIC_HARDIRQS=y irq #0 is used 
> > for
> > for external interrupts, irq #1 for I/O interrupts and irq #2 for adapter
> > interrupts. The adapter interrupt handler for PCI has to scan the interrupt
> > vectors and call generic_handle_irq for the MSI interrupts starting at irq 
> > #3.
> > As I don't want to create a huge irq_desc array the number of allocatable
> > interrupts for MSI will be limited and I can not simply assign 64 interrupts
> > numbers to each device anymore.
> 
> Have you looked at irq_domain? It was created to solve that exact
> problem. irq_descs can get allocated dynamically as irqs are
> requested.

That is one option I am considering. The PCI support for System z can have 
multiple
PCI function groups, each with up to 2048 MSI interrupts. It is quite a good 
match.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] s390 patches for the 3.10-rc8

2013-06-25 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:
A couple of last-minute fixes: a build regression for !SMP, a
recent memory detection patch caused kdump to break, a regression
in regard to sscanf vs. reboot from FCP, and two fixes in the DMA
mapping code for PCI.

Ben Hutchings (1):
  s390/irq: Only define synchronize_irq() on SMP

Heiko Carstens (1):
  s390/mem_detect: fix memory hole handling

Michael Holzheu (1):
  s390/ipl: Fix FCP WWPN and LUN format strings for read

Sebastian Ott (2):
  s390/dma: fix mapping_error detection
  s390/dma: support debug_dma_mapping_error

 arch/s390/include/asm/dma-mapping.h |3 ++-
 arch/s390/kernel/ipl.c  |8 
 arch/s390/kernel/irq.c  |2 ++
 arch/s390/mm/mem_detect.c   |3 ++-
 4 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/s390/include/asm/dma-mapping.h 
b/arch/s390/include/asm/dma-mapping.h
index 886ac7d..2f8c1ab 100644
--- a/arch/s390/include/asm/dma-mapping.h
+++ b/arch/s390/include/asm/dma-mapping.h
@@ -50,9 +50,10 @@ static inline int dma_mapping_error(struct device *dev, 
dma_addr_t dma_addr)
 {
struct dma_map_ops *dma_ops = get_dma_ops(dev);
 
+   debug_dma_mapping_error(dev, dma_addr);
if (dma_ops->mapping_error)
return dma_ops->mapping_error(dev, dma_addr);
-   return (dma_addr == 0UL);
+   return (dma_addr == DMA_ERROR_CODE);
 }
 
 static inline void *dma_alloc_coherent(struct device *dev, size_t size,
diff --git a/arch/s390/kernel/ipl.c b/arch/s390/kernel/ipl.c
index d8a6a38..feb719d 100644
--- a/arch/s390/kernel/ipl.c
+++ b/arch/s390/kernel/ipl.c
@@ -754,9 +754,9 @@ static struct bin_attribute sys_reipl_fcp_scp_data_attr = {
.write = reipl_fcp_scpdata_write,
 };
 
-DEFINE_IPL_ATTR_RW(reipl_fcp, wwpn, "0x%016llx\n", "%016llx\n",
+DEFINE_IPL_ATTR_RW(reipl_fcp, wwpn, "0x%016llx\n", "%llx\n",
   reipl_block_fcp->ipl_info.fcp.wwpn);
-DEFINE_IPL_ATTR_RW(reipl_fcp, lun, "0x%016llx\n", "%016llx\n",
+DEFINE_IPL_ATTR_RW(reipl_fcp, lun, "0x%016llx\n", "%llx\n",
   reipl_block_fcp->ipl_info.fcp.lun);
 DEFINE_IPL_ATTR_RW(reipl_fcp, bootprog, "%lld\n", "%lld\n",
   reipl_block_fcp->ipl_info.fcp.bootprog);
@@ -1323,9 +1323,9 @@ static struct shutdown_action __refdata reipl_action = {
 
 /* FCP dump device attributes */
 
-DEFINE_IPL_ATTR_RW(dump_fcp, wwpn, "0x%016llx\n", "%016llx\n",
+DEFINE_IPL_ATTR_RW(dump_fcp, wwpn, "0x%016llx\n", "%llx\n",
   dump_block_fcp->ipl_info.fcp.wwpn);
-DEFINE_IPL_ATTR_RW(dump_fcp, lun, "0x%016llx\n", "%016llx\n",
+DEFINE_IPL_ATTR_RW(dump_fcp, lun, "0x%016llx\n", "%llx\n",
   dump_block_fcp->ipl_info.fcp.lun);
 DEFINE_IPL_ATTR_RW(dump_fcp, bootprog, "%lld\n", "%lld\n",
   dump_block_fcp->ipl_info.fcp.bootprog);
diff --git a/arch/s390/kernel/irq.c b/arch/s390/kernel/irq.c
index 408e866..dd3c199 100644
--- a/arch/s390/kernel/irq.c
+++ b/arch/s390/kernel/irq.c
@@ -312,6 +312,7 @@ void measurement_alert_subclass_unregister(void)
 }
 EXPORT_SYMBOL(measurement_alert_subclass_unregister);
 
+#ifdef CONFIG_SMP
 void synchronize_irq(unsigned int irq)
 {
/*
@@ -320,6 +321,7 @@ void synchronize_irq(unsigned int irq)
 */
 }
 EXPORT_SYMBOL_GPL(synchronize_irq);
+#endif
 
 #ifndef CONFIG_PCI
 
diff --git a/arch/s390/mm/mem_detect.c b/arch/s390/mm/mem_detect.c
index 3cbd3b8..cca3882 100644
--- a/arch/s390/mm/mem_detect.c
+++ b/arch/s390/mm/mem_detect.c
@@ -123,7 +123,8 @@ void create_mem_hole(struct mem_chunk mem_chunk[], unsigned 
long addr,
continue;
} else if ((addr <= chunk->addr) &&
   (addr + size >= chunk->addr + chunk->size)) {
-   memset(chunk, 0 , sizeof(*chunk));
+   memmove(chunk, chunk + 1, (MEMORY_CHUNKS-i-1) * 
sizeof(*chunk));
+   memset(&mem_chunk[MEMORY_CHUNKS-1], 0, sizeof(*chunk));
} else if (addr + size < chunk->addr + chunk->size) {
chunk->size =  chunk->addr + chunk->size - addr - size;
chunk->addr = addr + size;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] s390 patches for the 3.11 merge window #1

2013-07-03 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:
This is the bulk of the s390 patches for the 3.11 merge window.
Notable enhancements are: the block timeout patches for dasd from Hannes,
and more work on the PCI support front. In addition some cleanup and
the usual bug fixing.

Christian Borntraeger (1):
  s390/kvm: Provide function for setting the guest storage key

Hannes Reinecke (9):
  s390/dasd: Clarify comment
  s390/dasd: make number of retries configurable
  s390/dasd: process all requests in the device tasklet
  s390/dasd: Implement block timeout handling
  s390/dasd: Reduce amount of messages for specific errors
  block/dasd: detailed I/O errors
  block: check for timeout function in blk_rq_timed_out()
  s390/dasd: Add 'timeout' attribute
  s390/dasd: Fail all requests when DASD_FLAG_ABORTIO is set

Heiko Carstens (1):
  s390/smp: get rid of generic_smp_call_function_interrupt

Hendrik Brueckner (1):
  s390/hwsampler: Updated misleading member names in hws_data_entry

Martin Schwidefsky (4):
  s390/sclp: add parameter to specify number of buffer pages
  s390/irq: store interrupt information in pt_regs
  s390/pci: remove per device debug attribute
  s390/airq: simplify adapter interrupt code

Michael Holzheu (5):
  s390/cio: Introduce generic synchronous CHSC IOCTL
  s390/cio: Make /dev/chsc a single-open device
  s390/cio: Introduce on-close CHSC IOCTLs
  s390/sclp: Add SCLP character device driver
  s390/chsc: Use snprintf instead of sprintf

Michael Mueller (1):
  s390/facility: decompose test_facility()

Sebastian Ott (14):
  s390/pci: use to_pci_dev
  s390/qdio: remove unused function
  s390: remove virt_to_phys implementation
  pci: add pcibios_release_device
  s390/pci: implement pcibios_release_device
  s390/pci: cleanup hotplug code
  s390/pci: remove pdev during unplug
  s390/pci: sysfs remove strlen
  s390/qdio: cleanup chsc SSQD usage
  s390/qdio: cleanup chsc SADC usage
  s390/dma: remove gratuitous brackets
  s390/vmwatchdog: do not use static data
  s390/appldata_mem: do not use static data
  s390/appldata_net_sum: do not use static data

Thomas Meyer (5):
  s390/ap_bus: Cocci spatch "ptr_ret.spatch"
  s390/dasd: Cocci spatch "ptr_ret.spatch"
  s390/net: Cocci spatch "ptr_ret.spatch"
  s390/hypfs: Cocci spatch "ptr_ret.spatch"
  s390/drivers: Cocci spatch "ptr_ret.spatch"

Wei Yongjun (1):
  s390/sclp: remove duplicated include from sclp_ctl.c

 Documentation/ioctl/ioctl-number.txt  |1 +
 arch/s390/appldata/appldata_mem.c |   18 +++-
 arch/s390/appldata/appldata_net_sum.c |   18 +++-
 arch/s390/hypfs/hypfs_diag.c  |8 +-
 arch/s390/include/asm/airq.h  |   15 ++-
 arch/s390/include/asm/dma-mapping.h   |2 +-
 arch/s390/include/asm/facility.h  |   17 ++--
 arch/s390/include/asm/io.h|   22 -
 arch/s390/include/asm/pci.h   |2 -
 arch/s390/include/asm/pgalloc.h   |3 +
 arch/s390/include/asm/ptrace.h|1 +
 arch/s390/include/uapi/asm/Kbuild |1 +
 arch/s390/include/uapi/asm/chsc.h |   13 +++
 arch/s390/include/uapi/asm/dasd.h |4 +
 arch/s390/include/uapi/asm/sclp_ctl.h |   24 +
 arch/s390/kernel/asm-offsets.c|1 +
 arch/s390/kernel/entry.S  |   12 ++-
 arch/s390/kernel/entry.h  |2 +-
 arch/s390/kernel/entry64.S|   16 +++-
 arch/s390/kernel/irq.c|8 +-
 arch/s390/kernel/smp.c|5 +-
 arch/s390/mm/pgtable.c|   48 ++
 arch/s390/oprofile/hwsampler.h|4 +-
 arch/s390/pci/pci.c   |   83 
 arch/s390/pci/pci_clp.c   |1 -
 arch/s390/pci/pci_debug.c |   29 --
 arch/s390/pci/pci_dma.c   |6 +-
 arch/s390/pci/pci_sysfs.c |   20 ++--
 block/blk-core.c  |3 +
 block/blk-timeout.c   |5 +-
 drivers/pci/hotplug/s390_pci_hpc.c|   60 
 drivers/pci/pci.c |   10 ++
 drivers/pci/probe.c   |1 +
 drivers/s390/block/dasd.c |  115 ---
 drivers/s390/block/dasd_devmap.c  |   97 +++
 drivers/s390/block/dasd_diag.c|8 +-
 drivers/s390/block/dasd_eckd.c|   17 +++-
 drivers/s390/block/dasd_erp.c |8 ++
 drivers/s390/block/dasd_fba.c |   10 +-
 drivers/s390/block/dasd_int.h |   10 ++
 drivers/s390/block/dasd_ioctl.c   |   59 
 drivers/s390/char/Makefile|2 +-
 drivers/s390/char/sclp.c  |   86 +++--
 drivers

[PATCH] tsacct: optimize acct_update_integrals

2013-07-03 Thread Martin Schwidefsky
The conversion of a cputime to micro seconds can be done without the
detour via jiffies. This avoids unnecessary and costly calculations,
e.g. on s390 a 64-bit division and a multiplication can be replaced
with a simple shift.

Signed-off-by: Martin Schwidefsky 
---
 kernel/tsacct.c |5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/tsacct.c b/kernel/tsacct.c
index a1dd9a1..bf09be1 100644
--- a/kernel/tsacct.c
+++ b/kernel/tsacct.c
@@ -126,16 +126,13 @@ static void __acct_update_integrals(struct task_struct 
*tsk,
 {
if (likely(tsk->mm)) {
cputime_t time, dtime;
-   struct timeval value;
unsigned long flags;
u64 delta;
 
local_irq_save(flags);
time = stime + utime;
dtime = time - tsk->acct_timexpd;
-   jiffies_to_timeval(cputime_to_jiffies(dtime), &value);
-   delta = value.tv_sec;
-   delta = delta * USEC_PER_SEC + value.tv_usec;
+   delta = cputime_to_usecs(dtime);
 
if (delta == 0)
goto out;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH 0/2] s390/kvm: add kvm support for guest page hinting

2013-07-03 Thread Martin Schwidefsky
Linux on s390 as a guest under z/VM has been using the guest page
hinting interface (alias collaborative memory management) for a long
time. The full version with volatile states has been deemed to be too
complicated (see the old discussion about guest page hinting e.g. on
http://marc.info/?l=linux-mm&m=123816662017742&w=2).
What is currently implemented for the guest is the unused and stable
states to mark unallocated pages as freely available to the host.
This works just fine with z/VM as the host.

The two patches in this series implement the guest page hinting
interface for the unused and stable states in the KVM host.
Most of the code specific to s390 but there is a common memory
management part as well, see patch #1.

The circus is back ;-)

Konstantin Weitz (2):
  mm: add support for discard of unused ptes
  s390/kvm: support collaborative memory management

 arch/s390/include/asm/kvm_host.h |8 +++-
 arch/s390/include/asm/pgtable.h  |   24 
 arch/s390/kvm/kvm-s390.c |   24 
 arch/s390/kvm/kvm-s390.h |2 +
 arch/s390/kvm/priv.c |   37 ++
 arch/s390/mm/pgtable.c   |   77 ++
 include/asm-generic/pgtable.h|   13 +++
 include/linux/rmap.h |1 +
 mm/rmap.c|   28 +-
 mm/vmscan.c  |3 ++
 10 files changed, 214 insertions(+), 3 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] mm: add support for discard of unused ptes

2013-07-03 Thread Martin Schwidefsky
From: Konstantin Weitz 

In a virtualized environment and given an appropriate interface the guest
can mark pages as unused while they are free (for the s390 implementation
see git commit 45e576b1c3d00206 "guest page hinting light"). For the host
the unused state is a property of the pte.

This patch adds the primitive 'pte_unused' and code to the host swap out
handler so that pages marked as unused by all mappers are not swapped out
but discarded instead, thus saving one IO for swap out and potentially
another one for swap in.

[ Martin Schwidefsky: patch reordering and cleanup ]

Signed-off-by: Konstantin Weitz 
Signed-off-by: Martin Schwidefsky 
---
 include/asm-generic/pgtable.h |   13 +
 include/linux/rmap.h  |1 +
 mm/rmap.c |   28 +++-
 mm/vmscan.c   |3 +++
 4 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index b183698..aae349a 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -192,6 +192,19 @@ static inline int pte_same(pte_t pte_a, pte_t pte_b)
 }
 #endif
 
+#ifndef __HAVE_ARCH_PTE_UNUSED
+/*
+ * Some architectures provide facilities to virtualization guests
+ * so that they can flag allocated pages as unused. This allows the
+ * host to transparently reclaim unused pages. This function returns
+ * whether the pte's page is unused.
+ */
+static inline int pte_unused(pte_t pte)
+{
+   return 0;
+}
+#endif
+
 #ifndef __HAVE_ARCH_PMD_SAME
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 6dacb93..915e5c6 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -272,5 +272,6 @@ static inline int page_mkclean(struct page *page)
 #define SWAP_AGAIN 1
 #define SWAP_FAIL  2
 #define SWAP_MLOCK 3
+#define SWAP_FREE  4
 
 #endif /* _LINUX_RMAP_H */
diff --git a/mm/rmap.c b/mm/rmap.c
index 6280da8..be2788d 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1233,6 +1233,10 @@ int try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
}
set_pte_at(mm, address, pte,
   swp_entry_to_pte(make_hwpoison_entry(page)));
+   } else if (pte_unused(pteval) && PageAnon(page)) {
+   pte_clear(mm, address, pte);
+   dec_mm_counter(mm, MM_ANONPAGES);
+   ret = SWAP_FREE;
} else if (PageAnon(page)) {
swp_entry_t entry = { .val = page_private(page) };
 
@@ -1455,6 +1459,7 @@ static int try_to_unmap_anon(struct page *page, enum 
ttu_flags flags)
pgoff_t pgoff;
struct anon_vma_chain *avc;
int ret = SWAP_AGAIN;
+   int used = 0;
 
anon_vma = page_lock_anon_vma_read(page);
if (!anon_vma)
@@ -1479,10 +1484,31 @@ static int try_to_unmap_anon(struct page *page, enum 
ttu_flags flags)
 
address = vma_address(page, vma);
ret = try_to_unmap_one(page, vma, address, flags);
+
+   /*
+* If SWAP_FREE was returned, we know that the page
+* is not used (as indicated by pte_unused()) by this
+* mapper. If only one of the mappers used the page,
+* it is considered used.
+*/
+   if (ret == SWAP_FREE)
+   ret = SWAP_AGAIN;
+   else
+   used = 1;
+
if (ret != SWAP_AGAIN || !page_mapped(page))
break;
}
 
+   /*
+* If none of the mappers use the page, clear the dirty bit
+* so that the caller of try_to_unmap_anon() will free its mapping.
+*/
+   if (!used && page_swapcount(page) == 0) {
+   ClearPageDirty(page);
+   ret = SWAP_FREE;
+   }
+
page_unlock_anon_vma_read(anon_vma);
return ret;
 }
@@ -1625,7 +1651,7 @@ int try_to_unmap(struct page *page, enum ttu_flags flags)
ret = try_to_unmap_anon(page, flags);
else
ret = try_to_unmap_file(page, flags);
-   if (ret != SWAP_MLOCK && !page_mapped(page))
+   if (ret != SWAP_FREE && ret != SWAP_MLOCK && !page_mapped(page))
ret = SWAP_SUCCESS;
return ret;
 }
diff --git a/mm/vmscan.c b/mm/vmscan.c
index fa6a853..093c1d7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -800,6 +800,9 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
goto keep_locked;
case SWAP_MLOCK:
goto cull_mlocked;
+   case SWAP_FREE:
+   if (PageSwapCache(page))
+   try_to_free_swap(page);
case SWAP_S

[PATCH 2/2] s390/kvm: support collaborative memory management

2013-07-03 Thread Martin Schwidefsky
From: Konstantin Weitz 

This patch enables Collaborative Memory Management (CMM) for kvm
on s390. CMM allows the guest to inform the host about page usage
(see arch/s390/mm/cmm.c). The host uses this information to avoid
swapping in unused pages in the page fault handler. Further, a CPU
provided list of unused invalid pages is processed to reclaim swap
space of not yet accessed unused pages.

[ Martin Schwidefsky: patch reordering and cleanup ]

Signed-off-by: Konstantin Weitz 
Signed-off-by: Martin Schwidefsky 
---
 arch/s390/include/asm/kvm_host.h |8 +++-
 arch/s390/include/asm/pgtable.h  |   24 
 arch/s390/kvm/kvm-s390.c |   24 
 arch/s390/kvm/kvm-s390.h |2 +
 arch/s390/kvm/priv.c |   37 ++
 arch/s390/mm/pgtable.c   |   77 ++
 mm/rmap.c|2 +-
 7 files changed, 171 insertions(+), 3 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 16bd5d1..8d1bcf4 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -90,7 +90,8 @@ struct kvm_s390_sie_block {
__u32   scaoh;  /* 0x005c */
__u8reserved60; /* 0x0060 */
__u8ecb;/* 0x0061 */
-   __u8reserved62[2];  /* 0x0062 */
+   __u8ecb2;   /* 0x0062 */
+   __u8reserved63[1];  /* 0x0063 */
__u32   scaol;  /* 0x0064 */
__u8reserved68[4];  /* 0x0068 */
__u32   todpr;  /* 0x006c */
@@ -105,7 +106,9 @@ struct kvm_s390_sie_block {
__u64   gbea;   /* 0x0180 */
__u8reserved188[24];/* 0x0188 */
__u32   fac;/* 0x01a0 */
-   __u8reserved1a4[92];/* 0x01a4 */
+   __u8reserved1a4[20];/* 0x01a4 */
+   __u64   cbrlo;  /* 0x01b8 */
+   __u8reserved1c0[64];/* 0x01c0 */
 } __attribute__((packed));
 
 struct kvm_vcpu_stat {
@@ -140,6 +143,7 @@ struct kvm_vcpu_stat {
u32 instruction_stsi;
u32 instruction_stfl;
u32 instruction_tprot;
+   u32 instruction_essa;
u32 instruction_sigp_sense;
u32 instruction_sigp_sense_running;
u32 instruction_sigp_external_call;
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 9aefa3c..061e274 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -227,6 +227,7 @@ extern unsigned long MODULES_END;
 #define _PAGE_SWR  0x008   /* SW pte referenced bit */
 #define _PAGE_SWW  0x010   /* SW pte write bit */
 #define _PAGE_SPECIAL  0x020   /* SW associated with special page */
+#define _PAGE_UNUSED   0x040   /* SW bit for ptep_clear_flush() */
 #define __HAVE_ARCH_PTE_SPECIAL
 
 /* Set of bits not changed in pte_modify */
@@ -379,6 +380,12 @@ extern unsigned long MODULES_END;
 
 #endif /* CONFIG_64BIT */
 
+/* Guest Page State used for virtualization */
+#define _PGSTE_GPS_ZERO0x8000UL
+#define _PGSTE_GPS_USAGE_MASK  0x0300UL
+#define _PGSTE_GPS_USAGE_STABLE 0xUL
+#define _PGSTE_GPS_USAGE_UNUSED 0x0100UL
+
 /*
  * A user page table pointer has the space-switch-event bit, the
  * private-space-control bit and the storage-alteration-event-control
@@ -594,6 +601,12 @@ static inline int pte_file(pte_t pte)
return (pte_val(pte) & mask) == _PAGE_TYPE_FILE;
 }
 
+static inline int pte_swap(pte_t pte)
+{
+   unsigned long mask = _PAGE_RO | _PAGE_INVALID | _PAGE_SWT | _PAGE_SWX;
+   return (pte_val(pte) & mask) == _PAGE_TYPE_SWAP;
+}
+
 static inline int pte_special(pte_t pte)
 {
return (pte_val(pte) & _PAGE_SPECIAL);
@@ -797,6 +810,7 @@ unsigned long gmap_translate(unsigned long address, struct 
gmap *);
 unsigned long __gmap_fault(unsigned long address, struct gmap *);
 unsigned long gmap_fault(unsigned long address, struct gmap *);
 void gmap_discard(unsigned long from, unsigned long to, struct gmap *);
+void __gmap_zap(unsigned long address, struct gmap *);
 
 void gmap_register_ipte_notifier(struct gmap_notifier *);
 void gmap_unregister_ipte_notifier(struct gmap_notifier *);
@@ -828,6 +842,7 @@ static inline void set_pte_at(struct mm_struct *mm, 
unsigned long addr,
 
if (mm_has_pgste(mm)) {
pgste = pgste_get_lock(ptep);
+   pgste_val(pgste) &= ~_PGSTE_GPS_ZERO;
pgste_set_key(ptep, pgste, entry);
pgste_set_pte(ptep, entry);
pgste_set_unlock(ptep, pgste);
@@ -861,6 +876,12 @@ static inline int pte_young(pte_t pte)
return 0;
 }
 
+#define __HAVE_ARCH_PTE_UNUSED
+static inline int pte_unused(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_UNUSED;
+}
+

Re: [GIT PULL] s390 patches for the 3.9-rc6

2013-06-28 Thread Martin Schwidefsky
Hi Grant,

On Tue, 25 Jun 2013 15:18:21 +0100
Grant Likely  wrote:

> On Tue, Jun 25, 2013 at 3:12 PM, Martin Schwidefsky
>  wrote:
> > On Tue, 25 Jun 2013 14:30:20 +0100
> > Grant Likely  wrote:
> >
> >> Have you looked at irq_domain? It was created to solve that exact
> >> problem. irq_descs can get allocated dynamically as irqs are
> >> requested.
> >
> > That is one option I am considering. The PCI support for System z can have 
> > multiple
> > PCI function groups, each with up to 2048 MSI interrupts. It is quite a 
> > good match.
> 
> :-) It was designed to support exactly that use-case on PowerPC.

I decided against irq_domains after all. The reason is that on s390 the same
MSI interrupt numbers are used for multiple devices but without actually
sharing them. The PCI root complex uses the adapter id and the MSI interrupt
number to set two indicators for the target device (adapter interrupt summary
bit and the MSI interrupt bit in the adapter interrupt vector). That would
give me two options for irq_domains: 1) allocate an irq_domain for each
device, or 2) encode the adapter id in the hwirq number and create a sparsely
populated irq_domain. It is simpler to just use the irq_alloc_desc and
irq_free_desc calls directly.

The preliminary result of this can be gawked at 
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git genirq

Test and performance analysis is to-be-done and will take some time.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >