Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform

2021-02-18 Thread Michael Schmitz

On 19/02/21 12:19 am, Arnd Bergmann wrote:


drivers/net/ethernet/8390/apne.c
drivers/net/ethernet/8390/ax88796.c
drivers/net/ethernet/8390/hydra.c
drivers/net/ethernet/8390/mac8390.c
drivers/net/ethernet/8390/ne.c
drivers/net/ethernet/8390/zorro8390.c

[...]

Most of these are normal short-lived interrupts that only transfer
a few bytes or schedule deferred processing of some sort.
Most of the scsi and network drivers process the data in
a softirq, so those are generally fine here, but I do see that 8390
(ne2000) ethernet and the drivers/ide drivers do transfer their
data in hardirq context.

 Arnd


8390 ethernet drivers are widely used on m68k platforms (Amiga and 
Atari). At least on Amiga, the corresponding interrupt is a hardirq so 
I'd advise caution. That said, the 8390 drivers might benefit from some 
refactoring (the way these drivers are compiled does prevent e.g. the 
APNE driver from being used with two different variants of PCMCIA 
interfaces. I had begun some work on making IO primitives runtime 
selected two years ago but that ended up looking too ugly ...).


Can't recall what IPL the 8390 interrupts operate at - Geert?

Regarding allowing preemption of hardirqs - at least in 2017, that was 
considered perfectly OK (see Linus' comment on 
https://lore.kernel.org/patchwork/patch/794954/). I concur with Finn on 
this one - we did look for potential issues with the way irqs_disabled() 
is defined on m68k (for me, in the context of the above bug that caused 
memory corruption on my system), and found none.


Cheers,

    Michael



Re: [PATCH] scsi/NCR5380: Remove in_interrupt() test

2020-12-01 Thread Michael Schmitz

Hi Finn,

works fine, thanks!

Tested-By: Michael Schmitz 


On 1/12/20 7:46 PM, Finn Thain wrote:

The in_interrupt() macro is deprecated. Also, it's usage in
NCR5380_poll_politely2() has long been redundant.

Cc: Sebastian Andrzej Siewior 
Cc: Ahmed S. Darwish 
Cc: Thomas Gleixner 
Link: https://lore.kernel.org/r/20201126132952.2287996-1-bige...@linutronix.de
Signed-off-by: Finn Thain 
---
  drivers/scsi/NCR5380.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index 462d911a89f2..6972e7ceb81a 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -223,7 +223,10 @@ static int NCR5380_poll_politely2(struct NCR5380_hostdata 
*hostdata,
cpu_relax();
} while (n--);
  
-	if (irqs_disabled() || in_interrupt())

+   /* Sleeping is not allowed when in atomic or interrupt contexts.
+* Callers in such contexts always disable local irqs.
+*/
+   if (irqs_disabled())
return -ETIMEDOUT;
  
  	/* Repeatedly sleep for 1 ms until deadline */


Re: [PATCH] scsi/atari_scsi: Fix race condition between .queuecommand and EH

2020-11-19 Thread Michael Schmitz

Hi Finn,

thanks for your patch!

Tested on Atari Falcon (with falconide, and pata_falcon modules).

Reviewed-by: Michael Schmitz 
Tested-by: Michael Schmitz 

Am 20.11.2020 um 17:39 schrieb Finn Thain:

It is possible that bus_reset_cleanup() or .eh_abort_handler could
be invoked during NCR5380_queuecommand(). If that takes place before
the new command is enqueued and after the ST-DMA "lock" has been
acquired, the ST-DMA "lock" will be released again. This will result
in a lost DMA interrupt and a command timeout. Fix this by excluding
EH and interrupt handlers while the new command is enqueued.

Signed-off-by: Finn Thain 
---
Michael, would you please send your Acked-by or Reviewed-and-tested-by?
These two patches taken together should be equivalent to the one you tested
recently. I've split it into two as that seemed to make more sense.
---
 drivers/scsi/NCR5380.c|  9 ++---
 drivers/scsi/atari_scsi.c | 10 +++---
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index d654a6cc4162..ea4b5749e7da 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -580,11 +580,14 @@ static int NCR5380_queue_command(struct Scsi_Host 
*instance,

cmd->result = 0;

-   if (!NCR5380_acquire_dma_irq(instance))
-   return SCSI_MLQUEUE_HOST_BUSY;
-
spin_lock_irqsave(>lock, flags);

+   if (!NCR5380_acquire_dma_irq(instance)) {
+   spin_unlock_irqrestore(>lock, flags);
+
+   return SCSI_MLQUEUE_HOST_BUSY;
+   }
+
/*
 * Insert the cmd into the issue queue. Note that REQUEST SENSE
 * commands are added to the head of the queue since any command will
diff --git a/drivers/scsi/atari_scsi.c b/drivers/scsi/atari_scsi.c
index a82b63a66635..95d7a3586083 100644
--- a/drivers/scsi/atari_scsi.c
+++ b/drivers/scsi/atari_scsi.c
@@ -376,15 +376,11 @@ static int falcon_get_lock(struct Scsi_Host *instance)
if (IS_A_TT())
return 1;

-   if (stdma_is_locked_by(scsi_falcon_intr) &&
-   instance->hostt->can_queue > 1)
+   if (stdma_is_locked_by(scsi_falcon_intr))
return 1;

-   if (in_interrupt())
-   return stdma_try_lock(scsi_falcon_intr, instance);
-
-   stdma_lock(scsi_falcon_intr, instance);
-   return 1;
+   /* stdma_lock() may sleep which means it can't be used here */
+   return stdma_try_lock(scsi_falcon_intr, instance);
 }

 #ifndef MODULE



Re: [PATCH] scsi/NCR5380: Reduce NCR5380_maybe_release_dma_irq() call sites

2020-11-19 Thread Michael Schmitz

Hi Finn,

thanks for your patch!

Tested on Atari Falcon (with falconide, and pata_falcon modules).

Reviewed-by: Michael Schmitz 
Tested-by: Michael Schmitz 

Am 20.11.2020 um 17:39 schrieb Finn Thain:

Refactor to avoid needless calls to NCR5380_maybe_release_dma_irq().
This makes the machine code smaller and the source more readable.

Signed-off-by: Finn Thain 
---
 drivers/scsi/NCR5380.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index ea4b5749e7da..d597d7493a62 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -725,7 +725,6 @@ static void NCR5380_main(struct work_struct *work)

if (!NCR5380_select(instance, cmd)) {
dsprintk(NDEBUG_MAIN, instance, "main: select 
complete\n");
-   maybe_release_dma_irq(instance);
} else {
dsprintk(NDEBUG_MAIN | NDEBUG_QUEUES, instance,
 "main: select failed, returning %p to 
queue\n", cmd);
@@ -737,8 +736,10 @@ static void NCR5380_main(struct work_struct *work)
NCR5380_information_transfer(instance);
done = 0;
}
-   if (!hostdata->connected)
+   if (!hostdata->connected) {
NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask);
+   maybe_release_dma_irq(instance);
+   }
spin_unlock_irq(>lock);
if (!done)
cond_resched();
@@ -1844,7 +1845,6 @@ static void NCR5380_information_transfer(struct Scsi_Host 
*instance)
 */
NCR5380_write(TARGET_COMMAND_REG, 0);

-   maybe_release_dma_irq(instance);
return;
case MESSAGE_REJECT:
/* Accept message by clearing ACK */
@@ -1976,7 +1976,6 @@ static void NCR5380_information_transfer(struct Scsi_Host 
*instance)
hostdata->busy[scmd_id(cmd)] &= ~(1 << 
cmd->device->lun);
cmd->result = DID_ERROR << 16;
complete_cmd(instance, cmd);
-   maybe_release_dma_irq(instance);
return;
}
msgout = NOP;
@@ -2312,7 +2311,6 @@ static int NCR5380_abort(struct scsi_cmnd *cmd)
}

queue_work(hostdata->work_q, >main_task);
-   maybe_release_dma_irq(instance);
spin_unlock_irqrestore(>lock, flags);

return result;
@@ -2368,7 +2366,6 @@ static void bus_reset_cleanup(struct Scsi_Host *instance)
hostdata->dma_len = 0;

queue_work(hostdata->work_q, >main_task);
-   maybe_release_dma_irq(instance);
 }

 /**



Re: [PATCH 11/13] m68k/mm: make node data and node setup depend on CONFIG_DISCONTIGMEM

2020-10-28 Thread Michael Schmitz

Hi Mike,

On 29/10/20 12:16 AM, Mike Rapoport wrote:

Hi Geert,

On Wed, Oct 28, 2020 at 10:25:49AM +0100, Geert Uytterhoeven wrote:

Hi Mike,

On Tue, Oct 27, 2020 at 12:31 PM Mike Rapoport  wrote:

From: Mike Rapoport 

The pg_data_t node structures and their initialization currently depends on
!CONFIG_SINGLE_MEMORY_CHUNK. Since they are required only for DISCONTIGMEM
make this dependency explicit and replace usage of
CONFIG_SINGLE_MEMORY_CHUNK with CONFIG_DISCONTIGMEM where appropriate.

The CONFIG_SINGLE_MEMORY_CHUNK was implicitly disabled on the ColdFire MMU
variant, although it always presumed a single memory bank. As there is no
actual need for DISCONTIGMEM in this case, make sure that ColdFire MMU
systems set CONFIG_SINGLE_MEMORY_CHUNK to 'y'.

Signed-off-by: Mike Rapoport 

Thanks for your patch!


---
  arch/m68k/Kconfig.cpu   | 6 +++---
  arch/m68k/include/asm/page_mm.h | 2 +-
  arch/m68k/mm/init.c | 4 ++--
  3 files changed, 6 insertions(+), 6 deletions(-)

Is there any specific reason you didn't convert the checks for
CONFIG_SINGLE_MEMORY_CHUNK in arch/m68k/kernel/setup_mm.c

In arch/m68k/kernel/setup_mm.c the CONFIG_SINGLE_MEMORY_CHUNK is needed
for the case when a system has two banks, the kernel is loaded into the
second bank and so the first bank cannot be used as normal memory. It
does not matter what memory model will be used in this case.



That case used to be detected just fine at run time (by dint of the 
second memory chunk having an address below the first; the chunk the 
kernel resides in is always listed first), even without using 
CONFIG_SINGLE_MEMORY_CHUNK.


Unless you changed that behaviour (and I see nothing in your patch that 
would indicate that), this is still true.


Converting the check as Geert suggested, without also adding a test for 
out-of-order memory bank addresses, would implicitly treat DISCONTIGMEM 
as  SINGLE_MEMORY_CHUNK, regardless of bank ordering. I don't think that 
is what we really want?


Cheers,

    Michael





and arch/m68k/include/asm/virtconvert.h?
  
I remember I had build errors and troubles with include file

dependencies if I changed it there, but I might be mistaken. I'll
recheck again.


Gr{oetje,eeting}s,

 Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
 -- Linus Torvalds


Re: [PATCH] ide/falconide: Fix module unload

2020-09-24 Thread Michael Schmitz

Hi Finn,

thanks for catching this!

Reviewed-By: Michael Schmitz 

Am 25.09.2020 um 13:39 schrieb Finn Thain:

Unloading the falconide module results in a crash:

Unable to handle kernel NULL pointer dereference at virtual address 
Oops: 
Modules linked in: falconide(-)
PC: [<002930b2>] ide_host_remove+0x2e/0x1d2
SR: 2000  SP: 00b49e28  a2: 009b0f90
d0: d1: 009b0f90d2: d3: 00b48000
d4: 003cef32d5: 00299188a0: 0086d000a1: 0086d000
Process rmmod (pid: 322, task=009b0f90)
Frame format=7 eff addr= ssw=0505 faddr=
wb 1 stat/addr/data:   
wb 2 stat/addr/data:   
wb 3 stat/addr/data:   00018da9
push data:    
Stack from 00b49e90:
004c456a 0027f176 0027cb0a 0027cb9e  0086d00a 2187d3f0 0027f0e0
00b49ebc 2187d1f6  00b49ec8 002811e8 0086d000 00b49ef0 0028024c
0086d00a 002800d6 00279a1a 0001 0001 0086d00a 2187d3f0 00279a58
00b49f1c 002802e0 0086d00a 2187d3f0 004c456a 0086d00a ef96af74 
2187d3f0 002805d2 800de064 00b49f44 0027f088 2187d3f0 00ac1cf4 2187d3f0
004c43be 2187d3f0  2187d3f0 800b66a8 00b49f5c 00280776 2187d3f0
Call Trace: [<0027f176>] __device_driver_unlock+0x0/0x48
 [<0027cb0a>] device_links_busy+0x0/0x94
 [<0027cb9e>] device_links_unbind_consumers+0x0/0x130
 [<0027f0e0>] __device_driver_lock+0x0/0x5a
 [<2187d1f6>] falconide_remove+0x12/0x18 [falconide]
 [<002811e8>] platform_drv_remove+0x1c/0x28
 [<0028024c>] device_release_driver_internal+0x176/0x17c
 [<002800d6>] device_release_driver_internal+0x0/0x17c
 [<00279a1a>] get_device+0x0/0x22
 [<00279a58>] put_device+0x0/0x18
 [<002802e0>] driver_detach+0x56/0x82
 [<002805d2>] driver_remove_file+0x0/0x24
 [<0027f088>] bus_remove_driver+0x4c/0xa4
 [<00280776>] driver_unregister+0x28/0x5a
 [<00281a00>] platform_driver_unregister+0x12/0x18
 [<2187d2a0>] ide_falcon_driver_exit+0x10/0x16 [falconide]
 [<000764f0>] sys_delete_module+0x110/0x1f2
 [<000e83ea>] sys_rename+0x1a/0x1e
 [<2e0c>] syscall+0x8/0xc
 [<00188004>] ext4_multi_mount_protect+0x35a/0x3ce
Code: 0029 9188 4bf9 0027 aa1c 283c 003c ef32 <265c> 4a8b 6700 00b8 2043 2028 
000c 0280 00ff ff00 6600 0176 40c0 7202 b2b9 004c
Disabling lock debugging due to kernel taint

This happens because the driver_data pointer is uninitialized.
Add the missing platform_set_drvdata() call. For clarity, use the
matching platform_get_drvdata() as well.

Cc: Michael Schmitz 
Cc: Bartlomiej Zolnierkiewicz 
Cc: linux-m...@lists.linux-m68k.org
Fixes: 5ed0794cde593 ("m68k/atari: Convert Falcon IDE drivers to platform 
drivers")
Signed-off-by: Finn Thain 
---
This patch was tested using Aranym.
---
 drivers/ide/falconide.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/ide/falconide.c b/drivers/ide/falconide.c
index dbeb2605e5f6e..607c44bc50f1b 100644
--- a/drivers/ide/falconide.c
+++ b/drivers/ide/falconide.c
@@ -166,6 +166,7 @@ static int __init falconide_init(struct platform_device 
*pdev)
if (rc)
goto err_free;

+   platform_set_drvdata(pdev, host);
return 0;
 err_free:
ide_host_free(host);
@@ -176,7 +177,7 @@ static int __init falconide_init(struct platform_device 
*pdev)

 static int falconide_remove(struct platform_device *pdev)
 {
-   struct ide_host *host = dev_get_drvdata(>dev);
+   struct ide_host *host = platform_get_drvdata(pdev);

ide_host_remove(host);




Re: [PATCH v2] ide/macide: Convert Mac IDE driver to platform driver

2020-09-23 Thread Michael Schmitz

Hi Finn,

On 24/09/20 1:07 PM, Finn Thain wrote:

Looking further at the drivers using ide_host_register(), I see that
falconide.c is missing a set_drvdata() call, while tx4939ide.c calls
set_drvdata() after ide_host_register(). The latter example is not a bug.

The pattern I used, that is, calling set_drvdata() after ide_host_add(),
is actually more popular among IDE drivers than the pattern you suggested,
that is, set_drvdata() followed by ide_host_register(). Either way, I
don't see any bugs besides the one in falconide.c.

Regarding falconide.c, my inclination is to send a fix following the more
common pattern (among IDE drivers), as below. I guess that may prompt the
subsystem maintainers to make known their views on the style question.


Please do - that is clearly a bug. I must admit I never tried to boot my 
Falcon off a SCSI partition to test falconide module unload.


Cheers,

    Michael




diff --git a/drivers/ide/falconide.c b/drivers/ide/falconide.c
index dbeb2605e5f6e..607c44bc50f1b 100644
--- a/drivers/ide/falconide.c
+++ b/drivers/ide/falconide.c
@@ -166,6 +166,7 @@ static int __init falconide_init(struct platform_device 
*pdev)
if (rc)
goto err_free;
  
+	platform_set_drvdata(pdev, host);

return 0;
  err_free:
ide_host_free(host);
@@ -176,7 +177,7 @@ static int __init falconide_init(struct platform_device 
*pdev)
  
  static int falconide_remove(struct platform_device *pdev)

  {
-   struct ide_host *host = dev_get_drvdata(>dev);
+   struct ide_host *host = platform_get_drvdata(pdev);
  
  	ide_host_remove(host);
  


Re: [PATCH] ide/macide: Convert Mac IDE driver to platform driver

2020-09-09 Thread Michael Schmitz

Hi Finn,

Am 10.09.2020 um 12:23 schrieb Finn Thain:

+   return 0;
+
+release_mem:
+   release_mem_region(mem->start, resource_size(mem));


Not needed, as you used devm_*() for allocation.



OK, I'll remove this. I put it there after I looked at falconide.c and
wondered whether the automatic release would take place after both init
failure and exit (or just exit). I see now that pata_gayle.c does it
differently.


pata_gayle.c has probably seen more testing (in the platform 
environment), so I'd go with what's done there.


Cheers,

Michael



Re: [PATCH v2 2/7] scsi: NCR5380: Always re-enable reselection interrupt

2019-06-18 Thread Michael Schmitz

Martin,

On 19/06/19 12:47 PM, Martin K. Petersen wrote:

Michael,


No matter - patch applied cleanly to what I'm running on my Falcon,
and works just fine for now (stresstest will take a few hours to
complete). And that'll thoroughly exercise the reselection code path,
from what we've seen before.

How did it go?



Just fine - repeated with different settings for can_queue and 
cmd_per_lun, with not a single hitch. No regression at all.


Cheers,

    Michael







Re: [PATCH] NCR5380: Support chained sg lists

2019-06-11 Thread Michael Schmitz

On 11/06/19 3:25 PM, Finn Thain wrote:


My understanding is that support for chained scatterlists is to
become mandatory for LLDs.

Cc: Michael Schmitz 
Signed-off-by: Finn Thain 

Reviewed-by: Michael Schmitz 

---
  drivers/scsi/NCR5380.c | 41 ++---
  1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index d9fa9cf2fd8b..536426f25e86 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -149,12 +149,10 @@ static inline void initialize_SCp(struct scsi_cmnd *cmd)
  
  	if (scsi_bufflen(cmd)) {

cmd->SCp.buffer = scsi_sglist(cmd);
-   cmd->SCp.buffers_residual = scsi_sg_count(cmd) - 1;
cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
cmd->SCp.this_residual = cmd->SCp.buffer->length;
} else {
cmd->SCp.buffer = NULL;
-   cmd->SCp.buffers_residual = 0;
cmd->SCp.ptr = NULL;
cmd->SCp.this_residual = 0;
}
@@ -163,6 +161,17 @@ static inline void initialize_SCp(struct scsi_cmnd *cmd)
cmd->SCp.Message = 0;
  }
  
+static inline void advance_sg_buffer(struct scsi_cmnd *cmd)

+{
+   struct scatterlist *s = cmd->SCp.buffer;
+
+   if (!cmd->SCp.this_residual && s && !sg_is_last(s)) {
+   cmd->SCp.buffer = sg_next(s);
+   cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
+   cmd->SCp.this_residual = cmd->SCp.buffer->length;
+   }
+}
+
  /**
   * NCR5380_poll_politely2 - wait for two chip register values
   * @hostdata: host private data
@@ -1670,12 +1679,7 @@ static void NCR5380_information_transfer(struct 
Scsi_Host *instance)
sun3_dma_setup_done != cmd) {
int count;
  
-if (!cmd->SCp.this_residual && cmd->SCp.buffers_residual) {

-   ++cmd->SCp.buffer;
-   --cmd->SCp.buffers_residual;
-   cmd->SCp.this_residual = 
cmd->SCp.buffer->length;
-   cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
-   }
+   advance_sg_buffer(cmd);
  
  count = sun3scsi_dma_xfer_len(hostdata, cmd);
  
@@ -1725,15 +1729,11 @@ static void NCR5380_information_transfer(struct Scsi_Host *instance)

 * scatter-gather list, move onto the next one.
 */
  
-if (!cmd->SCp.this_residual && cmd->SCp.buffers_residual) {

-   ++cmd->SCp.buffer;
-   --cmd->SCp.buffers_residual;
-   cmd->SCp.this_residual = 
cmd->SCp.buffer->length;
-   cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
-   dsprintk(NDEBUG_INFORMATION, instance, "%d 
bytes and %d buffers left\n",
-cmd->SCp.this_residual,
-cmd->SCp.buffers_residual);
-   }
+   advance_sg_buffer(cmd);
+   dsprintk(NDEBUG_INFORMATION, instance,
+   "this residual %d, sg ents %d\n",
+   cmd->SCp.this_residual,
+   sg_nents(cmd->SCp.buffer));
  
  /*

 * The preferred transfer method is going to be
@@ -2126,12 +2126,7 @@ static void NCR5380_reselect(struct Scsi_Host *instance)
if (sun3_dma_setup_done != tmp) {
int count;
  
-		if (!tmp->SCp.this_residual && tmp->SCp.buffers_residual) {

-   ++tmp->SCp.buffer;
-   --tmp->SCp.buffers_residual;
-   tmp->SCp.this_residual = tmp->SCp.buffer->length;
-   tmp->SCp.ptr = sg_virt(tmp->SCp.buffer);
-   }
+   advance_sg_buffer(tmp);
  
  		count = sun3scsi_dma_xfer_len(hostdata, tmp);
  


Re: [PATCH v2 2/7] scsi: NCR5380: Always re-enable reselection interrupt

2019-06-11 Thread Michael Schmitz

Hi Finn,

On 11/06/19 9:33 PM, Finn Thain wrote:

On Tue, 11 Jun 2019, Michael Schmitz wrote:


Hi Finn,

IIRC I'd tested that change as well - didn't change broken target
behaviour but no regressions in other respects. Add my tested-by if
needed.


Unfortunately I can't confirm that this is the same patch as the one you
tested as I no longer have that commit. But Stan did test a wide variety
of targets and I'm confident that the reselection code path was covered.

No matter - patch applied cleanly to what I'm running on my Falcon, and 
works just fine for now (stresstest will take a few hours to complete). 
And that'll thoroughly exercise the reselection code path, from what 
we've seen before.


Cheers,

    Michael




Re: [PATCH v2 2/7] scsi: NCR5380: Always re-enable reselection interrupt

2019-06-10 Thread Michael Schmitz

Hi Finn,

IIRC I'd tested that change as well - didn't change broken target 
behaviour but no regressions in other respects. Add my tested-by if needed.


Cheers,

Michael


Am 09.06.2019 um 13:19 schrieb Finn Thain:

The reselection interrupt gets disabled during selection and must be
re-enabled when hostdata->connected becomes NULL. If it isn't re-enabled
a disconnected command may time-out or the target may wedge the bus while
trying to reselect the host. This can happen after a command is aborted.

Fix this by enabling the reselection interrupt in NCR5380_main() after
calls to NCR5380_select() and NCR5380_information_transfer() return.

Cc: Michael Schmitz 
Cc: sta...@vger.kernel.org # v4.9+
Fixes: 8b00c3d5d40d ("ncr5380: Implement new eh_abort_handler")
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/scsi/NCR5380.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index fe0535affc14..08e3ea8159b3 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -709,6 +709,8 @@ static void NCR5380_main(struct work_struct *work)
NCR5380_information_transfer(instance);
done = 0;
}
+   if (!hostdata->connected)
+   NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask);
spin_unlock_irq(>lock);
if (!done)
cond_resched();
@@ -1110,8 +1112,6 @@ static bool NCR5380_select(struct Scsi_Host *instance, 
struct scsi_cmnd *cmd)
spin_lock_irq(>lock);
NCR5380_write(INITIATOR_COMMAND_REG, ICR_BASE);
NCR5380_reselect(instance);
-   if (!hostdata->connected)
-   NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask);
shost_printk(KERN_ERR, instance, "reselection after won 
arbitration?\n");
goto out;
}
@@ -1119,7 +1119,6 @@ static bool NCR5380_select(struct Scsi_Host *instance, 
struct scsi_cmnd *cmd)
if (err < 0) {
spin_lock_irq(>lock);
NCR5380_write(INITIATOR_COMMAND_REG, ICR_BASE);
-   NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask);

/* Can't touch cmd if it has been reclaimed by the scsi ML */
if (!hostdata->selecting)
@@ -1157,7 +1156,6 @@ static bool NCR5380_select(struct Scsi_Host *instance, 
struct scsi_cmnd *cmd)
if (err < 0) {
shost_printk(KERN_ERR, instance, "select: REQ timeout\n");
NCR5380_write(INITIATOR_COMMAND_REG, ICR_BASE);
-   NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask);
goto out;
}
if (!hostdata->selecting) {
@@ -1826,9 +1824,6 @@ static void NCR5380_information_transfer(struct Scsi_Host 
*instance)
 */
NCR5380_write(TARGET_COMMAND_REG, 0);

-   /* Enable reselect interrupts */
-   NCR5380_write(SELECT_ENABLE_REG, 
hostdata->id_mask);
-
maybe_release_dma_irq(instance);
return;
case MESSAGE_REJECT:
@@ -1860,8 +1855,6 @@ static void NCR5380_information_transfer(struct Scsi_Host 
*instance)
 */
NCR5380_write(TARGET_COMMAND_REG, 0);

-   /* Enable reselect interrupts */
-   NCR5380_write(SELECT_ENABLE_REG, 
hostdata->id_mask);
 #ifdef SUN3_SCSI_VME
dregs->csr |= CSR_DMA_ENABLE;
 #endif
@@ -1964,7 +1957,6 @@ static void NCR5380_information_transfer(struct Scsi_Host 
*instance)
cmd->result = DID_ERROR << 16;
complete_cmd(instance, cmd);
maybe_release_dma_irq(instance);
-   NCR5380_write(SELECT_ENABLE_REG, 
hostdata->id_mask);
return;
}
msgout = NOP;



Re: [PATCH] NCR5380: Support chained sg lists

2019-06-10 Thread Michael Schmitz

Hi Finn,

Thanks - can't test this on my hardware but looks good to me.

Cheers,

Michael

Am 11.06.2019 um 15:25 schrieb Finn Thain:

My understanding is that support for chained scatterlists is to
become mandatory for LLDs.

Cc: Michael Schmitz 
Signed-off-by: Finn Thain 
---
 drivers/scsi/NCR5380.c | 41 ++---
 1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index d9fa9cf2fd8b..536426f25e86 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -149,12 +149,10 @@ static inline void initialize_SCp(struct scsi_cmnd *cmd)

if (scsi_bufflen(cmd)) {
cmd->SCp.buffer = scsi_sglist(cmd);
-   cmd->SCp.buffers_residual = scsi_sg_count(cmd) - 1;
cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
cmd->SCp.this_residual = cmd->SCp.buffer->length;
} else {
cmd->SCp.buffer = NULL;
-   cmd->SCp.buffers_residual = 0;
cmd->SCp.ptr = NULL;
cmd->SCp.this_residual = 0;
}
@@ -163,6 +161,17 @@ static inline void initialize_SCp(struct scsi_cmnd *cmd)
cmd->SCp.Message = 0;
 }

+static inline void advance_sg_buffer(struct scsi_cmnd *cmd)
+{
+   struct scatterlist *s = cmd->SCp.buffer;
+
+   if (!cmd->SCp.this_residual && s && !sg_is_last(s)) {
+   cmd->SCp.buffer = sg_next(s);
+   cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
+   cmd->SCp.this_residual = cmd->SCp.buffer->length;
+   }
+}
+
 /**
  * NCR5380_poll_politely2 - wait for two chip register values
  * @hostdata: host private data
@@ -1670,12 +1679,7 @@ static void NCR5380_information_transfer(struct 
Scsi_Host *instance)
sun3_dma_setup_done != cmd) {
int count;

-   if (!cmd->SCp.this_residual && 
cmd->SCp.buffers_residual) {
-   ++cmd->SCp.buffer;
-   --cmd->SCp.buffers_residual;
-   cmd->SCp.this_residual = 
cmd->SCp.buffer->length;
-   cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
-   }
+   advance_sg_buffer(cmd);

count = sun3scsi_dma_xfer_len(hostdata, cmd);

@@ -1725,15 +1729,11 @@ static void NCR5380_information_transfer(struct 
Scsi_Host *instance)
 * scatter-gather list, move onto the next one.
 */

-   if (!cmd->SCp.this_residual && 
cmd->SCp.buffers_residual) {
-   ++cmd->SCp.buffer;
-   --cmd->SCp.buffers_residual;
-   cmd->SCp.this_residual = 
cmd->SCp.buffer->length;
-   cmd->SCp.ptr = sg_virt(cmd->SCp.buffer);
-   dsprintk(NDEBUG_INFORMATION, instance, "%d 
bytes and %d buffers left\n",
-cmd->SCp.this_residual,
-cmd->SCp.buffers_residual);
-   }
+   advance_sg_buffer(cmd);
+   dsprintk(NDEBUG_INFORMATION, instance,
+   "this residual %d, sg ents %d\n",
+   cmd->SCp.this_residual,
+   sg_nents(cmd->SCp.buffer));

/*
 * The preferred transfer method is going to be
@@ -2126,12 +2126,7 @@ static void NCR5380_reselect(struct Scsi_Host *instance)
if (sun3_dma_setup_done != tmp) {
int count;

-   if (!tmp->SCp.this_residual && tmp->SCp.buffers_residual) {
-   ++tmp->SCp.buffer;
-   --tmp->SCp.buffers_residual;
-   tmp->SCp.this_residual = tmp->SCp.buffer->length;
-   tmp->SCp.ptr = sg_virt(tmp->SCp.buffer);
-   }
+   advance_sg_buffer(tmp);

count = sun3scsi_dma_xfer_len(hostdata, tmp);




Re: [PATCH 5/7] scsi: mac_scsi: Fix pseudo DMA implementation, take 2

2019-06-03 Thread Michael Schmitz

Hi Finn,

On 3/06/19 7:40 PM, Finn Thain wrote:



There are several other drivers that contain pieces of assembler code.


Does any driver contain assembler code for multiple architectures? I was
trying to avoid that -- though admittedly I don't yet have actual code for
the PDMA implementation for mac_scsi for Nubus PowerMacs.

I've seen that once, for one of the ESP drivers that were supported on 
both m68k and ppc (APUS, PPC upgrade to Amiga computers). But that 
driver was removed long ago (after 2.6?).


In that case, the assembly file did reside in drivers/scsi/. That still 
appears to be current practice (see drivers/scsi/arm/acornscsi-io.S).


Cheers,

    Michael




Re: linux-next: build failure after merge of the ecryptfs tree

2019-05-14 Thread Michael Schmitz

Hi,

Am 14.05.2019 um 13:22 schrieb Michael Schmitz:

Stephen,

I wasn't aware of the other asix module when submitting the phy driver.
The phy module gets autoloaded based on the PHY ID, so there's no reason
why it couldn't be renamed.

May I suggest ax88796b for the new module name?


I've got a patch series ready to go for this (compile tested).

I suppose this is meant to go through the net tree, Dave?

Cheers,

Michael




Cheers,

Michael



On 14/05/19 12:56 PM, Stephen Rothwell wrote:

Hi all,

[excessive quoting for new CC's]

On Tue, 14 May 2019 09:40:53 +0900 Masahiro Yamada
 wrote:

On Tue, May 14, 2019 at 9:16 AM Stephen Rothwell
 wrote:

I don't know why this suddenly appeared after mergeing the ecryptfs
tree
since nothin has changed in that tree for some time (and nothing in
that
tree seems relevant).

After merging the ecryptfs tree, today's linux-next build (x86_64
allmodconfig) failed like this:

scripts/Makefile.modpost:112: target '.tmp_versions/asix.mod'
doesn't match the target pattern
scripts/Makefile.modpost:113: warning: overriding recipe for target
'.tmp_versions/asix.mod'
scripts/Makefile.modpost:100: warning: ignoring old recipe for
target '.tmp_versions/asix.mod'
scripts/Makefile.modpost:127: target '.tmp_versions/asix.mod'
doesn't match the target pattern
scripts/Makefile.modpost:128: warning: overriding recipe for target
'.tmp_versions/asix.mod'
scripts/Makefile.modpost:113: warning: ignoring old recipe for
target '.tmp_versions/asix.mod'
make[2]: Circular .tmp_versions/asix.mod <- __modpost dependency
dropped.
Binary file .tmp_versions/asix.mod matches: No such file or directory
make[2]: *** [scripts/Makefile.modpost:91: __modpost] Error 1
make[1]: *** [Makefile:1290: modules] Error 2

The only clue I can see is that asix.o gets built in two separate
directories (drivers/net/{phy,usb}).

Module name should be unique.

If both are compiled as a module,
they have the same module names:

drivers/net/phy/asix.ko
drivers/net/usb/asix.ko

If you see .tmp_version directory,
you will see asix.mod

Perhaps, one overwrote the other,
or it already got broken somehow.

So, the latter of these drivers (drivers/net/phy/asix.c) was added in
v4.18-rc1 by commit

   31dd83b96641 ("net-next: phy: new Asix Electronics PHY driver")

If we can't have 2 modules with the same base name, is it too late to
change its name?

I am sort of suprised that noone else has hit this in the past year.


I have the following files in the object directory:

./.tmp_versions/asix.mod
./drivers/net/usb/asix.ko
./drivers/net/usb/asix.mod.o
./drivers/net/usb/asix.o
./drivers/net/usb/asix_common.o
./drivers/net/usb/asix_devices.o
./drivers/net/usb/.asix.ko.cmd
./drivers/net/usb/.asix.mod.o.cmd
./drivers/net/usb/.asix.o.cmd
./drivers/net/usb/asix.mod.c
./drivers/net/usb/.asix_common.o.cmd
./drivers/net/usb/.asix_devices.o.cmd
./drivers/net/phy/asix.ko
./drivers/net/phy/asix.o
./drivers/net/phy/.asix.ko.cmd
./drivers/net/phy/.asix.mod.o.cmd
./drivers/net/phy/asix.mod.o
./drivers/net/phy/asix.mod.c
./drivers/net/phy/.asix.o.cmd

./.tmp_versions/asix.mod

Looks like this:

--
drivers/net/phy/asix.ko
drivers/net/phy/asix.o


--

What you can't see above are the 256 NUL bytes at the end of the file
(followed by a NL).

This is from a -j 80 build.  Surely there is a race condition here
if the
file in .tmp_versions is only named for the base name of the module and
we have 2 modules with the same base name.

I removed that file and redid the build and it succeeded.

Mind you, I have no itdea why this file was begin rebuilt, the merge
only touched these files:

fs/ecryptfs/crypto.c
fs/ecryptfs/keystore.c

Puzzled ... :-(


Re: linux-next: build failure after merge of the ecryptfs tree

2019-05-13 Thread Michael Schmitz

Stephen,

I wasn't aware of the other asix module when submitting the phy driver. 
The phy module gets autoloaded based on the PHY ID, so there's no reason 
why it couldn't be renamed.


May I suggest ax88796b for the new module name?

Cheers,

    Michael



On 14/05/19 12:56 PM, Stephen Rothwell wrote:

Hi all,

[excessive quoting for new CC's]

On Tue, 14 May 2019 09:40:53 +0900 Masahiro Yamada 
 wrote:

On Tue, May 14, 2019 at 9:16 AM Stephen Rothwell  wrote:

I don't know why this suddenly appeared after mergeing the ecryptfs tree
since nothin has changed in that tree for some time (and nothing in that
tree seems relevant).

After merging the ecryptfs tree, today's linux-next build (x86_64
allmodconfig) failed like this:

scripts/Makefile.modpost:112: target '.tmp_versions/asix.mod' doesn't match the 
target pattern
scripts/Makefile.modpost:113: warning: overriding recipe for target 
'.tmp_versions/asix.mod'
scripts/Makefile.modpost:100: warning: ignoring old recipe for target 
'.tmp_versions/asix.mod'
scripts/Makefile.modpost:127: target '.tmp_versions/asix.mod' doesn't match the 
target pattern
scripts/Makefile.modpost:128: warning: overriding recipe for target 
'.tmp_versions/asix.mod'
scripts/Makefile.modpost:113: warning: ignoring old recipe for target 
'.tmp_versions/asix.mod'
make[2]: Circular .tmp_versions/asix.mod <- __modpost dependency dropped.
Binary file .tmp_versions/asix.mod matches: No such file or directory
make[2]: *** [scripts/Makefile.modpost:91: __modpost] Error 1
make[1]: *** [Makefile:1290: modules] Error 2

The only clue I can see is that asix.o gets built in two separate
directories (drivers/net/{phy,usb}).

Module name should be unique.

If both are compiled as a module,
they have the same module names:

drivers/net/phy/asix.ko
drivers/net/usb/asix.ko

If you see .tmp_version directory,
you will see asix.mod

Perhaps, one overwrote the other,
or it already got broken somehow.

So, the latter of these drivers (drivers/net/phy/asix.c) was added in
v4.18-rc1 by commit

   31dd83b96641 ("net-next: phy: new Asix Electronics PHY driver")

If we can't have 2 modules with the same base name, is it too late to
change its name?

I am sort of suprised that noone else has hit this in the past year.


I have the following files in the object directory:

./.tmp_versions/asix.mod
./drivers/net/usb/asix.ko
./drivers/net/usb/asix.mod.o
./drivers/net/usb/asix.o
./drivers/net/usb/asix_common.o
./drivers/net/usb/asix_devices.o
./drivers/net/usb/.asix.ko.cmd
./drivers/net/usb/.asix.mod.o.cmd
./drivers/net/usb/.asix.o.cmd
./drivers/net/usb/asix.mod.c
./drivers/net/usb/.asix_common.o.cmd
./drivers/net/usb/.asix_devices.o.cmd
./drivers/net/phy/asix.ko
./drivers/net/phy/asix.o
./drivers/net/phy/.asix.ko.cmd
./drivers/net/phy/.asix.mod.o.cmd
./drivers/net/phy/asix.mod.o
./drivers/net/phy/asix.mod.c
./drivers/net/phy/.asix.o.cmd

./.tmp_versions/asix.mod

Looks like this:

--
drivers/net/phy/asix.ko
drivers/net/phy/asix.o


--

What you can't see above are the 256 NUL bytes at the end of the file
(followed by a NL).

This is from a -j 80 build.  Surely there is a race condition here if the
file in .tmp_versions is only named for the base name of the module and
we have 2 modules with the same base name.

I removed that file and redid the build and it succeeded.

Mind you, I have no itdea why this file was begin rebuilt, the merge
only touched these files:

fs/ecryptfs/crypto.c
fs/ecryptfs/keystore.c

Puzzled ... :-(


Re: [PATCH v2] scsi: NCR5380: Mark expected switch fall-through

2019-02-28 Thread Michael Schmitz

Finn's version looks fine to me.

Cheers,

    Michael

On 1/03/19 2:16 PM, Finn Thain wrote:

On Thu, 28 Feb 2019, Gustavo A. R. Silva wrote:


In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.


This switch case is already marked. So I think the patch description
should state that this patch is actually a workaround for a gcc deficiency
which prevents it from locating the marker.


This patch fixes the following warning:

In file included from drivers/scsi/dmx3191d.c:48:
drivers/scsi/NCR5380.c: In function ?NCR5380_information_transfer?:
drivers/scsi/NCR5380.c:1933:9: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
   if (!hostdata->connected)
  ^
drivers/scsi/NCR5380.c:1937:5: note: here
  default:
  ^~~

Warning level 3 was used: -Wimplicit-fallthrough=3

Notice that, in this particular case, the code comment is modified
in accordance with what GCC is expecting to find.

This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.

Signed-off-by: Gustavo A. R. Silva 
---
Changes in v2:
  - Update commit log.
  - Move code comment after the default label and
retain reason for fall-through in comment as
requested by Michael Schmitz.

  drivers/scsi/NCR5380.c | 9 -
  1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index 01c23d27f290..985d1c053578 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -1933,13 +1933,12 @@ static void NCR5380_information_transfer(struct 
Scsi_Host *instance)
if (!hostdata->connected)
return;
  
-	/* Fall through to reject message */

-
+   /* Fall through - to reject message */

This new hyphen is wrong and harms readability for humans.

I did confirm that gcc can be appeased by the use of a hyphen but not by
correct grammar such as "Fall through to reject message" or "Fall through.
Reject message."


+   default:
/*
-* If we get something weird that we 
aren't expecting,
-* reject it.
+* If we get something weird that we
+* aren't expecting, reject it.

This reformatting isn't relevant to this patch. The comments can be
improved however (see below).


 */
-   default:

Moving the 'default' keyword closer to the 'fall through' comment makes
sense to me -- I could understand if gcc had simple, unambiguous rules for
annotations.

Do compilers and static analysers agree as to what a correctly annotated
switch label should look like? If not, we would have to try to mangle code
and comments in such a way that might satisfy all of the failings in all
of the tools.


if (tmp == EXTENDED_MESSAGE)
scmd_printk(KERN_INFO, cmd,
"rejecting unknown 
extended message code %02x, length %d\n",


Here's an alternative patch, which has the virtue that a simple heuristic
will work. This patch does not require that other static analysis tools
will follow gcc's weird rules about hyphens. (I assume they don't but I
didn't check.)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index 7fed9bb72784..fe0535affc14 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -1932,13 +1932,13 @@ static void NCR5380_information_transfer(struct 
Scsi_Host *instance)
if (!hostdata->connected)
return;
  
-	/* Fall through to reject message */

-
+   /* Reject message */
+   /* Fall through */
+   default:
/*
 * If we get something weird that we 
aren't expecting,
-* reject it.
+* log it.
 */
-   default:
if (tmp == EXTENDED_MESSAGE)
scmd_printk(KERN_INFO, cmd,
"rejecting unknown 
extended message code %02x, length %d\n",



Re: [v4,1/9] net-next: phy: new Asix Electronics PHY driver

2019-01-21 Thread Michael Schmitz

Hi Andrew,

no objection for you to pick this up as part as a larger cleanup. I've 
tried to reconstruct how this happened (i.e. what other phy driver file 
I used as a 'template' for asix.c) - all I can say is that the 2.0+ 
boiler plate text was in my initial commit, and the incorrect SPDX tag 
was added in response to checkpath complaints. So 2.0+ would be correct.


Thomas: does that suit your purpose?

Cheers,

    Michael


On 21/01/19 6:43 AM, Andrew Lunn wrote:

On Fri, Jan 18, 2019 at 11:22:39AM +0100, Thomas Gleixner wrote:

Michael,

On Thu, 19 Apr 2018, Michael Schmitz wrote:


--- /dev/null
+++ b/drivers/net/phy/asix.c
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Driver for Asix PHYs
+ *
+ * Author: Michael Schmitz 
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ *
+ */

This license information is broken. The SPDX license identifier and the
boiler plate text are contradicting. The SPDX id is GPL v2 only and the
boiler plate says v2 or later.

Hi Thomas

Please see:

https://www.spinics.net/lists/netdev/msg544312.html

The first two patches are simple SPDX converstions. Then it gets
interesting trying to sort out license inconsistencies.

Andrew


Re: [PATCH v8 01/25] scsi/atari_scsi: Don't select CONFIG_NVRAM

2018-12-29 Thread Michael Schmitz

Christophe,

Am 30.12.2018 um 05:55 schrieb LEROY Christophe:

Michael Schmitz  a écrit :


Hi Finn,

Am 29.12.2018 um 14:06 schrieb Finn Thain:

On Fri, 28 Dec 2018, LEROY Christophe wrote:

diff --git a/drivers/scsi/atari_scsi.c b/drivers/scsi/atari_scsi.c
index 89f5154c40b6..99e5729d910d 100644
--- a/drivers/scsi/atari_scsi.c
+++ b/drivers/scsi/atari_scsi.c
@@ -755,9 +755,10 @@ static int __init atari_scsi_probe(struct
platform_device *pdev)
if (ATARIHW_PRESENT(TT_SCSI) && setup_sg_tablesize >= 0)
atari_scsi_template.sg_tablesize = setup_sg_tablesize;

-if (setup_hostid >= 0) {
+if (setup_hostid >= 0)
atari_scsi_template.this_id = setup_hostid & 7;
-} else {
+#ifdef CONFIG_NVRAM
+else


Such ifdefs should be avoided in C files.
It would be better to use

} else if (IS_ENABLED(CONFIG_NVRAM)) {



I don't like #ifdefs either. However, as the maintainer of this file,
I am
okay with this one.

The old #ifdef CONFIG_NVRAM conditional compilation convention that gets
used here and under drivers/video/fbdev could probably be improved upon
but I consider this to be out-of-scope for this series, which is
complicated enough.

And as explained in the commit log, CONFIG_NVRAM=y and CONFIG_NVRAM=m
are
treaded differently by drivers. Therefore, IS_ENABLED would be
incorrect.


IS_BUILTIN(CONFIG_NVRAM) is probably what Christophe really meant to
suggest.


Doesn't #ifdef means either y or m ? So the same as IS_ENABLED() ?


#ifdef CONFIG_NVRAM is used if you want to match CONFIG_NVRAM=y. For 
CONFIG_NVRAM=m, you'd use #ifdef CONFIG_NVRAM_MODULE.


Cheers,

Michael




Christophe



Or (really going out on a limb here):

IS_BUILTIN(CONFIG_NVRAM) ||
( IS_MODULE(CONFIG_ATARI_SCSI) && IS_ENABLED(CONFIG_NVRAM) )

Not that I'd advocate that, for this series.

Cheers,

Michael





Re: [PATCH v8 01/25] scsi/atari_scsi: Don't select CONFIG_NVRAM

2018-12-28 Thread Michael Schmitz

Hi Finn,

Am 26.12.2018 um 13:37 schrieb Finn Thain:

On powerpc, setting CONFIG_NVRAM=n builds a kernel with no NVRAM support.
Setting CONFIG_NVRAM=m enables the /dev/nvram misc device module without
enabling NVRAM support in drivers. Setting CONFIG_NVRAM=y enables the
misc device (built-in) and also enables NVRAM support in drivers.

m68k shares the valkyriefb driver with powerpc, and since that driver uses
NVRAM, it is affected by CONFIG_ATARI_SCSI, because of the use of
"select NVRAM".

Adopt the powerpc convention on m68k to avoid surprises.

Signed-off-by: Finn Thain 
Tested-by: Christian T. Steigies 


Acked-by: Michael Schmitz 


---
This patch temporarily disables CONFIG_NVRAM on Atari, to prevent build
failures when bisecting the rest of this patch series. It gets enabled
again with the introduction of CONFIG_HAVE_ARCH_NVRAM_OPS, once the
nvram_* global functions have been moved to an ops struct.
---
 drivers/char/Kconfig  | 5 +
 drivers/scsi/Kconfig  | 6 +++---
 drivers/scsi/atari_scsi.c | 7 ---
 3 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index 9d03b2ff5df6..5b54595dfe30 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -236,7 +236,7 @@ source "drivers/char/hw_random/Kconfig"

 config NVRAM
tristate "/dev/nvram support"
-   depends on ATARI || X86 || GENERIC_NVRAM
+   depends on X86 || GENERIC_NVRAM
---help---
  If you say Y here and create a character special file /dev/nvram
  with major number 10 and minor number 144 using mknod ("man mknod"),
@@ -254,9 +254,6 @@ config NVRAM
  should NEVER idly tamper with it. See Ralf Brown's interrupt list
  for a guide to the use of CMOS bytes by your BIOS.

- On Atari machines, /dev/nvram is always configured and does not need
- to be selected.
-
  To compile this driver as a module, choose M here: the
  module will be called nvram.

diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 640cd1b31a18..924eb69e7fc4 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -1381,14 +1381,14 @@ config ATARI_SCSI
tristate "Atari native SCSI support"
depends on ATARI && SCSI
select SCSI_SPI_ATTRS
-   select NVRAM
---help---
  If you have an Atari with built-in NCR5380 SCSI controller (TT,
  Falcon, ...) say Y to get it supported. Of course also, if you have
  a compatible SCSI controller (e.g. for Medusa).

- To compile this driver as a module, choose M here: the
- module will be called atari_scsi.
+ To compile this driver as a module, choose M here: the module will
+ be called atari_scsi. If you also enable NVRAM support, the SCSI
+ host's ID is taken from the setting in TT RTC NVRAM.

  This driver supports both styles of NCR integration into the
  system: the TT style (separate DMA), and the Falcon style (via
diff --git a/drivers/scsi/atari_scsi.c b/drivers/scsi/atari_scsi.c
index 89f5154c40b6..99e5729d910d 100644
--- a/drivers/scsi/atari_scsi.c
+++ b/drivers/scsi/atari_scsi.c
@@ -755,9 +755,10 @@ static int __init atari_scsi_probe(struct platform_device 
*pdev)
if (ATARIHW_PRESENT(TT_SCSI) && setup_sg_tablesize >= 0)
atari_scsi_template.sg_tablesize = setup_sg_tablesize;

-   if (setup_hostid >= 0) {
+   if (setup_hostid >= 0)
atari_scsi_template.this_id = setup_hostid & 7;
-   } else {
+#ifdef CONFIG_NVRAM
+   else
/* Test if a host id is set in the NVRam */
if (ATARIHW_PRESENT(TT_CLK) && nvram_check_checksum()) {
unsigned char b = nvram_read_byte(16);
@@ -768,7 +769,7 @@ static int __init atari_scsi_probe(struct platform_device 
*pdev)
if (b & 0x80)
atari_scsi_template.this_id = b & 7;
}
-   }
+#endif

/* If running on a Falcon and if there's TT-Ram (i.e., more than one
 * memory block, since there's always ST-Ram in a Falcon), then



Re: [PATCH v8 01/25] scsi/atari_scsi: Don't select CONFIG_NVRAM

2018-12-28 Thread Michael Schmitz

Hi Finn,

Am 29.12.2018 um 15:34 schrieb Finn Thain:

On Sat, 29 Dec 2018, Michael Schmitz wrote:



IS_BUILTIN(CONFIG_NVRAM) is probably what Christophe really meant to suggest.

Or (really going out on a limb here):

IS_BUILTIN(CONFIG_NVRAM) ||
( IS_MODULE(CONFIG_ATARI_SCSI) && IS_ENABLED(CONFIG_NVRAM) )

Not that I'd advocate that, for this series.



Well, you are a maintainer for atari_scsi.c.

Are you saying that you want IS_BUILTIN(CONFIG_NVRAM) used here instead of
ifdef?


No, just pointing out that there would be a way to avoid the ifdef 
without messing up driver behaviour. I'm fine with the ifdef - not least 
because it clearly eliminates code that would be unreachable.


(On second thought - I don't want to speculate whether there's weird 
compiler options that could result in the nvram_check_checksum and 
nvram_read_bytes symbols to still be referenced in the final link, even 
though IS_BUILTIN(CONFIG_NVRAM) always evaluates to false. Best leave 
this as-is.)



OTOH, if you approve of the existing patch, please send your acked-by.


Of course - I'd seen Geert's acked-by on some of the patches and forgot 
to check which still required acks.


Cheers,

Michael




Re: [PATCH v8 01/25] scsi/atari_scsi: Don't select CONFIG_NVRAM

2018-12-28 Thread Michael Schmitz

Hi Finn,

Am 29.12.2018 um 14:06 schrieb Finn Thain:

On Fri, 28 Dec 2018, LEROY Christophe wrote:

diff --git a/drivers/scsi/atari_scsi.c b/drivers/scsi/atari_scsi.c
index 89f5154c40b6..99e5729d910d 100644
--- a/drivers/scsi/atari_scsi.c
+++ b/drivers/scsi/atari_scsi.c
@@ -755,9 +755,10 @@ static int __init atari_scsi_probe(struct
platform_device *pdev)
if (ATARIHW_PRESENT(TT_SCSI) && setup_sg_tablesize >= 0)
atari_scsi_template.sg_tablesize = setup_sg_tablesize;

-   if (setup_hostid >= 0) {
+   if (setup_hostid >= 0)
atari_scsi_template.this_id = setup_hostid & 7;
-   } else {
+#ifdef CONFIG_NVRAM
+   else


Such ifdefs should be avoided in C files.
It would be better to use

} else if (IS_ENABLED(CONFIG_NVRAM)) {



I don't like #ifdefs either. However, as the maintainer of this file, I am
okay with this one.

The old #ifdef CONFIG_NVRAM conditional compilation convention that gets
used here and under drivers/video/fbdev could probably be improved upon
but I consider this to be out-of-scope for this series, which is
complicated enough.

And as explained in the commit log, CONFIG_NVRAM=y and CONFIG_NVRAM=m are
treaded differently by drivers. Therefore, IS_ENABLED would be incorrect.


IS_BUILTIN(CONFIG_NVRAM) is probably what Christophe really meant to 
suggest.


Or (really going out on a limb here):

IS_BUILTIN(CONFIG_NVRAM) ||
( IS_MODULE(CONFIG_ATARI_SCSI) && IS_ENABLED(CONFIG_NVRAM) )

Not that I'd advocate that, for this series.

Cheers,

Michael





Re: [RFC PATCH v2 09/14] m68k: hp300: Remove hp300_gettimeoffset()

2018-11-24 Thread Michael Schmitz

Hi Finn,

Am 25.11.2018 um 14:15 schrieb Finn Thain:

Maybe the timer interrupt has a sufficiently high priority and latency is
low? Maybe cia_set_irq() is really expensive?

I don't know the platform well enough so I'm inclined to revert. We can
benchmark gettimeofday syscalls on elgar but is that hardware
representative of other relevant models?


I suppose the CIA is on the main board, so running with the slower clock 
speed that you'd see with a vanilla Amiga without 060 accelerator board. 
Ought to be representative enough?


Adrian has a few other Amigas with different hardware base, so we might 
be able to get test coverage on more than one model.


Cheers,

Michael



[1]
https://github.com/mamedev/mame/commit/e2ed0490520f538c346c8bdaa9084bcbc43427cb

[2]
http://vice-emu.sourceforge.net/vice_9.html

[3]
https://www.commodore.ca/manuals/funet/cbm/documents/chipdata/cia6526.zip



Re: [RFC PATCH v2 09/14] m68k: hp300: Remove hp300_gettimeoffset()

2018-11-24 Thread Michael Schmitz

Hi Finn,

Am 25.11.2018 um 14:15 schrieb Finn Thain:

Maybe the timer interrupt has a sufficiently high priority and latency is
low? Maybe cia_set_irq() is really expensive?

I don't know the platform well enough so I'm inclined to revert. We can
benchmark gettimeofday syscalls on elgar but is that hardware
representative of other relevant models?


I suppose the CIA is on the main board, so running with the slower clock 
speed that you'd see with a vanilla Amiga without 060 accelerator board. 
Ought to be representative enough?


Adrian has a few other Amigas with different hardware base, so we might 
be able to get test coverage on more than one model.


Cheers,

Michael



[1]
https://github.com/mamedev/mame/commit/e2ed0490520f538c346c8bdaa9084bcbc43427cb

[2]
http://vice-emu.sourceforge.net/vice_9.html

[3]
https://www.commodore.ca/manuals/funet/cbm/documents/chipdata/cia6526.zip



Re: [RFC PATCH v2 07/14] m68k: atari: Convert to clocksource API

2018-11-23 Thread Michael Schmitz



Am 20.11.2018 um 23:02 schrieb Andreas Schwab:

On Nov 20 2018, Linus Walleij  wrote:


Yes you already see the same as I see: this chip MK68901 has
no less than four timers. I bet the kernel is just using one of them,
out of habit.


Note that not all timers can be used freely.  Some of them are hardwired
to generate the clock for the serial interfaces.


Timer A is used by the DMA sound driver - no workaround possible there.

Timer B is used by the framebuffer driver, but it's used only once to 
reprogram the screen base address at driver init. This one could 
potentially be used after framebuffer init to improve the clocksource 
accuracy.


Timer D is already used to generate timer interrupts used to poll the 
ROM port network card / USB adapters. This timer is initialized early in 
the boot process, which prevents using the MFP UART as serial console 
(something that I hadn't properly considered before). I'll send a patch 
for that. I'll also consider using timer B or timer C interrupts instead 
to poll ROM port hardware.


There are no serial drivers anymore that could use the MFP UART 
hardware, so that point is somewhat moot at present.


Cheers,

Michael


Andreas.



Re: [RFC PATCH v2 07/14] m68k: atari: Convert to clocksource API

2018-11-23 Thread Michael Schmitz



Am 20.11.2018 um 23:02 schrieb Andreas Schwab:

On Nov 20 2018, Linus Walleij  wrote:


Yes you already see the same as I see: this chip MK68901 has
no less than four timers. I bet the kernel is just using one of them,
out of habit.


Note that not all timers can be used freely.  Some of them are hardwired
to generate the clock for the serial interfaces.


Timer A is used by the DMA sound driver - no workaround possible there.

Timer B is used by the framebuffer driver, but it's used only once to 
reprogram the screen base address at driver init. This one could 
potentially be used after framebuffer init to improve the clocksource 
accuracy.


Timer D is already used to generate timer interrupts used to poll the 
ROM port network card / USB adapters. This timer is initialized early in 
the boot process, which prevents using the MFP UART as serial console 
(something that I hadn't properly considered before). I'll send a patch 
for that. I'll also consider using timer B or timer C interrupts instead 
to poll ROM port hardware.


There are no serial drivers anymore that could use the MFP UART 
hardware, so that point is somewhat moot at present.


Cheers,

Michael


Andreas.



Re: [RFC PATCH v2 07/14] m68k: atari: Convert to clocksource API

2018-11-19 Thread Michael Schmitz

Hi Finn,

this fixes the ssh timeout issues I reported for the earlier versions. 
Can't see a large degradation of resolution either.


The race in the previous versions may have been exacerbated in part by 
an incorrect assumption about the likelihood of counter wrap-around in 
the old atari_gettimeoffset() implementation. The comment in the code 
states that the likelihood is just 2%, so skips the interrupt pending 
bit check if the counter has run down more than 2%. It does not follow 
that the counter never runs down further than that threshold though. I 
have found the distribution of counter values observed with interrupt 
pending to be quite long-tailed, with a 50% threshold just beginning to 
catch the majority of wrap-arounds.


Your solution to stop the clock instead of allowing a jump backwards is 
much safer in this context.


Am 19.11.2018 um 14:10 schrieb Finn Thain:

Add a platform clocksource by adapting the existing arch_gettimeoffset
implementation.

Normally the MFP timer C interrupt flag would be used to check for
timer counter wrap-around. Unfortunately, that flag gets cleared by the
MFP itself (due to automatic EOI mode). This means that
mfp_timer_c_handler() and atari_read_clk() must race when accounting
for counter wrap-around.

That problem is avoided here by effectively stopping the clock when it
might otherwise jump backwards. This harms clock accuracy; the result
is not much better than the jiffies clocksource. Also, clock error is
no longer uniformly distributed.

Signed-off-by: Finn Thain 
Acked-by: Linus Walleij 


Tested-by: Michael Schmitz 

---
TODO: find a spare counter for the clocksource, rather than hanging
it off the HZ timer.

It would be simpler to adopt the 'jiffies' clocksource here
(c.f. patch for the hp300 platform in this series).

Changed since v1:
 - Moved clk_total access to within the irq lock.
 - Renamed mfp_timer_handler and mfptimer_handler.
 - Avoid accessing the timer interrupt flag in atari_read_clk(). To
get monotonicity, keep track of the previous timer counter value.
---
 arch/m68k/atari/time.c | 48 +++---
 1 file changed, 31 insertions(+), 17 deletions(-)

diff --git a/arch/m68k/atari/time.c b/arch/m68k/atari/time.c
index fafa20f75ab9..914832e55ec5 100644
--- a/arch/m68k/atari/time.c
+++ b/arch/m68k/atari/time.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 

@@ -24,12 +25,27 @@
 DEFINE_SPINLOCK(rtc_lock);
 EXPORT_SYMBOL_GPL(rtc_lock);

+static u64 atari_read_clk(struct clocksource *cs);
+
+static struct clocksource atari_clk = {
+   .name   = "mfp",
+   .rating = 100,
+   .read   = atari_read_clk,
+   .mask   = CLOCKSOURCE_MASK(32),
+   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
+};
+
+static u32 clk_total;
+static u32 last_timer_count;
+
 static irqreturn_t mfp_timer_c_handler(int irq, void *dev_id)
 {
irq_handler_t timer_routine = dev_id;
unsigned long flags;

local_irq_save(flags);
+   last_timer_count = st_mfp.tim_dt_c;
+   clk_total += INT_TICKS;
timer_routine(0, NULL);
local_irq_restore(flags);

@@ -44,32 +60,30 @@ atari_sched_init(irq_handler_t timer_routine)
 /* start timer C, div = 1:100 */
 st_mfp.tim_ct_cd = (st_mfp.tim_ct_cd & 15) | 0x60;
 /* install interrupt service routine for MFP Timer C */
-if (request_irq(IRQ_MFP_TIMC, mfp_timer_c_handler, 0, "timer",
+if (request_irq(IRQ_MFP_TIMC, mfp_timer_c_handler, IRQF_TIMER, "timer",
 timer_routine))
pr_err("Couldn't register timer interrupt\n");
+
+clocksource_register_hz(_clk, INT_CLK);
 }

 /* ++andreas: gettimeoffset fixed to check for pending interrupt */

-#define TICK_SIZE 1
-
-/* This is always executed with interrupts disabled.  */
-u32 atari_gettimeoffset(void)
+static u64 atari_read_clk(struct clocksource *cs)
 {
-  u32 ticks, offset = 0;
-
-  /* read MFP timer C current value */
-  ticks = st_mfp.tim_dt_c;
-  /* The probability of underflow is less than 2% */
-  if (ticks > INT_TICKS - INT_TICKS / 50)
-/* Check for pending timer interrupt */
-if (st_mfp.int_pn_b & (1 << 5))
-  offset = TICK_SIZE;
+   unsigned long flags;
+   u32 ticks;

-  ticks = INT_TICKS - ticks;
-  ticks = ticks * 1L / INT_TICKS;
+   local_irq_save(flags);
+   ticks = st_mfp.tim_dt_c;
+   if (ticks > last_timer_count) /* timer wrapped since last interrupt */
+   ticks = last_timer_count;
+   last_timer_count = ticks;
+   ticks = INT_TICKS - ticks;
+   ticks += clk_total;
+   local_irq_restore(flags);

-  return (ticks + offset) * 1000;
+   return ticks;
 }





Re: [RFC PATCH v2 07/14] m68k: atari: Convert to clocksource API

2018-11-19 Thread Michael Schmitz

Hi Finn,

this fixes the ssh timeout issues I reported for the earlier versions. 
Can't see a large degradation of resolution either.


The race in the previous versions may have been exacerbated in part by 
an incorrect assumption about the likelihood of counter wrap-around in 
the old atari_gettimeoffset() implementation. The comment in the code 
states that the likelihood is just 2%, so skips the interrupt pending 
bit check if the counter has run down more than 2%. It does not follow 
that the counter never runs down further than that threshold though. I 
have found the distribution of counter values observed with interrupt 
pending to be quite long-tailed, with a 50% threshold just beginning to 
catch the majority of wrap-arounds.


Your solution to stop the clock instead of allowing a jump backwards is 
much safer in this context.


Am 19.11.2018 um 14:10 schrieb Finn Thain:

Add a platform clocksource by adapting the existing arch_gettimeoffset
implementation.

Normally the MFP timer C interrupt flag would be used to check for
timer counter wrap-around. Unfortunately, that flag gets cleared by the
MFP itself (due to automatic EOI mode). This means that
mfp_timer_c_handler() and atari_read_clk() must race when accounting
for counter wrap-around.

That problem is avoided here by effectively stopping the clock when it
might otherwise jump backwards. This harms clock accuracy; the result
is not much better than the jiffies clocksource. Also, clock error is
no longer uniformly distributed.

Signed-off-by: Finn Thain 
Acked-by: Linus Walleij 


Tested-by: Michael Schmitz 

---
TODO: find a spare counter for the clocksource, rather than hanging
it off the HZ timer.

It would be simpler to adopt the 'jiffies' clocksource here
(c.f. patch for the hp300 platform in this series).

Changed since v1:
 - Moved clk_total access to within the irq lock.
 - Renamed mfp_timer_handler and mfptimer_handler.
 - Avoid accessing the timer interrupt flag in atari_read_clk(). To
get monotonicity, keep track of the previous timer counter value.
---
 arch/m68k/atari/time.c | 48 +++---
 1 file changed, 31 insertions(+), 17 deletions(-)

diff --git a/arch/m68k/atari/time.c b/arch/m68k/atari/time.c
index fafa20f75ab9..914832e55ec5 100644
--- a/arch/m68k/atari/time.c
+++ b/arch/m68k/atari/time.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 

@@ -24,12 +25,27 @@
 DEFINE_SPINLOCK(rtc_lock);
 EXPORT_SYMBOL_GPL(rtc_lock);

+static u64 atari_read_clk(struct clocksource *cs);
+
+static struct clocksource atari_clk = {
+   .name   = "mfp",
+   .rating = 100,
+   .read   = atari_read_clk,
+   .mask   = CLOCKSOURCE_MASK(32),
+   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
+};
+
+static u32 clk_total;
+static u32 last_timer_count;
+
 static irqreturn_t mfp_timer_c_handler(int irq, void *dev_id)
 {
irq_handler_t timer_routine = dev_id;
unsigned long flags;

local_irq_save(flags);
+   last_timer_count = st_mfp.tim_dt_c;
+   clk_total += INT_TICKS;
timer_routine(0, NULL);
local_irq_restore(flags);

@@ -44,32 +60,30 @@ atari_sched_init(irq_handler_t timer_routine)
 /* start timer C, div = 1:100 */
 st_mfp.tim_ct_cd = (st_mfp.tim_ct_cd & 15) | 0x60;
 /* install interrupt service routine for MFP Timer C */
-if (request_irq(IRQ_MFP_TIMC, mfp_timer_c_handler, 0, "timer",
+if (request_irq(IRQ_MFP_TIMC, mfp_timer_c_handler, IRQF_TIMER, "timer",
 timer_routine))
pr_err("Couldn't register timer interrupt\n");
+
+clocksource_register_hz(_clk, INT_CLK);
 }

 /* ++andreas: gettimeoffset fixed to check for pending interrupt */

-#define TICK_SIZE 1
-
-/* This is always executed with interrupts disabled.  */
-u32 atari_gettimeoffset(void)
+static u64 atari_read_clk(struct clocksource *cs)
 {
-  u32 ticks, offset = 0;
-
-  /* read MFP timer C current value */
-  ticks = st_mfp.tim_dt_c;
-  /* The probability of underflow is less than 2% */
-  if (ticks > INT_TICKS - INT_TICKS / 50)
-/* Check for pending timer interrupt */
-if (st_mfp.int_pn_b & (1 << 5))
-  offset = TICK_SIZE;
+   unsigned long flags;
+   u32 ticks;

-  ticks = INT_TICKS - ticks;
-  ticks = ticks * 1L / INT_TICKS;
+   local_irq_save(flags);
+   ticks = st_mfp.tim_dt_c;
+   if (ticks > last_timer_count) /* timer wrapped since last interrupt */
+   ticks = last_timer_count;
+   last_timer_count = ticks;
+   ticks = INT_TICKS - ticks;
+   ticks += clk_total;
+   local_irq_restore(flags);

-  return (ticks + offset) * 1000;
+   return ticks;
 }





Re: [RFC PATCH 01/13] arm: Fix mutual exclusion in arch_gettimeoffset

2018-11-16 Thread Michael Schmitz

Hi Finn,

Am 17.11.2018 um 11:49 schrieb Finn Thain:

On Fri, 16 Nov 2018, Russell King - ARM Linux wrote:



The EBSA110 is probably in a similar boat - I don't remember whether it
had 16MB or 32MB as the maximal amount of memory, but memory was getting
tight with some kernels even running a minimalist userspace.

So, it's probably time to say goodbyte to the kernel support for these
platforms.



Your call.

Note that removing code from mainline won't help users obtain older,
smaller, -stable kernel releases, free from the bug we were discussing.
(The bug appeared in Linux v2.6.32.)

BTW, if you did want to boot Linux on a 16 MB system, you do have some
options.

https://lwn.net/Articles/741494/
https://lwn.net/Articles/608945/
https://tiny.wiki.kernel.org/

Contributing to this kind of effort probably has value for IoT
deployments. I suspect it also cuts a small amount of bloat from a large
number of other Linux systems.


I boot 4.19 on a system with 14 MB RAM - 10 MB remain for user space 
once the kernel loads. After remote login, 4 MB of that remain free for 
buffers or user code (1.5 MB swapped).
That's with sysvinit - Christian tried to boot a systemd userland on his 
Falcon once, and ran out of memory before swap could be activated.


I shouldn't say 16 or 32 MB are hopeless. And the 2.6 kernels were a lot 
more sluggish to my recollection.


Cheers,

Michael


Re: [RFC PATCH 01/13] arm: Fix mutual exclusion in arch_gettimeoffset

2018-11-16 Thread Michael Schmitz

Hi Finn,

Am 17.11.2018 um 11:49 schrieb Finn Thain:

On Fri, 16 Nov 2018, Russell King - ARM Linux wrote:



The EBSA110 is probably in a similar boat - I don't remember whether it
had 16MB or 32MB as the maximal amount of memory, but memory was getting
tight with some kernels even running a minimalist userspace.

So, it's probably time to say goodbyte to the kernel support for these
platforms.



Your call.

Note that removing code from mainline won't help users obtain older,
smaller, -stable kernel releases, free from the bug we were discussing.
(The bug appeared in Linux v2.6.32.)

BTW, if you did want to boot Linux on a 16 MB system, you do have some
options.

https://lwn.net/Articles/741494/
https://lwn.net/Articles/608945/
https://tiny.wiki.kernel.org/

Contributing to this kind of effort probably has value for IoT
deployments. I suspect it also cuts a small amount of bloat from a large
number of other Linux systems.


I boot 4.19 on a system with 14 MB RAM - 10 MB remain for user space 
once the kernel loads. After remote login, 4 MB of that remain free for 
buffers or user code (1.5 MB swapped).
That's with sysvinit - Christian tried to boot a systemd userland on his 
Falcon once, and ran out of memory before swap could be activated.


I shouldn't say 16 or 32 MB are hopeless. And the 2.6 kernels were a lot 
more sluggish to my recollection.


Cheers,

Michael


Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-14 Thread Michael Schmitz

Hi Finn

Am 15.11.2018 um 12:54 schrieb Michael Schmitz:

That one does appear to work - different versions of ARAnyM, and
different userland versions though. I'll test that again with the setup
that saw c606b5cf902 fail.


Still fails on that emulator / userland.


Must be a quirk of ARAnyM 1.0.2 (or powerpc). With 0.9.15 on x86_64,
it's fine.

I'm sufficiently convinced to try this on actual hardware now.


Well, it sort of works - I've seen one login timeout on the console 
under load (less than 10 seconds after typing in the password), but most 
attempts went OK. Couldn't log in through SSH without increasing fatal: 
Timeout before authenticationthe LoginGraceTime parameter though.


I usually get reliable login using ssh key files with the default 
setting of 120 seconds (takes around 90 to 100 seconds to complete). 
With your patch, even increasing this to 4800 doesn't result in reliable 
login.


The error I see in the logs is 'fatal: Timeout before authentication'.

Cheers,

Michael



Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-14 Thread Michael Schmitz

Hi Finn

Am 15.11.2018 um 12:54 schrieb Michael Schmitz:

That one does appear to work - different versions of ARAnyM, and
different userland versions though. I'll test that again with the setup
that saw c606b5cf902 fail.


Still fails on that emulator / userland.


Must be a quirk of ARAnyM 1.0.2 (or powerpc). With 0.9.15 on x86_64,
it's fine.

I'm sufficiently convinced to try this on actual hardware now.


Well, it sort of works - I've seen one login timeout on the console 
under load (less than 10 seconds after typing in the password), but most 
attempts went OK. Couldn't log in through SSH without increasing fatal: 
Timeout before authenticationthe LoginGraceTime parameter though.


I usually get reliable login using ssh key files with the default 
setting of 120 seconds (takes around 90 to 100 seconds to complete). 
With your patch, even increasing this to 4800 doesn't result in reliable 
login.


The error I see in the logs is 'fatal: Timeout before authentication'.

Cheers,

Michael



Re: [RFC PATCH 01/13] arm: Fix mutual exclusion in arch_gettimeoffset

2018-11-14 Thread Michael Schmitz



On 14/11/18 8:58 PM, Russell King - ARM Linux wrote:



Are you saying that's not possible on arm, because the current timer rundown
counter can't be read while the timer is running?

If I were to run a second timer at higher rate for clocksource, but keeping
the 10 ms timer as clock event (could easily do that by using timer D on
Atari Falcon) - how would that improve my timekeeping? Clock events still
only happen 10 ms apart ...

Ah, I think you're talking about something else.

You seem to be talking about what happens when time keeping interrupts
happen.

That's what I understood your comment was about.

I'm talking about the resolution of gettimeofday() and the other
interfaces that are used (eg) for packet capture timestamping and
the like - the _user_ visible effects of the timekeeping system.

With the existing implementation, all these have better-than-jiffy
resolution - in the case of RPC, that's 500ns, rather than 10ms
which would be the case without gettimeoffset().  Removing
gettimeoffset() as Finn has proposed without preserving that
resolution is simply completely unacceptable.


I agree - but Finn had also offered to supply patches to arm that would 
have added a clocksource_read() function very much like for those m68k 
platforms that had used gettimeoffset(). I wondered what reason there 
was for these not to work on arm


I now realize you'd never seen that offer.

The proposal to drop architectures still relying on arch_gettimeoffset() 
had been raising enough of a response on linux-m68k to make me conclude 
that same proposal had been kicked on to other arch MLs affected as 
well. I'm a bit naive that way.


Now your criticism of arch_gettimeoffset() (missing timer rollover 
because the timer interrupt has been cleared by the interrupt handler) 
still stands. I just can't find the offset() functions shown in the 
5cfc8ee0bb51 patch. Any hints?


Cheers,

    Michael




Re: [RFC PATCH 01/13] arm: Fix mutual exclusion in arch_gettimeoffset

2018-11-14 Thread Michael Schmitz



On 14/11/18 8:58 PM, Russell King - ARM Linux wrote:



Are you saying that's not possible on arm, because the current timer rundown
counter can't be read while the timer is running?

If I were to run a second timer at higher rate for clocksource, but keeping
the 10 ms timer as clock event (could easily do that by using timer D on
Atari Falcon) - how would that improve my timekeeping? Clock events still
only happen 10 ms apart ...

Ah, I think you're talking about something else.

You seem to be talking about what happens when time keeping interrupts
happen.

That's what I understood your comment was about.

I'm talking about the resolution of gettimeofday() and the other
interfaces that are used (eg) for packet capture timestamping and
the like - the _user_ visible effects of the timekeeping system.

With the existing implementation, all these have better-than-jiffy
resolution - in the case of RPC, that's 500ns, rather than 10ms
which would be the case without gettimeoffset().  Removing
gettimeoffset() as Finn has proposed without preserving that
resolution is simply completely unacceptable.


I agree - but Finn had also offered to supply patches to arm that would 
have added a clocksource_read() function very much like for those m68k 
platforms that had used gettimeoffset(). I wondered what reason there 
was for these not to work on arm


I now realize you'd never seen that offer.

The proposal to drop architectures still relying on arch_gettimeoffset() 
had been raising enough of a response on linux-m68k to make me conclude 
that same proposal had been kicked on to other arch MLs affected as 
well. I'm a bit naive that way.


Now your criticism of arch_gettimeoffset() (missing timer rollover 
because the timer interrupt has been cleared by the interrupt handler) 
still stands. I just can't find the offset() functions shown in the 
5cfc8ee0bb51 patch. Any hints?


Cheers,

    Michael




Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-14 Thread Michael Schmitz

Hi Finn,

On 14/11/18 3:58 PM, Michael Schmitz wrote:

Hi Finn,

Am 14.11.2018 um 14:08 schrieb Michael Schmitz:

Can you also test tree fbf8405cd982 please?


My tests were on c606b5cf902 in case it wasn't clear. I've now seen
fbf8405cd982, one moment please ...

That one does appear to work - different versions of ARAnyM, and
different userland versions though. I'll test that again with the setup
that saw c606b5cf902 fail.


Still fails on that emulator / userland.

Must be a quirk of ARAnyM 1.0.2 (or powerpc). With 0.9.15 on x86_64, 
it's fine.


I'm sufficiently convinced to try this on actual hardware now.

Cheers,

    Michael




Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-14 Thread Michael Schmitz

Hi Finn,

On 14/11/18 3:58 PM, Michael Schmitz wrote:

Hi Finn,

Am 14.11.2018 um 14:08 schrieb Michael Schmitz:

Can you also test tree fbf8405cd982 please?


My tests were on c606b5cf902 in case it wasn't clear. I've now seen
fbf8405cd982, one moment please ...

That one does appear to work - different versions of ARAnyM, and
different userland versions though. I'll test that again with the setup
that saw c606b5cf902 fail.


Still fails on that emulator / userland.

Must be a quirk of ARAnyM 1.0.2 (or powerpc). With 0.9.15 on x86_64, 
it's fine.


I'm sufficiently convinced to try this on actual hardware now.

Cheers,

    Michael




Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-13 Thread Michael Schmitz

Hi Finn,

Am 14.11.2018 um 14:08 schrieb Michael Schmitz:

Can you also test tree fbf8405cd982 please?


My tests were on c606b5cf902 in case it wasn't clear. I've now seen
fbf8405cd982, one moment please ...

That one does appear to work - different versions of ARAnyM, and
different userland versions though. I'll test that again with the setup
that saw c606b5cf902 fail.


Still fails on that emulator / userland.

Cheers,

Michael




Cheers,

Michael





Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-13 Thread Michael Schmitz

Hi Finn,

Am 14.11.2018 um 14:08 schrieb Michael Schmitz:

Can you also test tree fbf8405cd982 please?


My tests were on c606b5cf902 in case it wasn't clear. I've now seen
fbf8405cd982, one moment please ...

That one does appear to work - different versions of ARAnyM, and
different userland versions though. I'll test that again with the setup
that saw c606b5cf902 fail.


Still fails on that emulator / userland.

Cheers,

Michael




Cheers,

Michael





Re: [RFC PATCH 01/13] arm: Fix mutual exclusion in arch_gettimeoffset

2018-11-13 Thread Michael Schmitz



On 14/11/18 12:43 PM, Russell King - ARM Linux wrote:

On Wed, Nov 14, 2018 at 08:55:37AM +1100, Finn Thain wrote:

On Tue, 13 Nov 2018, Russell King - ARM Linux wrote:


On Tue, Nov 13, 2018 at 02:39:00PM +1100, Finn Thain wrote:

You could remove the old arch_gettimeoffset API without dropping any
platforms.

If no-one converts a given platform to the clocksource API it would mean
that the default 'jiffies' clocksource will get used on that platform.

Clock resolution and timer precision would be degraded, but that might not
matter.

Anyway, if someone who has this hardware is willing to test a clocksource
API conversion, they can let me know and I'll attempt that patch.

There's reasons why that's not appropriate - such as not having two
separate timers in order to supply a clocksource and separate clock
event.

Not all hardware is suited to the clocksource + clockevent idea.


Sorry, I don't follow.

AFAIK, clocksources and clock event devices are orthogonal concepts. There
are platforms with !ARCH_USES_GETTIMEOFFSET && !GENERIC_CLOCKEVENTS (and
every other combination).

A clocksource read method just provides a cycle count, and in this sense
arch_gettimeoffset() is equivalent to a clocksource.

A clocksource provides a cycle counter that monotonically changes and
does not wrap between clockevent events.

A clock event is responsible for providing events to the system when
some work is needing to be done, limited by the wrap interval of the
clocksource.

Each time the clock event triggers an interrupt, the clocksource is
read to determine how much time has passed, using:

count = (new_value - old_value) & available_bits
nanosecs = count * scale >> shift;

If you try to combine the clocksource and clockevent because you only
have a single counter, and the counter has the behaviour of:
- counting down towards zero
- when reaching zero, triggers an interrupt, and reloads with N

then this provides your clockevent, but you can't use this as a clock
source, because each time you receive an interrupt and try to read the
counter value, it will be approximately the same value.  This means
that the above calculation fails to register the correct number of
nanoseconds passing.  Hence, this does not work.


So we'd still have to use jiffies + interpolation from the current timer 
rundown counter as clocksource (since that will be monotonous and won't 
wrap)?


The way Finn did the clocksource for m68k, the clocksource counter does 
behave as it should (monotonous, doesn't wrap) at least so far as I've 
tested. Using the same timer for clocksource and clock events will 
degrade timekeeping based on timer events to jiffies resolution, as you 
point out. That would already have been the case before, so no gain in 
resolution. Other timekeeping would have worked at higher resolution 
before (interpolating through arch_gettimeoffset) just the same as now 
(interpolating through clocksource reads). Finn's clocksource read code 
essentially is arch_gettimeoffset under a new name.


Are you saying that's not possible on arm, because the current timer 
rundown counter can't be read while the timer is running?


If I were to run a second timer at higher rate for clocksource, but 
keeping the 10 ms timer as clock event (could easily do that by using 
timer D on Atari Falcon) - how would that improve my timekeeping? Clock 
events still only happen 10 ms apart ...


Cheers,

    Michael



Also note where I said above that the clock event device must be able
to provide an interrupt _before_ the clocksource wraps - clearly with
such a timer, that is utterly impossible.

The simple solution is to not use such a counter as the clocksource,
which means you fall back to using the jiffies clocksource, and your
timekeeping has jiffy resolution - so 10ms, or possibly 1ms if you
want a 1kHz timer interval.  For most applications, that's simply way
to coarse, as was realised in the very early days of Linux.

If only there was a way to interpolate between timer interrupts...
which is exactly what arch_gettimeoffset() does, and is a historical
reminant of the pre-clocksource/clockevent days of the kernel - but
it is the only way to get better resolution from this sort of setup.



Re: [RFC PATCH 01/13] arm: Fix mutual exclusion in arch_gettimeoffset

2018-11-13 Thread Michael Schmitz



On 14/11/18 12:43 PM, Russell King - ARM Linux wrote:

On Wed, Nov 14, 2018 at 08:55:37AM +1100, Finn Thain wrote:

On Tue, 13 Nov 2018, Russell King - ARM Linux wrote:


On Tue, Nov 13, 2018 at 02:39:00PM +1100, Finn Thain wrote:

You could remove the old arch_gettimeoffset API without dropping any
platforms.

If no-one converts a given platform to the clocksource API it would mean
that the default 'jiffies' clocksource will get used on that platform.

Clock resolution and timer precision would be degraded, but that might not
matter.

Anyway, if someone who has this hardware is willing to test a clocksource
API conversion, they can let me know and I'll attempt that patch.

There's reasons why that's not appropriate - such as not having two
separate timers in order to supply a clocksource and separate clock
event.

Not all hardware is suited to the clocksource + clockevent idea.


Sorry, I don't follow.

AFAIK, clocksources and clock event devices are orthogonal concepts. There
are platforms with !ARCH_USES_GETTIMEOFFSET && !GENERIC_CLOCKEVENTS (and
every other combination).

A clocksource read method just provides a cycle count, and in this sense
arch_gettimeoffset() is equivalent to a clocksource.

A clocksource provides a cycle counter that monotonically changes and
does not wrap between clockevent events.

A clock event is responsible for providing events to the system when
some work is needing to be done, limited by the wrap interval of the
clocksource.

Each time the clock event triggers an interrupt, the clocksource is
read to determine how much time has passed, using:

count = (new_value - old_value) & available_bits
nanosecs = count * scale >> shift;

If you try to combine the clocksource and clockevent because you only
have a single counter, and the counter has the behaviour of:
- counting down towards zero
- when reaching zero, triggers an interrupt, and reloads with N

then this provides your clockevent, but you can't use this as a clock
source, because each time you receive an interrupt and try to read the
counter value, it will be approximately the same value.  This means
that the above calculation fails to register the correct number of
nanoseconds passing.  Hence, this does not work.


So we'd still have to use jiffies + interpolation from the current timer 
rundown counter as clocksource (since that will be monotonous and won't 
wrap)?


The way Finn did the clocksource for m68k, the clocksource counter does 
behave as it should (monotonous, doesn't wrap) at least so far as I've 
tested. Using the same timer for clocksource and clock events will 
degrade timekeeping based on timer events to jiffies resolution, as you 
point out. That would already have been the case before, so no gain in 
resolution. Other timekeeping would have worked at higher resolution 
before (interpolating through arch_gettimeoffset) just the same as now 
(interpolating through clocksource reads). Finn's clocksource read code 
essentially is arch_gettimeoffset under a new name.


Are you saying that's not possible on arm, because the current timer 
rundown counter can't be read while the timer is running?


If I were to run a second timer at higher rate for clocksource, but 
keeping the 10 ms timer as clock event (could easily do that by using 
timer D on Atari Falcon) - how would that improve my timekeeping? Clock 
events still only happen 10 ms apart ...


Cheers,

    Michael



Also note where I said above that the clock event device must be able
to provide an interrupt _before_ the clocksource wraps - clearly with
such a timer, that is utterly impossible.

The simple solution is to not use such a counter as the clocksource,
which means you fall back to using the jiffies clocksource, and your
timekeeping has jiffy resolution - so 10ms, or possibly 1ms if you
want a 1kHz timer interval.  For most applications, that's simply way
to coarse, as was realised in the very early days of Linux.

If only there was a way to interpolate between timer interrupts...
which is exactly what arch_gettimeoffset() does, and is a historical
reminant of the pre-clocksource/clockevent days of the kernel - but
it is the only way to get better resolution from this sort of setup.



Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-13 Thread Michael Schmitz

Hi Finn,

On 14/11/18 11:11 AM, Finn Thain wrote:

On Tue, 13 Nov 2018, Michael Schmitz wrote:


Running a recent kernel under ARAnyM shows 40 ns resolution so the Atari
hardware emulation is a little more complete.


You mean, 40 us resolution, right?


Sorry, typo. Should have been us of course.




Using that for initial tests, I can confirm that timer resolution is
reduced to 10ms after patch 6, and gets restored to 40ns after applying
the full series

Thanks for testing!


(once clocksource_init runs, that is - the first part of the boot is at
10ms resolution only, a regression compared to with use of
arch_gettimeoffset).


Sounds like a theoretical regression (?)

Is there any need for more precise timers (I mean, better than 40 us)
before clocksource_init runs?


I don't think so, given that we can run just fine with 10 ms resolution.




Unfortunately, I can't log in at the console with all patches applied. I
immediately get the 'login timeout exceeded' message. Weird...


I didn't see that in my tests... Was this aranym or real hardware or both?


ARAnyM only so far.



Can you also test tree fbf8405cd982 please?

My tests were on c606b5cf902 in case it wasn't clear. I've now seen 
fbf8405cd982, one moment please ...


That one does appear to work - different versions of ARAnyM, and 
different userland versions though. I'll test that again with the setup 
that saw c606b5cf902 fail.


Cheers,

    Michael





Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-13 Thread Michael Schmitz

Hi Finn,

On 14/11/18 11:11 AM, Finn Thain wrote:

On Tue, 13 Nov 2018, Michael Schmitz wrote:


Running a recent kernel under ARAnyM shows 40 ns resolution so the Atari
hardware emulation is a little more complete.


You mean, 40 us resolution, right?


Sorry, typo. Should have been us of course.




Using that for initial tests, I can confirm that timer resolution is
reduced to 10ms after patch 6, and gets restored to 40ns after applying
the full series

Thanks for testing!


(once clocksource_init runs, that is - the first part of the boot is at
10ms resolution only, a regression compared to with use of
arch_gettimeoffset).


Sounds like a theoretical regression (?)

Is there any need for more precise timers (I mean, better than 40 us)
before clocksource_init runs?


I don't think so, given that we can run just fine with 10 ms resolution.




Unfortunately, I can't log in at the console with all patches applied. I
immediately get the 'login timeout exceeded' message. Weird...


I didn't see that in my tests... Was this aranym or real hardware or both?


ARAnyM only so far.



Can you also test tree fbf8405cd982 please?

My tests were on c606b5cf902 in case it wasn't clear. I've now seen 
fbf8405cd982, one moment please ...


That one does appear to work - different versions of ARAnyM, and 
different userland versions though. I'll test that again with the setup 
that saw c606b5cf902 fail.


Cheers,

    Michael





Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-13 Thread Michael Schmitz

Hi Finn,

Am 13.11.2018 um 19:15 schrieb Finn Thain:

On Tue, 13 Nov 2018, Michael Schmitz wrote:




(It appears that a QEMU-emulated Mac does not benefit from having a
clocksource that's more accurate than the 'jiffies' clocksource, in
spite of "clocksource: Switched to clocksource via1".)


With the current code, kernel log timestamps have 10 ms resolution on
Atari. Time resolution of times reported by initcall_debug is roughly 40
us. I'd expect that changes with falling back to jiffies only. Might be
worth a try on QEMU Mac.


The initcall debug output shows the same precision as my earlier tests
with userland. The VIA timer only gets updated when QEMU wants to update
it, which seems to be about every 9765 us. Hence, using the 'jiffies'
clocksource would be effectively no regression for a virtual Mac.


Running a recent kernel under ARAnyM shows 40 ns resolution so the Atari 
hardware emulation is a little more complete.


Using that for initial tests, I can confirm that timer resolution is 
reduced to 10ms after patch 6, and gets restored to 40ns after applying 
the full series (once clocksource_init runs, that is - the first part of 
the boot is at 10ms resolution only, a regression compared to with use 
of arch_gettimeoffset).


Unfortunately, I can't log in at the console with all patches applied. I 
immediately get the 'login timeout exceeded' message. Weird...


Cheers,

Michael



Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-13 Thread Michael Schmitz

Hi Finn,

Am 13.11.2018 um 19:15 schrieb Finn Thain:

On Tue, 13 Nov 2018, Michael Schmitz wrote:




(It appears that a QEMU-emulated Mac does not benefit from having a
clocksource that's more accurate than the 'jiffies' clocksource, in
spite of "clocksource: Switched to clocksource via1".)


With the current code, kernel log timestamps have 10 ms resolution on
Atari. Time resolution of times reported by initcall_debug is roughly 40
us. I'd expect that changes with falling back to jiffies only. Might be
worth a try on QEMU Mac.


The initcall debug output shows the same precision as my earlier tests
with userland. The VIA timer only gets updated when QEMU wants to update
it, which seems to be about every 9765 us. Hence, using the 'jiffies'
clocksource would be effectively no regression for a virtual Mac.


Running a recent kernel under ARAnyM shows 40 ns resolution so the Atari 
hardware emulation is a little more complete.


Using that for initial tests, I can confirm that timer resolution is 
reduced to 10ms after patch 6, and gets restored to 40ns after applying 
the full series (once clocksource_init runs, that is - the first part of 
the boot is at 10ms resolution only, a regression compared to with use 
of arch_gettimeoffset).


Unfortunately, I can't log in at the console with all patches applied. I 
immediately get the 'login timeout exceeded' message. Weird...


Cheers,

Michael



Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-12 Thread Michael Schmitz

Hi Finn,

Am 13.11.2018 um 16:14 schrieb Finn Thain:

On Tue, 13 Nov 2018, Michael Schmitz wrote:


Hi Finn,

Am 12.11.2018 um 22:06 schrieb Finn Thain:

On Mon, 12 Nov 2018, Geert Uytterhoeven wrote:


Hi Finn,

Thanks for your patch!

On Mon, Nov 12, 2018 at 5:46 AM Finn Thain 
wrote:

The functions that implement arch_gettimeoffset are re-used by
new clocksource drivers in subsequent patches.


Disabling this first affects functionality during bisection, right?



It means that all platforms have to use the 'jiffies' clocksource.


So all that happens is timer granularity drops to 10ms, then gets restored by
the later patches?



Yes, that was the plan, but I can't confirm that it worked out as I don't
have any physical 68k hardware in front of me right now. If you can
confirm this on your Atari Falcon, that would be great.


Will do.


(It appears that a QEMU-emulated Mac does not benefit from having a
clocksource that's more accurate than the 'jiffies' clocksource, in spite
of "clocksource: Switched to clocksource via1".)


With the current code, kernel log timestamps have 10 ms resolution on 
Atari. Time resolution of times reported by initcall_debug is roughly 40 
us. I'd expect that changes with falling back to jiffies only. Might be 
worth a try on QEMU Mac.


Cheers,

Michael


The latest patches can be found at
https://github.com/fthain/linux/commits/mac68k-queue/



Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-12 Thread Michael Schmitz

Hi Finn,

Am 13.11.2018 um 16:14 schrieb Finn Thain:

On Tue, 13 Nov 2018, Michael Schmitz wrote:


Hi Finn,

Am 12.11.2018 um 22:06 schrieb Finn Thain:

On Mon, 12 Nov 2018, Geert Uytterhoeven wrote:


Hi Finn,

Thanks for your patch!

On Mon, Nov 12, 2018 at 5:46 AM Finn Thain 
wrote:

The functions that implement arch_gettimeoffset are re-used by
new clocksource drivers in subsequent patches.


Disabling this first affects functionality during bisection, right?



It means that all platforms have to use the 'jiffies' clocksource.


So all that happens is timer granularity drops to 10ms, then gets restored by
the later patches?



Yes, that was the plan, but I can't confirm that it worked out as I don't
have any physical 68k hardware in front of me right now. If you can
confirm this on your Atari Falcon, that would be great.


Will do.


(It appears that a QEMU-emulated Mac does not benefit from having a
clocksource that's more accurate than the 'jiffies' clocksource, in spite
of "clocksource: Switched to clocksource via1".)


With the current code, kernel log timestamps have 10 ms resolution on 
Atari. Time resolution of times reported by initcall_debug is roughly 40 
us. I'd expect that changes with falling back to jiffies only. Might be 
worth a try on QEMU Mac.


Cheers,

Michael


The latest patches can be found at
https://github.com/fthain/linux/commits/mac68k-queue/



Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-12 Thread Michael Schmitz

Hi Finn,

Am 12.11.2018 um 22:06 schrieb Finn Thain:

On Mon, 12 Nov 2018, Geert Uytterhoeven wrote:


Hi Finn,

Thanks for your patch!

On Mon, Nov 12, 2018 at 5:46 AM Finn Thain  wrote:

The functions that implement arch_gettimeoffset are re-used by
new clocksource drivers in subsequent patches.


Disabling this first affects functionality during bisection, right?



It means that all platforms have to use the 'jiffies' clocksource.


So all that happens is timer granularity drops to 10ms, then gets 
restored by the later patches?


I doubt that would be a large enough regression to matter for bisection, 
but the way you reuse the arch_gettimeoffset() code for the new 
read_clock() functions makes reordering of this patch to the end of the 
series impossible.


Best you can don, under the circumstances.

Cheers,

Michael



Re: [RFC PATCH 06/13] m68k: Drop ARCH_USES_GETTIMEOFFSET

2018-11-12 Thread Michael Schmitz

Hi Finn,

Am 12.11.2018 um 22:06 schrieb Finn Thain:

On Mon, 12 Nov 2018, Geert Uytterhoeven wrote:


Hi Finn,

Thanks for your patch!

On Mon, Nov 12, 2018 at 5:46 AM Finn Thain  wrote:

The functions that implement arch_gettimeoffset are re-used by
new clocksource drivers in subsequent patches.


Disabling this first affects functionality during bisection, right?



It means that all platforms have to use the 'jiffies' clocksource.


So all that happens is timer granularity drops to 10ms, then gets 
restored by the later patches?


I doubt that would be a large enough regression to matter for bisection, 
but the way you reuse the arch_gettimeoffset() code for the new 
read_clock() functions makes reordering of this patch to the end of the 
series impossible.


Best you can don, under the circumstances.

Cheers,

Michael



Re: [PATCH] ata: add Buddha PATA controller driver

2018-10-31 Thread Michael Schmitz

Hi Adrian,

my fix is evidently incomplete - I just crashed elgar trying to remove 
the pata_buddha module, sorry. Must've done something silly.


So no, can't post a patch to add module_exit just yet.

Cheers,

Michael


Am 31.10.2018 um 23:06 schrieb John Paul Adrian Glaubitz:

Hi!

On 10/25/18 9:38 PM, John Paul Adrian Glaubitz wrote:

Tested-by: John Paul Adrian Glaubitz 

Michael wants to make some changes to the driver though, I think. He has access
to the Amiga 4000, in any case, so he can also test them.


@Michael: Can you post an updated version of the driver, rebased for 4.19
  and with your patch?

Adrian



Re: [PATCH] ata: add Buddha PATA controller driver

2018-10-31 Thread Michael Schmitz

Hi Adrian,

my fix is evidently incomplete - I just crashed elgar trying to remove 
the pata_buddha module, sorry. Must've done something silly.


So no, can't post a patch to add module_exit just yet.

Cheers,

Michael


Am 31.10.2018 um 23:06 schrieb John Paul Adrian Glaubitz:

Hi!

On 10/25/18 9:38 PM, John Paul Adrian Glaubitz wrote:

Tested-by: John Paul Adrian Glaubitz 

Michael wants to make some changes to the driver though, I think. He has access
to the Amiga 4000, in any case, so he can also test them.


@Michael: Can you post an updated version of the driver, rebased for 4.19
  and with your patch?

Adrian



Re: [PATCH] ata: add Buddha PATA controller driver

2018-10-18 Thread Michael Schmitz

Hi Bartlomiej,

On 19/10/18 01:29, Bartlomiej Zolnierkiewicz wrote:

Add Buddha PATA controller driver. It enables libata support for
the Buddha, Catweasel and X-Surf expansion boards on the Zorro
expansion bus.

Cc: John Paul Adrian Glaubitz 
Cc: Michael Schmitz 
Cc: Geert Uytterhoeven 
Signed-off-by: Bartlomiej Zolnierkiewicz 
---
John, please test if possible..

  drivers/ata/Kconfig   |   12 ++
  drivers/ata/Makefile  |1
  drivers/ata/pata_buddha.c |  257 
++
  3 files changed, 270 insertions(+)

Index: b/drivers/ata/Kconfig
===
--- a/drivers/ata/Kconfig   2018-10-18 14:18:02.766452406 +0200
+++ b/drivers/ata/Kconfig   2018-10-18 14:18:02.766452406 +0200
@@ -965,6 +965,18 @@ config PATA_GAYLE
  
  	  If unsure, say N.
  
+config PATA_BUDDHA

+   tristate "Buddha/Catweasel/X-Surf PATA support"
+   depends on ZORRO
+   help
+ This option enables support for the IDE interfaces
+ on the Buddha, Catweasel and X-Surf expansion boards
+ on the Zorro expansion bus. It supports up to two
+ interfaces on the Buddha, three on the Catweasel and
+ two on the X-Surf.
+
+ If unsure, say N.
+
  config PATA_ISAPNP
tristate "ISA Plug and Play PATA support"
depends on ISAPNP
Index: b/drivers/ata/Makefile
===
--- a/drivers/ata/Makefile  2018-10-18 14:18:02.766452406 +0200
+++ b/drivers/ata/Makefile  2018-10-18 14:18:02.766452406 +0200
@@ -98,6 +98,7 @@ obj-$(CONFIG_PATA_WINBOND)+= pata_sl82c
  obj-$(CONFIG_PATA_CMD640_PCI) += pata_cmd640.o
  obj-$(CONFIG_PATA_FALCON) += pata_falcon.o
  obj-$(CONFIG_PATA_GAYLE)  += pata_gayle.o
+obj-$(CONFIG_PATA_BUDDHA)  += pata_buddha.o
  obj-$(CONFIG_PATA_ISAPNP) += pata_isapnp.o
  obj-$(CONFIG_PATA_IXP4XX_CF)  += pata_ixp4xx_cf.o
  obj-$(CONFIG_PATA_MPIIX)  += pata_mpiix.o
Index: b/drivers/ata/pata_buddha.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ b/drivers/ata/pata_buddha.c 2018-10-18 14:23:41.054460925 +0200
@@ -0,0 +1,257 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Buddha, Catweasel and X-Surf PATA controller driver
+ *
+ * Copyright (c) 2018 Samsung Electronics Co., Ltd.
+ * http://www.samsung.com
+ *
+ * Based on buddha.c:
+ *
+ * Copyright (C) 1997, 2001 by Geert Uytterhoeven and others
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#define DRV_NAME "pata_buddha"
+#define DRV_VERSION "0.1.0"
+
+#define BUDDHA_BASE1   0x800
+#define BUDDHA_BASE2   0xa00
+#define BUDDHA_BASE3   0xc00
+#define XSURF_BASE10xb000 /* 2.5" interface */
+#define XSURF_BASE20xd000 /* 3.5" interface */
+#define BUDDHA_CONTROL 0x11a
+#define BUDDHA_IRQ 0xf00
+#define XSURF_IRQ  0x7e
+#define BUDDHA_IRQ_MR  0xfc0   /* master interrupt enable */
+
+enum {
+   BOARD_BUDDHA = 0,
+   BOARD_CATWEASEL,
+   BOARD_XSURF
+};
+
+static unsigned int buddha_bases[3] __initdata = {
+   BUDDHA_BASE1, BUDDHA_BASE2, BUDDHA_BASE3
+};
+
+static unsigned int xsurf_bases[2] __initdata = {
+   XSURF_BASE1, XSURF_BASE2
+};
+
+static struct scsi_host_template pata_buddha_sht = {
+   ATA_PIO_SHT(DRV_NAME),
+};
+
+/* FIXME: is this needed? */
+static unsigned int pata_buddha_data_xfer(struct ata_queued_cmd *qc,
+unsigned char *buf,
+unsigned int buflen, int rw)
+{
+   struct ata_device *dev = qc->dev;
+   struct ata_port *ap = dev->link->ap;
+   void __iomem *data_addr = ap->ioaddr.data_addr;
+   unsigned int words = buflen >> 1;
+
+   /* Transfer multiple of 2 bytes */
+   if (rw == READ)
+   raw_insw((u16 *)data_addr, (u16 *)buf, words);
+   else
+   raw_outsw((u16 *)data_addr, (u16 *)buf, words);
+
+   /* Transfer trailing byte, if any. */
+   if (unlikely(buflen & 0x01)) {
+   unsigned char pad[2] = { };
+
+   /* Point buf to the tail of buffer */
+   buf += buflen - 1;
+
+   if (rw == READ) {
+   raw_insw((u16 *)data_addr, (u16 *)pad, 1);
+   *buf = pad[0];
+   } else {
+   pad[0] = *buf;
+   raw_outsw((u16 *)data_addr, (u16 *)pad, 1);
+   }
+   words++;
+   }
+
+   return words << 1;
+}
+
+/*
+ * Provide our own set_mode() as we don't want to change anything that has
+ * already been configured..
+ */
+static int pata_buddha_set_mode(struct ata_link *link,
+  

Re: [PATCH] ata: add Buddha PATA controller driver

2018-10-18 Thread Michael Schmitz

Hi Bartlomiej,

On 19/10/18 01:29, Bartlomiej Zolnierkiewicz wrote:

Add Buddha PATA controller driver. It enables libata support for
the Buddha, Catweasel and X-Surf expansion boards on the Zorro
expansion bus.

Cc: John Paul Adrian Glaubitz 
Cc: Michael Schmitz 
Cc: Geert Uytterhoeven 
Signed-off-by: Bartlomiej Zolnierkiewicz 
---
John, please test if possible..

  drivers/ata/Kconfig   |   12 ++
  drivers/ata/Makefile  |1
  drivers/ata/pata_buddha.c |  257 
++
  3 files changed, 270 insertions(+)

Index: b/drivers/ata/Kconfig
===
--- a/drivers/ata/Kconfig   2018-10-18 14:18:02.766452406 +0200
+++ b/drivers/ata/Kconfig   2018-10-18 14:18:02.766452406 +0200
@@ -965,6 +965,18 @@ config PATA_GAYLE
  
  	  If unsure, say N.
  
+config PATA_BUDDHA

+   tristate "Buddha/Catweasel/X-Surf PATA support"
+   depends on ZORRO
+   help
+ This option enables support for the IDE interfaces
+ on the Buddha, Catweasel and X-Surf expansion boards
+ on the Zorro expansion bus. It supports up to two
+ interfaces on the Buddha, three on the Catweasel and
+ two on the X-Surf.
+
+ If unsure, say N.
+
  config PATA_ISAPNP
tristate "ISA Plug and Play PATA support"
depends on ISAPNP
Index: b/drivers/ata/Makefile
===
--- a/drivers/ata/Makefile  2018-10-18 14:18:02.766452406 +0200
+++ b/drivers/ata/Makefile  2018-10-18 14:18:02.766452406 +0200
@@ -98,6 +98,7 @@ obj-$(CONFIG_PATA_WINBOND)+= pata_sl82c
  obj-$(CONFIG_PATA_CMD640_PCI) += pata_cmd640.o
  obj-$(CONFIG_PATA_FALCON) += pata_falcon.o
  obj-$(CONFIG_PATA_GAYLE)  += pata_gayle.o
+obj-$(CONFIG_PATA_BUDDHA)  += pata_buddha.o
  obj-$(CONFIG_PATA_ISAPNP) += pata_isapnp.o
  obj-$(CONFIG_PATA_IXP4XX_CF)  += pata_ixp4xx_cf.o
  obj-$(CONFIG_PATA_MPIIX)  += pata_mpiix.o
Index: b/drivers/ata/pata_buddha.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ b/drivers/ata/pata_buddha.c 2018-10-18 14:23:41.054460925 +0200
@@ -0,0 +1,257 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Buddha, Catweasel and X-Surf PATA controller driver
+ *
+ * Copyright (c) 2018 Samsung Electronics Co., Ltd.
+ * http://www.samsung.com
+ *
+ * Based on buddha.c:
+ *
+ * Copyright (C) 1997, 2001 by Geert Uytterhoeven and others
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#define DRV_NAME "pata_buddha"
+#define DRV_VERSION "0.1.0"
+
+#define BUDDHA_BASE1   0x800
+#define BUDDHA_BASE2   0xa00
+#define BUDDHA_BASE3   0xc00
+#define XSURF_BASE10xb000 /* 2.5" interface */
+#define XSURF_BASE20xd000 /* 3.5" interface */
+#define BUDDHA_CONTROL 0x11a
+#define BUDDHA_IRQ 0xf00
+#define XSURF_IRQ  0x7e
+#define BUDDHA_IRQ_MR  0xfc0   /* master interrupt enable */
+
+enum {
+   BOARD_BUDDHA = 0,
+   BOARD_CATWEASEL,
+   BOARD_XSURF
+};
+
+static unsigned int buddha_bases[3] __initdata = {
+   BUDDHA_BASE1, BUDDHA_BASE2, BUDDHA_BASE3
+};
+
+static unsigned int xsurf_bases[2] __initdata = {
+   XSURF_BASE1, XSURF_BASE2
+};
+
+static struct scsi_host_template pata_buddha_sht = {
+   ATA_PIO_SHT(DRV_NAME),
+};
+
+/* FIXME: is this needed? */
+static unsigned int pata_buddha_data_xfer(struct ata_queued_cmd *qc,
+unsigned char *buf,
+unsigned int buflen, int rw)
+{
+   struct ata_device *dev = qc->dev;
+   struct ata_port *ap = dev->link->ap;
+   void __iomem *data_addr = ap->ioaddr.data_addr;
+   unsigned int words = buflen >> 1;
+
+   /* Transfer multiple of 2 bytes */
+   if (rw == READ)
+   raw_insw((u16 *)data_addr, (u16 *)buf, words);
+   else
+   raw_outsw((u16 *)data_addr, (u16 *)buf, words);
+
+   /* Transfer trailing byte, if any. */
+   if (unlikely(buflen & 0x01)) {
+   unsigned char pad[2] = { };
+
+   /* Point buf to the tail of buffer */
+   buf += buflen - 1;
+
+   if (rw == READ) {
+   raw_insw((u16 *)data_addr, (u16 *)pad, 1);
+   *buf = pad[0];
+   } else {
+   pad[0] = *buf;
+   raw_outsw((u16 *)data_addr, (u16 *)pad, 1);
+   }
+   words++;
+   }
+
+   return words << 1;
+}
+
+/*
+ * Provide our own set_mode() as we don't want to change anything that has
+ * already been configured..
+ */
+static int pata_buddha_set_mode(struct ata_link *link,
+  

Re: [PATCH] ata: add Buddha PATA controller driver

2018-10-18 Thread Michael Schmitz

Hi Adrian,

module built and loaded fine (no need to build a new kernel for this). 
Can't unload the module however (-EBUSY).


You'll have to reboot elgar to reload the module, I'm afraid.

Cheers,

    Michael



On 19/10/18 01:32, John Paul Adrian Glaubitz wrote:

Hi!

On 10/18/18 2:29 PM, Bartlomiej Zolnierkiewicz wrote:

John, please test if possible..

Yes, will test ASAP. @Michael: Can you build an updated kernel from your
tree for elgar and copy it over. I'll make sure to test it and also
hook up an IDE drive for testing.

Adrian





Re: [PATCH] ata: add Buddha PATA controller driver

2018-10-18 Thread Michael Schmitz

Hi Adrian,

module built and loaded fine (no need to build a new kernel for this). 
Can't unload the module however (-EBUSY).


You'll have to reboot elgar to reload the module, I'm afraid.

Cheers,

    Michael



On 19/10/18 01:32, John Paul Adrian Glaubitz wrote:

Hi!

On 10/18/18 2:29 PM, Bartlomiej Zolnierkiewicz wrote:

John, please test if possible..

Yes, will test ASAP. @Michael: Can you build an updated kernel from your
tree for elgar and copy it over. I'll make sure to test it and also
hook up an IDE drive for testing.

Adrian





Re: [PATCH AUTOSEL 4.18 24/58] Input: atakbd - fix Atari CapsLock behaviour

2018-10-10 Thread Michael Schmitz

Hi Geert,


On 10/10/18 19:59, Geert Uytterhoeven wrote:

Hi Michael,

On Wed, Oct 10, 2018 at 12:07 AM Michael Schmitz  wrote:

I agree the bug is neither subtle nor recent, not security relevant and
will affect only a handful of users at best.

If you're worried about weakening the rules around stable releases, by
all means go ahead and veto the inclusion of these patches in the next
stable release.

I believe the distro the issue was reported against (Debian ports) will not
get the fix until the issue is fixed in the upstream stable release?


That's correct. but there are ways around that. I could petition Ben to 
add it as a separate patch for now.


Boils down to the same question really - what's the impact of this bug? 
Will it prevent installation from distribution media f.e.?


Given that it's taken ten years for someone to even notice one aspect 
(missing key mappings), and then I just happened to see the discussion 
on the ML (which wasn't even about mapping so much as about keys stuck 
in autorepeat, which I haven't been able to reproduce), I think it's 
clear that this has minimal impact. m68k users tend to be reasonably 
computer savvy, and can install a custom built kernel until Debian 
catches up with current kernel versions.


Having reread Documentation/process/stable-kernel-rules.rst, it's hard 
for me to argue these patches belong in stable.


Cheers,

    Michael





On 09/10/18 08:20, Dmitry Torokhov wrote:

On Mon, Oct 8, 2018 at 12:09 PM Michael Schmitz  wrote:

someone on debian-68k reported the bug, which (to me) indicates that the
code is not just used by me.

Whether or not a functioning Capslock is essential to have? You be the
judge of that. If you are OK with applying the keymap patch, why not
this one?

I have exactly the same concerns about the keymap patch. This all has
not been working correctly for many many years (and it was not broken
in a subtly way as far as I understand, but rather quite obvious).
Thus I do not understand why this belongs to stable release. It is not
a [recent] regression, nor secutiry bug, nor even enabling of new
hardware, that is why I myself did not mark it as stable.

I still maintain that we pick up for stable too many patches for no
clear benefit. This is similar to the patch for Atmel controllers that
was picked to stable and I asked why, as it is not clear how many
users might be affected (or if the problem the patch was solving was
purely theoretical, or only affecting hardware that is not in
circulation yet).


Debian will carry stable patches without explicit action on behalf of
the maintainer. Unstable patches are a little harder to get accepted.

On 09/10/18 06:11, Dmitry Torokhov wrote:

On Mon, Oct 8, 2018 at 8:25 AM Sasha Levin  wrote:

From: Michael Schmitz 

[ Upstream commit 52d2c7bf7c90217fbe875d2d76f310979c48eb83 ]

The CapsLock key on Atari keyboards is not a toggle, it does send the
normal make and break scancodes.

Drop the CapsLock toggle handling code, which did cause the CapsLock
key to merely act as a Shift key.

This has been broken for 10+ years. Does it really make sense to
promote it to stable?


Tested-by: Michael Schmitz 
Signed-off-by: Michael Schmitz 
Signed-off-by: Andreas Schwab 
Signed-off-by: Dmitry Torokhov 
Signed-off-by: Sasha Levin 
---
drivers/input/keyboard/atakbd.c | 10 ++
1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/input/keyboard/atakbd.c b/drivers/input/keyboard/atakbd.c
index 524a72bee55a..fdeda0b0fbd6 100644
--- a/drivers/input/keyboard/atakbd.c
+++ b/drivers/input/keyboard/atakbd.c
@@ -189,14 +189,8 @@ static void atakbd_interrupt(unsigned char scancode, char 
down)

   scancode = atakbd_keycode[scancode];

-   if (scancode == KEY_CAPSLOCK) { /* CapsLock is a toggle switch 
key on Amiga */
-   input_report_key(atakbd_dev, scancode, 1);
-   input_report_key(atakbd_dev, scancode, 0);
-   input_sync(atakbd_dev);
-   } else {
-   input_report_key(atakbd_dev, scancode, down);
-   input_sync(atakbd_dev);
-   }
+   input_report_key(atakbd_dev, scancode, down);
+   input_sync(atakbd_dev);
   } else  /* scancodes >= 0xf3 are mouse data, 
most likely */
   printk(KERN_INFO "atakbd: unhandled scancode %x\n", 
scancode);


Gr{oetje,eeting}s,

 Geert





Re: [PATCH AUTOSEL 4.18 24/58] Input: atakbd - fix Atari CapsLock behaviour

2018-10-10 Thread Michael Schmitz

Hi Geert,


On 10/10/18 19:59, Geert Uytterhoeven wrote:

Hi Michael,

On Wed, Oct 10, 2018 at 12:07 AM Michael Schmitz  wrote:

I agree the bug is neither subtle nor recent, not security relevant and
will affect only a handful of users at best.

If you're worried about weakening the rules around stable releases, by
all means go ahead and veto the inclusion of these patches in the next
stable release.

I believe the distro the issue was reported against (Debian ports) will not
get the fix until the issue is fixed in the upstream stable release?


That's correct. but there are ways around that. I could petition Ben to 
add it as a separate patch for now.


Boils down to the same question really - what's the impact of this bug? 
Will it prevent installation from distribution media f.e.?


Given that it's taken ten years for someone to even notice one aspect 
(missing key mappings), and then I just happened to see the discussion 
on the ML (which wasn't even about mapping so much as about keys stuck 
in autorepeat, which I haven't been able to reproduce), I think it's 
clear that this has minimal impact. m68k users tend to be reasonably 
computer savvy, and can install a custom built kernel until Debian 
catches up with current kernel versions.


Having reread Documentation/process/stable-kernel-rules.rst, it's hard 
for me to argue these patches belong in stable.


Cheers,

    Michael





On 09/10/18 08:20, Dmitry Torokhov wrote:

On Mon, Oct 8, 2018 at 12:09 PM Michael Schmitz  wrote:

someone on debian-68k reported the bug, which (to me) indicates that the
code is not just used by me.

Whether or not a functioning Capslock is essential to have? You be the
judge of that. If you are OK with applying the keymap patch, why not
this one?

I have exactly the same concerns about the keymap patch. This all has
not been working correctly for many many years (and it was not broken
in a subtly way as far as I understand, but rather quite obvious).
Thus I do not understand why this belongs to stable release. It is not
a [recent] regression, nor secutiry bug, nor even enabling of new
hardware, that is why I myself did not mark it as stable.

I still maintain that we pick up for stable too many patches for no
clear benefit. This is similar to the patch for Atmel controllers that
was picked to stable and I asked why, as it is not clear how many
users might be affected (or if the problem the patch was solving was
purely theoretical, or only affecting hardware that is not in
circulation yet).


Debian will carry stable patches without explicit action on behalf of
the maintainer. Unstable patches are a little harder to get accepted.

On 09/10/18 06:11, Dmitry Torokhov wrote:

On Mon, Oct 8, 2018 at 8:25 AM Sasha Levin  wrote:

From: Michael Schmitz 

[ Upstream commit 52d2c7bf7c90217fbe875d2d76f310979c48eb83 ]

The CapsLock key on Atari keyboards is not a toggle, it does send the
normal make and break scancodes.

Drop the CapsLock toggle handling code, which did cause the CapsLock
key to merely act as a Shift key.

This has been broken for 10+ years. Does it really make sense to
promote it to stable?


Tested-by: Michael Schmitz 
Signed-off-by: Michael Schmitz 
Signed-off-by: Andreas Schwab 
Signed-off-by: Dmitry Torokhov 
Signed-off-by: Sasha Levin 
---
drivers/input/keyboard/atakbd.c | 10 ++
1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/input/keyboard/atakbd.c b/drivers/input/keyboard/atakbd.c
index 524a72bee55a..fdeda0b0fbd6 100644
--- a/drivers/input/keyboard/atakbd.c
+++ b/drivers/input/keyboard/atakbd.c
@@ -189,14 +189,8 @@ static void atakbd_interrupt(unsigned char scancode, char 
down)

   scancode = atakbd_keycode[scancode];

-   if (scancode == KEY_CAPSLOCK) { /* CapsLock is a toggle switch 
key on Amiga */
-   input_report_key(atakbd_dev, scancode, 1);
-   input_report_key(atakbd_dev, scancode, 0);
-   input_sync(atakbd_dev);
-   } else {
-   input_report_key(atakbd_dev, scancode, down);
-   input_sync(atakbd_dev);
-   }
+   input_report_key(atakbd_dev, scancode, down);
+   input_sync(atakbd_dev);
   } else  /* scancodes >= 0xf3 are mouse data, 
most likely */
   printk(KERN_INFO "atakbd: unhandled scancode %x\n", 
scancode);


Gr{oetje,eeting}s,

 Geert





Re: [PATCH AUTOSEL 4.18 24/58] Input: atakbd - fix Atari CapsLock behaviour

2018-10-09 Thread Michael Schmitz

Hi Dmitry,

I agree the bug is neither subtle nor recent, not security relevant and 
will affect only a handful of users at best.


If you're worried about weakening the rules around stable releases, by 
all means go ahead and veto the inclusion of these patches in the next 
stable release.


Cheers,

    Michael


On 09/10/18 08:20, Dmitry Torokhov wrote:

Hi Michael,

On Mon, Oct 8, 2018 at 12:09 PM Michael Schmitz  wrote:

Dmitry,

someone on debian-68k reported the bug, which (to me) indicates that the
code is not just used by me.

Whether or not a functioning Capslock is essential to have? You be the
judge of that. If you are OK with applying the keymap patch, why not
this one?

I have exactly the same concerns about the keymap patch. This all has
not been working correctly for many many years (and it was not broken
in a subtly way as far as I understand, but rather quite obvious).
Thus I do not understand why this belongs to stable release. It is not
a [recent] regression, nor secutiry bug, nor even enabling of new
hardware, that is why I myself did not mark it as stable.

I still maintain that we pick up for stable too many patches for no
clear benefit. This is similar to the patch for Atmel controllers that
was picked to stable and I asked why, as it is not clear how many
users might be affected (or if the problem the patch was solving was
purely theoretical, or only affecting hardware that is not in
circulation yet).


Debian will carry stable patches without explicit action on behalf of
the maintainer. Unstable patches are a little harder to get accepted.

Cheers,

  Michael



On 09/10/18 06:11, Dmitry Torokhov wrote:

On Mon, Oct 8, 2018 at 8:25 AM Sasha Levin  wrote:

From: Michael Schmitz 

[ Upstream commit 52d2c7bf7c90217fbe875d2d76f310979c48eb83 ]

The CapsLock key on Atari keyboards is not a toggle, it does send the
normal make and break scancodes.

Drop the CapsLock toggle handling code, which did cause the CapsLock
key to merely act as a Shift key.

This has been broken for 10+ years. Does it really make sense to
promote it to stable?


Tested-by: Michael Schmitz 
Signed-off-by: Michael Schmitz 
Signed-off-by: Andreas Schwab 
Signed-off-by: Dmitry Torokhov 
Signed-off-by: Sasha Levin 
---
   drivers/input/keyboard/atakbd.c | 10 ++
   1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/input/keyboard/atakbd.c b/drivers/input/keyboard/atakbd.c
index 524a72bee55a..fdeda0b0fbd6 100644
--- a/drivers/input/keyboard/atakbd.c
+++ b/drivers/input/keyboard/atakbd.c
@@ -189,14 +189,8 @@ static void atakbd_interrupt(unsigned char scancode, char 
down)

  scancode = atakbd_keycode[scancode];

-   if (scancode == KEY_CAPSLOCK) { /* CapsLock is a toggle switch 
key on Amiga */
-   input_report_key(atakbd_dev, scancode, 1);
-   input_report_key(atakbd_dev, scancode, 0);
-   input_sync(atakbd_dev);
-   } else {
-   input_report_key(atakbd_dev, scancode, down);
-   input_sync(atakbd_dev);
-   }
+   input_report_key(atakbd_dev, scancode, down);
+   input_sync(atakbd_dev);
  } else  /* scancodes >= 0xf3 are mouse data, 
most likely */
  printk(KERN_INFO "atakbd: unhandled scancode %x\n", scancode);

--
2.17.1


Thanks.





Re: [PATCH AUTOSEL 4.18 24/58] Input: atakbd - fix Atari CapsLock behaviour

2018-10-09 Thread Michael Schmitz

Hi Dmitry,

I agree the bug is neither subtle nor recent, not security relevant and 
will affect only a handful of users at best.


If you're worried about weakening the rules around stable releases, by 
all means go ahead and veto the inclusion of these patches in the next 
stable release.


Cheers,

    Michael


On 09/10/18 08:20, Dmitry Torokhov wrote:

Hi Michael,

On Mon, Oct 8, 2018 at 12:09 PM Michael Schmitz  wrote:

Dmitry,

someone on debian-68k reported the bug, which (to me) indicates that the
code is not just used by me.

Whether or not a functioning Capslock is essential to have? You be the
judge of that. If you are OK with applying the keymap patch, why not
this one?

I have exactly the same concerns about the keymap patch. This all has
not been working correctly for many many years (and it was not broken
in a subtly way as far as I understand, but rather quite obvious).
Thus I do not understand why this belongs to stable release. It is not
a [recent] regression, nor secutiry bug, nor even enabling of new
hardware, that is why I myself did not mark it as stable.

I still maintain that we pick up for stable too many patches for no
clear benefit. This is similar to the patch for Atmel controllers that
was picked to stable and I asked why, as it is not clear how many
users might be affected (or if the problem the patch was solving was
purely theoretical, or only affecting hardware that is not in
circulation yet).


Debian will carry stable patches without explicit action on behalf of
the maintainer. Unstable patches are a little harder to get accepted.

Cheers,

  Michael



On 09/10/18 06:11, Dmitry Torokhov wrote:

On Mon, Oct 8, 2018 at 8:25 AM Sasha Levin  wrote:

From: Michael Schmitz 

[ Upstream commit 52d2c7bf7c90217fbe875d2d76f310979c48eb83 ]

The CapsLock key on Atari keyboards is not a toggle, it does send the
normal make and break scancodes.

Drop the CapsLock toggle handling code, which did cause the CapsLock
key to merely act as a Shift key.

This has been broken for 10+ years. Does it really make sense to
promote it to stable?


Tested-by: Michael Schmitz 
Signed-off-by: Michael Schmitz 
Signed-off-by: Andreas Schwab 
Signed-off-by: Dmitry Torokhov 
Signed-off-by: Sasha Levin 
---
   drivers/input/keyboard/atakbd.c | 10 ++
   1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/input/keyboard/atakbd.c b/drivers/input/keyboard/atakbd.c
index 524a72bee55a..fdeda0b0fbd6 100644
--- a/drivers/input/keyboard/atakbd.c
+++ b/drivers/input/keyboard/atakbd.c
@@ -189,14 +189,8 @@ static void atakbd_interrupt(unsigned char scancode, char 
down)

  scancode = atakbd_keycode[scancode];

-   if (scancode == KEY_CAPSLOCK) { /* CapsLock is a toggle switch 
key on Amiga */
-   input_report_key(atakbd_dev, scancode, 1);
-   input_report_key(atakbd_dev, scancode, 0);
-   input_sync(atakbd_dev);
-   } else {
-   input_report_key(atakbd_dev, scancode, down);
-   input_sync(atakbd_dev);
-   }
+   input_report_key(atakbd_dev, scancode, down);
+   input_sync(atakbd_dev);
  } else  /* scancodes >= 0xf3 are mouse data, 
most likely */
  printk(KERN_INFO "atakbd: unhandled scancode %x\n", scancode);

--
2.17.1


Thanks.





Re: [PATCH AUTOSEL 4.18 24/58] Input: atakbd - fix Atari CapsLock behaviour

2018-10-08 Thread Michael Schmitz

Dmitry,

someone on debian-68k reported the bug, which (to me) indicates that the 
code is not just used by me.


Whether or not a functioning Capslock is essential to have? You be the 
judge of that. If you are OK with applying the keymap patch, why not 
this one?


Debian will carry stable patches without explicit action on behalf of 
the maintainer. Unstable patches are a little harder to get accepted.


Cheers,

    Michael



On 09/10/18 06:11, Dmitry Torokhov wrote:

On Mon, Oct 8, 2018 at 8:25 AM Sasha Levin  wrote:

From: Michael Schmitz 

[ Upstream commit 52d2c7bf7c90217fbe875d2d76f310979c48eb83 ]

The CapsLock key on Atari keyboards is not a toggle, it does send the
normal make and break scancodes.

Drop the CapsLock toggle handling code, which did cause the CapsLock
key to merely act as a Shift key.

This has been broken for 10+ years. Does it really make sense to
promote it to stable?


Tested-by: Michael Schmitz 
Signed-off-by: Michael Schmitz 
Signed-off-by: Andreas Schwab 
Signed-off-by: Dmitry Torokhov 
Signed-off-by: Sasha Levin 
---
  drivers/input/keyboard/atakbd.c | 10 ++
  1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/input/keyboard/atakbd.c b/drivers/input/keyboard/atakbd.c
index 524a72bee55a..fdeda0b0fbd6 100644
--- a/drivers/input/keyboard/atakbd.c
+++ b/drivers/input/keyboard/atakbd.c
@@ -189,14 +189,8 @@ static void atakbd_interrupt(unsigned char scancode, char 
down)

 scancode = atakbd_keycode[scancode];

-   if (scancode == KEY_CAPSLOCK) { /* CapsLock is a toggle switch 
key on Amiga */
-   input_report_key(atakbd_dev, scancode, 1);
-   input_report_key(atakbd_dev, scancode, 0);
-   input_sync(atakbd_dev);
-   } else {
-   input_report_key(atakbd_dev, scancode, down);
-   input_sync(atakbd_dev);
-   }
+   input_report_key(atakbd_dev, scancode, down);
+   input_sync(atakbd_dev);
 } else  /* scancodes >= 0xf3 are mouse data, 
most likely */
 printk(KERN_INFO "atakbd: unhandled scancode %x\n", scancode);

--
2.17.1







Re: [PATCH AUTOSEL 4.18 24/58] Input: atakbd - fix Atari CapsLock behaviour

2018-10-08 Thread Michael Schmitz

Dmitry,

someone on debian-68k reported the bug, which (to me) indicates that the 
code is not just used by me.


Whether or not a functioning Capslock is essential to have? You be the 
judge of that. If you are OK with applying the keymap patch, why not 
this one?


Debian will carry stable patches without explicit action on behalf of 
the maintainer. Unstable patches are a little harder to get accepted.


Cheers,

    Michael



On 09/10/18 06:11, Dmitry Torokhov wrote:

On Mon, Oct 8, 2018 at 8:25 AM Sasha Levin  wrote:

From: Michael Schmitz 

[ Upstream commit 52d2c7bf7c90217fbe875d2d76f310979c48eb83 ]

The CapsLock key on Atari keyboards is not a toggle, it does send the
normal make and break scancodes.

Drop the CapsLock toggle handling code, which did cause the CapsLock
key to merely act as a Shift key.

This has been broken for 10+ years. Does it really make sense to
promote it to stable?


Tested-by: Michael Schmitz 
Signed-off-by: Michael Schmitz 
Signed-off-by: Andreas Schwab 
Signed-off-by: Dmitry Torokhov 
Signed-off-by: Sasha Levin 
---
  drivers/input/keyboard/atakbd.c | 10 ++
  1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/input/keyboard/atakbd.c b/drivers/input/keyboard/atakbd.c
index 524a72bee55a..fdeda0b0fbd6 100644
--- a/drivers/input/keyboard/atakbd.c
+++ b/drivers/input/keyboard/atakbd.c
@@ -189,14 +189,8 @@ static void atakbd_interrupt(unsigned char scancode, char 
down)

 scancode = atakbd_keycode[scancode];

-   if (scancode == KEY_CAPSLOCK) { /* CapsLock is a toggle switch 
key on Amiga */
-   input_report_key(atakbd_dev, scancode, 1);
-   input_report_key(atakbd_dev, scancode, 0);
-   input_sync(atakbd_dev);
-   } else {
-   input_report_key(atakbd_dev, scancode, down);
-   input_sync(atakbd_dev);
-   }
+   input_report_key(atakbd_dev, scancode, down);
+   input_sync(atakbd_dev);
 } else  /* scancodes >= 0xf3 are mouse data, 
most likely */
 printk(KERN_INFO "atakbd: unhandled scancode %x\n", scancode);

--
2.17.1







Re: [PATCH v3 3/7] drivers: parisc: Avoids building driver if CONFIG_PARISC is disabled

2018-10-05 Thread Michael Schmitz




Am 05.10.2018 um 15:16 schrieb Leonardo Bras:

Well it's not really that persuasive.  Most people simply let the build
run to completion, but if you have a problem with a job control 3h
timelimit, then create a job that kills itself at 2:59 and then
resubmits itself.  That will produce a complete build in 3h chunks
without any need to call sub Makefiles.



Humm, I probably should have explained better how GitlabCI works.
It works creating a docker container that have a limited lifespan of 3h max.
After that the time is over, this container ceases to exist, leaving no build
objects, only the console log. So there is no way of 'resuming' the building
from where it stopped. I used the 'job' term because it's how they call it,
and I understand it's easily confused with bash jobs.


All of our Makefiles are coded assuming the upper level can prevent
descent into the lower ones.  You're proposing to change that
assumption, requiring a fairly large patch set, which doesn't really
seem to provide a huge benefit.

James


I understand your viewpoint.
But what I propose is not to change that assumption, but instead give some
Makefiles the aditional ability to be called directly and still not build stuff
if they were not enabled in .config.

But, why these chosen Makefiles, and not all of them?
Granularity.
What I am trying to achieve with this patchset is the ability of building
smaller sets of drivers without accidentally building what is not enabled
on .config.
And, in my viewpoint, building a single drivers/DRIVERNAME is small enough to
be fast in most situations.


That already works, doesn't it? So all that you'd need is an offline 
tool to precompute what drivers to actually build with a given config.


'make -n' with some suitable output mangling might do the job.

There may well be other ways to achieve your stated goal, without any 
need to make changes to the kernel build process (which is the result of 
many years of evolution and tuning, BTW).



This change is not supposed to bother the usual way of building the kernel, and


Enough people have voiced their concern to warrant that you should back 
up that claim, IMO. Have you verified that your patchset does not change 
current behaviour when building the entire set of default configurations 
for each supported architecture? Does it reduce or increase overall 
complexity of the build process?



it is not even supposed to add overhead to kernel compilation. And it would,
at least, solve my problem with the 3h limit, and enable the tool
I am building on GiltabCI to help other developers.


(Apropos of nothing: Am I the only one who thinks gitlab might take a 
rather dim view of your creativity in dealing with their limit?)



Thanks for reading,

Leonardo Bras


Thanks for listening!

Cheers,

Michael


Re: [PATCH v3 3/7] drivers: parisc: Avoids building driver if CONFIG_PARISC is disabled

2018-10-05 Thread Michael Schmitz




Am 05.10.2018 um 15:16 schrieb Leonardo Bras:

Well it's not really that persuasive.  Most people simply let the build
run to completion, but if you have a problem with a job control 3h
timelimit, then create a job that kills itself at 2:59 and then
resubmits itself.  That will produce a complete build in 3h chunks
without any need to call sub Makefiles.



Humm, I probably should have explained better how GitlabCI works.
It works creating a docker container that have a limited lifespan of 3h max.
After that the time is over, this container ceases to exist, leaving no build
objects, only the console log. So there is no way of 'resuming' the building
from where it stopped. I used the 'job' term because it's how they call it,
and I understand it's easily confused with bash jobs.


All of our Makefiles are coded assuming the upper level can prevent
descent into the lower ones.  You're proposing to change that
assumption, requiring a fairly large patch set, which doesn't really
seem to provide a huge benefit.

James


I understand your viewpoint.
But what I propose is not to change that assumption, but instead give some
Makefiles the aditional ability to be called directly and still not build stuff
if they were not enabled in .config.

But, why these chosen Makefiles, and not all of them?
Granularity.
What I am trying to achieve with this patchset is the ability of building
smaller sets of drivers without accidentally building what is not enabled
on .config.
And, in my viewpoint, building a single drivers/DRIVERNAME is small enough to
be fast in most situations.


That already works, doesn't it? So all that you'd need is an offline 
tool to precompute what drivers to actually build with a given config.


'make -n' with some suitable output mangling might do the job.

There may well be other ways to achieve your stated goal, without any 
need to make changes to the kernel build process (which is the result of 
many years of evolution and tuning, BTW).



This change is not supposed to bother the usual way of building the kernel, and


Enough people have voiced their concern to warrant that you should back 
up that claim, IMO. Have you verified that your patchset does not change 
current behaviour when building the entire set of default configurations 
for each supported architecture? Does it reduce or increase overall 
complexity of the build process?



it is not even supposed to add overhead to kernel compilation. And it would,
at least, solve my problem with the 3h limit, and enable the tool
I am building on GiltabCI to help other developers.


(Apropos of nothing: Am I the only one who thinks gitlab might take a 
rather dim view of your creativity in dealing with their limit?)



Thanks for reading,

Leonardo Bras


Thanks for listening!

Cheers,

Michael


Re: moving affs + RDB partition support to staging?

2018-06-27 Thread Michael Schmitz

Joanne,

Linux on m68k has supported lseek64 (or llseek) for a long time (from 
glibc version 2.1 according to what I found). About the only area where 
we are limited by 32 bits is the virtual memory size.


I'm not proposing to modify the RDB format definition, though an 
extension to store 64 bit offsets separate from the 32 bit ones would be 
one way to make certain such disks are safe to use on 3.1 and earlier 
versions of AmigaOS. (Another one would be to modify the disk drivers on 
older versions to do the offset calculation in 64 bit, and check for 
overflow just as we do here. Not sure whether that's feasible. And as 
you so eloquently describe, we can't rely on users listening.)


Either way, we need the cooperation of partitioning tool writers to 
ensure partition information that is prone to overflows is never stored 
in the 32 bit, classic RDB. That appears to have failed already, as 
Martin's experience illustrates.


I'm only concerned with fixing the (dangerous) but in the Linux 
partition format parser for RDB. Refusing to use any partitions that 
will cause havoc on old AmigaOS versions is all I can do to try and get 
the users' attention.


Your warning makes me wonder whether the log message should just say 
'report this bug to linux-m...@vger.kernel.org' to at least try and 
educate any that respond about the dangers of their partitioning scheme 
before telling them about the override option. Problem with that is, in 
about three years no one will remember any of this ...


Cheers,

Michael


Am 28.06.2018 um 15:44 schrieb jdow:

Michael, as long as m68k only supports int fseek( int ) 4 GB * block
size is your HARD limit. Versions that support __int64 fseek64( __int64
) can work with larger disks. RDBs could include RDSK and a new set of
other blocks that replace the last two characters with "64" and use
__int64 where needed in the various values. That way a clever disk
partitioner could give allow normal (32 bit) RDB definitions where
possible. Then at least SOME of the disk could be supported AND a very
clever filesystem that abstracts very large disks properly could give
access to the whole disk. (Read the RDBs first 32 bits. Then if a
filesystem or driveinit was loaded re-read the RDBs to see if new 64 bit
partitions are revealed.

I could be wrong but I do not think RDBs could be safely modified any
other way to work. And, trust me as I bet this is still true, you will
need a SERIOUSLY good bomb shelter on the Moon if you change RDBs.
Suppose Joe Amigoid uses it, and then Joe Amigoid loads Amigados 2.4
because he wants to run a game that crashes on anything newer. Then Joe
got far enough something writes to the disk and data is corrupted. Note
further that Amigoids do NOT, repeat NOT, listen to facts in such cases.
Hell, some of them never listened to facts about an incident at Jerry
Pournelle's place when a 1.1 DPaint session with Kelly Freas hung and we
lost a delightful drawing. Jerry reported it. Amigoids screamed. I tried
to tell them I was there, it was my machine, and 1.1 was, indeed, crap.

{o.o}

On 20180627 02:00, Michael Schmitz wrote:

Joanne,

I'm not at all allergic to avoiding RDB at all cost for new disks. If
AmigaOS 4.1 supports more recent partition formats, all the better.
This is all about supporting use of legacy RDB disks on Linux (though
2 TB does stretch the definition of 'legacy' a little). My interest in
this is to ensure we can continue to use RDB format disks on m68k
Amiga computers which have no other way to boot Linux from disk.

Not proposing to change the RDB format at all, either. Just trying to
make sure we translate RDB info into Linux 512-byte block offset and
size numbers correctly. The kernel won't modify the RDB at all
(intentionally, that is - with the correct choice of partition sizes,
Martin might well have wiped out his RDB with the current version of
the parser).

The choice of refusing to mount a disk (or mounting read-only) rests
with the VFS drivers alone - AFFS in that case. Not touching any of
that. At partition scan time, we only have the option of making the
partition available (with a warning printed), or refusing to make it
available to the kernel. Once it's made available, all bets are off.

 From what Martin writes, his test case RDB was valid and worked as
expected on 32 bit AmigaOS (4.1). Apparently, that version has the
necessary extensions to handle the large offsets resulting from 2 TB
disks. Not sure what safeguards are in place when connecting such a
disk to older versions of AmigaOS, but that is a different matter
entirely.

The overflows in partition offset and size are the only ones I can see
in the partition parser - there is no other overflow I've identified.
I just stated that in order to place a partition towards the end of a
2 TB disk, the offset calculation will overflow regardless of what
combination of rdb->rdb_BlockBytes and sector addresses stored in the
RDB (in units of 512 byte b

Re: moving affs + RDB partition support to staging?

2018-06-27 Thread Michael Schmitz

Joanne,

Linux on m68k has supported lseek64 (or llseek) for a long time (from 
glibc version 2.1 according to what I found). About the only area where 
we are limited by 32 bits is the virtual memory size.


I'm not proposing to modify the RDB format definition, though an 
extension to store 64 bit offsets separate from the 32 bit ones would be 
one way to make certain such disks are safe to use on 3.1 and earlier 
versions of AmigaOS. (Another one would be to modify the disk drivers on 
older versions to do the offset calculation in 64 bit, and check for 
overflow just as we do here. Not sure whether that's feasible. And as 
you so eloquently describe, we can't rely on users listening.)


Either way, we need the cooperation of partitioning tool writers to 
ensure partition information that is prone to overflows is never stored 
in the 32 bit, classic RDB. That appears to have failed already, as 
Martin's experience illustrates.


I'm only concerned with fixing the (dangerous) but in the Linux 
partition format parser for RDB. Refusing to use any partitions that 
will cause havoc on old AmigaOS versions is all I can do to try and get 
the users' attention.


Your warning makes me wonder whether the log message should just say 
'report this bug to linux-m...@vger.kernel.org' to at least try and 
educate any that respond about the dangers of their partitioning scheme 
before telling them about the override option. Problem with that is, in 
about three years no one will remember any of this ...


Cheers,

Michael


Am 28.06.2018 um 15:44 schrieb jdow:

Michael, as long as m68k only supports int fseek( int ) 4 GB * block
size is your HARD limit. Versions that support __int64 fseek64( __int64
) can work with larger disks. RDBs could include RDSK and a new set of
other blocks that replace the last two characters with "64" and use
__int64 where needed in the various values. That way a clever disk
partitioner could give allow normal (32 bit) RDB definitions where
possible. Then at least SOME of the disk could be supported AND a very
clever filesystem that abstracts very large disks properly could give
access to the whole disk. (Read the RDBs first 32 bits. Then if a
filesystem or driveinit was loaded re-read the RDBs to see if new 64 bit
partitions are revealed.

I could be wrong but I do not think RDBs could be safely modified any
other way to work. And, trust me as I bet this is still true, you will
need a SERIOUSLY good bomb shelter on the Moon if you change RDBs.
Suppose Joe Amigoid uses it, and then Joe Amigoid loads Amigados 2.4
because he wants to run a game that crashes on anything newer. Then Joe
got far enough something writes to the disk and data is corrupted. Note
further that Amigoids do NOT, repeat NOT, listen to facts in such cases.
Hell, some of them never listened to facts about an incident at Jerry
Pournelle's place when a 1.1 DPaint session with Kelly Freas hung and we
lost a delightful drawing. Jerry reported it. Amigoids screamed. I tried
to tell them I was there, it was my machine, and 1.1 was, indeed, crap.

{o.o}

On 20180627 02:00, Michael Schmitz wrote:

Joanne,

I'm not at all allergic to avoiding RDB at all cost for new disks. If
AmigaOS 4.1 supports more recent partition formats, all the better.
This is all about supporting use of legacy RDB disks on Linux (though
2 TB does stretch the definition of 'legacy' a little). My interest in
this is to ensure we can continue to use RDB format disks on m68k
Amiga computers which have no other way to boot Linux from disk.

Not proposing to change the RDB format at all, either. Just trying to
make sure we translate RDB info into Linux 512-byte block offset and
size numbers correctly. The kernel won't modify the RDB at all
(intentionally, that is - with the correct choice of partition sizes,
Martin might well have wiped out his RDB with the current version of
the parser).

The choice of refusing to mount a disk (or mounting read-only) rests
with the VFS drivers alone - AFFS in that case. Not touching any of
that. At partition scan time, we only have the option of making the
partition available (with a warning printed), or refusing to make it
available to the kernel. Once it's made available, all bets are off.

 From what Martin writes, his test case RDB was valid and worked as
expected on 32 bit AmigaOS (4.1). Apparently, that version has the
necessary extensions to handle the large offsets resulting from 2 TB
disks. Not sure what safeguards are in place when connecting such a
disk to older versions of AmigaOS, but that is a different matter
entirely.

The overflows in partition offset and size are the only ones I can see
in the partition parser - there is no other overflow I've identified.
I just stated that in order to place a partition towards the end of a
2 TB disk, the offset calculation will overflow regardless of what
combination of rdb->rdb_BlockBytes and sector addresses stored in the
RDB (in units of 512 byte b

Re: moving affs + RDB partition support to staging?

2018-06-27 Thread Michael Schmitz
 read each of the variables with the nominal RDB size and convert it
to uint64_t before calculating byte indices.

However, consider my inputs as advice from an adult who has seen the
Amiga Elephant so to speak. I am not trying to assert any control. Do as
you wish; but, I would plead with you to avoid ANY chance you can for
the user to make a bonehead stupid move and lose all his treasured disk
archives. Doing otherwise is very poor form.

{o.o}   Joanne "Said enough, she has" Dow

On 20180626 18:07, Michael Schmitz wrote:

Joanne,

As far as I have been able to test, the change is backwards compatible
(RDB partitions from an old disk 80 GB disk are still recognized OK).
That"s only been done on an emulator though.

Your advice about the dangers of using RDB disks that would have
failed the current Linux RDB parser on legacy 32 bit systems is well
taken though. Maybe Martin can clarify that for me - was the 2 TB disk
in question ever used on a 32 bit Amiga system?

RDB disk format is meant for legacy use only, so I think we can get
away with printing a big fat warning during boot, advising the user
that the oversize RDB partition(s) scanned are not compatible with
legacy 32 bit AmigaOS. With the proposed fix they will work under both
AmigaOS 4.1 and Linux instead of truncating the first overflowing
partition at disk end and trashing valid partitions that overlap,
which is what Martin was after.

If that still seems too risky, we can make the default behaviour to
bail out once a potential overflow is detected, and allow the user to
override that through a boot parameter. I'd leave that decision up for
the code review on linux-block.

Two more comments: Linux uses 512 byte block sizes for the partition
start and size calculations, so a change of the RDB blocksize to
reduce the block counts stored in the RDB would still result in the
same overflow. And amiga-fdisk is indeed utterly broken and needs
updating (along with probably most legacy m68k partitioners). Adrian
has advertised parted as replacement for the old tools - maybe this
would make a nice test case for parted?

Cheers,

   Michael



On Tue, Jun 26, 2018 at 9:45 PM, jdow  wrote:

If it is not backwards compatible I for one would refuse to use it.
And if
it still mattered that much to me I'd also generate a reasonable
alternative. Modifying RDBs nay not be even an approximation of a
good idea.
You'd discover that as soon as an RDB uint64_t disk is tasted by a
uint32_t
only system. If it is for your personal use then you're entirely free to
reject my advice and are probably smart enough to keep it working for
yourself.

GPT is probably the right way to go. Preserve the ability to read
RDBs for
legacy disks only.

{^_^}


On 20180626 01:31, Michael Schmitz wrote:


Joanne,

I think we all agree that doing 32 bit calculations on 512-byte block
addresses that overflow on disks 2 TB and larger is a bug, causing the
issues Martin reported. Your patch addresses that by using the correct
data type for the calculations (as do other partition parsers that may
have to deal with large disks) and fixes Martin's bug, so appears to be
the right thing to do.

Using 64 bit data types for disks smaller than 2 TB where calculations
don't currently overflow is not expected to cause new issues, other
than
enabling use of disk and partitions larger than 2 TB (which may have
ramifications with filesystems on these partitions). So comptibility is
preserved.

Forcing larger block sizes might be a good strategy to avoid overflow
issues in filesystems as well, but I can't see how the block size
stored
in the RDB would enforce use of the same block size in filesystems.
We'll have to rely on the filesystem tools to get that right, too.
Linux
AFFS does allow block sizes up to 4k (VFS limitation) so this should
allow partitions larger than 2 TB to work already (but I suspect Al
Viro
may have found a few issues when he looked at the AFFS code so I won't
say more). Anyway partitioning tools and filesystems are unrelated to
the Linux partition parser code which is all we aim to fix in this
patch.

If you feel strongly about unknown ramifications of any filesystems on
partitions larger than 2 TB, say so and I'll have the kernel print a
warning about these partitions.

I'll get this patch tested on Martin's test case image as well as on a
RDB image from a disk known to currently work under Linux (thanks Geert
for the losetup hint). Can't do much more without procuring a working
Amiga disk image to use with an emulator, sorry. The Amiga I plan to
use
for tests is a long way away from my home indeed.

Cheers,

  Michael

Am 26.06.18 um 17:17 schrieb jdow:


As long as it preserves compatibility it should be OK, I suppose.
Personally I'd make any partitioning tool front end gently force the
block size towards 8k as the disk size gets larger. The file systems
may also run into 2TB issues that are not obvious. An unused blocks
list will have to go beyond a uint32_t size, for exam

Re: moving affs + RDB partition support to staging?

2018-06-27 Thread Michael Schmitz
 read each of the variables with the nominal RDB size and convert it
to uint64_t before calculating byte indices.

However, consider my inputs as advice from an adult who has seen the
Amiga Elephant so to speak. I am not trying to assert any control. Do as
you wish; but, I would plead with you to avoid ANY chance you can for
the user to make a bonehead stupid move and lose all his treasured disk
archives. Doing otherwise is very poor form.

{o.o}   Joanne "Said enough, she has" Dow

On 20180626 18:07, Michael Schmitz wrote:

Joanne,

As far as I have been able to test, the change is backwards compatible
(RDB partitions from an old disk 80 GB disk are still recognized OK).
That"s only been done on an emulator though.

Your advice about the dangers of using RDB disks that would have
failed the current Linux RDB parser on legacy 32 bit systems is well
taken though. Maybe Martin can clarify that for me - was the 2 TB disk
in question ever used on a 32 bit Amiga system?

RDB disk format is meant for legacy use only, so I think we can get
away with printing a big fat warning during boot, advising the user
that the oversize RDB partition(s) scanned are not compatible with
legacy 32 bit AmigaOS. With the proposed fix they will work under both
AmigaOS 4.1 and Linux instead of truncating the first overflowing
partition at disk end and trashing valid partitions that overlap,
which is what Martin was after.

If that still seems too risky, we can make the default behaviour to
bail out once a potential overflow is detected, and allow the user to
override that through a boot parameter. I'd leave that decision up for
the code review on linux-block.

Two more comments: Linux uses 512 byte block sizes for the partition
start and size calculations, so a change of the RDB blocksize to
reduce the block counts stored in the RDB would still result in the
same overflow. And amiga-fdisk is indeed utterly broken and needs
updating (along with probably most legacy m68k partitioners). Adrian
has advertised parted as replacement for the old tools - maybe this
would make a nice test case for parted?

Cheers,

   Michael



On Tue, Jun 26, 2018 at 9:45 PM, jdow  wrote:

If it is not backwards compatible I for one would refuse to use it.
And if
it still mattered that much to me I'd also generate a reasonable
alternative. Modifying RDBs nay not be even an approximation of a
good idea.
You'd discover that as soon as an RDB uint64_t disk is tasted by a
uint32_t
only system. If it is for your personal use then you're entirely free to
reject my advice and are probably smart enough to keep it working for
yourself.

GPT is probably the right way to go. Preserve the ability to read
RDBs for
legacy disks only.

{^_^}


On 20180626 01:31, Michael Schmitz wrote:


Joanne,

I think we all agree that doing 32 bit calculations on 512-byte block
addresses that overflow on disks 2 TB and larger is a bug, causing the
issues Martin reported. Your patch addresses that by using the correct
data type for the calculations (as do other partition parsers that may
have to deal with large disks) and fixes Martin's bug, so appears to be
the right thing to do.

Using 64 bit data types for disks smaller than 2 TB where calculations
don't currently overflow is not expected to cause new issues, other
than
enabling use of disk and partitions larger than 2 TB (which may have
ramifications with filesystems on these partitions). So comptibility is
preserved.

Forcing larger block sizes might be a good strategy to avoid overflow
issues in filesystems as well, but I can't see how the block size
stored
in the RDB would enforce use of the same block size in filesystems.
We'll have to rely on the filesystem tools to get that right, too.
Linux
AFFS does allow block sizes up to 4k (VFS limitation) so this should
allow partitions larger than 2 TB to work already (but I suspect Al
Viro
may have found a few issues when he looked at the AFFS code so I won't
say more). Anyway partitioning tools and filesystems are unrelated to
the Linux partition parser code which is all we aim to fix in this
patch.

If you feel strongly about unknown ramifications of any filesystems on
partitions larger than 2 TB, say so and I'll have the kernel print a
warning about these partitions.

I'll get this patch tested on Martin's test case image as well as on a
RDB image from a disk known to currently work under Linux (thanks Geert
for the losetup hint). Can't do much more without procuring a working
Amiga disk image to use with an emulator, sorry. The Amiga I plan to
use
for tests is a long way away from my home indeed.

Cheers,

  Michael

Am 26.06.18 um 17:17 schrieb jdow:


As long as it preserves compatibility it should be OK, I suppose.
Personally I'd make any partitioning tool front end gently force the
block size towards 8k as the disk size gets larger. The file systems
may also run into 2TB issues that are not obvious. An unused blocks
list will have to go beyond a uint32_t size, for exam

Re: moving affs + RDB partition support to staging?

2018-06-26 Thread Michael Schmitz
Joanne,

As far as I have been able to test, the change is backwards compatible
(RDB partitions from an old disk 80 GB disk are still recognized OK).
That"s only been done on an emulator though.

Your advice about the dangers of using RDB disks that would have
failed the current Linux RDB parser on legacy 32 bit systems is well
taken though. Maybe Martin can clarify that for me - was the 2 TB disk
in question ever used on a 32 bit Amiga system?

RDB disk format is meant for legacy use only, so I think we can get
away with printing a big fat warning during boot, advising the user
that the oversize RDB partition(s) scanned are not compatible with
legacy 32 bit AmigaOS. With the proposed fix they will work under both
AmigaOS 4.1 and Linux instead of truncating the first overflowing
partition at disk end and trashing valid partitions that overlap,
which is what Martin was after.

If that still seems too risky, we can make the default behaviour to
bail out once a potential overflow is detected, and allow the user to
override that through a boot parameter. I'd leave that decision up for
the code review on linux-block.

Two more comments: Linux uses 512 byte block sizes for the partition
start and size calculations, so a change of the RDB blocksize to
reduce the block counts stored in the RDB would still result in the
same overflow. And amiga-fdisk is indeed utterly broken and needs
updating (along with probably most legacy m68k partitioners). Adrian
has advertised parted as replacement for the old tools - maybe this
would make a nice test case for parted?

Cheers,

  Michael



On Tue, Jun 26, 2018 at 9:45 PM, jdow  wrote:
> If it is not backwards compatible I for one would refuse to use it. And if
> it still mattered that much to me I'd also generate a reasonable
> alternative. Modifying RDBs nay not be even an approximation of a good idea.
> You'd discover that as soon as an RDB uint64_t disk is tasted by a uint32_t
> only system. If it is for your personal use then you're entirely free to
> reject my advice and are probably smart enough to keep it working for
> yourself.
>
> GPT is probably the right way to go. Preserve the ability to read RDBs for
> legacy disks only.
>
> {^_^}
>
>
> On 20180626 01:31, Michael Schmitz wrote:
>>
>> Joanne,
>>
>> I think we all agree that doing 32 bit calculations on 512-byte block
>> addresses that overflow on disks 2 TB and larger is a bug, causing the
>> issues Martin reported. Your patch addresses that by using the correct
>> data type for the calculations (as do other partition parsers that may
>> have to deal with large disks) and fixes Martin's bug, so appears to be
>> the right thing to do.
>>
>> Using 64 bit data types for disks smaller than 2 TB where calculations
>> don't currently overflow is not expected to cause new issues, other than
>> enabling use of disk and partitions larger than 2 TB (which may have
>> ramifications with filesystems on these partitions). So comptibility is
>> preserved.
>>
>> Forcing larger block sizes might be a good strategy to avoid overflow
>> issues in filesystems as well, but I can't see how the block size stored
>> in the RDB would enforce use of the same block size in filesystems.
>> We'll have to rely on the filesystem tools to get that right, too. Linux
>> AFFS does allow block sizes up to 4k (VFS limitation) so this should
>> allow partitions larger than 2 TB to work already (but I suspect Al Viro
>> may have found a few issues when he looked at the AFFS code so I won't
>> say more). Anyway partitioning tools and filesystems are unrelated to
>> the Linux partition parser code which is all we aim to fix in this patch.
>>
>> If you feel strongly about unknown ramifications of any filesystems on
>> partitions larger than 2 TB, say so and I'll have the kernel print a
>> warning about these partitions.
>>
>> I'll get this patch tested on Martin's test case image as well as on a
>> RDB image from a disk known to currently work under Linux (thanks Geert
>> for the losetup hint). Can't do much more without procuring a working
>> Amiga disk image to use with an emulator, sorry. The Amiga I plan to use
>> for tests is a long way away from my home indeed.
>>
>> Cheers,
>>
>>  Michael
>>
>> Am 26.06.18 um 17:17 schrieb jdow:
>>>
>>> As long as it preserves compatibility it should be OK, I suppose.
>>> Personally I'd make any partitioning tool front end gently force the
>>> block size towards 8k as the disk size gets larger. The file systems
>>> may also run into 2TB issues that are not obvious. An unused blocks
>>> list will have to go beyond a uint32_t size, for example. But a block
>>

Re: moving affs + RDB partition support to staging?

2018-06-26 Thread Michael Schmitz
Joanne,

As far as I have been able to test, the change is backwards compatible
(RDB partitions from an old disk 80 GB disk are still recognized OK).
That"s only been done on an emulator though.

Your advice about the dangers of using RDB disks that would have
failed the current Linux RDB parser on legacy 32 bit systems is well
taken though. Maybe Martin can clarify that for me - was the 2 TB disk
in question ever used on a 32 bit Amiga system?

RDB disk format is meant for legacy use only, so I think we can get
away with printing a big fat warning during boot, advising the user
that the oversize RDB partition(s) scanned are not compatible with
legacy 32 bit AmigaOS. With the proposed fix they will work under both
AmigaOS 4.1 and Linux instead of truncating the first overflowing
partition at disk end and trashing valid partitions that overlap,
which is what Martin was after.

If that still seems too risky, we can make the default behaviour to
bail out once a potential overflow is detected, and allow the user to
override that through a boot parameter. I'd leave that decision up for
the code review on linux-block.

Two more comments: Linux uses 512 byte block sizes for the partition
start and size calculations, so a change of the RDB blocksize to
reduce the block counts stored in the RDB would still result in the
same overflow. And amiga-fdisk is indeed utterly broken and needs
updating (along with probably most legacy m68k partitioners). Adrian
has advertised parted as replacement for the old tools - maybe this
would make a nice test case for parted?

Cheers,

  Michael



On Tue, Jun 26, 2018 at 9:45 PM, jdow  wrote:
> If it is not backwards compatible I for one would refuse to use it. And if
> it still mattered that much to me I'd also generate a reasonable
> alternative. Modifying RDBs nay not be even an approximation of a good idea.
> You'd discover that as soon as an RDB uint64_t disk is tasted by a uint32_t
> only system. If it is for your personal use then you're entirely free to
> reject my advice and are probably smart enough to keep it working for
> yourself.
>
> GPT is probably the right way to go. Preserve the ability to read RDBs for
> legacy disks only.
>
> {^_^}
>
>
> On 20180626 01:31, Michael Schmitz wrote:
>>
>> Joanne,
>>
>> I think we all agree that doing 32 bit calculations on 512-byte block
>> addresses that overflow on disks 2 TB and larger is a bug, causing the
>> issues Martin reported. Your patch addresses that by using the correct
>> data type for the calculations (as do other partition parsers that may
>> have to deal with large disks) and fixes Martin's bug, so appears to be
>> the right thing to do.
>>
>> Using 64 bit data types for disks smaller than 2 TB where calculations
>> don't currently overflow is not expected to cause new issues, other than
>> enabling use of disk and partitions larger than 2 TB (which may have
>> ramifications with filesystems on these partitions). So comptibility is
>> preserved.
>>
>> Forcing larger block sizes might be a good strategy to avoid overflow
>> issues in filesystems as well, but I can't see how the block size stored
>> in the RDB would enforce use of the same block size in filesystems.
>> We'll have to rely on the filesystem tools to get that right, too. Linux
>> AFFS does allow block sizes up to 4k (VFS limitation) so this should
>> allow partitions larger than 2 TB to work already (but I suspect Al Viro
>> may have found a few issues when he looked at the AFFS code so I won't
>> say more). Anyway partitioning tools and filesystems are unrelated to
>> the Linux partition parser code which is all we aim to fix in this patch.
>>
>> If you feel strongly about unknown ramifications of any filesystems on
>> partitions larger than 2 TB, say so and I'll have the kernel print a
>> warning about these partitions.
>>
>> I'll get this patch tested on Martin's test case image as well as on a
>> RDB image from a disk known to currently work under Linux (thanks Geert
>> for the losetup hint). Can't do much more without procuring a working
>> Amiga disk image to use with an emulator, sorry. The Amiga I plan to use
>> for tests is a long way away from my home indeed.
>>
>> Cheers,
>>
>>  Michael
>>
>> Am 26.06.18 um 17:17 schrieb jdow:
>>>
>>> As long as it preserves compatibility it should be OK, I suppose.
>>> Personally I'd make any partitioning tool front end gently force the
>>> block size towards 8k as the disk size gets larger. The file systems
>>> may also run into 2TB issues that are not obvious. An unused blocks
>>> list will have to go beyond a uint32_t size, for example. But a block
>>

Re: moving affs + RDB partition support to staging?

2018-06-26 Thread Michael Schmitz
Hi Martin,


Am 26.06.18 um 20:02 schrieb Martin Steigerwald:
> Michael.
>
> Michael Schmitz - 26.06.18, 04:23:
>> Joanne,
>>
>> Martin's boot log (including your patch) says:
>>
>> Jun 19 21:19:09 merkaba kernel: [ 7891.843284]  sdb: RDSK (512) sdb1
>> (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res 2 spb
>> 4)
>> Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: [sdb]
>> Attached SCSI disk
>>
>> so it's indeed a case of self inflicted damage (RDSK (512) means 512
>> byte blocks) and can be worked around by using a different block size.
> Well I pretty much believe that this was the standard block size of the 
> tool I used in order to create the RDB. I think it has  been Media 

No offense meant - 512 bytes per block would certainly have been the
default and it certainly wasn't obvious that this needed changing.

> Toolbox + the engine behind it, from AmigaOS 4.0 (not 4.1) or so. DOS-
> Type JXF points to JXFS, an AmigaOS 4 only filesystem that meanwhile has 
> been deprecated by AmigaOS upstream.
>
> Maybe HDToolbox + hdwrench.library by Joanne (AmigaOS 3.5/3.9) would 
> have set it up differently.
>
> Anyway: It works like this in AmigaOS 4 without any issues. With 512 
> Byte Blocks. I think it is good that Linux does not create data 
> corruption when writing to disks that work just fine in AmigaOS. 
> Especially as those using AmigaNG machines like AmigaOne X1000/X5000 or 
> Acube Sam machines may dual boot AmigaOS and Linux on their machines.

That's the goal.

> Thanks again for putting together a patch.

The hard work had been done by Joanne and you six years ago, so no
matter at all. If Jens and the crowd at linux-block point out errors in
review, I take that as a learning experience...

Cheers,

 Michael


>
> Thanks,



Re: moving affs + RDB partition support to staging?

2018-06-26 Thread Michael Schmitz
Hi Martin,


Am 26.06.18 um 20:02 schrieb Martin Steigerwald:
> Michael.
>
> Michael Schmitz - 26.06.18, 04:23:
>> Joanne,
>>
>> Martin's boot log (including your patch) says:
>>
>> Jun 19 21:19:09 merkaba kernel: [ 7891.843284]  sdb: RDSK (512) sdb1
>> (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res 2 spb
>> 4)
>> Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: [sdb]
>> Attached SCSI disk
>>
>> so it's indeed a case of self inflicted damage (RDSK (512) means 512
>> byte blocks) and can be worked around by using a different block size.
> Well I pretty much believe that this was the standard block size of the 
> tool I used in order to create the RDB. I think it has  been Media 

No offense meant - 512 bytes per block would certainly have been the
default and it certainly wasn't obvious that this needed changing.

> Toolbox + the engine behind it, from AmigaOS 4.0 (not 4.1) or so. DOS-
> Type JXF points to JXFS, an AmigaOS 4 only filesystem that meanwhile has 
> been deprecated by AmigaOS upstream.
>
> Maybe HDToolbox + hdwrench.library by Joanne (AmigaOS 3.5/3.9) would 
> have set it up differently.
>
> Anyway: It works like this in AmigaOS 4 without any issues. With 512 
> Byte Blocks. I think it is good that Linux does not create data 
> corruption when writing to disks that work just fine in AmigaOS. 
> Especially as those using AmigaNG machines like AmigaOne X1000/X5000 or 
> Acube Sam machines may dual boot AmigaOS and Linux on their machines.

That's the goal.

> Thanks again for putting together a patch.

The hard work had been done by Joanne and you six years ago, so no
matter at all. If Jens and the crowd at linux-block point out errors in
review, I take that as a learning experience...

Cheers,

 Michael


>
> Thanks,



Re: moving affs + RDB partition support to staging?

2018-06-26 Thread Michael Schmitz
Joanne,

I think we all agree that doing 32 bit calculations on 512-byte block
addresses that overflow on disks 2 TB and larger is a bug, causing the
issues Martin reported. Your patch addresses that by using the correct
data type for the calculations (as do other partition parsers that may
have to deal with large disks) and fixes Martin's bug, so appears to be
the right thing to do.

Using 64 bit data types for disks smaller than 2 TB where calculations
don't currently overflow is not expected to cause new issues, other than
enabling use of disk and partitions larger than 2 TB (which may have
ramifications with filesystems on these partitions). So comptibility is
preserved. 

Forcing larger block sizes might be a good strategy to avoid overflow
issues in filesystems as well, but I can't see how the block size stored
in the RDB would enforce use of the same block size in filesystems.
We'll have to rely on the filesystem tools to get that right, too. Linux
AFFS does allow block sizes up to 4k (VFS limitation) so this should
allow partitions larger than 2 TB to work already (but I suspect Al Viro
may have found a few issues when he looked at the AFFS code so I won't
say more). Anyway partitioning tools and filesystems are unrelated to
the Linux partition parser code which is all we aim to fix in this patch.

If you feel strongly about unknown ramifications of any filesystems on
partitions larger than 2 TB, say so and I'll have the kernel print a
warning about these partitions.

I'll get this patch tested on Martin's test case image as well as on a
RDB image from a disk known to currently work under Linux (thanks Geert
for the losetup hint). Can't do much more without procuring a working
Amiga disk image to use with an emulator, sorry. The Amiga I plan to use
for tests is a long way away from my home indeed.

Cheers,

    Michael

Am 26.06.18 um 17:17 schrieb jdow:
> As long as it preserves compatibility it should be OK, I suppose.
> Personally I'd make any partitioning tool front end gently force the
> block size towards 8k as the disk size gets larger. The file systems
> may also run into 2TB issues that are not obvious. An unused blocks
> list will have to go beyond a uint32_t size, for example. But a block
> list (OFS for sure, don't remember for the newer AFS) uses a tad under
> 1% of the disk all by itself. A block bitmap is not quite so bad. {^_-}
>
> Just be sure you are aware of all the ramifications when you make a
> change. I remember thinking about this for awhile and then determining
> I REALLY did not want to think about it as my brain was getting tied
> into a gordian knot.
>
> {^_^}
>
> On 20180625 19:23, Michael Schmitz wrote:
>> Joanne,
>>
>> Martin's boot log (including your patch) says:
>>
>> Jun 19 21:19:09 merkaba kernel: [ 7891.843284]  sdb: RDSK (512) sdb1
>> (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res 2 spb
>> 4)
>> Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: [sdb]
>> Attached SCSI disk
>>
>> so it's indeed a case of self inflicted damage (RDSK (512) means 512
>> byte blocks) and can be worked around by using a different block size.
>>
>> Your memory serves right indeed - blocksize is in 512 bytes units.
>> I'll still submit a patch to Jens anyway as this may bite others yet.
>>
>> Cheers,
>>
>>    Michael
>>
>>
>> On Sun, Jun 24, 2018 at 11:40 PM, jdow  wrote:
>>> BTW - anybody who uses 512 byte blocks with an Amiga file system is
>>> a famn
>>> dool.
>>>
>>> If memory serves the RDBs think in blocks rather than bytes so it
>>> should
>>> work up to 2 gigablocks whatever your block size is. 512 blocks is
>>> 219902322 bytes. But that wastes just a WHOLE LOT of disk in
>>> block maps.
>>> Go up to 4096 or 8192. The latter is 35 TB.
>>>
>>> {^_^}
>>> On 20180624 02:06, Martin Steigerwald wrote:
>>>>
>>>> Hi.
>>>>
>>>> Michael Schmitz - 27.04.18, 04:11:
>>>>>
>>>>> test results at https://bugzilla.kernel.org/show_bug.cgi?id=43511
>>>>> indicate the RDB parser bug is fixed by the patch given there, so if
>>>>> Martin now submits the patch, all should be well?
>>>>
>>>>
>>>> Ok, better be honest than having anyone waiting for it:
>>>>
>>>> I do not care enough about this, in order to motivate myself preparing
>>>> the a patch from Joanne Dow´s fix.
>>>>
>>>> I am not even using my Amiga boxes anymore, not even the Sam440ep
>>>> which
>>>> I still have in my apartment.
>>>>
>>>> So RDB support in Linux it remains broken for disks larger 2 TB,
>>>> unless
>>>> someone else does.
>>>>
>>>> Thanks.
>>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe
>>> linux-m68k" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-m68k" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: moving affs + RDB partition support to staging?

2018-06-26 Thread Michael Schmitz
Joanne,

I think we all agree that doing 32 bit calculations on 512-byte block
addresses that overflow on disks 2 TB and larger is a bug, causing the
issues Martin reported. Your patch addresses that by using the correct
data type for the calculations (as do other partition parsers that may
have to deal with large disks) and fixes Martin's bug, so appears to be
the right thing to do.

Using 64 bit data types for disks smaller than 2 TB where calculations
don't currently overflow is not expected to cause new issues, other than
enabling use of disk and partitions larger than 2 TB (which may have
ramifications with filesystems on these partitions). So comptibility is
preserved. 

Forcing larger block sizes might be a good strategy to avoid overflow
issues in filesystems as well, but I can't see how the block size stored
in the RDB would enforce use of the same block size in filesystems.
We'll have to rely on the filesystem tools to get that right, too. Linux
AFFS does allow block sizes up to 4k (VFS limitation) so this should
allow partitions larger than 2 TB to work already (but I suspect Al Viro
may have found a few issues when he looked at the AFFS code so I won't
say more). Anyway partitioning tools and filesystems are unrelated to
the Linux partition parser code which is all we aim to fix in this patch.

If you feel strongly about unknown ramifications of any filesystems on
partitions larger than 2 TB, say so and I'll have the kernel print a
warning about these partitions.

I'll get this patch tested on Martin's test case image as well as on a
RDB image from a disk known to currently work under Linux (thanks Geert
for the losetup hint). Can't do much more without procuring a working
Amiga disk image to use with an emulator, sorry. The Amiga I plan to use
for tests is a long way away from my home indeed.

Cheers,

    Michael

Am 26.06.18 um 17:17 schrieb jdow:
> As long as it preserves compatibility it should be OK, I suppose.
> Personally I'd make any partitioning tool front end gently force the
> block size towards 8k as the disk size gets larger. The file systems
> may also run into 2TB issues that are not obvious. An unused blocks
> list will have to go beyond a uint32_t size, for example. But a block
> list (OFS for sure, don't remember for the newer AFS) uses a tad under
> 1% of the disk all by itself. A block bitmap is not quite so bad. {^_-}
>
> Just be sure you are aware of all the ramifications when you make a
> change. I remember thinking about this for awhile and then determining
> I REALLY did not want to think about it as my brain was getting tied
> into a gordian knot.
>
> {^_^}
>
> On 20180625 19:23, Michael Schmitz wrote:
>> Joanne,
>>
>> Martin's boot log (including your patch) says:
>>
>> Jun 19 21:19:09 merkaba kernel: [ 7891.843284]  sdb: RDSK (512) sdb1
>> (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res 2 spb
>> 4)
>> Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: [sdb]
>> Attached SCSI disk
>>
>> so it's indeed a case of self inflicted damage (RDSK (512) means 512
>> byte blocks) and can be worked around by using a different block size.
>>
>> Your memory serves right indeed - blocksize is in 512 bytes units.
>> I'll still submit a patch to Jens anyway as this may bite others yet.
>>
>> Cheers,
>>
>>    Michael
>>
>>
>> On Sun, Jun 24, 2018 at 11:40 PM, jdow  wrote:
>>> BTW - anybody who uses 512 byte blocks with an Amiga file system is
>>> a famn
>>> dool.
>>>
>>> If memory serves the RDBs think in blocks rather than bytes so it
>>> should
>>> work up to 2 gigablocks whatever your block size is. 512 blocks is
>>> 219902322 bytes. But that wastes just a WHOLE LOT of disk in
>>> block maps.
>>> Go up to 4096 or 8192. The latter is 35 TB.
>>>
>>> {^_^}
>>> On 20180624 02:06, Martin Steigerwald wrote:
>>>>
>>>> Hi.
>>>>
>>>> Michael Schmitz - 27.04.18, 04:11:
>>>>>
>>>>> test results at https://bugzilla.kernel.org/show_bug.cgi?id=43511
>>>>> indicate the RDB parser bug is fixed by the patch given there, so if
>>>>> Martin now submits the patch, all should be well?
>>>>
>>>>
>>>> Ok, better be honest than having anyone waiting for it:
>>>>
>>>> I do not care enough about this, in order to motivate myself preparing
>>>> the a patch from Joanne Dow´s fix.
>>>>
>>>> I am not even using my Amiga boxes anymore, not even the Sam440ep
>>>> which
>>>> I still have in my apartment.
>>>>
>>>> So RDB support in Linux it remains broken for disks larger 2 TB,
>>>> unless
>>>> someone else does.
>>>>
>>>> Thanks.
>>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe
>>> linux-m68k" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-m68k" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: moving affs + RDB partition support to staging?

2018-06-25 Thread Michael Schmitz
Joanne,

Martin's boot log (including your patch) says:

Jun 19 21:19:09 merkaba kernel: [ 7891.843284]  sdb: RDSK (512) sdb1
(LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res 2 spb
4)
Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: [sdb]
Attached SCSI disk

so it's indeed a case of self inflicted damage (RDSK (512) means 512
byte blocks) and can be worked around by using a different block size.

Your memory serves right indeed - blocksize is in 512 bytes units.
I'll still submit a patch to Jens anyway as this may bite others yet.

Cheers,

  Michael


On Sun, Jun 24, 2018 at 11:40 PM, jdow  wrote:
> BTW - anybody who uses 512 byte blocks with an Amiga file system is a famn
> dool.
>
> If memory serves the RDBs think in blocks rather than bytes so it should
> work up to 2 gigablocks whatever your block size is. 512 blocks is
> 219902322 bytes. But that wastes just a WHOLE LOT of disk in block maps.
> Go up to 4096 or 8192. The latter is 35 TB.
>
> {^_^}
> On 20180624 02:06, Martin Steigerwald wrote:
>>
>> Hi.
>>
>> Michael Schmitz - 27.04.18, 04:11:
>>>
>>> test results at https://bugzilla.kernel.org/show_bug.cgi?id=43511
>>> indicate the RDB parser bug is fixed by the patch given there, so if
>>> Martin now submits the patch, all should be well?
>>
>>
>> Ok, better be honest than having anyone waiting for it:
>>
>> I do not care enough about this, in order to motivate myself preparing
>> the a patch from Joanne Dow´s fix.
>>
>> I am not even using my Amiga boxes anymore, not even the Sam440ep which
>> I still have in my apartment.
>>
>> So RDB support in Linux it remains broken for disks larger 2 TB, unless
>> someone else does.
>>
>> Thanks.
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-m68k" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: moving affs + RDB partition support to staging?

2018-06-25 Thread Michael Schmitz
Joanne,

Martin's boot log (including your patch) says:

Jun 19 21:19:09 merkaba kernel: [ 7891.843284]  sdb: RDSK (512) sdb1
(LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res 2 spb
4)
Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: [sdb]
Attached SCSI disk

so it's indeed a case of self inflicted damage (RDSK (512) means 512
byte blocks) and can be worked around by using a different block size.

Your memory serves right indeed - blocksize is in 512 bytes units.
I'll still submit a patch to Jens anyway as this may bite others yet.

Cheers,

  Michael


On Sun, Jun 24, 2018 at 11:40 PM, jdow  wrote:
> BTW - anybody who uses 512 byte blocks with an Amiga file system is a famn
> dool.
>
> If memory serves the RDBs think in blocks rather than bytes so it should
> work up to 2 gigablocks whatever your block size is. 512 blocks is
> 219902322 bytes. But that wastes just a WHOLE LOT of disk in block maps.
> Go up to 4096 or 8192. The latter is 35 TB.
>
> {^_^}
> On 20180624 02:06, Martin Steigerwald wrote:
>>
>> Hi.
>>
>> Michael Schmitz - 27.04.18, 04:11:
>>>
>>> test results at https://bugzilla.kernel.org/show_bug.cgi?id=43511
>>> indicate the RDB parser bug is fixed by the patch given there, so if
>>> Martin now submits the patch, all should be well?
>>
>>
>> Ok, better be honest than having anyone waiting for it:
>>
>> I do not care enough about this, in order to motivate myself preparing
>> the a patch from Joanne Dow´s fix.
>>
>> I am not even using my Amiga boxes anymore, not even the Sam440ep which
>> I still have in my apartment.
>>
>> So RDB support in Linux it remains broken for disks larger 2 TB, unless
>> someone else does.
>>
>> Thanks.
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-m68k" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: moving affs + RDB partition support to staging? (was: Re: Moving unmaintained filesystems to staging)

2018-06-25 Thread Michael Schmitz
Hi Martin,

OK, I'll prepare a patch and submit it to linux-block for review. I'll
have to refer to your testing back in 2012 since all I can test is
whether the patch still allows partition tables on small disks to be
recognized at this time (unless Adrian has a 2 TB disk and a SATA-SCSI
bridge to test this properly on).

Cheers,

    Michael



Am 24.06.18 um 21:06 schrieb Martin Steigerwald:
> Hi.
>
> Michael Schmitz - 27.04.18, 04:11:
>> test results at https://bugzilla.kernel.org/show_bug.cgi?id=43511
>> indicate the RDB parser bug is fixed by the patch given there, so if
>> Martin now submits the patch, all should be well?
> Ok, better be honest than having anyone waiting for it:
>
> I do not care enough about this, in order to motivate myself preparing 
> the a patch from Joanne Dow´s fix.
>
> I am not even using my Amiga boxes anymore, not even the Sam440ep which 
> I still have in my apartment.
>
> So RDB support in Linux it remains broken for disks larger 2 TB, unless 
> someone else does.
>
> Thanks.



Re: moving affs + RDB partition support to staging? (was: Re: Moving unmaintained filesystems to staging)

2018-06-25 Thread Michael Schmitz
Hi Martin,

OK, I'll prepare a patch and submit it to linux-block for review. I'll
have to refer to your testing back in 2012 since all I can test is
whether the patch still allows partition tables on small disks to be
recognized at this time (unless Adrian has a 2 TB disk and a SATA-SCSI
bridge to test this properly on).

Cheers,

    Michael



Am 24.06.18 um 21:06 schrieb Martin Steigerwald:
> Hi.
>
> Michael Schmitz - 27.04.18, 04:11:
>> test results at https://bugzilla.kernel.org/show_bug.cgi?id=43511
>> indicate the RDB parser bug is fixed by the patch given there, so if
>> Martin now submits the patch, all should be well?
> Ok, better be honest than having anyone waiting for it:
>
> I do not care enough about this, in order to motivate myself preparing 
> the a patch from Joanne Dow´s fix.
>
> I am not even using my Amiga boxes anymore, not even the Sam440ep which 
> I still have in my apartment.
>
> So RDB support in Linux it remains broken for disks larger 2 TB, unless 
> someone else does.
>
> Thanks.



Re: [RFC PATCH] m68k: set dma and coherent masks for Macintosh SONIC based ethernet

2018-05-29 Thread Michael Schmitz
Hi Finn,

On Tue, May 29, 2018 at 5:38 PM, Finn Thain  wrote:
> I found some patches here,
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/generic-dma-noncoherent.2

That's the most recent IIRC. Haven't begun looking at that yet - still stuck at

git://git.infradead.org/users/hch/misc.git

and waiting for a SCSI disk to be installed on elgar.

Christoph - any merit in testing that old repo?

Cheers,

  Michael


Re: [RFC PATCH] m68k: set dma and coherent masks for Macintosh SONIC based ethernet

2018-05-29 Thread Michael Schmitz
Hi Finn,

On Tue, May 29, 2018 at 5:38 PM, Finn Thain  wrote:
> I found some patches here,
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/generic-dma-noncoherent.2

That's the most recent IIRC. Haven't begun looking at that yet - still stuck at

git://git.infradead.org/users/hch/misc.git

and waiting for a SCSI disk to be installed on elgar.

Christoph - any merit in testing that old repo?

Cheers,

  Michael


Re: [RFC PATCH] m68k: set dma and coherent masks for Macintosh SONIC based ethernet

2018-05-28 Thread Michael Schmitz
Hi Finn,

Am 29.05.2018 um 14:15 schrieb Finn Thain:
> 
> Since an arch gets to apply limits in the dma ops it implements, why would 
> arch code also have to set a limit in the form of default platform device 
> masks? Powerpc seems to be the only arch that does this.

One of Christoph's recent patches removed most of arches' dma ops,
replacing them by one generic implementation instead. m68k was one of
the affected arches. I concede his patch series is experimental still
and not in mainline, but may be included at some time.

Otherwise - your patches, so if you prefer to have each driver handle
this on an as-needed basis, I'm not objecting.

Cheers,

Michael


Re: [RFC PATCH] m68k: set dma and coherent masks for Macintosh SONIC based ethernet

2018-05-28 Thread Michael Schmitz
Hi Finn,

Am 29.05.2018 um 14:15 schrieb Finn Thain:
> 
> Since an arch gets to apply limits in the dma ops it implements, why would 
> arch code also have to set a limit in the form of default platform device 
> masks? Powerpc seems to be the only arch that does this.

One of Christoph's recent patches removed most of arches' dma ops,
replacing them by one generic implementation instead. m68k was one of
the affected arches. I concede his patch series is experimental still
and not in mainline, but may be included at some time.

Otherwise - your patches, so if you prefer to have each driver handle
this on an as-needed basis, I'm not objecting.

Cheers,

Michael


Re: [RFC PATCH] m68k: set dma and coherent masks for Macintosh SONIC based ethernet

2018-05-28 Thread Michael Schmitz
Hi Geert,

my preference would be Finn's patch introducing a m68k
arch_setup_pdev_archdata(). It nicely preserves what bus code sets up
prior to registering a platform device (important for Zorro devices
using platform or mfd devices), and allows overriding by drivers that
need it.

If ever a kernel-wide consensus is reached to move this setup to the
generic arch_setup_pdev_archdata(), we can still back out his patch at
that time. Finn found one counterexample where absence of DMA mask
signaled 'do not use DMA', so I think moving the DMA mask setup to
generic code is unlikely to happen any time.

Just my $0.02 ...

Cheers,

  MIchael

On Mon, May 28, 2018 at 10:15 PM, Geert Uytterhoeven
 wrote:
> On Mon, May 28, 2018 at 7:26 AM, Finn Thain  
> wrote:
>> On Mon, 28 May 2018, Michael Schmitz wrote:
>>> Am 27.05.2018 um 17:49 schrieb Finn Thain:
>>> > On Sun, 27 May 2018, Michael Schmitz wrote:
>>> >
>>> >> That should have fixed the warning already ...
>>> >
>>> > It's still not fixed (hence my "acked-by" for Geunter's patch).
>>> >
>>>
>>> Odd - does link order still matter even though the
>>> arch_setup_dev_archdata() function from the core platform code is
>>> declared as a weak symbol?
>>>
>>> I'll see what I can find out on elgar ...
>>>
>>
>> Any one of the numerous patches/rfcs/suggestions that I sent will avoid
>> the WARN splat.
>>
>> When I said "it's still not fixed", what I meant to say was, "it's still
>> not fixed in mainline and no proposed fix was accepted to the best of my
>> knowledge".
>
> Indeed.
>
> Do we have a consensus on the way forward? The merge window for
> v4.18 will open soon.
>
> Thanks!
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds


Re: [RFC PATCH] m68k: set dma and coherent masks for Macintosh SONIC based ethernet

2018-05-28 Thread Michael Schmitz
Hi Geert,

my preference would be Finn's patch introducing a m68k
arch_setup_pdev_archdata(). It nicely preserves what bus code sets up
prior to registering a platform device (important for Zorro devices
using platform or mfd devices), and allows overriding by drivers that
need it.

If ever a kernel-wide consensus is reached to move this setup to the
generic arch_setup_pdev_archdata(), we can still back out his patch at
that time. Finn found one counterexample where absence of DMA mask
signaled 'do not use DMA', so I think moving the DMA mask setup to
generic code is unlikely to happen any time.

Just my $0.02 ...

Cheers,

  MIchael

On Mon, May 28, 2018 at 10:15 PM, Geert Uytterhoeven
 wrote:
> On Mon, May 28, 2018 at 7:26 AM, Finn Thain  
> wrote:
>> On Mon, 28 May 2018, Michael Schmitz wrote:
>>> Am 27.05.2018 um 17:49 schrieb Finn Thain:
>>> > On Sun, 27 May 2018, Michael Schmitz wrote:
>>> >
>>> >> That should have fixed the warning already ...
>>> >
>>> > It's still not fixed (hence my "acked-by" for Geunter's patch).
>>> >
>>>
>>> Odd - does link order still matter even though the
>>> arch_setup_dev_archdata() function from the core platform code is
>>> declared as a weak symbol?
>>>
>>> I'll see what I can find out on elgar ...
>>>
>>
>> Any one of the numerous patches/rfcs/suggestions that I sent will avoid
>> the WARN splat.
>>
>> When I said "it's still not fixed", what I meant to say was, "it's still
>> not fixed in mainline and no proposed fix was accepted to the best of my
>> knowledge".
>
> Indeed.
>
> Do we have a consensus on the way forward? The merge window for
> v4.18 will open soon.
>
> Thanks!
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds


Re: [RFC PATCH] m68k: set dma and coherent masks for Macintosh SONIC based ethernet

2018-05-27 Thread Michael Schmitz
Hi Finn,

Am 27.05.2018 um 17:49 schrieb Finn Thain:
> On Sun, 27 May 2018, Michael Schmitz wrote:
> 
>> That should have fixed the warning already ...
> 
> It's still not fixed (hence my "acked-by" for Geunter's patch).
> 

Odd - does link order still matter even though the
arch_setup_dev_archdata() function from the core platform code is
declared as a weak symbol?

I'll see what I can find out on elgar ...

Cheers,

Michael


Re: [RFC PATCH] m68k: set dma and coherent masks for Macintosh SONIC based ethernet

2018-05-27 Thread Michael Schmitz
Hi Finn,

Am 27.05.2018 um 17:49 schrieb Finn Thain:
> On Sun, 27 May 2018, Michael Schmitz wrote:
> 
>> That should have fixed the warning already ...
> 
> It's still not fixed (hence my "acked-by" for Geunter's patch).
> 

Odd - does link order still matter even though the
arch_setup_dev_archdata() function from the core platform code is
declared as a weak symbol?

I'll see what I can find out on elgar ...

Cheers,

Michael


Re: [RFC PATCH] m68k: set dma and coherent masks for Macintosh SONIC based ethernet

2018-05-26 Thread Michael Schmitz
Hi Finn,

was your patch to implement this in arch_setup_pdev_archdata() rejected?
That should have fixed the warning already ...

Am 27.05.2018 um 15:01 schrieb Finn Thain:
> On Sat, 26 May 2018, Guenter Roeck wrote:
> 
>> As of commit 205e1b7f51e4 ("dma-mapping: warn when there is no
>> coherent_dma_mask") the NatSemi SONIC Ethernet driver is issuing the
>> following warning on driver initialization on Macintosh q800 systems.
>>
>> SONIC ethernet @50f0a000, MAC 08:00:07:12:34:56, IRQ 3
>> [ cut here ]
>> WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 
>> macsonic_init+0x6a/0x15a
>> Modules linked in:
>> CPU: 0 PID: 1 Comm: swapper Not tainted 4.17.0-rc6-mac-00286-g527f47c #1
>> Stack from 0781fdd8:
>> 0781fdd8 003615b3 000181ba 05c4 07a24cbc   
>> 20e8
>>  07a24800 002c196c 0001824e 00334c06 0204 001f782a 0009 
>>   003358d9 001f782a 00334c06 0204 0003  07a24800
>>  002b5cb6 000372ec 001f8b1a 07a24800 00359203 50f0a000 07a14a48 0003
>>   07845c0a 0039dcca 003c835c 003c835c 0035b924 001c19de 07845c00
>>  07845c0a 0039dcca 001c06dc 07845c0a 0781fed8 0007 0054d040 07845c0a
>> Call Trace: [<000181ba>] __warn+0xc0/0xc2
>>  [<20e8>] do_one_initcall+0x0/0x140
>>  [<0001824e>] warn_slowpath_null+0x26/0x2c
>>  [<001f782a>] macsonic_init+0x6a/0x15a
>>  [<001f782a>] macsonic_init+0x6a/0x15a
>>  [<002b5cb6>] memcmp+0x0/0x2a
>>  [<000372ec>] printk+0x0/0x18
>>  [<001f8b1a>] mac_sonic_platform_probe+0x380/0x404
>>
>> As per the warning the coherent_dma_mask is not set on this device.
>> There is nothing special about the DMA memory coherency on this hardware
>> so we can just set the mask to 32bits in the platform data for the FEC
>> ethernet devices.
>>
>> Signed-off-by: Guenter Roeck 
> 
> Acked-by: Finn Thain 
> 
> Geert, assuming that Guenter's patch is acceptable, would you also accept 
> a similar patch for the "macmace" platform device?
> 
>> ---
>> Modeled after f61e64310b75 ("m68k: set dma and coherent masks for platform
>> FEC ethernets"). 
>>
>> RFC: Is "nothing special about the DMA memory coherency" correect ?
>>
> 
> As I understand it, "cache-coherence" is meaningless unless you have 
> multiple CPU cores and caches. If there's only one CPU core, its cache 
> can't be said to be "coherent" or "non-coherent". Moreover, DMA (direct 
> memory access) concerns an IO device and a memory, not a cache, so the 
> term "coherent dma" is bogus IMHO.

DMA does concern the CPU cache insofar as (without explicit cache
maintenance) the CPU cache may hold stale data after a DMA write, or
data to be used in a DMA read may not have been written back from cache
to memory. As far as I understand it, the generic DMA API does handle
these cache maintenance aspects without a need for action at the driver
level. But you're correct that the question of coherent memory does not
arise on uniprocessor systems, so the whole warning could have been
avoided.

Unless 'coherent memory' in this context means no explicit cache
maintenance is required (CPU does bus snooping - which of our platforms
fully support that)?

Cheers,

Michael




Re: [RFC PATCH] m68k: set dma and coherent masks for Macintosh SONIC based ethernet

2018-05-26 Thread Michael Schmitz
Hi Finn,

was your patch to implement this in arch_setup_pdev_archdata() rejected?
That should have fixed the warning already ...

Am 27.05.2018 um 15:01 schrieb Finn Thain:
> On Sat, 26 May 2018, Guenter Roeck wrote:
> 
>> As of commit 205e1b7f51e4 ("dma-mapping: warn when there is no
>> coherent_dma_mask") the NatSemi SONIC Ethernet driver is issuing the
>> following warning on driver initialization on Macintosh q800 systems.
>>
>> SONIC ethernet @50f0a000, MAC 08:00:07:12:34:56, IRQ 3
>> [ cut here ]
>> WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 
>> macsonic_init+0x6a/0x15a
>> Modules linked in:
>> CPU: 0 PID: 1 Comm: swapper Not tainted 4.17.0-rc6-mac-00286-g527f47c #1
>> Stack from 0781fdd8:
>> 0781fdd8 003615b3 000181ba 05c4 07a24cbc   
>> 20e8
>>  07a24800 002c196c 0001824e 00334c06 0204 001f782a 0009 
>>   003358d9 001f782a 00334c06 0204 0003  07a24800
>>  002b5cb6 000372ec 001f8b1a 07a24800 00359203 50f0a000 07a14a48 0003
>>   07845c0a 0039dcca 003c835c 003c835c 0035b924 001c19de 07845c00
>>  07845c0a 0039dcca 001c06dc 07845c0a 0781fed8 0007 0054d040 07845c0a
>> Call Trace: [<000181ba>] __warn+0xc0/0xc2
>>  [<20e8>] do_one_initcall+0x0/0x140
>>  [<0001824e>] warn_slowpath_null+0x26/0x2c
>>  [<001f782a>] macsonic_init+0x6a/0x15a
>>  [<001f782a>] macsonic_init+0x6a/0x15a
>>  [<002b5cb6>] memcmp+0x0/0x2a
>>  [<000372ec>] printk+0x0/0x18
>>  [<001f8b1a>] mac_sonic_platform_probe+0x380/0x404
>>
>> As per the warning the coherent_dma_mask is not set on this device.
>> There is nothing special about the DMA memory coherency on this hardware
>> so we can just set the mask to 32bits in the platform data for the FEC
>> ethernet devices.
>>
>> Signed-off-by: Guenter Roeck 
> 
> Acked-by: Finn Thain 
> 
> Geert, assuming that Guenter's patch is acceptable, would you also accept 
> a similar patch for the "macmace" platform device?
> 
>> ---
>> Modeled after f61e64310b75 ("m68k: set dma and coherent masks for platform
>> FEC ethernets"). 
>>
>> RFC: Is "nothing special about the DMA memory coherency" correect ?
>>
> 
> As I understand it, "cache-coherence" is meaningless unless you have 
> multiple CPU cores and caches. If there's only one CPU core, its cache 
> can't be said to be "coherent" or "non-coherent". Moreover, DMA (direct 
> memory access) concerns an IO device and a memory, not a cache, so the 
> term "coherent dma" is bogus IMHO.

DMA does concern the CPU cache insofar as (without explicit cache
maintenance) the CPU cache may hold stale data after a DMA write, or
data to be used in a DMA read may not have been written back from cache
to memory. As far as I understand it, the generic DMA API does handle
these cache maintenance aspects without a need for action at the driver
level. But you're correct that the question of coherent memory does not
arise on uniprocessor systems, so the whole warning could have been
avoided.

Unless 'coherent memory' in this context means no explicit cache
maintenance is required (CPU does bus snooping - which of our platforms
fully support that)?

Cheers,

Michael




Re: [PATCH net] macmace: Set platform device coherent_dma_mask

2018-05-11 Thread Michael Schmitz
Hi Finn,

Am 11.05.2018 um 22:06 schrieb Finn Thain:
>> You would have to be careful not to overwrite a pdev->dev.dma_mask and 
>> pdev->dev.dma_coherent_mask that might have been set in a platform 
>> device passed via platform_device_register here. Coldfire is the only 
>> m68k platform currently using that, but there might be others in future.
>>
> 
> That Coldfire patch could be reverted if this is a better solution.

True, but there might be other uses for deviating from a platform
default (I'm thinking of Atari SCSI and floppy drivers here). But we
could chose the correct mask to set in arch_setup_pdev_archdata()
instead, as it's a platform property not a driver property in that case.

>> ... But I don't think there are smaller DMA masks used by m68k drivers 
>> that use the platform device mechanism at present. I've only looked at 
>> arch/m68k though.
> 
> So we're back at the same problem that Geert's suggestion also raised: how 
> to identify potentially affected platform devices and drivers?
> 
> Maybe we can take a leaf out of Christoph's book, and leave a noisy 
> WARNING splat in the log.
> 
> void arch_setup_pdev_archdata(struct platform_device *pdev)
> {
> WARN_ON_ONCE(pdev->dev.coherent_dma_mask != DMA_MASK_NONE ||
>  pdev->dev.dma_mask != NULL);

I'd suggest using WARN_ON() so we catch all uses on a particular platform.

I initially thought it necessary to warn on unset mask here, but I see
that would throw up a lot of redundant false positives.

Cheers,

Michael


Re: [PATCH net] macmace: Set platform device coherent_dma_mask

2018-05-11 Thread Michael Schmitz
Hi Finn,

Am 11.05.2018 um 22:06 schrieb Finn Thain:
>> You would have to be careful not to overwrite a pdev->dev.dma_mask and 
>> pdev->dev.dma_coherent_mask that might have been set in a platform 
>> device passed via platform_device_register here. Coldfire is the only 
>> m68k platform currently using that, but there might be others in future.
>>
> 
> That Coldfire patch could be reverted if this is a better solution.

True, but there might be other uses for deviating from a platform
default (I'm thinking of Atari SCSI and floppy drivers here). But we
could chose the correct mask to set in arch_setup_pdev_archdata()
instead, as it's a platform property not a driver property in that case.

>> ... But I don't think there are smaller DMA masks used by m68k drivers 
>> that use the platform device mechanism at present. I've only looked at 
>> arch/m68k though.
> 
> So we're back at the same problem that Geert's suggestion also raised: how 
> to identify potentially affected platform devices and drivers?
> 
> Maybe we can take a leaf out of Christoph's book, and leave a noisy 
> WARNING splat in the log.
> 
> void arch_setup_pdev_archdata(struct platform_device *pdev)
> {
> WARN_ON_ONCE(pdev->dev.coherent_dma_mask != DMA_MASK_NONE ||
>  pdev->dev.dma_mask != NULL);

I'd suggest using WARN_ON() so we catch all uses on a particular platform.

I initially thought it necessary to warn on unset mask here, but I see
that would throw up a lot of redundant false positives.

Cheers,

Michael


Re: [PATCH net] macmace: Set platform device coherent_dma_mask

2018-05-11 Thread Michael Schmitz
Hi Finn,

Am 11.05.2018 um 17:28 schrieb Finn Thain:
> On Fri, 11 May 2018, Michael Schmitz wrote:
> 
>>
>> I'm afraid using platform_device_register() (which you already use for 
>> the SCC devices) is the only option handling this on a per-device basis 
>> without touching platform core code, while at the same time keeping the 
>> DMA mask setup out of device drivers
> 
> I don't think that will fly. If you call platform_device_register() and 
> follow that with a dma mask assignment, you could race with the bus 
> matching and driver probe, and we are back to the same WARNING message.

I wasn't proposing to follow platform_device_register() with the mask
assignment, but rather to use the same strategy from the Coldfire FEC
patch (f61e64310b75): set pdev.dev.coherent_dma_mask and
pdev.dev.dma_mask _before_ calling platform_device_register().

> If you want to use platform_device_register(), you'd have to implement 
> arch_setup_pdev_archdata() and use that to set up the dma mask.

If you want to avoid the overhead of defining the macmace platform
device and using platform_device_register() ... seeing as you would not
be defining just the DMA mask and nothing else, that's probably the
least effort for the MACE and SONIC drivers.

>> (I can see Geert's point there - device driver code might be shared 
>> across implementations of the device on platforms with different DMA 
>> mask requirements,, something the driver can't be expected to know 
>> about).
> 
> As I said, these drivers might be expected to be portable between Macs and 
> early PowerMacs, but the same dma mask would apply AFAIK.
> 
> If a platform driver isn't expected to be portable, I think either method 
> is reasonable: arch_setup_pdev_archdata() or the method in the patch.
> 
> Anyway, there is this in arch/powerpc/kernel/setup-common.c:
> 
> void arch_setup_pdev_archdata(struct platform_device *pdev)
> {
> pdev->archdata.dma_mask = DMA_BIT_MASK(32);
> pdev->dev.dma_mask = >archdata.dma_mask;
>   ...

You would have to be careful not to overwrite a pdev->dev.dma_mask and
pdev->dev.dma_coherent_mask that might have been set in a platform
device passed via platform_device_register here. Coldfire is the only
m68k platform currently using that, but there might be others in future.

> I'm inclined to propose something similar for m68k. That should fix the 
> problem, since arch_setup_pdev_archdata() is already in the call chain:
> 
>   platform_device_register_simple()
>   platform_device_register_resndata()
>   platform_device_register_full()
>   platform_device_alloc()
>   arch_setup_pdev_archdata()
> 
> Thoughts? Will this have nasty side effects for m68k platforms that use 
> smaller dma masks?

These can still set a smaller mask in pdev->dev directly and use
platform_device_register() instead. But I don't think there are smaller
DMA masks used by m68k drivers that use the platform device mechanism at
present. I've only looked at arch/m68k though.

Cheers,

Michael


Re: [PATCH net] macmace: Set platform device coherent_dma_mask

2018-05-11 Thread Michael Schmitz
Hi Finn,

Am 11.05.2018 um 17:28 schrieb Finn Thain:
> On Fri, 11 May 2018, Michael Schmitz wrote:
> 
>>
>> I'm afraid using platform_device_register() (which you already use for 
>> the SCC devices) is the only option handling this on a per-device basis 
>> without touching platform core code, while at the same time keeping the 
>> DMA mask setup out of device drivers
> 
> I don't think that will fly. If you call platform_device_register() and 
> follow that with a dma mask assignment, you could race with the bus 
> matching and driver probe, and we are back to the same WARNING message.

I wasn't proposing to follow platform_device_register() with the mask
assignment, but rather to use the same strategy from the Coldfire FEC
patch (f61e64310b75): set pdev.dev.coherent_dma_mask and
pdev.dev.dma_mask _before_ calling platform_device_register().

> If you want to use platform_device_register(), you'd have to implement 
> arch_setup_pdev_archdata() and use that to set up the dma mask.

If you want to avoid the overhead of defining the macmace platform
device and using platform_device_register() ... seeing as you would not
be defining just the DMA mask and nothing else, that's probably the
least effort for the MACE and SONIC drivers.

>> (I can see Geert's point there - device driver code might be shared 
>> across implementations of the device on platforms with different DMA 
>> mask requirements,, something the driver can't be expected to know 
>> about).
> 
> As I said, these drivers might be expected to be portable between Macs and 
> early PowerMacs, but the same dma mask would apply AFAIK.
> 
> If a platform driver isn't expected to be portable, I think either method 
> is reasonable: arch_setup_pdev_archdata() or the method in the patch.
> 
> Anyway, there is this in arch/powerpc/kernel/setup-common.c:
> 
> void arch_setup_pdev_archdata(struct platform_device *pdev)
> {
> pdev->archdata.dma_mask = DMA_BIT_MASK(32);
> pdev->dev.dma_mask = >archdata.dma_mask;
>   ...

You would have to be careful not to overwrite a pdev->dev.dma_mask and
pdev->dev.dma_coherent_mask that might have been set in a platform
device passed via platform_device_register here. Coldfire is the only
m68k platform currently using that, but there might be others in future.

> I'm inclined to propose something similar for m68k. That should fix the 
> problem, since arch_setup_pdev_archdata() is already in the call chain:
> 
>   platform_device_register_simple()
>   platform_device_register_resndata()
>   platform_device_register_full()
>   platform_device_alloc()
>   arch_setup_pdev_archdata()
> 
> Thoughts? Will this have nasty side effects for m68k platforms that use 
> smaller dma masks?

These can still set a smaller mask in pdev->dev directly and use
platform_device_register() instead. But I don't think there are smaller
DMA masks used by m68k drivers that use the platform device mechanism at
present. I've only looked at arch/m68k though.

Cheers,

Michael


Re: [PATCH net] macmace: Set platform device coherent_dma_mask

2018-05-10 Thread Michael Schmitz
Hi Finn,

Am 11.05.2018 um 15:28 schrieb Finn Thain:
> On Fri, 11 May 2018, Michael Schmitz wrote:
> 
>>>> Which begs the question: why can' you set up all Nubus bus devices' 
>>>> DMA masks in nubus_device_register(), or nubus_add_board()?
>>>
>>> I am expecting to see the same WARNING from the nubus sonic driver but 
>>> it hasn't happened yet, so I don't have a patch for it yet. In 
>>> anycase, the nubus fix would be a lot like the zorro bus fix, so I 
>>> don't see a problem.
>>
>> That's odd. But what I meant to say is that by setting up 
>> dma_coherent_mask in nubus_add_board(), and pointing dma_mask to that, 
>> ypu won't need any patches to Nubus device drivers.
> 
> Right. I think I've already acknowledged that. But it's off-topic, because 
> the patches under review are for platform drivers. Those patches fix an 
> actual bug that I've observed. Whereas, the nubus driver dma mask issue 
> that you raised is purely theoretical at this stage.

I had lost track of the fact that macsonic can be probed as either Nubus
or platform device. Sorry for the noise.

I'm afraid using platform_device_register() (which you already use for
the SCC devices) is the only option handling this on a per-device basis
without touching platform core code, while at the same time keeping the
DMA mask setup out of device drivers (I can see Geert's point there -
device driver code might be shared across implementations of the device
on platforms with different DMA mask requirements,, something the driver
can't be expected to know about).

Cheers,

Michael


Re: [PATCH net] macmace: Set platform device coherent_dma_mask

2018-05-10 Thread Michael Schmitz
Hi Finn,

Am 11.05.2018 um 15:28 schrieb Finn Thain:
> On Fri, 11 May 2018, Michael Schmitz wrote:
> 
>>>> Which begs the question: why can' you set up all Nubus bus devices' 
>>>> DMA masks in nubus_device_register(), or nubus_add_board()?
>>>
>>> I am expecting to see the same WARNING from the nubus sonic driver but 
>>> it hasn't happened yet, so I don't have a patch for it yet. In 
>>> anycase, the nubus fix would be a lot like the zorro bus fix, so I 
>>> don't see a problem.
>>
>> That's odd. But what I meant to say is that by setting up 
>> dma_coherent_mask in nubus_add_board(), and pointing dma_mask to that, 
>> ypu won't need any patches to Nubus device drivers.
> 
> Right. I think I've already acknowledged that. But it's off-topic, because 
> the patches under review are for platform drivers. Those patches fix an 
> actual bug that I've observed. Whereas, the nubus driver dma mask issue 
> that you raised is purely theoretical at this stage.

I had lost track of the fact that macsonic can be probed as either Nubus
or platform device. Sorry for the noise.

I'm afraid using platform_device_register() (which you already use for
the SCC devices) is the only option handling this on a per-device basis
without touching platform core code, while at the same time keeping the
DMA mask setup out of device drivers (I can see Geert's point there -
device driver code might be shared across implementations of the device
on platforms with different DMA mask requirements,, something the driver
can't be expected to know about).

Cheers,

Michael


Re: [PATCH net] macmace: Set platform device coherent_dma_mask

2018-05-10 Thread Michael Schmitz
Hi Finn,

On Fri, May 11, 2018 at 11:55 AM, Finn Thain  wrote:

>> > What's worse, if you do pass a dma_mask in struct
>> > platform_device_info, you end up with this problem in
>> > platform_device_register_full():
>> >
>> > if (pdevinfo->dma_mask) {
>> > /*
>> >  * This memory isn't freed when the device is put,
>> >  * I don't have a nice idea for that though.  Conceptually
>> >  * dma_mask in struct device should not be a pointer.
>> >  * See http://thread.gmane.org/gmane.linux.kernel.pci/9081
>> >  */
>> > pdev->dev.dma_mask =
>> > kmalloc(sizeof(*pdev->dev.dma_mask), GFP_KERNEL);
>>
>> Maybe platform_device_register_full() should rather check whether
>> dev.coherent_dma_mask is set, and make dev.dma_mask point to that? This
>> is how we solved the warning issue for the Zorro bus devices...
>> (8614f1b58bd0e920a5859464a500b93152c5f8b1)
>>
>
> The claim in the comment above that a pointer is the wrong solution
> suggests that your proposal won't get far. Also, your proposal doesn't

I read the comment to be mostly concerned about not freeing memory,
and attempted to address that. I won't pretend it's the right thing to
do if the pointer will go away anyway, and I certainly won't submit a
patch. Sorry for muddling the issue.

> address the other issues I raised: a new
> platform_device_register_simple_dma() API would only have two callers, and
> the dma mask setup for device-tree probed platform devices is apparently a
> work-in-progress (which I don't want to churn up).

Yes, and that's why I would prefer your old patch handling this in the
device driver (which Geert didn't like), or in the alternative to set
the mask up when registering a device with its bus where appropriate.

I concede this won't help with pure platform devices but as we can't
test all these, we should leave the fix for platfoem devices up to
Christoph.

>
>> > > With people setting the mask to kill the WARNING splat, this may
>> > > become more common.
>> >
>> > Since the commit which introduced the WARNING, only commits f61e64310b75
>> > ("m68k: set dma and coherent masks for platform FEC ethernets") and
>> > 7bcfab202ca7 ("powerpc/macio: set a proper dma_coherent_mask") seem to be
>> > aimed at squelching that WARNING.
>> >
>> > (Am I missing any others?)
>>
>> Zorro devices :-)
>
> Right, I should add commit 55496d3fe2ac ("zorro: Set up z->dev.dma_mask
> for the DMA API") to that list.
>
>> Which begs the question: why can' you set up all Nubus bus devices' DMA
>> masks in nubus_device_register(), or nubus_add_board()?
>
> I am expecting to see the same WARNING from the nubus sonic driver but it
> hasn't happened yet, so I don't have a patch for it yet. In anycase, the
> nubus fix would be a lot like the zorro bus fix, so I don't see a problem.

That's odd. But what I meant to say is that by setting up
dma_coherent_mask in nubus_add_board(), and pointing dma_mask to that,
ypu won't need any patches to Nubus device drivers.

I must be missing something else...

Cheers,

  Michael


>
> --


Re: [PATCH net] macmace: Set platform device coherent_dma_mask

2018-05-10 Thread Michael Schmitz
Hi Finn,

On Fri, May 11, 2018 at 11:55 AM, Finn Thain  wrote:

>> > What's worse, if you do pass a dma_mask in struct
>> > platform_device_info, you end up with this problem in
>> > platform_device_register_full():
>> >
>> > if (pdevinfo->dma_mask) {
>> > /*
>> >  * This memory isn't freed when the device is put,
>> >  * I don't have a nice idea for that though.  Conceptually
>> >  * dma_mask in struct device should not be a pointer.
>> >  * See http://thread.gmane.org/gmane.linux.kernel.pci/9081
>> >  */
>> > pdev->dev.dma_mask =
>> > kmalloc(sizeof(*pdev->dev.dma_mask), GFP_KERNEL);
>>
>> Maybe platform_device_register_full() should rather check whether
>> dev.coherent_dma_mask is set, and make dev.dma_mask point to that? This
>> is how we solved the warning issue for the Zorro bus devices...
>> (8614f1b58bd0e920a5859464a500b93152c5f8b1)
>>
>
> The claim in the comment above that a pointer is the wrong solution
> suggests that your proposal won't get far. Also, your proposal doesn't

I read the comment to be mostly concerned about not freeing memory,
and attempted to address that. I won't pretend it's the right thing to
do if the pointer will go away anyway, and I certainly won't submit a
patch. Sorry for muddling the issue.

> address the other issues I raised: a new
> platform_device_register_simple_dma() API would only have two callers, and
> the dma mask setup for device-tree probed platform devices is apparently a
> work-in-progress (which I don't want to churn up).

Yes, and that's why I would prefer your old patch handling this in the
device driver (which Geert didn't like), or in the alternative to set
the mask up when registering a device with its bus where appropriate.

I concede this won't help with pure platform devices but as we can't
test all these, we should leave the fix for platfoem devices up to
Christoph.

>
>> > > With people setting the mask to kill the WARNING splat, this may
>> > > become more common.
>> >
>> > Since the commit which introduced the WARNING, only commits f61e64310b75
>> > ("m68k: set dma and coherent masks for platform FEC ethernets") and
>> > 7bcfab202ca7 ("powerpc/macio: set a proper dma_coherent_mask") seem to be
>> > aimed at squelching that WARNING.
>> >
>> > (Am I missing any others?)
>>
>> Zorro devices :-)
>
> Right, I should add commit 55496d3fe2ac ("zorro: Set up z->dev.dma_mask
> for the DMA API") to that list.
>
>> Which begs the question: why can' you set up all Nubus bus devices' DMA
>> masks in nubus_device_register(), or nubus_add_board()?
>
> I am expecting to see the same WARNING from the nubus sonic driver but it
> hasn't happened yet, so I don't have a patch for it yet. In anycase, the
> nubus fix would be a lot like the zorro bus fix, so I don't see a problem.

That's odd. But what I meant to say is that by setting up
dma_coherent_mask in nubus_add_board(), and pointing dma_mask to that,
ypu won't need any patches to Nubus device drivers.

I must be missing something else...

Cheers,

  Michael


>
> --


Re: [PATCH net] macmace: Set platform device coherent_dma_mask

2018-05-10 Thread Michael Schmitz
Hi Finn,

On Thu, May 10, 2018 at 1:25 PM, Finn Thain  wrote:
> On Thu, 3 May 2018, Geert Uytterhoeven wrote:
>
>>
>> Perhaps you can add a new helper
>> (platform_device_register_simple_dma()?) that takes the DMA mask, too?
[...]
> To actually hoist the dma mask setup out of existing platform drivers
> would have implications for every device that matches with those drivers.
>
> That's a bit risky since I can't test those devices -- that's assuming I
> could identify them all; sometimes platform device matching is not well
> defined at build time (see loongson_sysconf.ecname).
>
> So far, it looks like macmace and macsonic would be the only callers of
> this new API call.
>
> What's worse, if you do pass a dma_mask in struct platform_device_info,
> you end up with this problem in platform_device_register_full():
>
> if (pdevinfo->dma_mask) {
> /*
>  * This memory isn't freed when the device is put,
>  * I don't have a nice idea for that though.  Conceptually
>  * dma_mask in struct device should not be a pointer.
>  * See http://thread.gmane.org/gmane.linux.kernel.pci/9081
>  */
> pdev->dev.dma_mask =
> kmalloc(sizeof(*pdev->dev.dma_mask), GFP_KERNEL);

Maybe platform_device_register_full() should rather check whether
dev.coherent_dma_mask is set, and make dev.dma_mask point to that?
This is how we solved the warning issue for the Zorro bus devices...
(8614f1b58bd0e920a5859464a500b93152c5f8b1)

Not sure what the ramifications of that change would be in the general
case (i.e. platforms where coherent and non-coherent DMA operations
must use different masks). I'd hope all those platforms explicitly set
up their DMA masks anyway.

Your other comment regarding the default used by dma_get_mask() is spot on.

> Most of the platform drivers that call dma_coerce_mask_and_coherent() are
> using pdev->of_match_table, not platform_device_register_simple(). Many of
> them have a comment like this:
>
> /*
>  * Right now device-tree probed devices don't get dma_mask set.
>  * Since shared usb code relies on it, set it here for now.
>  * Once we have dma capability bindings this can go away.
>  */
>
>> With people setting the mask to kill the WARNING splat, this may become
>> more common.
>
> Since the commit which introduced the WARNING, only commits f61e64310b75
> ("m68k: set dma and coherent masks for platform FEC ethernets") and
> 7bcfab202ca7 ("powerpc/macio: set a proper dma_coherent_mask") seem to be
> aimed at squelching that WARNING.
>
> (Am I missing any others?)

Zorro devices :-) Which begs the question: why can' you set up all
Nubus bus devices' DMA masks in nubus_device_register(), or
nubus_add_board()?

> So far, this is not looking like a common problem, and I'm having trouble
> finding some way to improve on my original patches.

Putting this in the core platform device code might have too many
unintended side effects. Platform specific bus drivers or device
drivers might be a safer place to put this. Makes it harder for
Christoph to find all instances of such workarounds though.

Cheers,

  Michael


Re: [PATCH net] macmace: Set platform device coherent_dma_mask

2018-05-10 Thread Michael Schmitz
Hi Finn,

On Thu, May 10, 2018 at 1:25 PM, Finn Thain  wrote:
> On Thu, 3 May 2018, Geert Uytterhoeven wrote:
>
>>
>> Perhaps you can add a new helper
>> (platform_device_register_simple_dma()?) that takes the DMA mask, too?
[...]
> To actually hoist the dma mask setup out of existing platform drivers
> would have implications for every device that matches with those drivers.
>
> That's a bit risky since I can't test those devices -- that's assuming I
> could identify them all; sometimes platform device matching is not well
> defined at build time (see loongson_sysconf.ecname).
>
> So far, it looks like macmace and macsonic would be the only callers of
> this new API call.
>
> What's worse, if you do pass a dma_mask in struct platform_device_info,
> you end up with this problem in platform_device_register_full():
>
> if (pdevinfo->dma_mask) {
> /*
>  * This memory isn't freed when the device is put,
>  * I don't have a nice idea for that though.  Conceptually
>  * dma_mask in struct device should not be a pointer.
>  * See http://thread.gmane.org/gmane.linux.kernel.pci/9081
>  */
> pdev->dev.dma_mask =
> kmalloc(sizeof(*pdev->dev.dma_mask), GFP_KERNEL);

Maybe platform_device_register_full() should rather check whether
dev.coherent_dma_mask is set, and make dev.dma_mask point to that?
This is how we solved the warning issue for the Zorro bus devices...
(8614f1b58bd0e920a5859464a500b93152c5f8b1)

Not sure what the ramifications of that change would be in the general
case (i.e. platforms where coherent and non-coherent DMA operations
must use different masks). I'd hope all those platforms explicitly set
up their DMA masks anyway.

Your other comment regarding the default used by dma_get_mask() is spot on.

> Most of the platform drivers that call dma_coerce_mask_and_coherent() are
> using pdev->of_match_table, not platform_device_register_simple(). Many of
> them have a comment like this:
>
> /*
>  * Right now device-tree probed devices don't get dma_mask set.
>  * Since shared usb code relies on it, set it here for now.
>  * Once we have dma capability bindings this can go away.
>  */
>
>> With people setting the mask to kill the WARNING splat, this may become
>> more common.
>
> Since the commit which introduced the WARNING, only commits f61e64310b75
> ("m68k: set dma and coherent masks for platform FEC ethernets") and
> 7bcfab202ca7 ("powerpc/macio: set a proper dma_coherent_mask") seem to be
> aimed at squelching that WARNING.
>
> (Am I missing any others?)

Zorro devices :-) Which begs the question: why can' you set up all
Nubus bus devices' DMA masks in nubus_device_register(), or
nubus_add_board()?

> So far, this is not looking like a common problem, and I'm having trouble
> finding some way to improve on my original patches.

Putting this in the core platform device code might have too many
unintended side effects. Platform specific bus drivers or device
drivers might be a safer place to put this. Makes it harder for
Christoph to find all instances of such workarounds though.

Cheers,

  Michael


Re: [PATCH] nubus: Unconditionally register bus type

2018-05-08 Thread Michael Schmitz
Hi Greg,

Am 08.05.2018 um 19:25 schrieb Greg Kroah-Hartman:
> On Tue, May 08, 2018 at 09:07:27AM +0200, Geert Uytterhoeven wrote:
>> Hi Greg,
>>
>> On Tue, May 8, 2018 at 9:00 AM, Greg Kroah-Hartman
>> <gre...@linuxfoundation.org> wrote:
>>> On Mon, May 07, 2018 at 09:51:12AM +1200, Michael Schmitz wrote:
>>>> the BUG() was triggered by loading a Mac Nubus network card module on
>>>> a multiplatform kernel running on an Amiga machine. Up to Finn's Nubus
>>>> core rewrite (this February), we've seen no errors. Since then, Nubus
>>>> drivers fail to register because the Nubus bus is only registered on
>>>> Macs.
>>>>
>>>> Can't see link order involved here at all.
>>>
>>> The link order is totally involved here :)
>>>
>>> Link order determines the order in which init calls are run, so you need
>>> to ensure that your bus code comes before any drivers that use that bus
>>> code in link order.  That way, at init time, your bus is created first,
>>> preventing this type of error to happen.
>>
>> The issue here is not due to link ordering, but due to the bus not being
>> registered on a system that doesn't have that particular bus.
> 
> But how can that happen if the bus code is not present in the system at
> that point in time?  Hardware doesn't matter at all here.
> 
>> Akin to booting a kernel on an old PC without PCI, and loading a driver
>> module for a PCI network card. I guess that doesn't crash (because no one
>> has a PC without PCI anymore? ;-)
> 
> No, it should work just fine, try it!  :)
> 
> The driver will not bind to anything, but the bus code should work
> properly, as long as it is initialized before the driver tries to
> register with that specific bus type.

Before Finn's patch, the Nubus init call used to do this:

static int __init nubus_init(void)
{
int err;

if (!MACH_IS_MAC)
return 0;

nubus_proc_init();
err = nubus_bus_register();
if (err)
return err;
nubus_scan_bus();
return 0;
}

subsys_initcall(nubus_init);

MACH_IS_MAC returns false if run on Amiga, Atari, ... anything but Mac.
The Nubus bus driver is only registered on Mac hardware (working on the
assumption it won't be useful anywhere else so should not be needed.
That was a little naive, as we've seen now.).

The initcalls for Nubus hardware drivers attempt to register the driver
with the Nubus bus driver, regardless of what hardware we're running on,
i.e. regardless of whether or not the Nubus bus ever registered with the
core.

Finn's patch sidesteps the issue by registering the Nubus bus with the
core unconditionally (PCI probably does the same). Drivers can then
register successfully but will never see their probe code called because
the Nubus driver only scans the bus for devices if running on a Mac. So
no harm done either way.

An alternative might be to only allow devices to register if the Nubus
bus has successfully registered with the core (which it wouldn't do
unless running on Mac hardware). That would solve link order and wrong
hardware issues at the same time.
(Link order is Nubus before network drivers as far as I could see from
drivers/built-in.a)

Cheers,

Michael

> 
> thanks,
> 
> greg k-h
> 


Re: [PATCH] nubus: Unconditionally register bus type

2018-05-08 Thread Michael Schmitz
Hi Greg,

Am 08.05.2018 um 19:25 schrieb Greg Kroah-Hartman:
> On Tue, May 08, 2018 at 09:07:27AM +0200, Geert Uytterhoeven wrote:
>> Hi Greg,
>>
>> On Tue, May 8, 2018 at 9:00 AM, Greg Kroah-Hartman
>>  wrote:
>>> On Mon, May 07, 2018 at 09:51:12AM +1200, Michael Schmitz wrote:
>>>> the BUG() was triggered by loading a Mac Nubus network card module on
>>>> a multiplatform kernel running on an Amiga machine. Up to Finn's Nubus
>>>> core rewrite (this February), we've seen no errors. Since then, Nubus
>>>> drivers fail to register because the Nubus bus is only registered on
>>>> Macs.
>>>>
>>>> Can't see link order involved here at all.
>>>
>>> The link order is totally involved here :)
>>>
>>> Link order determines the order in which init calls are run, so you need
>>> to ensure that your bus code comes before any drivers that use that bus
>>> code in link order.  That way, at init time, your bus is created first,
>>> preventing this type of error to happen.
>>
>> The issue here is not due to link ordering, but due to the bus not being
>> registered on a system that doesn't have that particular bus.
> 
> But how can that happen if the bus code is not present in the system at
> that point in time?  Hardware doesn't matter at all here.
> 
>> Akin to booting a kernel on an old PC without PCI, and loading a driver
>> module for a PCI network card. I guess that doesn't crash (because no one
>> has a PC without PCI anymore? ;-)
> 
> No, it should work just fine, try it!  :)
> 
> The driver will not bind to anything, but the bus code should work
> properly, as long as it is initialized before the driver tries to
> register with that specific bus type.

Before Finn's patch, the Nubus init call used to do this:

static int __init nubus_init(void)
{
int err;

if (!MACH_IS_MAC)
return 0;

nubus_proc_init();
err = nubus_bus_register();
if (err)
return err;
nubus_scan_bus();
return 0;
}

subsys_initcall(nubus_init);

MACH_IS_MAC returns false if run on Amiga, Atari, ... anything but Mac.
The Nubus bus driver is only registered on Mac hardware (working on the
assumption it won't be useful anywhere else so should not be needed.
That was a little naive, as we've seen now.).

The initcalls for Nubus hardware drivers attempt to register the driver
with the Nubus bus driver, regardless of what hardware we're running on,
i.e. regardless of whether or not the Nubus bus ever registered with the
core.

Finn's patch sidesteps the issue by registering the Nubus bus with the
core unconditionally (PCI probably does the same). Drivers can then
register successfully but will never see their probe code called because
the Nubus driver only scans the bus for devices if running on a Mac. So
no harm done either way.

An alternative might be to only allow devices to register if the Nubus
bus has successfully registered with the core (which it wouldn't do
unless running on Mac hardware). That would solve link order and wrong
hardware issues at the same time.
(Link order is Nubus before network drivers as far as I could see from
drivers/built-in.a)

Cheers,

Michael

> 
> thanks,
> 
> greg k-h
> 


Re: moving affs + RDB partition support to staging?

2018-05-07 Thread Michael Schmitz
Martin,

On Mon, May 7, 2018 at 7:08 PM, Martin Steigerwald <mar...@lichtvoll.de> wrote:
> Michael Schmitz - 07.05.18, 04:40:
>> Al,
>>
>> I don't think there is USB sticks with affs on them as yet. There
>> isn't even USB host controller support for Amiga hardware (yet).
>>
>> Last I tried USB on m68k (Atari, 060 accelerator) the desktop
>> experience was such that I'd rather not repeat that in a hurry (and
>> that was a simple FAT USB stick).
>
> There is USB support available on Amiga since a long time.

Good to hear that. I stand corrected.

> On "Classic" Amigas AmigaOS 3.x with Poseidon USB stack + some USB card.

Haven't seen a Linux driver for that 'some USB card' yet.

> On AmigaOS 4.x built-in. AmigaOS 4.x hardware like Sam boards from Acube
> Systems have USB controllers that work out of the bux.

Forgot about the new (non-m68k) hardware. My focus is somewhat narrow,
on m68k and Linux.

> And I am pretty sure, you can also tell it to use Amiga Fast Filesystem
> (on Linux affs) on an USB stick. Also you can plug in an external
> harddisk with RDB partitions and whatever filesystems you wish.

I already conceded that's possible.

So our problem with the bug Al spotted, and AFFS on USB media are twtofold:

AmigaOS:
Exploitable: yes (unless the AmigaOS AFFS driver detects and mitigates this).
Likelihood: low (as Joanne said there are easier ways to do harm to
these systems)

Linux:
Exploitable: yes, except on hardware that doesn't have USB hardware support.
Likelihood: high

Can we blacklist affs from being autoloaded through udev on USB
storage media discovery?

Cheers,

  Michael


Re: moving affs + RDB partition support to staging?

2018-05-07 Thread Michael Schmitz
Martin,

On Mon, May 7, 2018 at 7:08 PM, Martin Steigerwald  wrote:
> Michael Schmitz - 07.05.18, 04:40:
>> Al,
>>
>> I don't think there is USB sticks with affs on them as yet. There
>> isn't even USB host controller support for Amiga hardware (yet).
>>
>> Last I tried USB on m68k (Atari, 060 accelerator) the desktop
>> experience was such that I'd rather not repeat that in a hurry (and
>> that was a simple FAT USB stick).
>
> There is USB support available on Amiga since a long time.

Good to hear that. I stand corrected.

> On "Classic" Amigas AmigaOS 3.x with Poseidon USB stack + some USB card.

Haven't seen a Linux driver for that 'some USB card' yet.

> On AmigaOS 4.x built-in. AmigaOS 4.x hardware like Sam boards from Acube
> Systems have USB controllers that work out of the bux.

Forgot about the new (non-m68k) hardware. My focus is somewhat narrow,
on m68k and Linux.

> And I am pretty sure, you can also tell it to use Amiga Fast Filesystem
> (on Linux affs) on an USB stick. Also you can plug in an external
> harddisk with RDB partitions and whatever filesystems you wish.

I already conceded that's possible.

So our problem with the bug Al spotted, and AFFS on USB media are twtofold:

AmigaOS:
Exploitable: yes (unless the AmigaOS AFFS driver detects and mitigates this).
Likelihood: low (as Joanne said there are easier ways to do harm to
these systems)

Linux:
Exploitable: yes, except on hardware that doesn't have USB hardware support.
Likelihood: high

Can we blacklist affs from being autoloaded through udev on USB
storage media discovery?

Cheers,

  Michael


  1   2   3   >