Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

2018-03-15 Thread Shmulik Ladkani
On Thu, 15 Mar 2018 09:35:51 -0700 (PDT) Liran Alon  
wrote:
> - shmulik.ladk...@gmail.com wrote:
> 
> > On Thu, 15 Mar 2018 08:01:03 -0700 (PDT) Liran Alon
> >  wrote:  
> > > 
> > > I still think that default behavior should be to zero skb->mark only  
> > when skb  
> > > cross netdevs in different netns.  
> > 
> > But the previous default was scrub the mark in *both* xnet and
> > non-xnet
> > situations.
> > 
> > Therefore, there might be users which RELY on this (strange) default
> > behavior in their same-netns-veth-pair setups.
> > Meaning, changing the default behavior might break their apps relying
> > on
> > the former default behavior.
> > 
> > This is why the "disable mark scrubbing in non-xnet case" should be
> > opt-in.  
> 
> We think the same.
> The only difference is that I think this for now should be controllable
> by a global /proc/sys/net/core file instead of giving a flexible per-netdev
> control.
> Because that is a larger change that could be done later.

A flags attribute to veth newlink is a very scoped change.
User controls this per veth creation.
This is way more neat than /proc/sys/net and provides the desired granular
control.

Also, scoping this to veth has the advantage of not affecting the many other
dev_forward_skb callers.

Regards,
Shmulik


Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

2018-03-15 Thread Shmulik Ladkani
On Thu, 15 Mar 2018 09:35:51 -0700 (PDT) Liran Alon  
wrote:
> - shmulik.ladk...@gmail.com wrote:
> 
> > On Thu, 15 Mar 2018 08:01:03 -0700 (PDT) Liran Alon
> >  wrote:  
> > > 
> > > I still think that default behavior should be to zero skb->mark only  
> > when skb  
> > > cross netdevs in different netns.  
> > 
> > But the previous default was scrub the mark in *both* xnet and
> > non-xnet
> > situations.
> > 
> > Therefore, there might be users which RELY on this (strange) default
> > behavior in their same-netns-veth-pair setups.
> > Meaning, changing the default behavior might break their apps relying
> > on
> > the former default behavior.
> > 
> > This is why the "disable mark scrubbing in non-xnet case" should be
> > opt-in.  
> 
> We think the same.
> The only difference is that I think this for now should be controllable
> by a global /proc/sys/net/core file instead of giving a flexible per-netdev
> control.
> Because that is a larger change that could be done later.

A flags attribute to veth newlink is a very scoped change.
User controls this per veth creation.
This is way more neat than /proc/sys/net and provides the desired granular
control.

Also, scoping this to veth has the advantage of not affecting the many other
dev_forward_skb callers.

Regards,
Shmulik


Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

2018-03-15 Thread Shmulik Ladkani
On Thu, 15 Mar 2018 08:01:03 -0700 (PDT) Liran Alon  
wrote:
> 
> I still think that default behavior should be to zero skb->mark only when skb
> cross netdevs in different netns.

But the previous default was scrub the mark in *both* xnet and non-xnet
situations.

Therefore, there might be users which RELY on this (strange) default
behavior in their same-netns-veth-pair setups.
Meaning, changing the default behavior might break their apps relying on
the former default behavior.

This is why the "disable mark scrubbing in non-xnet case" should be opt-in.

Regards,
Shmulik


Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

2018-03-15 Thread Shmulik Ladkani
On Thu, 15 Mar 2018 08:01:03 -0700 (PDT) Liran Alon  
wrote:
> 
> I still think that default behavior should be to zero skb->mark only when skb
> cross netdevs in different netns.

But the previous default was scrub the mark in *both* xnet and non-xnet
situations.

Therefore, there might be users which RELY on this (strange) default
behavior in their same-netns-veth-pair setups.
Meaning, changing the default behavior might break their apps relying on
the former default behavior.

This is why the "disable mark scrubbing in non-xnet case" should be opt-in.

Regards,
Shmulik


Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

2018-03-15 Thread Shmulik Ladkani
On Thu, 15 Mar 2018 16:13:39 +0100 Daniel Borkmann <dan...@iogearbox.net> wrote:
> On 03/15/2018 01:50 PM, Shmulik Ladkani wrote:
> > 
> > It would be beneficial to have the mark preserved when skb is injected
> > to the slave device's rx path (especially when it's on the same netns).  
> 
> Right, I think also here the easiest would be to have a BPF_F_PRESERVE_MARK
> flag to opt-in in general case (xnet/non-xnet)

Sounds okay to me.

> But lets presume for a sec you would _not_ scrub it, then how are users
> supposed to make use of this? The feature/bug may not be critical enough
> (well, otherwise it wouldn't have been like this for long time) for stable,
> so to write an app relying on it the behavior will change from kernel A to
> kernel B, where you need to end up having a full blown veth run-time test
> in order to figure it out before you can use it, not really useful either.

Let's assume BPF_F_PRESERVE_MARK is a feature then, which is available only
in new kernels.
As said, this flag will not be honored by older kernels.

But your "run-time test" argument is true for every new flag-bit
introduced to bpf functions, for example:
 BPF_F_SEQ_NUMBER was added after other skb_set_tunnel_key flags,
 Same for BPF_F_INVALIDATE_HASH (skb_store_bytes), BPF_F_MARK_ENFORCE
 (l4_csum_replace) and others.

With every flag addition, the flag mask validation in the corresponding
bpf function has been relaxed to support it.

Why is BPF_F_PRESERVE_MARK any different from any previous flag addition?

Thanks,
Shmulik


Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

2018-03-15 Thread Shmulik Ladkani
On Thu, 15 Mar 2018 16:13:39 +0100 Daniel Borkmann  wrote:
> On 03/15/2018 01:50 PM, Shmulik Ladkani wrote:
> > 
> > It would be beneficial to have the mark preserved when skb is injected
> > to the slave device's rx path (especially when it's on the same netns).  
> 
> Right, I think also here the easiest would be to have a BPF_F_PRESERVE_MARK
> flag to opt-in in general case (xnet/non-xnet)

Sounds okay to me.

> But lets presume for a sec you would _not_ scrub it, then how are users
> supposed to make use of this? The feature/bug may not be critical enough
> (well, otherwise it wouldn't have been like this for long time) for stable,
> so to write an app relying on it the behavior will change from kernel A to
> kernel B, where you need to end up having a full blown veth run-time test
> in order to figure it out before you can use it, not really useful either.

Let's assume BPF_F_PRESERVE_MARK is a feature then, which is available only
in new kernels.
As said, this flag will not be honored by older kernels.

But your "run-time test" argument is true for every new flag-bit
introduced to bpf functions, for example:
 BPF_F_SEQ_NUMBER was added after other skb_set_tunnel_key flags,
 Same for BPF_F_INVALIDATE_HASH (skb_store_bytes), BPF_F_MARK_ENFORCE
 (l4_csum_replace) and others.

With every flag addition, the flag mask validation in the corresponding
bpf function has been relaxed to support it.

Why is BPF_F_PRESERVE_MARK any different from any previous flag addition?

Thanks,
Shmulik


Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

2018-03-15 Thread Shmulik Ladkani
Hi,

On Thu, 15 Mar 2018 12:56:13 +0100 Daniel Borkmann <dan...@iogearbox.net> wrote:
> On 03/15/2018 10:21 AM, Shmulik Ladkani wrote:
> > 
> > Regarding veth xmit, it does makes sense to preserve the fields if not
> > crossing netns. This is also the case when one uses tc mirred.
> > 
> > Regarding bpf redirect, well, it depends on the expectations of each bpf
> > program.
> > I'd argue that preserving the fields (at least the mark field) in the
> > *non* xnet makes sense and provides more information and therefore more
> > capabilities; Alas this might change behavior already being relied on.
> > 
> > Maybe Daniel can comment on the matter.  
> 
> Overall I think it might be nice to not need scrubbing skb in such cases,
> although my concern would be that this has potential to break existing
> setups when they would expect mark being zero on other veth peer in any
> case since it's the behavior for a long time already. The safer option
> would be to have some sort of explicit opt-in e.g. on link creation to let
> the skb->mark pass through unscrubbed. This would definitely be a useful
> option e.g. when mark is set in the netns facing veth via clsact/egress
> on xmit and when the container is unprivileged anyway.

For the veth xmit case, an opt-in flag which disables mark scrubbing in
the *non* xnet veth-pair seems reasonable.

But what about bpf_redirect BPF_F_INGRESS, in setups not invovling
containers?
Currently bpf_redirect is implemented using dev_forward_skb which
*fully* scrubs the skb, even if the target device is on same netns as
skb->dev is.

One might use ebpf programs that perform BPF_F_INGRESS bpf_redirect, for
example for demuxing skbs arriving on some "master" device into various
"slave" devices using specialized critiria.

It would be beneficial to have the mark preserved when skb is injected
to the slave device's rx path (especially when it's on the same netns).

Liran's patch fixes this - but at the cost of changing existing behavior
for BPF_F_INGRESS users (formerly: fully scrubbed; post patch: scrubbed
only if xnet).

I wonder, do you know of implementations that actually RELY on the fact
that BPF_F_INGRESS actually clears the mark, in the *non* xnet case?

Regards,
Shmulik


Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

2018-03-15 Thread Shmulik Ladkani
Hi,

On Thu, 15 Mar 2018 12:56:13 +0100 Daniel Borkmann  wrote:
> On 03/15/2018 10:21 AM, Shmulik Ladkani wrote:
> > 
> > Regarding veth xmit, it does makes sense to preserve the fields if not
> > crossing netns. This is also the case when one uses tc mirred.
> > 
> > Regarding bpf redirect, well, it depends on the expectations of each bpf
> > program.
> > I'd argue that preserving the fields (at least the mark field) in the
> > *non* xnet makes sense and provides more information and therefore more
> > capabilities; Alas this might change behavior already being relied on.
> > 
> > Maybe Daniel can comment on the matter.  
> 
> Overall I think it might be nice to not need scrubbing skb in such cases,
> although my concern would be that this has potential to break existing
> setups when they would expect mark being zero on other veth peer in any
> case since it's the behavior for a long time already. The safer option
> would be to have some sort of explicit opt-in e.g. on link creation to let
> the skb->mark pass through unscrubbed. This would definitely be a useful
> option e.g. when mark is set in the netns facing veth via clsact/egress
> on xmit and when the container is unprivileged anyway.

For the veth xmit case, an opt-in flag which disables mark scrubbing in
the *non* xnet veth-pair seems reasonable.

But what about bpf_redirect BPF_F_INGRESS, in setups not invovling
containers?
Currently bpf_redirect is implemented using dev_forward_skb which
*fully* scrubs the skb, even if the target device is on same netns as
skb->dev is.

One might use ebpf programs that perform BPF_F_INGRESS bpf_redirect, for
example for demuxing skbs arriving on some "master" device into various
"slave" devices using specialized critiria.

It would be beneficial to have the mark preserved when skb is injected
to the slave device's rx path (especially when it's on the same netns).

Liran's patch fixes this - but at the cost of changing existing behavior
for BPF_F_INGRESS users (formerly: fully scrubbed; post patch: scrubbed
only if xnet).

I wonder, do you know of implementations that actually RELY on the fact
that BPF_F_INGRESS actually clears the mark, in the *non* xnet case?

Regards,
Shmulik


Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

2018-03-15 Thread Shmulik Ladkani
Hi,

On Tue, 13 Mar 2018 17:07:22 +0200 Liran Alon  wrote:
> Before this commit, dev_forward_skb() always cleared packet's
> per-network-namespace info. Even if the packet doesn't cross
> network namespaces.
> 
> The comment above dev_forward_skb() describes that this is done
> because the receiving device may be in another network namespace.
> However, this case can easily be tested for and therefore we can
> scrub packet's per-network-namespace info only when receiving device
> is indeed in another network namespace.
> 
> Therefore, this commit changes dev_forward_skb() to tell
> skb_scrub_packet() that skb has crossed network-namespace only in case
> transmitting device (skb->dev) network namespace is different then
> receiving device (dev) network namespace.

Assuming the premise of this commit is correct, note it may not act as
intended for xnet situation in ipvlan_process_multicast, snip:

nskb->dev = ipvlan->dev;
if (tx_pkt)
ret = dev_forward_skb(ipvlan->dev, nskb);
else
ret = netif_rx(nskb);

as 'dev' gets already assigned to nskb prior dev_forward_skb (hence in
dev_forward_skb both dev and skb->dev are the same).
Fortunately every ipvlan_multicast_enqueue call is preceded by a forced
scrub; It would be future-proof to not assign nskb->dev in the
dev_forward_skb case (assign it only in the netif_rx case).


Regarding the premise of this commit, this "reduces" the
ipvs/orphan/mark scrubbing in the following *non* xnet situations:

 1. mac2vlan port xmit to other macvlan ports in Bridge Mode
 2. similarly for ipvlan
 3. veth xmit
 4. l2tp_eth_dev_recv
 5. bpf redirect/clone_redirect ingress actions

Regarding l2tp recv, this commit seems to align the srubbing behavior
with ip tunnels (full scrub only if crossing netns, see ip_tunnel_rcv).

Regarding veth xmit, it does makes sense to preserve the fields if not
crossing netns. This is also the case when one uses tc mirred.

Regarding bpf redirect, well, it depends on the expectations of each bpf
program.
I'd argue that preserving the fields (at least the mark field) in the
*non* xnet makes sense and provides more information and therefore more
capabilities; Alas this might change behavior already being relied on.

Maybe Daniel can comment on the matter.

Regards,
Shmulik


Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

2018-03-15 Thread Shmulik Ladkani
Hi,

On Tue, 13 Mar 2018 17:07:22 +0200 Liran Alon  wrote:
> Before this commit, dev_forward_skb() always cleared packet's
> per-network-namespace info. Even if the packet doesn't cross
> network namespaces.
> 
> The comment above dev_forward_skb() describes that this is done
> because the receiving device may be in another network namespace.
> However, this case can easily be tested for and therefore we can
> scrub packet's per-network-namespace info only when receiving device
> is indeed in another network namespace.
> 
> Therefore, this commit changes dev_forward_skb() to tell
> skb_scrub_packet() that skb has crossed network-namespace only in case
> transmitting device (skb->dev) network namespace is different then
> receiving device (dev) network namespace.

Assuming the premise of this commit is correct, note it may not act as
intended for xnet situation in ipvlan_process_multicast, snip:

nskb->dev = ipvlan->dev;
if (tx_pkt)
ret = dev_forward_skb(ipvlan->dev, nskb);
else
ret = netif_rx(nskb);

as 'dev' gets already assigned to nskb prior dev_forward_skb (hence in
dev_forward_skb both dev and skb->dev are the same).
Fortunately every ipvlan_multicast_enqueue call is preceded by a forced
scrub; It would be future-proof to not assign nskb->dev in the
dev_forward_skb case (assign it only in the netif_rx case).


Regarding the premise of this commit, this "reduces" the
ipvs/orphan/mark scrubbing in the following *non* xnet situations:

 1. mac2vlan port xmit to other macvlan ports in Bridge Mode
 2. similarly for ipvlan
 3. veth xmit
 4. l2tp_eth_dev_recv
 5. bpf redirect/clone_redirect ingress actions

Regarding l2tp recv, this commit seems to align the srubbing behavior
with ip tunnels (full scrub only if crossing netns, see ip_tunnel_rcv).

Regarding veth xmit, it does makes sense to preserve the fields if not
crossing netns. This is also the case when one uses tc mirred.

Regarding bpf redirect, well, it depends on the expectations of each bpf
program.
I'd argue that preserving the fields (at least the mark field) in the
*non* xnet makes sense and provides more information and therefore more
capabilities; Alas this might change behavior already being relied on.

Maybe Daniel can comment on the matter.

Regards,
Shmulik


Re: [PATCH 4.4 10/34] net/sched: act_vlan: Push skb->data to mac_header prior calling skb_vlan_*() functions

2016-11-13 Thread Shmulik Ladkani
On Sun, 13 Nov 2016 12:24:42 +0100
Greg Kroah-Hartman <gre...@linuxfoundation.org> wrote:

> 4.4-stable review patch.  If anyone has any objections, please let me know.
> 

Looks ok, thanks.
Reviewed-by: Shmulik Ladkani <shmulik.ladk...@gmail.com>


Re: [PATCH 4.4 10/34] net/sched: act_vlan: Push skb->data to mac_header prior calling skb_vlan_*() functions

2016-11-13 Thread Shmulik Ladkani
On Sun, 13 Nov 2016 12:24:42 +0100
Greg Kroah-Hartman  wrote:

> 4.4-stable review patch.  If anyone has any objections, please let me know.
> 

Looks ok, thanks.
Reviewed-by: Shmulik Ladkani 


Re: [PATCH 1/2] tun: Use memdup_user() rather than duplicating its implementation

2016-08-20 Thread Shmulik Ladkani
Hi,

On Sat, 20 Aug 2016 09:34:56 +0200 SF Markus Elfring 
<elfr...@users.sourceforge.net> wrote:
> From: Markus Elfring <elfr...@users.sourceforge.net>
> Date: Sat, 20 Aug 2016 08:54:15 +0200
> 
> Reuse existing functionality from memdup_user() instead of keeping
> duplicate source code.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring <elfr...@users.sourceforge.net>

Reviewed-by: Shmulik Ladkani <shmulik.ladk...@gmail.com>


Re: [PATCH 1/2] tun: Use memdup_user() rather than duplicating its implementation

2016-08-20 Thread Shmulik Ladkani
Hi,

On Sat, 20 Aug 2016 09:34:56 +0200 SF Markus Elfring 
 wrote:
> From: Markus Elfring 
> Date: Sat, 20 Aug 2016 08:54:15 +0200
> 
> Reuse existing functionality from memdup_user() instead of keeping
> duplicate source code.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 

Reviewed-by: Shmulik Ladkani 


Re: [RFC/PATCH] ubi: Replace wear leveling thread with a workqueue

2012-11-23 Thread Shmulik Ladkani
Hi Ezequiel,

On Fri, 23 Nov 2012 09:13:29 -0300 Ezequiel Garcia  
wrote:
>  /**
> - * ubi_thread - UBI background thread.
> + * ubi_wl_do_work - UBI background wl worker function.
>   * @u: the UBI device description object pointer
>   */
> -int ubi_thread(void *u)
> +void ubi_wl_do_work(struct work_struct *work)
>  {
> + int err;
>   int failures = 0;
> - struct ubi_device *ubi = u;
> -
> - ubi_msg("background thread \"%s\" started, PID %d",
> - ubi->bgt_name, task_pid_nr(current));
> -
> - set_freezable();
> - for (;;) {
> - int err;
> -
> - if (kthread_should_stop())
> - break;
> + struct ubi_device *ubi = container_of(work, struct ubi_device, work);
>  
> - if (try_to_freeze())
> - continue;
> -
> - spin_lock(>wl_lock);
> - if (list_empty(>works) || ubi->ro_mode ||

Originally, 'ubi_thread' did nothing if 'ubi->ro_mode'.
This filtering is missing from 'ubi_wl_do_work' implementation.
How do we guarantee 'ubi_wl_do_work' is never queued when in RO mode?

> - !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
> - set_current_state(TASK_INTERRUPTIBLE);
> - spin_unlock(>wl_lock);
> - schedule();
> - continue;
> - }
> - spin_unlock(>wl_lock);
> + err = do_work(ubi);
> + if (!err)
> + return;
>  
> - err = do_work(ubi);
> - if (err) {
> - ubi_err("%s: work failed with error code %d",
> - ubi->bgt_name, err);
> - if (failures++ > WL_MAX_FAILURES) {
> - /*
> -  * Too many failures, disable the thread and
> -  * switch to read-only mode.
> -  */
> - ubi_msg("%s: %d consecutive failures",
> - ubi->bgt_name, WL_MAX_FAILURES);
> - ubi_ro_mode(ubi);
> - ubi->thread_enabled = 0;
> - continue;
> - }
> - } else
> - failures = 0;
> -
> - cond_resched();
> + ubi_err("%s: work failed with error code %d",
> + ubi->wq_name, err);
> + if (failures++ > WL_MAX_FAILURES) {
> + /*
> +  * Too many failures, disable the workqueue and
> +  * switch to read-only mode.
> +  */

This condition will never be met (after your change), since 'failures'
is local to 'ubi_wl_do_work' (per work invocation).

Formerly, 'failures' was local to 'ubi_thread' (per ubi device's
thread), hence it was possible that several 'do_works()' calls have
failed during thread's lifetime, reaching the WL_MAX_FAILURES limit.

If we'd like to preseve the 'failures' semantics, 'failures' should be
an 'ubi_device' property.

One last thing:
Some variables and functions (debug and sysfs) are still named *bgt*,
which is confusing.

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH] ubi: Replace wear leveling thread with a workqueue

2012-11-23 Thread Shmulik Ladkani
Hi Ezequiel,

On Fri, 23 Nov 2012 09:13:29 -0300 Ezequiel Garcia elezegar...@gmail.com 
wrote:
  /**
 - * ubi_thread - UBI background thread.
 + * ubi_wl_do_work - UBI background wl worker function.
   * @u: the UBI device description object pointer
   */
 -int ubi_thread(void *u)
 +void ubi_wl_do_work(struct work_struct *work)
  {
 + int err;
   int failures = 0;
 - struct ubi_device *ubi = u;
 -
 - ubi_msg(background thread \%s\ started, PID %d,
 - ubi-bgt_name, task_pid_nr(current));
 -
 - set_freezable();
 - for (;;) {
 - int err;
 -
 - if (kthread_should_stop())
 - break;
 + struct ubi_device *ubi = container_of(work, struct ubi_device, work);
  
 - if (try_to_freeze())
 - continue;
 -
 - spin_lock(ubi-wl_lock);
 - if (list_empty(ubi-works) || ubi-ro_mode ||

Originally, 'ubi_thread' did nothing if 'ubi-ro_mode'.
This filtering is missing from 'ubi_wl_do_work' implementation.
How do we guarantee 'ubi_wl_do_work' is never queued when in RO mode?

 - !ubi-thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
 - set_current_state(TASK_INTERRUPTIBLE);
 - spin_unlock(ubi-wl_lock);
 - schedule();
 - continue;
 - }
 - spin_unlock(ubi-wl_lock);
 + err = do_work(ubi);
 + if (!err)
 + return;
  
 - err = do_work(ubi);
 - if (err) {
 - ubi_err(%s: work failed with error code %d,
 - ubi-bgt_name, err);
 - if (failures++  WL_MAX_FAILURES) {
 - /*
 -  * Too many failures, disable the thread and
 -  * switch to read-only mode.
 -  */
 - ubi_msg(%s: %d consecutive failures,
 - ubi-bgt_name, WL_MAX_FAILURES);
 - ubi_ro_mode(ubi);
 - ubi-thread_enabled = 0;
 - continue;
 - }
 - } else
 - failures = 0;
 -
 - cond_resched();
 + ubi_err(%s: work failed with error code %d,
 + ubi-wq_name, err);
 + if (failures++  WL_MAX_FAILURES) {
 + /*
 +  * Too many failures, disable the workqueue and
 +  * switch to read-only mode.
 +  */

This condition will never be met (after your change), since 'failures'
is local to 'ubi_wl_do_work' (per work invocation).

Formerly, 'failures' was local to 'ubi_thread' (per ubi device's
thread), hence it was possible that several 'do_works()' calls have
failed during thread's lifetime, reaching the WL_MAX_FAILURES limit.

If we'd like to preseve the 'failures' semantics, 'failures' should be
an 'ubi_device' property.

One last thing:
Some variables and functions (debug and sysfs) are still named *bgt*,
which is confusing.

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mtd: cmdlinepart: fix the wrong partitions number when truncating occurs

2012-08-29 Thread Shmulik Ladkani
On Wed, 29 Aug 2012 11:16:05 +0300 Artem Bityutskiy  wrote:
> On Sun, 2012-08-26 at 09:06 +0300, Shmulik Ladkani wrote:
> > root100m@0
> > kernel  100m@100m
> > rootfs  800m@200m (truncated)
> > user0@1g (truncated)
> > rest0@1g
> 
> Who would benefit from having those 2 0-sized partitions and how? How
> many users/scripts would be confused by this (these 2 ghost partitions
> would be visible in /proc/mtd and sysfs)? How much RAM would we spend
> for creating sysfs files and directories for these ghost partitions
> (note, one sysfs file costs a couple KiB I thing, because 'sizeof
> (struct inode)').
> 
> While you suggestion is clever, do we really benefit from this?

Artem, please note this is just a side effect of what I've suggested
(that its, continue parsing after first truncated partition), not a real
use case I'd expect and intentionally wish to support.

I am used to specify partitions explicitly using size@offset (specifying
offset for all parts, even if sometimes adjacent), and sometimes in an
unsorted fashion.
I never defined some partition that got truncated, so the whole
discussion is theoretical to _my_ usecase.

The only benefit of continuing to parse, is that if there _are_ later
partitions which are defined _explicitly_ with an offset, whose location
is _before_ the truncated partition, these would still be parsed and
registered.

Not so important, and would rarely happen, but simplistic and naive.

And reagrding 0 sized partitions, we can always skip these.

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mtd: cmdlinepart: fix the wrong partitions number when truncating occurs

2012-08-29 Thread Shmulik Ladkani
On Wed, 29 Aug 2012 11:16:05 +0300 Artem Bityutskiy dedeki...@gmail.com wrote:
 On Sun, 2012-08-26 at 09:06 +0300, Shmulik Ladkani wrote:
  root100m@0
  kernel  100m@100m
  rootfs  800m@200m (truncated)
  user0@1g (truncated)
  rest0@1g
 
 Who would benefit from having those 2 0-sized partitions and how? How
 many users/scripts would be confused by this (these 2 ghost partitions
 would be visible in /proc/mtd and sysfs)? How much RAM would we spend
 for creating sysfs files and directories for these ghost partitions
 (note, one sysfs file costs a couple KiB I thing, because 'sizeof
 (struct inode)').
 
 While you suggestion is clever, do we really benefit from this?

Artem, please note this is just a side effect of what I've suggested
(that its, continue parsing after first truncated partition), not a real
use case I'd expect and intentionally wish to support.

I am used to specify partitions explicitly using size@offset (specifying
offset for all parts, even if sometimes adjacent), and sometimes in an
unsorted fashion.
I never defined some partition that got truncated, so the whole
discussion is theoretical to _my_ usecase.

The only benefit of continuing to parse, is that if there _are_ later
partitions which are defined _explicitly_ with an offset, whose location
is _before_ the truncated partition, these would still be parsed and
registered.

Not so important, and would rarely happen, but simplistic and naive.

And reagrding 0 sized partitions, we can always skip these.

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mtd: cmdlinepart: fix the wrong partitions number when truncating occurs

2012-08-26 Thread Shmulik Ladkani
Hi,

On Sat, 25 Aug 2012 05:26:51 -0400 Huang Shijie  wrote:
> On Sat, Aug 25, 2012 at 5:02 AM, Shmulik Ladkani
>  wrote:
> > Your analysis seems right, but let me offer an alternative approach.
> >
> > I would simply:
> >
> > -   part->num_parts = i;
> your code does not wors in such kernel command line(also with the 1GB
> nand chip):
> #gpmi-nand:100m(root),100m(kernel),1g(rootfs),1g(user),-(rest)
> 

Can you please detail what do you mean by "not work"?

To my understanding, in this example, according to my suggestion, the
resulting partitions would be:

root100m@0
kernel  100m@100m
rootfs  800m@200m (truncated)
user0@1g (truncated)
rest0@1g

Reasonable IMO, given the fact that the mtd device size is smaller than
the specified parts.

I saw you submitted a patch which sorts the cmdline parts; I don't
understand why this is necessary.
Also, sorting might not be desirable, as the user specified the unsorted
partitions might have _wanted_ them to appear in that order.

Now lets focus on your original suggestion and its consequences:

- Orignal code STOPPED parsing at the 1st truncated partition,
  this partition WAS NOT returned to the caller
- Your patch STOPS AFTER parsing the 1st truncated partition,
  this partiton IS returned to the caller (but partitions specified
  later are no longer parsed)
- My suggestion CONTINUES parsing all partitions.
  So later partitions (specified with the 'size' but *without* 'offset')
  will be truncated AND presented to the caller.
  AND, if later partitions are specified using the 'size@offset'
  explicit format, they are parsed normally.
  
Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mtd: cmdlinepart: fix the wrong partitions number when truncating occurs

2012-08-26 Thread Shmulik Ladkani
Hi,

On Sat, 25 Aug 2012 05:26:51 -0400 Huang Shijie shij...@gmail.com wrote:
 On Sat, Aug 25, 2012 at 5:02 AM, Shmulik Ladkani
 shmulik.ladk...@gmail.com wrote:
  Your analysis seems right, but let me offer an alternative approach.
 
  I would simply:
 
  -   part-num_parts = i;
 your code does not wors in such kernel command line(also with the 1GB
 nand chip):
 #gpmi-nand:100m(root),100m(kernel),1g(rootfs),1g(user),-(rest)
 

Can you please detail what do you mean by not work?

To my understanding, in this example, according to my suggestion, the
resulting partitions would be:

root100m@0
kernel  100m@100m
rootfs  800m@200m (truncated)
user0@1g (truncated)
rest0@1g

Reasonable IMO, given the fact that the mtd device size is smaller than
the specified parts.

I saw you submitted a patch which sorts the cmdline parts; I don't
understand why this is necessary.
Also, sorting might not be desirable, as the user specified the unsorted
partitions might have _wanted_ them to appear in that order.

Now lets focus on your original suggestion and its consequences:

- Orignal code STOPPED parsing at the 1st truncated partition,
  this partition WAS NOT returned to the caller
- Your patch STOPS AFTER parsing the 1st truncated partition,
  this partiton IS returned to the caller (but partitions specified
  later are no longer parsed)
- My suggestion CONTINUES parsing all partitions.
  So later partitions (specified with the 'size' but *without* 'offset')
  will be truncated AND presented to the caller.
  AND, if later partitions are specified using the 'size@offset'
  explicit format, they are parsed normally.
  
Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mtd: cmdlinepart: fix the wrong partitions number when truncating occurs

2012-08-25 Thread Shmulik Ladkani
On Sat, 25 Aug 2012 05:26:51 -0400 Huang Shijie  wrote:
> > The specified cmdline partitions might not be ordered (according to
> > start offset), so next partition specified after the truncated one might
> > define a partition at the beginning of the device, which is okay
> > (regardless the truncation of current partition).
> could you please give me an example of this specified cmdline?

Assume we have a 1GB(8Gb) nand chip:
  #gpmi-nand:1g@200m(rootfs),100m@0(boot),100m@100m(kernel)

I am used to explicitly specify size@offset for all my parts.

Obviously I won't define a partition above the device size... somewhat
hypothetical discussion here...

But your code will stop after creating 'rootfs' (and original code will
not create a single partition).

Is that what we want?

Or do we want to truncate 'rootfs', but still have the valid 'boot' and
'kerner' partitions?

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mtd: cmdlinepart: fix the wrong check condition

2012-08-25 Thread Shmulik Ladkani
Hi Huang, Artem,

On Sat, 25 Aug 2012 16:06:50 -0400 Huang Shijie  wrote:
> diff --git a/drivers/mtd/cmdlinepart.c b/drivers/mtd/cmdlinepart.c
> index fc960a3..216d751 100644
> --- a/drivers/mtd/cmdlinepart.c
> +++ b/drivers/mtd/cmdlinepart.c
> @@ -322,13 +322,16 @@ static int parse_cmdline_partitions(struct mtd_info 
> *master,
>   struct cmdline_mtd_partition *part;
>   const char *mtd_id = master->name;
>  
> + if (!mtd_id)
> + return 0;
> +
>   /* parse command line */
>   if (!cmdline_parsed)
>   mtdpart_setup_real(cmdline);
>  
>   for(part = partitions; part; part = part->next)
>   {
> - if ((!mtd_id) || (!strcmp(part->mtd_id, mtd_id)))
> + if (!strcmp(part->mtd_id, mtd_id))
>   {
>   for(i = 0, offset = 0; i < part->num_parts; i++)
>   {

This changes the behavior of cmdling parsing, which might affect users
expecting the old behavior.

According to the remark above 'parse_cmdline_partitions':

 * It returns partitions for the requested mtd device, or
 * the first one in the chain if a NULL mtd_id is passed in.

I think the purpose of a NULL 'mtd_id' was to support simple systems
where there's a single driver and a single chip.
The driver could be dumb, not specifying its 'mtd_info->name'
(thus, a NULL mtd_id is passed).

In this case, since the system is simply configured (one driver, one
chip), 'parse_cmdline_partitions' simply disregards the "mtd-id" name
specified in the cmdline string, allowing the user to present some
arbitrary string there.

I quite remember seeing this pattern somewhere in the past, I don't know
if it's still used, though.

Obviously if you have many drivers (and many chips) in a system, that
won't work; the drivers must initialize 'mtd_info->name' and the user
should present a cmdline that has explicit 'mtd-id's.

So question is, would we like to prohibit NULL mtd-id?

If so, we must make sure all drivers are properly assigning their
'mtd_info->name', and all users correctly specifying 'mtd-id' in their
"mtdparts" cmdline strings.

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mtd: cmdlinepart: fix the wrong partitions number when truncating occurs

2012-08-25 Thread Shmulik Ladkani
Hi Huang,

On Sat, 25 Aug 2012 10:26:07 -0400 Huang Shijie  wrote:
> diff --git a/drivers/mtd/cmdlinepart.c b/drivers/mtd/cmdlinepart.c
> index 4558e0f..fc960a3 100644
> --- a/drivers/mtd/cmdlinepart.c
> +++ b/drivers/mtd/cmdlinepart.c
> @@ -344,7 +344,8 @@ static int parse_cmdline_partitions(struct mtd_info 
> *master,
>  "%s: partitioning exceeds flash 
> size, truncating\n",
>  part->mtd_id);
>   part->parts[i].size = master->size - 
> offset;
> - part->num_parts = i;
> + part->num_parts = i + 1;
> + break;

Your analysis seems right, but let me offer an alternative approach.

I would simply:

-   part->num_parts = i;

(and not replace it with anything).

The specified cmdline partitions might not be ordered (according to
start offset), so next partition specified after the truncated one might
define a partition at the beginning of the device, which is okay
(regardless the truncation of current partition).

Your patch skips the definitions of next partitions, which can be legit.

I agree specifying "unsorted" partitions is not commonly used (and it
might make no sense when using the "remaining" syntax), but it is legit
to define all partitions _explicitly_ with their size@offset in an
unordered fashion.

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mtd: cmdlinepart: fix the wrong partitions number when truncating occurs

2012-08-25 Thread Shmulik Ladkani
Hi Huang,

On Sat, 25 Aug 2012 10:26:07 -0400 Huang Shijie shij...@gmail.com wrote:
 diff --git a/drivers/mtd/cmdlinepart.c b/drivers/mtd/cmdlinepart.c
 index 4558e0f..fc960a3 100644
 --- a/drivers/mtd/cmdlinepart.c
 +++ b/drivers/mtd/cmdlinepart.c
 @@ -344,7 +344,8 @@ static int parse_cmdline_partitions(struct mtd_info 
 *master,
  %s: partitioning exceeds flash 
 size, truncating\n,
  part-mtd_id);
   part-parts[i].size = master-size - 
 offset;
 - part-num_parts = i;
 + part-num_parts = i + 1;
 + break;

Your analysis seems right, but let me offer an alternative approach.

I would simply:

-   part-num_parts = i;

(and not replace it with anything).

The specified cmdline partitions might not be ordered (according to
start offset), so next partition specified after the truncated one might
define a partition at the beginning of the device, which is okay
(regardless the truncation of current partition).

Your patch skips the definitions of next partitions, which can be legit.

I agree specifying unsorted partitions is not commonly used (and it
might make no sense when using the remaining syntax), but it is legit
to define all partitions _explicitly_ with their size@offset in an
unordered fashion.

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mtd: cmdlinepart: fix the wrong check condition

2012-08-25 Thread Shmulik Ladkani
Hi Huang, Artem,

On Sat, 25 Aug 2012 16:06:50 -0400 Huang Shijie shij...@gmail.com wrote:
 diff --git a/drivers/mtd/cmdlinepart.c b/drivers/mtd/cmdlinepart.c
 index fc960a3..216d751 100644
 --- a/drivers/mtd/cmdlinepart.c
 +++ b/drivers/mtd/cmdlinepart.c
 @@ -322,13 +322,16 @@ static int parse_cmdline_partitions(struct mtd_info 
 *master,
   struct cmdline_mtd_partition *part;
   const char *mtd_id = master-name;
  
 + if (!mtd_id)
 + return 0;
 +
   /* parse command line */
   if (!cmdline_parsed)
   mtdpart_setup_real(cmdline);
  
   for(part = partitions; part; part = part-next)
   {
 - if ((!mtd_id) || (!strcmp(part-mtd_id, mtd_id)))
 + if (!strcmp(part-mtd_id, mtd_id))
   {
   for(i = 0, offset = 0; i  part-num_parts; i++)
   {

This changes the behavior of cmdling parsing, which might affect users
expecting the old behavior.

According to the remark above 'parse_cmdline_partitions':

 * It returns partitions for the requested mtd device, or
 * the first one in the chain if a NULL mtd_id is passed in.

I think the purpose of a NULL 'mtd_id' was to support simple systems
where there's a single driver and a single chip.
The driver could be dumb, not specifying its 'mtd_info-name'
(thus, a NULL mtd_id is passed).

In this case, since the system is simply configured (one driver, one
chip), 'parse_cmdline_partitions' simply disregards the mtd-id name
specified in the cmdline string, allowing the user to present some
arbitrary string there.

I quite remember seeing this pattern somewhere in the past, I don't know
if it's still used, though.

Obviously if you have many drivers (and many chips) in a system, that
won't work; the drivers must initialize 'mtd_info-name' and the user
should present a cmdline that has explicit 'mtd-id's.

So question is, would we like to prohibit NULL mtd-id?

If so, we must make sure all drivers are properly assigning their
'mtd_info-name', and all users correctly specifying 'mtd-id' in their
mtdparts cmdline strings.

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mtd: cmdlinepart: fix the wrong partitions number when truncating occurs

2012-08-25 Thread Shmulik Ladkani
On Sat, 25 Aug 2012 05:26:51 -0400 Huang Shijie shij...@gmail.com wrote:
  The specified cmdline partitions might not be ordered (according to
  start offset), so next partition specified after the truncated one might
  define a partition at the beginning of the device, which is okay
  (regardless the truncation of current partition).
 could you please give me an example of this specified cmdline?

Assume we have a 1GB(8Gb) nand chip:
  #gpmi-nand:1g@200m(rootfs),100m@0(boot),100m@100m(kernel)

I am used to explicitly specify size@offset for all my parts.

Obviously I won't define a partition above the device size... somewhat
hypothetical discussion here...

But your code will stop after creating 'rootfs' (and original code will
not create a single partition).

Is that what we want?

Or do we want to truncate 'rootfs', but still have the valid 'boot' and
'kerner' partitions?

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/8] UBI: replace MTD_UBI_BEB_LIMIT with module parameter

2012-08-19 Thread Shmulik Ladkani
Hi Richard,

On Fri, 17 Aug 2012 16:35:22 +0200 Richard Genoud  
wrote:
> +   "MTD devices may be specified by their number, name, or 
> path to the MTD character device node.\n"
> +   "Optional \"vid_hdr_offs\" parameter specifies UBI VID 
> header position to be used by UBI. (default value if 0 or not set)\n"
> +   "Optional \"max_beb_per1024\" parameter specifies the 
> maximum expected bad eraseblock per 1024 eraseblocks. (default value ("
> +   __stringify(CONFIG_MTD_UBI_BEB_LIMIT) ") if 0 or not 
> set)\n"

Looks like "mtd=/dev/mtd1,,25" is invalid, although we state "default
value if 0 or not set".

Also, lines are exceeding 80 chars, not sure this is more readable than
if they were broken.

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/8] UBI: check max_beb_per1024 value in ubi_attach_mtd_dev

2012-08-19 Thread Shmulik Ladkani
Hi Richard,

On Fri, 17 Aug 2012 16:35:21 +0200 Richard Genoud  
wrote:
> + /*
> +  * A value of 0 is forced to the default value to keep the same
> +  * behavior between ubiattach command and module parameter
> +  */

Minor thing.

Since the module parameter is not yet introduced (only in a later
patch), and since last part of sentence isn't that important, I would
simply state:

+* Use the default if max_beb_per1024 isn't provided.

or alike.

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/8] UBI: prepare for max_beb_per1024 module parameter addition

2012-08-19 Thread Shmulik Ladkani
On Fri, 17 Aug 2012 16:35:20 +0200 Richard Genoud  
wrote:
> This patch prepare the way for the addition of max_beb_per1024 module
> parameter.
> There's no functional change.
> 
> Signed-off-by: Richard Genoud 

Reviewed-by: Shmulik Ladkani 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/8] UBI: separate bad_peb_limit in a function

2012-08-19 Thread Shmulik Ladkani
On Fri, 17 Aug 2012 16:35:18 +0200 Richard Genoud  
wrote:
> No functional changes here, just to prepare for next patch.
> 
> Signed-off-by: Richard Genoud 

Reviewed-by: Shmulik Ladkani 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] UBI: replace MTD_UBI_BEB_LIMIT with user-space parameter

2012-08-19 Thread Shmulik Ladkani
On Thu, 16 Aug 2012 16:33:32 +0300 Artem Bityutskiy  wrote:
> We do not have that big user-base. No one uses 0 in the tree, most use
> the default. I never heard anyone using 0. I did not use it also. I
> think it is OK to have the lower range start from non-zero. But why it
> is 2 and not 1 - I am not sure :-)

Artem,

Note that originally range was 0..25 (percentage).

It was changed to be 2..256 (per 1024) by 07ad322 in linux-ubi.

Seems arbitrary change.
So if you'd like, we can fix to 1..256 (or 0..256) - as you prefer.

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] UBI: replace MTD_UBI_BEB_LIMIT with user-space parameter

2012-08-19 Thread Shmulik Ladkani
On Thu, 16 Aug 2012 16:33:32 +0300 Artem Bityutskiy dedeki...@gmail.com wrote:
 We do not have that big user-base. No one uses 0 in the tree, most use
 the default. I never heard anyone using 0. I did not use it also. I
 think it is OK to have the lower range start from non-zero. But why it
 is 2 and not 1 - I am not sure :-)

Artem,

Note that originally range was 0..25 (percentage).

It was changed to be 2..256 (per 1024) by 07ad322 in linux-ubi.

Seems arbitrary change.
So if you'd like, we can fix to 1..256 (or 0..256) - as you prefer.

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/8] UBI: separate bad_peb_limit in a function

2012-08-19 Thread Shmulik Ladkani
On Fri, 17 Aug 2012 16:35:18 +0200 Richard Genoud richard.gen...@gmail.com 
wrote:
 No functional changes here, just to prepare for next patch.
 
 Signed-off-by: Richard Genoud richard.gen...@gmail.com

Reviewed-by: Shmulik Ladkani shmulik.ladk...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/8] UBI: prepare for max_beb_per1024 module parameter addition

2012-08-19 Thread Shmulik Ladkani
On Fri, 17 Aug 2012 16:35:20 +0200 Richard Genoud richard.gen...@gmail.com 
wrote:
 This patch prepare the way for the addition of max_beb_per1024 module
 parameter.
 There's no functional change.
 
 Signed-off-by: Richard Genoud richard.gen...@gmail.com

Reviewed-by: Shmulik Ladkani shmulik.ladk...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/8] UBI: check max_beb_per1024 value in ubi_attach_mtd_dev

2012-08-19 Thread Shmulik Ladkani
Hi Richard,

On Fri, 17 Aug 2012 16:35:21 +0200 Richard Genoud richard.gen...@gmail.com 
wrote:
 + /*
 +  * A value of 0 is forced to the default value to keep the same
 +  * behavior between ubiattach command and module parameter
 +  */

Minor thing.

Since the module parameter is not yet introduced (only in a later
patch), and since last part of sentence isn't that important, I would
simply state:

+* Use the default if max_beb_per1024 isn't provided.

or alike.

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/8] UBI: replace MTD_UBI_BEB_LIMIT with module parameter

2012-08-19 Thread Shmulik Ladkani
Hi Richard,

On Fri, 17 Aug 2012 16:35:22 +0200 Richard Genoud richard.gen...@gmail.com 
wrote:
 +   MTD devices may be specified by their number, name, or 
 path to the MTD character device node.\n
 +   Optional \vid_hdr_offs\ parameter specifies UBI VID 
 header position to be used by UBI. (default value if 0 or not set)\n
 +   Optional \max_beb_per1024\ parameter specifies the 
 maximum expected bad eraseblock per 1024 eraseblocks. (default value (
 +   __stringify(CONFIG_MTD_UBI_BEB_LIMIT) ) if 0 or not 
 set)\n

Looks like mtd=/dev/mtd1,,25 is invalid, although we state default
value if 0 or not set.

Also, lines are exceeding 80 chars, not sure this is more readable than
if they were broken.

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] UBI: replace MTD_UBI_BEB_LIMIT with user-space parameter

2012-08-16 Thread Shmulik Ladkani
On Thu, 16 Aug 2012 16:28:38 +0300 Artem Bityutskiy  wrote:
> On Thu, 2012-08-16 at 11:57 +0300, Shmulik Ladkani wrote:
> > 
> > For the simplest systems (those having one ubi device) that need a
> > limit
> > *other* than the default (20 per 1024), they can simply set the config
> > to their chosen value, as they were used to.
> > 
> > With you approach, these system MUST pass the limit parameter via the
> > ioctl / module-parameter. 
> 
> Yeah, when you change the default, you usually need to do an extra step.
> It does not feel too bad, and I would not keep additional configuration
> option for a hypothetical user. If someone suffers, we can add an option
> to change the default. But I'd start without it. So, I think it is OK to
> remove this.

Yes, but the main drawback I was referring to is those platforms already
setting MTD_UBI_BEB_RESERVE other than the default, by means of kernel
configuration.
(there's one platform known to do so in its defconfig, that's
sam9_l9260_defconfig, which uses 3% instead of the "standard" 2%).

These platforms must now change their usermode code to either pass a
module parameter during the insmod or change their attach ioctl of
their application.

We force these systems to change their usermode because we changed ubi's
default BEB limit to be 20/1024 _hardcoded_ (instead of kernel
configurable as previously was).

Is this ok?

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] UBI: replace MTD_UBI_BEB_LIMIT with user-space parameter

2012-08-16 Thread Shmulik Ladkani
Hi Richard, Artem,

On Thu, 16 Aug 2012 12:07:01 +0200 Richard Genoud  
wrote:
> > With you approach, these system MUST pass the limit parameter via the
> > ioctl / module-parameter.
> That's right.
> We can add a kernel config option to change the max_beb_per1024
> default value (actually, this is almost the patch I send first).
> But I see something disturbing with that:
> It means that an ubi_attach call from userspace, without specifying
> max_beb_per1024, won't have the same result, depending of the default
> config value the kernel has been compiled with.
> (Or maybe this behavior is acceptable).

Well, that was the previous behavior of MTD_UBI_BEB_RESERVE, long before
our patchsets.
I think it is acceptable, given the fact it simplifies the configuration
for most simple systems.

Anyway I'm just pointing out the consequences of your change and try to
suggest other alternatives.
Artem should decide as he's the maintainer.

> > Also, since max_beb_per1024 is always set, how one may specify a zero
> > limit?
> You can't.
> Do you think we need that ?

Well again, originally, prior our patchsets, one *could* set a zero
MTD_UBI_BEB_RESERVE for his system. So we're introducing a change that
affects the possible ways an ubi system can be configured, banning
a configuration that was valid in the past.

Does it make sense to set a zero limit? dunno.
For testing purposes, maybe.

Artem, what do you think? prohibit a zero beb limit?

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] UBI: replace MTD_UBI_BEB_LIMIT with user-space parameter

2012-08-16 Thread Shmulik Ladkani
Hi Richard,

Sorry for reviewing this late...

On Tue, 10 Jul 2012 18:23:42 +0200 Richard Genoud  
wrote:
> -config MTD_UBI_BEB_LIMIT
> - int "Maximum expected bad eraseblocks per 1024 eraseblocks"
> - default 20
> - range 2 256

I see some benefit keeping the config.

For the simplest systems (those having one ubi device) that need a limit
*other* than the default (20 per 1024), they can simply set the config
to their chosen value, as they were used to.

With you approach, these system MUST pass the limit parameter via the
ioctl / module-parameter.

> +static int get_bad_peb_limit(const struct ubi_device *ubi, int 
> max_beb_per1024)
> +{
> + int device_peb_count;
> + uint64_t device_size;
> + int beb_limit = 0;
> +
> + /* this has already been checked in ioctl */
> + if (max_beb_per1024 <= 0)
> + goto out;

Can you explain how 'max_beb_per1024 <= 0' may happen? 

It seems that all of your calls to 'ubi_attach_mtd_dev' pass a positive
max_beb_per1024 value (the default is always set). See your
'ubi_mtd_param_parse' and 'ctrl_cdev_ioctl'. Am I missing something?

Also, since max_beb_per1024 is always set, how one may specify a zero
limit?

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] UBI: use the whole MTD device size to get bad_peb_limit

2012-08-16 Thread Shmulik Ladkani
Hi,

One more thing...

On Wed, 15 Aug 2012 18:08:51 +0300 Artem Bityutskiy  wrote:
>  config MTD_UBI_BEB_LIMIT
> - int "Percentage of maximum expected bad eraseblocks"
> - default 2
> - range 0 25
> + int "Maximum expected bad eraseblock count per 1024 eraseblocks"
> + default 20
> + range 2 256

Those defconfigs that explicilty set an original LIMIT should be
adjusted as well, as the units have changed, no?

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] UBI: use the whole MTD device size to get bad_peb_limit

2012-08-16 Thread Shmulik Ladkani
Hi Artem, Richard,

On Wed, 15 Aug 2012 18:08:51 +0300 Artem Bityutskiy  wrote:
> 1. Invalid blocks are block that contain one or more bad bits beyond
> ECC.

I would remove this one sentence from the log, it is misleading; invalid
blocks are not necessarily related to ECC.

>   if (CONFIG_MTD_UBI_BEB_LIMIT > 0) {
> - int percent = CONFIG_MTD_UBI_BEB_LIMIT;
> - int limit = mult_frac(ubi->peb_count, percent, 100);
> + int per1024 = CONFIG_MTD_UBI_BEB_LIMIT;
> + int limit, device_pebs;
> + uint64_t device_size;
> +
> + /*
> +  * Here we are using size of the entire flash chip and
> +  * not just the MTD partition size because the maximum
> +  * number of bad eraseblocks is a percentage of the
> +  * whole device and bad eraseblocks are not fairly
> +  * distributed over the flash chip. So the worst case
> +  * is that all the bad eraseblocks of the chip are in
> +  * the MTD partition we are attaching (ubi->mtd).
> +  */
> + device_size = mtd_get_device_size(ubi->mtd);
> + device_pebs = mtd_div_by_eb(device_size, ubi->mtd);
> + limit = mult_frac(device_pebs, per1024, 1024);
>  
>   /* Round it up */
> - if (mult_frac(limit, 100, percent) < ubi->peb_count)
> + if (mult_frac(limit, 1024, per1024) < ubi->peb_count)

Oops... should be: 

+   if (mult_frac(limit, 1024, per1024) < device_pebs)

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] UBI: use the whole MTD device size to get bad_peb_limit

2012-08-16 Thread Shmulik Ladkani
Hi Artem, Richard,

On Wed, 15 Aug 2012 18:08:51 +0300 Artem Bityutskiy dedeki...@gmail.com wrote:
 1. Invalid blocks are block that contain one or more bad bits beyond
 ECC.

I would remove this one sentence from the log, it is misleading; invalid
blocks are not necessarily related to ECC.

   if (CONFIG_MTD_UBI_BEB_LIMIT  0) {
 - int percent = CONFIG_MTD_UBI_BEB_LIMIT;
 - int limit = mult_frac(ubi-peb_count, percent, 100);
 + int per1024 = CONFIG_MTD_UBI_BEB_LIMIT;
 + int limit, device_pebs;
 + uint64_t device_size;
 +
 + /*
 +  * Here we are using size of the entire flash chip and
 +  * not just the MTD partition size because the maximum
 +  * number of bad eraseblocks is a percentage of the
 +  * whole device and bad eraseblocks are not fairly
 +  * distributed over the flash chip. So the worst case
 +  * is that all the bad eraseblocks of the chip are in
 +  * the MTD partition we are attaching (ubi-mtd).
 +  */
 + device_size = mtd_get_device_size(ubi-mtd);
 + device_pebs = mtd_div_by_eb(device_size, ubi-mtd);
 + limit = mult_frac(device_pebs, per1024, 1024);
  
   /* Round it up */
 - if (mult_frac(limit, 100, percent)  ubi-peb_count)
 + if (mult_frac(limit, 1024, per1024)  ubi-peb_count)

Oops... should be: 

+   if (mult_frac(limit, 1024, per1024)  device_pebs)

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] UBI: use the whole MTD device size to get bad_peb_limit

2012-08-16 Thread Shmulik Ladkani
Hi,

One more thing...

On Wed, 15 Aug 2012 18:08:51 +0300 Artem Bityutskiy dedeki...@gmail.com wrote:
  config MTD_UBI_BEB_LIMIT
 - int Percentage of maximum expected bad eraseblocks
 - default 2
 - range 0 25
 + int Maximum expected bad eraseblock count per 1024 eraseblocks
 + default 20
 + range 2 256

Those defconfigs that explicilty set an original LIMIT should be
adjusted as well, as the units have changed, no?

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] UBI: replace MTD_UBI_BEB_LIMIT with user-space parameter

2012-08-16 Thread Shmulik Ladkani
Hi Richard,

Sorry for reviewing this late...

On Tue, 10 Jul 2012 18:23:42 +0200 Richard Genoud richard.gen...@gmail.com 
wrote:
 -config MTD_UBI_BEB_LIMIT
 - int Maximum expected bad eraseblocks per 1024 eraseblocks
 - default 20
 - range 2 256

I see some benefit keeping the config.

For the simplest systems (those having one ubi device) that need a limit
*other* than the default (20 per 1024), they can simply set the config
to their chosen value, as they were used to.

With you approach, these system MUST pass the limit parameter via the
ioctl / module-parameter.

 +static int get_bad_peb_limit(const struct ubi_device *ubi, int 
 max_beb_per1024)
 +{
 + int device_peb_count;
 + uint64_t device_size;
 + int beb_limit = 0;
 +
 + /* this has already been checked in ioctl */
 + if (max_beb_per1024 = 0)
 + goto out;

Can you explain how 'max_beb_per1024 = 0' may happen? 

It seems that all of your calls to 'ubi_attach_mtd_dev' pass a positive
max_beb_per1024 value (the default is always set). See your
'ubi_mtd_param_parse' and 'ctrl_cdev_ioctl'. Am I missing something?

Also, since max_beb_per1024 is always set, how one may specify a zero
limit?

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] UBI: replace MTD_UBI_BEB_LIMIT with user-space parameter

2012-08-16 Thread Shmulik Ladkani
Hi Richard, Artem,

On Thu, 16 Aug 2012 12:07:01 +0200 Richard Genoud richard.gen...@gmail.com 
wrote:
  With you approach, these system MUST pass the limit parameter via the
  ioctl / module-parameter.
 That's right.
 We can add a kernel config option to change the max_beb_per1024
 default value (actually, this is almost the patch I send first).
 But I see something disturbing with that:
 It means that an ubi_attach call from userspace, without specifying
 max_beb_per1024, won't have the same result, depending of the default
 config value the kernel has been compiled with.
 (Or maybe this behavior is acceptable).

Well, that was the previous behavior of MTD_UBI_BEB_RESERVE, long before
our patchsets.
I think it is acceptable, given the fact it simplifies the configuration
for most simple systems.

Anyway I'm just pointing out the consequences of your change and try to
suggest other alternatives.
Artem should decide as he's the maintainer.

  Also, since max_beb_per1024 is always set, how one may specify a zero
  limit?
 You can't.
 Do you think we need that ?

Well again, originally, prior our patchsets, one *could* set a zero
MTD_UBI_BEB_RESERVE for his system. So we're introducing a change that
affects the possible ways an ubi system can be configured, banning
a configuration that was valid in the past.

Does it make sense to set a zero limit? dunno.
For testing purposes, maybe.

Artem, what do you think? prohibit a zero beb limit?

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] UBI: replace MTD_UBI_BEB_LIMIT with user-space parameter

2012-08-16 Thread Shmulik Ladkani
On Thu, 16 Aug 2012 16:28:38 +0300 Artem Bityutskiy dedeki...@gmail.com wrote:
 On Thu, 2012-08-16 at 11:57 +0300, Shmulik Ladkani wrote:
  
  For the simplest systems (those having one ubi device) that need a
  limit
  *other* than the default (20 per 1024), they can simply set the config
  to their chosen value, as they were used to.
  
  With you approach, these system MUST pass the limit parameter via the
  ioctl / module-parameter. 
 
 Yeah, when you change the default, you usually need to do an extra step.
 It does not feel too bad, and I would not keep additional configuration
 option for a hypothetical user. If someone suffers, we can add an option
 to change the default. But I'd start without it. So, I think it is OK to
 remove this.

Yes, but the main drawback I was referring to is those platforms already
setting MTD_UBI_BEB_RESERVE other than the default, by means of kernel
configuration.
(there's one platform known to do so in its defconfig, that's
sam9_l9260_defconfig, which uses 3% instead of the standard 2%).

These platforms must now change their usermode code to either pass a
module parameter during the insmod or change their attach ioctl of
their application.

We force these systems to change their usermode because we changed ubi's
default BEB limit to be 20/1024 _hardcoded_ (instead of kernel
configurable as previously was).

Is this ok?

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mtd: kill MTD_NAND_VERIFY_WRITE

2012-08-15 Thread Shmulik Ladkani
Hi Huang,

On Tue, 14 Aug 2012 22:38:45 -0400 Huang Shijie  wrote:
> diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig
> index 588e989..0ca7257 100644
> --- a/drivers/mtd/nand/Kconfig
> +++ b/drivers/mtd/nand/Kconfig
> @@ -22,15 +22,6 @@ menuconfig MTD_NAND
>  
>  if MTD_NAND
>  
> -config MTD_NAND_VERIFY_WRITE
> - bool "Verify NAND page writes"
> - help
> -   This adds an extra check when data is written to the flash. The
> -   NAND flash device internally checks only bits transitioning
> -   from 1 to 0. There is a rare possibility that even though the
> -   device thinks the write was successful, a bit could have been
> -   flipped accidentally due to device wear or something else.
> -

There are some defconfig files which set CONFIG_MTD_NAND_VERIFY_WRITE.

I guess you should submit an accompanying patch that removes
CONFIG_MTD_NAND_VERIFY_WRITE from all defconfig files.

(also, trimmed the CC list for this specific discussion, seems unrelated
to all of the parties)

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mtd: kill MTD_NAND_VERIFY_WRITE

2012-08-15 Thread Shmulik Ladkani
Hi Huang,

On Tue, 14 Aug 2012 22:38:45 -0400 Huang Shijie shij...@gmail.com wrote:
 diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig
 index 588e989..0ca7257 100644
 --- a/drivers/mtd/nand/Kconfig
 +++ b/drivers/mtd/nand/Kconfig
 @@ -22,15 +22,6 @@ menuconfig MTD_NAND
  
  if MTD_NAND
  
 -config MTD_NAND_VERIFY_WRITE
 - bool Verify NAND page writes
 - help
 -   This adds an extra check when data is written to the flash. The
 -   NAND flash device internally checks only bits transitioning
 -   from 1 to 0. There is a rare possibility that even though the
 -   device thinks the write was successful, a bit could have been
 -   flipped accidentally due to device wear or something else.
 -

There are some defconfig files which set CONFIG_MTD_NAND_VERIFY_WRITE.

I guess you should submit an accompanying patch that removes
CONFIG_MTD_NAND_VERIFY_WRITE from all defconfig files.

(also, trimmed the CC list for this specific discussion, seems unrelated
to all of the parties)

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: UBI fastmap updates

2012-08-05 Thread Shmulik Ladkani
Hi,

On Thu, 2 Aug 2012 19:45:38 +0200 Richard Weinberger  wrote:
> Okay, then let's explicitly reserve a few PEBs for fastmap.
> This should be very easy task.

Need to consider what's expected when migrating from a former non-FM
UBI system to an FM enabled system, in the case where all PEBs where
consumed (reserved) in the former system.

> How much PEB should be reserved? 2 x sizeof(fastmap)?

Since FM does not use EBA's atomic LEB change when writing the new
fastmap, but instead implements its own FM "leb change" internally -
then reserving 2x is needed if we'd like to avoid reusing the same
fastmap PEB.

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: UBI fastmap updates

2012-08-05 Thread Shmulik Ladkani
Hi,

On Thu, 2 Aug 2012 19:45:38 +0200 Richard Weinberger rich...@nod.at wrote:
 Okay, then let's explicitly reserve a few PEBs for fastmap.
 This should be very easy task.

Need to consider what's expected when migrating from a former non-FM
UBI system to an FM enabled system, in the case where all PEBs where
consumed (reserved) in the former system.

 How much PEB should be reserved? 2 x sizeof(fastmap)?

Since FM does not use EBA's atomic LEB change when writing the new
fastmap, but instead implements its own FM leb change internally -
then reserving 2x is needed if we'd like to avoid reusing the same
fastmap PEB.

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/5] ubi: Fix bad PEBs reserve caclulation

2012-07-31 Thread Shmulik Ladkani
Hi Artem,

On Mon, 30 Jul 2012 16:56:50 +0300 Artem Bityutskiy  wrote:
> Hi Shmulik, I've separated out the defconfig changes and pushed patches
> 1,2, and 3 to the UBI tree (the master branch). Patches 4 and 5 are
> already merged upstream. I did a couple of minor modifications in
> commentaries and messages and I think in variables declaration section,
> nothing else. I'll send you the patches separately.

Thanks!

I've noticed a diff in the Kconfig describing MTD_UBI_BEB_LIMIT.

In my original [PATCH 2/5] "ubi: Limit amount of reserved eraseblocks
for bad PEB handling" I've amended the MTD_UBI_BEB_LIMIT explanation a
bit.

The diff between what's on linux-ubi and my suggested description is:

- This option specifies the maximum bad physical eraseblocks UBI
- expects on the UBI device (percents of total number of physical
- eraseblocks on this MTD partition). If the underlying flash does not
- admit of bad eraseblocks (e.g. NOR flash), this value is ignored.
+ If the MTD device admits of bad eraseblocks (e.g. NAND flash), UBI
+ reserves some amount of physical eraseblocks to handle new bad
+ eraseblocks.
+ This option specifies the maximum bad eraseblocks UBI expects on the
+ ubi device (percents of total number of flash eraseblocks).
+ This limit is used in order to derive amount of eraseblock UBI
+ reserves for handling new bad blocks.
+ If the device has more bad eraseblocks than this limit, UBI does not
+ reserve any physical eraseblocks for new bad eraseblocks, but
+ attempts to use available eraseblocks (if any).
+ If the underlying flash does not admit of bad eraseblocks (e.g. NOR
+ flash), this value is ignored.

Just wanted to make sure you deliberately discarded these amendments.

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/5] ubi: Fix bad PEBs reserve caclulation

2012-07-31 Thread Shmulik Ladkani
Hi Artem,

On Mon, 30 Jul 2012 16:56:50 +0300 Artem Bityutskiy dedeki...@gmail.com wrote:
 Hi Shmulik, I've separated out the defconfig changes and pushed patches
 1,2, and 3 to the UBI tree (the master branch). Patches 4 and 5 are
 already merged upstream. I did a couple of minor modifications in
 commentaries and messages and I think in variables declaration section,
 nothing else. I'll send you the patches separately.

Thanks!

I've noticed a diff in the Kconfig describing MTD_UBI_BEB_LIMIT.

In my original [PATCH 2/5] ubi: Limit amount of reserved eraseblocks
for bad PEB handling I've amended the MTD_UBI_BEB_LIMIT explanation a
bit.

The diff between what's on linux-ubi and my suggested description is:

- This option specifies the maximum bad physical eraseblocks UBI
- expects on the UBI device (percents of total number of physical
- eraseblocks on this MTD partition). If the underlying flash does not
- admit of bad eraseblocks (e.g. NOR flash), this value is ignored.
+ If the MTD device admits of bad eraseblocks (e.g. NAND flash), UBI
+ reserves some amount of physical eraseblocks to handle new bad
+ eraseblocks.
+ This option specifies the maximum bad eraseblocks UBI expects on the
+ ubi device (percents of total number of flash eraseblocks).
+ This limit is used in order to derive amount of eraseblock UBI
+ reserves for handling new bad blocks.
+ If the device has more bad eraseblocks than this limit, UBI does not
+ reserve any physical eraseblocks for new bad eraseblocks, but
+ attempts to use available eraseblocks (if any).
+ If the underlying flash does not admit of bad eraseblocks (e.g. NOR
+ flash), this value is ignored.

Just wanted to make sure you deliberately discarded these amendments.

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] ubi: introduce ubi->bad_peb_limit

2012-07-19 Thread Shmulik Ladkani
On Wed, 18 Jul 2012 13:40:53 +0300 Artem Bityutskiy  wrote:
> I've also amended the Kconfig text a tiny bit and dropped the defconfig
> changes - let's have them separately as a single patch at the end of the
> series.

Wouldn't having the defconfig change as the last patch break things for
those defconfigs that had explicitly set CONFIG_MTD_UBI_BEB_RESERVE
other than the default?

Meaning, if the one-before-last would be "kill CONFIG_MTD_UBI_BEB_RESERVE",
then those defconfigs that had _explicitly_ set a BEB_RESERVE value,
which do not YET set a BEB_LIMIT value, will have their BEB_LIMIT as
the default - but they actually meant a specific value other than the
default.

This is why I tried to:
- set the CONFIG_MTD_UBI_BEB_LIMIT in defconfigs as part of the commit
  which introduces this config (copy same value as their RESERVE config)
- kill all CONFIG_MTD_UBI_BEB_RESERVE references from defconfigs as part
  of the commit which kills it

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] ubi: introduce ubi-bad_peb_limit

2012-07-19 Thread Shmulik Ladkani
On Wed, 18 Jul 2012 13:40:53 +0300 Artem Bityutskiy dedeki...@gmail.com wrote:
 I've also amended the Kconfig text a tiny bit and dropped the defconfig
 changes - let's have them separately as a single patch at the end of the
 series.

Wouldn't having the defconfig change as the last patch break things for
those defconfigs that had explicitly set CONFIG_MTD_UBI_BEB_RESERVE
other than the default?

Meaning, if the one-before-last would be kill CONFIG_MTD_UBI_BEB_RESERVE,
then those defconfigs that had _explicitly_ set a BEB_RESERVE value,
which do not YET set a BEB_LIMIT value, will have their BEB_LIMIT as
the default - but they actually meant a specific value other than the
default.

This is why I tried to:
- set the CONFIG_MTD_UBI_BEB_LIMIT in defconfigs as part of the commit
  which introduces this config (copy same value as their RESERVE config)
- kill all CONFIG_MTD_UBI_BEB_RESERVE references from defconfigs as part
  of the commit which kills it

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] ubi: Limit amount of reserved eraseblocks for bad PEB handling

2012-07-18 Thread Shmulik Ladkani
Hi Artem,

On Wed, 18 Jul 2012 13:28:37 +0300 Artem Bityutskiy  wrote:
> The whole thing will become simpler if we first mark the PEB as bad
> unconditionally (because it _is_ bad), then grab the lock and do all the
> re-calculations.

On first glance that would sound the right thing to do.

Actually, when looked at the original code, which does NOT mark the
block bad when 'beb_rsvd_pebs == 0' (instead, does a 'goto out_ro'),
I initially thought "this is SO wrong, the block IS bad, we MUST mark it
bad, what IS the deal here?".
But then I spent more and more time trying to backtrack the reason for
it - and I think I found a reason, quite an improtant one.

Suppose 'beb_rsvd_pebs == 0'.

In the original scheme, it means that there are NO available PEBs at
all. All good PEBs are "assigned" (to user volumes, layout volume, WL
and EBA).

Now, if one of these PEBs goes bad, and you DO mark it bad, and
decrement the good_peb_count, you'll have a shortage of good PEBs - it
will not match the amount of PEBs assigned to the consumers (user
volumes, layout, WL, EBA).
Meaning, next attach would simply fail with a "no enough physical
eraseblocks" message (grep for these in ubi_wl_init and ubi_eba_init).
You won't even be able to attach anymore. Not good.

However, if you DO NOT mark it bad, but instead go into RO mode, you
should be able to later re-attach because the good_peb_count would fit
(no shortage of PEBs).

So going into RO seems a "less worse" solution than marking bad, for
that particular case.

Anyways, I've really invested some thought into this patch, trying to
cover all sorts of esoteric cases.
I think the logic itself is pretty robust, although I would really
appreciate some reassurances on that...

I agree the code does not "read simple" enough, I tried to make the best
simplifying and adding some comments. Let me know if you'd like some
more polish on it.

I saw you submitted a simplified version, I can surely take a look, but
I'm afraid it lacks the proper treatment discussed above (NOT marking
the bad block as bad when beb_rsvd_pebs is zero).

Let me know.

Many thanks,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] ubi: Limit amount of reserved eraseblocks for bad PEB handling

2012-07-18 Thread Shmulik Ladkani
Hi Artem,

On Wed, 18 Jul 2012 13:28:37 +0300 Artem Bityutskiy dedeki...@gmail.com wrote:
 The whole thing will become simpler if we first mark the PEB as bad
 unconditionally (because it _is_ bad), then grab the lock and do all the
 re-calculations.

On first glance that would sound the right thing to do.

Actually, when looked at the original code, which does NOT mark the
block bad when 'beb_rsvd_pebs == 0' (instead, does a 'goto out_ro'),
I initially thought this is SO wrong, the block IS bad, we MUST mark it
bad, what IS the deal here?.
But then I spent more and more time trying to backtrack the reason for
it - and I think I found a reason, quite an improtant one.

Suppose 'beb_rsvd_pebs == 0'.

In the original scheme, it means that there are NO available PEBs at
all. All good PEBs are assigned (to user volumes, layout volume, WL
and EBA).

Now, if one of these PEBs goes bad, and you DO mark it bad, and
decrement the good_peb_count, you'll have a shortage of good PEBs - it
will not match the amount of PEBs assigned to the consumers (user
volumes, layout, WL, EBA).
Meaning, next attach would simply fail with a no enough physical
eraseblocks message (grep for these in ubi_wl_init and ubi_eba_init).
You won't even be able to attach anymore. Not good.

However, if you DO NOT mark it bad, but instead go into RO mode, you
should be able to later re-attach because the good_peb_count would fit
(no shortage of PEBs).

So going into RO seems a less worse solution than marking bad, for
that particular case.

Anyways, I've really invested some thought into this patch, trying to
cover all sorts of esoteric cases.
I think the logic itself is pretty robust, although I would really
appreciate some reassurances on that...

I agree the code does not read simple enough, I tried to make the best
simplifying and adding some comments. Let me know if you'd like some
more polish on it.

I saw you submitted a simplified version, I can surely take a look, but
I'm afraid it lacks the proper treatment discussed above (NOT marking
the bad block as bad when beb_rsvd_pebs is zero).

Let me know.

Many thanks,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/5] ubi: Fix bad PEBs reserve caclulation

2012-07-17 Thread Shmulik Ladkani
Hi Artem,

On Mon, 16 Jul 2012 18:33:57 +0300 Artem Bityutskiy  wrote:
> But one more think is the mtd web-site. I've grepped for '1%' and there
> are plenty of them. I've changed them all to 2% more or less
> mechanically - only cleaned up one section by removing out-of-date
> information. Would you please grep for '2%' and review if the
> information there is reasonable? Also, would you please add some more
> info to this FAQ entry:
> 
> http://linux-mtd.infradead.org/faq/ubi.html#L_bad_blocks_exceeded
> 
> Or even better if you could write a separate section for this stuff in
> the documentation, then you could remove that FAQ entry completely.

Sure, I'll try to do it as well.

But it would only make sense after accepting Richard's patchset as well,
which suggests configuring per-ubi-device beb limit via the attach ioctl.

I didn't had the time to properly review it yet, but IMO it looks more
controversial.
http://lists.infradead.org/pipermail/linux-mtd/2012-July/042803.html

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/5] ubi: Fix bad PEBs reserve caclulation

2012-07-17 Thread Shmulik Ladkani
Hi Artem,

On Mon, 16 Jul 2012 18:33:57 +0300 Artem Bityutskiy dedeki...@gmail.com wrote:
 But one more think is the mtd web-site. I've grepped for '1%' and there
 are plenty of them. I've changed them all to 2% more or less
 mechanically - only cleaned up one section by removing out-of-date
 information. Would you please grep for '2%' and review if the
 information there is reasonable? Also, would you please add some more
 info to this FAQ entry:
 
 http://linux-mtd.infradead.org/faq/ubi.html#L_bad_blocks_exceeded
 
 Or even better if you could write a separate section for this stuff in
 the documentation, then you could remove that FAQ entry completely.

Sure, I'll try to do it as well.

But it would only make sense after accepting Richard's patchset as well,
which suggests configuring per-ubi-device beb limit via the attach ioctl.

I didn't had the time to properly review it yet, but IMO it looks more
controversial.
http://lists.infradead.org/pipermail/linux-mtd/2012-July/042803.html

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] ubi: Limit amount of reserved eraseblocks for bad PEB handling

2012-07-09 Thread Shmulik Ladkani
On Mon, 9 Jul 2012 12:15:17 +0200 Richard Genoud  
wrote:
> 2012/7/4 Shmulik Ladkani :
> > +   /*
> > +* Calculate the actual number of PEBs currently needed to be 
> > reserved
> > +* for future bad eraseblock handling.
> > +*/
> > +   ubi->beb_rsvd_level = ubi->bad_peb_limit - ubi->bad_peb_count;
> > +   if (ubi->beb_rsvd_level < 0) {
> > +   ubi->beb_rsvd_level = 0;
> > +   ubi_warn("number of bad PEBs (%d) is above the expected 
> > limit "
> > +"(%d), not reserving any PEBs for bad PEB 
> > handling, "
> > +"will use available PEBs (if any)",
> > +ubi->bad_peb_count, ubi->bad_peb_limit);
> > +   }
> >  }
> is it ok for beb_rsvd_level to be in the range [0..x[ instead of [2..x[ ?

Yes, it is ok in my new scheme.
It is even mandatory, otherwise more and more PEBs will attempt to be
reserved for future bad PEB handling (as 'beb_rsvd_pebs' always wishes
to reach 'beb_rsvd_level') even if passed the limit - this is exactly
what I'd like to fix.

Regards
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: UBI fastmap updates

2012-07-09 Thread Shmulik Ladkani
Hi Richard,

On Sun, 08 Jul 2012 14:07:41 +0200 Richard Weinberger  wrote:
> > +   /* TODO: in the new locking scheme, produce_free_peb is
> > +* called under wl_lock taken.
> > +* so when returning, should reacquire the lock
> > +*/
> 
> Which new locking scheme?

I am diffing linux-ubi fastmap HEAD against its fork point (vanilla
ubi), that's 6b16351..d41a140 on linux-ubi.

Which gives the following diff in produce_free_pebs:

@@ -261,7 +266,6 @@ static int produce_free_peb(struct ubi_device *ubi)
 {
int err;
 
-   spin_lock(>wl_lock);
while (!ubi->free.rb_node) {
spin_unlock(>wl_lock);
 
@@ -272,7 +276,6 @@ static int produce_free_peb(struct ubi_device *ubi)
 
spin_lock(>wl_lock);
}
-   spin_unlock(>wl_lock);
 
return 0;
 }

Which is probably okay, since you obtain the lock in the new
'ubi_refill_pools', which calls produce_free_peb:

+void ubi_refill_pools(struct ubi_device *ubi)
+{
+   spin_lock(>wl_lock);
+   refill_wl_pool(ubi);
+   refill_wl_user_pool(ubi);
+   spin_unlock(>wl_lock);
+}

However if 'do_work' fails within 'produce_free_peb', you return the
error but leave wl_lock unlocked - where it is expected to be locked
(otherwise, ubi_refill_pools will unlock it again):

static int produce_free_peb(struct ubi_device *ubi)
{
int err;

while (!ubi->free.rb_node) {
spin_unlock(>wl_lock);

dbg_wl("do one work synchronously");
err = do_work(ubi);
if (err)
return err;

spin_lock(>wl_lock);
}

return 0;
}

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: UBI fastmap updates

2012-07-09 Thread Shmulik Ladkani
Hi Richard,

On Sun, 08 Jul 2012 14:07:41 +0200 Richard Weinberger rich...@nod.at wrote:
  +   /* TODO: in the new locking scheme, produce_free_peb is
  +* called under wl_lock taken.
  +* so when returning, should reacquire the lock
  +*/
 
 Which new locking scheme?

I am diffing linux-ubi fastmap HEAD against its fork point (vanilla
ubi), that's 6b16351..d41a140 on linux-ubi.

Which gives the following diff in produce_free_pebs:

@@ -261,7 +266,6 @@ static int produce_free_peb(struct ubi_device *ubi)
 {
int err;
 
-   spin_lock(ubi-wl_lock);
while (!ubi-free.rb_node) {
spin_unlock(ubi-wl_lock);
 
@@ -272,7 +276,6 @@ static int produce_free_peb(struct ubi_device *ubi)
 
spin_lock(ubi-wl_lock);
}
-   spin_unlock(ubi-wl_lock);
 
return 0;
 }

Which is probably okay, since you obtain the lock in the new
'ubi_refill_pools', which calls produce_free_peb:

+void ubi_refill_pools(struct ubi_device *ubi)
+{
+   spin_lock(ubi-wl_lock);
+   refill_wl_pool(ubi);
+   refill_wl_user_pool(ubi);
+   spin_unlock(ubi-wl_lock);
+}

However if 'do_work' fails within 'produce_free_peb', you return the
error but leave wl_lock unlocked - where it is expected to be locked
(otherwise, ubi_refill_pools will unlock it again):

static int produce_free_peb(struct ubi_device *ubi)
{
int err;

while (!ubi-free.rb_node) {
spin_unlock(ubi-wl_lock);

dbg_wl(do one work synchronously);
err = do_work(ubi);
if (err)
return err;

spin_lock(ubi-wl_lock);
}

return 0;
}

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] ubi: Limit amount of reserved eraseblocks for bad PEB handling

2012-07-09 Thread Shmulik Ladkani
On Mon, 9 Jul 2012 12:15:17 +0200 Richard Genoud richard.gen...@gmail.com 
wrote:
 2012/7/4 Shmulik Ladkani shmulik.ladk...@gmail.com:
  +   /*
  +* Calculate the actual number of PEBs currently needed to be 
  reserved
  +* for future bad eraseblock handling.
  +*/
  +   ubi-beb_rsvd_level = ubi-bad_peb_limit - ubi-bad_peb_count;
  +   if (ubi-beb_rsvd_level  0) {
  +   ubi-beb_rsvd_level = 0;
  +   ubi_warn(number of bad PEBs (%d) is above the expected 
  limit 
  +(%d), not reserving any PEBs for bad PEB 
  handling, 
  +will use available PEBs (if any),
  +ubi-bad_peb_count, ubi-bad_peb_limit);
  +   }
   }
 is it ok for beb_rsvd_level to be in the range [0..x[ instead of [2..x[ ?

Yes, it is ok in my new scheme.
It is even mandatory, otherwise more and more PEBs will attempt to be
reserved for future bad PEB handling (as 'beb_rsvd_pebs' always wishes
to reach 'beb_rsvd_level') even if passed the limit - this is exactly
what I'd like to fix.

Regards
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: UBI fastmap updates

2012-07-08 Thread Shmulik Ladkani
Hi Richard,

On Fri, 29 Jun 2012 17:14:18 +0200 Richard Weinberger  wrote:
> This is the next round of UBI fastmap updates.

Please examine some TODOs (and questions) I've spotted while diffing
against "vanilla" ubi.

This patch should apply to linux-ubi at d41a140

Sorry I couldn't review entirely, diff has gotten really enormous...
I'll try to pick it up again when you'll get to the final merge
submitting incremental patches.

Regards,
Shmulik

diff --git a/drivers/mtd/ubi/attach.c b/drivers/mtd/ubi/attach.c
index a343a41..dae9674 100644
--- a/drivers/mtd/ubi/attach.c
+++ b/drivers/mtd/ubi/attach.c
@@ -999,6 +999,13 @@ static int scan_peb(struct ubi_device *ubi, struct 
ubi_attach_info *ai,
 
if (!find_fastmap &&
   (vol_id > UBI_MAX_VOLUMES && vol_id != UBI_LAYOUT_VOLUME_ID)) {
+
+   /* TODO: if find_fastmap==1, we do not enter this block at all.
+* shouldn't we? shouldn't we care of compatability of unknown
+* internal volumes OTHER than the fastmap ones, even if
+* find_fastmap==1?
+*/
+
int lnum = be32_to_cpu(vidh->lnum);
 
/* Unsupported internal volume */
@@ -1221,6 +1228,7 @@ static void destroy_ai(struct ubi_attach_info *ai)
  * scan_all - scan entire MTD device.
  * @ubi: UBI device description object
  * @ai: attach info object
+ * TODO: document @start
  *
  * This function does full scanning of an MTD device and returns complete
  * information about it in form of a "struct ubi_attach_info" object. In case
@@ -1350,6 +1358,11 @@ out_vidh:
 out_ech:
kfree(ech);
 out_ai:
+   /* TODO: doesn't seem clean to destroy 'ai' as it was allocated by the
+* caller, its his responsibility.
+* also looks like it leads to double freee in case 'err' returned is
+* negative
+*/
destroy_ai(ai);
return err;
 }
@@ -1441,6 +1454,7 @@ int ubi_attach(struct ubi_device *ubi, int force_scan)
 
err = scan_all(ubi, scan_ai, 0);
if (err) {
+   /* TODO: hmm... kfree or destroy_ai ? */
kfree(scan_ai);
goto out_wl;
}
diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
index 50b7590..f769c22 100644
--- a/drivers/mtd/ubi/build.c
+++ b/drivers/mtd/ubi/build.c
@@ -1053,6 +1053,7 @@ int ubi_detach_mtd_dev(int ubi_num, int anyway)
ubi_notify_all(ubi, UBI_VOLUME_REMOVED, NULL);
dbg_msg("detaching mtd%d from ubi%d", ubi->mtd->index, ubi_num);
 
+   /* TODO: any action on failure? */
ubi_update_fastmap(ubi);
 
/*
@@ -1070,7 +1071,6 @@ int ubi_detach_mtd_dev(int ubi_num, int anyway)
 
ubi_debugfs_exit_dev(ubi);
uif_close(ubi);
-
ubi_wl_close(ubi);
ubi_free_internal_volumes(ubi);
vfree(ubi->vtbl);
diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
index 9f766ff..0b2d0cf 100644
--- a/drivers/mtd/ubi/ubi.h
+++ b/drivers/mtd/ubi/ubi.h
@@ -219,7 +219,7 @@ struct ubi_volume_desc;
  * @size: size of the fastmap in bytes
  * @used_blocks: number of used PEBs
  * @max_pool_size: maximal size of the user pool
- * @max_wl_pool_size: maximal size of the pooly used by the WL sub-system
+ * @max_wl_pool_size: maximal size of the pool used by the WL sub-system
  * @raw: the fastmap itself as byte array (only valid while attaching)
  */
 struct ubi_fastmap_layout {
@@ -242,7 +242,6 @@ struct ubi_fastmap_layout {
  * A pool gets filled with up to max_size.
  * If all PEBs within the pool are used a new fastmap will be written
  * to the flash and the pool gets refilled with empty PEBs.
- *
  */
 struct ubi_fm_pool {
int pebs[UBI_FM_MAX_POOL_SIZE];
diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 6c69017..61b 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -272,6 +272,10 @@ static int produce_free_peb(struct ubi_device *ubi)
dbg_wl("do one work synchronously");
err = do_work(ubi);
if (err)
+   /* TODO: in the new locking scheme, produce_free_peb is
+* called under wl_lock taken.
+* so when returning, should reacquire the lock
+*/
return err;
 
spin_lock(>wl_lock);
@@ -377,6 +381,7 @@ static struct ubi_wl_entry *find_wl_entry(struct ubi_device 
*ubi,
 * as anchor PEB, hold it back and return the second best WL entry
 * such that fastmap can use the anchor PEB later. */
if (!ubi->fm_disabled && !ubi->fm && e->pnum < UBI_FM_MAX_START)
+   /* TODO: do we have a risk returning NULL here? */
return prev_e;
 
return e;
@@ -405,6 +410,7 @@ static struct ubi_wl_entry *find_mean_wl_entry(struct 
ubi_device *ubi,
/* If no fastmap has been written and this WL entry can be used
 

Re: UBI fastmap updates

2012-07-08 Thread Shmulik Ladkani
Hi Richard,

On Fri, 29 Jun 2012 17:14:18 +0200 Richard Weinberger rich...@nod.at wrote:
 This is the next round of UBI fastmap updates.

Please examine some TODOs (and questions) I've spotted while diffing
against vanilla ubi.

This patch should apply to linux-ubi at d41a140

Sorry I couldn't review entirely, diff has gotten really enormous...
I'll try to pick it up again when you'll get to the final merge
submitting incremental patches.

Regards,
Shmulik

diff --git a/drivers/mtd/ubi/attach.c b/drivers/mtd/ubi/attach.c
index a343a41..dae9674 100644
--- a/drivers/mtd/ubi/attach.c
+++ b/drivers/mtd/ubi/attach.c
@@ -999,6 +999,13 @@ static int scan_peb(struct ubi_device *ubi, struct 
ubi_attach_info *ai,
 
if (!find_fastmap 
   (vol_id  UBI_MAX_VOLUMES  vol_id != UBI_LAYOUT_VOLUME_ID)) {
+
+   /* TODO: if find_fastmap==1, we do not enter this block at all.
+* shouldn't we? shouldn't we care of compatability of unknown
+* internal volumes OTHER than the fastmap ones, even if
+* find_fastmap==1?
+*/
+
int lnum = be32_to_cpu(vidh-lnum);
 
/* Unsupported internal volume */
@@ -1221,6 +1228,7 @@ static void destroy_ai(struct ubi_attach_info *ai)
  * scan_all - scan entire MTD device.
  * @ubi: UBI device description object
  * @ai: attach info object
+ * TODO: document @start
  *
  * This function does full scanning of an MTD device and returns complete
  * information about it in form of a struct ubi_attach_info object. In case
@@ -1350,6 +1358,11 @@ out_vidh:
 out_ech:
kfree(ech);
 out_ai:
+   /* TODO: doesn't seem clean to destroy 'ai' as it was allocated by the
+* caller, its his responsibility.
+* also looks like it leads to double freee in case 'err' returned is
+* negative
+*/
destroy_ai(ai);
return err;
 }
@@ -1441,6 +1454,7 @@ int ubi_attach(struct ubi_device *ubi, int force_scan)
 
err = scan_all(ubi, scan_ai, 0);
if (err) {
+   /* TODO: hmm... kfree or destroy_ai ? */
kfree(scan_ai);
goto out_wl;
}
diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
index 50b7590..f769c22 100644
--- a/drivers/mtd/ubi/build.c
+++ b/drivers/mtd/ubi/build.c
@@ -1053,6 +1053,7 @@ int ubi_detach_mtd_dev(int ubi_num, int anyway)
ubi_notify_all(ubi, UBI_VOLUME_REMOVED, NULL);
dbg_msg(detaching mtd%d from ubi%d, ubi-mtd-index, ubi_num);
 
+   /* TODO: any action on failure? */
ubi_update_fastmap(ubi);
 
/*
@@ -1070,7 +1071,6 @@ int ubi_detach_mtd_dev(int ubi_num, int anyway)
 
ubi_debugfs_exit_dev(ubi);
uif_close(ubi);
-
ubi_wl_close(ubi);
ubi_free_internal_volumes(ubi);
vfree(ubi-vtbl);
diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
index 9f766ff..0b2d0cf 100644
--- a/drivers/mtd/ubi/ubi.h
+++ b/drivers/mtd/ubi/ubi.h
@@ -219,7 +219,7 @@ struct ubi_volume_desc;
  * @size: size of the fastmap in bytes
  * @used_blocks: number of used PEBs
  * @max_pool_size: maximal size of the user pool
- * @max_wl_pool_size: maximal size of the pooly used by the WL sub-system
+ * @max_wl_pool_size: maximal size of the pool used by the WL sub-system
  * @raw: the fastmap itself as byte array (only valid while attaching)
  */
 struct ubi_fastmap_layout {
@@ -242,7 +242,6 @@ struct ubi_fastmap_layout {
  * A pool gets filled with up to max_size.
  * If all PEBs within the pool are used a new fastmap will be written
  * to the flash and the pool gets refilled with empty PEBs.
- *
  */
 struct ubi_fm_pool {
int pebs[UBI_FM_MAX_POOL_SIZE];
diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 6c69017..61b 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -272,6 +272,10 @@ static int produce_free_peb(struct ubi_device *ubi)
dbg_wl(do one work synchronously);
err = do_work(ubi);
if (err)
+   /* TODO: in the new locking scheme, produce_free_peb is
+* called under wl_lock taken.
+* so when returning, should reacquire the lock
+*/
return err;
 
spin_lock(ubi-wl_lock);
@@ -377,6 +381,7 @@ static struct ubi_wl_entry *find_wl_entry(struct ubi_device 
*ubi,
 * as anchor PEB, hold it back and return the second best WL entry
 * such that fastmap can use the anchor PEB later. */
if (!ubi-fm_disabled  !ubi-fm  e-pnum  UBI_FM_MAX_START)
+   /* TODO: do we have a risk returning NULL here? */
return prev_e;
 
return e;
@@ -405,6 +410,7 @@ static struct ubi_wl_entry *find_mean_wl_entry(struct 
ubi_device *ubi,
/* If no fastmap has been written and this WL entry can be used
  

Re: [PATCH 0/5] ubi: Fix bad PEBs reserve caclulation

2012-07-07 Thread Shmulik Ladkani
Hi Richard,

On Fri, 6 Jul 2012 17:27:59 +0200 Richard Genoud  
wrote:
> I've got an oops...
> this is my dev-kernel in 3.5-rc5 + some work to be able to boot on my board
> NB: If I use ubi_format it's ok.
> the mtd1 device has 1984 PEB
> the 4 last are UBI reserved + BBT
> 
> I didn't test without your patch, but anyway something is wrong there.

Many thanks for testing.

Could you please verify the crash only occurs with the patch?

Can you provide the vmlinux matching this oops, so I may analyze the
exact null dereferencing point?
It seems to be somewhere in ubi_wl_init, however the patch seems not to
affect these parts of ubi...

Also, what's your CONFIG_MTD_UBI_BEB_LIMIT?

Regards,
Shmulik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/5] ubi: Fix bad PEBs reserve caclulation

2012-07-07 Thread Shmulik Ladkani
Hi Richard,

On Fri, 6 Jul 2012 17:27:59 +0200 Richard Genoud richard.gen...@gmail.com 
wrote:
 I've got an oops...
 this is my dev-kernel in 3.5-rc5 + some work to be able to boot on my board
 NB: If I use ubi_format it's ok.
 the mtd1 device has 1984 PEB
 the 4 last are UBI reserved + BBT
 
 I didn't test without your patch, but anyway something is wrong there.

Many thanks for testing.

Could you please verify the crash only occurs with the patch?

Can you provide the vmlinux matching this oops, so I may analyze the
exact null dereferencing point?
It seems to be somewhere in ubi_wl_init, however the patch seems not to
affect these parts of ubi...

Also, what's your CONFIG_MTD_UBI_BEB_LIMIT?

Regards,
Shmulik
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/