[PATCH] scripts/config: fix variable substitution command
Commit 229455bc02b87f7128f190c4491b4ce38648 accidentally changed the separator between sed `s' command and its parameters from ':' to '/'. Revert this change. Signed-off-by: Clement Chauplannaz chaup...@gmail.com --- scripts/config | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/config b/scripts/config index 58383e2..ae85564 100755 --- a/scripts/config +++ b/scripts/config @@ -80,7 +80,7 @@ txt_subst() { local infile=$3 local tmpfile=$infile.swp - sed -e s/$before/$after/ $infile $tmpfile + sed -e s:$before:$after: $infile $tmpfile # replace original file with the edited one mv $tmpfile $infile } -- 1.8.3.2.dirty -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Intel-gfx] [PATCH 1/2] drm/i915: kill set_need_resched
On Fri, Sep 13, 2013 at 2:59 AM, Rob Clark robdcl...@gmail.com wrote: I guess in i915 (and ttm) case, the issue arises due to need for CPU access to buffer via GTT? In which case I should be safe to drop the set_need_resched() as well? (Since CPU always has direct access to the pages.) Or am I missing something about the original issue that necessitated set_need_resched()? For drm/i915 the _only_ reason we've had it was to avoid life-locking with our gpu reset work when the gpu hung. We've fixed that properly now by using a wait-queue to stall when a gpu reset is pending and proper locking in the gpu reset handler (plus tons of evil tests to make sure it doesn't break, there's rather fragile lock-dropping and tricky ordering involved). So if you don't have i915's broken gpu reset handling from yonder you don't need our cargo-cult. ttm's usage with a trylock+yield is a different form of duct-tape to paper over locking inversions between copy_*_user callsites and the pagefault handler. In any case there's no way it actually works properly ;-) Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 3/3] kvm: Add VFIO device for handling IOMMU cache coherency
On 09/13/2013 07:23 AM, Alex Williamson wrote: So far we've succeeded at making KVM and VFIO mostly unaware of each other, but there's any important point where that breaks down. Intel VT-d hardware may or may not support snoop control. When snoop control is available, intel-iommu promotes No-Snoop transactions on PCIe to be cache coherent. That allows KVM to handle things like the x86 WBINVD opcode as a nop. When the hardware does not support this, KVM must implement a hardware visible WBINVD for the guest. We could simply let userspace tell KVM how to handle WBINVD, but it's privileged for a reason. Allowing an arbitrary user to enable physical WBINVD gives them a more access to the hardware. Previously, this has only been enabled for guests supporting legacy PCI device assignment. In such cases it's necessary for proper guest execution. We therefore create a new KVM-VFIO virtual device. The user can add and remove VFIO groups to this device via file descriptors. KVM makes use of the VFIO external user interface to validate that the user has access to physical hardware and gets the coherency state of the IOMMU from VFIO. This provides equivalent functionality to legacy KVM assignment, while keeping (nearly) all the bits isolated. The one intrusion is the resulting flag indicating the coherency state. For this RFC it's placed on the x86 kvm_arch struct, however I know POWER has interest in using the VFIO external user interface, and I'm hoping we can share a common KVM-VFIO device. Perhaps they care about No-Snoop handling as well or the code can be #ifdef'd. POWER does not support (at least boos3s - server, not sure about others) this cache-non-coherent stuff at all. Regarding reusing this device with external API for POWER - I posted a patch which introduces KVM device to link KVM with IOMMU but besides the list of groups registered in KVM, it also provides the way to find a group by LIOBN (logical bus number) which is used in DMA map/unmap hypercalls. So in my case kvm_vfio_group struct needs LIOBN and it would be nice to have there window_size too (for a quick boundary check). I am not sure we want to mix everything here. It is in [PATCH v10 12/13] KVM: PPC: Add support for IOMMU in-kernel handling if you are interested (kvmppc_spapr_tce_iommu_device). -- Alexey -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/4] Add physical count arch timer support for clocksource in ARMv7.
On 13 September 2013 00:39, Marc Zyngier marc.zyng...@arm.com wrote: On 12/09/13 17:07, cinifr wrote: This cannot be a compile-time option as above in a multiplatform build. Other paltforms (e.g. KVM guests) *must* use the virtual counters to get any semblance of a consistent view of time. Yes I accept compile-time option is not perfect in my pre email, But,Why Ohter paltforms *must* use the virtual counters? I think KVM should not limit how to use arch timer in its guest OS. Of cause, if KVM guest use vct can be more efficiency then that use pct. but KVM should and must support guest OS to access pct. The virtual counter is there for a good reason: it allows a virtual machine to: - see its time starting at zero - be migrated to another host without seeing time shifting one way or another. So using the physical counter in a VM is a recipe for disaster if you're doing any kind of time tracking. The counter being used for sched_clock(), we cannot afford to see it being shifted one way or another. I accept that virtual count is better in VM than physical counter because hypversion can modify VM timer by set CNTVOFF. But I think hypversior should support that VM should can access physical counter, When VM use physical count. hypversior could trap accessing physical count from guest OS, and return a value that guest OS want liking hypervisor set CNTVOFF for virtual counter. On this way, VM could too see its timer at zero and VM could too be migrated to another host without seeing time shifting. If you have issues with the use of the virtual counter, I suggest you fix your firmware to have a consistent CNTVOFF across CPUs. And/or even better, boot your kernel in HYP mode, as it will take care of setting CNTVOFF to zero. I am wondering what is the principle between kernel and bootload? What should be done in bootloader and what should be done in kernel? As you said, If kernel boot from hyp, Kernel can set CNTVOFF to zero directly, does we add the code to set CNTVOFF in kernel? But, if kernel boot from PL1 NS=0, Does kernel need to switch hyp mode to set CNTVOFF and return PL1 NS=0 mode? Or,kernel dont care it because kernel believe bootloader have set CNTVOFF before? Thanks, Fan. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] tools, perf: Add a precise event qualifier v2
On Thu, Sep 12, 2013 at 07:36:17PM +0200, Andi Kleen wrote: Your feature to export 'precise' requirements on events looks useful to me. We could implement it not by special casing it implicitly but by saying that if ../format/precise contains something like: attr:240-241 Since we currently have the pattern $name:bits to mean perf_event_attr::$name the above would imply and create a possible collision with perf_event_attr::attr. If we're going to do this I'd propose using something like _:240-241, for while '_' is a valid name in C its not something we're ever going to allow in perf_event_attr. then that's a natural extension of the config:X-Y format and should be interpreted to mean mean 2 bits in the perf attr field. I.e. we could go beyond the config bitfield. Basically the whole perf_event_attr can be thought of as a 'giant bitfield', in which we can specify values to export an enumerated list of events from the kernel to tooling. (Using attr:X-Y the config and config1 variants can be expressed as well, as the config fields are inside the attr structure.) The positions within the perf_attr are an ABI, so this would work pretty well. Wouldn't we need different bits for each architecture then? 32bit/64bit, some archs with weird alignment rules, maybe different for BE/LE too? Typically PMU drivers are per arch and all the format stuff is per pmu driver so I'd not worry about that just yet. But yes, while the perf_event_attr thing is ABI its not identical across archs. Ok I suppose it could be somehow auto generated in asm-offsets.c, although I'm not sure how to get a bitfield offset there. Yes, that is an unfortunate situation. I (and either Acme or Jolsa) tried wrapping the bitfield in an anonymous union to create a named variable for the entire u64 but older GCC completely fails with that. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] net: smsc: remove deprecated IRQF_DISABLED
[adding David Brown] On Fri, Sep 13, 2013 at 05:27:47AM +0100, Michael Opdenacker wrote: This patch proposes to remove the IRQF_DISABLED flag from code in drivers/net/ethernet/smsc/ It's a NOOP since 2.6.35 and it will be removed one day. Signed-off-by: Michael Opdenacker michael.opdenac...@free-electrons.com --- drivers/net/ethernet/smsc/smc91x.h | 2 +- drivers/net/ethernet/smsc/smsc9420.c | 3 +-- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/smsc/smc91x.h b/drivers/net/ethernet/smsc/smc91x.h index 370e13d..5730fe2 100644 --- a/drivers/net/ethernet/smsc/smc91x.h +++ b/drivers/net/ethernet/smsc/smc91x.h @@ -271,7 +271,7 @@ static inline void mcf_outsw(void *a, unsigned char *p, int l) #define SMC_insw(a, r, p, l) mcf_insw(a + r, p, l) #define SMC_outsw(a, r, p, l)mcf_outsw(a + r, p, l) -#define SMC_IRQ_FLAGS(IRQF_DISABLED) +#define SMC_IRQ_FLAGS0 After this change, the only machine that defines the flags to anything other than IRQF_TRIGGER_RISING is MSM, which uses IRQF_TRIGGER_HIGH. David: do you actually need this? The irq_chip code under mach-msm doesn't seem to distinguish between the two. Will -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
Op 13-09-13 10:23, Thomas Hellstrom schreef: On 09/13/2013 09:51 AM, Maarten Lankhorst wrote: Op 13-09-13 09:46, Thomas Hellstrom schreef: On 09/13/2013 09:16 AM, Maarten Lankhorst wrote: Op 13-09-13 08:44, Thomas Hellstrom schreef: On 09/12/2013 11:50 PM, Maarten Lankhorst wrote: Op 12-09-13 18:44, Thomas Hellstrom schreef: On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: Op 12-09-13 17:36, Daniel Vetter schreef: On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra pet...@infradead.org wrote: So I'm poking around the preemption code and stumbled upon: drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c: set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c: set_need_resched(); drivers/gpu/drm/udl/udl_gem.c: set_need_resched(); All these sites basically do: while (!trylock()) yield(); which is a horrible and broken locking pattern. Firstly its deadlock prone, suppose the faulting process is a FIFOn+1 task that preempted the lock holder at FIFOn. Secondly the implementation is worse than usual by abusing VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault doesn't retry, but you're using it as a get out of fault path. And you're using set_need_resched() which is not something a driver should _ever_ touch. Now I'm going to take away set_need_resched() -- and while you can 'reimplement' it using set_thread_flag() you're not going to do that because it will be broken due to changes to the preempt code. So please as to fix ASAP and don't allow anybody to trick you into merging silly things like that again ;-) The set_need_resched in i915_gem.c:i915_gem_fault can actually be removed. It was there to give the error handler a chance to sneak in and reset the hw/sw tracking when the gpu is dead. That hack goes back to the days when the locking around our error handler was somewhere between nonexistent and totally broken, nowadays we keep things from live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll whip up a patch to rip this out. I'll also check that our testsuite properly exercises this path (needs a bit of work on a quick look for better coverage). The one in ttm is just bonghits to shut up lockdep: ttm can recurse into it's own pagefault handler and then deadlock, the trylock just keeps lockdep quiet. We've had that bug arise in drm/i915 due to some fun userspace did and now have testcases for them. The right solution to fix this is to use copy_to|from_user_atomic in ttm everywhere it holds locks and have slowpaths which drops locks, copies stuff into a temp allocation and then continues. At least that's how we've fixed all those inversions in i915-gem. I'm not volunteering to fix this ;-) Ah the case where a mmap'd address is passed to the execbuf ioctl? :P Fine I'll look into it a bit, hopefully before tuesday. Else it might take a bit longer since I'll be on my way to plumbers.. I think a possible fix would be if fault() were allowed to return an error and drop the mmap_sem() before returning. Otherwise we need to track down all copy_to_user / copy_from_user which happen with bo::reserve held. Actually, from looking at the mm code, it seems OK to do the following: if (!bo_tryreserve()) { up_read mmap_sem(); // Release the mmap_sem to avoid deadlocks. bo_reserve(); // Wait for the BO to become available (interruptible) bo_unreserve(); // Where is bo_wait_unreserved() when we need it, Maarten :P return VM_FAULT_RETRY; // Go ahead and retry the VMA walk, after regrabbing } Is this meant as a jab at me? You're doing locking wrong here! Again! It's not meant as a jab at you. I'm sorry if it came out that way. It was meant as a joke. I wasn't aware the topic was sensitive. Anyway, could you describe what is wrong, with the above solution, because it seems perfectly legal to me. There is no substantial overhead, and there is no risc of deadlocks. Or do you mean it's bad because it confuses lockdep? Evil userspace can pass a bo as pointer to use for relocation lists, lockdep will warn when that locks up, but still.. This is already a problem now, and your fixing will only cause lockdep to explicitly warn on it. As previously mentioned, copy_from_user should return -EFAULT, since the VMAs are marked with VM_IO. It should not recurse into fault(), so evil user-space looses. You can make a complicated user program to test this, or simply use this function for debugging: void ttm_might_fault(void) { struct reservation_object obj; reservation_object_init(obj); ww_mutex_lock(obj.lock, NULL); ww_mutex_unlock(obj.lock); reservation_object_fini(obj); } Put it near every instance of copy_to_user/copy_from_user and you'll find the bugs. :) I'm still not convinced that there are any problems with this
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On Fri, Sep 13, 2013 at 10:41:54AM +0200, Daniel Vetter wrote: On Fri, Sep 13, 2013 at 10:29 AM, Peter Zijlstra pet...@infradead.org wrote: On Fri, Sep 13, 2013 at 09:46:03AM +0200, Thomas Hellstrom wrote: if (!bo_tryreserve()) { up_read mmap_sem(); // Release the mmap_sem to avoid deadlocks. bo_reserve(); // Wait for the BO to become available (interruptible) bo_unreserve(); // Where is bo_wait_unreserved() when we need it, Maarten :P return VM_FAULT_RETRY; // Go ahead and retry the VMA walk, after regrabbing } Anyway, could you describe what is wrong, with the above solution, because it seems perfectly legal to me. Luckily the rule of law doesn't have anything to do with this stuff -- at least I sincerely hope so. The thing that's wrong with that pattern is that its still not deterministic - although its a lot better than the pure trylock. Because you have to release and re-acquire with the trylock another user might have gotten in again. Its utterly prone to starvation. The acquire+release does remove the dead/life-lock scenario from the FIFO case, since blocking on the acquire will allow the other task to run (or even get boosted on -rt). Aside from that there's nothing particularly wrong with it and lockdep should be happy afaict (but I haven't had my morning juice yet). bo_reserve internally maps to a ww-mutex and task can already hold ww-mutex (potentially even the same for especially nasty userspace). OK, yes I wasn't aware of that. Yes in that case you're quite right. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] mei: me: downgrade suspend message to debug level
Signed-off-by: Paul Bolle pebo...@tiscali.nl --- Lightly tested (on top of v3.10.10, actually). Should apply cleanly to v3.11 or current master. Perhaps a better fix is to remove this message entirely. Rechecking my logs I noticed a stop error on module unload. That message could be downgraded to debug level too. But it could also be removed. If Tomas could say what is preferable I'll send a second version of this trivial patch. I prefer to have it on debug level. Thanks Tomas
Re: [PATCH v4 0/2] ARM: dts: Beaglebone MMC fixes
Op 13 sep. 2013, om 00:27 heeft Kevin Hilman khil...@linaro.org het volgende geschreven: Koen Kooi k...@dominion.thruhere.net writes: Here are two patches to fix MMC on beaglebone, one fixes card detect on BBW, the other adds the eMMC entry for BBB and its fixed regulator. After that mmc1 gets a nice speed boost by moving to 4-bit mode and LED triggers get assigned. This series depends on: http://comments.gmane.org/gmane.linux.kernel.stable/63648 https://lkml.org/lkml/2013/9/10/454 http://comments.gmane.org/gmane.linux.kernel.mmc/22381 Or as git-cherry would put it: [koen@rrMBP patches]$ git cherry -v + 564fc88cc64387af5312e2abd8019c75a13223b2 ARM: OMAP2+: am335x-bone*: add DT for BeagleBone Black + e5133ed98acc1c3e01c370b851041a8ca629cd15 ARM: EDMA: Fix clearing of unused list for DT DMA resources + ac71bb58605d3bdd5d14af770a639fb3ff11c612 ARM: dts: add AM33XX EDMA support + 31a8270a299c57c7de7510f44d9dc36fd1787243 ARM: dts: add AM33XX SPI DMA support + 4fa0a4cb9ea17da30cf43085c03e5ec1361a4fc2 ARM: dts: add AM33XX MMC support and documentation + 0553f50bd45f019a0cc11050e2f20bddbf07dfe0 ARM: dts: am335x-bone: add CD for mmc1 + 7d64f765630a2921a63b82f93f9959a6de37f29d ARM: dts: am335x-boneblack: add eMMC DT entry + dc96cd4003e2668d8ec7e7fe19e402e97a198f81 ARM: dts: am335x-bone-common: switch mmc1 to 4-bit mode + f8262e78830cda56c936724549ba9f04e312 ARM: dts: am335x-bone-common: add cpu0 and mmc1 triggers Also available as a git branch at https://github.com/koenkooi/linux/commits/mainline FWIW, tested this branch on BB black/white with MMC rootfs. Tested-by: Kevin Hilman khil...@linaro.org Koen, Thanks for your persistence getting this stuff merged. No problem, with all comments addressed I can safely disappear for 3 weeks to go on honeymoon :) regards, Koen-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
On (09/12/13 15:12), Greg KH wrote: On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote: Dan Carpenter noted that handle_pending_slot_free() is racy with zram_reset_device(). Take write init_lock in zram_slot_free(), thus preventing any concurrent zram_slot_free(), zram_bvec_rw() or zram_reset_device(). This also allows to safely check zram-init_done in handle_pending_slot_free(). Initial intention was to minimze number of handle_pending_slot_free() call from zram_bvec_rw(), which were slowing down READ requests due to slot_free_lock spin lock. Jerome Marchand suggested to remove handle_pending_slot_free() from zram_bvec_rw(). Link: https://lkml.org/lkml/2013/9/9/172 Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com I have multiple versions of this and the other zram patches from you, with no idea of which to accept. yes, please, drop all patches. I did not Cc you in these two patches to stop spamming your inbox with multiply versions. I will send them back to you as soon as I get positive feedback. So, I'm going to drop them all, can you please resend what you wish to submit, and in the future, be a bit more obvious with your vN markings? sorry for that. -ss thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On 09/13/2013 10:58 AM, Maarten Lankhorst wrote: Op 13-09-13 10:23, Thomas Hellstrom schreef: On 09/13/2013 09:51 AM, Maarten Lankhorst wrote: Op 13-09-13 09:46, Thomas Hellstrom schreef: On 09/13/2013 09:16 AM, Maarten Lankhorst wrote: Op 13-09-13 08:44, Thomas Hellstrom schreef: On 09/12/2013 11:50 PM, Maarten Lankhorst wrote: Op 12-09-13 18:44, Thomas Hellstrom schreef: On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: Op 12-09-13 17:36, Daniel Vetter schreef: On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra pet...@infradead.org wrote: So I'm poking around the preemption code and stumbled upon: drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/udl/udl_gem.c: set_need_resched(); All these sites basically do: while (!trylock()) yield(); which is a horrible and broken locking pattern. Firstly its deadlock prone, suppose the faulting process is a FIFOn+1 task that preempted the lock holder at FIFOn. Secondly the implementation is worse than usual by abusing VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault doesn't retry, but you're using it as a get out of fault path. And you're using set_need_resched() which is not something a driver should _ever_ touch. Now I'm going to take away set_need_resched() -- and while you can 'reimplement' it using set_thread_flag() you're not going to do that because it will be broken due to changes to the preempt code. So please as to fix ASAP and don't allow anybody to trick you into merging silly things like that again ;-) The set_need_resched in i915_gem.c:i915_gem_fault can actually be removed. It was there to give the error handler a chance to sneak in and reset the hw/sw tracking when the gpu is dead. That hack goes back to the days when the locking around our error handler was somewhere between nonexistent and totally broken, nowadays we keep things from live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll whip up a patch to rip this out. I'll also check that our testsuite properly exercises this path (needs a bit of work on a quick look for better coverage). The one in ttm is just bonghits to shut up lockdep: ttm can recurse into it's own pagefault handler and then deadlock, the trylock just keeps lockdep quiet. We've had that bug arise in drm/i915 due to some fun userspace did and now have testcases for them. The right solution to fix this is to use copy_to|from_user_atomic in ttm everywhere it holds locks and have slowpaths which drops locks, copies stuff into a temp allocation and then continues. At least that's how we've fixed all those inversions in i915-gem. I'm not volunteering to fix this ;-) Ah the case where a mmap'd address is passed to the execbuf ioctl? :P Fine I'll look into it a bit, hopefully before tuesday. Else it might take a bit longer since I'll be on my way to plumbers.. I think a possible fix would be if fault() were allowed to return an error and drop the mmap_sem() before returning. Otherwise we need to track down all copy_to_user / copy_from_user which happen with bo::reserve held. Actually, from looking at the mm code, it seems OK to do the following: if (!bo_tryreserve()) { up_read mmap_sem(); // Release the mmap_sem to avoid deadlocks. bo_reserve(); // Wait for the BO to become available (interruptible) bo_unreserve(); // Where is bo_wait_unreserved() when we need it, Maarten :P return VM_FAULT_RETRY; // Go ahead and retry the VMA walk, after regrabbing } Is this meant as a jab at me? You're doing locking wrong here! Again! It's not meant as a jab at you. I'm sorry if it came out that way. It was meant as a joke. I wasn't aware the topic was sensitive. Anyway, could you describe what is wrong, with the above solution, because it seems perfectly legal to me. There is no substantial overhead, and there is no risc of deadlocks. Or do you mean it's bad because it confuses lockdep? Evil userspace can pass a bo as pointer to use for relocation lists, lockdep will warn when that locks up, but still.. This is already a problem now, and your fixing will only cause lockdep to explicitly warn on it. As previously mentioned, copy_from_user should return -EFAULT, since the VMAs are marked with VM_IO. It should not recurse into fault(), so evil user-space looses. You can make a complicated user program to test this, or simply use this function for debugging: void ttm_might_fault(void) { struct reservation_object obj; reservation_object_init(obj); ww_mutex_lock(obj.lock, NULL); ww_mutex_unlock(obj.lock); reservation_object_fini(obj); } Put it near every instance of copy_to_user/copy_from_user and you'll find the bugs. :) I'm still not convinced that there are any problems with this solution. Did you
Re: [RFC] Restrict kernel spawning of threads to a specified set of cpus.
On Thu, Sep 12, 2013 at 08:30:25PM +0200, Frederic Weisbecker wrote: Now the issue doesn't only concern kthreads but all tasks in the system. No, only kernel threads, all other tasks have a parent they inherit (namespace, cgroup, affinity etc..) context from. If we really want to solve that race, then may be we can think of a kernel_parameter No bloody kernel params. I'd much rather create a pointless kthread to act as usermodehelper parent that people can set context on (move it into cgroups, set affinity, whatever) so it automagically propagates to all userspace helper thingies. Is there anything other than usermodehelper we need to be concerned with? One that comes to mind would be unbound workqueue threads. Do we want to share the parent with usermodehelpers or have these two classes have different parents? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 0/5] x86, memblock: Allocate memory near kernel image before SRAT parsed.
This patch-set is based on tj's suggestion, and not fully tested. Just for review and discussion. This patch-set is based on the latest kernel (3.11) HEAD is: commit d5d04bb48f0eb89c14e76779bb46212494de0bec Author: Linus Torvalds torva...@linux-foundation.org Date: Wed Sep 11 19:55:12 2013 -0700 [Problem] The current Linux cannot migrate pages used by the kerenl because of the kernel direct mapping. In Linux kernel space, va = pa + PAGE_OFFSET. When the pa is changed, we cannot simply update the pagetable and keep the va unmodified. So the kernel pages are not migratable. There are also some other issues will cause the kernel pages not migratable. For example, the physical address may be cached somewhere and will be used. It is not to update all the caches. When doing memory hotplug in Linux, we first migrate all the pages in one memory device somewhere else, and then remove the device. But if pages are used by the kernel, they are not migratable. As a result, memory used by the kernel cannot be hot-removed. Modifying the kernel direct mapping mechanism is too difficult to do. And it may cause the kernel performance down and unstable. So we use the following way to do memory hotplug. [What we are doing] In Linux, memory in one numa node is divided into several zones. One of the zones is ZONE_MOVABLE, which the kernel won't use. In order to implement memory hotplug in Linux, we are going to arrange all hotpluggable memory in ZONE_MOVABLE so that the kernel won't use these memory. To do this, we need ACPI's help. In ACPI, SRAT(System Resource Affinity Table) contains NUMA info. The memory affinities in SRAT record every memory range in the system, and also, flags specifying if the memory range is hotpluggable. (Please refer to ACPI spec 5.0 5.2.16) With the help of SRAT, we have to do the following two things to achieve our goal: 1. When doing memory hot-add, allow the users arranging hotpluggable as ZONE_MOVABLE. (This has been done by the MOVABLE_NODE functionality in Linux.) 2. when the system is booting, prevent bootmem allocator from allocating hotpluggable memory for the kernel before the memory initialization finishes. The problem 2 is the key problem we are going to solve. But before solving it, we need some preparation. Please see below. [Preparation] Bootloader has to load the kernel image into memory. And this memory must be unhotpluggable. We cannot prevent this anyway. So in a memory hotplug system, we can assume any node the kernel resides in is not hotpluggable. Before SRAT is parsed, we don't know which memory ranges are hotpluggable. But memblock has already started to work. In the current kernel, memblock allocates the following memory before SRAT is parsed: setup_arch() |-memblock_x86_fill()/* memblock is ready */ |.. |-early_reserve_e820_mpc_new() /* allocate memory under 1MB */ |-reserve_real_mode()/* allocate memory under 1MB */ |-init_mem_mapping() /* allocate page tables, about 2MB to map 1GB memory */ |-dma_contiguous_reserve() /* specified by user, should be low */ |-setup_log_buf()/* specified by user, several mega bytes */ |-relocate_initrd() /* could be large, but will be freed after boot, should reorder */ |-acpi_initrd_override() /* several mega bytes */ |-reserve_crashkernel() /* could be large, should reorder */ |.. |-initmem_init() /* Parse SRAT */ According to Tejun's advice, before SRAT is parsed, we should try our best to allocate memory near the kernel image. Since the whole node the kernel resides in won't be hotpluggable, and for a modern server, a node may have at least 16GB memory, allocating several mega bytes memory around the kernel image won't cross to hotpluggable memory. [About this patch-set] So this patch-set does the following: 1. Make memblock be able to allocate memory from low address to high address. 1) Keep all the memblock APIs' prototype unmodified. 2) When the direction is bottom up, keep the start address greater than the end of kernel image. 2. Improve init_mem_mapping() to support allocate page tables in bottom up direction. 3. Introduce movablenode boot option to enable and disable this functionality. PS: Reordering of relocate_initrd() has not been done yet. acpi_initrd_override() needs to access initrd with virtual address. So relocate_initrd() must be done before acpi_initrd_override(). Change log v2 - v3: 1. According to Toshi's suggestion, move the direction checking logic into memblock. And simply the code more. Change log v1 - v2: 1. According to tj's suggestion, implemented a new function memblock_alloc_bottom_up() to allocate memory from bottom upwards, whihc can simplify the code. Tang Chen (5): memblock: Introduce allocation direction to memblock. memblock: Improve memblock to support allocation from lower address.
[PATCH v3 5/5] mem-hotplug: Introduce movablenode boot option to control memblock allocation direction.
The Hot-Pluggable fired in SRAT specifies which memory is hotpluggable. As we mentioned before, if hotpluggable memory is used by the kernel, it cannot be hot-removed. So memory hotplug users may want to set all hotpluggable memory in ZONE_MOVABLE so that the kernel won't use it. Memory hotplug users may also set a node as movable node, which has ZONE_MOVABLE only, so that the whole node can be hot-removed. But the kernel cannot use memory in ZONE_MOVABLE. By doing this, the kernel cannot use memory in movable nodes. This will cause NUMA performance down. And other users may be unhappy. So we need a way to allow users to enable and disable this functionality. In this patch, we introduce movablenode boot option to allow users to choose to reserve hotpluggable memory and set it as ZONE_MOVABLE or not. Users can specify movablenode in kernel commandline to enable this functionality. For those who don't use memory hotplug or who don't want to lose their NUMA performance, just don't specify anything. The kernel will work as before. After memblock is ready, before SRAT is parsed, we should allocate memory near the kernel image. So this patch does the following: 1. After memblock is ready, make memblock allocate memory from low address to high. 2. After SRAT is parsed, make memblock behave as default, allocate memory from high address to low. This behavior is controlled by movablenode boot option. Suggested-by: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com Signed-off-by: Tang Chen tangc...@cn.fujitsu.com Reviewed-by: Wanpeng Li liw...@linux.vnet.ibm.com Reviewed-by: Zhang Yanfei zhangyan...@cn.fujitsu.com --- Documentation/kernel-parameters.txt | 15 ++ arch/x86/kernel/setup.c | 36 +++ include/linux/memory_hotplug.h |5 mm/memory_hotplug.c |9 4 files changed, 65 insertions(+), 0 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 1a036cd..8c056c4 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1769,6 +1769,21 @@ bytes respectively. Such letter suffixes can also be entirely omitted. that the amount of memory usable for all allocations is not too small. + movablenode [KNL,X86] This parameter enables/disables the + kernel to arrange hotpluggable memory ranges recorded + in ACPI SRAT(System Resource Affinity Table) as + ZONE_MOVABLE. And these memory can be hot-removed when + the system is up. + By specifying this option, all the hotpluggable memory + will be in ZONE_MOVABLE, which the kernel cannot use. + This will cause NUMA performance down. For users who + care about NUMA performance, just don't use it. + If all the memory ranges in the system are hotpluggable, + then the ones used by the kernel at early time, such as + kernel code and data segments, initrd file and so on, + won't be set as ZONE_MOVABLE, and won't be hotpluggable. + Otherwise the kernel won't have enough memory to boot. + MTD_Partition= [MTD] Format: name,region-number,size,offset diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 36cfce3..617af9a 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1094,6 +1094,31 @@ void __init setup_arch(char **cmdline_p) trim_platform_memory_ranges(); trim_low_memory_range(); +#ifdef CONFIG_MOVABLE_NODE + if (movablenode_enable_srat) { + /* +* Memory used by the kernel cannot be hot-removed because Linux +* cannot migrate the kernel pages. When memory hotplug is +* enabled, we should prevent memblock from allocating memory +* for the kernel. +* +* ACPI SRAT records all hotpluggable memory ranges. But before +* SRAT is parsed, we don't know about it. +* +* The kernel image is loaded into memory at very early time. We +* cannot prevent this anyway. So on NUMA system, we set any +* node the kernel resides in as un-hotpluggable. +* +* Since on modern servers, one node could have double-digit +* gigabytes memory, we can assume the memory around the kernel +* image is also un-hotpluggable. So before SRAT is parsed, just +* allocate memory near the kernel image to try the best to keep +* the kernel away from hotpluggable memory. +*/ +
[PATCH v3 4/5] x86, mem-hotplug: Support initialize page tables from low to high.
init_mem_mapping() is called before SRAT is parsed. And memblock will allocate memory for page tables. To prevent page tables being allocated within hotpluggable memory, we will allocate page tables from the end of kernel image to the higher memory. Signed-off-by: Tang Chen tangc...@cn.fujitsu.com Reviewed-by: Zhang Yanfei zhangyan...@cn.fujitsu.com --- arch/x86/mm/init.c | 121 +++ 1 files changed, 92 insertions(+), 29 deletions(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 04664cd..bf7b732 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -401,13 +401,79 @@ static unsigned long __init init_range_memory_mapping( /* (PUD_SHIFT-PMD_SHIFT)/2 */ #define STEP_SIZE_SHIFT 5 -void __init init_mem_mapping(void) + +#ifdef CONFIG_MOVABLE_NODE +/** + * memory_map_from_low - Map [start, end) from low to high + * @start: start address of the target memory range + * @end: end address of the target memory range + * + * This function will setup direct mapping for memory range [start, end) in a + * heuristic way. In the beginning, step_size is small. The more memory we map + * memory in the next loop. + */ +static void __init memory_map_from_low(unsigned long start, unsigned long end) +{ + unsigned long next, new_mapped_ram_size; + unsigned long mapped_ram_size = 0; + /* step_size need to be small so pgt_buf from BRK could cover it */ + unsigned long step_size = PMD_SIZE; + + while (start end) { + if (end - start step_size) { + next = round_up(start + 1, step_size); + if (next end) + next = end; + } else + next = end; + + new_mapped_ram_size = init_range_memory_mapping(start, next); + min_pfn_mapped = start PAGE_SHIFT; + start = next; + + if (new_mapped_ram_size mapped_ram_size) + step_size = STEP_SIZE_SHIFT; + mapped_ram_size += new_mapped_ram_size; + } +} +#endif /* CONFIG_MOVABLE_NODE */ + +/** + * memory_map_from_high - Map [start, end) from high to low + * @start: start address of the target memory range + * @end: end address of the target memory range + * + * This function is similar to memory_map_from_low() except it maps memory + * from high to low. + */ +static void __init memory_map_from_high(unsigned long start, unsigned long end) { - unsigned long end, real_end, start, last_start; - unsigned long step_size; - unsigned long addr; + unsigned long prev, new_mapped_ram_size; unsigned long mapped_ram_size = 0; - unsigned long new_mapped_ram_size; + /* step_size need to be small so pgt_buf from BRK could cover it */ + unsigned long step_size = PMD_SIZE; + + while (start end) { + if (end step_size) { + prev = round_down(end - 1, step_size); + if (prev start) + prev = start; + } else + prev = start; + + new_mapped_ram_size = init_range_memory_mapping(prev, end); + min_pfn_mapped = prev PAGE_SHIFT; + end = prev; + + if (new_mapped_ram_size mapped_ram_size) + step_size = STEP_SIZE_SHIFT; + mapped_ram_size += new_mapped_ram_size; + } +} + +void __init init_mem_mapping(void) +{ + unsigned long end; probe_page_size_mask(); @@ -417,45 +483,42 @@ void __init init_mem_mapping(void) end = max_low_pfn PAGE_SHIFT; #endif - /* the ISA range is always mapped regardless of memory holes */ - init_memory_mapping(0, ISA_END_ADDRESS); + max_pfn_mapped = 0; /* will get exact value next */ + min_pfn_mapped = end PAGE_SHIFT; + +#ifdef CONFIG_MOVABLE_NODE + unsigned long kernel_end; + + if (memblock_direction_bottom_up()) { + kernel_end = round_up(__pa_symbol(_end), PMD_SIZE); + + memory_map_from_low(kernel_end, end); + memory_map_from_low(ISA_END_ADDRESS, kernel_end); + goto out; + } +#endif /* CONFIG_MOVABLE_NODE */ + + unsigned long addr, real_end; /* xen has big range in reserved near end of ram, skip it at first.*/ addr = memblock_find_in_range(ISA_END_ADDRESS, end, PMD_SIZE, PMD_SIZE); real_end = addr + PMD_SIZE; - /* step_size need to be small so pgt_buf from BRK could cover it */ - step_size = PMD_SIZE; - max_pfn_mapped = 0; /* will get exact value next */ - min_pfn_mapped = real_end PAGE_SHIFT; - last_start = start = real_end; - /* * We start from the top (end of memory) and go to the bottom. * The memblock_find_in_range() gets us a block of RAM from the * end of RAM in [min_pfn_mapped,
[PATCH v3 1/5] memblock: Introduce allocation direction to memblock.
The Linux kernel cannot migrate pages used by the kernel. As a result, kernel pages cannot be hot-removed. So we cannot allocate hotpluggable memory for the kernel. ACPI SRAT (System Resource Affinity Table) contains the memory hotplug info. But before SRAT is parsed, memblock has already started to allocate memory for the kernel. So we need to prevent memblock from doing this. In a memory hotplug system, any numa node the kernel resides in should be unhotpluggable. And for a modern server, each node could have at least 16GB memory. So memory around the kernel image is highly likely unhotpluggable. So the basic idea is: Allocate memory from the end of the kernel image and to the higher memory. Since memory allocation before SRAT is parsed won't be too much, it could highly likely be in the same node with kernel image. The current memblock can only allocate memory from high address to low. So this patch introduces the allocation direct to memblock. It could be used to tell memblock to allocate memory from high to low or from low to high. Signed-off-by: Tang Chen tangc...@cn.fujitsu.com Reviewed-by: Zhang Yanfei zhangyan...@cn.fujitsu.com --- include/linux/memblock.h | 22 ++ mm/memblock.c| 13 + 2 files changed, 35 insertions(+), 0 deletions(-) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index 31e95ac..a7d3436 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -19,6 +19,11 @@ #define INIT_MEMBLOCK_REGIONS 128 +/* Allocation order. */ +#define MEMBLOCK_DIRECTION_HIGH_TO_LOW 0 +#define MEMBLOCK_DIRECTION_LOW_TO_HIGH 1 +#define MEMBLOCK_DIRECTION_DEFAULT MEMBLOCK_DIRECTION_HIGH_TO_LOW + struct memblock_region { phys_addr_t base; phys_addr_t size; @@ -35,6 +40,7 @@ struct memblock_type { }; struct memblock { + int current_direction; /* allocate from higher or lower address */ phys_addr_t current_limit; struct memblock_type memory; struct memblock_type reserved; @@ -148,6 +154,12 @@ phys_addr_t memblock_alloc_try_nid(phys_addr_t size, phys_addr_t align, int nid) phys_addr_t memblock_alloc(phys_addr_t size, phys_addr_t align); +static inline bool memblock_direction_bottom_up(void) +{ + return memblock.current_direction == MEMBLOCK_DIRECTION_LOW_TO_HIGH; +} + + /* Flags for memblock_alloc_base() amd __memblock_alloc_base() */ #define MEMBLOCK_ALLOC_ANYWHERE(~(phys_addr_t)0) #define MEMBLOCK_ALLOC_ACCESSIBLE 0 @@ -175,6 +187,16 @@ static inline void memblock_dump_all(void) } /** + * memblock_set_current_direction - Set current allocation direction to allow + * allocating memory from higher to lower + * address or from lower to higher address + * + * @direction: In which order to allocate memory. Could be + * MEMBLOCK_DIRECTION_{HIGH_TO_LOW|LOW_TO_HIGH} + */ +void memblock_set_current_direction(int direction); + +/** * memblock_set_current_limit - Set the current allocation limit to allow * limiting allocations to what is currently * accessible during boot diff --git a/mm/memblock.c b/mm/memblock.c index 0ac412a..f24ca2e 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -32,6 +32,7 @@ struct memblock memblock __initdata_memblock = { .reserved.cnt = 1,/* empty dummy entry */ .reserved.max = INIT_MEMBLOCK_REGIONS, + .current_direction = MEMBLOCK_DIRECTION_DEFAULT, .current_limit = MEMBLOCK_ALLOC_ANYWHERE, }; @@ -995,6 +996,18 @@ void __init_memblock memblock_trim_memory(phys_addr_t align) } } +void __init_memblock memblock_set_current_direction(int direction) +{ + if (direction != MEMBLOCK_DIRECTION_HIGH_TO_LOW + direction != MEMBLOCK_DIRECTION_LOW_TO_HIGH) { + pr_warn(memblock: Failed to set allocation order. + Invalid order type: %d\n, direction); + return; + } + + memblock.current_direction = direction; +} + void __init_memblock memblock_set_current_limit(phys_addr_t limit) { memblock.current_limit = limit; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] scripts/config: fix variable substitution command
On Fri, Sep 13, 2013 at 10:45 AM, Clement Chauplannaz chaup...@gmail.com wrote: Commit 229455bc02b87f7128f190c4491b4ce38648 accidentally changed the separator between sed `s' command and its parameters from ':' to '/'. Revert this change. Signed-off-by: Clement Chauplannaz chaup...@gmail.com This patch fixes my issue with --set-str, thanks! Tested-by: Linus Walleij linus.wall...@linaro.org Yours, Linus Walleij -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 2/5] memblock: Improve memblock to support allocation from lower address.
This patch modifies the memblock_find_in_range_node() to support two different allocation directions. After this patch, memblock will check memblock.current_direction, and decide in which direction to allocate memory. Now it supports two allocation directions: bottom up and top down. When direction is top down, it acts as before. When direction is bottom up, the start address should be greater than the end of the kernel image. Otherwise, it will be trimmed to kernel image end address. Signed-off-by: Tang Chen tangc...@cn.fujitsu.com Reviewed-by: Zhang Yanfei zhangyan...@cn.fujitsu.com --- mm/memblock.c | 107 ++-- 1 files changed, 95 insertions(+), 12 deletions(-) diff --git a/mm/memblock.c b/mm/memblock.c index f24ca2e..87a7f04 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -20,6 +20,8 @@ #include linux/seq_file.h #include linux/memblock.h +#include asm-generic/sections.h + static struct memblock_region memblock_memory_init_regions[INIT_MEMBLOCK_REGIONS] __initdata_memblock; static struct memblock_region memblock_reserved_init_regions[INIT_MEMBLOCK_REGIONS] __initdata_memblock; @@ -84,8 +86,81 @@ static long __init_memblock memblock_overlaps_region(struct memblock_type *type, } /** + * __memblock_find_range - find free area utility + * @start: start of candidate range, can be %MEMBLOCK_ALLOC_ACCESSIBLE + * @end: end of candidate range, can be %MEMBLOCK_ALLOC_{ANYWHERE|ACCESSIBLE} + * @size: size of free area to find + * @align: alignment of free area to find + * @nid: nid of the free area to find, %MAX_NUMNODES for any node + * + * Utility called from memblock_find_in_range_node(), find free area from + * lower address to higher address. + * + * RETURNS: + * Found address on success, %0 on failure. + */ +static phys_addr_t __init_memblock +__memblock_find_range(phys_addr_t start, phys_addr_t end, + phys_addr_t size, phys_addr_t align, int nid) +{ + phys_addr_t this_start, this_end, cand; + u64 i; + + for_each_free_mem_range(i, nid, this_start, this_end, NULL) { + this_start = clamp(this_start, start, end); + this_end = clamp(this_end, start, end); + + cand = round_up(this_start, align); + if (cand this_end this_end - cand = size) + return cand; + } + + return 0; +} + +/** + * __memblock_find_range_rev - find free area utility, in reverse order + * @start: start of candidate range, can be %MEMBLOCK_ALLOC_ACCESSIBLE + * @end: end of candidate range, can be %MEMBLOCK_ALLOC_{ANYWHERE|ACCESSIBLE} + * @size: size of free area to find + * @align: alignment of free area to find + * @nid: nid of the free area to find, %MAX_NUMNODES for any node + * + * Utility called from memblock_find_in_range_node(), find free area from + * higher address to lower address. + * + * RETURNS: + * Found address on success, %0 on failure. + */ +static phys_addr_t __init_memblock +__memblock_find_range_rev(phys_addr_t start, phys_addr_t end, + phys_addr_t size, phys_addr_t align, int nid) +{ + phys_addr_t this_start, this_end, cand; + u64 i; + + for_each_free_mem_range_reverse(i, nid, this_start, this_end, NULL) { + this_start = clamp(this_start, start, end); + this_end = clamp(this_end, start, end); + + /* +* Just in case that (this_end - size) underflows and cause +* (cand = this_start) to be true incorrectly. +*/ + if (this_end size) + break; + + cand = round_down(this_end - size, align); + if (cand = this_start) + return cand; + } + + return 0; +} + +/** * memblock_find_in_range_node - find free area in given range and node - * @start: start of candidate range + * @start: start of candidate range, can be %MEMBLOCK_ALLOC_ACCESSIBLE * @end: end of candidate range, can be %MEMBLOCK_ALLOC_{ANYWHERE|ACCESSIBLE} * @size: size of free area to find * @align: alignment of free area to find @@ -93,6 +168,11 @@ static long __init_memblock memblock_overlaps_region(struct memblock_type *type, * * Find @size free area aligned to @align in the specified range and node. * + * When allocation direction is from low to high, the @start should be greater + * than the end of the kernel image. Otherwise, it will be trimmed. And also, + * if allocation from low to high failed, will try to allocate memory from high + * to low again. + * * RETURNS: * Found address on success, %0 on failure. */ @@ -100,8 +180,7 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start, phys_addr_t end, phys_addr_t size, phys_addr_t align, int nid) { - phys_addr_t this_start, this_end, cand; - u64 i; +
[PATCH v3 3/5] x86, acpi, crash, kdump: Do reserve_crashkernel() after SRAT is parsed.
Memory reserved for crashkernel could be large. So we should not allocate this memory bottom up from the end of kernel image. When SRAT is parsed, we will be able to know whihc memory is hotpluggable, and we can avoid allocating this memory for the kernel. So reorder reserve_crashkernel() after SRAT is parsed. Signed-off-by: Tang Chen tangc...@cn.fujitsu.com Reviewed-by: Zhang Yanfei zhangyan...@cn.fujitsu.com --- arch/x86/kernel/setup.c |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index f0de629..36cfce3 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1120,8 +1120,6 @@ void __init setup_arch(char **cmdline_p) acpi_initrd_override((void *)initrd_start, initrd_end - initrd_start); #endif - reserve_crashkernel(); - vsmp_init(); io_delay_init(); @@ -1136,6 +1134,12 @@ void __init setup_arch(char **cmdline_p) initmem_init(); memblock_find_dma_reserve(); + /* +* Reserve memory for crash kernel after SRAT is parsed so that it +* won't consume hotpluggable memory. +*/ + reserve_crashkernel(); + #ifdef CONFIG_KVM_GUEST kvmclock_init(); #endif -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/4] Add physical count arch timer support for clocksource in ARMv7.
On 13/09/13 09:49, cinifr wrote: On 13 September 2013 00:39, Marc Zyngier marc.zyng...@arm.com wrote: On 12/09/13 17:07, cinifr wrote: This cannot be a compile-time option as above in a multiplatform build. Other paltforms (e.g. KVM guests) *must* use the virtual counters to get any semblance of a consistent view of time. Yes I accept compile-time option is not perfect in my pre email, But,Why Ohter paltforms *must* use the virtual counters? I think KVM should not limit how to use arch timer in its guest OS. Of cause, if KVM guest use vct can be more efficiency then that use pct. but KVM should and must support guest OS to access pct. The virtual counter is there for a good reason: it allows a virtual machine to: - see its time starting at zero - be migrated to another host without seeing time shifting one way or another. So using the physical counter in a VM is a recipe for disaster if you're doing any kind of time tracking. The counter being used for sched_clock(), we cannot afford to see it being shifted one way or another. I accept that virtual count is better in VM than physical counter because hypversion can modify VM timer by set CNTVOFF. But I think hypversior should support that VM should can access physical counter, When VM use physical count. hypversior could trap accessing physical count from guest OS, and return a value that guest OS want liking hypervisor set CNTVOFF for virtual counter. On this way, VM could too see its timer at zero and VM could too be migrated to another host without seeing time shifting. I urge you to read the ARM ARM, and specifically the section dedicated to trapping access to CP15 operations. If you do, you'll quickly notice that you *cannot* trap accesses to the timer subsystem. All you can do is disable access to the physical timer/counter, resulting in an UNDEF in the *guest*. Additionally, please realise the overhead of trapping is enormous, and that we do try very hard to minimise it. Why do you think we went out of our way to ensure that host and guest would use different timers, always? If you have issues with the use of the virtual counter, I suggest you fix your firmware to have a consistent CNTVOFF across CPUs. And/or even better, boot your kernel in HYP mode, as it will take care of setting CNTVOFF to zero. I am wondering what is the principle between kernel and bootload? What should be done in bootloader and what should be done in kernel? As you said, If kernel boot from hyp, Kernel can set CNTVOFF to zero directly, does we add the code to set CNTVOFF in kernel? But, if kernel boot from PL1 NS=0, Does kernel need to switch hyp mode to set CNTVOFF and return PL1 NS=0 mode? Or,kernel dont care it because kernel believe bootloader have set CNTVOFF before? In an ideal world, the bootloader should set CNTVOFF to zero. The fact that the kernel does it too when booted in HYP mode is to preserve itself from from broken bootloaders. CNTVOFF can only be setup from either HYP or Secure Monitor mode with SCR.NS == 1, so if you run your kernel in secure mode, it is always best to do it in the bootloader. M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] perf fixes
Hi, On 13 September 2013 07:09, Ingo Molnar mi...@kernel.org wrote: * David Ahern dsah...@gmail.com wrote: By default a simple 'make' should build perf to the maximum extent possible, with no other input required from the user - with warnings displayed as package install suggestions. By default there is no config. Autoprobing generates a first one or a user can specify a defconfig. This could work if there's not two but three states for individual features: - autoprobe - on - off and if autoprobe, if a system feature has been probed successfully, automatically turned 'autoprobe' entries into 'on'. That would give us the best of all worlds - autodetection, configurability and caching: - initial user types 'make' and gets a .config that has almost all entries 'on', a few 'autoprobe'. - once the user installs a dependency, the corresponding .config entry turns into 'on'. - the regular user or developers would have libraries that turn all entries in the .config to 'on'. - if a user is genuinely uninterested in a feature, he can mark it 'off', which would then stay off permanently. This could also be used by embedded/specialized builds. - other specialized users, like distro builds, could use a .config with all entries 'on' and could enforce the presence of all dependencies for a successful build. [We could add 'make allyesconfig' to help that.] Is there a way to detect the presence of a dependency and _also_ check its version? Some new features are depending on a recent version of a library, e.g. dwarf unwinding depends on libunwind = 1.1 (cf. http://www.spinics.net/lists/kernel/msg1598951.html). Thanks, Jean Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] scripts/config: use sed's POSIX interface
On Fri, Sep 13, 2013 at 10:38 AM, Clément Chauplannaz chaup...@gmail.com wrote: Thank you for this report. I was able to reproduce this bug and fix it. Thanks! Tested and works fine. My previous commit changed the separator between sed's substitute command and its parameters, from ':' to '/'. The latter conflicted with the slashes found in the value of variable CMDLINE, as provided in your email. Hm it could actually be useful to be able to have colons in a CMDLINE, I wonder if we can think about some better separator ... oh well that is another issue, all old scripts work now anyway. Yours, Linus Walleij -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] dma: pl330: Simplify irq allocation
On 09/12/2013 03:42 PM, Vinod Koul wrote: On Wed, Sep 04, 2013 at 04:40:17PM +0200, Michal Simek wrote: Use devm_request_irq function. Applied, thanks Thanks. I have one change in my tree which I would like to get to the mainline but it will need more interaction with others pl330 users. It is about adding support for interrupt per channel feature. For our case we have 8 channels and every channel has specific interrupt + we have one abort IRQ. Based on the binding and Linux code two changes are necessary. 1. Extend AMBA_NR_IRQS because we have 9 IRQs 2. Driver change which is just register irqs when they are available. Are these two changes OK for you? Thanks, Michal -- Michal Simek, Ing. (M.Eng), OpenPGP - KeyID: FE3D1F91 w: www.monstr.eu p: +42-0-721842854 Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/ Maintainer of Linux kernel - Xilinx Zynq ARM architecture Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform signature.asc Description: OpenPGP digital signature
Re: [GIT PULL] target updates for v3.12-rc1
On 9/13/13, Nicholas A. Bellinger n...@linux-iscsi.org wrote: .., Removed individual CCs The patches to add COMPARE_AND_WRITE and EXTENDED_COPY support are of particular significance, which make us the first and only open source target to support the full set of VAAI primitives. Probably not the first and only open source target. quadstor (http://www.quadstor.com) also has full VAAI support. Thought you should know and my apologies if this reply is inappropriate for the devel lists. Many thanks for the VAAI support ! - jb -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4] mfd: rtsx: Modify rts5249_optimize_phy
From: Wei WANG wei_w...@realsil.com.cn In some platforms, specially Thinkpad series, rts5249 won't be initialized properly. So we need adjust some phy parameters to improve the compatibility issue. It is a little different between simulation and real chip. We have no idea about which configuration is better before tape-out. We set default settings according to simulation, but need to tune these parameters after getting the real chip. I can't explain every change in detail here. The below information is just a rough description: PHY_REG_REV: Disable internal clkreq_tx, enable rx_pwst PHY_BPCR: No change, just turn the magic number to macro definitions PHY_PCR: Change OOBS sensitivity, from 60mV to 90mV PHY_RCR2: Control charge-pump current automatically PHY_FLD4: Use TX cmu reference clock PHY_RDR: Change RXDSEL from 30nF to 1.9nF PHY_RCR1: Change the duration between adp_st and asserting cp_en from 0.32 us to 0.64us PHY_FLD3: Adjust internal timers PHY_TUNE: Fine tune the regulator12 output voltage Signed-off-by: Wei WANG wei_w...@realsil.com.cn --- drivers/mfd/rts5249.c| 48 -- include/linux/mfd/rtsx_pci.h | 53 ++ 2 files changed, 99 insertions(+), 2 deletions(-) diff --git a/drivers/mfd/rts5249.c b/drivers/mfd/rts5249.c index 3b835f5..573de7b 100644 --- a/drivers/mfd/rts5249.c +++ b/drivers/mfd/rts5249.c @@ -130,13 +130,57 @@ static int rts5249_optimize_phy(struct rtsx_pcr *pcr) { int err; - err = rtsx_pci_write_phy_register(pcr, PHY_REG_REV, 0xFE46); + err = rtsx_pci_write_phy_register(pcr, PHY_REG_REV, + PHY_REG_REV_RESV | PHY_REG_REV_RXIDLE_LATCHED | + PHY_REG_REV_P1_EN | PHY_REG_REV_RXIDLE_EN | + PHY_REG_REV_RX_PWST | PHY_REG_REV_CLKREQ_DLY_TIMER_1_0 | + PHY_REG_REV_STOP_CLKRD | PHY_REG_REV_STOP_CLKWR); if (err 0) return err; msleep(1); - return rtsx_pci_write_phy_register(pcr, PHY_BPCR, 0x05C0); + err = rtsx_pci_write_phy_register(pcr, PHY_BPCR, + PHY_BPCR_IBRXSEL | PHY_BPCR_IBTXSEL | + PHY_BPCR_IB_FILTER | PHY_BPCR_CMIRROR_EN); + if (err 0) + return err; + err = rtsx_pci_write_phy_register(pcr, PHY_PCR, + PHY_PCR_FORCE_CODE | PHY_PCR_OOBS_CALI_50 | + PHY_PCR_OOBS_VCM_08 | PHY_PCR_OOBS_SEN_90 | + PHY_PCR_RSSI_EN); + if (err 0) + return err; + err = rtsx_pci_write_phy_register(pcr, PHY_RCR2, + PHY_RCR2_EMPHASE_EN | PHY_RCR2_NADJR | + PHY_RCR2_CDR_CP_10 | PHY_RCR2_CDR_SR_2 | + PHY_RCR2_FREQSEL_12 | PHY_RCR2_CPADJEN | + PHY_RCR2_CDR_SC_8 | PHY_RCR2_CALIB_LATE); + if (err 0) + return err; + err = rtsx_pci_write_phy_register(pcr, PHY_FLD4, + PHY_FLD4_FLDEN_SEL | PHY_FLD4_REQ_REF | + PHY_FLD4_RXAMP_OFF | PHY_FLD4_REQ_ADDA | + PHY_FLD4_BER_COUNT | PHY_FLD4_BER_TIMER | + PHY_FLD4_BER_CHK_EN); + if (err 0) + return err; + err = rtsx_pci_write_phy_register(pcr, PHY_RDR, PHY_RDR_RXDSEL_1_9); + if (err 0) + return err; + err = rtsx_pci_write_phy_register(pcr, PHY_RCR1, + PHY_RCR1_ADP_TIME | PHY_RCR1_VCO_COARSE); + if (err 0) + return err; + err = rtsx_pci_write_phy_register(pcr, PHY_FLD3, + PHY_FLD3_TIMER_4 | PHY_FLD3_TIMER_6 | + PHY_FLD3_RXDELINK); + if (err 0) + return err; + return rtsx_pci_write_phy_register(pcr, PHY_TUNE, + PHY_TUNE_TUNEREF_1_0 | PHY_TUNE_VBGSEL_1252 | + PHY_TUNE_SDBUS_33 | PHY_TUNE_TUNED18 | + PHY_TUNE_TUNED12); } static int rts5249_turn_on_led(struct rtsx_pcr *pcr) diff --git a/include/linux/mfd/rtsx_pci.h b/include/linux/mfd/rtsx_pci.h index d1382df..0ce7721 100644 --- a/include/linux/mfd/rtsx_pci.h +++ b/include/linux/mfd/rtsx_pci.h @@ -756,6 +756,59 @@ #define PCR_SETTING_REG2 0x814 #define PCR_SETTING_REG3 0x747 +/* Phy bits */ +#define PHY_PCR_FORCE_CODE 0xB000 +#define PHY_PCR_OOBS_CALI_50 0x0800 +#define PHY_PCR_OOBS_VCM_080x0200 +#define PHY_PCR_OOBS_SEN_900x0040 +#define PHY_PCR_RSSI_EN0x0002 + +#define PHY_RCR1_ADP_TIME 0x0100 +#define PHY_RCR1_VCO_COARSE0x001F + +#define PHY_RCR2_EMPHASE_EN0x8000 +#define PHY_RCR2_NADJR 0x4000 +#define PHY_RCR2_CDR_CP_10 0x0400 +#define
Re: [GIT PULL] perf fixes
* Jean Pihet jean.pi...@linaro.org wrote: Hi, On 13 September 2013 07:09, Ingo Molnar mi...@kernel.org wrote: * David Ahern dsah...@gmail.com wrote: By default a simple 'make' should build perf to the maximum extent possible, with no other input required from the user - with warnings displayed as package install suggestions. By default there is no config. Autoprobing generates a first one or a user can specify a defconfig. This could work if there's not two but three states for individual features: - autoprobe - on - off and if autoprobe, if a system feature has been probed successfully, automatically turned 'autoprobe' entries into 'on'. That would give us the best of all worlds - autodetection, configurability and caching: - initial user types 'make' and gets a .config that has almost all entries 'on', a few 'autoprobe'. - once the user installs a dependency, the corresponding .config entry turns into 'on'. - the regular user or developers would have libraries that turn all entries in the .config to 'on'. - if a user is genuinely uninterested in a feature, he can mark it 'off', which would then stay off permanently. This could also be used by embedded/specialized builds. - other specialized users, like distro builds, could use a .config with all entries 'on' and could enforce the presence of all dependencies for a successful build. [We could add 'make allyesconfig' to help that.] Is there a way to detect the presence of a dependency and _also_ check its version? Some new features are depending on a recent version of a library, e.g. dwarf unwinding depends on libunwind = 1.1 (cf. http://www.spinics.net/lists/kernel/msg1598951.html). Yeah, see the testcases in tools/perf/config/feature-tests.mak, they typically include the latest library API usages, which will fail on older versions. That kind of 'does it actually work?' test is a lot more robust than explicit version checks, and combined with caching it should be fast and parallelizable as well. (One of the problems of the current simple implementation of the feature tests is that they are 20 serial tests with no parallelization.) Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] ARM64: Add support for ILP32 ABI.
On Fri, Sep 13, 2013 at 07:18:48AM +0100, Andrew Pinski wrote: On Wed, Sep 11, 2013 at 7:32 AM, Catalin Marinas catalin.mari...@arm.com wrote: On Mon, Sep 09, 2013 at 10:32:59PM +0100, Andrew Pinski wrote: This patch adds full support of the ABI to the ARM64 target. This description is too short. Please describe what the ABI is, what are the commonalities with AArch64 and AArch32, what other non-obvious things had to be done (like __kernel_long_t being long long). Split this patch into multiple patches like base syscall handling, signal handling, pselect6/ppoll, vdso etc. It's too much to review at once. Ok. I will do so after my vacation next week. On top of these, I would really like to see Documentation/arm64/ilp32.txt describing the ABI. No other target does not, not even x86_64 for x32. Well, I'm sure they wouldn't mind if you submitted documentation for them too. I would also like to know (you can state this in the cover letter) the level of testing for all 3 types of ABI. I'm worried that at least this patch breaks the current compat ABI (has LTP reported anything?). We did test LTP on an earlier version of this patch for all three ABIs, I will make sure that the next version I send out is tested on all three ABIs also. We also tested ILP32/LP64 on big-endian at the same time which we will continue to do (I should push for our team here to push out the big-endian patches). We also have some BE patches internally, but obviously they just target LP64 and AArch32 compat. I'd hope to get these out shortly (the current issue is extensive testing, since we don't have much of a BE userspace). diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index cc64df5..7fdc994 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -248,7 +248,7 @@ source fs/Kconfig.binfmt config COMPAT def_bool y - depends on ARM64_AARCH32 + depends on ARM64_AARCH32 || ARM64_ILP32 select COMPAT_BINFMT_ELF config ARM64_AARCH32 (nitpick) We used to have an option like this, called CONFIG_AARCH32_EMULATION, which I think is clearer than CONFIG_ARM64_AARCH32. diff --git a/arch/arm64/kernel/vdsoilp32/Makefile b/arch/arm64/kernel/vdsoilp32/Makefile new file mode 100644 index 000..ec93f3f --- /dev/null +++ b/arch/arm64/kernel/vdsoilp32/Makefile Could we not keep vdso in the same directory? I started out that way but make clean ARCH=arm64 did not clean the vdso files all the time. Can you elaborate please? I'd much rather we fix broken make rules instead of botching around the issue by creating new directories. Will -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] tools, perf: Add a precise event qualifier v2
* Peter Zijlstra pet...@infradead.org wrote: On Thu, Sep 12, 2013 at 07:36:17PM +0200, Andi Kleen wrote: Your feature to export 'precise' requirements on events looks useful to me. We could implement it not by special casing it implicitly but by saying that if ../format/precise contains something like: attr:240-241 Since we currently have the pattern $name:bits to mean perf_event_attr::$name the above would imply and create a possible collision with perf_event_attr::attr. If we're going to do this I'd propose using something like _:240-241, for while '_' is a valid name in C its not something we're ever going to allow in perf_event_attr. Ok - and I'm not against adding individual 'names' one by one as well, that allows us to expose only the fields that relate to event configuration. For example if we added 'type' as well we could expose the generic, hardware-independent events via sysfs as well. ( Eventually this scheme would be fit to expose more advanced events as well: such as composite groups of events with simple arithmetic operations between them. That would allow the exposure of E1+E2-E3 type of simple calculations. ) Wouldn't we need different bits for each architecture then? 32bit/64bit, some archs with weird alignment rules, maybe different for BE/LE too? Typically PMU drivers are per arch and all the format stuff is per pmu driver so I'd not worry about that just yet. ok. But yes, while the perf_event_attr thing is ABI its not identical across archs. Yeah, like syscalls - it's not an on-disk format. Ok I suppose it could be somehow auto generated in asm-offsets.c, although I'm not sure how to get a bitfield offset there. Yes, that is an unfortunate situation. I (and either Acme or Jolsa) tried wrapping the bitfield in an anonymous union to create a named variable for the entire u64 but older GCC completely fails with that. We could be careful with bitfields and enumerate their offsets explicitly, with a build time testcase that makes sure that the offsets match reality. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv2 0/11] staging: usbip: Userland crypto and ACLs
Hi, this patch series includes an updated version of the IPv6 support patch (a call to freeaddrinfo() was missing) as well as: - The client/server authentication support using GnuTLS Tobias already announced on the usbip-devel mailing list some time ago[1] - Support for restricting the access to devices to specific IP address ranges - Improved error reporting and new error codes to be passed over the TCP protocol. We think that the added features justify a version bump to 1.2.0. The corresponding patch is also included. All protocol changes are backwards-compatible, thus, we don't increment the protocol version. We've already sent this patch series, but forgot to specify a subject line. linux-usb apparently received it[2], the LKML didn't. Sorry if you received it twice now. Regards, Tobias Polzer and Dominik Paulus [1] See 6aeb926e1c4572e79488a91a827333c9.squir...@faumail.uni-erlangen.de [2] http://www.mail-archive.com/linux-usb@vger.kernel.org/msg28689.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv2 04/11] staging: usbip: Add CIDR matching helper functions
This patch adds a few utility functions to match IP addresses against CIDR masks. Signed-off-by: Dominik Paulus dominik.pau...@fau.de Signed-off-by: Tobias Polzer tobias.pol...@fau.de --- drivers/staging/usbip/userspace/src/utils.c | 84 + drivers/staging/usbip/userspace/src/utils.h | 15 ++ 2 files changed, 99 insertions(+) diff --git a/drivers/staging/usbip/userspace/src/utils.c b/drivers/staging/usbip/userspace/src/utils.c index 2d4966e..df40817 100644 --- a/drivers/staging/usbip/userspace/src/utils.c +++ b/drivers/staging/usbip/userspace/src/utils.c @@ -74,3 +74,87 @@ int modify_match_busid(char *busid, int add) return ret; } + +/* + * Parses a string of form ip/prefix into a subnet mask to dest. + * Returns -1 on error, 0 on success + */ +int parse_cidr(const char *src, struct subnet *dest) +{ + char *ip, *prefix, *saveptr; + char *endptr; + struct in6_addr ip6; + struct in_addr ip4; + int bits; + long int tmp; + char buf[128]; /* For strtok */ + + strncpy(buf, src, sizeof(buf)); + buf[sizeof(buf)-1] = 0; + + ip = strtok_r(buf, /, saveptr); + prefix = strtok_r(NULL, /, saveptr); + if (strtok_r(NULL, /, saveptr) || !ip || + strlen(src) sizeof(buf) - 1) + return -1; + + if (inet_pton(AF_INET6, ip, ip6) == 1) { + dest-ai_family = AF_INET6; + bits = 128; + dest-address.ip6 = ip6; + } else if (inet_pton(AF_INET, ip, ip4) == 1) { + dest-ai_family = AF_INET; + bits = 32; + dest-address.ip4 = ip4; + } else { + return -1; + } + + /* +* We also accept single IPs without an explicitely +* specified prefix +*/ + if (prefix) { + tmp = strtol(prefix, endptr, 10); + if (tmp 0 || tmp bits || *endptr != '\0') + return -1; + dest-prefix = tmp; + } else { + dest-prefix = bits; + } + + return 0; +} + +/* + * Checks if addr is in range. Expects addr to be a struct in6_addr* if + * ai_family == AF_INET6, else struct in_addr*. + * Returns 1 if in range, 0 otherwise. + */ +int in_range(struct sockaddr_storage *addr, struct subnet range) +{ + if (addr-ss_family != range.ai_family) + return 0; + if (addr-ss_family == AF_INET6) { + int i; + struct sockaddr_in6 *in6addr = (struct sockaddr_in6 *) addr; + unsigned char *ip = in6addr-sin6_addr.s6_addr; + for (i = 0; i range.prefix; ++i) { + int idx = i/8, mask = 1 (7 - i%8); + if ((ip[idx] mask) != (range.address.ip6.s6_addr[idx] + mask)) + return 0; + } + } else { + int i; + struct sockaddr_in *inaddr = (struct sockaddr_in *) addr; + uint32_t ip = ntohl(inaddr-sin_addr.s_addr); + uint32_t comp = ntohl(range.address.ip4.s_addr); + for (i = 0; i range.prefix; ++i) { + int mask = 1 (31-i); + if ((ip mask) != (comp mask)) + return 0; + } + } + return 1; +} diff --git a/drivers/staging/usbip/userspace/src/utils.h b/drivers/staging/usbip/userspace/src/utils.h index 5916fd3..a3704ef 100644 --- a/drivers/staging/usbip/userspace/src/utils.h +++ b/drivers/staging/usbip/userspace/src/utils.h @@ -19,7 +19,22 @@ #ifndef __UTILS_H #define __UTILS_H +#include arpa/inet.h +#include sys/socket.h +#include netinet/ip.h + +struct subnet { + int ai_family; + int prefix; + union { + struct in6_addr ip6; + struct in_addr ip4; + } address; +}; + int modify_match_busid(char *busid, int add); +int parse_cidr(const char *src, struct subnet *dest); +int in_range(struct sockaddr_storage *addr, struct subnet range); #endif /* __UTILS_H */ -- 1.8.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/5] FS: Export poll_select_copy_remaining and rename poll_select_copy_remaining in compat.c so it does not pick the wrong copy.
On Wed, Sep 11, 2013 at 10:00:14PM +0100, Andrew Pinski wrote: On Wed, Sep 11, 2013 at 4:09 AM, Catalin Marinas catalin.mari...@arm.com wrote: On Mon, Sep 09, 2013 at 10:32:57PM +0100, Andrew Pinski wrote: The ILP32 ABI in ARM64 uses a slightly different pselect from either the compat or even the native LP64 ABI. We would want to reuse some of the code path that are used as the size of the timespec is the same, so this patch exports poll_select_copy_remaining from fs/select.c and renames the copy in fs/compat.c to make sure that it is not being used. Signed-off-by: Andrew Pinski apin...@cavium.com I think this patch has to wait until we review the ILP32 ABI for arm64. When I looked at this some time ago I thought we can just use the native arm64 pselect6 and ppoll. Once we agree that's not possible we can push this patch. On its own it doesn't have much value. Since fd_set is defined by XPG4.2 to be a struct of an array of long's, we cannot change the definition in user space. I tried using the native ppoll/pselect for the ABI first. It worked for little-endian just fine but failed hard when big-endian. If we only care about little-endian arm64 at this point, I can remove this part of the patch and keep it for when we (Cavium) submits the big-endian patches. I see the issue now. I think for the initial set of patches we can assume little endian. We need a lot more testing, at least for AArch32 mode (we can kick off some tests here once these patches get closer to merging). -- Catalin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv2 03/11] staging: usbip: Add kernel support for client ACLs
This patch adds the possibility to stored ACLs for allowed clients for each stub device in sysfs. It adds a new sysfs entry called usbip_acl for each stub device, containing a list of CIDR masks of allowed clients. This file will be used by usbip and usbipd to store the ACL. Signed-off-by: Kurt Kanzenbach ly80t...@cip.cs.fau.de Signed-off-by: Dominik Paulus dominik.pau...@fau.de Signed-off-by: Tobias Polzer tobias.pol...@fau.de --- drivers/staging/usbip/stub.h | 5 +++ drivers/staging/usbip/stub_dev.c | 68 +++- 2 files changed, 72 insertions(+), 1 deletion(-) diff --git a/drivers/staging/usbip/stub.h b/drivers/staging/usbip/stub.h index a73e437..cfe75d1 100644 --- a/drivers/staging/usbip/stub.h +++ b/drivers/staging/usbip/stub.h @@ -60,6 +60,11 @@ struct stub_device { struct list_head unlink_free; wait_queue_head_t tx_waitq; + + /* list of allowed IP addrs */ + char *acls; + /* for locking list operations */ + spinlock_t ip_lock; }; /* private data into urb-priv */ diff --git a/drivers/staging/usbip/stub_dev.c b/drivers/staging/usbip/stub_dev.c index d8957a5..c44d5f2 100644 --- a/drivers/staging/usbip/stub_dev.c +++ b/drivers/staging/usbip/stub_dev.c @@ -142,6 +142,62 @@ err: } static DEVICE_ATTR(usbip_sockfd, S_IWUSR, NULL, store_sockfd); +/* + * This function replaces the current ACL list + */ +static ssize_t store_acl(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct stub_device *sdev = dev_get_drvdata(dev); + int retval = 0; + + if (!sdev) + return -ENODEV; + + if (count = PAGE_SIZE) + /* Prevent storing oversized ACLs in kernel memory */ + return -EINVAL; + + /* Store ACL */ + spin_lock_irq(sdev-ip_lock); + kfree(sdev-acls); + sdev-acls = kstrdup(buf, GFP_KERNEL); + if (IS_ERR(sdev-acls)) { + retval = PTR_ERR(sdev-acls); + sdev-acls = NULL; + } else { + retval = strlen(sdev-acls); + } + spin_unlock_irq(sdev-ip_lock); + + return retval; +} + +/* + * This functions prints all allowed IP addrs for this dev + */ +static ssize_t show_acl(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct stub_device *sdev = dev_get_drvdata(dev); + int retval = 0; + + if (!sdev) + return -ENODEV; + + spin_lock_irq(sdev-ip_lock); + if (sdev-acls == NULL) { + retval = 0; + } else { + strcpy(buf, sdev-acls); + retval = strlen(buf); + } + spin_unlock_irq(sdev-ip_lock); + + return retval; +} +static DEVICE_ATTR(usbip_acl, S_IWUSR | S_IRUGO, show_acl, store_acl); + static int stub_add_files(struct device *dev) { int err = 0; @@ -157,9 +213,13 @@ static int stub_add_files(struct device *dev) err = device_create_file(dev, dev_attr_usbip_debug); if (err) goto err_debug; + err = device_create_file(dev, dev_attr_usbip_acl); + if (err) + goto err_ip; return 0; - +err_ip: + device_remove_file(dev, dev_attr_usbip_debug); err_debug: device_remove_file(dev, dev_attr_usbip_sockfd); err_sockfd: @@ -173,6 +233,7 @@ static void stub_remove_files(struct device *dev) device_remove_file(dev, dev_attr_usbip_status); device_remove_file(dev, dev_attr_usbip_sockfd); device_remove_file(dev, dev_attr_usbip_debug); + device_remove_file(dev, dev_attr_usbip_acl); } static void stub_shutdown_connection(struct usbip_device *ud) @@ -306,12 +367,14 @@ static struct stub_device *stub_device_alloc(struct usb_device *udev, sdev-ud.status = SDEV_ST_AVAILABLE; spin_lock_init(sdev-ud.lock); sdev-ud.tcp_socket = NULL; + sdev-acls = NULL; INIT_LIST_HEAD(sdev-priv_init); INIT_LIST_HEAD(sdev-priv_tx); INIT_LIST_HEAD(sdev-priv_free); INIT_LIST_HEAD(sdev-unlink_free); INIT_LIST_HEAD(sdev-unlink_tx); + spin_lock_init(sdev-ip_lock); spin_lock_init(sdev-priv_lock); init_waitqueue_head(sdev-tx_waitq); @@ -507,6 +570,9 @@ static void stub_disconnect(struct usb_interface *interface) usb_put_dev(sdev-udev); usb_put_intf(interface); + /* free ACL list */ + kfree(sdev-acls); + /* free sdev */ busid_priv-sdev = NULL; stub_device_free(sdev); -- 1.8.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] scripts/config: use sed's POSIX interface
On Sep 13, 2013, at 11:32 AM, Linus Walleij linus.wall...@linaro.org wrote: On Fri, Sep 13, 2013 at 10:38 AM, Clément Chauplannaz chaup...@gmail.com wrote: Thank you for this report. I was able to reproduce this bug and fix it. Thanks! Tested and works fine. Glad to read the patch solves your issue. Thanks for the quick feedback! My previous commit changed the separator between sed's substitute command and its parameters, from ':' to '/'. The latter conflicted with the slashes found in the value of variable CMDLINE, as provided in your email. Hm it could actually be useful to be able to have colons in a CMDLINE, I wonder if we can think about some better separator ... oh well that is another issue, all old scripts work now anyway. Indeed config script may not work with all possible string values. My first concern for now was to fallback to previous interface. We may look into hardening the script later on. Best regards, Clement Chauplannaz-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv2 09/11] staging: usbip: Improve debug output
For IPv6, IP:Port is unreadable. Signed-off-by: Dominik Paulus dominik.pau...@fau.de Signed-off-by: Tobias Polzer tobias.pol...@fau.de --- drivers/staging/usbip/userspace/src/usbipd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/usbip/userspace/src/usbipd.c b/drivers/staging/usbip/userspace/src/usbipd.c index ae572c6..6550460 100644 --- a/drivers/staging/usbip/userspace/src/usbipd.c +++ b/drivers/staging/usbip/userspace/src/usbipd.c @@ -519,7 +519,7 @@ static int do_accept(int listenfd) return -1; } #endif - info(connection from %s:%s, host, port); + info(connection from %s, port %s, host, port); return connfd; } -- 1.8.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: increased vmap_area_lock contentions on n_tty: Move buffers into n_tty_data
On 09/12/2013 11:44 PM, Greg KH wrote: On Fri, Sep 13, 2013 at 11:38:04AM +0800, Fengguang Wu wrote: On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote: On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote: Hi Peter, FYI, we noticed much increased vmap_area_lock contentions since this commit: What does that mean? What is happening, are we allocating/removing more memory now? No. Same amount of memory, allocated and freed with the same frequency as before. What type of load were you running that showed this problem? The increased contentions and lock hold/wait time showed up in a number of test cases. [...] That's a lot of slowdowns, especially for such a simple patch. Peter, any ideas? Looks like this patch incidentally triggers some worst-case behavior in the memory manager. I'm not sure how this is possible with two 4k buffers, but the evidence is substantial. This patch isn't critical so I suggest we back out this patch for mainline but use the patch to find out what's wrong in the vmap area. Unfortunately, I'm on my way out the door and won't be back til Sunday pm (EST) so I'll get a revert to you then. Sorry 'bout that. Regards, Peter Hurley -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv2 08/11] staging: usbip: Handle usbip being started as user
usbip now prints an error message when started as user and requiring root access. Also, some debug messages are changed to error messages so the command line utilities now print less confusing (and more verbose) error messages when not used correctly. Signed-off-by: Dominik Paulus dominik.pau...@fau.de Signed-off-by: Tobias Polzer tobias.pol...@fau.de --- drivers/staging/usbip/userspace/src/usbip_attach.c | 3 +++ drivers/staging/usbip/userspace/src/usbip_bind.c | 16 ++-- 2 files changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/staging/usbip/userspace/src/usbip_attach.c b/drivers/staging/usbip/userspace/src/usbip_attach.c index 2a3f313..651e93a 100644 --- a/drivers/staging/usbip/userspace/src/usbip_attach.c +++ b/drivers/staging/usbip/userspace/src/usbip_attach.c @@ -210,6 +210,9 @@ int usbip_attach(int argc, char *argv[]) int opt; int ret = -1; + if (geteuid() != 0) + err(not running as root?); + for (;;) { opt = getopt_long(argc, argv, r:b:, opts, NULL); diff --git a/drivers/staging/usbip/userspace/src/usbip_bind.c b/drivers/staging/usbip/userspace/src/usbip_bind.c index d2739fc..ab26b30f 100644 --- a/drivers/staging/usbip/userspace/src/usbip_bind.c +++ b/drivers/staging/usbip/userspace/src/usbip_bind.c @@ -158,7 +158,7 @@ static int unbind_other(char *busid) busid_dev = sysfs_open_device(bus_type, busid); if (!busid_dev) { - dbg(sysfs_open_device %s failed: %s, busid, strerror(errno)); + err(sysfs_open_device %s failed: %s, busid, strerror(errno)); return -1; } @@ -166,7 +166,7 @@ static int unbind_other(char *busid) bDevClass = sysfs_get_device_attr(busid_dev, bDeviceClass); bNumIntfs = sysfs_get_device_attr(busid_dev, bNumInterfaces); if (!bConfValue || !bDevClass || !bNumIntfs) { - dbg(problem getting device attributes: %s, + err(problem getting device attributes: %s, strerror(errno)); goto err_close_busid_dev; } @@ -181,7 +181,7 @@ static int unbind_other(char *busid) bConfValue-value, i); intf_dev = sysfs_open_device(bus_type, intf_busid); if (!intf_dev) { - dbg(could not open interface device: %s, + err(could not open interface device: %s, strerror(errno)); goto err_close_busid_dev; } @@ -202,14 +202,14 @@ static int unbind_other(char *busid) /* unbinding */ intf_drv = sysfs_open_driver(bus_type, intf_dev-driver_name); if (!intf_drv) { - dbg(could not open interface driver on %s: %s, + err(could not open interface driver on %s: %s, intf_dev-name, strerror(errno)); goto err_close_intf_dev; } unbind_attr = sysfs_get_driver_attr(intf_drv, unbind); if (!unbind_attr) { - dbg(problem getting interface driver attribute: %s, + err(problem getting interface driver attribute: %s, strerror(errno)); goto err_close_intf_drv; } @@ -218,7 +218,8 @@ static int unbind_other(char *busid) SYSFS_BUS_ID_SIZE); if (rc 0) { /* NOTE: why keep unbinding other interfaces? */ - dbg(unbind driver at %s failed, intf_dev-bus_id); + err(unbind driver at %s failed: %s, intf_dev-bus_id, + strerror(errno)); status = UNBIND_ST_FAILED; } @@ -287,6 +288,9 @@ int usbip_bind(int argc, char *argv[]) allow[0] = 0; + if (geteuid() != 0) + err(not running as root?); + for (;;) { opt = getopt_long(argc, argv, a:b:, opts, NULL); -- 1.8.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv2 05/11] staging: usbip: Add ACL support to usbip bind
Add the command line argument -a (--allow) to usbip bind to specify networks allowed to attach to the device and code to store the ACLs in sysfs. Signed-off-by: Kurt Kanzenbach ly80t...@cip.cs.fau.de Signed-off-by: Dominik Paulus dominik.pau...@fau.de Signed-off-by: Tobias Polzer tobias.pol...@fau.de --- drivers/staging/usbip/userspace/doc/usbip.8 | 8 ++- drivers/staging/usbip/userspace/src/usbip_bind.c | 74 2 files changed, 69 insertions(+), 13 deletions(-) diff --git a/drivers/staging/usbip/userspace/doc/usbip.8 b/drivers/staging/usbip/userspace/doc/usbip.8 index b5050ed..b818bde 100644 --- a/drivers/staging/usbip/userspace/doc/usbip.8 +++ b/drivers/staging/usbip/userspace/doc/usbip.8 @@ -62,9 +62,15 @@ Detach an imported USB device. .PP .HP -\fBbind\fR \-\-busid=\fIbusid\fR +\fBbind\fR \-\-busid=\fIbusid\fR [\-\-allow=\fICIDR mask\fR...] .IP Make a device exportable. +.br +\-\-allow accepts CIDR masks like 127.0.0.0/8 or fd00::/64 +.br +Only hosts in (at least) one of the allowed ranges are accepted. If +\-\-allow is omitted, 0.0.0.0/0 and ::/0 are added to the list. The list can +be read/written from corresponding \fBusbip_acl\fR file in sysfs after bind. .PP .HP diff --git a/drivers/staging/usbip/userspace/src/usbip_bind.c b/drivers/staging/usbip/userspace/src/usbip_bind.c index 9ecaf6e..d2739fc 100644 --- a/drivers/staging/usbip/userspace/src/usbip_bind.c +++ b/drivers/staging/usbip/userspace/src/usbip_bind.c @@ -37,8 +37,9 @@ enum unbind_status { static const char usbip_bind_usage_string[] = usbip bind args\n - -b, --busid=busidBind USBIP_HOST_DRV_NAME .ko to device - on busid\n; + -b, --busid=busidBind USBIP_HOST_DRV_NAME .ko to + device on busid\n + -a, --allow=CIDR maskRestrict device access to CIDR mask\n; void usbip_bind_usage(void) { @@ -46,17 +47,19 @@ void usbip_bind_usage(void) } /* call at unbound state */ -static int bind_usbip(char *busid) +static int bind_usbip(char *busid, char *allow) { char bus_type[] = usb; char attr_name[] = bind; char sysfs_mntpath[SYSFS_PATH_MAX]; char bind_attr_path[SYSFS_PATH_MAX]; char intf_busid[SYSFS_BUS_ID_SIZE]; + char ip_attr_path[SYSFS_PATH_MAX]; struct sysfs_device *busid_dev; struct sysfs_attribute *bind_attr; struct sysfs_attribute *bConfValue; struct sysfs_attribute *bNumIntfs; + struct sysfs_attribute *usbip_ip; int i, failed = 0; int rc, ret = -1; @@ -101,8 +104,32 @@ static int bind_usbip(char *busid) dbg(bind driver at %s failed, intf_busid); failed = 1; } + + } + + /* +* store allowed IP ranges +* specified by `usbip bind -b busid --allow CIDR mask` +*/ + snprintf(ip_attr_path, sizeof(ip_attr_path), + %s/%s/%s/%s/%s/%s:%.1s.%d/%s, + sysfs_mntpath, SYSFS_BUS_NAME, bus_type, + SYSFS_DRIVERS_NAME, USBIP_HOST_DRV_NAME, busid, + bConfValue-value, 0, usbip_acl); + + usbip_ip = sysfs_open_attribute(ip_attr_path); + if (!usbip_ip) { + err(sysfs_open_attribute failed: path=%s, + ip_attr_path); + goto err_close_busid_dev; } + rc = sysfs_write_attribute(usbip_ip, allow, strlen(allow)); + if (rc) + err(sysfs_write_attribute failed); + + sysfs_close_attribute(usbip_ip); + if (!failed) ret = 0; @@ -213,7 +240,7 @@ out: return status; } -static int bind_device(char *busid) +static int bind_device(char *busid, char *allow) { int rc; @@ -233,7 +260,7 @@ static int bind_device(char *busid) return -1; } - rc = bind_usbip(busid); + rc = bind_usbip(busid, allow); if (rc 0) { err(could not bind device to %s, USBIP_HOST_DRV_NAME); modify_match_busid(busid, 0); @@ -249,29 +276,52 @@ int usbip_bind(int argc, char *argv[]) { static const struct option opts[] = { { busid, required_argument, NULL, 'b' }, + { allow, required_argument, NULL, 'a' }, { NULL,0, NULL, 0 } }; - int opt; - int ret = -1; + int opt, rc; + char allow[4096]; + char *device = NULL; + struct subnet subnet; + + allow[0] = 0; for (;;) { - opt = getopt_long(argc, argv, b:, opts, NULL); + opt = getopt_long(argc, argv, a:b:, opts, NULL); if (opt == -1) break; switch (opt) { + case 'a': + rc = parse_cidr(optarg, subnet); + if (rc 0) { +
[PATCHv2 06/11] staging: usbip: Add support for ACLs in usbipd
Interpret the ACLs stored in sysfs in usbipd and reject clients not matching one of the ACLs. Signed-off-by: Kurt Kanzenbach ly80t...@cip.cs.fau.de Signed-off-by: Dominik Paulus dominik.pau...@fau.de Signed-off-by: Tobias Polzer tobias.pol...@fau.de --- drivers/staging/usbip/userspace/src/Makefile.am | 2 +- drivers/staging/usbip/userspace/src/usbipd.c| 79 + 2 files changed, 80 insertions(+), 1 deletion(-) diff --git a/drivers/staging/usbip/userspace/src/Makefile.am b/drivers/staging/usbip/userspace/src/Makefile.am index a113003..5161bae 100644 --- a/drivers/staging/usbip/userspace/src/Makefile.am +++ b/drivers/staging/usbip/userspace/src/Makefile.am @@ -9,4 +9,4 @@ usbip_SOURCES := usbip.h utils.h usbip.c utils.c usbip_network.c \ usbip_bind.c usbip_unbind.c -usbipd_SOURCES := usbip_network.h usbipd.c usbip_network.c +usbipd_SOURCES := usbip_network.h usbipd.c usbip_network.c utils.c diff --git a/drivers/staging/usbip/userspace/src/usbipd.c b/drivers/staging/usbip/userspace/src/usbipd.c index 8db2f27..bc1fd19 100644 --- a/drivers/staging/usbip/userspace/src/usbipd.c +++ b/drivers/staging/usbip/userspace/src/usbipd.c @@ -48,6 +48,7 @@ #include usbip_host_driver.h #include usbip_common.h #include usbip_network.h +#include utils.h #undef PROGNAME #define PROGNAME usbipd @@ -169,12 +170,69 @@ static void usbipd_help(void) printf(%s\n, usbipd_help_string); } +/* + * Checks whether client IP matches at least one + * ACL entry + * + * Returns: + * 1 if matches + * 0 if not + * -1 on error + */ +static int check_allowed(char *acls, int sockfd) +{ + int rc, match; + struct sockaddr_storage sa; + char *acl_cpy, *iter, *saveptr; + socklen_t sa_len = sizeof(sa); + + rc = getpeername(sockfd, (struct sockaddr *) sa, sa_len); + if (rc || sa_len sizeof(sa)) { + err(getpeername failed: %s, strerror(errno)); + return -1; + } + + /* +* We are going to modify our argument, +* thus, we need to duplicate it. +*/ + acl_cpy = strdup(acls); + if (!acl_cpy) { + err(strdup(): %s, strerror(errno)); + return -1; + } + + match = 0; + iter = strtok_r(acl_cpy, \n, saveptr); + /* +* Iterate over ACL entries and check for +* matching one. +*/ + while (iter) { + struct subnet net; + + if (parse_cidr(iter, net) 0) { + dbg(parse_cidr() failed); + } else if (in_range(sa, net)) { + match = 1; + break; + } + + iter = strtok_r(NULL, \n, saveptr); + } + + free(acl_cpy); + return match; +} + static int recv_request_import(int sockfd) { struct op_import_request req; struct op_common reply; struct usbip_exported_device *edev; struct usbip_usb_device pdu_udev; + struct sysfs_attribute *usbip_acl; + char ip_attr_path[SYSFS_PATH_MAX]; int found = 0; int error = 0; int rc; @@ -206,6 +264,27 @@ static int recv_request_import(int sockfd) rc = usbip_host_export_device(edev, sockfd); if (rc 0) error = 1; + + /* check for allowed IPs */ + snprintf(ip_attr_path, sizeof(ip_attr_path), %s/%s:%d.%d/%s, + edev-udev.path, edev-udev.busid, + edev-udev.bConfigurationValue, 0, usbip_acl); + + usbip_acl = sysfs_open_attribute(ip_attr_path); + if (usbip_acl) { + rc = sysfs_read_attribute(usbip_acl); + if (rc 0) { + err(Unable to open sysfs); + error = 1; + } else if (check_allowed(usbip_acl-value, sockfd) != 1) { + info(Access denied to device %s, + edev-udev.busid); + error = 1; + } + sysfs_close_attribute(usbip_acl); + } else { + err(failed to get ip list); + } } else { info(requested device not found: %s, req.busid); error = 1; -- 1.8.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 1/1] X86: Hyper-V: Get the local APIC timer frequency from the hypervisor
On Fri, Sep 13, 2013 at 01:43:09AM +, KY Srinivasan wrote: -Original Message- From: H. Peter Anvin [mailto:h...@zytor.com] Sent: Thursday, September 12, 2013 5:28 PM To: KY Srinivasan Cc: x...@kernel.org; gre...@linuxfoundation.org; linux-kernel@vger.kernel.org; de...@linuxdriverproject.org; o...@aepfle.de; a...@canonical.com; jasow...@redhat.com; t...@linutronix.de; jbeul...@suse.com; b...@alien8.de Subject: Re: [PATCH V2 1/1] X86: Hyper-V: Get the local APIC timer frequency from the hypervisor On 09/12/2013 05:06 PM, KY Srinivasan wrote: Peter, Let me know if you want me to address any additional issues in this patch. Please address Jan and Gleb's feedback. Gleb's feedback was a question and I answered that as I did Jan's feedback as well. Gleb, Jan, please let me know if there is something else you want addressed here. No, I am just interesting to know some details about the interface since I cannot find it documented in Hyper-V spec. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv2 07/11] staging: usbip: Add proper error reporting
This patch adds new error codes and features extended error reporting in op_common packets. Signed-off-by: Dominik Paulus dominik.pau...@fau.de Signed-off-by: Tobias Polzer tobias.pol...@fau.de --- drivers/staging/usbip/userspace/src/usbip_attach.c | 4 +- drivers/staging/usbip/userspace/src/usbip_list.c | 3 +- .../staging/usbip/userspace/src/usbip_network.c| 50 -- .../staging/usbip/userspace/src/usbip_network.h| 17 +++- drivers/staging/usbip/userspace/src/usbipd.c | 29 +++-- 5 files changed, 74 insertions(+), 29 deletions(-) diff --git a/drivers/staging/usbip/userspace/src/usbip_attach.c b/drivers/staging/usbip/userspace/src/usbip_attach.c index 2363e56..2a3f313 100644 --- a/drivers/staging/usbip/userspace/src/usbip_attach.c +++ b/drivers/staging/usbip/userspace/src/usbip_attach.c @@ -147,7 +147,7 @@ static int query_import_device(int sockfd, char *busid) /* receive a reply */ rc = usbip_net_recv_op_common(sockfd, code); if (rc 0) { - err(recv op_common); + err(recv op_common: %s, usbip_net_strerror(rc)); return -1; } @@ -177,7 +177,7 @@ static int attach_device(char *host, char *busid) sockfd = usbip_net_connect(host); if (sockfd 0) { - err(tcp connect); + err(connection attempt failed); return -1; } diff --git a/drivers/staging/usbip/userspace/src/usbip_list.c b/drivers/staging/usbip/userspace/src/usbip_list.c index e4fa5b8..ff7acf8 100644 --- a/drivers/staging/usbip/userspace/src/usbip_list.c +++ b/drivers/staging/usbip/userspace/src/usbip_list.c @@ -64,7 +64,8 @@ static int get_exported_devices(char *host, int sockfd) rc = usbip_net_recv_op_common(sockfd, code); if (rc 0) { - dbg(usbip_net_recv_op_common failed); + err(usbip_net_recv_op_common failed: %s, + usbip_net_strerror(rc)); return -1; } diff --git a/drivers/staging/usbip/userspace/src/usbip_network.c b/drivers/staging/usbip/userspace/src/usbip_network.c index eda641f..61cd8db 100644 --- a/drivers/staging/usbip/userspace/src/usbip_network.c +++ b/drivers/staging/usbip/userspace/src/usbip_network.c @@ -178,7 +178,7 @@ int usbip_net_recv_op_common(int sockfd, uint16_t *code) rc = usbip_net_recv(sockfd, op_common, sizeof(op_common)); if (rc 0) { dbg(usbip_net_recv failed: %d, rc); - goto err; + return -ERR_SYSERR; } PACK_OP_COMMON(0, op_common); @@ -186,30 +186,48 @@ int usbip_net_recv_op_common(int sockfd, uint16_t *code) if (op_common.version != USBIP_VERSION) { dbg(version mismatch: %d %d, op_common.version, USBIP_VERSION); - goto err; + return -ERR_MISMATCH; } switch (*code) { case OP_UNSPEC: break; default: - if (op_common.code != *code) { + /* +* Only accept expected opcode. Exception: OP_REPLY +* flag set may be sent as a reply to all requests, +* if only used for status reporting. +*/ + if (op_common.code != *code op_common.code != OP_REPLY) { dbg(unexpected pdu %#0x for %#0x, op_common.code, *code); - goto err; + return -ERR_UNEXPECTED; } } - if (op_common.status != ST_OK) { - dbg(request failed at peer: %d, op_common.status); - goto err; - } - *code = op_common.code; - return 0; -err: - return -1; + return -op_common.status; +} + +const char *usbip_net_strerror(int status) +{ + static const char *const errs[] = { + /* ERR_OK */ Success, + /* ERR_NA */ Command failed, + /* ERR_MISMATCH */ Protocol version mismatch, + /* ERR_SYSERR */ System error, + /* ERR_UNEXPECTED */ Unexpected opcode received, + /* ERR_AUTHREQ */ Server requires authentication, + /* ERR_PERM */ Permission denied, + /* ERR_NOTFOUND */ Requested device not found, + /* ERR_NOAUTH */ Server doesn't support authentication + }; + if (status 0) + status = -status; + if (status = (int) (sizeof(errs) / sizeof(*errs))) + return Invalid; + return errs[status]; } int usbip_net_set_reuseaddr(int sockfd) @@ -360,6 +378,7 @@ int usbip_net_connect(char *hostname) #ifdef HAVE_GNUTLS if (usbip_srp_password) { int rc; + uint16_t code = OP_REP_STARTTLS; rc = usbip_net_send_op_common(sockfd, OP_REQ_STARTTLS, 0); if (rc 0) { @@ -367,6
[PATCHv2 01/11] staging: usbip: Fix IPv6 support in usbipd
getaddrinfo() leaves the order of the returned addrinfo structs unspecified. On systems with bindv6only disabled (this is the default), PF_INET6 sockets bind to IPv4, too. Thus, IPv6 support in usbipd was broken when getaddrinfo returned first IPv4 and then IPv6 addrinfos, as the IPv6 bind failed with EADDRINUSE. This patch uses seperate sockets for IPv4 and IPv6 and sets IPV6_V6ONLY on all IPv6 sockets. Two command line arguments, -4 and -6 were added to manually select the socket family. Signed-off-by: Tobias Polzer tobias.pol...@fau.de Signed-off-by: Dominik Paulus dominik.pau...@fau.de --- .../staging/usbip/userspace/src/usbip_network.c| 12 .../staging/usbip/userspace/src/usbip_network.h| 1 + drivers/staging/usbip/userspace/src/usbipd.c | 69 -- 3 files changed, 64 insertions(+), 18 deletions(-) diff --git a/drivers/staging/usbip/userspace/src/usbip_network.c b/drivers/staging/usbip/userspace/src/usbip_network.c index c39a07f..e78279c 100644 --- a/drivers/staging/usbip/userspace/src/usbip_network.c +++ b/drivers/staging/usbip/userspace/src/usbip_network.c @@ -239,6 +239,18 @@ int usbip_net_set_keepalive(int sockfd) return ret; } +int usbip_net_set_v6only(int sockfd) +{ + const int val = 1; + int ret; + + ret = setsockopt(sockfd, IPPROTO_IPV6, IPV6_V6ONLY, val, sizeof(val)); + if (ret 0) + dbg(setsockopt: IPV6_V6ONLY); + + return ret; +} + /* * IPv6 Ready */ diff --git a/drivers/staging/usbip/userspace/src/usbip_network.h b/drivers/staging/usbip/userspace/src/usbip_network.h index 2d0e427..f19ae19 100644 --- a/drivers/staging/usbip/userspace/src/usbip_network.h +++ b/drivers/staging/usbip/userspace/src/usbip_network.h @@ -180,6 +180,7 @@ int usbip_net_recv_op_common(int sockfd, uint16_t *code); int usbip_net_set_reuseaddr(int sockfd); int usbip_net_set_nodelay(int sockfd); int usbip_net_set_keepalive(int sockfd); +int usbip_net_set_v6only(int sockfd); int usbip_net_tcp_connect(char *hostname, char *port); #endif /* __USBIP_NETWORK_H */ diff --git a/drivers/staging/usbip/userspace/src/usbipd.c b/drivers/staging/usbip/userspace/src/usbipd.c index 1c76cfd..7980f8b 100644 --- a/drivers/staging/usbip/userspace/src/usbipd.c +++ b/drivers/staging/usbip/userspace/src/usbipd.c @@ -56,6 +56,13 @@ static const char usbip_version_string[] = PACKAGE_STRING; static const char usbipd_help_string[] = usage: usbipd [options]\n + \n + -4, --ipv4\n + Bind to IPv4. Default is both.\n + \n + -6, --ipv6\n + Bind to IPv6. Default is both.\n + \n -D, --daemon\n Run as a daemon process.\n \n @@ -354,14 +361,15 @@ static void addrinfo_to_text(struct addrinfo *ai, char buf[], snprintf(buf, buf_size, %s:%s, hbuf, sbuf); } -static int listen_all_addrinfo(struct addrinfo *ai_head, int sockfdlist[]) +static int listen_all_addrinfo(struct addrinfo *ai_head, int sockfdlist[], +int maxsockfd) { struct addrinfo *ai; int ret, nsockfd = 0; const size_t ai_buf_size = NI_MAXHOST + NI_MAXSERV + 2; char ai_buf[ai_buf_size]; - for (ai = ai_head; ai nsockfd MAXSOCKFD; ai = ai-ai_next) { + for (ai = ai_head; ai nsockfd maxsockfd; ai = ai-ai_next) { int sock; addrinfo_to_text(ai, ai_buf, ai_buf_size); dbg(opening %s, ai_buf); @@ -374,6 +382,9 @@ static int listen_all_addrinfo(struct addrinfo *ai_head, int sockfdlist[]) usbip_net_set_reuseaddr(sock); usbip_net_set_nodelay(sock); + /* We use seperate sockets for IPv4 and IPv6 +* (see do_standalone_mode()) */ + usbip_net_set_v6only(sock); if (sock = FD_SETSIZE) { err(FD_SETSIZE: %s: sock=%d, max=%d, @@ -402,11 +413,6 @@ static int listen_all_addrinfo(struct addrinfo *ai_head, int sockfdlist[]) sockfdlist[nsockfd++] = sock; } - if (nsockfd == 0) - return -1; - - dbg(listening on %d address%s, nsockfd, (nsockfd == 1) ? : es); - return nsockfd; } @@ -473,11 +479,11 @@ static void remove_pid_file() } } -static int do_standalone_mode(int daemonize) +static int do_standalone_mode(int daemonize, int ipv4, int ipv6) { struct addrinfo *ai_head; int sockfdlist[MAXSOCKFD]; - int nsockfd; + int nsockfd, family; int i, terminate; struct pollfd *fds; struct timespec timeout; @@ -501,21 +507,36 @@ static int do_standalone_mode(int daemonize) set_signal(); write_pid_file(); - ai_head = do_getaddrinfo(NULL, PF_UNSPEC); + info(starting PROGNAME (%s), usbip_version_string); + + /* +* To suppress warnings on systems with bindv6only
[PATCHv2 02/11] staging: usbip: Add support for client authentication
This patch adds support for authenticating both client and server using a pre-shared passphrase using SRP (Secure Remote Password) over TLS (see RFC 5054) using GnuTLS. Both usbip and usbipd now accept a shared secret as a command line argument. Currently, the established TLS connection is only used to perform a secure handshake and dropped before the socket is passed to the kernel. The code may be extended to exchange a session key over TLS and pass it to the kernel to perform IPsec. Signed-off-by: Dominik Paulus dominik.pau...@fau.de Signed-off-by: Tobias Polzer tobias.pol...@fau.de --- drivers/staging/usbip/userspace/configure.ac | 14 ++ drivers/staging/usbip/userspace/doc/usbip.8| 6 + drivers/staging/usbip/userspace/doc/usbipd.8 | 6 + drivers/staging/usbip/userspace/src/usbip.c| 30 ++- drivers/staging/usbip/userspace/src/usbip_attach.c | 2 +- drivers/staging/usbip/userspace/src/usbip_list.c | 2 +- .../staging/usbip/userspace/src/usbip_network.c| 82 .../staging/usbip/userspace/src/usbip_network.h| 9 +- drivers/staging/usbip/userspace/src/usbipd.c | 217 ++--- 9 files changed, 335 insertions(+), 33 deletions(-) diff --git a/drivers/staging/usbip/userspace/configure.ac b/drivers/staging/usbip/userspace/configure.ac index 2be4060..7bba496 100644 --- a/drivers/staging/usbip/userspace/configure.ac +++ b/drivers/staging/usbip/userspace/configure.ac @@ -84,6 +84,20 @@ AC_ARG_WITH([tcp-wrappers], AC_DEFINE([HAVE_LIBWRAP], [1], [use tcp wrapper])], [AC_MSG_RESULT([no]); LIBS=$saved_LIBS])]) +# Checks for the GnuTLS library +AC_ARG_WITH([gnutls], + [AS_HELP_STRING([--with-gnutls], + [use the GnuTLS library for authentication])], + dnl [ACTION-IF-GIVEN] + [if test $withval = yes; then +PKG_CHECK_MODULES([GNUTLS], [gnutls]) +AC_DEFINE([HAVE_GNUTLS], [1], [use gnutls]) +CFLAGS=$CFLAGS $GNUTLS_CFLAGS +LDFLAGS=$LDFLAGS $GNUTLS_LIBS +fi + ], + ) + # Sets directory containing usb.ids. AC_ARG_WITH([usbids-dir], [AS_HELP_STRING([--with-usbids-dir=DIR], diff --git a/drivers/staging/usbip/userspace/doc/usbip.8 b/drivers/staging/usbip/userspace/doc/usbip.8 index a6097be..b5050ed 100644 --- a/drivers/staging/usbip/userspace/doc/usbip.8 +++ b/drivers/staging/usbip/userspace/doc/usbip.8 @@ -29,6 +29,12 @@ Log to syslog. Connect to PORT on remote host (used for attach and list --remote). .PP +.HP +\fB\-\-auth\fR +.IP +Set the password to be used for client authentication. See usbipd(8) for more information. +.PP + .SH COMMANDS .HP \fBversion\fR diff --git a/drivers/staging/usbip/userspace/doc/usbipd.8 b/drivers/staging/usbip/userspace/doc/usbipd.8 index ac4635d..b2b9eee 100644 --- a/drivers/staging/usbip/userspace/doc/usbipd.8 +++ b/drivers/staging/usbip/userspace/doc/usbipd.8 @@ -54,6 +54,12 @@ If no FILE specified, use /var/run/usbipd.pid Listen on TCP/IP port PORT. .PP +.HP +\fB\-s\fR, \fB\-\-auth\fR +.IP +Sets the password to be used for client authentication. If -a is used, the server will only accept connections from authenticated clients. Note: USB traffic will still be unencrypted, this currently only serves for authentication. +.PP + \fB\-h\fR, \fB\-\-help\fR .IP Print the program help message and exit. diff --git a/drivers/staging/usbip/userspace/src/usbip.c b/drivers/staging/usbip/userspace/src/usbip.c index 04a5f20..8a5de83 100644 --- a/drivers/staging/usbip/userspace/src/usbip.c +++ b/drivers/staging/usbip/userspace/src/usbip.c @@ -25,6 +25,12 @@ #include getopt.h #include syslog.h +#include ../config.h + +#ifdef HAVE_GNUTLS +#include gnutls/gnutls.h +#endif + #include usbip_common.h #include usbip_network.h #include usbip.h @@ -35,8 +41,12 @@ static int usbip_version(int argc, char *argv[]); static const char usbip_version_string[] = PACKAGE_STRING; static const char usbip_usage_string[] = - usbip [--debug] [--log] [--tcp-port PORT] [version]\n -[help] command args\n; + usbip +#ifdef HAVE_GNUTLS + [--auth PASSWORD] +#endif + [--debug] [--log] [--tcp-port PORT]\n +[version] [help] command args\n; static void usbip_usage(void) { @@ -142,6 +152,7 @@ int main(int argc, char *argv[]) { debug,no_argument, NULL, 'd' }, { log, no_argument, NULL, 'l' }, { tcp-port, required_argument, NULL, 't' }, + { auth, required_argument, NULL, 's' }, { NULL, 0, NULL, 0 } }; @@ -152,12 +163,25 @@ int main(int argc, char *argv[]) usbip_use_stderr = 1; opterr = 0;
[PATCHv2 10/11] staging: usbip: Separate protocol/program version
Not all new program versions necessarily introduce non-backwards-compatible protocol changes. We thus move the definition of the protocol version from configure.ac to usbip_network.h, where it logically belongs to. Signed-off-by: Dominik Paulus dominik.pau...@fau.de Signed-off-by: Tobias Polzer tobias.pol...@fau.de --- drivers/staging/usbip/userspace/configure.ac| 1 - drivers/staging/usbip/userspace/src/usbip_network.c | 6 +++--- drivers/staging/usbip/userspace/src/usbip_network.h | 6 ++ 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/staging/usbip/userspace/configure.ac b/drivers/staging/usbip/userspace/configure.ac index 7bba496..099d24b 100644 --- a/drivers/staging/usbip/userspace/configure.ac +++ b/drivers/staging/usbip/userspace/configure.ac @@ -2,7 +2,6 @@ dnl Process this file with autoconf to produce a configure script. AC_PREREQ(2.59) AC_INIT([usbip-utils], [1.1.1], [linux-...@vger.kernel.org]) -AC_DEFINE([USBIP_VERSION], [0x0111], [binary-coded decimal version number]) CURRENT=0 REVISION=1 diff --git a/drivers/staging/usbip/userspace/src/usbip_network.c b/drivers/staging/usbip/userspace/src/usbip_network.c index 61cd8db..f5955c2 100644 --- a/drivers/staging/usbip/userspace/src/usbip_network.c +++ b/drivers/staging/usbip/userspace/src/usbip_network.c @@ -153,7 +153,7 @@ int usbip_net_send_op_common(int sockfd, uint32_t code, uint32_t status) memset(op_common, 0, sizeof(op_common)); - op_common.version = USBIP_VERSION; + op_common.version = PROTOCOL_VERSION; op_common.code= code; op_common.status = status; @@ -183,9 +183,9 @@ int usbip_net_recv_op_common(int sockfd, uint16_t *code) PACK_OP_COMMON(0, op_common); - if (op_common.version != USBIP_VERSION) { + if (op_common.version != PROTOCOL_VERSION) { dbg(version mismatch: %d %d, op_common.version, - USBIP_VERSION); + PROTOCOL_VERSION); return -ERR_MISMATCH; } diff --git a/drivers/staging/usbip/userspace/src/usbip_network.h b/drivers/staging/usbip/userspace/src/usbip_network.h index d3c1b71..6a41fd8 100644 --- a/drivers/staging/usbip/userspace/src/usbip_network.h +++ b/drivers/staging/usbip/userspace/src/usbip_network.h @@ -14,6 +14,12 @@ #include stdint.h +/* + * Protocol version. Incremented only on non-backwards-compatible + * changes. + */ +#define PROTOCOL_VERSION 0x111 + extern int usbip_port; extern char *usbip_port_string; extern char *usbip_srp_password; -- 1.8.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv2 11/11] staging: usbip: Increment version to 1.2.0
Signed-off-by: Dominik Paulus dominik.pau...@fau.de Signed-off-by: Tobias Polzer tobias.pol...@fau.de --- drivers/staging/usbip/userspace/configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/usbip/userspace/configure.ac b/drivers/staging/usbip/userspace/configure.ac index 099d24b..0b0e035 100644 --- a/drivers/staging/usbip/userspace/configure.ac +++ b/drivers/staging/usbip/userspace/configure.ac @@ -1,7 +1,7 @@ dnl Process this file with autoconf to produce a configure script. AC_PREREQ(2.59) -AC_INIT([usbip-utils], [1.1.1], [linux-...@vger.kernel.org]) +AC_INIT([usbip-utils], [1.2.0], [linux-...@vger.kernel.org]) CURRENT=0 REVISION=1 -- 1.8.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] ARM64: Add support for ILP32 ABI.
On Fri, Sep 13, 2013 at 10:47:12AM +0100, Will Deacon wrote: On Fri, Sep 13, 2013 at 07:18:48AM +0100, Andrew Pinski wrote: On Wed, Sep 11, 2013 at 7:32 AM, Catalin Marinas catalin.mari...@arm.com wrote: On Mon, Sep 09, 2013 at 10:32:59PM +0100, Andrew Pinski wrote: diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index cc64df5..7fdc994 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -248,7 +248,7 @@ source fs/Kconfig.binfmt config COMPAT def_bool y - depends on ARM64_AARCH32 + depends on ARM64_AARCH32 || ARM64_ILP32 select COMPAT_BINFMT_ELF config ARM64_AARCH32 (nitpick) We used to have an option like this, called CONFIG_AARCH32_EMULATION, which I think is clearer than CONFIG_ARM64_AARCH32. I think avoiding EMULATION is better, we don't actually emulate the instruction set ;). -- Catalin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/9] i2c: prepare runtime PM support for I2C client devices
On Fri, Sep 13, 2013 at 09:54:34AM +0300, Mika Westerberg wrote: On Thu, Sep 12, 2013 at 02:34:21PM -0700, Kevin Hilman wrote: For hardware that is disabled/powered-off on startup, there will now be a mismatch between the hardware state an the RPM core state. The call to pm_runtime_get_noresume() should make sure that the device is in active state (at least in state where it can access the bus) if I'm understanding this right. Accessing the bus isn't an issue for I2C outside of ACPI, the power management of the device is totally disassociated from the bus and the controller is responsible for ensuring it is available during transfers. signature.asc Description: Digital signature
Re: [PATCH v2 1/9] i2c: prepare runtime PM support for I2C client devices
On Fri, Sep 13, 2013 at 09:14:20AM +0800, Aaron Lu wrote: On 09/13/2013 06:06 AM, Sylwester Nawrocki wrote: So there is currently no way to avoid this behaviour, i.e. to have the adapter not activated before any of its client devices is probed, but only later on, after explicit call to pm_runtime_get*(client-dev) in the client driver ? The above pm_runtime_get_sync is used to make sure when the client I2C device is going to be probed, its host adapter device is turned on(or we will fail the probe). It doesn't affect the adapter's status before the probe of I2C client device. The expecation is that if the adaptor needs to do anything to transfer it'll do that when asked to transfer - that way it can sit in a low power state when the bus is idle. signature.asc Description: Digital signature
Re: [PATCH 5/5] ARM64: Add support for ILP32 ABI.
On Fri, Sep 13, 2013 at 10:57:40AM +0100, Catalin Marinas wrote: On Fri, Sep 13, 2013 at 10:47:12AM +0100, Will Deacon wrote: On Fri, Sep 13, 2013 at 07:18:48AM +0100, Andrew Pinski wrote: On Wed, Sep 11, 2013 at 7:32 AM, Catalin Marinas catalin.mari...@arm.com wrote: On Mon, Sep 09, 2013 at 10:32:59PM +0100, Andrew Pinski wrote: diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index cc64df5..7fdc994 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -248,7 +248,7 @@ source fs/Kconfig.binfmt config COMPAT def_bool y - depends on ARM64_AARCH32 + depends on ARM64_AARCH32 || ARM64_ILP32 select COMPAT_BINFMT_ELF config ARM64_AARCH32 (nitpick) We used to have an option like this, called CONFIG_AARCH32_EMULATION, which I think is clearer than CONFIG_ARM64_AARCH32. I think avoiding EMULATION is better, we don't actually emulate the instruction set ;). Bah, you suggest something better then! Will -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[f2fs-dev][PATCH] f2fs: limit nr_iovecs in bio_alloc
This patch add macro MAX_BIO_BLOCKS to limit value of npages in f2fs_bio_alloc, it can avoid to return NULL in bio_alloc caused by npages is larger than UIO_MAXIOV. Signed-off-by: Yu Chao chao2...@samsung.com --- fs/f2fs/segment.c |4 +++- fs/f2fs/segment.h |3 +++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 09af9c7..bd79bbe 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -657,6 +657,7 @@ static void submit_write_page(struct f2fs_sb_info *sbi, struct page *page, block_t blk_addr, enum page_type type) { struct block_device *bdev = sbi-sb-s_bdev; + int bio_blocks; verify_block_addr(sbi, blk_addr); @@ -676,7 +677,8 @@ retry: goto retry; } - sbi-bio[type] = f2fs_bio_alloc(bdev, max_hw_blocks(sbi)); + bio_blocks = MAX_BIO_BLOCKS(max_hw_blocks(sbi)); + sbi-bio[type] = f2fs_bio_alloc(bdev, bio_blocks); sbi-bio[type]-bi_sector = SECTOR_FROM_BLOCK(sbi, blk_addr); sbi-bio[type]-bi_private = priv; /* diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index bdd10ea..9cc95eb 100644 --- a/fs/f2fs/segment.h +++ b/fs/f2fs/segment.h @@ -9,6 +9,7 @@ * published by the Free Software Foundation. */ #include linux/blkdev.h +#include linux/uio.h /* constant macro */ #define NULL_SEGNO ((unsigned int)(~0)) @@ -90,6 +91,8 @@ (blk_addr ((sbi)-log_blocksize - F2FS_LOG_SECTOR_SIZE)) #define SECTOR_TO_BLOCK(sbi, sectors) \ (sectors ((sbi)-log_blocksize - F2FS_LOG_SECTOR_SIZE)) +#define MAX_BIO_BLOCK(max_hw_blocks) \ + (min((int)max_hw_blocks, UIO_MAXIOV)) /* during checkpoint, bio_private is used to synchronize the last bio */ struct bio_private { --- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/9] i2c: prepare runtime PM support for I2C client devices
On Fri, Sep 13, 2013 at 10:59:50AM +0100, Mark Brown wrote: On Fri, Sep 13, 2013 at 09:54:34AM +0300, Mika Westerberg wrote: On Thu, Sep 12, 2013 at 02:34:21PM -0700, Kevin Hilman wrote: For hardware that is disabled/powered-off on startup, there will now be a mismatch between the hardware state an the RPM core state. The call to pm_runtime_get_noresume() should make sure that the device is in active state (at least in state where it can access the bus) if I'm understanding this right. Accessing the bus isn't an issue for I2C outside of ACPI, the power management of the device is totally disassociated from the bus and the controller is responsible for ensuring it is available during transfers. Yes, but since we want to support ACPI as well, we must make sure that the adapter (and the associated controller) is available when client -probe() is called. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v5 3/3] dma: Add Freescale eDMA engine driver support
Hi, Vinod, Could you please help review this patch? Thanks! Best Regards, Jingchang -Original Message- From: Lu Jingchang-B35083 Sent: Thursday, September 05, 2013 5:55 PM To: vinod.k...@intel.com Cc: d...@fb.com; shawn@linaro.org; linux-kernel@vger.kernel.org; linux-arm-ker...@lists.infradead.org; devicet...@vger.kernel.org; Lu Jingchang-B35083; Wang Huan-B18965 Subject: [PATCH v5 3/3] dma: Add Freescale eDMA engine driver support Add Freescale enhanced direct memory(eDMA) controller support. The eDMA controller deploys DMAMUXs routing DMA request sources(slot) to eDMA channels. This module can be found on Vybrid and LS-1 SoCs. Signed-off-by: Alison Wang b18...@freescale.com Signed-off-by: Jingchang Lu b35...@freescale.com --- changes in v5: config slave_id when dmaengine_slave_config intead of channel request. adding residue calculation and dma pause/resume device control. changes in v4: using exact compatible string in binding document. changes in v3: handle all pending interrupt one time. add protect lock on dma transfer complete handling. change desc and tcd alloc flag to GFP_NOWAIT. add sanity check and error messages. changes in v2: using generic dma-channels property instead of fsl,dma-channel. rename the binding document to fsl-edma.txt. Documentation/devicetree/bindings/dma/fsl-edma.txt | 84 ++ drivers/dma/Kconfig| 10 + drivers/dma/Makefile | 1 + drivers/dma/fsl-edma.c | 925 + 4 files changed, 1020 insertions(+) create mode 100644 Documentation/devicetree/bindings/dma/fsl-edma.txt create mode 100644 drivers/dma/fsl-edma.c N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a��� 0��h���i
[PATCH] Time: Clocksource: fix 'ret' data type of sysfs_override_clocksource() and sysfs_unbind_clocksource()
From: Elad Wexler elad.wex...@gmail.com sysfs_override_clocksource(): The expression 'if (ret = 0)' is always true. This will cause clocksource_select() to always run. Thus modified ret to be of type ssize_t. sysfs_unbind_clocksource(): The expression 'if (ret 0)' is always false. So in case sysfs_get_uname() failed, the expression won't take an effect. Thus modified ret to be of type ssize_t. Signed-off-by: Elad Wexler elad.wex...@gmail.com --- kernel/time/clocksource.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 50a8736..6d28a52 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -924,7 +924,7 @@ static ssize_t sysfs_override_clocksource(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { - size_t ret; + ssize_t ret; mutex_lock(clocksource_mutex); @@ -952,7 +952,7 @@ static ssize_t sysfs_unbind_clocksource(struct device *dev, { struct clocksource *cs; char name[CS_NAME_LEN]; - size_t ret; + ssize_t ret; ret = sysfs_get_uname(buf, name, count); if (ret 0) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/9] i2c: prepare runtime PM support for I2C client devices
On Fri, Sep 13, 2013 at 01:16:11PM +0300, Mika Westerberg wrote: On Fri, Sep 13, 2013 at 10:59:50AM +0100, Mark Brown wrote: Accessing the bus isn't an issue for I2C outside of ACPI, the power management of the device is totally disassociated from the bus and the controller is responsible for ensuring it is available during transfers. Yes, but since we want to support ACPI as well, we must make sure that the adapter (and the associated controller) is available when client -probe() is called. Right, but this probably needs to be highlighted more since it's a very surprising thing for I2C and is causing confusion. signature.asc Description: Digital signature
Re: [PATCH 1/2] tick: broadcast: Deny per-cpu clockevents from being broadcast sources
Hi Soren, On 09/13/2013 03:50 PM, Preeti Murthy wrote: Hi, So the patch that Daniel points out http://lwn.net/Articles/566270/ , enables broadcast functionality without using an external global clock device. It uses one of the per cpu clock devices to enable the broadcast functionality. The way it achieves this is by creating a pseudo clock device and associating it with one of the cpus clock device and by having a hrtimer queued on the same cpu. This pseudo clock device acts as the broadcast device, and the per cpu clock device that it is associated with acts as the broadcast source. The disadvantages that Soren mentions in having a per cpu clock device as the broadcast source can be overcome by following the approach proposed in this patch n the way described below: 1. What if the cpu, whose clock device is the broadcast source goes offline? The solution that the above patch proposes is associate the pseudo clock device with another cpu and move the hrtimer whose function is explained in the next point to another cpu. The broadcast functionality continues to remain active transparently. 2. The cpu that requires broadcast functionality is different from the cpu whose clock device is the broadcast source. So how will the former cpu program/control the clock device of the latter cpu? The above patch queues a hrtimer on the cpu whose clock device is the broadcast source, which expires at max(tick_broadcast_period, dev-next_event), where tick_broadcast_period is what we define and dev is the pseudo device whose next event is set by the broadcast framework. On expiry of this hrtimer, do broadcast handling and reprogram the hrtimer with same as above, max(tick_broadcast_period, dev-next_event). This ensures that a cpu that requires broadcast function to be activated need not program the broadcast source, which also happens to be a per cpu clock device. The hrtimer queued on the cpu whose clock device is the broadcast source takes care of when to do broadcast handling. tick_broadcast_period ensures that we do not miss wakeups. This is introduced to overcome the constraint of a cpu not being able to program the clock device of another cpu. Soren, do let me know if the above approach described in the patch has not addressed any of the challenges that you see with having a per cpu clock device as the broadcast source. Regards Preeti U Murthy On Fri, Sep 13, 2013 at 1:55 PM, Daniel Lezcano daniel.lezc...@linaro.orgwrote: On 09/12/2013 10:30 PM, Thomas Gleixner wrote: On Thu, 12 Sep 2013, Soren Brinkmann wrote: From: Stephen Boyd sb...@codeaurora.org On most ARM systems the per-cpu clockevents are truly per-cpu in the sense that they can't be controlled on any other CPU besides the CPU that they interrupt. If one of these clockevents were to become a broadcast source we will run into a lot of trouble because the broadcast source is enabled on the first CPU to go into deep idle (if that CPU suffers from FEAT_C3_STOP) and that could be a different CPU than what the clockevent is interrupting (or even worse the CPU that the clockevent interrupts could be offline). Theoretically it's possible to support per-cpu clockevents as the broadcast source but so far we haven't needed this and supporting it is rather complicated. Let's just deny the possibility for now until this becomes a reality (let's hope it never does!). Well, we can't do it this way. There are globally accessible clock event devices which deliver only to cpu0. So the mask check might be causing failure here. Just add a feature flag CLOCK_EVT_FEAT_PERCPU to the clock event device and check for it. It sounds probably more understandable than dealing with the cpumasks. I am wondering if this is semantically opposed to http://lwn.net/Articles/566270/ ? [PATCH V3 0/6] cpuidle/ppc: Enable broadcast support for deep idle states -- Daniel So the point I am trying to make is that the fix that you have proposed on this thread is valid. It is difficult to ensure that a per cpu clock device doubles up as the broadcast source without significant code changes to the current broadcast code and the timer code. But the patch [PATCH V3 0/6] cpuidle/ppc: Enable broadcast support for deep idle states, attempts to overcome the disadvantage on certain architectures of not having an external clock device to perform broadcast *without* significant code changes in broadcast or timer. This patch does not conflict with what you are proposing in this thread of having a feature flag CLOCK_EVT_FEAT_PERCPU, since the pseudo clock device that the patch introduces will not have this flag set anyway. So ideally architectures, without having a planned infrastructure in them cannot nominate their per cpu clock device as the broadcast source. And if they do have some infrastructure to support a per cpu clock device as broadcast source, they should ensure that the device
Re: trinity finds ftrace/perf bug. Film at 11.
On Thu, Sep 12, 2013 at 02:19:13PM -0400, Steven Rostedt wrote: The good news is I can reproduce that very quickly. (Apply http://paste.fedoraproject.org/38721/37890755 on top of trinity.git, and run ./trinity -l off -q -C32 -c inotify_init1 -c perf_event_open -c newlstat) The paste you are looking for does not exist And that line on its own doesn't seem to do much.. its been running for minutes now. Or does 'quickly' mean something else? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] vt: properly ignore xterm-256 colour codes
Hi On Thu, Sep 12, 2013 at 3:24 PM, Adam Borowski kilob...@angband.pl wrote: On Thu, Sep 12, 2013 at 02:37:26PM +0200, David Herrmann wrote: On Mon, Sep 9, 2013 at 6:46 PM, Adam Borowski kilob...@angband.pl wrote: On Mon, Sep 09, 2013 at 05:53:19PM +0200, David Herrmann wrote: [...] Btw., you should put Greg Kroah-Hartman and Andrew Morton on CC. Both are the most likely to pick this up. Thanks for the suggestion. I've sent the patch two days ago to Jiri Slaby (listed as a maintainer besides Greg) together with a newbie question, but he's apparently busy. Jiri Slaby maintains the TTY subsystem (together with Greg). This does not include the VT layer, though. drivers/tty/vt/ and drivers/video/console/ are unmaintained. You need to get the attention of any maintainer who is willing to take it through their tree (hint: most maintainers don't dare touching the VT layer. Greg and Andrew were brave enough in the past.). I'm willing to review your patches Thanks. So I shouldn't bother anyone else for now to get My First Kernel Patch(tm) into 3.12 (or, if it's too late, into something included in 3.13), right? Too late for 3.12. We want such stuff in linux-next before it is merged. So yeah, 3.13 should be your aim. I've got more changes for the vt, but there's no hurry, I wanted to test the waters with a single minor one in 3.12 first. drivers/tty/vt/ and drivers/video/console/ are unmaintained. [...] but history taught me touching the VT layer is a waste of time. Could you tell me why? Because few people care (did anyone but me respond to this?). Furthermore, most people just want it to not break, they often don't care for improvements. The console is an important tool when something fails. That's true for any recovery tool. Of course, fancy schmancy stuff like combining characters, etc, could be better done in that legendary userspace alternate console layer, but the built-in VT must remain at least functional. And getting corrupted text is not nice; the VT has fallen woefully behind what works on any other modern terminal. Back by ~2000, I'd say it worked better than rxvt, xterm, or, Cthulhu help us, Solaris' terminal. It just needs some maintenance. Yes, the linux-console used to be something people were proud of. But there ought to be a reason why it has fallen behind. You might wanna figure that out before spending time fixing the symptom. I had a bunch of other improvements planned; if you say that's a waste of time perhaps I should scale that back. But I'd still want to at least make sure programs that don't use terminfo (terminfo is a bad joke) won't spew ANSI codes to the screen; at least more popular ones like set window title, etc. You might know of other clean-up and fixes that need to be done here, though. I'm not saying that I dislike the effort. Please go ahead. But you might want to have a look at git log drivers/tty/vt/vt.c. There hasn't been any serious VT changes since 2010. Neither for drivers/char/vt.c which it was back then. I think this effort is better spent on user-space consoles, but I might be biased. Regards David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] scripts/config: use sed's POSIX interface
Dne 13.9.2013 11:54, Clément Chauplannaz napsal(a): On Sep 13, 2013, at 11:32 AM, Linus Walleij linus.wall...@linaro.org wrote: On Fri, Sep 13, 2013 at 10:38 AM, Clément Chauplannaz chaup...@gmail.com wrote: Thank you for this report. I was able to reproduce this bug and fix it. Thanks! Tested and works fine. Glad to read the patch solves your issue. Thanks for the quick feedback! My previous commit changed the separator between sed's substitute command and its parameters, from ':' to '/'. The latter conflicted with the slashes found in the value of variable CMDLINE, as provided in your email. Hm it could actually be useful to be able to have colons in a CMDLINE, I wonder if we can think about some better separator ... oh well that is another issue, all old scripts work now anyway. Indeed config script may not work with all possible string values. My first concern for now was to fallback to previous interface. We may look into hardening the script later on. Right. I will merge the patch because it reverts a regression. But feel free to submit another patch that escapes the colons in $after. The script already uses #!/bin/bash, so a ${after//:/\:} should work. Michal -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] shmem: fixup memory reservation during truncating
Shared anon mappings created without MAP_NORESERVE may hold reservation of memory commitment for whole size of shmem segment. There was no way to change that size, but recently introduced 'map_files' in proc allows to do that. This patch adjust memory reservation in shmem_setattr() during truncating. Signed-off-by: Konstantin Khlebnikov khlebni...@openvz.org --- mm/shmem.c | 17 + 1 file changed, 17 insertions(+) diff --git a/mm/shmem.c b/mm/shmem.c index ff08920..a15c8dd 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -148,6 +148,19 @@ static inline void shmem_unacct_size(unsigned long flags, loff_t size) vm_unacct_memory(VM_ACCT(size)); } +static inline int shmem_reacct_size(unsigned long flags, + loff_t oldsize, loff_t newsize) +{ + if (!(flags VM_NORESERVE)) { + if (VM_ACCT(newsize) VM_ACCT(oldsize)) + return security_vm_enough_memory_mm(current-mm, + VM_ACCT(newsize) - VM_ACCT(oldsize)); + else if (VM_ACCT(newsize) VM_ACCT(oldsize)) + vm_unacct_memory(VM_ACCT(oldsize) - VM_ACCT(newsize)); + } + return 0; +} + /* * ... whereas tmpfs objects are accounted incrementally as * pages are allocated, in order to allow huge sparse files. @@ -607,6 +620,10 @@ static int shmem_setattr(struct dentry *dentry, struct iattr *attr) loff_t newsize = attr-ia_size; if (newsize != oldsize) { + error = shmem_reacct_size(SHMEM_I(inode)-flags, + oldsize, newsize); + if (error) + return error; i_size_write(inode, newsize); inode-i_ctime = inode-i_mtime = CURRENT_TIME; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: catch memory commitment underflow
This adds debug for vm_committed_as under CONFIG_DEBUG_VM=y Signed-off-by: Konstantin Khlebnikov khlebni...@openvz.org --- mm/mmap.c |6 ++ 1 file changed, 6 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index 9d54851..2c7e6aa 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -131,6 +131,12 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin) vm_acct_memory(pages); +#ifdef CONFIG_DEBUG_VM + WARN_ONCE(percpu_counter_read(vm_committed_as) + -(s64)vm_committed_as_batch * num_online_cpus(), + memory commitment underflow); +#endif + /* * Sometimes we want to use more memory than we have */ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[ANN] Ubuntu PPA for bcache-tools and blocks
Hello, I've published various bcache-related packages to https://launchpad.net/~g2p/+archive/storage/ The packages are built for saucy and raring, and can be added with sudo add-apt-repository ppa:g2p/storage sudo apt-get update sudo apt-get install bcache-tools blocks bcache is a hybrid caching layer that speeds up HDDs with SSDs; it was merged in Linux 3.10. Backported Ubuntu kernels are available at http://kernel.ubuntu.com/~kernel-ppa/mainline/ bcache-tools contains udev and initramfs integration and some command-line tools. blocks is a conversion tool that can convert block devices to bcache and lvm. Usage is described here: https://github.com/g2p/blocks#bcache-conversion This release of blocks adds a maintboot mode that can convert root filesystems to bcache without a live cd. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND][pciutils] libpci: pci_id_lookup - add udev/hwdb support
On Wed, Sep 4, 2013 at 4:59 PM, Tom Gundersen t...@jklm.no wrote: On Wed, Sep 4, 2013 at 3:57 PM, Martin Mares m...@ucw.cz wrote: Hello! First of all: Sorry for not replying to the first mail. I do not follow linux-pci too much these days (or, I do that in big batches). No problem, I guessed as much. This lets you select hwdb support at compile time. hwdb is an efficient hardware database shipped with recent versions of udev. It contains among other sources pci.ids so querying hwdb rather than reading pci.ids directly should give the same result. Ideally Linux distros using udev could stop shipping pci.ids, but use hwdb as the only source of this information, which this patch allows. Generally, I will be glad to include hwdb support in libpci. Great. + if [ -f /usr/include/libudev.h -o -f /usr/local/include/libudev.h ] ; then + HWDB=yes + else + HWDB=no + fi Does this make sense? Does every version of libudev support hwdb? Good point. I'll replace it with a pkg-config call, is that acceptable? @@ -86,8 +91,58 @@ char *pci_id_lookup(struct pci_access *a, int flags, int cat, int id1, int id2, int id3, int id4) { struct id_entry *n, *best; - u32 id12 = id_pair(id1, id2); - u32 id34 = id_pair(id3, id4); + u32 id12, id34; + +#ifdef PCI_HAVE_HWDB + if (!(flags PCI_LOOKUP_SKIP_LOCAL)) +{ As you wrote it, hwdb has always priority over pci.ids (unless local lookup is disabled). As a user, I would expect that pci.ids (being a part of the pciutils) is the primary source of data and other sources (network lookups, hwdb) are used only if pci.ids do not match or if explicitly requested. Hm, this was actually intentional. The reason being that I'd like to avoid reading in the pci.ids db in the common case, as using the hwdb should be much more efficient (it is most likely already in memory and lookup is constant time), and also we (at the distro level) want to move away from the {usb,pci}.ids and rather default to hwdb everywhere. My original intention was to make hwdb a replacement for pci.ids, but I ended up going the less invasive route, would making it a replacement be more acceptable? If not, I'll just swap around the priority, not a problem. Hi Martin, Any comments on the above before I resubmit? Cheers, Tom -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: trinity finds ftrace/perf bug. Film at 11.
On Thu, Sep 12, 2013 at 02:19:13PM -0400, Steven Rostedt wrote: WARNING: CPU: 3 PID: 861 at kernel/events/core.c:5566 perf_swevent_add+0x18d/0x1a0() Modules linked in: ipt_ULOG nfnetlink can_bcm can scsi_transport_iscsi ax25 nfc rfkill af_802154 irda crc_ccitt rds x25 atm appletalk ipx p8023 psnap p8022 llc snd_hda_codec_realtek snd_hda_codec_hdmi xfs snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc libcrc32c snd_timer snd e1000e pcspkr ptp pps_core soundcore usb_debug CPU: 3 PID: 861 Comm: trinity-child31 Not tainted 3.11.0+ #67 81a2aa43 8801e6c65ae8 8171d5cb 8801e6c65b20 81053e5d 8801e66a2e68 880245dcf3e0 0004 0001 04392ac6 8801e6c65b30 Call Trace: [8171d5cb] dump_stack+0x54/0x74 [81053e5d] warn_slowpath_common+0x7d/0xa0 [81053f3a] warn_slowpath_null+0x1a/0x20 [8114302d] perf_swevent_add+0x18d/0x1a0 [81143ba7] event_sched_in.isra.78+0x87/0x1c0 [81144a9a] group_sched_in+0x6a/0x1c0 [8114580c] ctx_sched_in+0x17c/0x290 [8114595a] perf_event_sched_in+0x3a/0x90 [8114940b] perf_event_context_sched_in+0x7b/0xc0 [81149f67] __perf_event_task_sched_in+0x477/0x490 So I've got an idea how this can happen. If we have a per-cpu swevent and group it with an uncore counter which lives on another cpu we'll migrate the swevent using perf_pmu_migrate_context() but it doesn't migrate the swhash. The below should be able to confirm that theory if one can reproduce the issue. --- kernel/events/core.c | 8 1 file changed, 8 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 2207efc..e1441f5 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5621,11 +5621,6 @@ static void swevent_hlist_put(struct perf_event *event) { int cpu; - if (event-cpu != -1) { - swevent_hlist_put_cpu(event, event-cpu); - return; - } - for_each_possible_cpu(cpu) swevent_hlist_put_cpu(event, cpu); } @@ -5659,9 +5654,6 @@ static int swevent_hlist_get(struct perf_event *event) int err; int cpu, failed_cpu; - if (event-cpu != -1) - return swevent_hlist_get_cpu(event, event-cpu); - get_online_cpus(); for_each_possible_cpu(cpu) { err = swevent_hlist_get_cpu(event, cpu); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] Add smp support for Allwinner A20 and phy arch count timer
On Thu, Sep 12, 2013 at 04:46:42PM +0100, cinifr wrote: You seem to be suggesting a kernel change (using CNTPCT), but also bootloader changes (setting CNTHCTL.PL1PCTEN) to make this possible at all. If the bootloader needs to be modified, why can it not be modified to set CNTVOFF (or to boot the kernel in Hyp where it can set it itself)? I think kernel should can support both CNTVCT and CNTPCT. Yes, if bootloader have set CNTVOFF to zero, then CNTVCT is OK, kvm guest using CNTVCT can run more efficient then that using CNTPCT. but if bootloader dont set it, how about kernel booting? I think kernel should try it's best to boot and run ok even bootload dont set any generic timer register including CNTVOFF and CNTHCTL. So i gave a compile options using CNTPCT. That is only options, If CNTVCT can not working, you have others choice. Of cause, It is best that kerne can select which timer count is used in running time, CNTVOFF doesn't need to be zero -- when a guest runs under a hypervisor, CNTVOFF may change across a suspend/resume of a VM (to give the guest the illusion that time wasn't ticking when it wasn't running). All that's required is that all the CPUs have the same CNTVOFF value, and this has been valid for all platforms so far. Does CNTVOFF vary between your CPUs, or are they a consistent value (event if it's not zero)? For ARMv8 systems, CNTVOFF and CNTHCTL reset to an UNDEFINED value, so we cannot rely on the physical timers and counters being available -- the firmware and/or hypervisor must set at least one of them for an OS to be able to use the system. The virtual timers and counters are *always* available to PL1/EL1, so our best bet is to use them. I'd prefer not to have to have a run-time solution to a problem that can be avoided entirely with a simple modification to the bootloader now. I'm not sure what you mean by selecting which timer to use be reading the current running mode. We currently decide to use CNTVCT if booted in PL1, or CNTPCT if booted in Hyp. I assume this isn't the mode you're referring to? Yep, kernel can run PL1 NS=1, PL1 NS=0, PL2. If kernel can know current running Mode.then kernel can chose which timer is OK in running time. 1: If kernel is runing at PL2 and PL1 NS=0, then CNTPCT is OK in any case even CNTVOFF is not zero and CNTHCTL.PL1PCTEN is zero. 2: if kernel is running at PL1 NS=1,then kernel maybe should select CNTVCT. But it has risk to using CNTVCT when CNTVOFF is not zero. How to deal with the case CNTHCTL.PL1PCTEN is zero and CNTVOFF is not zero? current kernel cant using any arch timer incluing CNTVCT and CNTPCT. with this case, I think kernel should use CNTVCT by other ways: Kernel runing CPUn read CNTVCT and save it to local variable for example InitVctVALUEn in initialization, then kernel running CPUn read timer later return a value as ReadTIMERn=CNTVCTn-InitVctVALUEn, This way can run in any generic timer registe set and in any kernel runing mode. I try to write this patch for new way. But the new way should need more time than old in read timer funcation because it need more calculate. I don't think that would work. You have no way of ensuring that all CPUs read CNTVCT at the same time, so they may record offsets that give them different views of time. Consider the case that CNTVOFF was zero on all CPUs. CPU0 and CPU1 might read CNTVCT at different instants, and CPU1 could record its offset as 100 while CPU1 could record its offset as 2000. That would leave CPU1 thinking it's further ahead in time than CPU0, which could break all sorts of things. AFAICS there's no way of telling this apart from each CPU booting with a different CNTVOFF. As SMP support for this platform is not yet in mainline, and the bootloader can be fixed to set CNTVOFF (as KVM and Xen do for guests), we should get the bootloader to set CNTVOFF to a consistent value across all CPUs. Thanks, Mark. . Thanks for your question. On 12 September 2013 22:39, Mark Rutland mark.rutl...@arm.com wrote: On Thu, Sep 12, 2013 at 07:51:23AM +0100, Fan Rong wrote: The patchs add smp support for Allwinner A20. It add cpuregister node in dts forsmp configure. The patchs also add a options for phy count timer to replace vir count timer as ARM arch timer clocksource. About ARM arch timer: 1. Current kernel use vir count timer, vir count timer can be accessed in any cpu mode for kernel, but it need bootloader set vir count offset rigister zero at first. 2. Phy count timer can be accessed in most cpu mode for kernel except NS-PL1 mode when register CNTHCTL.PL1PCTEN is set to zero. To ensure to use phy count timer, bootloader should set register CNTHCTL.PL1PCTEN is 1 at first. At all, to ensure kernel can use arch timer, bootload should set some generic timer register(cntvoff or cnthctl) at first. the kernel should select which count timer by reading current kernel running mode. Sorry, but I find the above text
Re: [PATCH 15/16] bootparam: Pass acpi_rsdp pointer in bootparam
On Fri, Sep 13, 2013 at 03:12:40PM +0800, Dave Young wrote: Also the fw_vendor, runtime, tables elements will be fixed up to use virtual address after 1st kernel call SetVirtualAddress, so even with 1:1 mapping we still need save them and use in kexec kernel. As I said a couple of times already, 1:1 mapping is a no go. Concerning the runtime services, we need to pass the UEFI memory map to the kexec'ed kernel because it needs to know those to start mapping them from -4G virtual downwards. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 6/6] uas: remove BROKEN
xhci streams support is fixed, unblock usb attached scsi. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- drivers/usb/storage/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/storage/Kconfig b/drivers/usb/storage/Kconfig index 8470e1b..4761a28 100644 --- a/drivers/usb/storage/Kconfig +++ b/drivers/usb/storage/Kconfig @@ -202,7 +202,7 @@ config USB_STORAGE_ENE_UB6250 config USB_UAS tristate USB Attached SCSI - depends on SCSI BROKEN + depends on SCSI help The USB Attached SCSI protocol is supported by some USB storage devices. It permits higher performance by supporting -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 4/6] uas: add dead request list
This patch adds a new list where all requests which are canceled are added to, so we don't loose them. Then, after killing all inflight urbs on bus reset (and disconnect) we'll walk over the list and clean them up. Without this we can end up with aborted requests lingering around in case of status pipe transfer errors. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- drivers/usb/storage/uas.c | 50 +++ 1 file changed, 42 insertions(+), 8 deletions(-) diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c index 3cf5a5f..f049038 100644 --- a/drivers/usb/storage/uas.c +++ b/drivers/usb/storage/uas.c @@ -53,6 +53,7 @@ struct uas_dev_info { spinlock_t lock; struct work_struct work; struct list_head work_list; + struct list_head dead_list; }; enum { @@ -80,6 +81,7 @@ struct uas_cmd_info { struct urb *data_in_urb; struct urb *data_out_urb; struct list_head work; + struct list_head dead; }; /* I hate forward declarations, but I actually have a loop */ @@ -89,6 +91,7 @@ static void uas_do_work(struct work_struct *work); static int uas_try_complete(struct scsi_cmnd *cmnd, const char *caller); static void uas_configure_endpoints(struct uas_dev_info *devinfo); static void uas_free_streams(struct uas_dev_info *devinfo); +static void uas_log_cmd_state(struct scsi_cmnd *cmnd, const char *caller); static void uas_unlink_data_urbs(struct uas_dev_info *devinfo, struct uas_cmd_info *cmdinfo) @@ -150,16 +153,12 @@ static void uas_abort_work(struct uas_dev_info *devinfo) struct scsi_pointer *scp = (void *)cmdinfo; struct scsi_cmnd *cmnd = container_of(scp, struct scsi_cmnd, SCp); + uas_log_cmd_state(cmnd, __func__); + WARN_ON(cmdinfo-state COMMAND_ABORTED); cmdinfo-state |= COMMAND_ABORTED; cmdinfo-state = ~IS_IN_WORK_LIST; - if (devinfo-resetting) { - /* uas_stat_cmplt() will not do that -* when a device reset is in -* progress */ - cmdinfo-state = ~COMMAND_INFLIGHT; - } - uas_try_complete(cmnd, __func__); list_del(cmdinfo-work); + list_add_tail(cmdinfo-dead, devinfo-dead_list); } spin_unlock_irqrestore(devinfo-lock, flags); } @@ -176,6 +175,28 @@ static void uas_add_work(struct uas_cmd_info *cmdinfo) schedule_work(devinfo-work); } +static void uas_zap_dead(struct uas_dev_info *devinfo) +{ + struct uas_cmd_info *cmdinfo; + struct uas_cmd_info *temp; + unsigned long flags; + + spin_lock_irqsave(devinfo-lock, flags); + list_for_each_entry_safe(cmdinfo, temp, devinfo-dead_list, dead) { + struct scsi_pointer *scp = (void *)cmdinfo; + struct scsi_cmnd *cmnd = container_of(scp, struct scsi_cmnd, + SCp); + uas_log_cmd_state(cmnd, __func__); + WARN_ON(!(cmdinfo-state COMMAND_ABORTED)); + /* all urbs are killed, clear inflight bits */ + cmdinfo-state = ~(COMMAND_INFLIGHT | + DATA_IN_URB_INFLIGHT | + DATA_OUT_URB_INFLIGHT); + uas_try_complete(cmnd, __func__); + } + spin_unlock_irqrestore(devinfo-lock, flags); +} + static void uas_sense(struct urb *urb, struct scsi_cmnd *cmnd) { struct sense_iu *sense_iu = urb-transfer_buffer; @@ -263,6 +284,7 @@ static int uas_try_complete(struct scsi_cmnd *cmnd, const char *caller) if (cmdinfo-state COMMAND_ABORTED) { scmd_printk(KERN_INFO, cmnd, abort completed\n); cmnd-result = DID_ABORT 16; + list_del(cmdinfo-dead); } cmnd-scsi_done(cmnd); return 0; @@ -292,7 +314,13 @@ static void uas_stat_cmplt(struct urb *urb) u16 tag; if (urb-status) { - dev_err(urb-dev-dev, URB BAD STATUS %d\n, urb-status); + if (urb-status == -ENOENT) { + dev_err(urb-dev-dev, stat urb: killed, stream %d\n, + urb-stream_id); + } else { + dev_err(urb-dev-dev, stat urb: status %d\n, + urb-status); + } usb_free_urb(urb); return; } @@ -743,7 +771,9 @@ static int uas_eh_abort_handler(struct scsi_cmnd *cmnd) uas_log_cmd_state(cmnd, __func__); spin_lock_irqsave(devinfo-lock, flags); + WARN_ON(cmdinfo-state COMMAND_ABORTED); cmdinfo-state |= COMMAND_ABORTED; + list_add_tail(cmdinfo-dead, devinfo-dead_list); if (cmdinfo-state
[PATCH v3 2/6] uas: properly reinitialize in uas_eh_bus_reset_handler
Signed-off-by: Gerd Hoffmann kra...@redhat.com --- drivers/usb/storage/uas.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c index d966b59..fc08ee9 100644 --- a/drivers/usb/storage/uas.c +++ b/drivers/usb/storage/uas.c @@ -85,6 +85,8 @@ static int uas_submit_urbs(struct scsi_cmnd *cmnd, struct uas_dev_info *devinfo, gfp_t gfp); static void uas_do_work(struct work_struct *work); static int uas_try_complete(struct scsi_cmnd *cmnd, const char *caller); +static void uas_configure_endpoints(struct uas_dev_info *devinfo); +static void uas_free_streams(struct uas_dev_info *devinfo); static DECLARE_WORK(uas_work, uas_do_work); static DEFINE_SPINLOCK(uas_work_lock); @@ -800,7 +802,10 @@ static int uas_eh_bus_reset_handler(struct scsi_cmnd *cmnd) usb_kill_anchored_urbs(devinfo-cmd_urbs); usb_kill_anchored_urbs(devinfo-sense_urbs); usb_kill_anchored_urbs(devinfo-data_urbs); + uas_free_streams(devinfo); err = usb_reset_device(udev); + if (!err) + uas_configure_endpoints(devinfo); devinfo-resetting = 0; if (err) { -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 5/6] uas: replace BUG_ON() + WARN_ON() with WARN_ON_ONCE()
Signed-off-by: Gerd Hoffmann kra...@redhat.com --- drivers/usb/storage/uas.c | 19 ++- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c index f049038..046eedf 100644 --- a/drivers/usb/storage/uas.c +++ b/drivers/usb/storage/uas.c @@ -154,7 +154,7 @@ static void uas_abort_work(struct uas_dev_info *devinfo) struct scsi_cmnd *cmnd = container_of(scp, struct scsi_cmnd, SCp); uas_log_cmd_state(cmnd, __func__); - WARN_ON(cmdinfo-state COMMAND_ABORTED); + WARN_ON_ONCE(cmdinfo-state COMMAND_ABORTED); cmdinfo-state |= COMMAND_ABORTED; cmdinfo-state = ~IS_IN_WORK_LIST; list_del(cmdinfo-work); @@ -169,7 +169,7 @@ static void uas_add_work(struct uas_cmd_info *cmdinfo) struct scsi_cmnd *cmnd = container_of(scp, struct scsi_cmnd, SCp); struct uas_dev_info *devinfo = cmnd-device-hostdata; - WARN_ON(!spin_is_locked(devinfo-lock)); + WARN_ON_ONCE(!spin_is_locked(devinfo-lock)); list_add_tail(cmdinfo-work, devinfo-work_list); cmdinfo-state |= IS_IN_WORK_LIST; schedule_work(devinfo-work); @@ -187,7 +187,7 @@ static void uas_zap_dead(struct uas_dev_info *devinfo) struct scsi_cmnd *cmnd = container_of(scp, struct scsi_cmnd, SCp); uas_log_cmd_state(cmnd, __func__); - WARN_ON(!(cmdinfo-state COMMAND_ABORTED)); + WARN_ON_ONCE(!(cmdinfo-state COMMAND_ABORTED)); /* all urbs are killed, clear inflight bits */ cmdinfo-state = ~(COMMAND_INFLIGHT | DATA_IN_URB_INFLIGHT | @@ -271,13 +271,13 @@ static int uas_try_complete(struct scsi_cmnd *cmnd, const char *caller) struct uas_cmd_info *cmdinfo = (void *)cmnd-SCp; struct uas_dev_info *devinfo = (void *)cmnd-device-hostdata; - WARN_ON(!spin_is_locked(devinfo-lock)); + WARN_ON_ONCE(!spin_is_locked(devinfo-lock)); if (cmdinfo-state (COMMAND_INFLIGHT | DATA_IN_URB_INFLIGHT | DATA_OUT_URB_INFLIGHT | UNLINK_DATA_URBS)) return -EBUSY; - BUG_ON(cmdinfo-state COMMAND_COMPLETED); + WARN_ON_ONCE(cmdinfo-state COMMAND_COMPLETED); cmdinfo-state |= COMMAND_COMPLETED; usb_free_urb(cmdinfo-data_in_urb); usb_free_urb(cmdinfo-data_out_urb); @@ -398,8 +398,9 @@ static void uas_data_cmplt(struct urb *urb) sdb = scsi_out(cmnd); cmdinfo-state = ~DATA_OUT_URB_INFLIGHT; } - BUG_ON(sdb == NULL); - if (urb-status) { + if (sdb == NULL) { + WARN_ON_ONCE(1); + } else if (urb-status) { /* error: no data transfered */ sdb-resid = sdb-length; } else { @@ -573,7 +574,7 @@ static int uas_submit_urbs(struct scsi_cmnd *cmnd, struct uas_cmd_info *cmdinfo = (void *)cmnd-SCp; int err; - WARN_ON(!spin_is_locked(devinfo-lock)); + WARN_ON_ONCE(!spin_is_locked(devinfo-lock)); if (cmdinfo-state SUBMIT_STATUS_URB) { err = uas_submit_sense_urb(cmnd-device-host, gfp, cmdinfo-stream); @@ -771,7 +772,7 @@ static int uas_eh_abort_handler(struct scsi_cmnd *cmnd) uas_log_cmd_state(cmnd, __func__); spin_lock_irqsave(devinfo-lock, flags); - WARN_ON(cmdinfo-state COMMAND_ABORTED); + WARN_ON_ONCE(cmdinfo-state COMMAND_ABORTED); cmdinfo-state |= COMMAND_ABORTED; list_add_tail(cmdinfo-dead, devinfo-dead_list); if (cmdinfo-state IS_IN_WORK_LIST) { -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 1/6] xhci: fix usb3 streams
xhci maintains a radix tree for each stream endpoint because it must be able to map a trb address to the stream ring. Each ring segment must be added to the ring for this to work. Currently xhci sticks only the first segment of each stream ring into the radix tree. Result is that things work initially, but as soon as the first segment is full xhci can't map the trb address from the completion event to the stream ring any more - BOOM. You'll find this message in the logs: ERROR Transfer event for disabled endpoint or incorrect stream ring This patch adds a helper function to update the radix tree. It can both insert and remove ring segments. It loops over the segment list and handles all segments instead of just the first. It is called whenever an update is needed: When allocating a ring, when expanding a ring and when releasing a ring. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- drivers/usb/host/xhci-mem.c | 53 ++--- drivers/usb/host/xhci.h | 2 ++ 2 files changed, 42 insertions(+), 13 deletions(-) diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 6f8c2fd..de8b006 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -154,8 +154,11 @@ void xhci_ring_free(struct xhci_hcd *xhci, struct xhci_ring *ring) if (!ring) return; - if (ring-first_seg) + if (ring-first_seg) { + if (ring-type == TYPE_STREAM) + xhci_update_stream_ring(ring, false); xhci_free_segments_for_ring(xhci, ring-first_seg); + } kfree(ring); } @@ -351,6 +354,11 @@ int xhci_ring_expansion(struct xhci_hcd *xhci, struct xhci_ring *ring, xhci_dbg(xhci, ring expansion succeed, now has %d segments\n, ring-num_segs); + if (ring-type == TYPE_STREAM) { + ret = xhci_update_stream_ring(ring, true); + WARN_ON(ret); /* FIXME */ + } + return 0; } @@ -601,6 +609,35 @@ static int xhci_test_radix_tree(struct xhci_hcd *xhci, * extended systems (where the DMA address can be bigger than 32-bits), * if we allow the PCI dma mask to be bigger than 32-bits. So don't do that. */ + +int xhci_update_stream_ring(struct xhci_ring *ring, bool insert) +{ + struct xhci_segment *seg; + unsigned long key; + bool present; + int ret; + + if (WARN_ON_ONCE(ring-trb_address_map == NULL)) + return 0; + + seg = ring-first_seg; + do { + key = (unsigned long)(seg-dma TRB_SEGMENT_SHIFT); + present = radix_tree_lookup(ring-trb_address_map, key) != NULL; + if (!present insert) { + ret = radix_tree_insert(ring-trb_address_map, + key, ring); + if (ret) + return ret; + } + if (present !insert) + radix_tree_delete(ring-trb_address_map, key); + seg = seg-next; + } while (seg != ring-first_seg); + + return 0; +} + struct xhci_stream_info *xhci_alloc_stream_info(struct xhci_hcd *xhci, unsigned int num_stream_ctxs, unsigned int num_streams, gfp_t mem_flags) @@ -608,7 +645,6 @@ struct xhci_stream_info *xhci_alloc_stream_info(struct xhci_hcd *xhci, struct xhci_stream_info *stream_info; u32 cur_stream; struct xhci_ring *cur_ring; - unsigned long key; u64 addr; int ret; @@ -663,6 +699,7 @@ struct xhci_stream_info *xhci_alloc_stream_info(struct xhci_hcd *xhci, if (!cur_ring) goto cleanup_rings; cur_ring-stream_id = cur_stream; + cur_ring-trb_address_map = stream_info-trb_address_map; /* Set deq ptr, cycle bit, and stream context type */ addr = cur_ring-first_seg-dma | SCT_FOR_CTX(SCT_PRI_TR) | @@ -672,10 +709,7 @@ struct xhci_stream_info *xhci_alloc_stream_info(struct xhci_hcd *xhci, xhci_dbg(xhci, Setting stream %d ring ptr to 0x%08llx\n, cur_stream, (unsigned long long) addr); - key = (unsigned long) - (cur_ring-first_seg-dma TRB_SEGMENT_SHIFT); - ret = radix_tree_insert(stream_info-trb_address_map, - key, cur_ring); + ret = xhci_update_stream_ring(cur_ring, true); if (ret) { xhci_ring_free(xhci, cur_ring); stream_info-stream_rings[cur_stream] = NULL; @@ -702,9 +736,6 @@ cleanup_rings: for (cur_stream = 1; cur_stream num_streams; cur_stream++) { cur_ring = stream_info-stream_rings[cur_stream]; if (cur_ring) { - addr =
[PATCH v3 3/6] uas: make work list per-device
Simplifies locking, we'll protect the list with the device spin lock. Also plugs races which can happen when two devices operate on the global list. While being at it rename the list head from list to work, preparing for the addition of a second list. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- drivers/usb/storage/uas.c | 106 +++--- 1 file changed, 44 insertions(+), 62 deletions(-) diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c index fc08ee9..3cf5a5f 100644 --- a/drivers/usb/storage/uas.c +++ b/drivers/usb/storage/uas.c @@ -51,6 +51,8 @@ struct uas_dev_info { unsigned uas_sense_old:1; struct scsi_cmnd *cmnd; spinlock_t lock; + struct work_struct work; + struct list_head work_list; }; enum { @@ -77,7 +79,7 @@ struct uas_cmd_info { struct urb *cmd_urb; struct urb *data_in_urb; struct urb *data_out_urb; - struct list_head list; + struct list_head work; }; /* I hate forward declarations, but I actually have a loop */ @@ -88,10 +90,6 @@ static int uas_try_complete(struct scsi_cmnd *cmnd, const char *caller); static void uas_configure_endpoints(struct uas_dev_info *devinfo); static void uas_free_streams(struct uas_dev_info *devinfo); -static DECLARE_WORK(uas_work, uas_do_work); -static DEFINE_SPINLOCK(uas_work_lock); -static LIST_HEAD(uas_work_list); - static void uas_unlink_data_urbs(struct uas_dev_info *devinfo, struct uas_cmd_info *cmdinfo) { @@ -118,75 +116,66 @@ static void uas_unlink_data_urbs(struct uas_dev_info *devinfo, static void uas_do_work(struct work_struct *work) { + struct uas_dev_info *devinfo = + container_of(work, struct uas_dev_info, work); struct uas_cmd_info *cmdinfo; struct uas_cmd_info *temp; - struct list_head list; unsigned long flags; int err; - spin_lock_irq(uas_work_lock); - list_replace_init(uas_work_list, list); - spin_unlock_irq(uas_work_lock); - - list_for_each_entry_safe(cmdinfo, temp, list, list) { + spin_lock_irqsave(devinfo-lock, flags); + list_for_each_entry_safe(cmdinfo, temp, devinfo-work_list, work) { struct scsi_pointer *scp = (void *)cmdinfo; - struct scsi_cmnd *cmnd = container_of(scp, - struct scsi_cmnd, SCp); - struct uas_dev_info *devinfo = (void *)cmnd-device-hostdata; - spin_lock_irqsave(devinfo-lock, flags); + struct scsi_cmnd *cmnd = container_of(scp, struct scsi_cmnd, + SCp); err = uas_submit_urbs(cmnd, cmnd-device-hostdata, GFP_ATOMIC); - if (!err) + if (!err) { cmdinfo-state = ~IS_IN_WORK_LIST; - spin_unlock_irqrestore(devinfo-lock, flags); - if (err) { - list_del(cmdinfo-list); - spin_lock_irq(uas_work_lock); - list_add_tail(cmdinfo-list, uas_work_list); - spin_unlock_irq(uas_work_lock); - schedule_work(uas_work); + list_del(cmdinfo-work); + } else { + schedule_work(devinfo-work); } } + spin_unlock_irqrestore(devinfo-lock, flags); } static void uas_abort_work(struct uas_dev_info *devinfo) { struct uas_cmd_info *cmdinfo; struct uas_cmd_info *temp; - struct list_head list; unsigned long flags; - spin_lock_irq(uas_work_lock); - list_replace_init(uas_work_list, list); - spin_unlock_irq(uas_work_lock); - spin_lock_irqsave(devinfo-lock, flags); - list_for_each_entry_safe(cmdinfo, temp, list, list) { + list_for_each_entry_safe(cmdinfo, temp, devinfo-work_list, work) { struct scsi_pointer *scp = (void *)cmdinfo; - struct scsi_cmnd *cmnd = container_of(scp, - struct scsi_cmnd, SCp); - struct uas_dev_info *di = (void *)cmnd-device-hostdata; - - if (di == devinfo) { - cmdinfo-state |= COMMAND_ABORTED; - cmdinfo-state = ~IS_IN_WORK_LIST; - if (devinfo-resetting) { - /* uas_stat_cmplt() will not do that -* when a device reset is in -* progress */ - cmdinfo-state = ~COMMAND_INFLIGHT; - } - uas_try_complete(cmnd, __func__); - } else { - /* not our uas device, relink into list */ - list_del(cmdinfo-list); -
Re: [PATCH 1/2] tools, perf: Add a precise event qualifier v2
On Fri, Sep 13, 2013 at 11:50:57AM +0200, Ingo Molnar wrote: For example if we added 'type' as well we could expose the generic, hardware-independent events via sysfs as well. Type is already fully implied by where you'll find the event in sysfs: /sys/bus/event_sources/devices/$PMU/events/ needs perf_event_attr::type := /sys/bus/event_sources/devices/$PMU/type -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Crypto Fixes for 3.12
Hi Linus: This push fixes a 7+ year race condition in the crypto API that causes sporadic crashes when multiple threads load the same algorithm. It also fixes the crct10dif algorithm again to prevent boot failures on systems where the initramfs tool ignores module softdeps. Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git or master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6.git Herbert Xu (2): crypto: api - Fix race condition in larval lookup crypto: crct10dif - Add fallback for broken initrds crypto/Makefile |2 +- crypto/api.c|7 +- crypto/{crct10dif.c = crct10dif_common.c} | 100 +-- crypto/{crct10dif.c = crct10dif_generic.c} | 53 +- lib/crc-t10dif.c| 11 ++- 5 files changed, 20 insertions(+), 153 deletions(-) Thanks, -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] /dev/random: Insufficient of entropy on many architectures
Stephan Mueller smueller at chronox.de writes: A monotonic counter is fully ok. Note, for /dev/random, the occurrence of events delivers entropy. Thus, we have to be able to precisely measure that occurrence. The timer itself does not need to deliver any entropy as long as it is fast. Rather the other way round… the get_cycles() value is XORd with jiffies in drivers/char/random.c in one instance, and used alongside it in the other, so it should NOT be derived from jiffies or the clocksource. I think the focus on it being high-frequence enough to return a different result every call is also too strict: better have a 16-bit 700 kHz counter than none at all (plus, chances are very good it’s increased between calls for drivers/char/random.c at least). Geert’s question was probably about the requirement to be monotonic… other callers do do things like 0xFF but the random code doesn’t, so something like use the 16 bit of the fast counter and fill the slower counter into the upper bits would work… but this is a trade-off against interrupt speed. I can’t judge whether it’s better to only use the fast 16-bit counter, considering bus speeds and interrupt counts. We do have jiffies too and use get_cycles() only to introduce more uncertainty, so it may very well be enough… [OT] Oh and for all who have not yet read e.g. Fefe today: http://people.umass.edu/gbecker/BeckerChes13.pdf Basically, RDRAND must not be used except to mix it into the pool, now confirmed. (Kudos to Theodore and Matt for always insisting on this.) bye, //mirabilos -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] cpufreq: serialize calls to __cpufreq_governor()
On 4 September 2013 01:10, Rafael J. Wysocki r...@sisk.pl wrote: On Tuesday, September 03, 2013 06:50:05 PM Srivatsa S. Bhat wrote: This doesn't solve the problem completely: it prevents the store_*() task from continuing *only* when it concurrently executes the __cpufreq_governor() function along with the CPU offline task. But if the two calls don't overlap, we will still have the possibility where the store_*() task tries to acquire the timer mutex after the CPU offline task has just finished destroying it. Yeah, I overlooked that. As a background, I had a IRC chat with Srivatsa on this mail.. (I have marked this unread as there were other important topics to close).. And he had some other code in mind and these synchronization problems aren't there with my patch at all (as per him too).. Rafael, probably both me and Srivatsa are missing something that you understood, can you please share what problem you see here with my patch? And yes, even with Srivatsa's patchset I found a problem: Two threads, one changing governor from ondemand-conservative and other one changing min/max freq.. First one will try to STOP governor and other one will try to change limits of gov. Suppose 2nd one gets to -governor() and after this first one stops the governor. Now the first one tries to access lock and crashes.. On IRC, Srivatsa agreed about the problem.. So, probably the first thing to do is to get this patch back, i.e. revert of Srivatsa's patch: commit 56d07db274b7b15ca38b60ea4a762d40de093000 Author: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Date: Sat Sep 7 01:23:55 2013 +0530 cpufreq: Remove temporary fix for race between CPU hotplug and sysfs-writes Srivatsa also asked if we can get a big lock around call to -governor() which would create recursive locks on a call to EXIT event (as we have seen it earlier..).. But he believes he can pull it off and will try this later.. So, for now we can revert the above patch :) -- viresh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/9] i2c: prepare runtime PM support for I2C client devices
On Fri, Sep 13, 2013 at 11:31:52AM +0100, Mark Brown wrote: On Fri, Sep 13, 2013 at 01:16:11PM +0300, Mika Westerberg wrote: On Fri, Sep 13, 2013 at 10:59:50AM +0100, Mark Brown wrote: Accessing the bus isn't an issue for I2C outside of ACPI, the power management of the device is totally disassociated from the bus and the controller is responsible for ensuring it is available during transfers. Yes, but since we want to support ACPI as well, we must make sure that the adapter (and the associated controller) is available when client -probe() is called. Right, but this probably needs to be highlighted more since it's a very surprising thing for I2C and is causing confusion. By highlighted more, do you mean something like adding a comment in the code about this or? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [f2fs-dev][PATCH] f2fs: limit nr_iovecs in bio_alloc
Did this patch pass the basic build? There seems have a typo regarding MAX_BIO_BLOCK. -- Jin On 13/09/2013 18:07, Chao Yu wrote: This patch add macro MAX_BIO_BLOCKS to limit value of npages in f2fs_bio_alloc, it can avoid to return NULL in bio_alloc caused by npages is larger than UIO_MAXIOV. Signed-off-by: Yu Chao chao2...@samsung.com --- fs/f2fs/segment.c |4 +++- fs/f2fs/segment.h |3 +++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 09af9c7..bd79bbe 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -657,6 +657,7 @@ static void submit_write_page(struct f2fs_sb_info *sbi, struct page *page, block_t blk_addr, enum page_type type) { struct block_device *bdev = sbi-sb-s_bdev; + int bio_blocks; verify_block_addr(sbi, blk_addr); @@ -676,7 +677,8 @@ retry: goto retry; } - sbi-bio[type] = f2fs_bio_alloc(bdev, max_hw_blocks(sbi)); + bio_blocks = MAX_BIO_BLOCKS(max_hw_blocks(sbi)); + sbi-bio[type] = f2fs_bio_alloc(bdev, bio_blocks); sbi-bio[type]-bi_sector = SECTOR_FROM_BLOCK(sbi, blk_addr); sbi-bio[type]-bi_private = priv; /* diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index bdd10ea..9cc95eb 100644 --- a/fs/f2fs/segment.h +++ b/fs/f2fs/segment.h @@ -9,6 +9,7 @@ * published by the Free Software Foundation. */ #include linux/blkdev.h +#include linux/uio.h /* constant macro */ #define NULL_SEGNO ((unsigned int)(~0)) @@ -90,6 +91,8 @@ (blk_addr ((sbi)-log_blocksize - F2FS_LOG_SECTOR_SIZE)) #define SECTOR_TO_BLOCK(sbi, sectors) \ (sectors ((sbi)-log_blocksize - F2FS_LOG_SECTOR_SIZE)) +#define MAX_BIO_BLOCK(max_hw_blocks) \ + (min((int)max_hw_blocks, UIO_MAXIOV)) /* during checkpoint, bio_private is used to synchronize the last bio */ struct bio_private { --- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] Btrfs
On Fri, Sep 13, 2013 at 2:44 AM, Geert Uytterhoeven ge...@linux-m68k.org wrote: On Thu, Sep 12, 2013 at 10:38 PM, Josh Boyer jwbo...@fedoraproject.org wrote: On Thu, Sep 12, 2013 at 11:36 AM, Chris Mason chris.ma...@fusionio.com wrote: Mark Fasheh (4): btrfs: offline dedupe This commit adds calls to __put_user_unaligned, which causes build failures on ARM if btrfs is configured: + make -s ARCH=arm V=1 -j4 modules fs/btrfs/ioctl.c: In function 'btrfs_ioctl_file_extent_same': fs/btrfs/ioctl.c:2802:3: error: implicit declaration of function '__put_user_unaligned' [-Werror=implicit-function-declaration] if (__put_user_unaligned(info.status, args-info[i].status) || ^ cc1: some warnings being treated as errors make[2]: *** [fs/btrfs/ioctl.o] Error 1 make[1]: *** [fs/btrfs] Error 2 make[1]: *** Waiting for unfinished jobs make: *** [fs] Error 2 make: *** Waiting for unfinished jobs Cfr. my early warning 10 days ago: Btrfs is the first user of __put_user_unaligned() outside the compat code, hence now all 32-bit architectures should make sure to implement this, too. http://marc.info/?l=linux-archm=137820065929216w=2 Indeed. I missed that as it was an m68k patch. I'm not an ARM expert, so I don't know if ARM should use the asm-generic implementations, or just use __get_user/__put_user in all cases. I've CC'd rmk. josh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] /dev/random: Insufficient of entropy on many architectures
Stephan Mueller smueller at chronox.de writes: And here the RNG theory breaks: a whitening function (crypto function) like the used SHA1 does not add entropy. Thus, the SHA1 just spreads out the entropy evenly over the output buffer. As entropy can be considered That’s why you also use a (faster, less powerful) mixing function on input. I was wondering: why not use Jenkins’ one-at-a-time hash there? I’ve worked with it a bit, and it has good avalanche behaviour; I didn’t measure any cycles though. Basically, h be an uint32_t register (usually loaded from a memory location and stored to one afterwards), then you do: (h) += (uint8_t)(b); #ifdef with_mirabilos_changes ++(h); #endif (h) += (h) 10; (h) ^= (h) 6; I’m adding 1 after the first step (adding the byte) in order to make even NUL bytes make up a difference. I’m toying with code that has 32 such uint32_t values, making for a total of 128 Bytes of state, and when asked to add some memory content into that pool, just distribute the bytes over the values array, incrementing a counter. (One could probably do something with cmpxchg to make that lockless, but I’m not doing any parallel programming myself.) Then, every once in a while, you’d run the finish function on every value, which is: #ifdef with_mirabilos_changes (h) += (h) 10; (h) ^= (h) 6; #endif (h) += (h) 3; (h) ^= (h) 11; (h) += (h) 15; My change here is because Jenkins’ OAAT has good avalanche except for the last input byte, so I just fake adding NUL (which doesn’t get avalanched so well, but doesn’t matter) to get the last actual data byte avalanched. Then I write those finialised hash values back to the 128-Byte-buffer then I add that into the main pool using a cryptographic (hash, or stream cipher rekey) function. (In my case, arc4random, if someone wonders.) It doesn't matter if you collect predictable data - it neither helps Oh yes, it hurts, if you update the entropy estimator on those predictable bits. Because then you get a deterministic RNG like I’ve seen Theodore apply exponential backoff to any estimation, which is probably a good thing. But yes, you probably will want to guess the entropy of the bytes added and not count things where you’re not too sure of. (In the interrupt case, we have jiffies, cycles and num, so we’d probably best estimate how often interrupts are called, base a number of “timing” bits on that, and account for num; this may very well be less than 8 bit “credited” for a long and two u_int added to the pool, but as long as you get that right and don’t blindly credit a byte as 8 bit, you’re safe. To quote the author of RANDOM.SYS for DOS: “Every bit counts.” It adds uncertainty to the pool, at least by stirring around the already-existing entropy without adding any new (creditable) entropy.) bye, //mirabilos -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/4] skd: use strncpy() as a cleanup
On Fri, Sep 13, 2013 at 10:05 AM, Dan Carpenter dan.carpen...@oracle.com wrote: The code here is copying the version to inq.driver_version but we don't want it to be NUL terminated. Instead we pad the rest of the array with spaces. It's fewer lines to use strncpy() and maybe a little nicer. Signed-off-by: Dan Carpenter dan.carpen...@oracle.com diff --git a/drivers/block/skd_main.c b/drivers/block/skd_main.c index f892d95..20ad843 100644 --- a/drivers/block/skd_main.c +++ b/drivers/block/skd_main.c @@ -2899,9 +2899,7 @@ static void skd_do_inq_page_da(struct skd_device *skdev, volatile struct fit_comp_error_info *skerr, uint8_t *cdb, uint8_t *buf) { - unsigned ver_byte; unsigned max_bytes; - char *ver = DRV_VER_COMPL; struct driver_inquiry_data inq; u16 val; @@ -2945,12 +2943,8 @@ static void skd_do_inq_page_da(struct skd_device *skdev, /* Driver version, fixed lenth, padded with spaces on the right */ inq.driver_version_length = sizeof(inq.driver_version); memset(inq.driver_version, ' ', sizeof(inq.driver_version)); - for (ver_byte = 0; ver_byte sizeof(inq.driver_version); ver_byte++) { - if (ver[ver_byte] != 0) - inq.driver_version[ver_byte] = ver[ver_byte]; - else - break; - } + strncpy(inq.driver_version, DRV_VER_COMPL, + min(sizeof(inq.driver_version), strlen(DRV_VER_COMPL))); This does the exact same thing as memcpy(), right? So why not use that? memcpy() has much simpler semantics than strncpy(). inq.page_length = cpu_to_be16((sizeof(inq) - 4)); Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say programmer or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [f2fs-dev][PATCH] f2fs: limit nr_iovecs in bio_alloc
-Original Message- From: Jin Xu [mailto:linuxclim...@gmail.com] Sent: Friday, September 13, 2013 7:49 PM To: Chao Yu Cc: ???; linux-f2fs-de...@lists.sourceforge.net; linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; 谭姝 Subject: Re: [f2fs-dev][PATCH] f2fs: limit nr_iovecs in bio_alloc Did this patch pass the basic build? There seems have a typo regarding MAX_BIO_BLOCK. I am so sorry about that.I miss the 'S' when merging the code by handwriting from build path to git branch path. I will check the patch carefully and resubmit it. Thanks for reminding! -- Jin On 13/09/2013 18:07, Chao Yu wrote: This patch add macro MAX_BIO_BLOCKS to limit value of npages in f2fs_bio_alloc, it can avoid to return NULL in bio_alloc caused by npages is larger than UIO_MAXIOV. Signed-off-by: Yu Chao chao2...@samsung.com --- fs/f2fs/segment.c |4 +++- fs/f2fs/segment.h |3 +++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 09af9c7..bd79bbe 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -657,6 +657,7 @@ static void submit_write_page(struct f2fs_sb_info *sbi, struct page *page, block_t blk_addr, enum page_type type) { struct block_device *bdev = sbi-sb-s_bdev; + int bio_blocks; verify_block_addr(sbi, blk_addr); @@ -676,7 +677,8 @@ retry: goto retry; } - sbi-bio[type] = f2fs_bio_alloc(bdev, max_hw_blocks(sbi)); + bio_blocks = MAX_BIO_BLOCKS(max_hw_blocks(sbi)); + sbi-bio[type] = f2fs_bio_alloc(bdev, bio_blocks); sbi-bio[type]-bi_sector = SECTOR_FROM_BLOCK(sbi, blk_addr); sbi-bio[type]-bi_private = priv; /* diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index bdd10ea..9cc95eb 100644 --- a/fs/f2fs/segment.h +++ b/fs/f2fs/segment.h @@ -9,6 +9,7 @@ * published by the Free Software Foundation. */ #include linux/blkdev.h +#include linux/uio.h /* constant macro */ #define NULL_SEGNO ((unsigned int)(~0)) @@ -90,6 +91,8 @@ (blk_addr ((sbi)-log_blocksize - F2FS_LOG_SECTOR_SIZE)) #define SECTOR_TO_BLOCK(sbi, sectors) \ (sectors ((sbi)-log_blocksize - F2FS_LOG_SECTOR_SIZE)) +#define MAX_BIO_BLOCK(max_hw_blocks) \ + (min((int)max_hw_blocks, UIO_MAXIOV)) /* during checkpoint, bio_private is used to synchronize the last bio */ struct bio_private { --- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/9] i2c: prepare runtime PM support for I2C client devices
On Fri, Sep 13, 2013 at 02:50:35PM +0300, Mika Westerberg wrote: On Fri, Sep 13, 2013 at 11:31:52AM +0100, Mark Brown wrote: Right, but this probably needs to be highlighted more since it's a very surprising thing for I2C and is causing confusion. By highlighted more, do you mean something like adding a comment in the code about this or? Perhaps, yes. Or possibly the commit log is going to be enough going forwards. signature.asc Description: Digital signature
Re: [PATCH 5/5] ARM64: Add support for ILP32 ABI.
On Fri, Sep 13, 2013 at 07:18:48AM +0100, Andrew Pinski wrote: On Wed, Sep 11, 2013 at 7:32 AM, Catalin Marinas catalin.mari...@arm.com wrote: On Mon, Sep 09, 2013 at 10:32:59PM +0100, Andrew Pinski wrote: On top of these, I would really like to see Documentation/arm64/ilp32.txt describing the ABI. No other target does not, not even x86_64 for x32. That's not really a good argument. The other approach I've been looking at is just using the native siginfo instead of the compat one for ILP32. But this requires wider debate (cc'ed Arnd if he has time). This is not useful and as you shown can be very messy and even worse when it comes taking into account big and little-endian. Even x32 does not do that. Well, please don't bring the x32 does not do that argument. It doesn't mean we shouldn't investigate better ways. Initially x32 got the siginfo members alignment wrong and they ended up __ARCH_SI_CLOCK_T and __ARCH_SI_ATTRIBUTES, changing the generic uapi files. Basically if you use the current siginfo in the ILP32 context with __kernel_clock_t being 64-bit you end up with a structure that doesn't match any of the native or compat siginfo. This is because we have some pointers which will turn into 32-bit values in ILP32: void __user *sival_ptr; /* accessed via si_ptr */ void __user *_addr; /* accessed via si_addr */ void __user *_call_addr;/* accessed via si_call_addr */ We also have __ARCH_SI_BAND_T defined as long. I had first thought about this and even started to implement it but I found the glibc and the kernel messier than it was already. The kernel part wasn't bad IMO (of course, needs ack from generic headers maintainer). I can't talk about glibc but wouldn't it just access these members explicitly? AFAICT, Linux only does a put_user() on these and never reads them from user space. This means that we can add the right padding on either side of these pointers (for endianness reasons) and Linux would write 0 as the top part of a 64-bit pointer (since the user address is restricted to 32-bit anyway). User ILP32 would only access the corresponding pointer as a 32-bit value and ignore the padding. And I am not a fan of changing the generic UAPI files just so it is no longer generic like you are doing. As I said above, x32 did that already and your are doing similar things for __ARCH_SI_CLOCK_T. So, I'm looking for feedback on this proposal. As I mentioned before even x32 does not do that and it is very messy to make sure things get zero'd on the glibc and kernel sides. (not the x32 argument again) On the kernel side, they get zeroed automatically because the kernel assumes it is a 64-bit address for user space, which is restricted to 32-bit only. Are these members ever read back by the kernel? That's where glibc zeroing would be needed (and I wouldn't like it either). diff --git a/arch/arm64/include/uapi/asm/siginfo.h b/arch/arm64/include/uapi/asm/siginfo.h index 5a74a08..297fb4f 100644 --- a/arch/arm64/include/uapi/asm/siginfo.h +++ b/arch/arm64/include/uapi/asm/siginfo.h @@ -16,7 +16,13 @@ #ifndef __ASM_SIGINFO_H #define __ASM_SIGINFO_H +#ifdef __LP64__ #define __ARCH_SI_PREAMBLE_SIZE(4 * sizeof(int)) +#else /* ILP32 */ +typedef long long __kernel_si_clock_t __attribute__((aligned(4))); +#define __ARCH_SI_CLOCK_T __kernel_si_clock_t +#define __ARCH_SI_ATTRIBUTES __attribute__((aligned(8))) +#endif This could go away if we manage to use the native siginfo. See above why I think this is a bad thing and even worse since even x32 did not do that already; it was the last added ABI like ILP32 to the kernel. The x32 thing is becoming the central theme. --- /dev/null +++ b/arch/arm64/kernel/sys_ilp32.c @@ -0,0 +1,274 @@ +/* + * AArch64- ILP32 specific system calls implementation + * + * Copyright (C) 2013 Cavium Inc. + * Author: Andrew Pinski apin...@cavium.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see http://www.gnu.org/licenses/. + */ + +/* Adjust unistd.h to provide 32-bit numbers and functions. */ +#define __SYSCALL_COMPAT No. We need to use as many native syscalls as possible and only define those absolutely necessary. In my investigation, I only ended up needing these: No using __SYSCALL_COMPAT is the correct thing to do and then only reverting
Re: [PATCH 5/5] ARM64: Add support for ILP32 ABI.
On Fri, Sep 13, 2013 at 11:04:53AM +0100, Will Deacon wrote: On Fri, Sep 13, 2013 at 10:57:40AM +0100, Catalin Marinas wrote: On Fri, Sep 13, 2013 at 10:47:12AM +0100, Will Deacon wrote: On Fri, Sep 13, 2013 at 07:18:48AM +0100, Andrew Pinski wrote: On Wed, Sep 11, 2013 at 7:32 AM, Catalin Marinas catalin.mari...@arm.com wrote: On Mon, Sep 09, 2013 at 10:32:59PM +0100, Andrew Pinski wrote: diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index cc64df5..7fdc994 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -248,7 +248,7 @@ source fs/Kconfig.binfmt config COMPAT def_bool y - depends on ARM64_AARCH32 + depends on ARM64_AARCH32 || ARM64_ILP32 select COMPAT_BINFMT_ELF config ARM64_AARCH32 (nitpick) We used to have an option like this, called CONFIG_AARCH32_EMULATION, which I think is clearer than CONFIG_ARM64_AARCH32. I think avoiding EMULATION is better, we don't actually emulate the instruction set ;). Bah, you suggest something better then! CONFIG_AARCH32_EL0. -- Catalin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[FULL] Boku No Pico OVA 1-adds
Related Tags: [FULL] Boku No Pico OVA 1-adds 800752f415 22film saru 3gp | temp-adds1 flowers in the attic free ebook download for ipod.rar http://vietnamese-social-escorts-in-ho-chi-minh-city.4532585.n2.nabble.com/Flowers-In-The-Attic-Free-Ebook-Download-For-Ipod-rar-tp7559196.html Search for torrents instinct istinto primordiale download ita torrent anushka shetty sex viedos | tested http://richa-gangopadhyay-hot.9064.n7.nabble.com/Anushka-Shetty-Sex-Viedos-Tested-tp12.html morrison and boyd organic chemistry 6th edition pdf.rar net.framework v4.0.30.319.rar http://south-indian-hot-actress.990475.n3.nabble.com/Net-framework-V4-0-30-319-rar-tp4010764.html essentials of pathophysiology 3rd edition.zip *Counter Strike 1.6 Full Download* Almanya La mia famiglia va in Germania XviD Ita Ac3 5 1 http://los-desadaptados.991038.n3.nabble.com/Almanya-La-Mia-Famiglia-Va-In-Germania-XviD-Ita-Ac3-5-1-tp4023391.html korg m1 wavestation standalone vst | checked Windows Xp Media Center 2005 Edition Sp3 DVD ISO http://racetalk.1058115.n5.nabble.com/Windows-Xp-Media-Center-2005-Edition-Sp3-DVD-ISO-tp5706412.html Led Zeppelin Celebration Day 2012 DVDRiP XViD AC3 PSiG.avi.zip ZIP archive unpacked si http://fier-forjat.50039.x6.nabble.com/Led-Zeppelin-Celebration-Day-2012-DVDRiP-XViD-AC3-PSiG-avi-zip-ZIP-Archive-Unpacked-Si-tp2.html RealityKings.com http://foro-web.1091912.n5.nabble.com/RealityKings-com-tp6.html Tarzan hindi cartoon serials 3gp download http://imet-blog.2311292.n4.nabble.com/Tarzan-Hindi-Cartoon-Serials-3gp-Download-tp4635728.html {Tally 9 �-? The Complete Business Solution Release 2.14 INCL CRACK} Extermination PS2 DVD Rar Graduate Admissions Essays Fourth Edition Write Your Way into the Graduate School of Your Choice.pdf http://orang-sekayu-komentar.1086700.n5.nabble.com/Graduate-Admissions-Essays-Fourth-Edition-Write-Your-Way-Into-The-Graduate-School-Of-Your-Choice-pdf-tp3.html 3d comics of tommy and linda zenilton pdf /precalculo de larson pdf.rar/ /Olm Converter Pro Keygen.zip/ guarddog_model_rb_122_e_wiring_diagram_rar-adds video mahasiswi nakal 3gp 4shared-adds 1 [Top rated] filem abuya full movie maulid diba habib ali.rar Gameloft full games free nokia c1 01-adds tomas taveira catarina furtado a foder.rar http://vidya-balan-hot.7910.n7.nabble.com/Tomas-Taveira-Catarina-Furtado-A-Foder-rar-tp11.html [FULL] Free Crack Gmail Hacker Pro mmoviper tera bot download Windows 8 advertisement song 2012 Mp3 download http://bald-eagles-of-broward-county-florida.1638815.n2.nabble.com/Windows-8-Advertisement-Song-2012-Mp3-Download-tp7573088.html principles of weaving by robinson.rar Hitman pro.full.rar principle of management by stephen p robbins pdf 11th edition free download zip Neil Gaiman Stardust epub Download Comedy Central Roast of David Hasselhoff movie xxximag.rar 1 Install Rtl120 Bpl-adds Farming Simulator 2011 2.2 multiplayer crack.rar defdat pes 2013.rar-adds *keep video youtube downloader* descargar gratis facehacker v5 5.zip Holt McDougal FL Modern Chemistry Located on: my.hrw.rar skeletal system worksheet high school zip-adds Hello.Darkness.pdf.rar http://x.295.n7.nabble.com/Hello-Darkness-pdf-rar-tp48.html wall street trader 2001.rar /grave encounters 2 full movie hindi free download 3gp/ Chew WGA 0.9 The Windows 7 Patch.zip god of war 2 pc trainer.rar far cry 2 crack http://the-hamilton-project-forum.2294154.n4.nabble.com/Far-Cry-2-Crack-tp4640975.html GLOSSARY OF BANKING TERMS GENERALLY USED IN CDR.pdf-adds Melanie Flash Halfway to Heaven Promo CDM 2007 Scratch.rar James Cameron Avatar hardware id 279 ielts essays from past papers with sample solutions-adds Gianni Morri La Noche Escabrosa at M2O DAB 18 01 2008 iHQLiVE.rar 1 Crack sygic trakker nav pakistan-adds Nuendo 5 64 bits torrent-adds 1 http://jenkins-ci.361315.n4.nabble.com/Nuendo-5-64-Bits-Torrent-adds-1-tp4677687.html /Alexanders Care of the Patient in Surgery 2003 12th Edition.rar/ solution manual for environmental organic chemistry.zip SIMULINK pdf amanda todd boob flash-adds http://slicer-devel.65872.n3.nabble.com/Amanda-Todd-Boob-Flash-adds-tp4029957.html cara mendownload video dari youtube melalui youtube downloader The Wire Season 3 http://discussion.1073557.n5.nabble.com/The-Wire-Season-3-tp122.html *mathcad download for windows 7 32 bit* pes 5 highly compressed new telugu botukatalu zip-adds 1 101 malayalam ayyappa saranam vilikal download | added by users http://html-formfu.2387840.n2.nabble.com/101-Malayalam-Ayyappa-Saranam-Vilikal-Download-Added-By-Users-tp7572428.html Free_Download_Software_Editing_Video_Edius_6_Full_Versi...-adds http://firebird-net-provider.3732322.n2.nabble.com/Free-Download-Software-Editing-Video-Edius-6-Full-Versi-adds-tp7572572.html New! download film bokep orgasme sampe muncrat http://freevedicastrology.3436.n7.nabble.com/New-Download-Film-Bokep-Orgasme-Sampe-Muncrat-tp9.html mcquarrie statistical
Re: [GIT PULL] Btrfs
On Fri, Sep 13, 2013 at 07:53:21AM -0400, Josh Boyer wrote: I'm not an ARM expert, so I don't know if ARM should use the asm-generic implementations, or just use __get_user/__put_user in all cases. I've CC'd rmk. Why do we have uaccess-unaligned.h ? Normally, these kinds of things are spawned by architectures which have problems with unaligned accesses, ARM being one of them, but afaik we've never need this. With the kernel-side trapping of unaligned accesses on older hardware, we've always dealt with the normal accessor faulting. From what I can tell in the git history, these unaligned put_user and get_user have existed all the way back to the dawn of git use. Can someone enlighten me why we have them? -- Russell King -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] mmc: sdhci-msm: Add support for MSM chipsets
Hi Georgi, Several comments bellow. On Thu, 2013-09-12 at 17:56 +0300, Georgi Djakov wrote: This platform driver adds the support of Secure Digital Host Controller Interface compliant controller found in Qualcomm MSM chipsets. CC: Asutosh Das asuto...@codeaurora.org CC: Venkat Gopalakrishnan venk...@codeaurora.org CC: Sahitya Tummala stumm...@codeaurora.org CC: Subhash Jadavani subha...@codeaurora.org Signed-off-by: Georgi Djakov gdja...@mm-sol.com --- Changes from v3: - Allocate memory for all required structs at once - Added termination entry in sdhci_msm_dt_match[] - Fixed a missing sdhci_pltfm_free() in probe() - Removed redundant of_match_ptr - Removed the unneeded function sdhci_msm_vreg_reset() Changes from v2: - Added DT bindings for clocks - Moved voltage regulators data to platform data - Removed unneeded includes - Removed obsolete and wrapper functions - Removed error checking where unnecessary - Removed redundant _clk suffix from clock names - Just return instead of goto where possible - Minor fixes Changes from v1: - GPIO references are replaced by pinctrl - DT parsing is done mostly by mmc_of_parse() - Use of_match_device() for DT matching - A few minor changes .../devicetree/bindings/mmc/sdhci-msm.txt | 71 +++ drivers/mmc/host/Kconfig | 13 + drivers/mmc/host/Makefile |1 + drivers/mmc/host/sdhci-msm.c | 660 4 files changed, 745 insertions(+) create mode 100644 Documentation/devicetree/bindings/mmc/sdhci-msm.txt create mode 100644 drivers/mmc/host/sdhci-msm.c diff --git a/Documentation/devicetree/bindings/mmc/sdhci-msm.txt b/Documentation/devicetree/bindings/mmc/sdhci-msm.txt new file mode 100644 index 000..ee112da --- /dev/null +++ b/Documentation/devicetree/bindings/mmc/sdhci-msm.txt @@ -0,0 +1,71 @@ +* Qualcomm SDHCI controller (sdhci-msm) + +This file documents differences between the core properties in mmc.txt +and the properties used by the sdhci-msm driver. + +Required properties: +- compatible: should be qcom,sdhci-msm +- reg: should contain SDHC, SD Core register map +- reg-names: indicates various resources passed to driver (via reg proptery) by name + reg-names examples are hc_mem and core_mem +- interrupts: should contain SDHC interrupts +- interrupt-names: indicates interrupts passed to driver (via interrupts property) by name + interrupt-names examples are hc_irq and pwr_irq +- supply-name-supply: phandle to the regulator device tree node + supply-name examples are vdd and vdd-io +- pinctrl-names: Should contain only one value - default. +- pinctrl-0: Should specify pin control groups used for this controller. +- clocks: phandles to clock instances of the device tree nodes +- clock-names: + iface: Main peripheral bus clock (PCLK/HCLK - AHB Bus clock) (required) + core: SDC MMC clock (MCLK) (required) + bus: SDCC bus voter clock (optional) + +Optional properties: +- qcom,bus-speed-mode - specifies supported bus speed modes by host + The supported bus speed modes are : + HS200_1p8v - indicates that host can support HS200 at 1.8v + HS200_1p2v - indicates that host can support HS200 at 1.2v + DDR_1p8v - indicates that host can support DDR mode at 1.8v + DDR_1p2v - indicates that host can support DDR mode at 1.2v + +In the following, supply can be vdd (flash core voltage) or vdd-io (I/O voltage). +- qcom,supply-always-on - specifies whether supply should be kept on always. +- qcom,supply-lpm-sup - specifies whether supply can be kept in low power mode (lpm). +- qcom,supply-voltage-level - specifies voltage levels for supply. Should be +specified in pairs (min, max), units uV. +- qcom,supply-current-level - specifies load levels for supply in lpm or high power mode + (hpm). Should be specified in pairs (lpm, hpm), units uA. + +Example: + + aliases { + sdhc1 = sdhc_1; + }; + + sdhc_1: qcom,sdhc@f9824900 { + compatible = qcom,sdhci-msm; + reg = 0xf9824900 0x11c, 0xf9824000 0x800; + reg-names = hc_mem, core_mem; + interrupts = 0 123 0, 0 138 0; + interrupt-names = hc_irq, pwr_irq; + bus-width = 4; + non-removable; + + vdd-supply = pm8941_l21; + vdd-io-supply = pm8941_l13; + qcom,vdd-voltage-level = 295 295; + qcom,vdd-current-level = 9000 80; + qcom,vdd-io-always-on; + qcom,vdd-io-lpm-sup; + qcom,vdd-io-voltage-level = 180 295; + qcom,vdd-io-current-level = 6 22000; + qcom,bus-speed-mode = HS200_1p8v, DDR_1p8v; + + pinctrl-names = default; + pinctrl-0 = sdc1_clk sdc1_cmd sdc1_data; + + clocks = iface, core, bus;
Re: [BUG] kernel panic during shutdown in run_timer_softirq()
On Fri, Sep 13, 2013 at 12:01:56PM +0200, Knut Petersen wrote: Hi everybody! Since about July I observe occasional kernel panics happening only during system shutdown on two systems. Hardware: mobos: both AOpen i915GMm-hfs mobos, cpus: Pentium-M Dothan / Banias, mem: 2GB Although the stack traces differ, there is one thing in common: CS:EIP is in run_timer_softirq(). A first report was posted to lkml on 2013-07-10: http://www.gossamer-threads.com/lists/linux/kernel/1744892?#1744892 jpg of the stack trace: http://www.gossamer-threads.com/lists/engine?do=post_attachment;postatt_id=57017;list=linux There were several reports of kernel panics with run_timer_softirq() in the call_trace. @Grant: You reported a kernel panic during shutdown for kernel 3.7.5, but the jpg is not accessible. Could you please verify if your report is related? The problem is still present in yesterdays git master, see attached stacktrace. A shortened transcription of that stacktrace: Bug: unable to handle kernel paging request at ... [...] Call Trace: __do_softirq+[...] irq_exit+[...] smp_apic_timer_interrupt+[...] apic_timer_interrupt+[...] ?default_idle+[...] ?clocksource_mark_unstable+[...] ?default_idle+[...] arch_cpu_idle+[...] cpu_startup_entry+[...] rest_init+[...] start_kernel+[...] i386_start_kernel+[...] [...] EIP: run_timer_softirq+[...] Kernel panic panic - not syncing: Fatal exception in interrupt Any idea how to debug that problem? Insert some debugging code and constantly rebooting one of the machines would be an option, but I don´t have a clue what to insert Adding Thomas, in case he has a clue. Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] hwrng: via-rng: Mark device ID table as __maybe_unused
On Thu, Sep 05, 2013 at 12:46:12AM +0100, Ben Hutchings wrote: It is only used in modular builds. Reported-by: kbuild test robot fengguang...@intel.com Signed-off-by: Ben Hutchings b...@decadent.org.uk Patch applied. Thanks. -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: ARC Fix for 3.12-rc1
On Fri, Sep 13, 2013 at 01:19:18PM +0530, Vineet Gupta wrote: Hi Greg, Please consider for stable 3.11.x. Mainline commit c3567f8a359b7917dcffa442301f88ed0a75211f Thanks, will do after 3.11.1 is out. greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Ubuntu PPA for bcache-tools and blocks
Is that possible to create bcached kernel for raring? I like autoapdating and packages created via standard way. So I can install source package and headers and can vbuild to it dkms modules. A backported dkms module isn't feasible, bcache comes with work on the block layer. Raring users should the kernels I've linked: Backported Ubuntu kernels are available at http://kernel.ubuntu.com/~kernel-ppa/mainline/ Latest is this one http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.11-saucy/ And there's a bit of docs here: https://wiki.ubuntu.com/Kernel/MainlineBuilds Other options for raring users: upgrade to saucy http://askubuntu.com/questions/12909/ Or build your own kernel https://wiki.ubuntu.com/KernelTeam/GitKernelBuild -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: increased vmap_area_lock contentions on n_tty: Move buffers into n_tty_data
On Fri, Sep 13, 2013 at 05:55:47AM -0400, Peter Hurley wrote: On 09/12/2013 11:44 PM, Greg KH wrote: On Fri, Sep 13, 2013 at 11:38:04AM +0800, Fengguang Wu wrote: On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote: On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote: Hi Peter, FYI, we noticed much increased vmap_area_lock contentions since this commit: What does that mean? What is happening, are we allocating/removing more memory now? No. Same amount of memory, allocated and freed with the same frequency as before. Ok, I thought so. What type of load were you running that showed this problem? The increased contentions and lock hold/wait time showed up in a number of test cases. [...] That's a lot of slowdowns, especially for such a simple patch. Peter, any ideas? Looks like this patch incidentally triggers some worst-case behavior in the memory manager. I'm not sure how this is possible with two 4k buffers, but the evidence is substantial. This patch isn't critical so I suggest we back out this patch for mainline but use the patch to find out what's wrong in the vmap area. Unfortunately, I'm on my way out the door and won't be back til Sunday pm (EST) so I'll get a revert to you then. Sorry 'bout that. No rush, we have plenty of time. I think it would be good to track down the real root cause of this, so that the allocators don't run into this problem again with some other innocuous change. thanks, greg k-h Regards, Peter Hurley -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2] USB: EHCI: make ehci-w90X900 a separate driver
Separate the W90X900(W90P910) on-chip host controller driver from ehci-hcd host code so that it can be built as a separate driver module. This work is part of enabling multi-platform kernels on ARM; however, note that other changes are still needed before W90X900(W90P910) can be booted with a multi-platform kernel and an ehci driver that only works on one of them. With the infrastructure added by Alan Stern in patch 3e0232039 USB: EHCI: prepare to make ehci-hcd a library module, we can avoid this problem by turning a bus glue into a separate module, as we do here for the w90X900 bus glue. Signed-off-by: Manjunath Goudar manjunath.gou...@linaro.org Signed-off-by: Deepak Saxena dsax...@linaro.org Acked-by: Arnd Bergmann a...@arndb.de Acked-by: Wan ZongShun mcuos@gmail.com Acked-by: Alan Stern st...@rowland.harvard.edu Cc: Greg KH g...@kroah.com Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org V1-V2: -Arranged #include's in alphabetical order. -Replaced w90p910 by w90x900 because it is supports all series of w90x900. --- drivers/usb/host/Kconfig|2 +- drivers/usb/host/Makefile |1 + drivers/usb/host/ehci-hcd.c |5 --- drivers/usb/host/ehci-w90x900.c | 89 --- 4 files changed, 39 insertions(+), 58 deletions(-) diff --git a/drivers/usb/host/Kconfig b/drivers/usb/host/Kconfig index 8ea1afc..4e298a4 100644 --- a/drivers/usb/host/Kconfig +++ b/drivers/usb/host/Kconfig @@ -224,7 +224,7 @@ config USB_EHCI_MV on-chip EHCI USB controller for those. config USB_W90X900_EHCI - bool W90X900(W90P910) EHCI support + tristate W90X900(W90P910) EHCI support depends on ARCH_W90X900 ---help--- Enables support for the W90X900 USB controller diff --git a/drivers/usb/host/Makefile b/drivers/usb/host/Makefile index 8fcb8da..0b9fdee 100644 --- a/drivers/usb/host/Makefile +++ b/drivers/usb/host/Makefile @@ -38,6 +38,7 @@ obj-$(CONFIG_USB_EHCI_S5P)+= ehci-s5p.o obj-$(CONFIG_USB_EHCI_HCD_AT91) += ehci-atmel.o obj-$(CONFIG_USB_EHCI_MSM) += ehci-msm.o obj-$(CONFIG_USB_EHCI_TEGRA) += ehci-tegra.o +obj-$(CONFIG_USB_W90X900_EHCI) += ehci-w90x900.o obj-$(CONFIG_USB_OXU210HP_HCD) += oxu210hp-hcd.o obj-$(CONFIG_USB_ISP116X_HCD) += isp116x-hcd.o diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c index 5d6022f..3e3ca83 100644 --- a/drivers/usb/host/ehci-hcd.c +++ b/drivers/usb/host/ehci-hcd.c @@ -1238,11 +1238,6 @@ MODULE_LICENSE (GPL); #define XILINX_OF_PLATFORM_DRIVER ehci_hcd_xilinx_of_driver #endif -#ifdef CONFIG_USB_W90X900_EHCI -#include ehci-w90x900.c -#definePLATFORM_DRIVER ehci_hcd_w90x900_driver -#endif - #ifdef CONFIG_USB_OCTEON_EHCI #include ehci-octeon.c #define PLATFORM_DRIVERehci_octeon_driver diff --git a/drivers/usb/host/ehci-w90x900.c b/drivers/usb/host/ehci-w90x900.c index 1c370df..cdad843 100644 --- a/drivers/usb/host/ehci-w90x900.c +++ b/drivers/usb/host/ehci-w90x900.c @@ -11,13 +11,28 @@ * */ +#include linux/dma-mapping.h +#include linux/io.h +#include linux/kernel.h +#include linux/module.h +#include linux/of.h #include linux/platform_device.h +#include linux/usb.h +#include linux/usb/hcd.h + +#include ehci.h /* enable phy0 and phy1 for w90p910 */ #defineENPHY (0x018) #define PHY0_CTR (0xA4) #define PHY1_CTR (0xA8) +#define DRIVER_DESC EHCI w90x900 driver + +static const char hcd_name[] = ehci-w90x900 ; + +static struct hc_driver __read_mostly ehci_w90x900_hc_driver; + static int usb_w90x900_probe(const struct hc_driver *driver, struct platform_device *pdev) { @@ -90,8 +105,8 @@ err1: return retval; } -static -void usb_w90x900_remove(struct usb_hcd *hcd, struct platform_device *pdev) +static void usb_w90x900_remove(struct usb_hcd *hcd, + struct platform_device *pdev) { usb_remove_hcd(hcd); iounmap(hcd-regs); @@ -99,54 +114,6 @@ void usb_w90x900_remove(struct usb_hcd *hcd, struct platform_device *pdev) usb_put_hcd(hcd); } -static const struct hc_driver ehci_w90x900_hc_driver = { - .description = hcd_name, - .product_desc = Nuvoton w90x900 EHCI Host Controller, - .hcd_priv_size = sizeof(struct ehci_hcd), - - /* -* generic hardware linkage -*/ - .irq = ehci_irq, - .flags = HCD_USB2|HCD_MEMORY|HCD_BH, - - /* -* basic lifecycle operations -*/ - .reset = ehci_setup, - .start = ehci_run, - - .stop = ehci_stop, - .shutdown = ehci_shutdown, - - /* -* managing i/o requests and associated device resources -*/ - .urb_enqueue = ehci_urb_enqueue, - .urb_dequeue = ehci_urb_dequeue, - .endpoint_disable = ehci_endpoint_disable, - .endpoint_reset = ehci_endpoint_reset, - - /* -* scheduling support -*/ -
Re: [00/23] 3.4.62-stable review
On Thu, Sep 12, 2013 at 10:04:37PM -0700, Guenter Roeck wrote: Odd, the 00/23 mail for 3.4.62 doesn't show up on lkml. So this mail will most likely show up as reply to 01/23. Anyway, here are my build results for 3.4.62: total: 103 pass: 89 skipped: 10 fail: 4 More configurations (added two crisv32 as well as several arm builds), one less failure (m32r:defconfig now builds). That's looking better, thanks for testing and letting me know. greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] Btrfs
On Fri, Sep 13, 2013 at 2:15 PM, Russell King r...@arm.linux.org.uk wrote: On Fri, Sep 13, 2013 at 07:53:21AM -0400, Josh Boyer wrote: I'm not an ARM expert, so I don't know if ARM should use the asm-generic implementations, or just use __get_user/__put_user in all cases. I've CC'd rmk. Why do we have uaccess-unaligned.h ? Normally, these kinds of things are spawned by architectures which have problems with unaligned accesses, ARM being one of them, but afaik we've never need this. With the kernel-side trapping of unaligned accesses on older hardware, we've always dealt with the normal accessor faulting. From what I can tell in the git history, these unaligned put_user and get_user have existed all the way back to the dawn of git use. Can someone enlighten me why we have them? You removed the answer when trimming the quoted part: | Btrfs is the first user of __put_user_unaligned() outside the compat code, __put_user_unaligned() is used in fs/compat.c, presumably because alignment restrictions may differ between 32- and 64-bit versions of the same CPU family. No one seems to actully use __get_user_unaligned(). Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say programmer or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 3/3] kvm: Add VFIO device for handling IOMMU cache coherency
On Thu, Sep 12, 2013 at 03:23:15PM -0600, Alex Williamson wrote: So far we've succeeded at making KVM and VFIO mostly unaware of each other, but there's any important point where that breaks down. Intel VT-d hardware may or may not support snoop control. When snoop control is available, intel-iommu promotes No-Snoop transactions on PCIe to be cache coherent. That allows KVM to handle things like the x86 WBINVD opcode as a nop. When the hardware does not support this, KVM must implement a hardware visible WBINVD for the guest. We could simply let userspace tell KVM how to handle WBINVD, but it's privileged for a reason. Allowing an arbitrary user to enable physical WBINVD gives them a more access to the hardware. Previously, this has only been enabled for guests supporting legacy PCI device assignment. In such cases it's necessary for proper guest execution. We therefore create a new KVM-VFIO virtual device. The user can add and remove VFIO groups to this device via file descriptors. KVM makes use of the VFIO external user interface to validate that the user has access to physical hardware and gets the coherency state of the IOMMU from VFIO. This provides equivalent functionality to legacy KVM assignment, while keeping (nearly) all the bits isolated. So how is the isolation handled then? How is this better than a ioctl to grant WBINVD to guest? kvm char device can be opened by any user, so any user can grant itself these priveledges. What did I miss? The one intrusion is the resulting flag indicating the coherency state. For this RFC it's placed on the x86 kvm_arch struct, however I know POWER has interest in using the VFIO external user interface, and I'm hoping we can share a common KVM-VFIO device. Perhaps they care about No-Snoop handling as well or the code can be #ifdef'd. Signed-off-by: Alex Williamson alex.william...@redhat.com --- Documentation/virtual/kvm/devices/vfio.txt | 22 +++ arch/x86/include/asm/kvm_host.h|1 arch/x86/kvm/Makefile |2 arch/x86/kvm/vmx.c |5 - arch/x86/kvm/x86.c |5 - include/linux/kvm_host.h |1 include/uapi/linux/kvm.h |4 virt/kvm/kvm_main.c|3 virt/kvm/vfio.c| 237 9 files changed, 275 insertions(+), 5 deletions(-) create mode 100644 Documentation/virtual/kvm/devices/vfio.txt create mode 100644 virt/kvm/vfio.c diff --git a/Documentation/virtual/kvm/devices/vfio.txt b/Documentation/virtual/kvm/devices/vfio.txt new file mode 100644 index 000..831e6a6 --- /dev/null +++ b/Documentation/virtual/kvm/devices/vfio.txt @@ -0,0 +1,22 @@ +VFIO virtual device +=== + +Device types supported: + KVM_DEV_TYPE_VFIO + +Only one VFIO instance may be created per VM. The created device +tracks VFIO groups in use by the VM and features of those groups +important to the correctness and acceleration of the VM. As groups +are enabled and disabled for use by the VM, KVM should be updated +about their presence. When registered with KVM, a reference to the +VFIO-group is held by KVM. + +Groups: + KVM_DEV_VFIO_ADD_GROUP + KVM_DEV_VFIO_DEL_GROUP + +Each takes a int32_t file descriptor for kvm_device_attr.addr and +does not support any group device kvm_device_attr.attr. + +RFC - Should we use Group KVM_DEV_VFIO_GROUP with Attributes + KVM_DEV_VFIO_GROUP_ADD DEL? diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index c76ff74..5b9350d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -588,6 +588,7 @@ struct kvm_arch { spinlock_t pvclock_gtod_sync_lock; bool use_master_clock; + bool vfio_noncoherent; u64 master_kernel_ns; cycle_t master_cycle_now; diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index bf4fb04..25d22b2 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -9,7 +9,7 @@ KVM := ../../../virt/kvm kvm-y+= $(KVM)/kvm_main.o $(KVM)/ioapic.o \ $(KVM)/coalesced_mmio.o $(KVM)/irq_comm.o \ - $(KVM)/eventfd.o $(KVM)/irqchip.o + $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o kvm-$(CONFIG_KVM_DEVICE_ASSIGNMENT) += $(KVM)/assigned-dev.o $(KVM)/iommu.o kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 1f1da43..94f7786 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -7395,8 +7395,9 @@ static u64 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) */ if (is_mmio) ret = MTRR_TYPE_UNCACHABLE VMX_EPT_MT_EPTE_SHIFT; - else if (vcpu-kvm-arch.iommu_domain -
[GIT] kconfig fix for v3.12-rc1
Hi Linus, there is a fix for a regression caused by my previous pull request. A sed command in scripts/config that used colons as separator was accidentally changed to use slashes, which fails when you use slashes in a value. Changing it back to colons is of course not a proper fix, but at least it will be broken in the same way it had been for four years. A proper fix is pending. Michal The following changes since commit 5b4197845ad1a33bc57da7ee5ea41de58c2f86bf: Merge branch 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild (2013-09-11 08:34:25 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild.git kconfig Clement Chauplannaz (1): scripts/config: fix variable substitution command scripts/config |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] perf tools: New comm infrastructure
On Thu, Sep 12, 2013 at 10:36:58PM +0200, Ingo Molnar wrote: * Frederic Weisbecker fweis...@gmail.com wrote: The way we handle hists sorted by comm is to first gather them by tid then in the end merge/collapse hists that end up with the same comm. But merging hists has shown some performances issues, especially with callchain where the operation can be very heavy. So this new comm infrastructure aims at removing comm collapses. It brings two features: 1) Keep track of comms lifecycle by storing timestamps when the comms are set. This way we can map the precise comm to any thread:time couple. This only works if the PERF_SAMPLE_ID comes along comm and fork events, otherwise we only track the latest comm set for a thread. This can provide us more precise comm sorted hists by distinguishing pre and post exec timeframes into seperate hists for a single thread. Note that although the comm infrastructure is ready to do this, I haven't yet made the perf tools support that. It's a TODO entry. 2) Allocate comms only once instead of duplicating them for all threads sharing a same one. Two threads having the same comm should now point to the same string. As a result we can compare hists thread comm by address. The big upside is that we can now live sort comm hists instead of collapsing them in the end of the processing. I've seen very nice performance results on perf report. Roughly a 1.5x to 2x on perf report default stdio output with callchains. You can try this branch: git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git perf/comm May be merging that with Namhyung callchains patches could provide some cumulative nice results. It would be nice to try Linus's testcase, which is, in essence a kernel build profile: make defconfig perf record -g make -j64 bzImage and to make sure that it can analyze the data in same, non-annoying runtimes. What I saw was 30 minutes of runtime - a 2x improvement is not nearly enough, 15 minutes is still an eternity. I doubt we can reach anything near non-annonying runtimes after recording all the callchains of a whole kernel build perf record. My patches and Namhyung's should improve the comm situation a lot but we can't do much miracle. The only way would be perhaps to be able to limit the deepness of the callchain branches. Now may be we can find other big contention point in perf. It's possible we also have some endless loop somewhere. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] perf tools: New comm infrastructure
On Fri, Sep 13, 2013 at 03:32:34PM +0900, Namhyung Kim wrote: Hi Frederic, On Thu, 12 Sep 2013 22:29:39 +0200, Frederic Weisbecker wrote: The way we handle hists sorted by comm is to first gather them by tid then in the end merge/collapse hists that end up with the same comm. But merging hists has shown some performances issues, especially with callchain where the operation can be very heavy. So this new comm infrastructure aims at removing comm collapses. It brings two features: 1) Keep track of comms lifecycle by storing timestamps when the comms are set. This way we can map the precise comm to any thread:time couple. This only works if the PERF_SAMPLE_ID comes along comm and fork events, otherwise we only track the latest comm set for a thread. This can provide us more precise comm sorted hists by distinguishing pre and post exec timeframes into seperate hists for a single thread. Note that although the comm infrastructure is ready to do this, I haven't yet made the perf tools support that. It's a TODO entry. 2) Allocate comms only once instead of duplicating them for all threads sharing a same one. Two threads having the same comm should now point to the same string. As a result we can compare hists thread comm by address. The big upside is that we can now live sort comm hists instead of collapsing them in the end of the processing. I've seen very nice performance results on perf report. Roughly a 1.5x to 2x on perf report default stdio output with callchains. You can try this branch: git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git perf/comm May be merging that with Namhyung callchains patches could provide some cumulative nice results. I got this: ui/browsers/hists.c: In function ‘hists__browser_title’: ui/browsers/hists.c:1258:10: error: passing argument 1 of ‘thread__comm_curr’ discards ‘const’ qualifier from pointer target type [-Werror] In file included from ui/browsers/../../util/sort.h:24:0, from ui/browsers/hists.c:11: ui/browsers/../../util/thread.h:39:13: note: expected ‘struct thread *’ but argument is of type ‘const struct thread *’ ui/browsers/hists.c: In function ‘perf_evsel__hists_browse’: ui/browsers/hists.c:1581:9: error: passing argument 1 of ‘thread__comm_curr’ discards ‘const’ qualifier from pointer target type [-Werror] In file included from ui/browsers/../../util/sort.h:24:0, from ui/browsers/hists.c:11: ui/browsers/../../util/thread.h:39:13: note: expected ‘struct thread *’ but argument is of type ‘const struct thread *’ ui/browsers/hists.c:1704:10: error: passing argument 1 of ‘thread__comm_curr’ discards ‘const’ qualifier from pointer target type [-Werror] In file included from ui/browsers/../../util/sort.h:24:0, from ui/browsers/hists.c:11: ui/browsers/../../util/thread.h:39:13: note: expected ‘struct thread *’ but argument is of type ‘const struct thread *’ cc1: all warnings being treated as errors make: *** [ui/browsers/hists.o] Error 1 make: *** Waiting for unfinished jobs Oops, I'm missing the libs to build the ui, so I didn't see this. Will fix, thanks! Thanks, Namhyung -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 3/4 v2] skd: use memcpy() as a cleanup
The code here is copying the version to inq.driver_version but we don't want it to be NUL terminated. Instead we pad the rest of the array with spaces. It's fewer lines to use memcpy() and maybe a little nicer. Signed-off-by: Dan Carpenter dan.carpen...@oracle.com --- v2: use memcpy() instead of strncpy() diff --git a/drivers/block/skd_main.c b/drivers/block/skd_main.c index f892d95..20ad843 100644 --- a/drivers/block/skd_main.c +++ b/drivers/block/skd_main.c @@ -2899,9 +2899,7 @@ static void skd_do_inq_page_da(struct skd_device *skdev, volatile struct fit_comp_error_info *skerr, uint8_t *cdb, uint8_t *buf) { - unsigned ver_byte; unsigned max_bytes; - char *ver = DRV_VER_COMPL; struct driver_inquiry_data inq; u16 val; @@ -2945,12 +2943,8 @@ static void skd_do_inq_page_da(struct skd_device *skdev, /* Driver version, fixed lenth, padded with spaces on the right */ inq.driver_version_length = sizeof(inq.driver_version); memset(inq.driver_version, ' ', sizeof(inq.driver_version)); - for (ver_byte = 0; ver_byte sizeof(inq.driver_version); ver_byte++) { - if (ver[ver_byte] != 0) - inq.driver_version[ver_byte] = ver[ver_byte]; - else - break; - } + memcpy(inq.driver_version, DRV_VER_COMPL, + min(sizeof(inq.driver_version), strlen(DRV_VER_COMPL))); inq.page_length = cpu_to_be16((sizeof(inq) - 4)); -- To unsubscribe from this list: send the line unsubscribe kernel-janitors in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/