Re: AGP bogosities
Jesse Barnes writes: > I have a system in my office with several gfx pipes on different AGP busses, > and I'd like that to work well too! :) Interesting, could you post the output from lspci -v on that system? What is the relationship in the PCI device tree between the video cards and their bridges? Is there for instance only one AGP bridge per host bridge? Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
AGP bogosities
Linus, I see that you did a cset -x on a changeset from Dave Jones that added a bogus test for which AGP bridge a device is under. That has left us with code in agp_collect_device_status that will never find any device (just take a look and you'll see). In fact there are other bogosities in drivers/char/agp/generic.c. I can't believe Dave ever tested that code with an AGP 3.0 device. If you pass in a mode that has the AGP 3.0 bit set, agp_v3_parse_one() will first clear that bit (and print a message), and then complain because you haven't got that bit set in the mode, with a message that the caller is broken. Furthermore, if the mode passed in has both the 4x and 8x bits set, the new code will give you 4x where the old code would give you 8x (which is what the caller wanted). The patch below fixes these problems. It will work in the 99.99% of cases where we have one AGP bridge and one AGP video card. We should eventually cope with multiple AGP bridges, but doing the matching of bridges to video cards is a hard problem because the video card is not necessarily a child or sibling of the PCI device that we use for controlling the AGP bridge. I think we need to see an actual example of a system with multiple AGP bridges first. Oh, and by the way, I have 3D working relatively well on my G5 with a 64-bit kernel (and 32-bit X server and clients), which is why I care about AGP 3.0 support. :) Paul. diff -urN linux-2.5/drivers/char/agp/agp.h g5-bad/drivers/char/agp/agp.h --- linux-2.5/drivers/char/agp/agp.h2005-03-07 14:01:44.0 +1100 +++ g5/drivers/char/agp/agp.h 2005-03-11 11:54:54.0 +1100 @@ -322,7 +322,7 @@ #define AGPCTRL_GTLBEN (1<<7) #define AGP2_RESERVED_MASK 0x00fffcc8 -#define AGP3_RESERVED_MASK 0x00ff00cc +#define AGP3_RESERVED_MASK 0x00ff00c4 #define AGP_ERRATA_FASTWRITES 1<<0 #define AGP_ERRATA_SBA 1<<1 diff -urN linux-2.5/drivers/char/agp/generic.c g5-bad/drivers/char/agp/generic.c --- linux-2.5/drivers/char/agp/generic.c2005-03-11 11:47:37.0 +1100 +++ g5/drivers/char/agp/generic.c 2005-03-11 12:08:29.0 +1100 @@ -515,13 +515,9 @@ printk (KERN_INFO PFX "%s tried to set rate=x0. Setting to AGP3 x4 mode.\n", current->comm); *requested_mode |= AGPSTAT3_4X; } - if (tmp == 3) { - printk (KERN_INFO PFX "%s tried to set rate=x3. Setting to AGP3 x4 mode.\n", current->comm); - *requested_mode |= AGPSTAT3_4X; - } - if (tmp >3) { - printk (KERN_INFO PFX "%s tried to set rate=x%d. Setting to AGP3 x8 mode.\n", current->comm, tmp); - *requested_mode |= AGPSTAT3_8X; + if (tmp >= 3) { + printk (KERN_INFO PFX "%s tried to set rate=x%d. Setting to AGP3 x8 mode.\n", current->comm, tmp * 4); + *requested_mode = (*requested_mode & ~7) | AGPSTAT3_8X; } /* ARQSZ - Set the value to the maximum one. @@ -642,11 +638,6 @@ return 0; } cap_ptr = pci_find_capability(device, PCI_CAP_ID_AGP); - if (!cap_ptr) { - pci_dev_put(device); - continue; - } - cap_ptr = 0; } /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 NUMA memory fixup
This patch is from Mike Kravetz <[EMAIL PROTECTED]>. When I booted my new 720 on a kernel configured for NUMA, I received the following during bootup: WARNING: Unexpected node layout: region start 4400 length 200 NUMA is disabled This is due to memory 'holes' within nodes. If such holes are encountered, then NUMA is disabled. The following patch adds support for such configurations. My 720 now boots with the following message: [boot]0012 Setup Arch Node 0 Memory: 0x0-0x800 0x4400-0x12a00 Node 1 Memory: 0x800-0x4400 0x12a00-0x1ea00 Signed-off-by: Mike Kravetz <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -Naupr linux-2.6.11-rc3/arch/ppc64/mm/numa.c linux-2.6.11-rc3.work/arch/ppc64/mm/numa.c --- linux-2.6.11-rc3/arch/ppc64/mm/numa.c 2005-02-03 01:57:16.0 + +++ linux-2.6.11-rc3.work/arch/ppc64/mm/numa.c 2005-03-01 19:39:21.0 + @@ -40,7 +40,6 @@ int nr_cpus_in_node[MAX_NUMNODES] = { [0 struct pglist_data *node_data[MAX_NUMNODES]; bootmem_data_t __initdata plat_node_bdata[MAX_NUMNODES]; -static unsigned long node0_io_hole_size; static int min_common_depth; /* @@ -49,7 +48,8 @@ static int min_common_depth; */ static struct { unsigned long node_start_pfn; - unsigned long node_spanned_pages; + unsigned long node_end_pfn; + unsigned long node_present_pages; } init_node_data[MAX_NUMNODES] __initdata; EXPORT_SYMBOL(node_data); @@ -348,33 +348,28 @@ new_range: if (max_domain < numa_domain) max_domain = numa_domain; - /* -* For backwards compatibility, OF splits the first node -* into two regions (the first being 0-4GB). Check for -* this simple case and complain if there is a gap in -* memory + /* +* Initialize new node struct, or add to an existing one. */ - if (init_node_data[numa_domain].node_spanned_pages) { - unsigned long shouldstart = - init_node_data[numa_domain].node_start_pfn + - init_node_data[numa_domain].node_spanned_pages; - if (shouldstart != (start / PAGE_SIZE)) { - /* Revert to non-numa for now */ - printk(KERN_ERR - "WARNING: Unexpected node layout: " - "region start %lx length %lx\n", - start, size); - printk(KERN_ERR "NUMA is disabled\n"); - goto err; - } - init_node_data[numa_domain].node_spanned_pages += + if (init_node_data[numa_domain].node_end_pfn) { + if ((start / PAGE_SIZE) < + init_node_data[numa_domain].node_start_pfn) + init_node_data[numa_domain].node_start_pfn = + start / PAGE_SIZE; + else + init_node_data[numa_domain].node_end_pfn = + (start / PAGE_SIZE) + + (size / PAGE_SIZE); + + init_node_data[numa_domain].node_present_pages += size / PAGE_SIZE; } else { node_set_online(numa_domain); init_node_data[numa_domain].node_start_pfn = start / PAGE_SIZE; - init_node_data[numa_domain].node_spanned_pages = + init_node_data[numa_domain].node_end_pfn = + init_node_data[numa_domain].node_start_pfn + size / PAGE_SIZE; } @@ -391,14 +386,6 @@ new_range: node_set_online(i); return 0; -err: - /* Something has gone wrong; revert any setup we've done */ - for_each_node(i) { - node_set_offline(i); - init_node_data[i].node_start_pfn = 0; - init_node_data[i].node_spanned_pages = 0; - } - return -1; } static void __init setup_nonnuma(void) @@ -426,12 +413,11 @@ static void __init setup_nonnuma(void) node_set_online(0); init_node_data[0].node_start_pfn = 0; - init_node_data[0].node_spanned_pages = lmb_end_of_DRAM() / PAGE_SIZE; + init_node_data[0].node_end_pfn = lmb_end_of_DRAM() / PAGE_SIZE; + init_node_data[0].node_present_pages = total_ram / PAGE_SIZE; for (i = 0 ; i < top_of_ram; i += MEMORY_INCREMENT) numa_memory_lookup_table[i >> MEMORY_INCREMENT_S
[PATCH] PPC64 fix eeh.h compile warnings
This patch is from Nathan Lynch <[EMAIL PROTECTED]>. Use static inlines instead of #defines for stub functions when CONFIG_EEH=n, to eliminate "statement with no effect" warnings with some toolchains. Signed-off-by: Nathan Lynch <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> Index: linux-2.6.11/include/asm-ppc64/eeh.h === --- linux-2.6.11.orig/include/asm-ppc64/eeh.h 2005-03-02 07:38:38.0 + +++ linux-2.6.11/include/asm-ppc64/eeh.h2005-03-03 01:39:25.0 + @@ -104,17 +104,30 @@ int eeh_unregister_notifier(struct notif */ #define EEH_IO_ERROR_VALUE(size) (~0U >> ((4 - (size)) * 8)) -#else -#define eeh_init() -#define eeh_check_failure(token, val) (val) -#define eeh_dn_check_failure(dn, dev) (0) -#define pci_addr_cache_build() -#define eeh_add_device_early(dn) -#define eeh_add_device_late(dev) -#define eeh_remove_device(dev) +#else /* !CONFIG_EEH */ +static inline void eeh_init(void) { } + +static inline unsigned long eeh_check_failure(const volatile void __iomem *token, unsigned long val) +{ + return val; +} + +static inline int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev) +{ + return 0; +} + +static inline void pci_addr_cache_build(void) { } + +static inline void eeh_add_device_early(struct device_node *dn) { } + +static inline void eeh_add_device_late(struct pci_dev *dev) { } + +static inline void eeh_remove_device(struct pci_dev *dev) { } + #define EEH_POSSIBLE_ERROR(val, type) (0) #define EEH_IO_ERROR_VALUE(size) (-1UL) -#endif +#endif /* CONFIG_EEH */ /* * MMIO read/write operations with EEH support. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 call idle_task_exit with irqs disabled
This patch is from Nathan Lynch <[EMAIL PROTECTED]>. Seeing this very occasionally during cpu hotplug testing: Badness in slb_flush_and_rebolt at arch/ppc64/mm/slb.c:52 Call Trace: [c000ef0efbe0] [c00127a0] .__switch_to+0xa4/0xf0 (unreliable) [c000ef0efc80] [c0050178] .idle_task_exit+0xbc/0x15c [c000ef0efd10] [c000d108] .cpu_die+0x18/0x68 [c000ef0efd90] [c001023c] .dedicated_idle+0x1fc/0x254 [c000ef0efe80] [c000fc80] .cpu_idle+0x3c/0x54 [c000ef0eff00] [c003aa90] .start_secondary+0x108/0x148 [c000ef0eff90] [c000bd28] .enable_64b_mode+0x0/0x28 idle_task_exit can result in a call to slb_flush_and_rebolt, which must not be called with interrupts enabled. Make the call with interrupts disabled. Signed-off-by: Nathan Lynch <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> pSeries_setup.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6.11-bk2/arch/ppc64/kernel/pSeries_setup.c === --- linux-2.6.11-bk2.orig/arch/ppc64/kernel/pSeries_setup.c 2005-03-07 04:09:29.0 + +++ linux-2.6.11-bk2/arch/ppc64/kernel/pSeries_setup.c 2005-03-07 04:15:22.0 + @@ -322,8 +322,8 @@ static void __init pSeries_discover_pic static void pSeries_mach_cpu_die(void) { - idle_task_exit(); local_irq_disable(); + idle_task_exit(); /* Some hardware requires clearing the CPPR, while other hardware does not * it is safe either way */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 update irq affinity mask when migrating irqs
This patch is from Nathan Lynch <[EMAIL PROTECTED]>. When offlining a cpu, any device interrupts which are bound to the cpu have their affinity forcibly reset to all cpus (the default). However, the value in /proc/irq/XXX/smp_affinity remains unchanged. Since we're doing this while all the other cpus are stopped, it should be safe to just call desc->handler->set_affinity and manually update the irq_affinity array. Signed-off-by: Nathan Lynch <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> xics.c | 11 ++- 1 files changed, 2 insertions(+), 9 deletions(-) Index: linux-2.6.11-bk2/arch/ppc64/kernel/xics.c === --- linux-2.6.11-bk2.orig/arch/ppc64/kernel/xics.c 2005-03-02 07:38:10.0 + +++ linux-2.6.11-bk2/arch/ppc64/kernel/xics.c 2005-03-07 03:52:08.0 + @@ -704,15 +704,8 @@ void xics_migrate_irqs_away(void) virq, cpu); /* Reset affinity to all cpus */ - xics_status[0] = default_distrib_server; - - status = rtas_call(ibm_set_xive, 3, 1, NULL, irq, - xics_status[0], xics_status[1]); - if (status) - printk(KERN_ERR "migrate_irqs_away: irq=%d " - "ibm,set-xive returns %d\n", - virq, status); - + desc->handler->set_affinity(virq, CPU_MASK_ALL); + irq_affinity[virq] = CPU_MASK_ALL; unlock: spin_unlock_irqrestore(&desc->lock, flags); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 error code cleanups for rtas wrappers
This patch is from John Rose <[EMAIL PROTECTED]> This patch changes the rtas wrapper functions in rtas.c to map RTAS failure codes to conventional error values. The goal is to make failure conditions obvious in the wrapper functions and in the caller code. Signed-off-by: John Rose <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -puN arch/ppc64/kernel/pSeries_smp.c~01_rtas_rcs arch/ppc64/kernel/pSeries_smp.c --- 2_6_linus_3/arch/ppc64/kernel/pSeries_smp.c~01_rtas_rcs 2005-03-02 14:50:33.0 -0600 +++ 2_6_linus_3-johnrose/arch/ppc64/kernel/pSeries_smp.c2005-03-02 14:50:33.0 -0600 @@ -151,7 +151,7 @@ static unsigned int find_physical_cpu_to if (index) { int state; int rc = rtas_get_sensor(9003, *index, &state); - if (rc != 0 || state != 1) + if (rc < 0 || state != 1) continue; } diff -puN arch/ppc64/kernel/rtas.c~01_rtas_rcs arch/ppc64/kernel/rtas.c --- 2_6_linus_3/arch/ppc64/kernel/rtas.c~01_rtas_rcs2005-03-02 14:50:33.0 -0600 +++ 2_6_linus_3-johnrose/arch/ppc64/kernel/rtas.c 2005-03-02 14:50:33.0 -0600 @@ -255,29 +255,59 @@ rtas_extended_busy_delay_time(int status return ms; } -int -rtas_get_power_level(int powerdomain, int *level) +int rtas_error_rc(int rtas_rc) +{ + int rc; + + switch (rtas_rc) { + case -1:/* Hardware Error */ + rc = -EIO; + break; + case -3:/* Bad indicator/domain/etc */ + rc = -EINVAL; + break; + case -9000: /* Isolation error */ + rc = -EFAULT; + break; + case -9001: /* Outstanding TCE/PTE */ + rc = -EEXIST; + break; + case -9002: /* No usable slot */ + rc = -ENODEV; + break; + default: + printk(KERN_ERR "%s: unexpected RTAS error %d\n", + __FUNCTION__, rtas_rc); + rc = -ERANGE; + break; + } + return rc; +} + +int rtas_get_power_level(int powerdomain, int *level) { int token = rtas_token("get-power-level"); int rc; if (token == RTAS_UNKNOWN_SERVICE) - return RTAS_UNKNOWN_OP; + return -ENOENT; while ((rc = rtas_call(token, 1, 2, level, powerdomain)) == RTAS_BUSY) udelay(1); + + if (rc < 0) + return rtas_error_rc(rc); return rc; } -int -rtas_set_power_level(int powerdomain, int level, int *setlevel) +int rtas_set_power_level(int powerdomain, int level, int *setlevel) { int token = rtas_token("set-power-level"); unsigned int wait_time; int rc; if (token == RTAS_UNKNOWN_SERVICE) - return RTAS_UNKNOWN_OP; + return -ENOENT; while (1) { rc = rtas_call(token, 2, 2, setlevel, powerdomain, level); @@ -289,18 +319,20 @@ rtas_set_power_level(int powerdomain, in } else break; } + + if (rc < 0) + return rtas_error_rc(rc); return rc; } -int -rtas_get_sensor(int sensor, int index, int *state) +int rtas_get_sensor(int sensor, int index, int *state) { int token = rtas_token("get-sensor-state"); unsigned int wait_time; int rc; if (token == RTAS_UNKNOWN_SERVICE) - return RTAS_UNKNOWN_OP; + return -ENOENT; while (1) { rc = rtas_call(token, 2, 2, state, sensor, index); @@ -312,18 +344,20 @@ rtas_get_sensor(int sensor, int index, i } else break; } + + if (rc < 0) + return rtas_error_rc(rc); return rc; } -int -rtas_set_indicator(int indicator, int index, int new_value) +int rtas_set_indicator(int indicator, int index, int new_value) { int token = rtas_token("set-indicator"); unsigned int wait_time; int rc; if (token == RTAS_UNKNOWN_SERVICE) - return RTAS_UNKNOWN_OP; + return -ENOENT; while (1) { rc = rtas_call(token, 3, 1, NULL, indicator, index, new_value); @@ -337,6 +371,8 @@ rtas_set_indicator(int indicator, int in break; } + if (rc < 0) + return rtas_error_rc(rc); return rc; } diff -puN arch/ppc64/kernel/rtasd.c~01_rtas_rcs arch/ppc64/kernel/rtasd.c --- 2_6_linus_3/arch/ppc64/kernel/r
[PATCH] PPC64 error code cleanups rpa[php,dlpar]
This patch is from John Rose <[EMAIL PROTECTED]> This patch changes the RPA PCI Hotplug and DLPAR modules to use more conventional error values for return codes. The goal is to make failure conditions obvious in the wrapper functions and in the caller code. Signed-off-by: John Rose <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -puN drivers/pci/hotplug/rpaphp.h~02_rpaphp_rcs drivers/pci/hotplug/rpaphp.h --- 2_6_linus_3/drivers/pci/hotplug/rpaphp.h~02_rpaphp_rcs 2005-03-07 17:52:20.0 -0600 +++ 2_6_linus_3-johnrose/drivers/pci/hotplug/rpaphp.h 2005-03-07 17:52:20.0 -0600 @@ -45,11 +45,6 @@ #define LED_ID 2 /* slow blinking */ #define LED_ACTION 3 /* fast blinking */ -/* Error status from rtas_get-sensor */ -#define NEED_POWER-9000/* slot must be power up and unisolated to get state */ -#define PWR_ONLY -9001/* slot must be powerd up to get state, leave isolated */ -#define ERR_SENSE_USE -9002/* No DR operation will succeed, slot is unusable */ - /* Sensor values from rtas_get-sensor */ #define EMPTY 0 /* No card in slot */ #define PRESENT 1 /* Card in slot */ diff -puN drivers/pci/hotplug/rpaphp_core.c~02_rpaphp_rcs drivers/pci/hotplug/rpaphp_core.c --- 2_6_linus_3/drivers/pci/hotplug/rpaphp_core.c~02_rpaphp_rcs 2005-03-07 17:52:20.0 -0600 +++ 2_6_linus_3-johnrose/drivers/pci/hotplug/rpaphp_core.c 2005-03-07 17:52:20.0 -0600 @@ -256,12 +256,12 @@ int rpaphp_get_drc_props(struct device_n my_index = (int *) get_property(dn, "ibm,my-drc-index", NULL); if (!my_index) { /* Node isn't DLPAR/hotplug capable */ - return 1; + return -EINVAL; } rc = get_children_props(dn->parent, &indexes, &names, &types, &domains); if (rc < 0) { - return 1; + return -EINVAL; } name_tmp = (char *) &names[1]; @@ -284,7 +284,7 @@ int rpaphp_get_drc_props(struct device_n type_tmp += (strlen(type_tmp) + 1); } - return 1; + return -EINVAL; } static int is_php_type(char *drc_type) diff -puN drivers/pci/hotplug/rpaphp_pci.c~02_rpaphp_rcs drivers/pci/hotplug/rpaphp_pci.c --- 2_6_linus_3/drivers/pci/hotplug/rpaphp_pci.c~02_rpaphp_rcs 2005-03-07 17:52:20.0 -0600 +++ 2_6_linus_3-johnrose/drivers/pci/hotplug/rpaphp_pci.c 2005-03-07 17:52:20.0 -0600 @@ -81,8 +81,8 @@ static int rpaphp_get_sensor_state(struc rc = rtas_get_sensor(DR_ENTITY_SENSE, slot->index, state); - if (rc) { - if (rc == NEED_POWER || rc == PWR_ONLY) { + if (rc < 0) { + if (rc == -EFAULT || rc == -EEXIST) { dbg("%s: slot must be power up to get sensor-state\n", __FUNCTION__); @@ -91,14 +91,14 @@ static int rpaphp_get_sensor_state(struc */ rc = rtas_set_power_level(slot->power_domain, POWER_ON, &setlevel); - if (rc) { + if (rc < 0) { dbg("%s: power on slot[%s] failed rc=%d.\n", __FUNCTION__, slot->name, rc); } else { rc = rtas_get_sensor(DR_ENTITY_SENSE, slot->index, state); } - } else if (rc == ERR_SENSE_USE) + } else if (rc == -ENODEV) info("%s: slot is unusable\n", __FUNCTION__); else err("%s failed to get sensor state\n", __FUNCTION__); @@ -413,7 +413,7 @@ static int setup_pci_hotplug_slot_info(s if (slot->hotplug_slot->info->adapter_status == NOT_VALID) { err("%s: NOT_VALID: skip dn->full_name=%s\n", __FUNCTION__, slot->dn->full_name); - return -1; + return -EINVAL; } return 0; } @@ -426,15 +426,15 @@ static int set_phb_slot_name(struct slot dn = slot->dn; if (!dn) { - return 1; + return -EINVAL; } phb = dn->phb; if (!phb) { - return 1; + return -EINVAL; } bus = phb->bus; if (!bus) { - return 1; + return -EINVAL; } sprintf(slot->name, "%04x:%02x:%02x.%x", pci_domain_nr(bus), @@ -448,7 +448,7 @@ static int setup_pci_slot(struct slot *s if (slot->type == PHB) { rc = set_phb_slot_name(slot); - if (rc) { + if (
[PATCH] PPC64 set pci_io_base dynamically if necessary
This patch is from John Rose <[EMAIL PROTECTED]>. Upon DLPAR addition of a PCI Host Brige to a system with purely virtual I/O, set pci_io_base as necessary. Signed-off-by: John Rose <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -urN linux-2.5/arch/ppc64/kernel/pSeries_pci.c test/arch/ppc64/kernel/pSeries_pci.c --- linux-2.5/arch/ppc64/kernel/pSeries_pci.c 2005-01-12 18:20:48.0 +1100 +++ test/arch/ppc64/kernel/pSeries_pci.c2005-03-07 21:04:02.0 +1100 @@ -424,16 +424,18 @@ unsigned int root_size_cells = 0; struct pci_controller *phb; struct pci_bus *bus; + int primary; root_size_cells = prom_n_size_cells(root); + primary = list_empty(&hose_list); phb = alloc_phb_dynamic(dn, root_size_cells); if (!phb) return NULL; pci_process_bridge_OF_ranges(phb, dn); - pci_setup_phb_io_dynamic(phb); + pci_setup_phb_io_dynamic(phb, primary); of_node_put(root); pci_devs_phb_init_dynamic(phb); diff -urN linux-2.5/arch/ppc64/kernel/pci.c test/arch/ppc64/kernel/pci.c --- linux-2.5/arch/ppc64/kernel/pci.c 2005-03-07 08:21:53.0 +1100 +++ test/arch/ppc64/kernel/pci.c2005-03-07 21:04:02.0 +1100 @@ -619,7 +619,8 @@ res->end += io_virt_offset; } -void __devinit pci_setup_phb_io_dynamic(struct pci_controller *hose) +void __devinit pci_setup_phb_io_dynamic(struct pci_controller *hose, + int primary) { unsigned long size = hose->pci_io_size; unsigned long io_virt_offset; @@ -631,6 +632,9 @@ hose->global_number, hose->io_base_phys, (unsigned long) hose->io_base_virt); + if (primary) + pci_io_base = (unsigned long)hose->io_base_virt; + io_virt_offset = (unsigned long)hose->io_base_virt - pci_io_base; res = &hose->io_resource; res->start += io_virt_offset; diff -urN linux-2.5/arch/ppc64/kernel/pci.h test/arch/ppc64/kernel/pci.h --- linux-2.5/arch/ppc64/kernel/pci.h 2005-01-12 18:20:48.0 +1100 +++ test/arch/ppc64/kernel/pci.h2005-03-07 21:06:52.0 +1100 @@ -16,8 +16,7 @@ extern void pci_setup_pci_controller(struct pci_controller *hose); extern void pci_setup_phb_io(struct pci_controller *hose, int primary); - -extern void pci_setup_phb_io_dynamic(struct pci_controller *hose); +extern void pci_setup_phb_io_dynamic(struct pci_controller *hose, int primary); extern struct list_head hose_list; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ppc64: kprobes: handle trap variants while processing probes
This patch is from Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>. While processing a kprobe, we were currently not handling all available trap variants available on PowerPC. This lead to the breakage of BUG() handling in ppc64. Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -Naurp temp/linux-2.6.11-rc3/arch/ppc64/kernel/kprobes.c linux-2.6.11-rc3/arch/ppc64/kernel/kprobes.c --- temp/linux-2.6.11-rc3/arch/ppc64/kernel/kprobes.c 2005-02-03 07:26:53.0 +0530 +++ linux-2.6.11-rc3/arch/ppc64/kernel/kprobes.c2005-02-10 18:08:25.0 +0530 @@ -105,8 +105,16 @@ static inline int kprobe_handler(struct p = get_kprobe(addr); if (!p) { unlock_kprobes(); -#if 0 if (*addr != BREAKPOINT_INSTRUCTION) { + /* +* PowerPC has multiple variants of the "trap" +* instruction. If the current instruction is a +* trap variant, it could belong to someone else +*/ + kprobe_opcode_t cur_insn = *addr; + if (IS_TW(cur_insn) || IS_TD(cur_insn) || + IS_TWI(cur_insn) || IS_TDI(cur_insn)) + goto no_kprobe; /* * The breakpoint instruction was removed right * after we hit it. Another cpu has removed @@ -116,7 +124,6 @@ static inline int kprobe_handler(struct */ ret = 1; } -#endif /* Not one of ours: let kernel handle it */ goto no_kprobe; } diff -Naurp temp/linux-2.6.11-rc3/include/asm-ppc64/kprobes.h linux-2.6.11-rc3/include/asm-ppc64/kprobes.h --- temp/linux-2.6.11-rc3/include/asm-ppc64/kprobes.h 2005-02-03 07:25:50.0 +0530 +++ linux-2.6.11-rc3/include/asm-ppc64/kprobes.h2005-02-10 18:08:58.0 +0530 @@ -35,6 +35,11 @@ typedef unsigned int kprobe_opcode_t; #define BREAKPOINT_INSTRUCTION 0x7fe8 /* trap */ #define MAX_INSN_SIZE 1 +#define IS_TW(instr) (((instr) & 0xfc0007fe) == 0x7c08) +#define IS_TD(instr) (((instr) & 0xfc0007fe) == 0x7c88) +#define IS_TDI(instr) (((instr) & 0xfc00) == 0x0800) +#define IS_TWI(instr) (((instr) & 0xfc00) == 0x0c00) + #define JPROBE_ENTRY(pentry) (kprobe_opcode_t *)((func_descr_t *)pentry) /* Architecture specific copy of original instruction */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64: C99 initializers for hw_interrupt_type
This patch is from Thomas Gleixner <[EMAIL PROTECTED]>. Convert the initializers of hw_interrupt_type structures to C99 initializers. Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -urN 2.6.11-rc5.orig/arch/ppc64/kernel/i8259.c 2.6.11-rc5/arch/ppc64/kernel/i8259.c --- 2.6.11-rc5.orig/arch/ppc64/kernel/i8259.c 2005-01-24 12:25:36.0 +0100 +++ 2.6.11-rc5/arch/ppc64/kernel/i8259.c2005-02-26 20:54:19.0 +0100 @@ -131,14 +131,11 @@ } struct hw_interrupt_type i8259_pic = { -" i8259", -NULL, -NULL, -i8259_unmask_irq, -i8259_mask_irq, -i8259_mask_and_ack_irq, -i8259_end_irq, -NULL + .typename = " i8259", + .enable = i8259_unmask_irq, + .disable = i8259_mask_irq, + .ack = i8259_mask_and_ack_irq, + .end = i8259_end_irq, }; void __init i8259_init(int offset) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 Fix init_boot_display link error
This patch is from Amos Waterland <[EMAIL PROTECTED]>. In pmac_setup.c, the function init_boot_display as currently written only makes sense with CONFIG_BOOTX_TEXT enabled, and causes a link error if it is not enabled. Signed-off-by: Amos Waterland <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> --- 1.15/arch/ppc64/kernel/pmac_setup.c 2005-01-08 00:43:52 -05:00 +++ edited/arch/ppc64/kernel/pmac_setup.c 2005-03-02 19:37:31 -05:00 @@ -244,7 +244,6 @@ { btext_drawchar(c); } -#endif /* CONFIG_BOOTX_TEXT */ static void __init init_boot_display(void) { @@ -280,6 +279,7 @@ return; } } +#endif /* CONFIG_BOOTX_TEXT */ /* * Early initialization. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ppc64: Mode 2 PCI-X config space size fix
This patch is from Brian King <[EMAIL PROTECTED]>. When working with a PCI-X Mode 2 adapter on a PCI-X Mode 1 PPC64 system, the current code used to determine the config space size of a device results in a PCI Master abort and an EEH error, resulting in the device being taken offline. This patch checks OF to see if the PCI bridge supports PCI-X Mode 2 and fails config accesses beyond 256 bytes if it does not. Signed-off-by: Brian King <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -urN linux-2.5/arch/ppc64/kernel/iSeries_pci.c test/arch/ppc64/kernel/iSeries_pci.c --- linux-2.5/arch/ppc64/kernel/iSeries_pci.c 2005-01-22 09:25:41.0 +1100 +++ test/arch/ppc64/kernel/iSeries_pci.c2005-03-07 18:21:41.0 +1100 @@ -610,6 +610,10 @@ if (node == NULL) return PCIBIOS_DEVICE_NOT_FOUND; + if (offset > 255) { + *val = ~0; + return PCIBIOS_BAD_REGISTER_NUMBER; + } fn = hv_cfg_read_func[(size - 1) & 3]; HvCall3Ret16(fn, &ret, node->DsaAddr.DsaAddr, offset, 0); @@ -636,6 +640,8 @@ if (node == NULL) return PCIBIOS_DEVICE_NOT_FOUND; + if (offset > 255) + return PCIBIOS_BAD_REGISTER_NUMBER; fn = hv_cfg_write_func[(size - 1) & 3]; ret = HvCall4(fn, node->DsaAddr.DsaAddr, offset, val, 0); diff -urN linux-2.5/arch/ppc64/kernel/pSeries_pci.c test/arch/ppc64/kernel/pSeries_pci.c --- linux-2.5/arch/ppc64/kernel/pSeries_pci.c 2005-01-12 18:20:48.0 +1100 +++ test/arch/ppc64/kernel/pSeries_pci.c2005-03-07 18:21:41.0 +1100 @@ -52,6 +52,16 @@ extern struct mpic *pSeries_mpic; +static int config_access_valid(struct device_node *dn, int where) +{ + if (where < 256) + return 1; + if (where < 4096 && dn->pci_ext_config_space) + return 1; + + return 0; +} + static int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) { int returnval = -1; @@ -60,10 +70,11 @@ if (!dn) return PCIBIOS_DEVICE_NOT_FOUND; - if (where & (size - 1)) + if (!config_access_valid(dn, where)) return PCIBIOS_BAD_REGISTER_NUMBER; - addr = (dn->busno << 16) | (dn->devfn << 8) | where; + addr = ((where & 0xf00) << 20) | (dn->busno << 16) | + (dn->devfn << 8) | (where & 0xff); buid = dn->phb->buid; if (buid) { ret = rtas_call(ibm_read_pci_config, 4, 2, &returnval, @@ -108,10 +119,11 @@ if (!dn) return PCIBIOS_DEVICE_NOT_FOUND; - if (where & (size - 1)) + if (!config_access_valid(dn, where)) return PCIBIOS_BAD_REGISTER_NUMBER; - addr = (dn->busno << 16) | (dn->devfn << 8) | where; + addr = ((where & 0xf00) << 20) | (dn->busno << 16) | + (dn->devfn << 8) | (where & 0xff); buid = dn->phb->buid; if (buid) { ret = rtas_call(ibm_write_pci_config, 5, 1, NULL, addr, buid >> 32, buid & 0x, size, (ulong) val); diff -urN linux-2.5/arch/ppc64/kernel/pci_dn.c test/arch/ppc64/kernel/pci_dn.c --- linux-2.5/arch/ppc64/kernel/pci_dn.c2005-01-12 18:20:48.0 +1100 +++ test/arch/ppc64/kernel/pci_dn.c 2005-03-07 18:21:41.0 +1100 @@ -37,6 +37,7 @@ static void * __devinit update_dn_pci_info(struct device_node *dn, void *data) { struct pci_controller *phb = data; + int *type = (int *)get_property(dn, "ibm,pci-config-space-type", NULL); u32 *regs; dn->phb = phb; @@ -46,6 +47,8 @@ dn->busno = (regs[0] >> 16) & 0xff; dn->devfn = (regs[0] >> 8) & 0xff; } + + dn->pci_ext_config_space = (type && *type == 1); return NULL; } diff -urN linux-2.5/include/asm-ppc64/prom.h test/include/asm-ppc64/prom.h --- linux-2.5/include/asm-ppc64/prom.h 2005-01-29 09:58:49.0 +1100 +++ test/include/asm-ppc64/prom.h 2005-03-07 18:21:41.0 +1100 @@ -137,6 +137,7 @@ int devfn; /* for pci devices */ int eeh_mode; /* See eeh.h for possible EEH_MODEs */ int eeh_config_addr; + int pci_ext_config_space; /* for pci devices */ struct pci_controller *phb;/* for pci devices */ struct iommu_table *iommu_table; /* for phb's or bridges */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] trivial fix for 2.6.11 raid6 compilation on ppc w/ Altivec
Jeff Garzik writes: > Rene Rebe wrote: > > Hi, > > > > > > --- linux-2.6.11/drivers/md/raid6altivec.uc.vanilla2005-03-02 > > 16:44:56.407107752 +0100 > > +++ linux-2.6.11/drivers/md/raid6altivec.uc2005-03-02 > > 16:45:22.424152560 +0100 > > @@ -108,7 +108,7 @@ > > int raid6_have_altivec(void) > > { > > /* This assumes either all CPUs have Altivec or none does */ > > -return cur_cpu_spec->cpu_features & CPU_FTR_ALTIVEC; > > +return cur_cpu_spec[0]->cpu_features & CPU_FTR_ALTIVEC; > > > I nominate this as a candidate for linux-2.6.11 release branch. :) No. Unfortunately if you fix ppc64 here you will break ppc, and vice versa. Yes, we are going to reconcile the cur_cpu_spec definitions between ppc and ppc64. :) Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Page fault scalability patch V18: Drop first acquisition of ptl
Andrew Morton writes: > But if the approach which these patches take is not suitable for these > architectures then they have no solution to the scalability problem. The > machines will perform suboptimally and more (perhaps conflicting) > development will be needed. We can do a pte_cmpxchg on ppc64. We already have a busy bit in the PTE and do most operations atomically, in order to ensure that we don't get races or inconsistencies due to accesses to the PTE by the low-level hash_page() routine (which instantiates a hardware PTE in the hardware hash table based on a Linux PTE), because it already accesses the linux page tables without taking the mm->page_table_lock. However, there are other developments we are considering in this area: notably Ben wants to change things so that when we invalidate a Linux PTE we leave it busy until we actually remove the hardware PTE from the hash table. Also we are looking forward to DaveM's patch which will change the generic MM code to give us the mm and address on all PTE operations, which will simplify some things for us. I don't really want to have to think about pte_cmpxchg until those other things are sorted out. More generally, I would be interested to know what sorts of applications or benchmarks show scalability problems on large machines due to contention on mm->page_table_lock. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Consolidate compat_sys_waitid
Stephen Rothwell writes: > This patch does: > - consolidate the three implementations of compat_sys_waitid > (some were called sys32_waitid). > - adds sys_waitid syscall to ppc > - adds sys_waitid and compat_sys_waitid syscalls to ppc64 Looks good to me. Are you going to submit it to Andrew? Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 collect and export low-level cpu usage statistics
POWER5 machines have a per-hardware-thread register which counts at a rate which is proportional to the percentage of cycles on which the cpu dispatches an instruction for this thread (if the thread gets all the dispatch cycles it counts at the same rate as the timebase register). This register is also context-switched by the hypervisor. Thus it gives a fine-grained measure of the actual cpu usage by the thread over time. This patch adds code to read this register every timer interrupt and on every context switch. The total over all virtual processors is available through the existing /proc/ppc64/lparcfg file, giving a way to measure the total cpu usage over the whole partition. Andrew, this is relatively non-invasive, but nevertheless you may prefer to put it in -mm until 2.6.11 is out. Signed-off-by: Manish Ahuja <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -urN linux-2.5/arch/ppc64/kernel/lparcfg.c test/arch/ppc64/kernel/lparcfg.c --- linux-2.5/arch/ppc64/kernel/lparcfg.c 2005-01-06 13:13:08.0 +1100 +++ test/arch/ppc64/kernel/lparcfg.c2005-02-09 22:38:05.508190616 +1100 @@ -33,8 +33,9 @@ #include #include #include +#include -#define MODULE_VERS "1.5" +#define MODULE_VERS "1.6" #define MODULE_NAME "lparcfg" /* #define LPARCFG_DEBUG */ @@ -214,13 +215,20 @@ } static unsigned long get_purr(void); -/* ToDo: get sum of purr across all processors. The purr collection code - * is coming, but at this time is still problematic, so for now this - * function will return 0. - */ + +/* Track sum of all purrs across all processors. This is used to further */ +/* calculate usage values by different applications */ + static unsigned long get_purr(void) { unsigned long sum_purr = 0; + int cpu; + struct cpu_usage *cu; + + for_each_cpu(cpu) { + cu = &per_cpu(cpu_usage_array, cpu); + sum_purr += cu->current_tb; + } return sum_purr; } diff -urN linux-2.5/arch/ppc64/kernel/process.c test/arch/ppc64/kernel/process.c --- linux-2.5/arch/ppc64/kernel/process.c 2005-01-29 09:58:49.0 +1100 +++ test/arch/ppc64/kernel/process.c2005-02-10 08:09:22.428216944 +1100 @@ -51,6 +51,7 @@ #include #include #include +#include #ifndef CONFIG_SMP struct task_struct *last_task_used_math = NULL; @@ -168,6 +169,8 @@ #endif /* CONFIG_ALTIVEC */ +DEFINE_PER_CPU(struct cpu_usage, cpu_usage_array); + struct task_struct *__switch_to(struct task_struct *prev, struct task_struct *new) { @@ -206,6 +209,21 @@ new_thread = &new->thread; old_thread = ¤t->thread; +/* Collect purr utilization data per process and per processor wise */ +/* purr is nothing but processor time base */ + +#if defined(CONFIG_PPC_PSERIES) + if (cur_cpu_spec->firmware_features & FW_FEATURE_SPLPAR) { + struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array); + long unsigned start_tb, current_tb; + start_tb = old_thread->start_tb; + cu->current_tb = current_tb = mfspr(SPRN_PURR); + old_thread->accum_tb += (current_tb - start_tb); + new_thread->start_tb = current_tb; + } +#endif + + local_irq_save(flags); last = _switch(old_thread, new_thread); diff -urN linux-2.5/arch/ppc64/kernel/time.c test/arch/ppc64/kernel/time.c --- linux-2.5/arch/ppc64/kernel/time.c 2005-01-22 09:25:41.0 +1100 +++ test/arch/ppc64/kernel/time.c 2005-02-10 08:09:34.412257896 +1100 @@ -334,6 +334,14 @@ } #endif +/* collect purr register values often, for accurate calculations */ +#if defined(CONFIG_PPC_PSERIES) + if (cur_cpu_spec->firmware_features & FW_FEATURE_SPLPAR) { + struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array); + cu->current_tb = mfspr(SPRN_PURR); + } +#endif + irq_exit(); return 1; diff -urN linux-2.5/include/asm-ppc64/processor.h test/include/asm-ppc64/processor.h --- linux-2.5/include/asm-ppc64/processor.h 2005-01-17 08:47:37.0 +1100 +++ test/include/asm-ppc64/processor.h 2005-02-09 22:38:05.528187576 +1100 @@ -562,7 +562,9 @@ double fpr[32];/* Complete floating point set */ unsigned long fpscr; /* Floating point status (plus pad) */ unsigned long fpexc_mode; /* Floating-point exception mode */ - unsigned long pad[3]; /* was saved_msr, saved_softe */ + unsigned long start_tb; /* Start purr when proc switched in */ + unsigned long accum_tb; /* Total accumilated purr for process */ + unsigned long pad;/* was saved_msr, saved_softe */ #ifdef CONFIG_ALTIVEC /* Complete AltiVec register set */
Re: A scrub daemon (prezeroing)
Christoph Lameter writes: > scrubd clears pages of orders 7-4 by default. That means 2^4 to 2^7 > pages are cleared at once. So are you saying that clearing an order 4 page will take measurably less time than clearing 16 order 0 pages? I find that hard to believe. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 replace last usage of vio dma mapping routines
This patch is from Stephen Rothwell <[EMAIL PROTECTED]>. This patch just replaces the last usage of the vio dma mapping routines with the equivalent generic dma mapping routines. Signed-off-by: Stephen Rothwell <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -ruNp linus-bk/drivers/net/ibmveth.c linus-bk-vio.1/drivers/net/ibmveth.c --- linus-bk/drivers/net/ibmveth.c 2004-12-08 04:06:06.0 +1100 +++ linus-bk-vio.1/drivers/net/ibmveth.c2005-01-31 16:45:28.0 +1100 @@ -218,7 +218,8 @@ static void ibmveth_replenish_buffer_poo ibmveth_assert(index != IBM_VETH_INVALID_MAP); ibmveth_assert(pool->skbuff[index] == NULL); - dma_addr = vio_map_single(adapter->vdev, skb->data, pool->buff_size, DMA_FROM_DEVICE); + dma_addr = dma_map_single(&adapter->vdev->dev, skb->data, + pool->buff_size, DMA_FROM_DEVICE); pool->free_map[free_index] = IBM_VETH_INVALID_MAP; pool->dma_addr[index] = dma_addr; @@ -238,7 +239,9 @@ static void ibmveth_replenish_buffer_poo pool->free_map[free_index] = IBM_VETH_INVALID_MAP; pool->skbuff[index] = NULL; pool->consumer_index--; - vio_unmap_single(adapter->vdev, pool->dma_addr[index], pool->buff_size, DMA_FROM_DEVICE); + dma_unmap_single(&adapter->vdev->dev, + pool->dma_addr[index], pool->buff_size, + DMA_FROM_DEVICE); dev_kfree_skb_any(skb); adapter->replenish_add_buff_failure++; break; @@ -299,7 +302,7 @@ static void ibmveth_free_buffer_pool(str for(i = 0; i < pool->size; ++i) { struct sk_buff *skb = pool->skbuff[i]; if(skb) { - vio_unmap_single(adapter->vdev, + dma_unmap_single(&adapter->vdev->dev, pool->dma_addr[i], pool->buff_size, DMA_FROM_DEVICE); @@ -337,7 +340,7 @@ static void ibmveth_remove_buffer_from_p adapter->rx_buff_pool[pool].skbuff[index] = NULL; - vio_unmap_single(adapter->vdev, + dma_unmap_single(&adapter->vdev->dev, adapter->rx_buff_pool[pool].dma_addr[index], adapter->rx_buff_pool[pool].buff_size, DMA_FROM_DEVICE); @@ -408,7 +411,9 @@ static void ibmveth_cleanup(struct ibmve { if(adapter->buffer_list_addr != NULL) { if(!dma_mapping_error(adapter->buffer_list_dma)) { - vio_unmap_single(adapter->vdev, adapter->buffer_list_dma, 4096, DMA_BIDIRECTIONAL); + dma_unmap_single(&adapter->vdev->dev, + adapter->buffer_list_dma, 4096, + DMA_BIDIRECTIONAL); adapter->buffer_list_dma = DMA_ERROR_CODE; } free_page((unsigned long)adapter->buffer_list_addr); @@ -417,7 +422,9 @@ static void ibmveth_cleanup(struct ibmve if(adapter->filter_list_addr != NULL) { if(!dma_mapping_error(adapter->filter_list_dma)) { - vio_unmap_single(adapter->vdev, adapter->filter_list_dma, 4096, DMA_BIDIRECTIONAL); + dma_unmap_single(&adapter->vdev->dev, + adapter->filter_list_dma, 4096, + DMA_BIDIRECTIONAL); adapter->filter_list_dma = DMA_ERROR_CODE; } free_page((unsigned long)adapter->filter_list_addr); @@ -426,7 +433,10 @@ static void ibmveth_cleanup(struct ibmve if(adapter->rx_queue.queue_addr != NULL) { if(!dma_mapping_error(adapter->rx_queue.queue_dma)) { - vio_unmap_single(adapter->vdev, adapter->rx_queue.queue_dma, adapter->rx_queue.queue_len, DMA_BIDIRECTIONAL); + dma_unmap_single(&adapter->vdev->dev, + adapter->rx_queue.queue_dma, + adapter->rx_queue.queue_len, + DMA_BIDIRECTIONAL); adapter->rx_queue.queue_dma = DMA_ERROR_CODE; } kfree(adapter->rx_queue.queue_addr); @@ -472,9 +482,13 @@ static int ibmveth_open(struct net_devi
[PATCH] Fix devfs name for the hvcs driver
This patch is from Jimi Xenidis <[EMAIL PROTECTED]>. The hvcs driver does not register a devfs_name resulting in devfs creating /dev/* entries. The following one line patch remedies the problem. Signed-off-by: Jimi Xenidis <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> --- orig/drivers/char/hvcs.c +++ mod/drivers/char/hvcs.c @@ -1363,6 +1363,7 @@ hvcs_tty_driver->driver_name = hvcs_driver_name; hvcs_tty_driver->name = hvcs_device_node; + hvcs_tty_driver->devfs_name = hvcs_device_node; /* * We'll let the system assign us a major number, indicated by leaving - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 show -1 for physical_id of non-present cpus
This patch is from Nathan Lynch <[EMAIL PROTECTED]>. Make the physical_id cpu sysfs attribute on ppc64 show -1 instead of 65535 for non-present cpus. Signed-off-by: Nathan Lynch <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -puN arch/ppc64/kernel/sysfs.c~make-cpu-physical_id-signed arch/ppc64/kernel/sysfs.c --- linux-2.6.11-rc2-mm1/arch/ppc64/kernel/sysfs.c~make-cpu-physical_id-signed 2005-01-27 15:03:16.0 -0600 +++ linux-2.6.11-rc2-mm1-nathanl/arch/ppc64/kernel/sysfs.c 2005-01-27 15:05:12.0 -0600 @@ -387,7 +387,7 @@ static ssize_t show_physical_id(struct s { struct cpu *cpu = container_of(dev, struct cpu, sysdev); - return sprintf(buf, "%u\n", get_hard_smp_processor_id(cpu->sysdev.id)); + return sprintf(buf, "%d\n", get_hard_smp_processor_id(cpu->sysdev.id)); } static SYSDEV_ATTR(physical_id, 0444, show_physical_id, NULL); diff -puN include/asm-ppc64/paca.h~make-cpu-physical_id-signed include/asm-ppc64/paca.h --- linux-2.6.11-rc2-mm1/include/asm-ppc64/paca.h~make-cpu-physical_id-signed 2005-01-27 15:04:14.0 -0600 +++ linux-2.6.11-rc2-mm1-nathanl/include/asm-ppc64/paca.h 2005-01-27 15:04:51.0 -0600 @@ -68,7 +68,7 @@ struct paca_struct { u64 stab_real; /* Absolute address of segment table */ u64 stab_addr; /* Virtual address of segment table */ void *emergency_sp; /* pointer to emergency stack */ - u16 hw_cpu_id; /* Physical processor number */ + s16 hw_cpu_id; /* Physical processor number */ u8 cpu_start; /* At startup, processor spins until */ /* this becomes non-zero. */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 correct return code in syscall auditing
This patch is from David Woodhouse <[EMAIL PROTECTED]>. We were pretending that every syscall returned zero. Don't do that. Signed-Off-By: David Woodhouse <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> = arch/ppc64/kernel/entry.S 1.51 vs edited = --- 1.51/arch/ppc64/kernel/entry.S Thu Jan 13 09:48:36 2005 +++ edited/arch/ppc64/kernel/entry.SThu Jan 20 16:14:50 2005 @@ -231,6 +231,7 @@ syscall_exit_trace: std r3,GPR3(r1) bl .save_nvgprs + addir3,r1,STACK_FRAME_OVERHEAD bl .do_syscall_trace_leave REST_NVGPRS(r1) ld r3,GPR3(r1) @@ -324,6 +325,7 @@ ld r4,TI_FLAGS(r4) andi. r4,r4,(_TIF_SYSCALL_T_OR_A|_TIF_SINGLESTEP) beq+81f + addir3,r1,STACK_FRAME_OVERHEAD bl .do_syscall_trace_leave 81:b .ret_from_except = arch/ppc64/kernel/ptrace.c 1.13 vs edited = --- 1.13/arch/ppc64/kernel/ptrace.c Fri Dec 17 08:09:09 2004 +++ edited/arch/ppc64/kernel/ptrace.c Thu Jan 20 16:24:12 2005 @@ -313,10 +313,10 @@ do_syscall_trace(); } -void do_syscall_trace_leave(void) +void do_syscall_trace_leave(struct pt_regs *regs) { if (unlikely(current->audit_context)) - audit_syscall_exit(current, 0); /* FIXME: pass pt_regs */ + audit_syscall_exit(current, regs->result); if ((test_thread_flag(TIF_SYSCALL_TRACE) || test_thread_flag(TIF_SINGLESTEP)) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A scrub daemon (prezeroing)
Christoph Lameter writes: > If the program does not use these cache lines then you have wasted time > in the page fault handler allocating and handling them. That is what > prezeroing does for you. The program is going to access at least one cache line of the new page. On my G5, it takes _less_ time to clear the whole page and pull in one cache line from L2 cache to L1 than it does to pull in that same cache line from memory. > Yes but its a short burst that only occurs very infrequestly and it takes It occurs just as often as we clear pages in the page fault handler. We aren't clearing any fewer pages by prezeroing, we are just clearing them a bit earlier. > advantage of all the optimizations that modern memory subsystems have for > linear accesses. And if hardware exists that can offload that from the cpu > then the cpu caches are only minimally affected. I can believe that prezeroing could provide a benefit on some machines, but I don't think it will provide any on ppc64. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A scrub daemon (prezeroing)
Christoph Lameter writes: > You need to think about this in a different way. Prezeroing only makes > sense if it can avoid using cache lines that the zeroing in the > hot paths would have to use since it touches all cachelines on > the page (the ppc instruction is certainly nice and avoids a cacheline > read but it still uses a cacheline!). The zeroing in itself (within the The dcbz instruction on the G5 (PPC970) establishes the new cache line in the L2 cache and doesn't disturb the L1 cache (except to invalidate the line in the L1 data cache if it is present there). The L2 cache is 512kB and 8-way set associative (LRU). So zeroing a page is unlikely to disturb the cache lines that the page fault handler is using. Then, when the page fault handler returns to the user program, any cache lines that the program wants to touch are available in 12 cycles (L2 hit latency) instead of 200 - 300 (memory access latency). > cpu caches) is extraordinarily fast and the zeroing of large portions of > memory is so too. That is why the impact of scrubd is negligible since > its extremely fast. But that also disturbs cache lines that may well otherwise be useful. > The point is to save activating cachelines not the time zeroing in itself > takes. This only works if only parts of the page are needed immediately > after the page fault. All of that has been documented in earlier posts on > the subject. As has my scepticism about pre-zeroing actually providing any benefit on ppc64. Nevertheless, the only definitive answer is to actually measure the performance both ways. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A scrub daemon (prezeroing)
Rik van Riel writes: > I'm not convinced. Zeroing a page takes 2000-4000 CPU > cycles, while faulting the page from RAM into cache takes > 200-400 CPU cycles per cache line, or 6000-12000 CPU > cycles. On my G5 it takes ~200 cycles to zero a whole page. In other words it takes about the same time to zero a page as to bring in a single cache line from memory. (PPC has an instruction to establish a whole cache line of zeroes in modified state without reading anything from memory.) Thus I can't see how prezeroing can ever be a win on ppc64. Regards, Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 use kref for device_node refcounting
This patch is from Nathan Lynch <[EMAIL PROTECTED]>. This changes struct device_node and associated code to use the kref api for object refcounting and freeing. I've given it some testing on pSeries with cpu add/remove and verified that the release function works. The change is somewhat cosmetic but it does make the code easier to understand... at least I think so =) The only real change is that the refcount on all device_nodes is initialized at 1, and the device node is freed when the refcount reaches 0 (of_remove_node has the extra "put" to ensure that this happens). This lets us get rid of the OF_STALE flag and macros in prom.h. Signed-off-by: Nathan Lynch <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -urN linux-2.5/arch/ppc64/kernel/prom.c test/arch/ppc64/kernel/prom.c --- linux-2.5/arch/ppc64/kernel/prom.c 2005-01-22 09:25:41.0 +1100 +++ test/arch/ppc64/kernel/prom.c 2005-01-22 20:52:35.0 +1100 @@ -717,6 +717,7 @@ dad->next->sibling = np; dad->next = np; } + kref_init(&np->kref); } while(1) { u32 sz, noff; @@ -1475,24 +1476,31 @@ * @node: Node to inc refcount, NULL is supported to * simplify writing of callers * - * Returns the node itself or NULL if gone. + * Returns node. */ struct device_node *of_node_get(struct device_node *node) { - if (node && !OF_IS_STALE(node)) { - atomic_inc(&node->_users); - return node; - } - return NULL; + if (node) + kref_get(&node->kref); + return node; } EXPORT_SYMBOL(of_node_get); +static inline struct device_node * kref_to_device_node(struct kref *kref) +{ + return container_of(kref, struct device_node, kref); +} + /** - * of_node_cleanup - release a dynamically allocated node - * @arg: Node to be released + * of_node_release - release a dynamically allocated node + * @kref: kref element of the node to be released + * + * In of_node_put() this function is passed to kref_put() + * as the destructor. */ -static void of_node_cleanup(struct device_node *node) +static void of_node_release(struct kref *kref) { + struct device_node *node = kref_to_device_node(kref); struct property *prop = node->properties; if (!OF_IS_DYNAMIC(node)) @@ -1518,19 +1526,8 @@ */ void of_node_put(struct device_node *node) { - if (!node) - return; - - WARN_ON(0 == atomic_read(&node->_users)); - - if (OF_IS_STALE(node)) { - if (atomic_dec_and_test(&node->_users)) { - of_node_cleanup(node); - return; - } - } - else - atomic_dec(&node->_users); + if (node) + kref_put(&node->kref, of_node_release); } EXPORT_SYMBOL(of_node_put); @@ -1773,7 +1770,7 @@ np->properties = proplist; OF_MARK_DYNAMIC(np); - of_node_get(np); + kref_init(&np->kref); np->parent = derive_parent(path); if (!np->parent) { kfree(np); @@ -1809,8 +1806,9 @@ } /* - * Remove an OF device node from the system. - * Caller should have already "gotten" np. + * "Unplug" a node from the device tree. The caller must hold + * a reference to the node. The memory associated with the node + * is not freed until its refcount goes to zero. */ int of_remove_node(struct device_node *np) { @@ -1828,7 +1826,6 @@ of_cleanup_node(np); write_lock(&devtree_lock); - OF_MARK_STALE(np); remove_node_proc_entries(np); if (allnodes == np) allnodes = np->allnext; @@ -1853,6 +1850,7 @@ } write_unlock(&devtree_lock); of_node_put(parent); + of_node_put(np); /* Must decrement the refcount */ return 0; } diff -urN linux-2.5/include/asm-ppc64/prom.h test/include/asm-ppc64/prom.h --- linux-2.5/include/asm-ppc64/prom.h 2005-01-06 13:13:10.0 +1100 +++ test/include/asm-ppc64/prom.h 2005-01-22 20:52:35.0 +1100 @@ -149,18 +149,15 @@ struct proc_dir_entry *pde; /* this node's proc directory */ struct proc_dir_entry *name_link; /* name symlink */ struct proc_dir_entry *addr_link; /* addr symlink */ - atomic_t _users; /* reference count */ + struct kref kref; unsigned long _flags; }; extern struct device_node *of_chosen; /* flag descriptions */ -#define OF_STALE 0 /* node is slated for deletion */ #define OF_DYNAMIC 1 /* node and properties were allocated via kmalloc */ -#define OF_IS_STALE(x) test_bit(OF_STALE, &x->_flags) -#define OF_MARK_STALE(x) set_bi
[PATCH] PPC64 sparse fixes for cpu feature constants
This patch is originally from Nathan Lynch <[EMAIL PROTECTED]>. Sparse gives a warning "constant ... is so big it is long" for every expression where we check bits in the cur_cpu_spec->cpu_features value. This patch removes the warnings by using the ASM_CONST macro. Signed-off-by: Nathan Lynch <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -urN linux-2.5/include/asm-ppc64/cacheflush.h test/include/asm-ppc64/cacheflush.h --- linux-2.5/include/asm-ppc64/cacheflush.h2004-05-31 19:02:01.0 +1000 +++ test/include/asm-ppc64/cacheflush.h 2005-01-22 20:13:46.0 +1100 @@ -40,7 +40,7 @@ static inline void flush_icache_range(unsigned long start, unsigned long stop) { - if (!(cur_cpu_spec->cpu_features & ASM_CONST(CPU_FTR_COHERENT_ICACHE))) + if (!(cur_cpu_spec->cpu_features & CPU_FTR_COHERENT_ICACHE)) __flush_icache_range(start, stop); } diff -urN linux-2.5/include/asm-ppc64/cputable.h test/include/asm-ppc64/cputable.h --- linux-2.5/include/asm-ppc64/cputable.h 2004-06-28 14:30:55.0 +1000 +++ test/include/asm-ppc64/cputable.h 2005-01-22 20:13:46.0 +1100 @@ -16,6 +16,7 @@ #define __ASM_PPC_CPUTABLE_H #include +#include /* for ASM_CONST */ /* Exposed to userland CPU features - Must match ppc32 definitions */ #define PPC_FEATURE_32 0x8000 @@ -103,38 +104,38 @@ /* CPU kernel features */ /* Retain the 32b definitions for the time being - use bottom half of word */ -#define CPU_FTR_SPLIT_ID_CACHE 0x0001 -#define CPU_FTR_L2CR 0x0002 -#define CPU_FTR_SPEC7450 0x0004 -#define CPU_FTR_ALTIVEC0x0008 -#define CPU_FTR_TAU0x0010 -#define CPU_FTR_CAN_DOZE 0x0020 -#define CPU_FTR_USE_TB 0x0040 -#define CPU_FTR_604_PERF_MON 0x0080 -#define CPU_FTR_6010x0100 -#define CPU_FTR_HPTE_TABLE 0x0200 -#define CPU_FTR_CAN_NAP0x0400 -#define CPU_FTR_L3CR 0x0800 -#define CPU_FTR_L3_DISABLE_NAP 0x1000 -#define CPU_FTR_NAP_DISABLE_L2_PR 0x2000 -#define CPU_FTR_DUAL_PLL_750FX 0x4000 +#define CPU_FTR_SPLIT_ID_CACHE ASM_CONST(0x0001) +#define CPU_FTR_L2CR ASM_CONST(0x0002) +#define CPU_FTR_SPEC7450 ASM_CONST(0x0004) +#define CPU_FTR_ALTIVECASM_CONST(0x0008) +#define CPU_FTR_TAUASM_CONST(0x0010) +#define CPU_FTR_CAN_DOZE ASM_CONST(0x0020) +#define CPU_FTR_USE_TB ASM_CONST(0x0040) +#define CPU_FTR_604_PERF_MON ASM_CONST(0x0080) +#define CPU_FTR_601ASM_CONST(0x0100) +#define CPU_FTR_HPTE_TABLE ASM_CONST(0x0200) +#define CPU_FTR_CAN_NAPASM_CONST(0x0400) +#define CPU_FTR_L3CR ASM_CONST(0x0800) +#define CPU_FTR_L3_DISABLE_NAP ASM_CONST(0x1000) +#define CPU_FTR_NAP_DISABLE_L2_PR ASM_CONST(0x2000) +#define CPU_FTR_DUAL_PLL_750FX ASM_CONST(0x4000) /* Add the 64b processor unique features in the top half of the word */ -#define CPU_FTR_SLB0x0001 -#define CPU_FTR_16M_PAGE 0x0002 -#define CPU_FTR_TLBIEL 0x0004 -#define CPU_FTR_NOEXECUTE 0x0008 -#define CPU_FTR_NODSISRALIGN 0x0010 -#define CPU_FTR_IABR 0x0020 -#define CPU_FTR_MMCRA 0x0040 -#define CPU_FTR_PMC8 0x0080 -#define CPU_FTR_SMT0x0100 -#define CPU_FTR_COHERENT_ICACHE0x0200 -#define CPU_FTR_LOCKLESS_TLBIE 0x0400 -#define CPU_FTR_MMCRA_SIHV 0x0800 +#define CPU_FTR_SLBASM_CONST(0x0001) +#define CPU_FTR_16M_PAGE ASM_CONST(0x0002) +#define CPU_FTR_TLBIEL ASM_CONST(0x0004) +#define CPU_FTR_NOEXECUTE ASM_CONST(0x0008) +#define CPU_FTR_NODSISRALIGN ASM_CONST(0x0010) +#define CPU_FTR_IABR ASM_CONST(0x0020) +#define CPU_FTR_MMCRA ASM_CONST(0x0040) +#define CPU_FTR_PMC8 ASM_CONST(0x0080) +#define CPU_FTR_SMTASM_CONST(0x0100) +#define CPU_FTR_COHERE
Re: [PATCH] PPC: fix stack alignment for signal handlers
Roland McGrath writes: > For PPC32 signal handlers, while the frame itself was of properly aligned > size, no alignment of the starting stack pointer was done at all, so that a > signal handler can still get a misaligned stack pointer if the interrupted > registers had one, though the kernel isn't gratuitously misaligning good > ones like it is for PPC64. I added explicit alignment to fix that. This part is unnecessary, because arch/ppc/kernel/signal.c:do_signal() already aligns the stack pointer to a 16-byte boundary: if ((ka.sa.sa_flags & SA_ONSTACK) && current->sas_ss_size && !on_sig_stack(regs->gpr[1])) newsp = current->sas_ss_sp + current->sas_ss_size; else newsp = regs->gpr[1]; newsp &= ~0xfUL; /* Whee! Actually deliver the signal. */ if (ka.sa.sa_flags & SA_SIGINFO) handle_rt_signal(signr, &ka, &info, oldset, regs, newsp); else handle_signal(signr, &ka, &info, oldset, regs, newsp); The additions to arch/ppc64/kernel/signal32.c are likewise unnecessary, because do_signal32() also does newsp &= ~0xfUL (in fact the code there is very similar to the ppc32 code). You are correct about the 64-bit case though. I thought we had fixed that but evidently not. Your patch looks fine as far as arch/ppc64/kernel/signal.c is concerned. Regards, Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64: Trivial Cleanup: EEH_REGION
This patch is originally from Linas Vepstas <[EMAIL PROTECTED]>. This is a dumb, dorky cleanup patch: Per last round of emails, the concept of EEH_REGION is gone, but a few stubs remained. This patch removes them. Signed-off-by: Linas Vepstas <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -urN linux-2.5/arch/ppc64/mm/hash_utils.c test/arch/ppc64/mm/hash_utils.c --- linux-2.5/arch/ppc64/mm/hash_utils.c2005-01-06 13:13:08.0 +1100 +++ test/arch/ppc64/mm/hash_utils.c 2005-01-22 16:42:48.0 +1100 @@ -294,12 +294,6 @@ vsid = get_kernel_vsid(ea); break; #if 0 - case EEH_REGION_ID: - /* -* Should only be hit if there is an access to MMIO space -* which is protected by EEH. -* Send the problem up to do_page_fault -*/ case KERNEL_REGION_ID: /* * Should never get here - entire 0xC0... region is bolted. diff -urN linux-2.5/arch/ppc64/mm/slb.c test/arch/ppc64/mm/slb.c --- linux-2.5/arch/ppc64/mm/slb.c 2005-01-06 13:13:08.0 +1100 +++ test/arch/ppc64/mm/slb.c2005-01-22 16:44:26.0 +1100 @@ -78,7 +78,7 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm) { unsigned long offset = get_paca()->slb_cache_ptr; - unsigned long esid_data; + unsigned long esid_data = 0; unsigned long pc = KSTK_EIP(tsk); unsigned long stack = KSTK_ESP(tsk); unsigned long unmapped_base; @@ -97,11 +97,8 @@ } /* Workaround POWER5 < DD2.1 issue */ - if (offset == 1 || offset > SLB_CACHE_ENTRIES) { - /* flush segment in EEH region, we shouldn't ever -* access addresses in this region. */ - asm volatile("slbie %0" : : "r"(EEHREGIONBASE)); - } + if (offset == 1 || offset > SLB_CACHE_ENTRIES) + asm volatile("slbie %0" : : "r" (esid_data)); get_paca()->slb_cache_ptr = 0; get_paca()->context = mm->context; diff -urN linux-2.5/include/asm-ppc64/page.h test/include/asm-ppc64/page.h --- linux-2.5/include/asm-ppc64/page.h 2005-01-06 13:13:10.0 +1100 +++ test/include/asm-ppc64/page.h 2005-01-22 16:42:48.0 +1100 @@ -205,10 +205,8 @@ #define KERNELBASE PAGE_OFFSET #define VMALLOCBASE ASM_CONST(0xD000) #define IOREGIONBASEASM_CONST(0xE000) -#define EEHREGIONBASE ASM_CONST(0xA000) #define IO_REGION_ID (IOREGIONBASE>>REGION_SHIFT) -#define EEH_REGION_ID (EEHREGIONBASE>>REGION_SHIFT) #define VMALLOC_REGION_ID (VMALLOCBASE>>REGION_SHIFT) #define KERNEL_REGION_ID (KERNELBASE>>REGION_SHIFT) #define USER_REGION_ID (0UL) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 replace schedule_timeout in __cpu_up
This patch is from Nishanth Aravamudan <[EMAIL PROTECTED]>. Replace schedule_timeout() with msleep to simplify the code and to express the delay in milliseconds instead of HZ. Signed-off-by: Nishanth Aravamudan <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> --- 2.6.11-rc1-kj-v/arch/ppc64/kernel/smp.c 2005-01-15 16:55:41.0 -0800 +++ 2.6.11-rc1-kj/arch/ppc64/kernel/smp.c 2005-01-15 17:30:16.0 -0800 @@ -459,8 +459,7 @@ int __devinit __cpu_up(unsigned int cpu) * hotplug case. Wait five seconds. */ for (c = 25; c && !cpu_callin_map[cpu]; c--) { - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ/5); + msleep(200); } #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 replace schedule_timeout in pSeries_cpu_die
This patch is from Nishanth Aravamudan <[EMAIL PROTECTED]>. Replace schedule_timeout() with msleep to simplify the code and to express the delay in milliseconds instead of HZ. Signed-off-by: Nishanth Aravamudan <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> --- 2.6.11-rc1-kj-v/arch/ppc64/kernel/pSeries_smp.c 2005-01-15 16:55:41.0 -0800 +++ 2.6.11-rc1-kj/arch/ppc64/kernel/pSeries_smp.c 2005-01-15 17:21:12.0 -0800 @@ -107,8 +107,7 @@ void pSeries_cpu_die(unsigned int cpu) cpu_status = query_cpu_stopped(pcpu); if (cpu_status == 0 || cpu_status == -1) break; - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ/5); + msleep(200); } if (cpu_status != 0) { printk("Querying DEAD? cpu %i (%i) shows %i\n", - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 replace schedule_timeout in iSeries_pci_reset
This patch is from Nishanth Aravamudan <[EMAIL PROTECTED]>. Replace schedule_timeout() with msleep to simplify the code and to express the delay in milliseconds instead of HZ. Signed-off-by: Nishanth Aravamudan <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> --- 2.6.11-rc1-kj-v/arch/ppc64/kernel/iSeries_pci_reset.c 2005-01-15 16:55:41.0 -0800 +++ 2.6.11-rc1-kj/arch/ppc64/kernel/iSeries_pci_reset.c 2005-01-15 17:17:54.0 -0800 @@ -32,6 +32,7 @@ #include #include #include +#include #include #include @@ -49,7 +50,7 @@ int iSeries_Device_ToggleReset(struct pci_dev *PciDev, int AssertTime, int DelayTime) { - unsigned long AssertDelay, WaitDelay; + unsigned int AssertDelay, WaitDelay; struct iSeries_Device_Node *DeviceNode = (struct iSeries_Device_Node *)PciDev->sysdata; @@ -62,14 +63,14 @@ int iSeries_Device_ToggleReset(struct pc * Set defaults, Assert is .5 second, Wait is 3 seconds. */ if (AssertTime == 0) - AssertDelay = (5 * HZ) / 10; + AssertDelay = 500; else - AssertDelay = (AssertTime * HZ) / 10; + AssertDelay = AssertTime * 100; if (DelayTime == 0) - WaitDelay = (30 * HZ) / 10; + WaitDelay = 3000; else - WaitDelay = (DelayTime * HZ) / 10; + WaitDelay = DelayTime * 100; /* * Assert reset @@ -77,8 +78,7 @@ int iSeries_Device_ToggleReset(struct pc DeviceNode->ReturnCode = HvCallPci_setSlotReset(ISERIES_BUS(DeviceNode), 0x00, DeviceNode->AgentId, 1); if (DeviceNode->ReturnCode == 0) { - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(AssertDelay); /* Sleep for the time */ + msleep(AssertDelay);/* Sleep for the time */ DeviceNode->ReturnCode = HvCallPci_setSlotReset(ISERIES_BUS(DeviceNode), 0x00, DeviceNode->AgentId, 0); @@ -86,8 +86,7 @@ int iSeries_Device_ToggleReset(struct pc /* * Wait for device to reset */ - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(WaitDelay); + msleep(WaitDelay); } if (DeviceNode->ReturnCode == 0) PCIFR("Slot 0x%04X.%02 Reset\n", ISERIES_BUS(DeviceNode), - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 replace schedule_timeout in die
This patch is from Nishanth Aravamudan <[EMAIL PROTECTED]>. Replace schedule_timeout() with ssleep to simplify the code and to express the delay in seconds instead of HZ. Signed-off-by: Nishanth Aravamudan <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> --- 2.6.11-rc1-kj-v/arch/ppc64/kernel/traps.c 2005-01-15 16:55:41.0 -0800 +++ 2.6.11-rc1-kj/arch/ppc64/kernel/traps.c 2005-01-15 17:30:39.0 -0800 @@ -29,6 +29,7 @@ #include #include #include +#include #include #include @@ -137,8 +138,7 @@ int die(const char *str, struct pt_regs if (panic_on_oops) { printk(KERN_EMERG "Fatal exception: panic in 5 seconds\n"); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(5 * HZ); + ssleep(5); panic("Fatal exception"); } do_exit(SIGSEGV); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 Clear MSR_RI earlier in syscall exit path
This patch is from Craig Chaney <[EMAIL PROTECTED]>. This patch moves the restoring of the stack pointer in the system call exit path to after the point where we clear the RI (recoverable interrupt) bit in the MSR. Normally, loading the stack pointer before clearing RI doesn't cause any problem because there is no trap that can normally occur in between. But if we are tracing the code using a tool that single-steps instructions, this can cause a problem. In this case, clearing RI serves as an indication that the following code can't be safely single-stepped. Signed-off-by: Craig Chaney <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -Naur clean/arch/ppc64/kernel/entry.S edited/arch/ppc64/kernel/entry.S --- clean/arch/ppc64/kernel/entry.S 2004-09-26 14:24:27.0 + +++ edited/arch/ppc64/kernel/entry.S2004-09-27 14:36:29.221308744 + @@ -185,10 +185,10 @@ beq-1f /* only restore r13 if */ ld r13,GPR13(r1) /* returning to usermode */ 1: ld r2,GPR2(r1) - ld r1,GPR1(r1) li r12,MSR_RI andcr10,r10,r12 mtmsrd r10,1 /* clear MSR.RI */ + ld r1,GPR1(r1) mtlrr4 mtcrr5 mtspr SRR0,r7 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Extend clear_page by an order parameter
Christoph Lameter writes: > I had the name "zero_page" in V1 and V2 of the patch where it was > separate. Then someone complained about code duplication. Well, if you duplicated each arch's clear_page implementation in zero_page, then yes, that would be unnecessary code duplication. I would suggest that for architectures where the clear_page implementation can easily be extended, rename it to clear_page_order (or something) and #define clear_page(x) to be clear_page_order(x, 0). For architectures where it can't, leave clear_page as clear_page and define clear_page_order as an inline function that calls clear_page in a loop. > clear_page is called clear_page because it clears one page of *any* order > not just higher orders. zero-order pages are not segregated nor are they > intrisincally better just because they contain more memory ;-). You have missed my point, which was about address constraints, not a distinction between zero-order pages and higher-order pages. Anyway, I remain of the opinion that your naming is inconsistent with the naming of other functions that deal with zero-order and higher-order pages, such as get_free_pages, alloc_pages, free_pages, etc., and that your patch is unnecessarily intrusive. I guess it's up to Andrew to decide which way we go. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 Fix in_be64 definition
This patch is from Jake Moilanen <[EMAIL PROTECTED]>. The instruction syntax for the in_be64 inline asm was incorrect for the "m" constraint for the address parameter. This patch fixes the instruction in the inline asm. Signed-off-by: Jake Moilanen <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -puN include/asm-ppc64/io.h~in_be64-fix include/asm-ppc64/io.h --- linux-2.6-bk/include/asm-ppc64/io.h~in_be64-fix Tue Jan 4 15:33:22 2005 +++ linux-2.6-bk-moilanen/include/asm-ppc64/io.hWed Jan 5 08:08:03 2005 @@ -371,7 +371,7 @@ static inline unsigned long in_be64(cons { unsigned long ret; - __asm__ __volatile__("ld %0,0(%1); twi 0,%0,0; isync" + __asm__ __volatile__("ld%U1%X1 %0,%1; twi 0,%0,0; isync" : "=r" (ret) : "m" (*addr)); return ret; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC64 xmon data breakpoints on partitioned systems
This patch is originally from Jake Moilanen <[EMAIL PROTECTED]>, substantially modified by me. On PPC64 systems with a hypervisor, we can't set the Data Address Breakpoint Register (DABR) directly, we have to do it through a hypervisor call. Signed-off-by: Jake Moilanen <[EMAIL PROTECTED]> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -urN linux-2.5/arch/ppc64/xmon/xmon.c test/arch/ppc64/xmon/xmon.c --- linux-2.5/arch/ppc64/xmon/xmon.c2005-01-12 18:20:48.0 +1100 +++ test/arch/ppc64/xmon/xmon.c 2005-01-22 10:55:46.664345064 +1100 @@ -624,6 +624,17 @@ return 0; } +/* On systems with a hypervisor, we can't set the DABR + (data address breakpoint register) directly. */ +static void set_controlled_dabr(unsigned long val) +{ + if (systemcfg->platform == PLATFORM_PSERIES_LPAR) { + int rc = plpar_hcall_norets(H_SET_DABR, val); + if (rc != H_Success) + xmon_printf("Warning: setting DABR failed (%d)\n", rc); + } else + set_dabr(val); +} static struct bpt *at_breakpoint(unsigned long pc) { @@ -711,7 +722,7 @@ static void insert_cpu_bpts(void) { if (dabr.enabled) - set_dabr(dabr.address | (dabr.enabled & 7)); + set_controlled_dabr(dabr.address | (dabr.enabled & 7)); if (iabr && (cur_cpu_spec->cpu_features & CPU_FTR_IABR)) set_iabr(iabr->address | (iabr->enabled & (BP_IABR|BP_IABR_TE))); @@ -739,7 +750,7 @@ static void remove_cpu_bpts(void) { - set_dabr(0); + set_controlled_dabr(0); if ((cur_cpu_spec->cpu_features & CPU_FTR_IABR)) set_iabr(0); } @@ -1049,8 +1060,8 @@ "b [cnt] set breakpoint at given instr addr\n" "bc clear all breakpoints\n" "bc clear breakpoint number n or at addr\n" -"bi [cnt] set hardware instr breakpoint (broken?)\n" -"bd [cnt] set hardware data breakpoint (broken?)\n" +"bi [cnt] set hardware instr breakpoint (POWER3/RS64 only)\n" +"bd [cnt] set hardware data breakpoint\n" ""; static void - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Extend clear_page by an order parameter
Andrew Morton writes: > It is, actually, from the POV of the page allocator. It's a "higher order > page" and is controlled by a struct page*, just like a zero-order page... So why is the function that gets me one of these "higher order pages" called "get_free_pages" with an "s"? :) Christoph's patch is bigger than it needs to be because he has to change all the occurrences of clear_page(x) to clear_page(x, 0), and then he has to change a lot of architectures' clear_page functions to be called _clear_page instead. If he picked a different name for the "clear a higher order page" function it would end up being less invasive as well as less confusing. The argument that clear_page is called that because it clears a higher order page won't wash; all the clear_page implementations in his patch are perfectly capable of clearing any contiguous set of 2^order pages (oops, I mean "zero-order pages"), not just a "higher order page". Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Extend clear_page by an order parameter
Andrew Morton writes: > It is, actually, from the POV of the page allocator. It's a "higher order > page" and is controlled by a struct page*, just like a zero-order page... OK. I still reckon it's confusing terminology for the rest of us who don't have our heads deep in the page allocator code. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Extend clear_page by an order parameter
Christoph Lameter writes: > clear_page clears one page of the specified order. Now you're really being confusing. A cluster of 2^n contiguous pages isn't one page by any normal definition. Call it "clear_page_cluster" or "clear_page_order" or something, but not "clear_page". Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Extend clear_page by an order parameter
Christoph Lameter writes: > The zeroing of a page of a arbitrary order in page_alloc.c and in hugetlb.c > may benefit from a > clear_page that is capable of zeroing multiple pages at once (and scrubd > too but that is now an independent patch). The following patch extends > clear_page with a second parameter specifying the order of the page to be > zeroed to allow an > efficient zeroing of pages. Hope I caught everything Wouldn't it be nicer to call the version that takes the order parameter "clear_pages" and then define clear_page(p) as clear_pages(p, 0) ? Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PPC64: EEH Recovery
Linas Vepstas writes: > > 2. I don't see why the device nodes for the PCI subtree being reset > >would go away, and thus I don't see the need for your eeh_cfg_tree > >struct. > > Its not the reset, its the hot-plug remove. The hot plug code assumes > that you are going to physically remove the device from the slot, so > it removes the device_node as part of the "unconfig". OK, I missed that. It seems a bit bogus to me. Could you point me at where in the code this happens? > > 3. Is there a good reason why we can't use the assigned-addresses > >property on the relevant device tree nodes to tell us what to set > >the BARs to? > > Yes, the reason is that after a reset, that property doesn't hold any > decent data. I discussed this with the firmware developers, and thier > response was that it is the kernel's responsibility to compute > (or save/restore) such values. (Except for bridges, which they will do for > us). The not holding any decent data is a consequence of the device nodes getting thrown away, isn't it? I fail to see how resetting the device can of itself affect our copy of the device tree. > > In particular I think it should be a > >userland write to a sysfs file that kicks off the restart process > >rather than it just happening after 5 seconds. Anyway, what > >process or thread is executing that 5 second sleep? Is it keventd > >or something? > > Its a workqueue. Which get run in keventd's context. In other words no other workqueues will get run during the 5 second sleep, or at least not on that cpu. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Horrible regression with -CURRENT from "Don't busy-lock-loop in preemptable spinlocks" patch
Ingo Molnar writes: > * Peter Chubb <[EMAIL PROTECTED]> wrote: > > > >> Here's a patch that adds the missing read_is_locked() and > > >> write_is_locked() macros for IA64. When combined with Ingo's > > >> patch, I can boot an SMP kernel with CONFIG_PREEMPT on. > > >> > > >> However, I feel these macros are misnamed: read_is_locked() returns > > >> true if the lock is held for writing; write_is_locked() returns > > >> true if the lock is held for reading or writing. > > > > Ingo> well, 'read_is_locked()' means: "will a read_lock() succeed" > > > > Fail, surely? > > yeah ... and with that i proved beyond doubt that the naming is indeed > unintuitive :-) Yes. Intuitively read_is_locked() is true when someone has done a read_lock and write_is_locked() is true when someone has done a write lock. I suggest read_poll(), write_poll(), spin_poll(), which are like {read,write,spin}_trylock but don't do the atomic op to get the lock, that is, they don't change the lock value but return true if the trylock would succeed, assuming no other cpu takes the lock in the meantime. Regards, Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] raid6: altivec support
David Woodhouse writes: > Yeah I'm increasingly tempted to merge ppc32/ppc64 into one arch > like mips/parisc/s390. Or would that get vetoed on the basis that we > don't have all that horrid non-OF platform support in ppc64 yet, and > we're still kidding ourselves that all those embedded vendors will > either not notice ppc64 or will use OF? I'm going to insist that every new ppc64 platform supplies a device tree. They don't have to have OF but they do need to have the booter or wrapper supply a flattened device tree (which is just a few kB of binary data as far as the booter/wrapper is concerned). It doesn't have to include all the As for merging ppc32 and ppc64, I think it would end up an awful ifdef mess, but if you can see a clean way to do it, send me a patch. :) Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PPC64: EEH Recovery
Linas Vepstas writes: > p.s. It was not clear to me if the EEH patch previously sent > (6 January 2005, same subject line) will be wending its way into > the main Torvalds kernel tree, or not. I hadn't really gotten > confirmation one way or another. I'm not really totally happy with it yet, on a number of fronts: 1. You're adding more PCI-specific stuff to the device_node struct, which I don't like. I would prefer that the device_node tree contains basically just what we get from OF, and that we have a separate struct for storing ppc64-specific information for each PCI device. Fixing that is outside the scope of your patch, though. 2. I don't see why the device nodes for the PCI subtree being reset would go away, and thus I don't see the need for your eeh_cfg_tree struct. 3. Is there a good reason why we can't use the assigned-addresses property on the relevant device tree nodes to tell us what to set the BARs to? 4. I think the 5 second sleep is quite bogus, and shows that we have the flow of control wrong. In particular I think it should be a userland write to a sysfs file that kicks off the restart process rather than it just happening after 5 seconds. Anyway, what process or thread is executing that 5 second sleep? Is it keventd or something? 5. AFAICS userland will get an unplug notification for the device, but nothing to indicate that is due to an EEH slot isolation event. I think userland should be told about EEH events. Regards, Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Horrible regression with -CURRENT from "Don't busy-lock-loop in preemptable spinlocks" patch
Chris Wedgwood writes: > +#define rwlock_is_write_locked(x) ((x)->lock == 0) AFAICS on i386 the lock word, although it goes to 0 when write-locked, can then go negative temporarily when another cpu tries to get a read or write lock. So I think this should be ((signed int)(x)->lock <= 0) (or the equivalent using atomic_read). Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Ioctl compatibility for TIOCMIWAIT and TIOCGICOUNT
This patch lets us use TIOCMIWAIT and TIOCGICOUNT from a 32-bit process on a 64-bit processor. TIOCMIWAIT uses the argument as a bitmap of things to wait for. The argument for TIOCGICOUNT points to a struct serial_icounter_struct, which only contains ints and arrays of int. Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]> diff -urN linux-2.5/include/linux/compat_ioctl.h test/include/linux/compat_ioctl.h --- linux-2.5/include/linux/compat_ioctl.h 2004-11-17 09:38:21.0 +1100 +++ test/include/linux/compat_ioctl.h 2005-01-17 14:25:41.0 +1100 @@ -25,6 +25,8 @@ COMPATIBLE_IOCTL(TIOCLINUX) COMPATIBLE_IOCTL(TIOCSBRK) COMPATIBLE_IOCTL(TIOCCBRK) +ULONG_IOCTL(TIOCMIWAIT) +COMPATIBLE_IOCTL(TIOCGICOUNT) /* Little t */ COMPATIBLE_IOCTL(TIOCGETD) COMPATIBLE_IOCTL(TIOCSETD) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Memory region check in drivers/pcmcia/rsrc_mgr.c
Linus Torvalds writes: > HOWEVER, you can change the resource checking to use the proper "parent > resource" instead of using the root resource. I absolutely agree that > using the root resource is wrong per se - it depends (incorrectly) on the > fact that on all laptops the PCMCIA controller tends to be on the root > bus. I was able to do this more easily than I had expected, and there is a (lightly-tested) patch below for comment and testing. The main thing is that the routines in rsrc_mgr.c now basically need to get a handle for the parent resource for the pcmcia socket controller that we are concerned with at the moment. To get this I use the s->cap.cb_dev field, which AFAICS gets set to the pci_dev for the controller for PCI controllers and should be NULL for ISA controllers. If there is a better way to get hold of the pci_dev let me know. I have added a socket_info_t *s argument to validate_mem, find_io_region and find_mem_region, so that we can get at s->cap.cb_dev. The callers of these routines all have the socket_info_t pointer readily to hand. We could pass in &s->cap or s->cap.cb_dev instead, but passing s seems to be the easiest and most generally useful option. If s or s->cap.cb_dev is NULL the routines fall back to the old behaviour, i.e. using ioport_resource or iomem_resource. Of course it is possible that the ISA memory and I/O space could be a sub-node in the ioport/mem_resource trees, and that we should be using those nodes for ISA pcmcia controllers rather than ioport/mem_resource. If that is so then we need to define new isa_ioport_resource and isa_iomem_resource variables and set them in the architecture-specific PCI code. I also fixed the problem that Jeff Garzik pointed out, which is that the existing code in find_io_region does a check_io_resource followed by a request_region, without checking the return from request_region, which is potentially racy (anyone for an SMP laptop? :). (And find_mem_region does the analogous thing.) I replaced the pair of calls with a single call to a new function, request_io_resource, which attempts to allocate the region in the socket controller's parent resource. Similarly there is a new request_mem_resource function used in find_mem_region. > Note that the CardBus side gets this all right - I assume that a 32-bit > CardBus card with a PCI driver should work on your powerbook even without > this patch, no? I assume so, but I don't have any cardbus devices to test it with. Regards, Paul. diff -urN linux/drivers/pcmcia/cistpl.c pmac/drivers/pcmcia/cistpl.c --- linux/drivers/pcmcia/cistpl.c Thu Feb 22 14:25:19 2001 +++ pmac/drivers/pcmcia/cistpl.cSun Jul 8 17:57:37 2001 @@ -264,11 +264,11 @@ (s->cis_mem.sys_start == 0)) { int low = !(s->cap.features & SS_CAP_PAGE_REGS); vs = s; - validate_mem(cis_readable, checksum_match, low); + validate_mem(cis_readable, checksum_match, low, s); s->cis_mem.sys_start = 0; vs = NULL; if (find_mem_region(&s->cis_mem.sys_start, s->cap.map_size, - s->cap.map_size, low, "card services")) { + s->cap.map_size, low, "card services", s)) { printk(KERN_NOTICE "cs: unable to map card memory!\n"); return CS_OUT_OF_RESOURCE; } diff -urN linux/drivers/pcmcia/cs.c pmac/drivers/pcmcia/cs.c --- linux/drivers/pcmcia/cs.c Wed Jul 4 14:33:24 2001 +++ pmac/drivers/pcmcia/cs.cSun Jul 8 17:57:36 2001 @@ -797,7 +797,7 @@ return 1; for (i = 0; i < MAX_IO_WIN; i++) { if (s->io[i].NumPorts == 0) { - if (find_io_region(base, num, align, name) == 0) { + if (find_io_region(base, num, align, name, s) == 0) { s->io[i].Attributes = attr; s->io[i].BasePort = *base; s->io[i].NumPorts = s->io[i].InUse = num; @@ -809,7 +809,7 @@ /* Try to extend top of window */ try = s->io[i].BasePort + s->io[i].NumPorts; if ((*base == 0) || (*base == try)) - if (find_io_region(&try, num, 0, name) == 0) { + if (find_io_region(&try, num, 0, name, s) == 0) { *base = try; s->io[i].NumPorts += num; s->io[i].InUse += num; @@ -818,7 +818,7 @@ /* Try to extend bottom of window */ try = s->io[i].BasePort - num; if ((*base == 0) || (*base == try)) - if (find_io_region(&try, num, 0, name) == 0) { + if (find_io_region(&try, num, 0, name, s) == 0) { s->io[i].BasePort = *base = try; s->io[i].NumPorts += num; s->io[i].InUse += num; @@ -1960,7 +1960,7 @@ find_mem_region(&win->base, win->size, align, (req->Attributes & WIN_MAP_BELOW_1MB) || !(s->cap.features & SS_CAP_PAGE_REGS), - (*handle)->dev_info)) + (*handle)->dev_info, s))
Memory region check in drivers/pcmcia/rsrc_mgr.c
In drivers/pcmcia/rsrc_mgr.c, there is code that check whether a given range of PCI memory addresses are available for the pcmcia code to use. This code uses a macro, check_mem_resource(), to check whether a particular region is available, defined like this: #define check_mem_resource(b,n) check_resource(&iomem_resource, (b), (n)) This code is now causing me problems on my powerbook because we now register the regions mapped by each PCI host bridge in the iomem_resource structure. The basic problem is that check_resource only checks at the top level of the iomem_resource tree. I think that we should be using check_mem_region instead, which will descend the tree until it finds out whether the region is actually in use or not. The patch below does this (and makes a similar correction for I/O space). With this patch applied, the pcmcia stuff works fine on my powerbook, and I end up with something like this in /proc/iomem: 8000-afff : /pci@f200 8000-8007 : Apple Computer Inc. KeyLargo Mac I/O 9000-9fff : PCI CardBus #02 a000-afff : Texas Instruments PCI1211 a0001000-a0001fff : Apple Computer Inc. KeyLargo USB (#2) a0001000-a0001fff : usb-ohci a0002000-a0002fff : Apple Computer Inc. KeyLargo USB a0002000-a0002fff : usb-ohci a700-a7000fff : card services b000-bfff : /pci@f000 b000-b0003fff : ATI Technologies Inc Mobility M3 AGP 2x b000-b0003fff : aty128fb MMIO b400-b7ff : ATI Technologies Inc Mobility M3 AGP 2x b400-b7ff : aty128fb FB f100-f1ff : /pci@f000 f300-f3ff : /pci@f200 f300-f33f : PCI CardBus #02 f500-f5ff : /pci@f400 f500-f5000fff : Apple Computer Inc. UniNorth FireWire f520-f53f : Apple Computer Inc. UniNorth GMAC Linus, would you apply this patch to your tree? Paul. diff -urN linux/drivers/pcmcia/rsrc_mgr.c pmac/drivers/pcmcia/rsrc_mgr.c --- linux/drivers/pcmcia/rsrc_mgr.c Sat Mar 31 03:06:19 2001 +++ pmac/drivers/pcmcia/rsrc_mgr.c Wed Jun 20 14:25:25 2001 @@ -104,8 +104,8 @@ ==*/ -#define check_io_resource(b,n) check_resource(&ioport_resource, (b), (n)) -#define check_mem_resource(b,n)check_resource(&iomem_resource, (b), (n)) +#define check_io_resource(b,n) check_region((b), (n)) +#define check_mem_resource(b,n)check_mem_region((b), (n)) /*== - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC interrupt mapping fix
Linus, The patch below fixes the interrupt assignments on PPC machines that use Open Firmware, in the case where we have devices behind a PCI-PCI bridge and multiple PCI host bridges. The patch is moderately large because I rewrote the procedure that parsed the open firmware interrupt tree. The previous routine was monolithic and almost unreadable - I wrote a new version which uses several subroutines and should be much more readable. There are also some fixes to allow us to use the interrupt tree on powermacs when booted with BootX, which we couldn't do previously. Please apply to your tree. Paul. diff -urN linux/arch/ppc/kernel/prom.c pmac/arch/ppc/kernel/prom.c --- linux/arch/ppc/kernel/prom.cWed Jul 4 14:33:18 2001 +++ pmac/arch/ppc/kernel/prom.c Wed Jul 4 22:53:29 2001 @@ -116,8 +116,11 @@ unsigned int rtas_size; unsigned int old_rtas; -/* Set for a newworld machine */ +/* Set for a newworld or CHRP machine */ int use_of_interrupt_tree; +struct device_node *dflt_interrupt_controller; +int num_interrupt_controllers; + int pmac_newworld; static struct device_node *allnodes; @@ -1153,7 +1156,19 @@ *prev_propp = PTRUNRELOC(pp); prev_propp = &pp->next; } - *prev_propp = 0; + if (np->node != NULL) { + /* Add a "linux,phandle" property" */ + pp = (struct property *) mem_start; + *prev_propp = PTRUNRELOC(pp); + prev_propp = &pp->next; + namep = (char *) (pp + 1); + pp->name = PTRUNRELOC(namep); + strcpy(namep, RELOC("linux,phandle")); + mem_start = ALIGN((unsigned long)namep + strlen(namep) + 1); + pp->value = (unsigned char *) PTRUNRELOC(&np->node); + pp->length = sizeof(np->node); + } + *prev_propp = NULL; /* get the node's full name */ l = (int) call_prom(RELOC("package-to-path"), 3, 1, node, @@ -1186,19 +1201,46 @@ finish_device_tree(void) { unsigned long mem = (unsigned long) klimit; + struct device_node *np; - /* All newworld machines now use the interrupt tree */ - struct device_node *np = allnodes; - - while(np && (_machine == _MACH_Pmac)) { + /* All newworld pmac machines and CHRPs now use the interrupt tree */ + for (np = allnodes; np != NULL; np = np->allnext) { if (get_property(np, "interrupt-parent", 0)) { - pmac_newworld = 1; + use_of_interrupt_tree = 1; break; } - np = np->allnext; } - if ((_machine == _MACH_chrp) || (boot_infos == 0 && pmac_newworld)) - use_of_interrupt_tree = 1; + if (_machine == _MACH_Pmac && use_of_interrupt_tree) + pmac_newworld = 1; + +#ifdef CONFIG_BOOTX_TEXT + if (boot_infos && pmac_newworld) { + prom_print("WARNING ! BootX/miBoot booting is not supported on this +machine\n"); + prom_print(" You should use an Open Firmware bootloader\n"); + } +#endif /* CONFIG_BOOTX_TEXT */ + + if (use_of_interrupt_tree) { + /* +* We want to find out here how many interrupt-controller +* nodes there are, and if we are booted from BootX, +* we need a pointer to the first (and hopefully only) +* such node. But we can't use find_devices here since +* np->name has not been set yet. -- paulus +*/ + int n = 0; + char *name; + + for (np = allnodes; np != NULL; np = np->allnext) { + if ((name = get_property(np, "name", NULL)) == NULL + || strcmp(name, "interrupt-controller") != 0) + continue; + if (n == 0) + dflt_interrupt_controller = np; + ++n; + } + num_interrupt_controllers = n; + } mem = finish_node(allnodes, mem, NULL, 1, 1); dev_tree_size = mem - (unsigned long) allnodes; @@ -1240,9 +1282,8 @@ if (ifunc != NULL) { mem_start = ifunc(np, mem_start, naddrc, nsizec); } - if (use_of_interrupt_tree) { + if (use_of_interrupt_tree) mem_start = finish_node_interrupts(np, mem_start); - } /* Look for #address-cells and #size-cells properties. */ ip = (int *) get_property(np, "#address-cells", 0); @@ -1298,141 +1339,210 @@ return mem_start; } -/* This routine walks the interrupt tree for a given device node and gather - * all necessary informations according to the draft interrupt mapping - * for CHRP. The current version was only tested on Apple "Core99" machines - * and may not handle cascaded controllers correctly. +/* + * Find the interrupt pare
[PATCH] fix drivers/usb/scanner.c ioctl return
The following patch corrects the return value from the ioctl function in the USB scanner code, in the case where the ioctl is unrecognized. Linus, please apply. Paul. diff -urN linux/drivers/usb/scanner.c pmac/drivers/usb/scanner.c --- linux/drivers/usb/scanner.c Sat Apr 28 23:02:49 2001 +++ pmac/drivers/usb/scanner.c Thu Jun 28 17:28:25 2001 @@ -909,7 +909,7 @@ return result; } default: - return -ENOIOCTLCMD; + return -ENOTTY; } return 0; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
drivers/ide/sl82c105.c
I am wondering who maintains drivers/ide/sl82c105.c, and who sent in the recent changes to it. We now have, at around line 278, this code: unsigned int pci_init_sl82c105(struct pci_dev *dev, const char *msg) { return ide_special_settings(dev, msg); } The call to ide_special_settings gives a link error because ide_special_settings is not exported from drivers/ide/ide-pci.c. I can't see what the point of calling it is anyway, even if it were exported, since ide_special_settings consists of a switch statement on the device ID and none of the cases will match. Paul (who uses sl82c105.c on his longtrail PPC CHRP box). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] fix compile error in usb-ohci.c
The following patch fixes a trivial error in drivers/usb/usb-ohci.c, where a missing argument to ohci_pci_suspend will cause a compile error if you have powerbook support enabled. Linus, please apply. Paul. diff -urN linux/drivers/usb/usb-ohci.c pmac/drivers/usb/usb-ohci.c --- linux/drivers/usb/usb-ohci.cWed Jul 4 14:33:36 2001 +++ pmac/drivers/usb/usb-ohci.c Fri Jul 6 16:20:58 2001 @@ -2749,7 +2749,7 @@ switch (when) { case PBOOK_SLEEP_NOW: - ohci_pci_suspend (ohci->ohci_dev); + ohci_pci_suspend (ohci->ohci_dev, 3); break; case PBOOK_WAKE: ohci_pci_resume (ohci->ohci_dev); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] fix compile error in imsttfb.c
As it currently stands, drivers/video/imsttfb.c will give a compile error if FBCON_HAS_CFB32 is defined. This patch fixes that. There used to be a declaration of `i' which was only used if FBCON_HAS_CFB32 was defined. I suspect that somebody was compiling without FBCON_HAS_CFB32 and saw an unused variable warning from gcc and decided to take out the declaration. This patch will avoid that warning. Linus, please apply. Paul. diff -urN linux/drivers/video/imsttfb.c linuxppc_2_4/drivers/video/imsttfb.c --- linux/drivers/video/imsttfb.c Thu Jul 5 14:46:16 2001 +++ linuxppc_2_4/drivers/video/imsttfb.cThu Jul 5 10:58:09 2001 @@ -1278,10 +1278,11 @@ break; #endif #ifdef FBCON_HAS_CFB32 - case 32: - i = (regno << 8) | regno; + case 32: { + int i = (regno << 8) | regno; p->fbcon_cmap.cfb32[regno] = (i << 16) | i; break; + } #endif } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Trouble Booting Linux PPC On Mac G4 2000
Tim McDaniel writes: > We are having a great degree of difficulty getting Linux PPC2 > running on a Mac G4 466 tower with 128MB of memory, One 30MB HD and one > CR RW. This is not a NuBus based system. To the best of our knowledge we > have followed the user manual to the tee, and even tried forcing video > settings at the Xboot screen. One possible problem is that many Apple monitors only work at a fixed horizontal frequency - the Apple Studio 17 monitor (with the transparent case) that I use with my G4 cube is like that, it will only operate at horizontal scan rates between 79 and 82 kHz. If the kernel video driver chooses a video mode with a scan rate outside that range the screen goes black. So I have to put video=aty128fb:vmode:20 on the kernel command line to avoid that. (It would be nice if the kernel driver did DDC but it doesn't.) Other than that, you might get more useful suggestions if you ask on the [EMAIL PROTECTED] mailing list. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: floating point problem
[EMAIL PROTECTED] writes: > In Linux PPC, the MSR[FP] bit (that is floating point available bit) is off > (atleast for non-SMP). > > Due to this, whenever some floating point instruction is executed in 'user > mode', it leads to a exception 'FPUnavailable'. The exception handler for Yes, this is so that we don't have to save and restore the floating point registers on every context switch. > this exception apart from setting the MSR[FP] bit, also sets the MSR[FE0] > and MSR[FE1] bits. These bits basically enables the floating point > exceptions so that if there are some floating point exception conditions > encountered while exeuting a floating point instruction, an appropriate > exception is raised. You have control at user-level over whether the cpu will take an exception (leading to a SIGFPE signal) or not by means of the FPSCR register. The VE, OE, UE, ZE and XE bits in the FPSCR control whether the cpu will take an exception on floating-point invalid operation, overflow, underflow, divide by zero and inexact result respectively. If the kernel cleared the FE0 and FE1 bits, there would be no way for an application to get a signal when a floating-point error occurred. With FE0 and FE1 set, the application can control this using the FPSCR, and get a signal, or not, as it prefers. > But whenever some floating point instruction is executed in 'kernel mode', > 'FPUnavailabe' exception handler code does not set the 'MSR[FE0] and > MSR[FE1]' bits. Floating point is not intended to be used in the kernel except in a couple of specific places. > Problem is that we want to get the good results without changing the > kernel. Either by having the user mode application to interact with some > special module which can set the MSR[FP] bit before we execute the floating > point instruction or by some other trick.Is there any solution apart > from changing the kernel? Clear the appropriate bits in the FPSCR. There is almost certainly a glibc interface to do this but I don't know what it would be. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] fix typo in 2.4.6 for PPC
The patch below fixes a typo in the PowerPC code in 2.4.6. Without this change, people attempting to compile up a kernel for a powermac will get a compile error. Paul. diff -urN linux/arch/ppc/kernel/pmac_pci.c linuxppc_2_4/arch/ppc/kernel/pmac_pci.c --- linux/arch/ppc/kernel/pmac_pci.cTue Jul 3 13:38:19 2001 +++ linuxppc_2_4/arch/ppc/kernel/pmac_pci.c Tue Jul 3 15:00:40 2001 @@ -249,7 +249,7 @@ out_le32(bp->cfg_addr, (1UL << BANDIT_DEVNUM) + PCI_VENDOR_ID); udelay(2); vendev = in_le32((volatile unsigned int *)bp->cfg_data); - if (vendev == (PCI_VENDOR_ID_APPLE_BANDIT << 16) + + if (vendev == (PCI_DEVICE_ID_APPLE_BANDIT << 16) + PCI_VENDOR_ID_APPLE) { /* read the revision id */ out_le32(bp->cfg_addr, - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: about kmap_high function
Stephen C. Tweedie writes: > On Tue, Jul 03, 2001 at 10:47:20PM +1000, Paul Mackerras wrote: > > On PPC it is a bit different. Flushing a single TLB entry is > > relatively cheap - the hardware broadcasts the TLB invalidation on the > > bus (in most implementations) so there are no cross-calls required. But > > flushing the whole TLB is expensive because we (strictly speaking) > > have to flush the whole of the MMU hash table as well. > > How much difference is there? Between flushing a single TLB entry and flushing the whole TLB, or between flushing a single entry and flushing a range? Flushing the whole TLB (including the MMU hash table) would be extremely expensive. Consider a machine with 1GB of RAM. The recommended MMU hash table size would be 16MB (1024MB/64), although we generally run with much less, maybe a quarter of that. That's still 4MB of memory we have to scan through in order to find and clear all the entries in the hash table, which is what would be required for flushing the whole hash table. What we do at present is (a) have a bit in the linux page tables which indicates whether there is a corresponding entry in the MMU hash table and (b) only flush the kernel portion of the address space (0xc000 - 0x) in flush_tlb_all(). We have a single page table tree for kernel addresses, shared between all processes. That all helps but we still have to scan through all the page table pages for kernel addresses to do a flush_tlb_all(). I just did some measurements on a 400MHz POWER3 machine with 1GB of RAM. This is a 64-bit machine but running a 32-bit kernel (so both the kernel and userspace run in 32-bit mode). It is a 1-cpu machine and I am running an SMP kernel with highmem enabled, with 512MB of lowmem and 512MB of highmem. The MMU hash table is 4MB. The time taken inside a single flush_tlb_page call depends on whether the linux PTE indicates that there is a hardware PTE in the hash table. If not, it takes about 110ns, if it does, it takes 1us (I measured 998.5ns but I rounded it :). A call to flush_tlb_range for 1024 pages from flush_all_zero_pkmaps (replacing the flush_tlb_all call) takes around 1080us, which is pretty much linear. The time for flush_tlb_page was measured inside the procedure whereas the time for flush_tlb_range was measured in the caller, so the flush_tlb_range number includes procedure call and loop overhead which the flush_tlb_page number doesn't. I expect that almost all the PTEs in the pkmap range would have a corresponding hash table entry, since we would almost always touch a page that we have kmap'd. > We only flush once per kmap sweep, and > we have 1024 entries in the global kmap pool, so the single tlb flush > would have to be more than a thousand times less expensive overall > than the global flush for that change to be worthwhile. The time for doing a flush_tlb_all call in flush_all_zero_pkmaps was 3280us. That is for the version which only flushes the kernel portion of the address space. Just doing a memset to 0 on the hash table takes over 11ms (the memset goes at around 360MB/s but there is 4MB to clear). Clearing out the hash table properly would take much longer since you are supposed to synchronize with the hardware when changing each entry in the hash table and the memset is certainly not doing that. So yes, the ratio is more than 1024 to 1. > If the page flush really is _that_ much faster, then sure, this > decision can easily be made per-architecture: the kmap_high code > already has all of the locking and refcounting to know when a per-page > tlb flush would be safe. My preference would be for architectures to be able to make this decision. I don't mind whether it is a flush call per page inside the loop in flush_all_zero_pkmaps or a flush_tlb_range call at the end of the loop. I counted the average number of pages needing to be flushed in the loop in flush_all_zero_pkmaps - it was 1023.9 for the workload I was using, which was a kernel compile. Using flush_tlb_range would be fine on PPC but as I noted before some architectures assume that flush_tlb_range is only used on user addresses at the moment. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: virt_to_bus and virt_to_phys on Apple G4 target
[EMAIL PROTECTED] writes: > I am running linux 2.4.2 on Apple G4 machine. I think the 'PCI bus > addresses' and 'physical addresses' are same on this architecture. I They are the same on an Apple G4 but not necessarily on other PowerPC machines. It depends on the PCI host bridge implementation. > expected the two be different but according to asm/io.h 'virt_to_bus(addr) > = virt_to_phys(addr) + PCI_DRAM_OFFSET'. I printed the value of > 'PCI_DRAM_OFFSET' and that come out to be zero. Is this correct? Yes, for an Apple G4. > If I somehow get the physical address of a user space buffer in a module > and take this as a PCI bus address, will I be able to do DMA properly? Yes, on an Apple G4. If you use virt_to_bus then it should work on all PowerPC machines that I know of (that run 32-bit PPC/Linux). But as Dave points out, you should use the interfaces described in Documentation/DMA-mapping.txt instead if at all possible. It's quite possible that virt_to_bus will be removed during 2.5.x development. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: readl() / writel() on PowerPC
David T Eger writes: > Am I missing something? Is there some reason that readl() and > writel() should byte-swap by default? readl()/writel() are defined to access PCI memory space in units of 32 bits. PCI is by definition little-endian, PowerPC is (natively at least) big-endian, hence the byte-swap. Same for inl/outl etc., but not insl/outsl - they don't swap because they are typically used for transferring arrays of bytes, just doing it 4 bytes at a time (2 at a time for insw/outsw). You can use __raw_readl/__raw_writel if you don't want byte-swapping, but they also don't give you any barriers. Thus if you do __raw_writel(v, addr); x = __raw_readl(addr); it is quite possible for the read to hit the device before the write. If you want to prevent that you need to put an iobarrier_rw() call in between the read and the write. You don't need a barrier between successive writes unless you want to prevent any potential store-gathering from happening, because PowerPC's don't reorder writes to I/O regions. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: about kmap_high function
Stephen C. Tweedie writes: > kmap_high is intended to be called routinely for access to highmem > pages. It is coded to be as fast as possible as a result. TLB > flushes are expensive, especially on SMP, so kmap_high tries hard to > avoid unnecessary flushes. The code assumes that flushing a single TLB entry is expensive on SMP, while flushing the whole TLB is relatively cheap - certainly cheaper than flushing several individual entries. And that assumption is of course true on i386. On PPC it is a bit different. Flushing a single TLB entry is relatively cheap - the hardware broadcasts the TLB invalidation on the bus (in most implementations) so there are no cross-calls required. But flushing the whole TLB is expensive because we (strictly speaking) have to flush the whole of the MMU hash table as well. The MMU gets its PTEs from a hash table (which can be very large) and we use the hash table as a kind of level-2 cache of PTEs, which means that the flush_tlb_* routines have to flush entries from the MMU hash table as well. The hash table can store PTEs from many contexts, so it can have a lot of PTEs in it at any given time. So flushing the whole TLB would imply going through every single entry in the hash table and clearing it. In fact, currently we cheat - flush_tlb_all actually only flushes the kernel portion of the address space, which is all that is required in the three places where flush_tlb_all is called at the moment. This is not a criticism, rather a request that we expand the interfaces so that the architecture-specific code can make the decisions about when and how to flush TLB entries. For example, I would like to get rid of flush_tlb_all and define a flush_tlb_kernel_range instead. In all the places where flush_tlb_all is currently used, we do actually know the range of addresses which are affected, and having that information would let us do things a lot more efficiently on PPC. On other platforms we could define flush_tlb_kernel_range to just flush the whole TLB, or whatever. Note that there is already a flush_tlb_range which could be used, but some architectures assume that it is only used on user addresses. Regards, Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
hang from HUP'ing init in linuxrc
Recently I tried running the Debian installer on top of a 2.4.6-pre6 kernel. It got up to the point of installing libc and then the system hung. It was still taking interrupts (I could change vt's, etc.) but no user processes were running. What was happening was rather interesting. The init process was stuck inside prepare_namespace(), in the while loop here (this is lines 749 - 751 of init/main.c): pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD); if (pid>0) while (pid != wait(&i)); The installer had sent a HUP signal to init. The init process thus had current->sigpending == 1. When it called wait, it got down into sys_wait4 which worked out that there were children but none were zombies, and at that point it would normally sleep, but because there were signals pending, it returned -ERESTARTSYS. Now, on the way out from the system call, the kernel noticed that it was returning to kernel mode and thus didn't deliver any signals, and sigpending stayed at 1. Thus the system was sitting in a tight loop calling wait() over and over again in kernel mode in the init process. This was on PPC. I had a look at the i386 code and AFAICS it will do the same thing. The check for whether we are returning to user mode is in do_signal there (whereas PPC does the check in entry.S) but the net effect in both cases is that we don't execute the main body of do_signal when we are returning from a syscall from a process running in kernel mode. I'm not sure what the best way to fix this is. The problem would crop up whenever we have a kernel thread which wants to wait for a child process. I don't think we want to start delivering signals to kernel threads in the same way that we do to usermode processes though. Any suggestions? Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Cosmetic JFFS patch.
Cort Dougan writes: > Can we then expect to see all mention of authors in drivers disappear from > the boot? I think we'll either see a lot more or a lot less. In my example I would have had no particular problem with a message saying "PPP driver copyright Al Longyear and Michael Callahan" or whatever. What annoyed me was the noisy copyright message about something that was only 20 or 30 lines of code, and not especially clever code at that. If copyright messages on boot are the way we get credit for the work we've done, then I have a few to add myself. :) My personal preference is for a quieter boot, with basically no copyright messages. It's Linus' call though. > Same with url's, version #'s and the like? See all the previous messages in this thread. :) > The built by > user@host message is a good bit of "drumming ones own drum" while > contributing very little (running 'make' vs. writing the system). Isn't that more a "who to blame" than credit? Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Cosmetic JFFS patch.
Linus Torvalds writes: > There's another side to "drumming your own drum": it is often seen as > actively offensive to some people who don't want to do the same thing. I agree. What usually seems to end up happening is that someone writes 95% and gets no credit, someone else does 5% and puts in a printk announcing their contribution loudly every time the system boots. I recall that the old PPP driver used to print "PPP Dynamic channel allocation code copyright 1995 Caldera, Inc." which always annoyed me because it was a completely trivial piece of code that the notice was referring to. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softirq in pre3 and all linux ports
Andrea Arcangeli writes: > We should release the stack before running the softirq (some place uses > softirqs to release the stack and avoid overflows). Well if they are relying on having a lot of stack available then those places are buggy. Once the softirq is made pending it can run at any time that interrupts are enabled. You can't rely on a softirq handler having any more stack available than a hard interrupt handler has. > ip + tcp are more intensive than just queueing a packet in a blacklog. > That's why they're not done in irq context in first place. Ah, ok, I misunderstood, I thought you were saying that that softirq framework itself had a lot of overhead. > I don't have gigabit ethernet so I cannot flood my boxes to death. > But I think it's real, and a softirq marking itself runnable again is > another case to handle without live lockups or starvation. As for the gigabit ethernet case, if we are having packets coming in and generating hard interrupts at that sort of a rate then what we really need is the sort of interrupt throttling that Jamal talked about at the 2.5 kernel kickoff. It seems to me that possibly softirqs are being used in some places where a kernel thread would be more appropriate. Instead of making softirqs use a kernel thread, I think it would be better to find the places that should use a thread and make them do so. Softirqs are still after all interrupt handlers (ones that run at a lower priority than any hardware interrupt) and should be treated as such. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: softirq in pre3 and all linux ports
Andrea Arcangeli writes: > With pre3 there are bugs introduced into mainline that are getting > extended to all architectures. > > First of all nucking the handle_softirq from entry.S is wrong. ppc > copied without thinking and we'll need to resurrect it too for example Well, I object to the "without thinking" bit. It seems to me that code that raises a softirq without having either hard interrupts or BHs disabled is buggy - why would you want to do that? And if we do want to allow that, shouldn't we put the check in raise_softirq or the equivalent, to get the minimum latency? > Fourth if the tasklet or softirq or bottom half hander is been marked > running again because of another even (like a nested irq) the kernel can > starve userspace too. (softirqs are much heavier than the irq handler so > it can also live lockup much more easily this way) Soft irqs should definitely not be much heavier than an irq handler, if they are then we have implemented them wrongly somehow. > So I recommend Linus merging this patch that fixes all the above > mentioned bugs (the anti starvation/live lockup logic is called > ksoftirqd): ksoftirqd seems like the wrong solution to the problem to me, if we really getting starved by softirqs then we need to look at whether whatever is doing it should be a kernel thread itself rather than doing it in softirqs. Do you have a concrete example of the starvation/live lockup that you can describe to us? Regards, Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux/PPC maintainer changing
Cort has put in an enormous amount of time and effort into maintaining the PowerPC port of Linux over the past 5 or 6 years, and I for one would like to acknowledge that publicly and thank him for that. It has not always been an easy task, I know, because there are a wide range of opinions within the PPC/Linux camp and Cort has been the man on the spot to sort out the balance between the competing interests. And I for one will miss the time, effort and resources he has put into the infrastructure things such as the repository, web pages, ftp site etc. I would also like to thank FSM Labs for contributing the space and bandwidth for the PPC/Linux repository over the last couple of years. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: any good diff merging utility?
Ivan Vadovic writes: > Well, are there any utilities to merge diffs? I couldn't find any on freshmeat. > So what are you using to stack many patches onto the kernel tree? Just manualy > modify the diff? I'll try to write something more automatic if nothing comes up. Try dirdiff - ftp://ftp.samba.org/pub/paulus/dirdiff-1.2.tar.gz. I use it all the time for merging in changes between Linus' official tree, my own development tree, and the PPC/Linux bitkeeper trees. Dirdiff is a tcl/tk-based utility for graphically displaying the difference between directory trees. It can handle from 2 to 5 trees. It displays a main window where it shows which files are different. You can select a file and get it to show the diffs between that file in any two of the directory trees. This comes up in another window in a format like a unified diff but with the background of the line colored according to which file it comes from. You can also copy files between trees with a menu item - in fact you can select whole groups of files to be copied. And you can use it to generate patches too. :) Once you have the differences between two versions of a file displayed, you can do a merge between the two versions. Each line of differences has a little check box beside it. If you check the box it means you want to make that change (right-click or shift-click selects a whole group of boxes). When you have checked all the boxes you want you select an item from the merge menu to say which tree you want to update. The new version of the file comes up in an edit window and you can check it, make any further changes you want, etc. Then you can either save the result or close the window (discarding the merge). It's hard to explain in words everything about how it works and how you use it. It isn't really a utility to merge diffs but it is very useful in tracking and merging changes between several large source trees. I find it particularly useful because I am usually interested only in a subset of the files (i.e. particularly arch/ppc and include/asm-ppc). So when Linus releases a new pre-patch, I update my "official Linus source" tree and do another dirdiff. If there are changes to files under fs/ for instance, I just select all of them and copy them over to my tree without looking at the diffs. If there are changes in arch/i386 for instance, I look at the diff to see if I am going to need to make a similar change in arch/ppc. Regards, Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Inconsistent "#ifdef __KERNEL__" on different architectures
Adrian Bunk writes: > Whatever the right policy is, the main concern in my initial mail was the > _consistency_ of the kernel headers between different architectures. > So when you want to flush out these programs I see no reason to > inconsistetly change it only on one architecture. Different architectures are maintained by different people who have different perspectives on things. The only thing you have any right to expect any consistency in is the kernel API, and even there things like error numbers etc. differ between architectures. If you want consistency, you would either have to persuade Linus to issue an edict or else persuade every single architecture maintainer to do things the same way. But if the motivation is to make it easier for user-level programs to use things which are not intended to be exported to userspace, then all you will achieve is that we will make sure that you can't use those things from userspace. And this definitely includes things like atomics, bitops, memory barriers etc. Take a copy by all means but don't rely on the kernel definitions for your userspace programs. It is the policy for all architectures that kernel headers should not be used in userspace programs. The "inconsistency" that you are complaining about is only a difference in the extent to which this policy is enforced. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Inconsistent "#ifdef __KERNEL__" on different architectures
Adrian Bunk writes: > (my main concern wasn't whether the "#ifdef __KERNEL__" is correct or not > but I was wondering whether there's a reason why it's different on > different architectures) The only valid reason for userspace programs to be including kernel headers is to get definitions that are part of the kernel API. (And in fact others here will go further and assert that there are *no* valid reasons for userspace programs to include kernel headers.) If you want some atomic functions or whatever for your userspace program and the ones in the kernel look like they would be useful, then take a copy of the relevant kernel code if you like, but don't include the kernel headers directly. If you do, you will get bitten at some point in the future when we decide to change some internal implementation detail in the kernel, and your program suddenly won't compile any more. This is why I added #ifdef __KERNEL__ around most of the contents of include/asm-ppc/*.h. It was done deliberately to flush out those programs which are depending on kernel headers when they shouldn't. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SyncPPP Generic PPP merge
Jeff Mcadams writes: > Indeed. And let me just throw out another thought. A clean abstraction > of the various portions of the PPP functionality is beneficial in other > ways. My personal pet project being to add L2TP support to the kernel > eventually. A good abstraction of the framing capabilities and basic > PPP processing would be rather useful in that project. That is exactly what ppp_generic.c is intended to do - it abstracts out the framing and encapsulation and low-level transport of PPP frames into ppp "channels" (see for example ppp_async.c, ppp_synctty.c) while ppp_generic.c does the basic PPP processing (compression, multilink, handling the network interface device etc.). You should be able to write an L2TP channel to work with ppp_generic - all your code would need to know about is how to take a PPP frame and encapsulate and send it, and how to receive and decapsulate PPP frames. [Note to myself: send in a Documentation/ppp_generic.txt which describes the interface between ppp_generic.c and the channels.] > I would agree that such a project would be 2.5 material. Do it today if you like, I can't see that adding a new PPP channel could break anything else, it would be like adding a new driver. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH: New iSeries Device Drivers (small update)
Alan Cox writes: > I was ignoring them because I think they should come via the PPC maintainers It's OK Alan, Tom is one of the maintainers for Linux on i-Series (AS/400) machines (we just haven't got around to sending the patch to the MAINTAINERS file yet). Cort and Tom and I are discussing how best to merge in the i-Series support into arch/ppc and include/asm-ppc but these drivers can go in as far as I am concerned (and AFAIK Cort agrees). Regards, Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: add page argument to copy/clear_user_page
Linus Torvalds writes: > > As for the `to' argument, yes it is redundant since it is just kmap(page). > > And why not let "clear_page()" just do that itself? OK, here's a patch that does that. > The thing is, copy/clear_page shouldn't exist at all (or rather, the > "highpage" versions should be renamed to the non-highpage names, because > the non-highmem case simply isn't interesting any more). Each architecture already had a clear_page that was functionally equivalent to memset(p, 0, PAGE_SIZE), but often in assembler, and likewise a copy_page that was equivalent to memcpy(d, s, PAGE_SIZE). So I renamed all the existing clear_page's and copy_page's to __clear_page and __copy_page (since they are "lower-level" or "raw" clear/copy page routines). In highmem.h I have renamed copy_highpage to copy_page and clear_highpage to clear_page. I also have default versions of copy_user_page and clear_user_page which just do copy_page/clear_page for those architectures that don't have any cache issues to deal with. Architectures can define __HAVE_ARCH_USER_PAGE in asm/page.h and then define their own copy/clear_user_page routines if they want to. I have fixed up all the architectures except sparc64. There the copy/clear_user_page routines are in assembler and my sparc assembler is pretty rusty these days (particularly when DaveM goes doing hairy things with the %g registers :). I'll let Dave fix that one up; the change is that copy/clear_user_page take page * arguments instead of void * arguments. This patch is a fair bit bigger than the last one, but most of the bulk is just the renaming of clear_page to __clear_page and copy_page to __copy_page. I also renamed memclear_highpage to memclear_page (which isn't actually used anywhere) and memclear_highpage_flush to memclear_page_flush. Let me know what you think of this; if it's OK, could you apply it to your tree? Thanks, Paul. diff -urN linux/Documentation/cachetlb.txt linux.new/Documentation/cachetlb.txt --- linux/Documentation/cachetlb.txtSat Mar 31 03:05:54 2001 +++ linux.new/Documentation/cachetlb.txtWed May 23 20:48:38 2001 @@ -260,8 +260,9 @@ Here is the new interface: - void copy_user_page(void *to, void *from, unsigned long address) - void clear_user_page(void *to, unsigned long address) + void copy_user_page(struct page *to, struct page *from, + unsigned long address) + void clear_user_page(struct page *to, unsigned long address) These two routines store data in user anonymous or COW pages. It allows a port to efficiently avoid D-cache alias @@ -279,6 +280,11 @@ If D-cache aliasing is not an issue, these two routines may simply call memcpy/memset directly and do nothing more. + + There are default versions of these procedures supplied in + include/linux/highmem.h. If a port does not want to use the + default versions it should declare them and define the symbol + __HAVE_ARCH_USER_PAGE in include/asm/page.h. void flush_dcache_page(struct page *page) diff -urN linux/arch/alpha/kernel/alpha_ksyms.c linux.new/arch/alpha/kernel/alpha_ksyms.c --- linux/arch/alpha/kernel/alpha_ksyms.c Sat Apr 28 23:02:30 2001 +++ linux.new/arch/alpha/kernel/alpha_ksyms.c Wed May 23 20:39:23 2001 @@ -98,8 +98,8 @@ EXPORT_SYMBOL(__memset); EXPORT_SYMBOL(__memsetw); EXPORT_SYMBOL(__constant_c_memset); -EXPORT_SYMBOL(copy_page); -EXPORT_SYMBOL(clear_page); +EXPORT_SYMBOL(__copy_page); +EXPORT_SYMBOL(__clear_page); EXPORT_SYMBOL(__direct_map_base); EXPORT_SYMBOL(__direct_map_size); diff -urN linux/arch/alpha/lib/clear_page.S linux.new/arch/alpha/lib/clear_page.S --- linux/arch/alpha/lib/clear_page.S Thu Feb 22 14:24:52 2001 +++ linux.new/arch/alpha/lib/clear_page.S Wed May 23 20:39:23 2001 @@ -6,9 +6,9 @@ .text .align 4 - .global clear_page - .ent clear_page -clear_page: + .global __clear_page + .ent __clear_page +__clear_page: .prologue 0 lda $0,128 @@ -36,4 +36,4 @@ unop nop - .end clear_page + .end __clear_page diff -urN linux/arch/alpha/lib/copy_page.S linux.new/arch/alpha/lib/copy_page.S --- linux/arch/alpha/lib/copy_page.SThu Feb 22 14:24:52 2001 +++ linux.new/arch/alpha/lib/copy_page.SWed May 23 21:05:31 2001 @@ -6,9 +6,9 @@ .text .align 4 - .global copy_page - .ent copy_page -copy_page: + .global __copy_page + .ent __copy_page +__copy_page: .prologue 0 lda $18,128 @@ -46,4 +46,4 @@ unop nop - .end copy_page + .end __copy_page diff -urN linux/arch/alpha/lib/ev6-clear_page.S linux.new/arch/alpha/lib/ev6-clear_page.S --- linux/arch/alpha/lib/ev6-clear_page.S Thu Feb 22 14:24:52 2001 +++ linux.new/arch/alpha/lib/ev6-clear_page.S Wed May 23 20:39:23 2001 @@ -6,9 +6,9 @@ .text .align 4 -.global clear_page
Re: SyncPPP IPCP/LCP loop problem and patch
[EMAIL PROTECTED] writes: > I've hit a problem with the syncPPP module within Linux. > > Under certain conditions (hard to quantify exactly, but try several 8Mbps > streams hitting a relatively slow, say 200MHz processor) the LCP/IPCP > negotiation hits the following loop. [snip] > My solution in the patch that follows is to detect the flip-flop using a > counter and then after three occurrences with no genuine IPCP traffic to > modify behavior on receipt of the LCP conf REQ. After three attempts we > acknowledge the LCP conf REQ but stay in the opened state rather than > dropping back and restarting our own LCP negotiation. This is non-RFC1661 > behavior unless you consider it part of the general loop avoidance directive. Seems to me that when you get the conf-request in opened state, you should send your conf-request before sending the conf-ack to the peer's conf-request. I think this would short-circuit the loop (I could be wrong though, it's getting late). That behaviour would be in line with the FSM in rfc1661, where the action for event RCR+ in Opened state is "tld,scr,sca/8", i.e. the one action involves sending both the conf-request and the conf-ack. It is debatable to what extent that specifies the order of the messages but it does list the conf-request first FWIW. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFD w/info-PATCH] device arguments from lookup, partion code
Alexander Viro writes: > drivers/net/ppp_generic.c: > ppp_set_compress(struct ppp *ppp, unsigned long arg) > { [snip] > if (copy_from_user(&data, (void *) arg, sizeof(data)) > || (data.length <= CCP_MAX_OPTION_LENGTH > && copy_from_user(ccp_option, data.ptr, data.length))) > goto out; > > And that's far from being uncommon. They _do_ follow pointers. Some - more > than once. :) That particular example is one that would probably be much cleaner as a write on a control fd. What is there currently is just a relatively ugly way of getting a variable-sized lump of data from usermode into the kernel. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: add page argument to copy/clear_user_page
Linus Torvalds writes: > If you add the page argument, why leave the old arguments lingering there > at all? They only create confusion, and add no information. You mean the `to' pointer argument, or the `vaddr' argument? The `vaddr' argument isn't redundant, it's the user virtual address where the page is mapped, and sparc64 needs it in order to avoid D-cache aliasing issues I believe. (Dave?) As for the `to' argument, yes it is redundant since it is just kmap(page). But copy/clear_user_page isn't the interface that gets called from the MM stuff, copy/clear_user_highpage is, defined in include/linux/highmem.h. These are two of a whole series of functions which all do kmap, do something, kunmap. IMHO having the kmap/kunmap calls in copy/clear_user_highpage in include/linux/highmem.h is the best approach because it means that most architectures can just #define copy/clear_user_page as copy/clear_page in include/asm/page.h (as they do at the moment). It means that the kmap/kunmap calls are in one place only instead of being duplicated in every architecture. But we could instead push the kmap/kunmap down into copy/clear_user_page. Then we might as well rename them into copy/clear_user_highpage. Here is how it might turn out in include/asm-i386/page.h (and asm-alpha, asm-arm, asm-crus, asm-s390, asm-sh, asm-s390x...): extern void clear_page(void *page); extern void copy_page(void * _to, void * _from); #define clear_user_highpage(page, vaddr)\ do {\ struct page *__page = page; \ clear_page(kmap(__page)); \ kunmap(__page); \ } while (0) #define copy_user_highpage(to, from, vaddr) \ do {\ struct page *__to = to, *__from = from; \ copy_page(kmap(__to), kmap(__from));\ kunmap(__from); \ kunmap(__to); \ } while (0) Doing it with inline functions would be cleaner but would mean that we would need the declaration of kmap/kunmap in page.h. That would mean that we would need to #include in include/asm/page.h which is starting to get pretty messy and inviting circular inclusions. We could move these declarations to another file in include/asm - include/asm/highmem.h might seem the natural place but it is only used if CONFIG_HIGHMEM is defined and not all ports have it. I assume nobody wants to do these functions out-of-line. :) So on the whole the way I had it seems cleanest to me. But I can whip up a patch to do the kmap/kunmap in the architecture-specific files instead, if you prefer - if so, do you prefer the macro version or the inline function version? Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
add page argument to copy/clear_user_page
Linus, The patch below adds a page * argument to copy_user_page and clear_user_page. These functions are used only in include/linux/highmem.h to implement clear_user_highpage and copy_user_highpage. The idea is to pass in the pointer to the page struct for the destination page so that, on architectures where it is needed, we can use the PG_arch_1 bit of page->flags to indicate whether the i-cache and d-cache are consistent for the page. With the extra argument, copy/clear_user_page can set the PG_arch_1 bit to the "inconsistent" state. We can then add architecture-specific code to flush the i-cache when the bit is in the "inconsistent" state and a user process wants to be able to execute from the page. Sparc64, ppc and ia64 at least will benefit from this. Using the PG_arch_1 bit in this way lets us avoid doing unnecessary i-cache flushes for the page - on ppc I have measured a 2.5% reduction in time for a kernel compile by doing this. David Miller and David Mosberger-Tang agree with this change, in fact if you look in include/asm-ia64/pgalloc.h you will see that copy/clear_user_page already have the extra page * argument. The patch below does nothing more than add the extra argument to all the definitions of copy_user_page and clear_user_page (for all architectures), and to the places where they are called. At this stage this extra argument will be unused (except on ia64). Once this change goes in, the various architecture maintainers who care can send you the patches which will make use of the extra argument on their architecture. We have patches for ppc tested and ready to be included, and I know DaveM has patches for sparc64. Please apply this to your tree. Thanks, Paul. diff -urN linux/Documentation/cachetlb.txt linux.new/Documentation/cachetlb.txt --- linux/Documentation/cachetlb.txtSat Mar 31 03:05:54 2001 +++ linux.new/Documentation/cachetlb.txtSun May 20 16:44:46 2001 @@ -260,8 +260,9 @@ Here is the new interface: - void copy_user_page(void *to, void *from, unsigned long address) - void clear_user_page(void *to, unsigned long address) + void copy_user_page(void *to, void *from, unsigned long address, + struct page *page) + void clear_user_page(void *to, unsigned long address, struct page *page) These two routines store data in user anonymous or COW pages. It allows a port to efficiently avoid D-cache alias @@ -279,6 +280,12 @@ If D-cache aliasing is not an issue, these two routines may simply call memcpy/memset directly and do nothing more. + + The "page" parameter points to the page struct for the page. + This allows a port to store information about the cache status + of the page in the page struct (for example, by using the + PG_arch_1 bit of the flags field) and update that status to + reflect the effect of the clear or copy. void flush_dcache_page(struct page *page) diff -urN linux/arch/sh/mm/cache.c linux.new/arch/sh/mm/cache.c --- linux/arch/sh/mm/cache.cSat Apr 28 23:02:38 2001 +++ linux.new/arch/sh/mm/cache.cSun May 20 16:45:47 2001 @@ -506,14 +506,15 @@ /* Page is 4K, OC size is 16K, there are four lines. */ #define CACHE_ALIAS 0x3000 -void clear_user_page(void *to, unsigned long address) +void clear_user_page(void *to, unsigned long address, struct page *page) { clear_page(to); if (((address ^ (unsigned long)to) & CACHE_ALIAS)) __flush_page_to_ram(to); } -void copy_user_page(void *to, void *from, unsigned long address) +void copy_user_page(void *to, void *from, unsigned long address, + struct page *page) { copy_page(to, from); if (((address ^ (unsigned long)to) & CACHE_ALIAS)) diff -urN linux/include/asm-alpha/page.h linux.new/include/asm-alpha/page.h --- linux/include/asm-alpha/page.h Thu Feb 22 14:25:37 2001 +++ linux.new/include/asm-alpha/page.h Sun May 20 16:50:42 2001 @@ -13,10 +13,10 @@ #define STRICT_MM_TYPECHECKS extern void clear_page(void *page); -#define clear_user_page(page, vaddr) clear_page(page) +#define clear_user_page(page, vaddr, pg) clear_page(page) extern void copy_page(void * _to, void * _from); -#define copy_user_page(to, from, vaddr)copy_page(to, from) +#define copy_user_page(to, from, vaddr, page) copy_page(to, from) #ifdef STRICT_MM_TYPECHECKS /* diff -urN linux/include/asm-arm/page.h linux.new/include/asm-arm/page.h --- linux/include/asm-arm/page.hMon Aug 14 02:54:15 2000 +++ linux.new/include/asm-arm/page.hSun May 20 16:50:41 2001 @@ -14,8 +14,8 @@ #define clear_page(page) memzero((void *)(page), PAGE_SIZE) extern void copy_page(void *to, void *from); -#define clear_user_page(page, vaddr) clear_page(page) -#define copy_user_page(to, from, vaddr)copy_page(to, from) +#define clear_user_page(page, vaddr, pg) clear_page(page) +#define copy_user_page(to, from, vaddr, page) co
icache flushing in kernel/ptrace.c
I would like to change kernel/ptrace.c to call something else instead of flush_icache_page in access_one_page in kernel/ptrace.c. Currently it calls flush_icache_page on the page after modifying it. Now of course on many architectures (including PPC) we need to do some sort of i-cache flush - my contention is that flush_icache_page is the wrong interface, we should be calling flush_icache_range or something like it instead. The problem with flush_icache_page is that it is also called in do_no_page and do_swap_page in mm/memory.c. In the do_no_page case it is called on a page which we have usually just got from the page cache. If the page is clean and has previously had the i-cache flushed for it then there is no need to do the flush again. But there is no way (no reasonable way, anyway) for flush_icache_page to tell whether it has been called from do_no_page or from access_one_page. I have been able to get good speedups on PPC by using the PG_arch_1 bit on the page to indicate whether a page is i-cache clean (has had the flush done), by delaying flushing until necessary (i.e. until a process maps in the page and has requested execute permission on it), and by not flushing the page if it has already been flushed. (Anton Blanchard has actually done a lot of this work with input from Dave Miller.) But to do this I need to make flush_icache_page do nothing, which breaks ptrace. For now I have duplicated most of the contents of kernel/ptrace.c inside arch/ppc/kernel/ptrace.c and changed the flush_icache_page to flush_icache_range (with appropriate parameters) to fix this. But this is not ideal. AFAICT the architectures that need to maintain i-cache coherency in software are alpha, ia64, m68k, mips, mips64, parisc, ppc and sparc64. There seems to be a lot of variation in the assumptions about what sorts of addresses flush_icache_range will be used on and what it should do. Going by the name, flush_icache_range would be the ideal interface for flushing the range of bytes that have been modified by access_one_page. But it looks to me like using it might be suboptimal on other architectures, e.g. alpha, due to the way that flush_icache_range has been implemented. Anyway, here's a proposed patch. Could the various architecture maintainers (particularly alpha) comment on what the impact would be on their architectures? If flush_icache_range isn't the right interface either, could we invent one that would be? Thanks, Paul. diff -urN linux/kernel/ptrace.c pmac/kernel/ptrace.c --- linuxppc_2_4/kernel/ptrace.cWed Mar 21 09:39:08 2001 +++ pmac/kernel/ptrace.cMon Apr 16 12:00:11 2001 @@ -58,10 +58,11 @@ flush_cache_page(vma, addr); if (write) { - maddr = kmap(page); - memcpy(maddr + (addr & ~PAGE_MASK), buf, len); + maddr = kmap(page) + (addr & ~PAGE_MASK); + memcpy(maddr, buf, len); flush_page_to_ram(page); - flush_icache_page(vma, page); + flush_icache_range((unsigned long) maddr, + (unsigned long) maddr + len); kunmap(page); } else { maddr = kmap(page); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [RESEND] fs/binfmt_elf.c changes vs 2.4.5-pre3
Linus, This patch against 2.4.5-pre3 makes 3 changes to fs/binfmt_elf.c: 1. It fixes the csp calculation so that it actually achieves the 16 byte final alignment that the comment claims. Previously the csp calculation didn't take the AT_NULL entry into account. If you look at the current fs/binfmt_elf.c there is a "sp -= 2" that is not reflected in the csp calculation, unlike all the other decrements of sp. 2. It allows each architecture to add extra aux table entries by defining DLINFO_ARCH_ITEMS and ARCH_DLINFO in . We need this on PowerPC to add entries for the cache line size, and to add entries for compatibility with older broken glibc's. 3. It removes the extra 16 bytes that were left free for PowerPC - in the past we had to move the auxiliary table up to cope with broken glibc's (now we cope by adding special AT_IGNORE entries using the ARCH_DLINFO macro). Please apply this to your tree. Thanks, Paul. diff -Nru a/fs/binfmt_elf.c b/fs/binfmt_elf.c --- a/fs/binfmt_elf.c Wed May 16 18:45:10 2001 +++ b/fs/binfmt_elf.c Wed May 16 18:45:10 2001 @@ -135,12 +135,13 @@ /* * Force 16 byte _final_ alignment here for generality. -* Leave an extra 16 bytes free so that on the PowerPC we -* can move the aux table up to start on a 16-byte boundary. */ - sp = (elf_addr_t *)((~15UL & (unsigned long)(u_platform)) - 16UL); + sp = (elf_addr_t *)(~15UL & (unsigned long)(u_platform)); csp = sp; - csp -= DLINFO_ITEMS*2 + (k_platform ? 2 : 0); + csp -= (1+DLINFO_ITEMS)*2 + (k_platform ? 2 : 0); +#ifdef DLINFO_ARCH_ITEMS + csp -= DLINFO_ARCH_ITEMS*2; +#endif csp -= envc+1; csp -= argc+1; csp -= (!ibcs ? 3 : 1); /* argc itself */ @@ -174,6 +175,13 @@ NEW_AUX_ENT(10, AT_EUID, (elf_addr_t) current->euid); NEW_AUX_ENT(11, AT_GID, (elf_addr_t) current->gid); NEW_AUX_ENT(12, AT_EGID, (elf_addr_t) current->egid); +#ifdef ARCH_DLINFO + /* +* ARCH_DLINFO must come last so platform specific code can enforce +* special alignment requirements on the AUXV if necessary (eg. PPC). +*/ + ARCH_DLINFO; +#endif #undef NEW_AUX_ENT sp -= envc+1; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: isa_read/write not available on ppc - solution suggestions ??
Linus Torvalds writes: > I would suggest the opposite approach instead: make the PPC just support > isa_readx/isa_writex instead. We can certainly do that, no problem. BUT that won't get a token ring pcmcia card working in the newer powerbooks, such as the titanium G4 powerbook, because the PCI host bridge doesn't map any cpu addresses to the bottom 16MB of PCI memory space. This is not a problem as far as pcmcia cards are concerned - the pcmcia stuff just picks an appropriate address (typically in the range 0x9000 - 0x9fff) and sets the pcmcia/cardbus bridge to map that to the card. But it means that the physical addresses for the card's memory space will be above the 16MB point, so it is essential to do the ioremap. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.3 oopses at lots of ppp sessions
Marcell GAL writes: > 2.4.3 (UP kernel UP machine, http://home.sch.bme.hu/~cell/.config) > oopses when I start lots of pppd eth0 simultaneously. > (I guess the problem is not pppoe specific, but I do not know exactly) > > The last pppd sighs: PPP: couldn't register device (-17) > This is 2 oops not just 1... Hmmm, somehow the list of ppp units has got a null pointer in it. At the moment I don't see how that can happen, but I will look into it. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CHECKER] security rules?
William Ie writes: > 4.linux/2.4.3/drivers/net/ppp_async.c:345:ppp_async_ioctl > case PPPIOCGFLAGS: > val = ap->flags | ap->rbits; > if (put_user(val, (int *) arg)) > break; > err = 0; > break; > case PPPIOCSFLAGS: > if (get_user(val, (int *) arg)) > break; > ap->flags = val & ~SC_RCV_BITS; > spin_lock_bh(&ap->recv_lock); > ap->rbits = val & SC_RCV_BITS; > spin_unlock_bh(&ap->recv_lock); > err = 0; > break; > seems to be getting and setting some flags without CAP_NET_ADMIN like in > ppp_synctty.c It is OK because this is a channel ioctl routine called from ppp_generic.c as a result of an ioctl call on /dev/ppp, and it is not possible to open /dev/ppp unless you have CAP_NET_ADMIN. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PPP update against 2.4.4-pre5
Byeong-ryeol Kim writes: > I met 'unresolved symbol sk_chk_filter ...' after applying this patch > and rebooting.( with CONFIG_PPP_FILTER=y ) > There shoud be folling lines in linux/net/netsyms.c or so: > > #ifdef CONFIG_PPP_FILTER > EXPORT_SYMBOL(sk_chk_filter); > #endif Good idea, actually let's put it next to the export of sk_run_filter, as in the patch below. Linus, could you apply this patch please? Paul. diff -urN linux/net/netsyms.c pmac/net/netsyms.c --- linux/net/netsyms.c Sun Apr 22 17:07:40 2001 +++ pmac/net/netsyms.c Mon Apr 23 11:24:31 2001 @@ -158,6 +158,7 @@ #ifdef CONFIG_FILTER EXPORT_SYMBOL(sk_run_filter); +EXPORT_SYMBOL(sk_chk_filter); #endif EXPORT_SYMBOL(neigh_table_init); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] ppp_generic, kernel 2.4.3
Tim Wilson writes: > Thanks for your reply. It seems I am finally talking to the right person (I > had previously tried posting this on the pptp-server mailing list, and I > also tried sending it to you directly, but no luck). Sorry, life has been a little turbulent for me over the last couple of months. > Well, I do know that people set up Linux gateways as PPTP servers, and that > they use MPPE to allow win98 clients to connect to those servers. That's > what I was trying to do anyway. After the connect, the gateway log says that > MPPE is negotiated, and the win98 client claims MPPE is being used, so all > looks OK, but the gateway sends PPP frames in cleartext. If that's not a > security hole, it is certainly not a Good Thing. Well, it's a consequence of using a knife to drive in a nail. :) Neither CCP nor the Linux CCP implementation are really designed to support encryption. There is a fairly strong assumption that if things go pear-shaped you can always take CCP down and send stuff uncompressed - it will be slower but it will still work. > As my patch shows, the fix > is quite easy, so reqardless of what we call it, might as well fix it. Sure, we can fix the problem you've pointed out, but that won't make for a secure MPPE implementation. (Is that an oxymoron, actually?) What I am saying is that even with your fix there is still a lot more work to do if you want to make sure that you never send or accept unencypted PPP frames. > Server Client > 1)2) ConfAck--> > 3) ConfReq--> > 4) > > The existing code (correctly) enables the compressor when it sends the > ConfAck (2). Then, it (incorrectly) disables the compressor when sending the > ConfReq in (3). With my fix, that doesn't happen; the compressor is disabled > at by reception of the ConfReq at(1), but it's not enabled yet anyway, so no > harm done. Good point. > if( ppp->flags & SC_CCP_UP) { > ppp->rstate &= ~SC_DECOMP_RUN; > ppp->xstate &= ~SC_COMP_RUN; > ppp->flags &= ~SC_CCP_UP; > } Yep, with the exception that I wouldn't clear SC_CCP_UP, since that is set and cleared by pppd. Here is an updated patch. Paul. diff -urN linux/drivers/net/ppp_generic.c pmac/drivers/net/ppp_generic.c --- linux/drivers/net/ppp_generic.c Sun Apr 22 17:07:28 2001 +++ pmac/drivers/net/ppp_generic.c Mon Apr 23 10:12:27 2001 @@ -1993,10 +1993,10 @@ /* * CCP is going down - disable compression. */ - if (inbound) + if (ppp->flags & SC_CCP_UP) { ppp->rstate &= ~SC_DECOMP_RUN; - else ppp->xstate &= ~SC_COMP_RUN; + } break; case CCP_CONFACK: @@ -2054,7 +2054,7 @@ ppp->xc_state = 0; } - ppp->xstate &= ~SC_DECOMP_RUN; + ppp->rstate &= ~SC_DECOMP_RUN; if (ppp->rc_state) { ppp->rcomp->decomp_free(ppp->rc_state); ppp->rc_state = 0; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] CONFIG_PPP_FILTER in -ac12 / -pre6
Andrzej Krzysztofowicz writes: > CONFIG_PPP_FILTER depends on CONFIG_FILTER (2.4.4-pre6, 2.4.3-ac12) > [ sk_run_filter(), ...] > So updated Config.in ... > - bool ' PPP filtering' CONFIG_PPP_FILTER > + dep_bool ' PPP filtering' CONFIG_PPP_FILTER $CONFIG_FILTER Yep, definitely a good idea. Thanks. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pmd_alloc, pte_alloc, Was Re: 2.4.3 and Alpha
[EMAIL PROTECTED] writes: > Basically in the pmd, it would seem that the current design in 2.4.3 forces > you to have pointers in there. Currently in our source we're using offsets > instead of a 64 bit pointer... this of course saved us from having to alloc 2 > contiguous pages in memory. Nope, the representation of the pgd/pmd/pte entries is entirely up to you (us :). The pmd entries for example are accessed through pmd_none, pmd_present, pte_offset, etc., and are set with pmd_populate. Those functions are all defined in asm/pgtable.h and asm/pgalloc.c. So you can make the representation whatever you like as long as those functions all do the right thing. Same goes for the pgd and pte levels. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PPP update against 2.4.4-pre5
Brown-paper bag time... The patch I sent earlier didn't include the accompanying changes to if_ppp.h and ppp_channel.h. Here they are. Paul. diff -urN linux/include/linux/if_ppp.h pmac/include/linux/if_ppp.h --- linux/include/linux/if_ppp.hTue Mar 28 04:28:55 2000 +++ pmac/include/linux/if_ppp.h Mon Mar 5 12:16:15 2001 @@ -1,4 +1,4 @@ -/* $Id: if_ppp.h,v 1.19 1999/03/31 06:07:57 paulus Exp $ */ +/* $Id: if_ppp.h,v 1.21 2000/03/27 06:03:36 paulus Exp $ */ /* * if_ppp.h - Point-to-Point Protocol definitions. @@ -21,7 +21,7 @@ */ /* - * ==FILEVERSION 2324== + * ==FILEVERSION 2724== * * NOTE TO MAINTAINERS: * If you modify this file at all, please set the above date. @@ -130,6 +130,8 @@ #define PPPIOCSCOMPRESS_IOW('t', 77, struct ppp_option_data) #define PPPIOCGNPMODE _IOWR('t', 76, struct npioctl) /* get NP mode */ #define PPPIOCSNPMODE _IOW('t', 75, struct npioctl) /* set NP mode */ +#define PPPIOCSPASS_IOW('t', 71, struct sock_fprog) /* set pass filter */ +#define PPPIOCSACTIVE _IOW('t', 70, struct sock_fprog) /* set active filt */ #define PPPIOCGDEBUG _IOR('t', 65, int) /* Read debug level */ #define PPPIOCSDEBUG _IOW('t', 64, int) /* Set debug level */ #define PPPIOCGIDLE_IOR('t', 63, struct ppp_idle) /* get idle time */ diff -urN linux/include/linux/ppp_channel.h pmac/include/linux/ppp_channel.h --- linux/include/linux/ppp_channel.h Mon Apr 2 02:20:35 2001 +++ pmac/include/linux/ppp_channel.hThu Apr 19 19:16:39 2001 @@ -22,7 +22,6 @@ #include #include #include -#include struct ppp_channel; @@ -32,7 +31,6 @@ int (*start_xmit)(struct ppp_channel *, struct sk_buff *); /* Handle an ioctl call that has come in via /dev/ppp. */ int (*ioctl)(struct ppp_channel *, unsigned int, unsigned long); - }; struct ppp_channel { @@ -78,16 +76,6 @@ * in the start_xmit and ioctl routines for the channel by the time * that ppp_unregister_channel returns. */ - -/* The following are temporary compatibility stuff */ -ssize_t ppp_channel_read(struct ppp_channel *chan, struct file *file, -char *buf, size_t count); -ssize_t ppp_channel_write(struct ppp_channel *chan, const char *buf, - size_t count); -unsigned int ppp_channel_poll(struct ppp_channel *chan, struct file *file, - poll_table *wait); -int ppp_channel_ioctl(struct ppp_channel *chan, unsigned int cmd, - unsigned long arg); #endif /* __KERNEL__ */ #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.3 Compile Errors - Power Mac
Jeff Galloway writes: > Compiler error message: > > fork.c: In function copy_mm¹: > fork.c:353: fixed or forbidden register 68 (0) was spilled for class > CR0_REGS. > This may be due to a compiler bug or to impossible asm statements or > clauses. You need a newer gcc, I suspect you have egcs installed, and you need to upgrade to gcc-2.95.2 or later. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPP update against 2.4.4-pre5
Linus, Alan, The patch below does two things: - It takes out the rest of the compatibility stuff that is no longer used, and which has the possibility of accessing memory that has been kfree'd (this could happen if you did a blocking read on a tty in PPP line discipline, and the tty hangs up). This possibility was pointed out by Kevin Buhr. - It adds packet filtering to the PPP driver. The main point of this is so that you can specify that certain sorts of packets don't count as activity, so they don't reset the idle timer and they don't bring up a demand-dialled link. This is a useful feature that I get asked for periodically, it's a small amount of code (in fact it's no extra code if you don't enable CONFIG_PPP_FILTER), and it's something I have had in my tree since last July without any problems. Linus, could this go in 2.4.4 please? Thanks, Paul. diff -urN linux/Documentation/Configure.help pmac/Documentation/Configure.help --- linux/Documentation/Configure.help Fri Apr 20 17:04:13 2001 +++ pmac/Documentation/Configure.help Fri Apr 20 17:45:20 2001 @@ -1756,6 +1756,10 @@ certain types of data to get through the socket. Linux Socket Filtering works on all socket types except TCP for now. See the text file Documentation/networking/filter.txt for more information. + + You need to say Y here if you want to use PPP packet filtering + (see the CONFIG_PPP_FILTER option below). + If unsure, say N. Network packet filtering @@ -7087,6 +7091,17 @@ If unsure, say N. +PPP filtering (EXPERIMENTAL) +CONFIG_PPP_FILTER + Say Y here if you want to be able to filter the packets passing over + PPP interfaces. This allows you to control which packets count as + activity (i.e. which packets will reset the idle timer or bring up + a demand-dialled link) and which packets are to be dropped entirely. + You need to say Y here if you wish to use the pass-filter and + active-filter options to pppd. + + If unsure, say N. + PPP support for async serial ports CONFIG_PPP_ASYNC Say Y (or M) here if you want to be able to use PPP over standard diff -urN linux/drivers/net/Config.in pmac/drivers/net/Config.in --- linux/drivers/net/Config.in Fri Apr 20 17:04:33 2001 +++ pmac/drivers/net/Config.in Fri Apr 20 17:24:04 2001 @@ -227,6 +227,7 @@ tristate 'PPP (point-to-point protocol) support' CONFIG_PPP if [ ! "$CONFIG_PPP" = "n" ]; then dep_bool ' PPP multilink support (EXPERIMENTAL)' CONFIG_PPP_MULTILINK $CONFIG_EXPERIMENTAL + bool ' PPP filtering' CONFIG_PPP_FILTER dep_tristate ' PPP support for async serial ports' CONFIG_PPP_ASYNC $CONFIG_PPP dep_tristate ' PPP support for sync tty ports' CONFIG_PPP_SYNC_TTY $CONFIG_PPP dep_tristate ' PPP Deflate compression' CONFIG_PPP_DEFLATE $CONFIG_PPP diff -urN linux/drivers/net/ppp_async.c pmac/drivers/net/ppp_async.c --- linux/drivers/net/ppp_async.c Thu Feb 22 14:25:14 2001 +++ pmac/drivers/net/ppp_async.cThu Mar 29 13:47:47 2001 @@ -244,11 +244,6 @@ err = 0; break; - case PPPIOCATTACH: - case PPPIOCDETACH: - err = ppp_channel_ioctl(&ap->chan, cmd, arg); - break; - default: err = -ENOIOCTLCMD; } diff -urN linux/drivers/net/ppp_generic.c pmac/drivers/net/ppp_generic.c --- linux/drivers/net/ppp_generic.c Fri Apr 20 17:04:35 2001 +++ pmac/drivers/net/ppp_generic.c Fri Apr 20 17:31:04 2001 @@ -19,7 +19,7 @@ * PPP driver, written by Michael Callahan and Al Longyear, and * subsequently hacked by Paul Mackerras. * - * ==FILEVERSION 2417== + * ==FILEVERSION 2902== */ #include @@ -32,6 +32,7 @@ #include #include #include +#include #include #include #include @@ -121,6 +122,10 @@ struct sk_buff_head mrq;/* MP: receive reconstruction queue */ #endif /* CONFIG_PPP_MULTILINK */ struct net_device_stats stats; /* statistics */ +#ifdef CONFIG_PPP_FILTER + struct sock_fprog pass_filter; /* filter for packets to pass */ + struct sock_fprog active_filter;/* filter for pkts to reset idle */ +#endif /* CONFIG_PPP_FILTER */ }; /* @@ -621,6 +626,43 @@ err = 0; break; +#ifdef CONFIG_PPP_FILTER + case PPPIOCSPASS: + case PPPIOCSACTIVE: + { + struct sock_fprog uprog, *filtp; + struct sock_filter *code = NULL; + int len; + + if (copy_from_user(&uprog, (void *) arg, sizeof(uprog))) + break; + if (uprog.len > 0) { + err = -ENOMEM; + len = uprog.len * sizeof(struct sock_filter); + code = kmalloc(len, GFP_KERNEL); + if (code == 0) +
Re: FW: Linux 2.4.3 Compile Errors - Power Mac
Jeff Galloway writes: > I sent this report to the people indicated below, whose names I got from the > MAINTAINERS file in the 2.4.3 distribution, but the email address for Mr. > MacKerras is no longer good and Mr. Chastain wrote me back that he is not > following 2.4 issues. I have left Linuxcare and [EMAIL PROTECTED] no longer works. Please use [EMAIL PROTECTED] > The compiler error message along with the menuconfig-generated configuration > file are set out in the attached MS Word document. I've had similar > problems with other versions of 2.4. Hmmm, I have to go to a lot of trouble to read Word documents, so I don't like receiving them. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [RESENT] fix bugs in HID driver
[Oops, re-sent with a subject line this time...] Linus, This patch fixes some bugs in drivers/usb/hid.c. Johannes Erdfelt (the maintainer) sent it to you previously but it got missed. Could it go in 2.4.4 please? Here are the comments explaining the patch that I wrote originally: > The first hunk just fixes some typos in s32ton. For example, with > n == 8, the code as it was would return 0x80 if value > 127 but 0xff > if value < -128. With my change it returns 0x7f for value > 127 and > 0x80 for value < -128. > > The second hunk fixes the "cdcd" problem that we see on apple > keyboards that can only handle 2-key rollover. If you type "c" "d" > quickly on these keyboards, you get a report with the > error-rollover code (1) in bytes 2 - 7 (instead of the codes for the > keys that are down). Without this patch the code thinks that all the > keys that were down are now up. When you release one key you get a > normal report again and the code thinks that the remaining keys have > been pressed again. The patch makes the code just discard the report > once it sees the error-rollover code. > > The remaining hunks fix some endianness problems in the code that sets > the keyboard leds. Thanks, Paul. diff -urN linux/drivers/usb/hid.c linuxppc_2_4/drivers/usb/hid.c --- linux/drivers/usb/hid.c Thu Feb 22 14:25:27 2001 +++ linuxppc_2_4/drivers/usb/hid.c Mon Feb 12 13:35:00 2001 @@ -698,7 +698,7 @@ static __inline__ __u32 s32ton(__s32 value, unsigned n) { __s32 a = value >> (n - 1); - if (a && a != -1) return value > 0 ? 1 << (n - 1) : (1 << n) - 1; + if (a && a != -1) return value < 0 ? 1 << (n - 1) : (1 << (n - 1)) - 1; return value & ((1 << n) - 1); } @@ -1016,9 +1016,15 @@ __s32 max = field->logical_maximum; __s32 value[count]; /* WARNING: gcc specific */ - for (n = 0; n < count; n++) + for (n = 0; n < count; n++) { value[n] = min < 0 ? snto32(extract(data, offset + n * size, size), size) : extract(data, offset + n * size, size); + /* Handle the ErrorRollOver code (1) by simply ignoring this +report */ + if (!(field->flags & HID_MAIN_ITEM_VARIABLE) + && value[n] >= min && value[n] <= max + && field->usage[value[n] - min].hid == HID_UP_KEYBOARD + 1) + return; + } for (n = 0; n < count; n++) { @@ -1231,7 +1237,7 @@ static int hid_submit_out(struct hid_device *hid) { - hid->urbout.transfer_buffer_length = hid->out[hid->outtail].dr.length; + hid->urbout.transfer_buffer_length = +le16_to_cpup(&hid->out[hid->outtail].dr.length); hid->urbout.transfer_buffer = hid->out[hid->outtail].buffer; hid->urbout.setup_packet = (void *) &(hid->out[hid->outtail].dr); hid->urbout.dev = hid->dev; @@ -1271,8 +1277,8 @@ hid_set_field(field, offset, value); hid_output_report(field->report, hid->out[hid->outhead].buffer); - hid->out[hid->outhead].dr.value = 0x200 | field->report->id; - hid->out[hid->outhead].dr.length = ((field->report->size - 1) >> 3) + 1; + hid->out[hid->outhead].dr.value = cpu_to_le16(0x200 | field->report->id); + hid->out[hid->outhead].dr.length = cpu_to_le16((field->report->size + 7) >> 3); hid->outhead = (hid->outhead + 1) & (HID_CONTROL_FIFO_SIZE - 1); @@ -1445,7 +1451,7 @@ for (n = 0; n < HID_CONTROL_FIFO_SIZE; n++) { hid->out[n].dr.requesttype = USB_TYPE_CLASS | USB_RECIP_INTERFACE; hid->out[n].dr.request = USB_REQ_SET_REPORT; - hid->out[n].dr.index = hid->ifnum; + hid->out[n].dr.index = cpu_to_le16(hid->ifnum); } hid->input.name = hid->name; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
No Subject
Linus, This patch fixes some bugs in drivers/usb/hid.c. Johannes Erdfelt (the maintainer) sent it to you previously but it got missed. Could it go in 2.4.4 please? Here are the comments explaining the patch that I wrote originally: > The first hunk just fixes some typos in s32ton. For example, with > n == 8, the code as it was would return 0x80 if value > 127 but 0xff > if value < -128. With my change it returns 0x7f for value > 127 and > 0x80 for value < -128. > > The second hunk fixes the "cdcd" problem that we see on apple > keyboards that can only handle 2-key rollover. If you type "c" "d" > quickly on these keyboards, you get a report with the > error-rollover code (1) in bytes 2 - 7 (instead of the codes for the > keys that are down). Without this patch the code thinks that all the > keys that were down are now up. When you release one key you get a > normal report again and the code thinks that the remaining keys have > been pressed again. The patch makes the code just discard the report > once it sees the error-rollover code. > > The remaining hunks fix some endianness problems in the code that sets > the keyboard leds. Thanks, Paul. diff -urN linux/drivers/usb/hid.c linuxppc_2_4/drivers/usb/hid.c --- linux/drivers/usb/hid.c Thu Feb 22 14:25:27 2001 +++ linuxppc_2_4/drivers/usb/hid.c Mon Feb 12 13:35:00 2001 @@ -698,7 +698,7 @@ static __inline__ __u32 s32ton(__s32 value, unsigned n) { __s32 a = value >> (n - 1); - if (a && a != -1) return value > 0 ? 1 << (n - 1) : (1 << n) - 1; + if (a && a != -1) return value < 0 ? 1 << (n - 1) : (1 << (n - 1)) - 1; return value & ((1 << n) - 1); } @@ -1016,9 +1016,15 @@ __s32 max = field->logical_maximum; __s32 value[count]; /* WARNING: gcc specific */ - for (n = 0; n < count; n++) + for (n = 0; n < count; n++) { value[n] = min < 0 ? snto32(extract(data, offset + n * size, size), size) : extract(data, offset + n * size, size); + /* Handle the ErrorRollOver code (1) by simply ignoring this +report */ + if (!(field->flags & HID_MAIN_ITEM_VARIABLE) + && value[n] >= min && value[n] <= max + && field->usage[value[n] - min].hid == HID_UP_KEYBOARD + 1) + return; + } for (n = 0; n < count; n++) { @@ -1231,7 +1237,7 @@ static int hid_submit_out(struct hid_device *hid) { - hid->urbout.transfer_buffer_length = hid->out[hid->outtail].dr.length; + hid->urbout.transfer_buffer_length = +le16_to_cpup(&hid->out[hid->outtail].dr.length); hid->urbout.transfer_buffer = hid->out[hid->outtail].buffer; hid->urbout.setup_packet = (void *) &(hid->out[hid->outtail].dr); hid->urbout.dev = hid->dev; @@ -1271,8 +1277,8 @@ hid_set_field(field, offset, value); hid_output_report(field->report, hid->out[hid->outhead].buffer); - hid->out[hid->outhead].dr.value = 0x200 | field->report->id; - hid->out[hid->outhead].dr.length = ((field->report->size - 1) >> 3) + 1; + hid->out[hid->outhead].dr.value = cpu_to_le16(0x200 | field->report->id); + hid->out[hid->outhead].dr.length = cpu_to_le16((field->report->size + 7) >> 3); hid->outhead = (hid->outhead + 1) & (HID_CONTROL_FIFO_SIZE - 1); @@ -1445,7 +1451,7 @@ for (n = 0; n < HID_CONTROL_FIFO_SIZE; n++) { hid->out[n].dr.requesttype = USB_TYPE_CLASS | USB_RECIP_INTERFACE; hid->out[n].dr.request = USB_REQ_SET_REPORT; - hid->out[n].dr.index = hid->ifnum; + hid->out[n].dr.index = cpu_to_le16(hid->ifnum); } hid->input.name = hid->name; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] update drivers/input/keybdev.c
Linus, The following patch updates drivers/input/keybdev.c so that we can generate either linux keycodes or ADB keycodes from keyboards that use the input layer. We now have ADB keyboards and mice using the input layer as well as USB, so it is very useful to have the flexibility to choose at runtime which type of keycodes you want to receive. You already have in your tree all of the other changes that we need in order to do this, just this one file got missed somehow. This change has been approved by the maintainer, Vojtech Pavlik. If you decide not to take this patch, please let me know so I can send you a patch to back out the corresponding changes that have already been made in other files. Paul. diff -urN linux/drivers/input/keybdev.c pmac/drivers/input/keybdev.c --- linux/drivers/input/keybdev.c Thu Apr 19 15:03:43 2001 +++ pmac/drivers/input/keybdev.cFri Apr 20 16:47:48 2001 @@ -38,7 +38,8 @@ #include #if defined(CONFIG_X86) || defined(CONFIG_IA64) || defined(__alpha__) || \ -defined(__mips__) || defined(CONFIG_SPARC64) || defined(CONFIG_SUPERH) +defined(__mips__) || defined(CONFIG_SPARC64) || defined(CONFIG_SUPERH) || \ +defined(CONFIG_PPC) || defined(__mc68000__) static int x86_sysrq_alt = 0; #ifdef CONFIG_SPARC64 @@ -63,8 +64,46 @@ 308,310,313,314,315,317,318,319,320,321,322,323,324,325,326,330, 332,340,341,342,343,344,345,346,356,359,365,368,369,370,371,372 }; +#ifdef CONFIG_MAC_EMUMOUSEBTN +extern int mac_hid_mouse_emulate_buttons(int, int, int); +#endif /* CONFIG_MAC_EMUMOUSEBTN */ +#ifdef CONFIG_MAC_ADBKEYCODES +extern int mac_hid_keyboard_sends_linux_keycodes(void); +#else +#define mac_hid_keyboard_sends_linux_keycodes()0 +#endif /* CONFIG_MAC_ADBKEYCODES */ +#if defined(CONFIG_MAC_ADBKEYCODES) || defined(CONFIG_ADB_KEYBOARD) +static unsigned char mac_keycodes[256] = { + 0, 53, 18, 19, 20, 21, 23, 22, 26, 28, 25, 29, 27, 24, 51, 48, +12, 13, 14, 15, 17, 16, 32, 34, 31, 35, 33, 30, 36, 54,128, 1, + 2, 3, 5, 4, 38, 40, 37, 41, 39, 50, 56, 42, 6, 7, 8, 9, +11, 45, 46, 43, 47, 44,123, 67, 58, 49, 57,122,120, 99,118, 96, +97, 98,100,101,109, 71,107, 89, 91, 92, 78, 86, 87, 88, 69, 83, +84, 85, 82, 65, 42, 0, 10,103,111, 0, 0, 0, 0, 0, 0, 0, +76,125, 75,105,124,110,115, 62,116, 59, 60,119, 61,121,114,117, + 0, 0, 0, 0,127, 81, 0,113, 0, 0, 0, 0, 95, 55, 55, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 94, 0, 93, 0, 0, 0, 0, 0, 0,104,102 }; +#endif /* CONFIG_MAC_ADBKEYCODES || CONFIG_ADB_KEYBOARD */ + static int emulate_raw(unsigned int keycode, int down) { +#ifdef CONFIG_MAC_EMUMOUSEBTN + if (mac_hid_mouse_emulate_buttons(1, keycode, down)) + return 0; +#endif /* CONFIG_MAC_EMUMOUSEBTN */ +#if defined(CONFIG_MAC_ADBKEYCODES) || defined(CONFIG_ADB_KEYBOARD) + if (!mac_hid_keyboard_sends_linux_keycodes()) { + if (keycode > 255 || !mac_keycodes[keycode]) + return -1; + + handle_scancode((mac_keycodes[keycode] & 0x7f), down); + return 0; + } +#endif /* CONFIG_MAC_ADBKEYCODES || CONFIG_ADB_KEYBOARD */ + if (keycode > 255 || !x86_keycodes[keycode]) return -1; @@ -103,28 +142,6 @@ if (keycode == KEY_STOP) sparc_l1_a_state = down; #endif - - return 0; -} - -#elif defined(CONFIG_ADB_KEYBOARD) - -static unsigned char mac_keycodes[128] = - { 0, 53, 18, 19, 20, 21, 23, 22, 26, 28, 25, 29, 27, 24, 51, 48, -12, 13, 14, 15, 17, 16, 32, 34, 31, 35, 33, 30, 36, 54,128, 1, - 2, 3, 5, 4, 38, 40, 37, 41, 39, 50, 56, 42, 6, 7, 8, 9, -11, 45, 46, 43, 47, 44,123, 67, 58, 49, 57,122,120, 99,118, 96, -97, 98,100,101,109, 71,107, 89, 91, 92, 78, 86, 87, 88, 69, 83, -84, 85, 82, 65, 42, 0, 10,103,111, 0, 0, 0, 0, 0, 0, 0, -76,125, 75,105,124, 0,115, 62,116, 59, 60,119, 61,121,114,117, - 0, 0, 0, 0,127, 81, 0,113, 0, 0, 0, 0, 0, 55, 55 }; - -static int emulate_raw(unsigned int keycode, int down) -{ - if (keycode > 127 || !mac_keycodes[keycode]) - return -1; - - handle_scancode(mac_keycodes[keycode] & 0x7f, down); return 0; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.4.3] PPP errors
Manfred H. Winter writes: > Apr 4 02:05:21 marvin pppd[1227]: Plugin /usr/lib/passwordfd.so loaded. > Apr 4 02:05:21 marvin pppd[1227]: pppd 2.4.0 started by mahowi, uid 500 > Apr 4 02:05:21 marvin pppd[1227]: Perms of /dev/ttyS0 are ok, no 'mesg n' necce > sary. Just out of curiosity, what pppd are you running, with what patches? I don't recognize the message about 'perms of /dev/ttyS0'. Or does this message come from the passwordfd.so plugin? > Modules Loaded serial sb sb_lib uart401 isa-pnp NVdriver opl3 sound >soundcore ipt_MASQUERADE iptable_nat ip_conntrack ppp_generic slhc iptable_filter >ip_tables af_packet khttpd autofs4 unix 8139too ide-scsi aic7xxx scsi_mod No ppp_async loaded - that's the problem. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [RESEND] update chipsfb driver
Linus, At present, drivers/video/chipsfb.c can only be used on PPC, and it doesn't compile even on PPC. The patch below makes it compile, and by changing it to use the generic inb/outb, means that there is at least a chance it can be used on other platforms. The patch is against 2.4.3-pre7, could you apply it please? Paul. diff -urN linux/drivers/video/chipsfb.c pmac/drivers/video/chipsfb.c --- linux/drivers/video/chipsfb.c Thu Feb 22 14:25:27 2001 +++ pmac/drivers/video/chipsfb.cSat Mar 3 21:17:19 2001 @@ -29,17 +29,19 @@ #include #include #include +#include + #ifdef CONFIG_FB_COMPAT_XPMAC #include -#endif -#include -#include #include +#endif #ifdef CONFIG_PMAC_BACKLIGHT #include #endif +#ifdef CONFIG_PMAC_PBOOK #include #include +#endif #include #include @@ -56,14 +58,13 @@ struct { __u8 red, green, blue; } palette[256]; + struct pci_dev *pdev; unsigned long frame_buffer_phys; __u8 *frame_buffer; unsigned long blitter_regs_phys; __u32 *blitter_regs; unsigned long blitter_data_phys; __u8 *blitter_data; - unsigned long io_base_phys; - __u8 *io_base; struct fb_info_chips *next; #ifdef CONFIG_PMAC_PBOOK unsigned char *save_framebuffer; @@ -74,10 +75,10 @@ }; #define write_ind(num, val, ap, dp)do { \ - out_8(p->io_base + (ap), (num)); out_8(p->io_base + (dp), (val)); \ + outb((num), (ap)); outb((val), (dp)); \ } while (0) #define read_ind(num, var, ap, dp) do { \ - out_8(p->io_base + (ap), (num)); var = in_8(p->io_base + (dp)); \ + outb((num), (ap)); var = inb((dp)); \ } while (0); /* extension registers */ @@ -97,10 +98,10 @@ #define read_sr(num, var) read_ind(num, var, 0x3c4, 0x3c5) /* attribute registers - slightly strange */ #define write_ar(num, val) do { \ - in_8(p->io_base + 0x3da); write_ind(num, val, 0x3c0, 0x3c0); \ + inb(0x3da); write_ind(num, val, 0x3c0, 0x3c0); \ } while (0) #define read_ar(num, var) do { \ - in_8(p->io_base + 0x3da); read_ind(num, var, 0x3c0, 0x3c1); \ + inb(0x3da); read_ind(num, var, 0x3c0, 0x3c1); \ } while (0) static struct fb_info_chips *all_chips; @@ -117,7 +118,7 @@ */ int chips_init(void); -static void chips_of_init(struct device_node *dp); +static void chips_pci_init(struct pci_dev *dp); static int chips_get_fix(struct fb_fix_screeninfo *fix, int con, struct fb_info *info); static int chips_get_var(struct fb_var_screeninfo *var, int con, @@ -253,29 +254,29 @@ #endif /* CONFIG_PMAC_BACKLIGHT */ /* get the palette from the chip */ for (i = 0; i < 256; ++i) { - out_8(p->io_base + 0x3c7, i); + outb(i, 0x3c7); udelay(1); - p->palette[i].red = in_8(p->io_base + 0x3c9); - p->palette[i].green = in_8(p->io_base + 0x3c9); - p->palette[i].blue = in_8(p->io_base + 0x3c9); + p->palette[i].red = inb(0x3c9); + p->palette[i].green = inb(0x3c9); + p->palette[i].blue = inb(0x3c9); } for (i = 0; i < 256; ++i) { - out_8(p->io_base + 0x3c8, i); + outb(i, 0x3c8); udelay(1); - out_8(p->io_base + 0x3c9, 0); - out_8(p->io_base + 0x3c9, 0); - out_8(p->io_base + 0x3c9, 0); + outb(0, 0x3c9); + outb(0, 0x3c9); + outb(0, 0x3c9); } } else { #ifdef CONFIG_PMAC_BACKLIGHT set_backlight_enable(1); #endif /* CONFIG_PMAC_BACKLIGHT */ for (i = 0; i < 256; ++i) { - out_8(p->io_base + 0x3c8, i); + outb(i, 0x3c8); udelay(1); - out_8(p->io_base + 0x3c9, p->palette[i].red); - out_8(p->io_base + 0x3c9, p->palette[i].green); - out_8(p->io_base + 0x3c9, p->palette[i].blue); + outb(p->palette[i].red, 0x3c9); + outb(p->palette[i].green, 0x3c9); + outb(p->palette[i].blue, 0x3c9); } } } @@ -307,11 +308,11 @@ p->palette[regno].red = red; p->palette[regno].green = green; p->palette[regno].blue = blue; - out_8(p->io_base + 0x3c8, regno); + outb(regno, 0x3c8); udelay(1); - out_8(p->io_base + 0x3c9, red); - out_8(p->io_base + 0x3c9, green); - out_8(p->io_base + 0x3c9, blue); + outb(red, 0x3c9); + outb(green, 0x3c9); + outb(blue, 0x3c9); #ifdef FBCON_HAS_CFB16 if (regno < 16) @@ -388,7 +389,7 @@ disp->visual = fix->
[PATCH] MM update for PPC
Linus, The patch below updates the MM code for PowerPC to correspond with the recent generic MM changes. The patch is against 2.4.3-pre7, and it affects only arch/ppc/mm/init.c, include/asm-ppc/pgalloc.h, and include/asm-ppc/semaphore.h. The changes to semaphore.h are only necessary because the definition of INIT_MM in sched.h uses __RWSEM_INITIALIZER with the argument of RW_LOCK_BIAS, meaning an unlocked semaphore. I think RW_LOCK_BIAS is at the very least a horrible name for something that means an unlocked semaphore, and in fact it is really a private definition used in the i386 semaphore code which should never be used in generic code like this. (But no I don't have a patch to fix this properly at the moment.) Paul. diff -urN linux/arch/ppc/mm/init.c linuxppc_2_4/arch/ppc/mm/init.c --- linux/arch/ppc/mm/init.cWed Mar 21 15:43:54 2001 +++ linuxppc_2_4/arch/ppc/mm/init.c Thu Mar 22 10:39:23 2001 @@ -110,7 +110,7 @@ #endif void MMU_init(void); -static void *MMU_get_page(void); +void *early_get_page(void); unsigned long prep_find_end_of_memory(void); unsigned long pmac_find_end_of_memory(void); unsigned long apus_find_end_of_memory(void); @@ -125,7 +125,7 @@ unsigned long m8260_find_end_of_memory(void); #endif /* CONFIG_8260 */ static void mapin_ram(void); -void map_page(unsigned long va, unsigned long pa, int flags); +int map_page(unsigned long va, unsigned long pa, int flags); void set_phys_avail(unsigned long total_ram); extern void die_if_kernel(char *,struct pt_regs *,long); @@ -206,41 +206,20 @@ pmd_val(*pmd) = (unsigned long) BAD_PAGETABLE; } -pte_t *get_pte_slow(pmd_t *pmd, unsigned long offset) -{ -pte_t *pte; - -if (pmd_none(*pmd)) { - if (!mem_init_done) - pte = (pte_t *) MMU_get_page(); - else if ((pte = (pte_t *) __get_free_page(GFP_KERNEL))) - clear_page(pte); -if (pte) { -pmd_val(*pmd) = (unsigned long)pte; -return pte + offset; -} - pmd_val(*pmd) = (unsigned long)BAD_PAGETABLE; -return NULL; -} -if (pmd_bad(*pmd)) { -__bad_pte(pmd); -return NULL; -} -return (pte_t *) pmd_page(*pmd) + offset; -} - int do_check_pgt_cache(int low, int high) { int freed = 0; - if(pgtable_cache_size > high) { + if (pgtable_cache_size > high) { do { - if(pgd_quicklist) - free_pgd_slow(get_pgd_fast()), freed++; - if(pmd_quicklist) - free_pmd_slow(get_pmd_fast()), freed++; - if(pte_quicklist) - free_pte_slow(get_pte_fast()), freed++; - } while(pgtable_cache_size > low); +if (pgd_quicklist) { + free_pgd_slow(get_pgd_fast()); + freed++; + } + if (pte_quicklist) { + pte_free_slow(pte_alloc_one_fast()); + freed++; + } + } while (pgtable_cache_size > low); } return freed; } @@ -383,6 +362,7 @@ __ioremap(unsigned long addr, unsigned long size, unsigned long flags) { unsigned long p, v, i; + int err; /* * Choose an address to map it to. @@ -453,10 +433,20 @@ flags |= _PAGE_GUARDED; /* -* Is it a candidate for a BAT mapping? +* Should check if it is a candidate for a BAT mapping */ - for (i = 0; i < size; i += PAGE_SIZE) - map_page(v+i, p+i, flags); + + spin_lock(&init_mm.page_table_lock); + err = 0; + for (i = 0; i < size && err == 0; i += PAGE_SIZE) + err = map_page(v+i, p+i, flags); + spin_unlock(&init_mm.page_table_lock); + if (err) { + if (mem_init_done) + vfree((void *)v); + return NULL; + } + out: return (void *) (v + (addr & ~PAGE_MASK)); } @@ -492,7 +482,7 @@ return (pte_val(*pg) & PAGE_MASK) | (addr & ~PAGE_MASK); } -void +int map_page(unsigned long va, unsigned long pa, int flags) { pmd_t *pd; @@ -501,10 +491,13 @@ /* Use upper 10 bits of VA to index the first level map */ pd = pmd_offset(pgd_offset_k(va), va); /* Use middle 10 bits of VA to index the second-level map */ - pg = pte_alloc(pd, va); + pg = pte_alloc(&init_mm, pd, va); + if (pg == 0) + return -ENOMEM; set_pte(pg, mk_pte_phys(pa & PAGE_MASK, __pgprot(flags))); if (mem_init_done) flush_hash_page(0, va); + return 0; } #ifndef CONFIG_8xx @@ -830,21 +823,16 @@ } } -/* In fact t
Re: PATCH against 2.4.2: TTY hangup on PPP channel corrupts kernel memory
Kevin Buhr writes: > I didn't realize my specific hang was a peculiarity of the older > attachment style. The channel created by pushing the PPP line I didn't realize you were talking about linux 2.4.0 and pppd 2.3.11. > discipline onto a TTY was connected to a unit with a PPPIOCATTACH > ioctl on the TTY---this didn't really "attach" the channel; it still > had a refcnt of only one. Through the old compatibility interface, it > was possible to call ppp_asynctty_read -> ppp_channel_read -> ppp_read > on the channel's "struct ppp_file" and wait on the channel's "rwait". > If the modem hung up, "do_tty_hangup" would call "ppp_asynctty_close" > (with a reader still in "ppp_asynctty_read") and the "struct channel" > would be freed in "ppp_unregister_channel". That's one of the main reasons why I removed the compatibility stuff. :) > I think your analysis of how things presently are with 2.4.2 and a > modern "pppd" is correct... > > Since the new "pppd" uses an explicit PPPIOCATTCHAN / PPPIOCCONNECT > sequence, the refcnt gets bumped to 2 and stays there while the > channel is attached. So, this specific hang isn't a problem anymore > for "ppp_async.c". It's still a problem with "ppp_synctty.c", though > (when used with "pppd" 2.3.11, say). Is the compatibility stuff in > there slated for removal, too? Yep, and we should take out the stuff in ppp_generic.c that was called by the compatibility stuff in the channels, too. > In particular, the comment above "ppp_asynctty_close" is misleading. > It's true that the TTY layer won't call any further line discipline > entries while the "close" is executing; however, there may be > processes already sleeping in line discipline functions called before > the hangup. For example, "ppp_asynctty_close" could be called while > we sleep in the "get_user" in "ppp_channel_ioctl" (called from > "ppp_asynctty_ioctl"). Therefore, calling "PPPIOCATTACH" on an > unattached PPP-disciplined TTY could, in unlikely circumstances > (argument swapped out), lead to a crash. Yuck. I don't see that we can protect against this without having some sort of lock in the tty structure, though. We can't protect the existence of the channel structure with a lock inside that structure. Ideally the necessary protection would be provided at the tty level. > I assume PPPIOCATTACH (on the TTY) is deprecated in favor of > PPPIOCATTCHAN / PPPIOCCONNECT (on the "/dev/ppp" handle). Can we > eliminate "ppp_channel_ioctl" from "ppp_async.c" entirely, as in the > patch below? We're requiring people to upgrade to "pppd" 2.4.0 > anyway, and it has no need for these calls. This would give me a warm, > fuzzy feeling. Sure, that would be fine. I'll make up a patch and send it to Linus. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH against 2.4.2: TTY hangup on PPP channel corrupts kernel memory
Kevin Buhr writes: > If there's a hangup in the TTY layer on an async PPP channel, > do_tty_hangup shuts down the PPP line discipline, and, in ppp_async.c, > the function ppp_asynctty_close unregisteres the channel. In > ppp_generic.c, ppp_unregister_channel merrily wakes up the rwait > queue, then proceeds to destroy the channel, freeing the "struct > channel" which contains the "struct ppp_file" that contains the > "wait_queue_head_t rwait". When the waiting process wakes up, it > removes itself from the wait queue, modifying freed memory. But the waiting process must have had an instance of /dev/ppp open and attached to the channel in order to be doing anything with rwait, within either ppp_file_read or ppp_poll. The process of attaching to the channel increases its refcnt, meaning that the channel shouldn't be destroyed until the instance of /dev/ppp is closed and ppp_release is called. Note that pppd will not be blocking inside ppp_file_read since it sets the file descriptor non-blocking. Most of the time pppd would be inside a select, so rwait would be in use by the poll/select code. I presume that the generic file descriptor code ensures that the file release function doesn't get called while any task is inside the read or write function for that file, or while the file descriptor is in use in a select or poll. If that assumption is wrong then it would indeed be possible for the channel to be destroyed while some process is waiting on rwait. But in any case it shouldn't be a problem in practice since it would only be pppd that would have the channel open and pppd is single-threaded, i.e. it couldn't be closing the file descriptor while it is blocked inside read or select. So, to put it in other words, this is the sequence (simplified): fd = open("/dev/ppp", O_RDWR); ioctl(fd, PPPIOCATTCHAN, &channel_number); fcntl(fd, F_SETFL, fcntl(fd, F_GETFL) | O_NONBLOCK); select(...);/* fd_sets including fd */ read(fd, ...); ... close(fd); I believe the channel structure is guaranteed to exist from the ioctl to the close, and all the selects and reads (i.e. all the uses of rwait) have to happen within that time interval. > A patch against 2.4.2 follows. I've overloaded the "refcnt" in > "struct ppp_file" to also keep track of rwaiters. The last refcnt > user destroys the channel and decreases the module use count. I've > tested this with printks in all the right places, and it seems to fix > the problem correctly. I'm not sure this is the right fix, this sounds to me like the refcounts are going awry somehow or there is an SMP race that I haven't considered, and I am concerned that this patch will just cover over the real problem. Actually, given that you've seen it 4 times in 6 months it's more likely that it is an SMP race IMHO. In any case I don't think your patch does the right thing with ppp_poll, because poll_wait doesn't actually wait, it just adds rwait to a list of things to watch for wakeups. In other words, rwait will be in use from the time poll_wait is called until the time that the poll/select logic (in fs/select.c) decides that it's time to return to the user. So increasing the refcount around just the poll_wait call won't help much. Do you have a way to reproduce the problem at will? Have you seen it happen on a UP box (i.e. could it be an SMP race)? How sure are you that your patch really fixes the problem? Regards, Paul. -- Paul Mackerras, Open Source Research Fellow, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax [EMAIL PROTECTED], http://www.linuxcare.com.au/ Linuxcare. Putting Open Source to work. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Bug in ppp_async.c
Albert D. Cahalan writes: > Even Red Hat 7 only has the 2.3.11 version. > > The 2.4.xx series is supposed to be stable. If there is any way > you could add a compatibility hack, please do so. Stable != backwards compatible to the year dot. ppp-2.4.0 has been out for over 5 months now. Adding the compatibility stuff back in would make the PPP subsystem much more complicated and less robust. And pppd is not the only thing you would have to upgrade if you are using a 2.4.0 with Red Hat 7.0 - I would expect that you would also at least have to upgrade modutils, and switch over from ipchains to iptables if you use the netfilter stuff. Paul. -- Paul Mackerras, Open Source Research Fellow, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax [EMAIL PROTECTED], http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Bug in ppp_async.c
Jo l'Indien writes: > I found a bug in the 2.4.1-pre10 version of ppp_async.c > > In fact, a lot of ioctl are not supported any more, > whih make the pppd start fail. I'll bet you're using an old pppd. You need version 2.4.0 of pppd, available from ftp://linuxcare.com.au/pub/ppp/, as documented in the Documentation/Changes file. > PS: sorry, but I don't know who is the actual maitainer of this > driver... Me. -- Paul Mackerras, Open Source Research Fellow, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax [EMAIL PROTECTED], http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/