On Tue, Jan 31, 2017 at 05:03:51PM +0100, Jan Stancek wrote: > On 01/30/2017 07:49 PM, Jiri Olsa wrote: > > so basically we're changing from avail to online cpus > > > > have you checked all the users of this FEATURE > > if such change is ok? > > Jiri, > > It wasn't OK as there are other users who index cpu_topology_map by CPU id. > I decided to give the alternative a try (attached): keep cpu_topology_map > indexed by CPU id, but extend it to fit max present CPU.
please send this next time as a standard patchset, it's hard to discuss over attachments SNIP > When build_cpu_topo() encounters offline/absent CPUs, > it fails to find any sysfs entries and returns failure. > This leads to build_cpu_topology() and write_cpu_topology() > failing as well. > > Because HEADER_CPU_TOPOLOGY has not been written, read leaves > cpu_topology_map NULL and we get NULL ptr deref at: > > ... > cmd_test > __cmd_test > test_and_print > run_test > test_session_topology > check_cpu_topology So IIUIC that's the key issue here.. write_cpu_topology that fails to write the TOPO data and following readers crashing on processing uncomplete data? if thats the case write_cpu_topology needs to be fixed, instead of doing workarounds SNIP > u32 nr, i; > size_t sz; > long ncpus; > - int ret = -1; > + int ret = 0; > + struct cpu_map *map; > > ncpus = sysconf(_SC_NPROCESSORS_CONF); > if (ncpus < 0) > - return NULL; > + goto out; can just return NULL > + > + /* build online CPU map */ > + map = cpu_map__new(NULL); > + if (map == NULL) { > + pr_debug("failed to get system cpumap\n"); > + goto out; > + } > > nr = (u32)(ncpus & UINT_MAX); > > sz = nr * sizeof(char *); > - > addr = calloc(1, sizeof(*tp) + 2 * sz); > if (!addr) > - return NULL; > + goto out_free; > > tp = addr; > tp->cpu_nr = nr; > @@ -530,14 +537,21 @@ static struct cpu_topo *build_cpu_topology(void) > tp->thread_siblings = addr; > > for (i = 0; i < nr; i++) { > + if (!cpu_map__has(map, i)) > + continue; > + so this prevents build_cpu_topo to fail due to missing topology info because cpu is offline.. can it fail for other reasons? > ret = build_cpu_topo(tp, i); > if (ret < 0) > break; SNIP