Signed-off-by: Mike Holmes <mike.hol...@linaro.org> --- doc/users-guide/users-guide.adoc | 161 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 161 insertions(+)
diff --git a/doc/users-guide/users-guide.adoc b/doc/users-guide/users-guide.adoc index cf77fa0..d2e1a16 100644 --- a/doc/users-guide/users-guide.adoc +++ b/doc/users-guide/users-guide.adoc @@ -431,6 +431,167 @@ Applications only include the 'include/odp.h file which includes the 'platform/< The doxygen documentation defining the behavior of the ODP API is all contained in the public API files, and the actual definitions for an implementation will be found in the per platform directories. Per-platform data that might normally be a #define can be recovered via the appropriate access function if the #define is not directly visible to the application. +== Helpers +Many small helper functions and definitions are needed to enable ODP +applications to be hardware optimized but not tied to a particular hardware or +execution environment. These are typically implemented with inline functions, +preprocessor macros, or compiler builtin features. Thus API definitions are +normally inline when possible. + +=== Core enumeration +Application or middleware need to handle physical and/or logical core IDs, core +counts and core masks quite often. Core enumeration has to remain consistent +even when core deployment may change during application execution (e.g., due to +adaptation to changing traffic profile, etc). + +* +odp_cpumask_from_str()+ +* +odp_cpumask_to_str()+ +* +odp_cpumask_zero()+ +* +odp_cpumask_set()+ +* +odp_cpumask_setall()+ +* +odp_cpumask_clr()+ +* +odp_cpumask_isset()+ +* +odp_cpumask_count()+ +* +odp_cpumask_and()+ +* +odp_cpumask_or()+ +* +odp_cpumask_xor()+ +* +odp_cpumask_equal()+ +* +odp_cpumask_copy()+ +* +odp_cpumask_first()+ +* +odp_cpumask_last()+ +* +odp_cpumask_next()+ +* +odp_cpumask_default_worker()+ +* +odp_cpumask_default_control()+ + +=== Memory alignments +For optimal performance and scalability (e.g., to avoid false sharing and cache +line aliasing), some application data structures need to be aligned to cache +(cache line) and/or memory subsystem (page, DRAM burst) alignments. NUMA +systems also support locationawareness and potentially different cache line +sizes on a permemory basis. Static memory allocation Serves application needs +for portable definitions for global and core/thread local data. + +* +ODP_ALIGNED+ +* +ODP_PACKED+ +* +ODP_OFFSETOF+ +* +ODP_FIELD_SIZEOF+ +* +ODP_CACHE_LINE_SIZE+ +* +ODP_PAGE_SIZE+ +* +ODP_ALIGNED_CACHE+ +* +ODP_ALIGNED_PAGE+ + +=== Compiler hints +The compiler and linker can do better optimizations if code includes hints on +expected application behavior. Examples of these are classification of +branches with likely/unlikely hints, or marking code with hot (optimize for +speed) or cold (optimize for size) tags. + +* +odp_likely()+ +* +odp_unlikely()+ +* +odp_prefetch()+ +* +odp_prefetch_store()+ + +=== Atomic operations +Modern ISAs offers various atomic instructions to access/manipulate data +concurrently from multiple cores. Well scalable multicore software is possible +only through correct usage (and combination) of hardware acceleration and +atomic instructions. Applications use atomic operations to update global +statistics, sequence counters, quotas, etc., and to build concurrent data +structures. + +* +odp_atomic_init_u64()+ +* +odp_atomic_load_u64()+ +* +odp_atomic_store_u64()+ +* +odp_atomic_fetch_add_u64()+ +* +odp_atomic_add_u64()+ +* +odp_atomic_fetch_sub_u64()+ +* +odp_atomic_sub_u64()+ +* +odp_atomic_fetch_inc_u64()+ +* +odp_atomic_inc_u64()+ +* +odp_atomic_fetch_dec_u64()+ +* +odp_atomic_dec_u64()+ + +=== Memory synchronization barriers +Application (or middleware) needs a portable way to synchronize data +modifications into main memory before messaging other cores or hardware +acceleration about the changes. The nature of the synchronization needs are +cache coherence protocol specific. + +* +odp_barrier_t()+ +* +odp_rwlock_t()+ +* +odp_ticketlock_t()+ +* +odp_barrier_init()+ +* +odp_barrier_wait()+ +* +odp_rwlock_init()+ +* +odp_rwlock_read_lock()+ +* +odp_rwlock_read_unlock()+ +* +odp_rwlock_write_lock()+ +* +odp_rwlock_write_unlock()+ +* +odp_sync_stores()+ +* +odp_ticketlock_init()+ +* +odp_ticketlock_lock()+ +* +odp_ticketlock_trylock()+ +* +odp_ticketlock_unlock()+ +* +odp_ticketlock_is_locked()+ + +=== Execution barriers and spinlocks +Although software locking should be avoided (especially in fast path code), at +times there is no practical way to synchronize cores other than using execution +barriers or spinlocks. For example, the application initialization phase +typically is not performance critical and may be much simpler with synchronous +interfaces and locking. + +* +odp_spinlock_t()+ +* +odp_spinlock_init()+ +* +odp_spinlock_lock()+ +* +odp_spinlock_trylock()+ +* +odp_spinlock_unlock()+ +* +odp_spinlock_is_locked()+ + +=== Profiling and debugging +Although there are (external) tools for profiling and debugging, some level of +application code instrumentation is typically needed (e.g., for on field +debug/profiling). Typically an SoC supports CPU level (e.g., cycle count, cache +misses, branch prediction misses) and SoC level (system cache misses, +interconnect/DRAM utilization) performance counters. + +* +odp_errno()+ +* +odp_errno_zero()+ +* +odp_errno_print()+ +* +odp_errno_str()+ + +* +odp_override_log()+ +* +odp_override_abort()+ + +=== SoC Hardware info +The application may be interested in generic performance characteristics of the +SoC it is running on to have optimal adaption to the system. + +* +odp_cpu_id()+ +* +odp_cpu_count()+ +* +odp_cpu_cycles()+ +* +odp_cpu_cycles_diff()+ +* +odp_cpu_cycles_max()+ +* +odp_cpu_cycles_resolution()+ + +=== Data manipulation +There are some data manipulation operations that are typical to networking +applications. Examples of these are byte order swap for big/littleendian +conversion, various checksum algorithms, and bit shuffling/shifting. + +* +odp_be_to_cpu_16()+ +* +odp_be_to_cpu_32()+ +* +odp_be_to_cpu_64()+ +* +odp_cpu_to_be_16()+ +* +odp_cpu_to_be_32()+ +* +odp_cpu_to_be_64()+ +* +odp_le_to_cpu_16()+ +* +odp_le_to_cpu_32()+ +* +odp_le_to_cpu_64()+ +* +odp_cpu_to_le_16()+ +* +odp_cpu_to_le_32()+ +* +odp_cpu_to_le_64()+ + .Users include structure ---- ./ -- 2.5.0 _______________________________________________ lng-odp mailing list lng-odp@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lng-odp