Signed-off-by: Mike Holmes <mike.hol...@linaro.org>
---
 doc/users-guide/users-guide.adoc | 161 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 161 insertions(+)

diff --git a/doc/users-guide/users-guide.adoc b/doc/users-guide/users-guide.adoc
index cf77fa0..d2e1a16 100644
--- a/doc/users-guide/users-guide.adoc
+++ b/doc/users-guide/users-guide.adoc
@@ -431,6 +431,167 @@ Applications only include the 'include/odp.h file which 
includes the 'platform/<
 The doxygen documentation defining the behavior of the ODP API is all 
contained in the public API files, and the actual definitions for an 
implementation will be found in the per platform directories.
 Per-platform data that might normally be a #define can be recovered via the 
appropriate access function if the #define is not directly visible to the 
application.
 
+== Helpers
+Many small helper functions and definitions are needed to enable ODP
+applications to be hardware optimized but not tied to a particular hardware or
+execution environment. These are typically implemented with inline functions,
+preprocessor macros, or compiler built­in features. Thus API definitions are
+normally inline when possible.
+
+=== Core enumeration
+Application or middleware need to handle physical and/or logical core IDs, core
+counts and core masks quite often. Core enumeration has to remain consistent
+even when core deployment may change during application execution (e.g., due to
+adaptation to changing traffic profile, etc).
+
+* +odp_cpumask_from_str()+
+* +odp_cpumask_to_str()+
+* +odp_cpumask_zero()+
+* +odp_cpumask_set()+
+* +odp_cpumask_setall()+
+* +odp_cpumask_clr()+
+* +odp_cpumask_isset()+
+* +odp_cpumask_count()+
+* +odp_cpumask_and()+
+* +odp_cpumask_or()+
+* +odp_cpumask_xor()+
+* +odp_cpumask_equal()+
+* +odp_cpumask_copy()+
+* +odp_cpumask_first()+
+* +odp_cpumask_last()+
+* +odp_cpumask_next()+
+* +odp_cpumask_default_worker()+
+* +odp_cpumask_default_control()+
+
+=== Memory alignments
+For optimal performance and scalability (e.g., to avoid false sharing and cache
+line aliasing), some application data structures need to be aligned to cache
+(cache line) and/or memory subsystem (page, DRAM burst) alignments.  NUMA
+systems also support location­awareness and potentially different cache line
+sizes on a per­memory basis. Static memory allocation Serves application needs
+for portable definitions for global and core/thread local data.
+
+* +ODP_ALIGNED+
+* +ODP_PACKED+
+* +ODP_OFFSETOF+
+* +ODP_FIELD_SIZEOF+
+* +ODP_CACHE_LINE_SIZE+
+* +ODP_PAGE_SIZE+
+* +ODP_ALIGNED_CACHE+
+* +ODP_ALIGNED_PAGE+
+
+=== Compiler hints
+The compiler and linker can do better optimizations if code includes hints on
+expected application  behavior.  Examples of these are classification of
+branches with likely/unlikely hints, or marking  code with hot (optimize for
+speed) or cold (optimize for size) tags.
+
+* +odp_likely()+
+* +odp_unlikely()+
+* +odp_prefetch()+
+* +odp_prefetch_store()+
+
+=== Atomic operations
+Modern ISAs offers various atomic instructions to access/manipulate data
+concurrently from multiple cores. Well scalable multicore software is possible
+only through correct usage (and combination) of hardware acceleration and
+atomic instructions. Applications use atomic operations to update global
+statistics, sequence counters, quotas, etc., and to build concurrent data
+structures.
+
+* +odp_atomic_init_u64()+
+* +odp_atomic_load_u64()+
+* +odp_atomic_store_u64()+
+* +odp_atomic_fetch_add_u64()+
+* +odp_atomic_add_u64()+
+* +odp_atomic_fetch_sub_u64()+
+* +odp_atomic_sub_u64()+
+* +odp_atomic_fetch_inc_u64()+
+* +odp_atomic_inc_u64()+
+* +odp_atomic_fetch_dec_u64()+
+* +odp_atomic_dec_u64()+
+
+=== Memory synchronization barriers
+Application (or middleware) needs a portable way to synchronize data
+modifications into main memory before messaging other cores or hardware
+acceleration about the changes. The nature of the synchronization needs are
+cache coherence protocol specific.
+
+* +odp_barrier_t()+
+* +odp_rwlock_t()+
+* +odp_ticketlock_t()+
+* +odp_barrier_init()+
+* +odp_barrier_wait()+
+* +odp_rwlock_init()+
+* +odp_rwlock_read_lock()+
+* +odp_rwlock_read_unlock()+
+* +odp_rwlock_write_lock()+
+* +odp_rwlock_write_unlock()+
+* +odp_sync_stores()+
+* +odp_ticketlock_init()+
+* +odp_ticketlock_lock()+
+* +odp_ticketlock_trylock()+
+* +odp_ticketlock_unlock()+
+* +odp_ticketlock_is_locked()+
+
+=== Execution barriers and spinlocks
+Although software locking should be avoided (especially in fast path code), at
+times there is no practical way to synchronize cores other than using execution
+barriers or spinlocks. For example, the application initialization phase
+typically is not performance critical and may be much simpler with synchronous
+interfaces and locking.
+
+* +odp_spinlock_t()+
+* +odp_spinlock_init()+
+* +odp_spinlock_lock()+
+* +odp_spinlock_trylock()+
+* +odp_spinlock_unlock()+
+* +odp_spinlock_is_locked()+
+
+=== Profiling and debugging
+Although there are (external) tools for profiling and debugging, some level of
+application code instrumentation is typically needed (e.g., for on field
+debug/profiling). Typically an SoC supports CPU level (e.g., cycle count, cache
+misses, branch prediction misses) and SoC level (system cache misses,
+interconnect/DRAM utilization) performance counters.
+
+* +odp_errno()+
+* +odp_errno_zero()+
+* +odp_errno_print()+
+* +odp_errno_str()+
+
+* +odp_override_log()+
+* +odp_override_abort()+
+
+=== SoC Hardware info
+The application may be interested in generic performance characteristics of the
+SoC it is running on to have optimal adaption to the system.
+
+* +odp_cpu_id()+
+* +odp_cpu_count()+
+* +odp_cpu_cycles()+
+* +odp_cpu_cycles_diff()+
+* +odp_cpu_cycles_max()+
+* +odp_cpu_cycles_resolution()+
+
+=== Data manipulation
+There are some data manipulation operations that are typical to networking
+applications. Examples of these are byte order swap for big/little­endian
+conversion, various checksum algorithms, and bit shuffling/shifting.
+
+* +odp_be_to_cpu_16()+
+* +odp_be_to_cpu_32()+
+* +odp_be_to_cpu_64()+
+* +odp_cpu_to_be_16()+
+* +odp_cpu_to_be_32()+
+* +odp_cpu_to_be_64()+
+* +odp_le_to_cpu_16()+
+* +odp_le_to_cpu_32()+
+* +odp_le_to_cpu_64()+
+* +odp_cpu_to_le_16()+
+* +odp_cpu_to_le_32()+
+* +odp_cpu_to_le_64()+
+
 .Users include structure
 ----
 ./
-- 
2.5.0

_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp

Reply via email to