Dear Wiki user, You have subscribed to a wiki page or wiki category on "Devicemap Wiki" for change notification.
The "Patterns2" page has been changed by rezan: https://wiki.apache.org/devicemap/Patterns2?action=diff&rev1=2&rev2=3 Comment: type Draft 1, 2014-01-09 This is the DeviceMap data specification for patterns and attributes. - - All encodings in this document are UTF8. === Overview === @@ -58, +56 @@ Each pattern file defines the domain input parsing rules: - inputTransformers:: + InputTransformers:: :: Type: list of transformation steps :: Optional. Default: none :: TODO: define what exactly these can be. - tokenSeparators:: + TokenSeparators:: :: Type: list of token seperator strings :: Optional. Default: none - ngramConcatSize:: + NgramConcatSize:: :: Type: greater than zero integer :: Optional. Default: 1 @@ -84, +82 @@ pattern matching step before moving on to the next token. This algorithm is pipeline and thread safe. - If the ngramConcatSize is greater than 1, the largest ngram must be + If the Ngram``Concat``Size is greater than 1, the largest ngram must be made first before creating the smaller ngrams. === Example === {{{ - inputTransformers: lowercase, [0-9]+ => _NUM + InputTransformers: lowercase, [0-9]+ => _NUM - tokenSeparators: [space] + TokenSeparators: [space] - ngramConcatSize: 2 + NgramConcatSize: 2 Input string: A 12 xyZ @@ -122, +120 @@ All the pattern types are prefixed with 'Simple'. This means that each pattern token is matched using a plain UTF8 string comparison. No regex or other syntax is allowed in Simple patterns. - This allows the algorithm to use simple string hashing for matching. This gives maximum performance and scaling complexity equal to a hashtable implementation. A Simple``HashCount attribute can be optionally defined which hints the classifier as to how many unique hashes it would need to generate to support the pattern set. + This allows the algorithm to use simple string hashing for matching. This gives maximum performance and scaling complexity equal to a hashtable implementation. A Simple``Hash``Count attribute can be optionally defined which hints the classifier as to how many unique hashes it would need to generate to support the pattern set. Pattern attributes: @@ -149, +147 @@ Default:: :: Type: boolean :: Optional. Default: false. - :: Only 1 pattern can have a true value of false. + :: Only 1 pattern can have a true value. == PatternType ==
