John et al,

> WLOG is mathspeak for Without Loss of Generality.  Sorry :)

OK, WTF.  :-)

> Ocaml is a programming language.  Ocaml strings are multiple words
> with null padding at the end plus an indicator how many bytes of
> padding got added.

OK, I guess if you know the language, then the format is a good
metaphor.

So how do you know when you're in the last word, to say it's time to
stop instead of continuing?  Any pattern you can invent with nulls and a
count, I can assert to exist somewhere in the middle of a longer string,
so long as all bytes 0..255 are valid in strings.  (No distinguished
values like a terminator.)  So far this just sounds to me like a
variation on C length-terminated (by nulls) with a fancier termination
code.

Anyway if I had more time and energy, I'd be inclined to revisit this
topic and maybe even write some CODE.  Well, maybe someday when I'm
caught up with O(N^2) other retirement tasks!

OK, at least you motivated me to glance at Judy.h.  I found this:

// The JLAP_INVALID pattern never appears in a JudyL array pointer.  Hence
// applications that build trees of JudyL arrays can use this pattern to
// distinguish a JudyL array value that is a pointer to another JudyL array
// from one that is a pointer to some other, application-defined object.

OK I remember now, there's ONE pointer LSB pattern an application can
use to embed a non-Judy pointer (JudyL value) in a meta-trie (array of
array of JudyL).  This should be sufficient!

Basically if all common substrings through some point in the meta-trie
continue at least one level deeper, then you just make the present JudyL
value be a pointer to another JudyL array.  But if one or more
substrings terminate "at this level" then you store a pointer (to your
own user data) with JLAP_INVALID to mark it as such.  Of course your
meta-library must encode/decode this above libJudy while descending the
meta-trie for any operation.

"Your own user data" is some kind of node/struct that must include at
least one of:  A value area for 1..4 or 1..8 terminating substrings
(depending on word size), and possibly also one of:  A value area
holding a pointer to a subsidiary JudyL array for superstrings.

For example, suppose I store ALL of these trailing substrings:

  ... 0x01
  ... 0x0100
  ... 0x010000
  ... 0x01000000
  ... 0x01000000 0x05...

  ... 0x0102
  ... 0x010200
  ... 0x01020000
  ... 0x01020000 0x05...

  ... 0x010203
  ... 0x01020300
  ... 0x01020300 0x05...

  ... 0x01020304
  ... 0x01020304 0x05...

Each group above occupies one key in a given JudyL array (meta-trie
node), but as you can see, in the worst case (the first group), four
different terminating strings plus one or more superstrings is
represented.  All appear under a four-byte key of 0x01000000 (since
nulls are appended to shorter substrings).  The value associated with
that key is a JLAP_INVALID pointer to my own data structure where I list
which combinations are valid, like this (informally):

  slot 1: yes + value  # meaning 0x01 exists.
  slot 2: yes + value  # meaning 0x0100 exists (and so on).
  slot 3: yes + value
  slot 4: yes + value
  slot 5: yes + value (which is a pointer to a JudyL array)

Having never coded this, I don't know if it would pan out, or how it
would perform.  The size would have to be at least 5 bits + 5 words, or
really 6 words, on a 32-bit machine.

Just to be clearer, suppose I only stored these three strings/keys in an
entire improved JudyNL array:

  length = 1:  0x01
  length = 5:  0x01000000 0x05
  length = 8:  0x01000000 0x05060708

The 0x01000000 key in the top JudyL array would point (with
JLAP_INVALID) to a JudyNL special node that says:

  slot 1: yes + value  # meaning 0x01 exists, and here's its value area.
  slot 2: no
  slot 3: no
  slot 4: no
  slot 5: yes + value (which is a pointer to a JudyL array containing
          0x05 and 0x05060708)

Got it?

Cheers,
Alan

------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
_______________________________________________
Judy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/judy-devel

Reply via email to