Re: [Sip] Is body handling NP-complete?

Paul Kyzivat Wed, 01 Oct 2008 09:43:45 -0700

I think I would prefer a declarative specification of body handling asthe normative part. But if we have a good processing algorithm to show,I would like to see it included, perhaps as a non-normative appendix.Its a good demonstration that the declarative spec is self consistent,and it gives developers a good starting point.


        Thanks,
        Paul


DRAGE, Keith (Keith) wrote:

I would note that if this does end up in the specification, it can carry
a rider that implementations should exhibit behaviour as demonstrated by
the algorithm, not that they need to implement the algorithm itself.
However I leave it to the working group to ascertain whether they prefer
this as a means of specification.

Regards

Keith
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] OnBehalf Of Paul Kyzivat
Sent: Tuesday, September 30, 2008 2:53 PM
To: [EMAIL PROTECTED]
Cc: [email protected]
Subject: Re: [Sip] Is body handling NP-complete?

Dale,

I like the thorough treatment. It seems plausible to me.
One implication of this is that body parts need to beprocessed twice in a content-type+content-disposition+contextdependent way: once to extract references, and again forfinal effect. (Hmm - is extracting of references dependent onthe disposition and context?)
If you imagine that various content types are handled byplugins, then the plugins will each require two entry points- one for each of these processing modes. While that iscertainly possible, it might mean that plugins that alreadyexist for non-sip applications (e.g. email or html) may notbe reusable for sip. (I am not familiar with the content-typeplugins for other protocols such as email. But I presume thatemail plugins may be useful in some cases, at least for MESSAGE.)
Do plugins for email have this complexity?
An alternative is to do all the processing in one pass, andthen at the end check if all body parts have been processed.BUT that means all the side effects of processing will havebeen realized before discovering that the message is inerror. That is certainly not ideal.
This example also points out that the requirement that referencesappear before the parts they reference is needed to make
the problem
tractable. Otherwise, there could be circular dependencies betweenthe choices for different multipart/alternatives, which might beimpossible to analyze by an algorithm as simple as a tree-walk.
This requirement is satisfied by lots of simple cases. But itmeans that it is impossible to describe any kind of circularstructure. I'm not familiar enough with the use ofmultipart/related to know if it is a restriction that istolerable there. Perhaps someone can comment on that.
        Thanks,
        Paul

[EMAIL PROTECTED] wrote:
Referring back to last March, in
http://www.ietf.org/mail-archive/web/sip/current/msg22426.html, I was
contemplating whether the rules for testing a message body requiredsearching through combinations of choices regarding which
alternatives
of multipart/alternatives are chosen. As I then conceptualized theproblem, the question was "Is there a choice of
alternatives such that
all the constraints for 'successful processing' can be
met?" It turns
out that if you phrase the question that way, the problem isNP-complete, which means in practice that any algorithm to solve itmust do backtracking.
The new formulation of the problem attempts to avoid this
by dividing
the problem into three phases: (1) assemble a tree which
represents
the dependencies of super-parts' being processable upon that ofsub-parts, (2) walk the tree, tagging parts from the bottom
up as to
whether they are processable, (3) if the overall tree is
processable,
selecting for processing the correct subset of parts as directed bythe tags in the tree. As long as the size of the tree is
commensurate
with the number of body parts, the algorithm is inherently
linear, and
clearly it doesn't involve backtracking.

In regard to step (1), the tree is assembled:

- the root node represents the message body
- a node for a multipart/alternative part has one child for each
  component
- a node for a multipart/mixed part has one child for each
  component
- a note for a multipart/related part is not automatically given
  children for its components
- a header in a part that refers to another component generates a
  child of that part that represents the referred-to component
- similarly, a reference within a part that refers to another
  component generates a child of that part that represents the
  referred-to component (This is how multipart/related nodes get
  children.)
Of course, if a component is referred to multiple times,
the component
is represented by multiple nodes in the tree, one node for
each of its
roles.
However, there are subtleties I might be overlooking, so I
am going to
revisit the embedding of the NP-complete problem "3-satisfiability"
into MIME to see what it reveals about my sketch of an algorithm.

Referring to the previous message:

    The reduction is as follows:

    http://en.wikipedia.org/wiki/NP-complete
    http://en.wikipedia.org/wiki/Reduction_%28complexity%29
    http://en.wikipedia.org/wiki/3-satisfiability#3-satisfiability
Choose an instance of the 3-satisfiability problem,
that is: a set of
    boolean variables A, B, C, etc. and a formula composed of:
            a series of "clauses" joined by AND, consisting of
a series of variables and their negations
joined by OR.
The problem is to determine if there is any assignment
of truth values
to the variables which makes the formula true, that is,
which makes
    true one of the variables or negations in each of the clauses.

    For example, choose 4 variables A, B, C, and D, and the formula
(A OR B OR C) AND (B OR NOT-C OR D) AND (NOT-A OR NOT-B
OR NOT-D)
This can be satisfied by setting A=TRUE, B=FALSE,
C=TRUE, and D=FALSE.
    Construct a multipart/mixed with each component labeled
handling=required. Within that, for each variable,
have a part which
is multipart/alternate with 2 parts. If processing
chooses the first
part of the part corresponding to a variable, that will
model the
variable being TRUE. If processing chooses the second
part of the
part corresponding to a variable, that will model the
variable being
    FALSE.

    For example, our body starts:

            multipart/mixed
                    multipart/alternative;handling=required
first part corresponding to A (not
yet specified)
                            second part corresponding to NOT-A
                    multipart/alternative;handling=required
                            first part corresponding to B
                            second part corresponding to NOT-B
                    multipart/alternative;handling=required
                            first part corresponding to C
                            second part corresponding to NOT-C
                    multipart/alternative;handling=required
                            first part corresponding to D
                            second part corresponding to NOT-D
For each clause of the formula, add an additional body
part, marked
"Content-Disposition: by-reference; handling=required".
 Give the part
a 'cid' URL. The ability of the recipient to
successfully process
    this part will model making the corresponding clause TRUE.

    For example, our body is now:

            multipart/mixed
                    multipart/alternative;handling=required
first part corresponding to A (not
yet specified)
                            second part corresponding to NOT-A
                    multipart/alternative;handling=required
                            first part corresponding to B
                            second part corresponding to NOT-B
                    multipart/alternative;handling=required
                            first part corresponding to C
                            second part corresponding to NOT-C
                    multipart/alternative;handling=required
                            first part corresponding to D
                            second part corresponding to NOT-D
part (not otherwise
specified);by-reference;handling=required;cid=clause1
part (not otherwise
specified);by-reference;handling=required;cid=clause2
part (not otherwisespecified);by-reference;handling=required;cid=clause3
We now want to arrange things so that one of the last 3
parts can be
successfully processed only if we choose for processing
one of the
earlier parts that corresponds to a variable value that
will make the
    corresponding clause true.  This is done by embedding forward
    references in the variable-corresponding parts that refer to the
    clause-corresponding parts.

    For example:

 1          multipart/mixed
 2                  multipart/alternative;handling=required
 3                          first part corresponding to A
                                    references cid:clause1
 4                          second part corresponding to NOT-A
                                    references cid:clause3
 5                  multipart/alternative;handling=required
 6                          first part corresponding to B
                                    references cid:clause1
                                    references cid:clause2
 7                          second part corresponding to NOT-B
                                    references cid:clause3
 8                  multipart/alternative;handling=required
 9                          first part corresponding to C
                                    references cid:clause1
10                          second part corresponding to NOT-C
                                    references cid:clause2
11                  multipart/alternative;handling=required
12                          first part corresponding to D
                                    references cid:clause2
13                          second part corresponding to NOT-D
                                    references cid:clause3
14                  part;by-reference;handling=required;cid=clause1
15                  part;by-reference;handling=required;cid=clause2
16                  part;by-reference;handling=required;cid=clause3
In order to process this body, the recipient must
choose exactly one
of each pair of bodies in the multipart/alternative
body parts.  This
choice is equivalent to choosing a set of values for
the variables.
The references embedded in the chosen parts refer to
the body parts
    that correspond to the clauses which are made true by the
corresponding variable's value. In order to
successfully process the
final body parts, the recipient must have chosen a set
of variable
    values that makes all of the clauses true.

The resulting tree is:

        1 -+- 2 --+- 3 ---- 14
           |      |
           |      +- 4 ---- 16
           |
           +- 5 --+- 6 --+- 14
           |      |      |
           |      |      +- 15
           |      |
           |      +- 7 ---- 16
           |
           +- 8 --+- 9 ---- 14
           |      |
           |      +- 10 --- 15
           |
           +- 11 -+- 12 --- 15
           |      |
           |      +- 13 --- 16
           |
           +- 14
           |
           +- 15
           |
           +- 16
Though the number of nodes is greater than the number of
body parts,
it's limited approximately by the number of appearances of
variables
in the original boolean formula, so it's linear in the
number of body
parts.  So in building the tree, we don't have exponential effort.
In the new approach, we walk the tree, tagging each node as
to whether
it is processable. In the above model, we assume that all
the simple
parts are processable, so all nodes labeled 2 through 16 are markedprocessable. And as we tag the nodes for processability, we choosefor each multipart/alternative node which child will be processed(namely, the last child that is processable). In this tree, thechoices are 2 -> 4, 5 -> 7, 8 -> 10, and 11 -> 13.
Where things get interesting is deciding if node 1 is processable.
All of its children are processable, of course. But the
crux of the
construction is how the "by-reference;handling=required"
constraints
on nodes 14, 15, and 16 are handled. In this algorithm, we checkwhether each part is referenced by a part that will be processed.Fortunately, all references to these nodes must have been
tree-walked
before node 1 is examined, so can determine unambiguously
whether used
references point to those nodes. In this case, we have chosen toprocess 2, 4, 5, 7, 8, 10, 11, and 13, which means that
there are used
references to 16, 16, 15, and 16. This leaves 17
unreferenced, and so
node 1 is determined to be unprocessable.
Note that the new algorithm avoids being NP-complete essentially bymaking choices as soon as they are seen. In this model,
values NOT-A,
NOT-B, NOT-C, and NOT-D are chosen outright. Since the exampleformula is not satisfied, the body that encodes it is not
processable.
This example also points out that the requirement that referencesappear before the parts they reference is needed to make
the problem
tractable. Otherwise, there could be circular dependencies betweenthe choices for different multipart/alternatives, which might beimpossible to analyze by an algorithm as simple as a tree-walk.
Dale
_______________________________________________
Sip mailing list  https://www.ietf.org/mailman/listinfo/sip
This list is for NEW development of the core SIP Protocol Use[EMAIL PROTECTED] for questions on current sip Use[EMAIL PROTECTED] for new developments on the application of sip
_______________________________________________
Sip mailing list  https://www.ietf.org/mailman/listinfo/sip
This list is for NEW development of the core SIP Protocol Use[EMAIL PROTECTED] for questions on current sipUse [EMAIL PROTECTED] for new developments on the application of sip
_______________________________________________
Sip mailing list  https://www.ietf.org/mailman/listinfo/sip
This list is for NEW development of the core SIP Protocol
Use [EMAIL PROTECTED] for questions on current sip
Use [EMAIL PROTECTED] for new developments on the application of sip

_______________________________________________
Sip mailing list  https://www.ietf.org/mailman/listinfo/sip
This list is for NEW development of the core SIP Protocol
Use [EMAIL PROTECTED] for questions on current sip
Use [EMAIL PROTECTED] for new developments on the application of sip

Re: [Sip] Is body handling NP-complete?

Reply via email to