Author: ks
Date: Sun Jul 15 23:28:46 2007
New Revision: 5716

Log:
- Added design.txt and prototypes updated.

Added:
    experimental/Document/design/design.txt
    experimental/Document/src/output/
    experimental/Document/src/output.php
Removed:
    experimental/Document/src/writer.php
    experimental/Document/src/writers/
Modified:
    experimental/Document/design/requirements.txt
    experimental/Document/src/conversion.php
    experimental/Document/src/transformer.php
    experimental/Document/src/validator.php

Added: experimental/Document/design/design.txt
==============================================================================
--- experimental/Document/design/design.txt (added)
+++ experimental/Document/design/design.txt [iso-8859-1] Sun Jul 15 23:28:46 
2007
@@ -1,0 +1,160 @@
+eZ Component: Document, Design
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Design description
+==================
+
+ezcDocConversion
+----------------
+
+ezcDocConversion class is a wrapper around real conversion classes.
+It keeps information about available formats and possible transformation
+ways and cares about choosing a way to transform from any user's input
+format to any output format.
+
+Often direct conversion is impossible and some intermediate formats are needed.
+A sequence of all intermediate formats and corresponding conversion classes is
+called "conversion chain". Sometimes there could be more then one chain, in 
this
+case a shorter one is choosen automatically. But if you want to alter this, its
+possible to use getConversionChains() function to get all the chains and
+useChain() to select one of them.
+
+This class also cares about input/output data types, which can be text, DOM 
and file,
+and converts data (parses/dumps XML and reads/writes files) when it is 
necessary.
+Not all the data can be converted to another type directly (for example plain 
text
+formats can't be converted to DOM), in this case it throws a error.
+
+ezcDocConverterBase
+-------------------
+
+A small abstract class that other converters are based on.
+
+ezcDocParser
+------------
+
+A class based on ezcDocConverterBase for creating text format parsers. Contains
+methods for document parsing using a formal grammar. Exact formal grammars and
+format-specific callback handlers (if needed) are set in derived classes.
+
+ezcDocTransformer
+-----------------
+
+A class based on ezcDocConverterBase for transforming DOM documents using 
special
+rules and element callback handlers. Contains only methods for document
+transformation. Exact rules and element handlers are set in derived classes.
+
+ezcDocOutput
+------------
+
+Performs an output of the given document tree in the text format using simple
+internal templating system. Also it cares about text indenting to show the
+structure of the document. Exact templates for element output and helper
+formatting functions are set in derived classes.
+
+ezcDocOutputTemplate
+--------------------
+
+Implemented in the DocumentTemplateTieIn component. It extends ezcDocOutput
+class for using Template component for elements output.
+
+ezcDocValidator
+---------------
+
+Validates a document or a separate element against it's schema. This class uses
+RelaxNG schema format as the input.
+
+Converter classes
+-----------------
+
+Classes derived from ezcDocConverterBase or from classes that extend
+ezcDocConverterBase are used to convert a document from one format to another.
+
+
+Algorithms
+==========
+
+Transforming XML
+----------------
+
+The component supports 2 ways of transforming DOM documents:
+1) Using XSLT stylesheet. In this case the converter is derived from
+ ezcDocConverterBase class and applies XSLT stylesheets to the document using
+ PHP XSL extension.
+2) Using ezcDocTransformer class. In this case the converter is derived from
+ ezcDocTransformer class.
+ 
+ezcDocTransformer class provides the interface to process the document.
+It's principle is completely different from XSL and it assumes that the result
+of transformations is the same document, but with diffrent schema.
+It's main function walks around the document's tree and calls callback handlers
+for elements. Derived classes contain control arrays and elements' handlers.
+
+Element's handler can pass information to next processed element's handler.
+This makes it possible to handle complex transformations that evolve many
+elements. For instance the handler of "text" element "knows" that it is needed
+to have <p> element as a parent. It creates new <p> element and passes it
+to the next element's handler by reference. So if the next element is a text,
+it will have a new parent element to be attached to.
+
+Parsing text/XML
+----------------
+
+ezcDocParser class performs a parsing of the input text and presents
+it as a DOM tree.
+
+This is not an implementation of a real context-free parser.
+There is an assumption that input language is XML-like, i.e. consists
+of elements that have their opening and ending parts and some
+content between them (that may contain another elements).
+ 
+Sometimes it's hard or impossible to formalize input in these terms,
+so some special algorithms or custom element handlers will be used
+in this case.
+
+Document output
+---------------
+
+ezcDocOutput class performs an output of the given document tree in the text
+format using simple internal templating system. Also it cares about text
+indenting to show the structure of the document.
+ 
+Exact templates for element output and helper formatting functions are set
+in derived classes. Templates are simple strings in which some character
+or sequence is replaced with another string using str_replace.
+
+ezcDocOutputTemplate class is implemented in the DocumentTemplateTieIn
+component. It extends this class to use Template component for elements
+output.
+
+Validating documents
+--------------------
+
+ezcDocValidator is used to validate a document or a separate element
+against it's schema.
+
+This class uses RelaxNG schema format as the input, then transforms it
+to the inner format for fast processing. The processed schema is stored
+in cached .php file for faster access in the future.
+
+The idea for fast validation is using regular expressions and strings.
+Here is an example:
+ 
+ <element name="elem1">
+   <zeroOrMore>
+     <element name="elem2">
+       ...
+     </element>
+   </zeroOrMore>
+   <element name="elem3">
+       ...
+   </element>
+</element>
+
+This RelaxNG schema for the element's content can be presented with regexp:
+
+'#(elem2)*elem3#'
+
+Validated document element's children can be also presented with a string,
+like 'elem2elem2elem3' for instance, which is validated with this regexp.
+ 
+The similar process used for attributes.

Modified: experimental/Document/design/requirements.txt
==============================================================================
--- experimental/Document/design/requirements.txt [iso-8859-1] (original)
+++ experimental/Document/design/requirements.txt [iso-8859-1] Sun Jul 15 
23:28:46 2007
@@ -106,7 +106,7 @@
 - eZ publish 4 XML text
 - eZ publish 4 simplified XML text
 
-The attached picture (document-formats.png) shows which formats will be
+The attached diagram (document-formats.svg) shows which formats will be
 supported in the first release of the component and possible directions of
 transforming from one format to anthoer.
 

Modified: experimental/Document/src/conversion.php
==============================================================================
--- experimental/Document/src/conversion.php [iso-8859-1] (original)
+++ experimental/Document/src/conversion.php [iso-8859-1] Sun Jul 15 23:28:46 
2007
@@ -39,16 +39,20 @@
     /**
     * List of available formats
     */
-    private $availableFormats = array( 'oehtmlinput', 'oehtml', 'ezxmltext', 
'docbook', 'xhtml', 'xhtmlbody' );
+    private $availableFormats = array( 'oe', 'oehtml', 'ezp', 'docbook', 
'xhtml', 'xhtmlbody', 'simple', 'simplexml' );
 
     /**
     * List of available converter classes
     */
-    private $availableConverters = array( 'oehtmlinput' => array( 'oehtml' => 
'ezcParserOe' ),
-                                          'oehtml' => array( 'ezxmltext' => 
'ezcTransformOeEzp' ),
-                                          'ezxmltext' => array( 'docbook' => 
'ezcTransformEzpDocbook',
-                                                                'xhtml' => 
'ezcTransformEzpXhtml',
-                                                                'xhtmlbody' => 
'ezcWriteEzpXhtml' ) );
+    private $availableConverters = array( 'oe' => array( 'oehtml' => 
'ezcParseOE' ),
+                                          'oehtml' => array( 'ezp' => 
'ezcTransformOEhtmlEzp' ),
+                                          'ezp' => array( 'docbook' => 
'ezcTransformEzpDocbook',
+                                                          'xhtml' => 
'ezcTransformEzpXhtml',
+                                                          'xhtmlbody' => 
'ezcOutputEzpXhtml',
+                                                          'simplexml' => 
'ezcTransformEzpSimplexml' ),
+                                          'docbook' => array( 'ezp' => 
'ezcTransformDocbookEzp',
+                                                              'xhtml' => 
'ezcTransformDocbookXhtml' ),
+                                          'simple' => array( 'simplexml' => 
'ezcParseSimple' ) );
     /**
     * Sets source format, data type is TEXT, DOM or FILE
     */

Added: experimental/Document/src/output.php
==============================================================================
--- experimental/Document/src/output.php (added)
+++ experimental/Document/src/output.php [iso-8859-1] Sun Jul 15 23:28:46 2007
@@ -1,0 +1,71 @@
+<?php
+/**
+ * File containing the ezcDocOutput class
+ *
+ * @package Document
+ * @version //autogen//
+ * @copyright Copyright (C) 2005-2007 eZ systems as. All rights reserved.
+ * @license http://ez.no/licenses/new_bsd New BSD License
+ */
+
+/**
+ * ezcDocOutput class performs an output of the given document tree in the text
+ * format using simple internal templating system. Also it cares about text
+ * indenting to show the structure of the document.
+ * 
+ * Exact templates for element output and helper formatting functions are set
+ * in derived classes.
+ *
+ * ezcDocOutputTemplate class is implemented in the DocumentTemplateTieIn
+ * component. It extends this class to use Template component for elements
+ * output.
+ *
+ */
+
+abstract class ezcDocOutput extends ezcDocConverterBase
+{
+
+    /**
+    * Rules for elements output, specified in derived classes
+    *
+    * $elementsOutput = array( 'element1' => array( 'startTag' => 
'<element1$>'.
+    *                                               'endTag' => 
"</element1>\n",
+    *                                               'attribute1' => ' 
attr1="$"',
+    *                                               'attribute2' => ' 
attr2="$"' ... ),
+    *                           ... );
+    *
+    * '$' sign is replaced with attributes string in tag's start or end 
template,
+    * and with attributes' values in attributes templates.
+    *
+    * (There should also be a possibility to set default view)
+    *
+    */
+    protected $elementsOutput;
+
+    /**
+    *  String to use for indentations 
+    */
+    protected $indentString;
+
+    /** Conversion function
+     *  @return output text string
+     */
+    public function convert( $source )
+    {
+        //...
+        return $text;
+    }
+
+    /** Main tree-walk recursive function.
+     *  @return string part
+     */
+    protected function elementOutput( $element )
+    {
+
+    }
+
+    $sourceDataType = DOM;
+    $destDataType = TEXT;
+}
+
+?>

Modified: experimental/Document/src/transformer.php
==============================================================================
--- experimental/Document/src/transformer.php [iso-8859-1] (original)
+++ experimental/Document/src/transformer.php [iso-8859-1] Sun Jul 15 23:28:46 
2007
@@ -22,7 +22,6 @@
 
 abstract class ezcDocTransformer extends ezcDocConverterBase
 {
-
     /**
     * Attribute conversion rules:
     * 
@@ -46,7 +45,7 @@
 
     /** Main walk-tree recursive function.
      */
-    protected function transform( $element, &$handlersData )
+    protected function transform( &$element, &$handlersData )
     {
         // call of the 1st element's handler
         $this->initHandler( $element, &$handlersData );
@@ -55,9 +54,10 @@
         $child = $element->firstChild;
         do
         {
+            $next =& $element->nextSibiling;
             $this->transform( $child, &$handlersData );
-            $child = $element->nextSibiling;
-
+            $child =& $next;
+            
         }while( $child );
 
         // call of the 2nd element's handler

Modified: experimental/Document/src/validator.php
==============================================================================
--- experimental/Document/src/validator.php [iso-8859-1] (original)
+++ experimental/Document/src/validator.php [iso-8859-1] Sun Jul 15 23:28:46 
2007
@@ -13,9 +13,8 @@
  * against it's schema.
  * 
  * This class uses RelaxNG schema format as the input, then transforms it
- * to the inner format for fast processing.
- * 
- * (cache to a file?? use cache component?)
+ * to the inner format for fast processing. The processed schema is stored
+ * in cached .php file for faster access in the future.
  * 
  * The idea for fast validation is using regular expressions and strings.
  * Here is an example:


-- 
svn-components mailing list
[email protected]
http://lists.ez.no/mailman/listinfo/svn-components

Reply via email to