[incubator-echarts-doc] branch next updated: add doc for data-transform

sushuang Mon, 24 Aug 2020 01:54:43 -0700

This is an automated email from the ASF dual-hosted git repository.

sushuang pushed a commit to branch next
in repository https://gitbox.apache.org/repos/asf/incubator-echarts-doc.git



The following commit(s) were added to refs/heads/next by this push:
     new d8744a6  add doc for data-transform
d8744a6 is described below

commit d8744a602ba39dbcf51a48e7dcad30b8930e0e24
Author: 100pah <sushuang0...@gmail.com>
AuthorDate: Mon Aug 24 16:53:28 2020 +0800

    add doc for data-transform
---
 build.js                                       |   2 +-
 en/option/component/data-transform-external.md |  20 +
 en/option/component/data-transform-filter.md   |  14 +
 en/option/component/data-transform-sort.md     |  14 +
 en/option/component/dataset.md                 |  10 +-
 en/option/partial/data-transform.md            |  25 ++
 en/tutorial/data-transform.md                  | 556 +++++++++++++++++++++++++
 en/tutorial/dataset.md                         |  12 +
 en/tutorial/tutorial.md                        |   1 +
 tool/extractDesc.js                            |   3 +-
 zh/option/component/data-transform-external.md |  21 +
 zh/option/component/data-transform-filter.md   |  14 +
 zh/option/component/data-transform-sort.md     |  14 +
 zh/option/component/dataset.md                 |  12 +-
 zh/option/partial/data-transform.md            |  27 ++
 zh/tutorial/data-transform.md                  | 545 ++++++++++++++++++++++++
 zh/tutorial/dataset.md                         |   4 +
 zh/tutorial/tutorial.md                        |   1 +
 18 files changed, 1291 insertions(+), 4 deletions(-)

diff --git a/build.js b/build.js
index b8cc3df..9e3bc24 100644
--- a/build.js
+++ b/build.js
@@ -131,7 +131,7 @@ async function run() {
 
     for (let language of languages) {
         await md2jsonAsync({
-            sectionsAnyOf: ['visualMap', 'dataZoom', 'series', 
'graphic.elements'],
+            sectionsAnyOf: ['visualMap', 'dataZoom', 'series', 
'graphic.elements', 'dataset.transform'],
             entry: 'option',
             language
         });
diff --git a/en/option/component/data-transform-external.md 
b/en/option/component/data-transform-external.md
new file mode 100644
index 0000000..6b7f1d9
--- /dev/null
+++ b/en/option/component/data-transform-external.md
@@ -0,0 +1,20 @@
+{{ target: component-data-transform-external }}
+
+## transform.xxx:xxx(Object)
+
+Besides built-in transforms (like 'filter', 'sort'), we can also use external 
transforms to provide more powerful functionalities.
+
+{{ use: partial-data-transform-tutorial-ref }}
+
+### type(string) = 'xxx:xxx'
+
+Built-in transform has no namespace (like `type: 'filter'`, `type: 'sort'`).
+
+External transform has namespace (like `type: 'ecStat:regression'`).
+
+### config
+
+The needed parameters of this data transform. Each type of transform has its 
own definition of `config`.
+
+
+{{ use: partial-data-transform-print }}
diff --git a/en/option/component/data-transform-filter.md 
b/en/option/component/data-transform-filter.md
new file mode 100644
index 0000000..698da5d
--- /dev/null
+++ b/en/option/component/data-transform-filter.md
@@ -0,0 +1,14 @@
+{{ target: component-data-transform-filter }}
+
+## transform.filter(Object)
+
+### type(string) = 'filter'
+
+### config
+
+The condition of transform "filter".
+
+{{ use: partial-data-transform-tutorial-ref }}
+
+
+{{ use: partial-data-transform-print }}
diff --git a/en/option/component/data-transform-sort.md 
b/en/option/component/data-transform-sort.md
new file mode 100644
index 0000000..a8d3007
--- /dev/null
+++ b/en/option/component/data-transform-sort.md
@@ -0,0 +1,14 @@
+{{ target: component-data-transform-sort }}
+
+## transform.sort(Object)
+
+### type(string) = 'sort'
+
+### config
+
+The condition of transform "sort".
+
+{{ use: partial-data-transform-tutorial-ref }}
+
+
+{{ use: partial-data-transform-print }}
diff --git a/en/option/component/dataset.md b/en/option/component/dataset.md
index 1ee7c35..b02e1f2 100644
--- a/en/option/component/dataset.md
+++ b/en/option/component/dataset.md
@@ -7,7 +7,6 @@
 
 More details about `dataset` can be checked in the 
[tutorial](tutorial.html#Dataset).
 
----
 
 {{use: partial-component-id(prefix="#")}}
 
@@ -64,3 +63,12 @@ Whether the first row/column of `dataset.source` represents 
[dimension names](da
 + `false`: data start from the first row/column.
 
 Note: "the first row/column" means that if 
[series.seriesLayoutBy](~series.seriesLayoutBy) is set as `'column'`, pick the 
first row, otherwise, if it is set as `'row'`, pick the first column.
+
+
+## transform(Object)
+
+{{ use: partial-data-transform-tutorial-ref }}
+
+{{ import: component-data-transform-filter }}
+{{ import: component-data-transform-sort }}
+{{ import: component-data-transform-external }}
diff --git a/en/option/partial/data-transform.md 
b/en/option/partial/data-transform.md
new file mode 100644
index 0000000..7344d22
--- /dev/null
+++ b/en/option/partial/data-transform.md
@@ -0,0 +1,25 @@
+{{ target: partial-data-transform-tutorial-ref }}
+See the tutorial of [datat transform](tutorial.html#Data%20Transform).
+
+
+{{ target: partial-data-transform-print }}
+### print(boolean) = false
+
+When using data transform, we might run into the trouble that the final chart 
do not display correctly but we do not know where the config is wrong. There is 
a property `transform.print` might help in such case. (`transform.print` is 
only available in dev environment).
+
+```js
+option = {
+    dataset: [{
+        source: [ ... ]
+    }, {
+        transform: {
+            type: 'filter',
+            config: { ... }
+            // The result of this transform will be printed
+            // in dev tool via `console.log`.
+            print: true
+        }
+    }],
+    ...
+}
+```
diff --git a/en/tutorial/data-transform.md b/en/tutorial/data-transform.md
new file mode 100644
index 0000000..9141544
--- /dev/null
+++ b/en/tutorial/data-transform.md
@@ -0,0 +1,556 @@
+{{target: data-transform}}
+
+# Data Transform
+
+`Data transform` has been supported since Apache ECharts 
(incubating)<sup>TM</sup> 5. In echarts, the term `data transform` means that 
generate new data from user provided source data and transform functions. both 
This feature is enable users to process data in declarative way, and provides 
users some common "transform functions" to make that kind of tasks 
"out-of-the-box". (For consistency in the context, the noun form of the word we 
keep using the "transform" rather than "transformation").
+
+The abstract formula of data transform is: `outData = f(inputData)`, where the 
transform function `f` can be like `filter`, `sort`, `regression`, `boxplot`, 
`cluster`, `aggregate`(todo) ...
+With the help of those transform methods, users can be implements the features 
like:
++ Partition data into multiple series.
++ Make some statistics and visualize the result.
++ Adapt some visualization algorithms to data and display the result.
++ Sort data.
++ Remove or choose some kind of empty or special datums.
++ ...
+
+
+## Get started to data transform
+
+In echarts, data transform is implemented based on the concept of 
[dataset](~dataset). A [dataset.transform](option.html#dataset.transform) can 
be configured in a dataset instance to indicate that this dataset is to be 
generated from this `transform`. For example:
+
+```js
+var option = {
+    dataset: [{
+        // This dataset is on `datasetIndex: 0`.
+        source: [
+            ['Product', 'Sales', 'Price', 'Year'],
+            ['Cake', 123, 32, 2011],
+            ['Cereal', 231, 14, 2011],
+            ['Tofu', 235, 5, 2011],
+            ['Dumpling', 341, 25, 2011],
+            ['Biscuit', 122, 29, 2011],
+            ['Cake', 143, 30, 2012],
+            ['Cereal', 201, 19, 2012],
+            ['Tofu', 255, 7, 2012],
+            ['Dumpling', 241, 27, 2012],
+            ['Biscuit', 102, 34, 2012],
+            ['Cake', 153, 28, 2013],
+            ['Cereal', 181, 21, 2013],
+            ['Tofu', 395, 4, 2013],
+            ['Dumpling', 281, 31, 2013],
+            ['Biscuit', 92, 39, 2013],
+            ['Cake', 223, 29, 2014],
+            ['Cereal', 211, 17, 2014],
+            ['Tofu', 345, 3, 2014],
+            ['Dumpling', 211, 35, 2014],
+            ['Biscuit', 72, 24, 2014],
+        ],
+        // id: 'a'
+    }, {
+        // This dataset is on `datasetIndex: 1`.
+        // A `transform` is configured to indicate that the
+        // final data of this dataset is transformed via this
+        // transform function.
+        transform: {
+            type: 'filter',
+            config: { dimension: 'Year', value: 2011 }
+        },
+        // There can be optional properties `fromDatasetIndex` or 
`fromDatasetId`
+        // to indicate that where is the input data of the transform from.
+        // For example, `fromDatasetIndex: 0` specify the input data is from
+        // the dataset on `datasetIndex: 0`, or `fromDatasetId: 'a'` specify 
the
+        // input data is from the dataset having `id: 'a'`.
+        // [DEFAULT_RULE]
+        // If both `fromDatasetIndex` and `fromDatasetId` are omitted,
+        // `fromDatasetIndex: 0` are used by default.
+    }, {
+        // This dataset is on `datasetIndex: 2`.
+        // Similarly, if neither `fromDatasetIndex` nor `fromDatasetId` is
+        // specified, `fromDatasetIndex: 0` is used by default
+        transform: {
+            // The "filter" transform filters and gets data items only match
+            // the given condition in property `config`.
+            type: 'filter',
+            // Transforms has a property `config`. In this "filter" transform,
+            // the `config` specify the condition that each result data item
+            // should be satisfied. In this case, this transform get all of
+            // the data items that the value on dimension "Year" equals to 
2012.
+            config: { dimension: 'Year', value: 2012 }
+        }
+    }, {
+        // This dataset is on `datasetIndex: 3`
+        transform: {
+            type: 'filter',
+            config: { dimension: 'Year', value: 2013 }
+        }
+    }],
+    series: [{
+        type: 'pie', radius: 50, center: ['25%', '50%'],
+        // In this case, each "pie" series reference to a dataset that has
+        // the result of its "filter" transform.
+        datasetIndex: 1
+    }, {
+        type: 'pie', radius: 50, center: ['50%', '50%'],
+        datasetIndex: 2
+    }, {
+        type: 'pie', radius: 50, center: ['75%', '50%'],
+        datasetIndex: 3
+    }]
+};
+```
+
+The case shows how we get three pies, representing the data from 2011, 2012, 
2013.
+~[800x300](${galleryViewPath}data-transform-multiple-pie&reset=1&edit=1)
+
+
+Let's summarize the key points of using data transform:
++ Generate new data from existing declared data via the declaration of 
`transform`, `fromDatasetIndex`/`fromDatasetId` in some blank dataset.
++ Series references these datasets to show the result.
+
+
+
+## Advanced usage
+
+### Piped transform
+
+There is a syntactic sugar that pipe transforms like:
+```js
+option: {
+    dataset: [{
+        source: [ ... ] // The original data
+    }, {
+        // Declare transforms in an array to pipe multiple transforms,
+        // which makes them execute one by one and take the output of
+        // the previous transform as the input of the next transform.
+        transform: [{
+            type: 'filter',
+            config: { dimension: 'Product', value: 'Tofu' }
+        }, {
+            type: 'sort',
+            config: { dimension: 'Year', order: 'desc' }
+        }]
+    }],
+    series: {
+        type: 'pie',
+        // Display the result of the piped transform.
+        datasetIndex: 1
+    }
+}
+```
+
+> Note: theoretically any type of transform is able to have multiple input 
data and multiple output data. But when a transform is piped, it is only able 
to take one input (except it is the first transform of the pipe) and product 
one output (except it is the last transform of the pipe).
+
+
+
+### Output multiple data
+
+In most cases, transform functions only need to produce one data. But there is 
indeed scenarios that a transform function needs to produce multiple data, each 
of whom might be used by different series.
+
+For example, in the built-in boxplot transform, besides boxplot data produced, 
the outlier data are also produced, which can be used in a scatter series. See 
the [example](${galleryEditorPath}boxplot-light-velocity&edit=1&reset=1).
+
+
+```js
+option = {
+    dataset: [{
+        // Original source data.
+        source: [...]
+    }, {
+        transform: {
+            type: 'boxplot'
+        }
+        // After this "boxplot transform" two result data generated:
+        // result[0]: The boxplot data
+        // result[1]: The outlier data
+        // By default, when series or other dataset reference this dataset,
+        // only result[0] can be visited.
+        // If we need to visit result[1], we have to use another dataset
+        // as follows:
+    }, {
+        // This extra dataset references the dataset above, and retrieves
+        // the result[1] as its own data. Thus series or other dataset can
+        // reference this dataset to get the data from result[1].
+        fromDatasetIndex: 1,
+        fromTransformResult: 1
+    }],
+    xAxis: {
+        type: 'category'
+    },
+    yAxis: {
+    },
+    series: [{
+        name: 'boxplot',
+        type: 'boxplot',
+        // Reference the data from result[0].
+        datasetIndex: 1
+    }, {
+        name: 'outlier',
+        type: 'scatter',
+        // Reference the data from result[1].
+        datasetIndex: 2
+    }]
+};
+```
+
+
+### Debug in develop environment
+
+When using data transform, we might run into the trouble that the final chart 
do not display correctly but we do not know where the config is wrong. There is 
a property `transform.print` might help in such case. (`transform.print` is 
only available in dev environment).
+
+```js
+option = {
+    dataset: [{
+        source: [ ... ]
+    }, {
+        transform: {
+            type: 'filter',
+            config: { ... }
+            // The result of this transform will be printed
+            // in dev tool via `console.log`.
+            print: true
+        }
+    }],
+    ...
+}
+```
+
+
+## The transform "filter"
+
+Transform type "filter" is a built-in transform that provide data filter 
according to specified conditions. The basic option is like:
+
+```js
+option = {
+    dataset: [{
+        source: [
+            ['Product', 'Sales', 'Price', 'Year'],
+            ['Cake', 123, 32, 2011],
+            ['Latte', 231, 14, 2011],
+            ['Tofu', 235, 5, 2011],
+            ['Milk Tee', 341, 25, 2011],
+            ['Porridge', 122, 29, 2011],
+            ['Cake', 143, 30, 2012],
+            ['Latte', 201, 19, 2012],
+            ['Tofu', 255, 7, 2012],
+            ['Milk Tee', 241, 27, 2012],
+            ['Porridge', 102, 34, 2012],
+            ['Cake', 153, 28, 2013],
+            ['Latte', 181, 21, 2013],
+            ['Tofu', 395, 4, 2013],
+            ['Milk Tee', 281, 31, 2013],
+            ['Porridge', 92, 39, 2013],
+            ['Cake', 223, 29, 2014],
+            ['Latte', 211, 17, 2014],
+            ['Tofu', 345, 3, 2014],
+            ['Milk Tee', 211, 35, 2014],
+            ['Porridge', 72, 24, 2014]
+        ]
+    }, {
+        transform: {
+            type: 'filter',
+            config: { dimension: 'Year', '=': 2011 }
+            // The config is the "condition" of this filter.
+            // This transform traverse the source data and
+            // and retrieve all the items that the "Year"
+            // is `2011`.
+        }
+    }],
+    series: {
+        type: 'pie',
+        datasetIndex: 1
+    }
+};
+```
+
+**About dimension:**
+
+The `config.dimension` can be:
++ Dimension name declared in dataset, like `config: { dimension: 'Year', '=': 
2011 }`. Dimension name declaration is not mandatory.
++ Dimension index (start from 0), like `config: { dimension: 3, '=': 2011 }`.
+
+**About relational operator:**
+
+The relational operator can be:
+`>`(`gt`), `>=`(`gte`), `<`(`lt`), `<=`(`lte`), `=`(`eq`), `!=`(`ne`, `<>`), 
`reg`. (The name in the parentheses are aliases). They follows the common 
semantics.
+Besides the common number comparison, there is some extra features:
++ Multiple operators are able to appear in one {} item like `{ dimension: 
'Price', '>=': 20, '<': 30 }`, which means logical "and" (Price >= 20 and Price 
< 30).
++ The data value can be "numeric string". Numeric string is a string that can 
be converted to number. Like ' 123 '. White spaces and line breaks will be auto 
trimmed in the conversion.
++ If we need to compare "JS `Date` instance" or date string (like 
'2012-05-12'), we need to specify `parser: 'time'` manually, like `config: { 
dimension: 3, lt: '2012-05-12', parser: 'time' }`.
++ Pure string comparison is supported but can only be used in `=`, `!=`. `>`, 
`>=`, `<`, `<=` do not support pure string comparison (the "right value" of the 
four operators can not be a "string").
++ The operator `reg` can be used to make regular expression test. Like using 
`{ dimension: 'Name', reg: /\s+Müller\s*$/ }` to select all data items that the 
"Name" dimension contains family name Müller.
+
+**About logical relationship:**
+
+Sometimes we also need to express logical relationship ( `and` / `or` / `not` 
):
+```js
+option = {
+    dataset: [{
+        source: [...]
+    }, {
+        transform: {
+            type: 'filter',
+            config: {
+                // Use operator "and".
+                // Similarly, we can also use "or", "not" in the same place.
+                // But "not" should be followed with a {...} rather than 
`[...]`.
+                and: [
+                    { dimension: 'Year', '=': 2011 },
+                    { dimension: 'Price', '>=': 20, '<': 30 }
+                ]
+            }
+            // The condition is "Year" is 2011 and "Price" is greater
+            // or equal to 20 but less than 30.
+        }
+    }],
+    series: {
+        type: 'pie',
+        datasetIndex: 1
+    }
+};
+```
+`and`/`or`/`not` can be nested like:
+```js
+transform: {
+    type: 'filter',
+    config: {
+        or: [{
+            and: [{
+                dimension: 'Price', '>=': 10, '<': 20
+            }, {
+                dimension: 'Sales', '<': 100
+            }, {
+                not: { dimension: 'Product', '=': 'Tofu' }
+            }]
+        }, {
+            and: [{
+                dimension: 'Price', '>=': 10, '<': 20
+            }, {
+                dimension: 'Sales', '<': 100
+            }, {
+                not: { dimension: 'Product', '=': 'Cake' }
+            }]
+        }]
+    }
+}
+```
+
+**About parser:**
+
+Some "parser" can be specified when make value comparison. At present only 
supported:
++ `parser: 'time'`: Parse the value to date time before comparing. The parser 
rule is the same as `echarts.time.parse`, where JS `Date` instance, timestamp 
number (in millisecond) and time string (like `'2012-05-12 03:11:22'`) are 
supported to be parse to timestamp number, while other value will be parsed to 
`NaN`.
++ `parser: 'trim'`: Trim the string before making comparison. For non-string, 
return the original value.
++ `parser: 'number'`: Force to convert the value to number before making 
comparison. If not possible to be converted to a meaningful number, converted 
to `NaN`. In most cases it is not necessary, because by default the value will 
be auto converted to number if possible before making comparison. But the 
default conversion is strict while this parser provide a loose strategy. If we 
meet the case that number string with unit suffix (like `'33%'`, `12px`), we 
should use `parser: 'number'` to [...]
+
+This is an example to show the `parser: 'time'`:
+```js
+option = {
+    dataset: [{
+        source: [
+            ['Product', 'Sales', 'Price', 'Date'],
+            ['Milk Tee', 311, 21, '2012-05-12'],
+            ['Cake', 135, 28, '2012-05-22'],
+            ['Latte', 262, 36, '2012-06-02'],
+            ['Milk Tee', 359, 21, '2012-06-22'],
+            ['Cake', 121, 28, '2012-07-02'],
+            ['Latte', 271, 36, '2012-06-22'],
+            ...
+        ]
+    }, {
+        transform: {
+            type: 'filter',
+            config: {
+                { dimension: 'Date', '>=': '2012-05', '<': '2012-06', parser: 
'time' }
+            }
+        }
+    }]
+}
+```
+
+**Formally definition:**
+
+Finally, we give the formally definition of the filter transform config here:
+```ts
+type FilterTransform = {
+    type: 'filter';
+    config: ConditionalExpressionOption;
+};
+type ConditionalExpressionOption =
+    true | false | RelationalExpressionOption | LogicalExpressionOption;
+type RelationalExpressionOption = {
+    dimension: DimensionName | DimensionIndex;
+    parser?: 'time' | 'trim' | 'number';
+    lt?: DataValue; // less than
+    lte?: DataValue; // less than or equal
+    gt?: DataValue; // greater than
+    gte?: DataValue; // greater than or equal
+    eq?: DataValue; // equal
+    ne?: DataValue; // not equal
+    '<'?: DataValue; // lt
+    '<='?: DataValue; // lte
+    '>'?: DataValue; // gt
+    '>='?: DataValue; // gte
+    '='?: DataValue; // eq
+    '!='?: DataValue; // ne
+    '<>'?: DataValue; // ne (SQL style)
+    reg?: RegExp | string; // RegExp
+};
+type LogicalExpressionOption = {
+    and?: ConditionalExpressionOption[];
+    or?: ConditionalExpressionOption[];
+    not?: ConditionalExpressionOption;
+};
+type DataValue = string | number | Date;
+type DimensionName = string;
+type DimensionIndex = number;
+```
+
+
+
+
+## The transform "sort"
+
+Another built-in transform is "sort".
+
+```js
+option = {
+    dataset: [{
+        dimensions: ['name', 'age', 'profession', 'score', 'date'],
+        source: [
+            [' Hannah Krause ', 41, 'Engineer', 314, '2011-02-12'],
+            ['Zhao Qian ', 20, 'Teacher', 351, '2011-03-01'],
+            [' Jasmin Krause ', 52, 'Musician', 287, '2011-02-14'],
+            ['Li Lei', 37, 'Teacher', 219, '2011-02-18'],
+            [' Karle Neumann ', 25, 'Engineer', 253, '2011-04-02'],
+            [' Adrian Groß', 19, 'Teacher', null, '2011-01-16'],
+            ['Mia Neumann', 71, 'Engineer', 165, '2011-03-19'],
+            [' Böhm Fuchs', 36, 'Musician', 318, '2011-02-24'],
+            ['Han Meimei ', 67, 'Engineer', 366, '2011-03-12'],
+        ]
+    }, {
+        transform: {
+            type: 'sort',
+            // Sort by score.
+            config: { dimension: 'score', order: 'asc' }
+        }
+    }],
+    series: {
+        type: 'bar',
+        datasetIndex: 1
+    },
+    ...
+};
+```
+
+~[600x350](${galleryViewPath}data-transform-sort-bar&reset=1&edit=1)
+
+
+
+Some extra features about "sort transform":
++ Order by multiple dimensions is supported. See examples below.
++ The sort rule:
+  + By default "numeric" (that is, number and numeric-string like `' 123 '`) 
are able to sorted by numeric order.
+  + Otherwise "non-numeric-string" are also able to be ordered among 
themselves. This might help to the case like grouping data items with the same 
tag, especially when multiple dimensions participated in the sort (See example 
below).
+  + When "numeric" is compared with "non-numeric-string", or either of them is 
compared with other types of value, they are not comparable. So we call the 
latter one as "incomparable" and treat it as "min value" or "max value" 
according to the prop `incomparable: 'min' | 'max'`. This feature usually helps 
to decide whether to put the empty values (like `null`, `undefined`, `NaN`, 
`''`, `'-'`) or other illegal values to the head or tail.
++ `filter: 'time' | 'trim' | 'number'` can be used, the same as "filter 
transform".
+  + If intending to sort time values (JS `Date` instance or time string like 
`'2012-03-12 11:13:54'`), `parser: 'time'` should be specified. Like `config: { 
dimension: 'date', order: 'desc', parser: 'time' }`
+  + If intending to sort values with unit suffix (like `'33%'`, `'16px'`), 
need to use `parser: 'number'`.
+
+
+See an example of multiple order:
+```js
+option = {
+    dataset: [{
+        dimensions: ['name', 'age', 'profession', 'score', 'date'],
+        source: [
+            [' Hannah Krause ', 41, 'Engineer', 314, '2011-02-12'],
+            ['Zhao Qian ', 20, 'Teacher', 351, '2011-03-01'],
+            [' Jasmin Krause ', 52, 'Musician', 287, '2011-02-14'],
+            ['Li Lei', 37, 'Teacher', 219, '2011-02-18'],
+            [' Karle Neumann ', 25, 'Engineer', 253, '2011-04-02'],
+            [' Adrian Groß', 19, 'Teacher', null, '2011-01-16'],
+            ['Mia Neumann', 71, 'Engineer', 165, '2011-03-19'],
+            [' Böhm Fuchs', 36, 'Musician', 318, '2011-02-24'],
+            ['Han Meimei ', 67, 'Engineer', 366, '2011-03-12'],
+        ]
+    }, {
+        transform: {
+            type: 'sort',
+            config: [
+                // Sort by the two dimensions.
+                { dimension: 'profession', order: 'desc' },
+                { dimension: 'score', order: 'desc' }
+            ]
+        }
+    }],
+    series: {
+        type: 'bar',
+        datasetIndex: 1
+    },
+    ...
+};
+```
+~[600x350](${galleryViewPath}doc-example/data-transform-multiple-sort-bar&reset=1&edit=1)
+
+
+Finally, we give the formally definition of the sort transform config here:
+```ts
+type SortTransform = {
+    type: 'filter';
+    config: OrderExpression | OrderExpression[];
+};
+type OrderExpression = {
+    dimension: DimensionName | DimensionIndex;
+    order: 'asc' | 'desc';
+    incomparable?: 'min' | 'max';
+    parser?: 'time' | 'trim' | 'number';
+};
+type DimensionName = string;
+type DimensionIndex = number;
+```
+
+
+## Use external transforms
+
+Besides built-in transforms (like 'filter', 'sort'), we can also use external 
transforms to provide more powerful functionalities. Here we use a third-party 
library [ecStat](https://github.com/ecomfe/echarts-stat) as an example:
+
+This case show how to make a regression line via ecStat:
+```js
+// Register the external transform at first.
+echarts.registerTransform(ecStatTransform(ecStat).regression);
+```
+```js
+option = {
+    dataset: [{
+        source: rawData
+    }, {
+        transform: {
+            // Reference the registered external transform.
+            // Note that external transform has a namespace (like 'ecStat:xxx'
+            // has namespace 'ecStat').
+            // built-in transform (like 'filter', 'sort') does not have a 
namespace.
+            type: 'ecStat:regression',
+            config: {
+                // Parameters needed by the external transform.
+                method: 'exponential'
+            }
+        }
+    }, {
+        fromDatasetIndex: 1,
+        fromTransformResult: 1
+    }],
+    xAxis: { type: 'category' },
+    yAxis: {},
+    series: [{
+        name: 'scatter',
+        type: 'scatter',
+        datasetIndex: 0
+    }, {
+        name: 'regression',
+        type: 'line',
+        symbol: 'none',
+        datasetIndex: 1
+    }]
+};
+```
+
+example: ecState regression
+
diff --git a/en/tutorial/dataset.md b/en/tutorial/dataset.md
index 0a55f42..b5d97cf 100644
--- a/en/tutorial/dataset.md
+++ b/en/tutorial/dataset.md
@@ -511,6 +511,13 @@ var option = {
 }
 ```
 
+## Data transform
+
+`Data transform` has been supported since Apache ECharts 
(incubating)<sup>TM</sup> 5. In echarts, the term `data transform` means that 
generate new data from user provided source data and transform functions. This 
feature is enable users to process data in declarative way, and provides users 
some common "transform functions" to make that kind of tasks "out-of-the-box".
+
+See the details of data transform in this [doc](~data-transform).
+
+
 
 ## ECharts3 data setting approach (series.data) can be used normally
 
@@ -542,6 +549,11 @@ The data setting approach before ECharts4 can still be 
used normally. If a serie
 In fact, setting data via [series.data](option.html#series.data) is not 
deprecated and useful in some cases. For example, for some charts, like 
[treemap](option.html#series-treemap), [graph](option.html#series-graph), 
[lines](option.html#series-lines), that do not apply table data, `dataset` is 
not supported for yet. Moreover, for the case of large data rendering (for 
example, millions of data), [appendData](api.html#echartsInstance.appendData) 
is probably needed to load data incremental [...]
 
 
+## Data transform
+
+See [datat transform](~Data%20Transform).
+
+
 ## Others
 
 Currently, not all types of series support dataset. Series that support 
dataset includes:
diff --git a/en/tutorial/tutorial.md b/en/tutorial/tutorial.md
index a61fec7..178dd5e 100644
--- a/en/tutorial/tutorial.md
+++ b/en/tutorial/tutorial.md
@@ -8,6 +8,7 @@
 {{ import: style-overview }}
 {{ import: dynamic-data }}
 {{ import: dataset }}
+{{ import: data-transform }}
 {{ import: data-zoom }}
 {{ import: media-query }}
 {{ import: visual-map }}
diff --git a/tool/extractDesc.js b/tool/extractDesc.js
index 972ac8f..e0e489e 100644
--- a/tool/extractDesc.js
+++ b/tool/extractDesc.js
@@ -73,7 +73,8 @@ function convertToTree(rootSchema, rootNode) {
             childNode.arrayItemType = 
schema.properties.type.default.replace(/'/g, '');
         }
         else {
-            throw new Error('Some thing wrong happens', schema);
+            console.error('schema', schema);
+            throw new Error('Some thing wrong happens');
         }
         return childNode;
     }
diff --git a/zh/option/component/data-transform-external.md 
b/zh/option/component/data-transform-external.md
new file mode 100644
index 0000000..b9775db
--- /dev/null
+++ b/zh/option/component/data-transform-external.md
@@ -0,0 +1,21 @@
+{{ target: component-data-transform-external }}
+
+## transform.xxx:xxx(Object)
+
+
+除了上述的内置的数据转换器外，我们也可以使用外部的数据转换器。外部数据转换器能提供或自己定制更丰富的功能。
+
+{{ use: partial-data-transform-tutorial-ref }}
+
+### type(string) = 'xxx:xxx'
+
+内置数据转换器没有名空间（如 `type: 'filter'`, `type: 'sort'`）。
+
+外部数据转换器须有名空间（如 `type: 'ecStat:regression'`）。
+
+### config
+
+这里设置每个数据转换器所须的参数。每种数据转换器有自己的参数格式定义。
+
+
+{{ use: partial-data-transform-print }}
diff --git a/zh/option/component/data-transform-filter.md 
b/zh/option/component/data-transform-filter.md
new file mode 100644
index 0000000..6b69a71
--- /dev/null
+++ b/zh/option/component/data-transform-filter.md
@@ -0,0 +1,14 @@
+{{ target: component-data-transform-filter }}
+
+## transform.filter(Object)
+
+### type(string) = 'filter'
+
+### config
+
+"sort" 数据转换器的“条件”。
+
+{{ use: partial-data-transform-tutorial-ref }}
+
+
+{{ use: partial-data-transform-print }}
diff --git a/zh/option/component/data-transform-sort.md 
b/zh/option/component/data-transform-sort.md
new file mode 100644
index 0000000..f73e23f
--- /dev/null
+++ b/zh/option/component/data-transform-sort.md
@@ -0,0 +1,14 @@
+{{ target: component-data-transform-sort }}
+
+## transform.sort(Object)
+
+### type(string) = 'sort'
+
+### config
+
+"sort" 数据转换器的“条件”。
+
+{{ use: partial-data-transform-tutorial-ref }}
+
+
+{{ use: partial-data-transform-print }}
diff --git a/zh/option/component/dataset.md b/zh/option/component/dataset.md
index b1973e7..2e22d77 100644
--- a/zh/option/component/dataset.md
+++ b/zh/option/component/dataset.md
@@ -8,7 +8,6 @@ ECharts 4 开始支持了 `数据集`（`dataset`）组件用于单独的数据
 关于 `dataset` 
的详情，请参见[教程](tutorial.html#%E4%BD%BF%E7%94%A8%20dataset%20%E7%AE%A1%E7%90%86%E6%95%B0%E6%8D%AE)。
 
 
----
 
 {{use: partial-component-id(prefix="#")}}
 
@@ -55,6 +54,7 @@ ECharts 4 开始支持了 `数据集`（`dataset`）组件用于单独的数据
     prefix="#"
 )}}
 
+
 ## sourceHeader(boolean)
 
 `dataset.source` 第一行/列是否是 [维度名](dataset.dimensions) 信息。可选值：
@@ -64,3 +64,13 @@ ECharts 4 开始支持了 `数据集`（`dataset`）组件用于单独的数据
 + `false`：第一行/列直接开始是数据。
 
 注意：“第一行/列” 的意思是，如果 [series.seriesLayoutBy](~series.seriesLayoutBy) 设置为 
`'column'`（默认值），则取第一行，如果 `series.seriesLayoutBy` 设置为 `'row'`，则取第一列。
+
+
+## transform(Object)
+
+{{ use: partial-data-transform-tutorial-ref }}
+
+{{ import: component-data-transform-filter }}
+{{ import: component-data-transform-sort }}
+{{ import: component-data-transform-external }}
+
diff --git a/zh/option/partial/data-transform.md 
b/zh/option/partial/data-transform.md
new file mode 100644
index 0000000..80ca540
--- /dev/null
+++ b/zh/option/partial/data-transform.md
@@ -0,0 +1,27 @@
+{{ target: partial-data-transform-tutorial-ref }}
+参见这个教程： [datat 
transform](tutorial.html#%E4%BD%BF%E7%94%A8%20transform%20%E8%BF%9B%E8%A1%8C%E6%95%B0%E6%8D%AE%E8%BD%AC%E6%8D%A2).
+
+
+
+{{ target: partial-data-transform-print }}
+
+### print(boolean) = false
+
+使用 transform 时，有时候我们会配不对，显示不出来结果，并且不知道哪里错了。所以，这里提供了一个配置项 `transform.print` 方便 
debug 。这个配置项只在开发环境中生效。如下例：
+
+```js
+option = {
+    dataset: [{
+        source: [ ... ]
+    }, {
+        transform: {
+            type: 'filter',
+            config: { ... }
+            // 配置为 `true` 后， transform 的结果
+            // 会被 console.log 打印出来。
+            print: true
+        }
+    }],
+    ...
+}
+```
\ No newline at end of file
diff --git a/zh/tutorial/data-transform.md b/zh/tutorial/data-transform.md
new file mode 100644
index 0000000..2e8d29d
--- /dev/null
+++ b/zh/tutorial/data-transform.md
@@ -0,0 +1,545 @@
+{{target: data-transform}}
+
+# 使用 transform 进行数据转换
+
+Apache ECharts (incubating)<sup>TM</sup> 5 开始支持了“数据转换”（ data transform ）功能。在 
echarts 中，“数据转换” 
这个词指的是，给定一个已有的“数据集”（[dataset](option.html#dataset)）和一个“转换方法”（[transform](option.html#dataset.transform），echarts
 能生成一个新的“数据集”，然后可以使用这个新的“数据集”绘制图表。这些工作都可以声明式地完成。
+
+抽象地来说，数据转换是这样一种公式：`outData = f(inputData)`。`f` 
是转换方法，例如：`filter`、`sort`、`regression`、`boxplot`、`cluster`、`aggregate`(todo) 
等等。有了数据转换能力后，我们就至少可以做到这些事情：
++ 把数据分成多份用不同的饼图展现。
++ 进行一些数据统计运算，并展示结果。
++ 用某些数据可视化算法处理数据，并展示结果。
++ 数据排序。
++ 去除或直选择数据项。
++ ...
+
+
+## 数据转换基础使用
+
+在 echarts 中，数据转换是依托于数据集（[dataset](~dataset)）来实现的. 我们可以设置 
[dataset.transform](option.html#dataset.transform) 来表示，此 dataset 的数据，来自于此 
transform 的结果。例如。
+
+```js
+var option = {
+    dataset: [{
+        // 这个 dataset 的 index 是 `0`。
+        source: [
+            ['Product', 'Sales', 'Price', 'Year'],
+            ['Cake', 123, 32, 2011],
+            ['Cereal', 231, 14, 2011],
+            ['Tofu', 235, 5, 2011],
+            ['Dumpling', 341, 25, 2011],
+            ['Biscuit', 122, 29, 2011],
+            ['Cake', 143, 30, 2012],
+            ['Cereal', 201, 19, 2012],
+            ['Tofu', 255, 7, 2012],
+            ['Dumpling', 241, 27, 2012],
+            ['Biscuit', 102, 34, 2012],
+            ['Cake', 153, 28, 2013],
+            ['Cereal', 181, 21, 2013],
+            ['Tofu', 395, 4, 2013],
+            ['Dumpling', 281, 31, 2013],
+            ['Biscuit', 92, 39, 2013],
+            ['Cake', 223, 29, 2014],
+            ['Cereal', 211, 17, 2014],
+            ['Tofu', 345, 3, 2014],
+            ['Dumpling', 211, 35, 2014],
+            ['Biscuit', 72, 24, 2014],
+        ],
+        // id: 'a'
+    }, {
+        // 这个 dataset 的 index 是 `1`。
+        // 这个 `transform` 配置，表示，此 dataset 的数据，来自于此 transform 的结果。
+        transform: {
+            type: 'filter',
+            config: { dimension: 'Year', value: 2011 }
+        },
+        // 我们还可以设置这些可选的属性： `fromDatasetIndex` 或 `fromDatasetId`。
+        // 这些属性，指定了，transform 的输入，来自于哪个 dataset。例如，
+        // `fromDatasetIndex: 0` 表示输入来自于 index 为 `0` 的 dataset 。又例如，
+        // `fromDatasetId: 'a'` 表示输入来自于 `id: 'a'` 的 dataset。
+        // 当这些属性都不指定时，默认认为，输入来自于 index 为 `0` 的 dataset 。
+    }, {
+        // 这个 dataset 的 index 是 `2`。
+        // 同样，这里因为 `fromDatasetIndex` 和 `fromDatasetId` 都没有被指定，
+        // 那么输入默认来自于 index 为 `0` 的 dataset 。
+        transform: {
+            // 这个类型为 "filter" 的 transform 能够遍历并筛选出满足条件的数据项。
+            type: 'filter',
+            // 每个 transform 如果需要有配置参数的话，都须配置在 `config` 里。
+            // 在这个 "filter" transform 中，`config` 用于指定筛选条件。
+            // 下面这个筛选条件是：选出维度（ dimension ）'Year' 中值为 2012 的所有
+            // 数据项。
+            config: { dimension: 'Year', value: 2012 }
+        }
+    }, {
+        // 这个 dataset 的 index 是 `3`。
+        transform: {
+            type: 'filter',
+            config: { dimension: 'Year', value: 2013 }
+        }
+    }],
+    series: [{
+        type: 'pie', radius: 50, center: ['25%', '50%'],
+        // 这个饼图系列，引用了 index 为 `1` 的 dataset 。也就是，引用了上述
+        // 2011 年那个 "filter" transform 的结果。
+        datasetIndex: 1
+    }, {
+        type: 'pie', radius: 50, center: ['50%', '50%'],
+        datasetIndex: 2
+    }, {
+        type: 'pie', radius: 50, center: ['75%', '50%'],
+        datasetIndex: 3
+    }]
+};
+```
+
+下面是上述例子的效果，三个饼图分别显示了 2011、2012、2013 年的数据。
+~[800x300](${galleryViewPath}data-transform-multiple-pie&reset=1&edit=1)
+
+现在我们简单总结下，使用 transform 时的几个要点：
++ 在一个空的 dataset 中声明 `transform`, `fromDatasetIndex`/`fromDatasetId` 
来表示我们要生成新的数据。
++ 系列引用这个 dataset 。
+
+
+
+## 数据转换的进阶使用
+
+### 链式声明 transform
+
+`transform` 可以被链式声明，这是一个语法糖。
+```js
+option: {
+    dataset: [{
+        source: [ ... ] // 原始数据
+    }, {
+        // 几个 transform 被声明成 array ，他们构成了一个链，
+        // 前一个 transform 的输出是后一个 transform 的输入。
+        transform: [{
+            type: 'filter',
+            config: { dimension: 'Product', value: 'Tofu' }
+        }, {
+            type: 'sort',
+            config: { dimension: 'Year', order: 'desc' }
+        }]
+    }],
+    series: {
+        type: 'pie',
+        // 这个系列引用上述 transform 的结果。
+        datasetIndex: 1
+    }
+}
+```
+
+> 注意：理论上，任何 transform 都可能有多个输入或多个输出。但是，如果一个 transform 被链式声明，它只能获取前一个 transform 
的第一个输出作为输入（第一个 transform 除外），以及它只能把自己的第一个输出给到后一个 transform （最后一个 transform 除外）。
+
+
+
+### 一个 transform 输出多个 data
+
+在大多数场景下，transform 只需输出一个 data 。但是也有一些场景，需要输出多个 data ，每个 data 可以被不同的 series 或者 
dataset 所使用。
+
+例如，在内置的 "boxplot" transform 中，除了 boxplot 系列所需要的 data 外，离群点（ outlier 
）也会被生成，并且可以用例如散点图系列显示出来。例如，[example](${galleryEditorPath}boxplot-light-velocity&edit=1&reset=1)。
+
+
+```js
+option = {
+    dataset: [{
+        // 这个 dataset 的 index 为 `0`。
+        source: [...] // 原始数据
+    }, {
+        // 这个 dataset 的 index 为 `1`。
+        transform: {
+            type: 'boxplot'
+        }
+        // 这个 "boxplot" transform 生成了两个数据：
+        // result[0]: boxplot series 所需的数据。
+        // result[1]: 离群点数据。
+        // 当其他 series 或者 dataset 引用这个 dataset 时，他们默认只能得到
+        // result[0] 。
+        // 如果想要他们得到 result[1] ，需要额外声明如下这样一个 dataset ：
+    }, {
+        // 这个 dataset 的 index 为 `2`。
+        // 这个额外的 dataset 指定了数据来源于 index 为 `1` 的 dataset。
+        fromDatasetIndex: 1,
+        // 并且指定了获取 transform result[1] 。
+        fromTransformResult: 1
+    }],
+    xAxis: {
+        type: 'category'
+    },
+    yAxis: {
+    },
+    series: [{
+        name: 'boxplot',
+        type: 'boxplot',
+        // Reference the data from result[0].
+        // 这个 series 引用 index 为 `1` 的 dataset 。
+        datasetIndex: 1
+    }, {
+        name: 'outlier',
+        type: 'scatter',
+        // 这个 series 引用 index 为 `2` 的 dataset 。
+        // 从而也就得到了上述的 transform result[1] （即离群点数据）
+        datasetIndex: 2
+    }]
+};
+```
+
+
+### 在开发环境中 debug
+
+使用 transform 时，有时候我们会配不对，显示不出来结果，并且不知道哪里错了。所以，这里提供了一个配置项 `transform.print` 方便 
debug 。这个配置项只在开发环境中生效。如下例：
+
+```js
+option = {
+    dataset: [{
+        source: [ ... ]
+    }, {
+        transform: {
+            type: 'filter',
+            config: { ... }
+            // 配置为 `true` 后， transform 的结果
+            // 会被 console.log 打印出来。
+            print: true
+        }
+    }],
+    ...
+}
+```
+
+
+## 数据转换器 "filter"
+
+echarts 内置提供了能起过滤作用的数据转换器。我们只需声明 `transform.type: "filter"`，以及给出数据筛选条件。如下例：
+
+```js
+option = {
+    dataset: [{
+        source: [
+            ['Product', 'Sales', 'Price', 'Year'],
+            ['Cake', 123, 32, 2011],
+            ['Latte', 231, 14, 2011],
+            ['Tofu', 235, 5, 2011],
+            ['Milk Tee', 341, 25, 2011],
+            ['Porridge', 122, 29, 2011],
+            ['Cake', 143, 30, 2012],
+            ['Latte', 201, 19, 2012],
+            ['Tofu', 255, 7, 2012],
+            ['Milk Tee', 241, 27, 2012],
+            ['Porridge', 102, 34, 2012],
+            ['Cake', 153, 28, 2013],
+            ['Latte', 181, 21, 2013],
+            ['Tofu', 395, 4, 2013],
+            ['Milk Tee', 281, 31, 2013],
+            ['Porridge', 92, 39, 2013],
+            ['Cake', 223, 29, 2014],
+            ['Latte', 211, 17, 2014],
+            ['Tofu', 345, 3, 2014],
+            ['Milk Tee', 211, 35, 2014],
+            ['Porridge', 72, 24, 2014]
+        ]
+    }, {
+        transform: {
+            type: 'filter',
+            config: { dimension: 'Year', '=': 2011 }
+            // 这个筛选条件表示，遍历数据，筛选出维度（ dimension ）
+            // 'Year' 上值为 2011 的所有数据项。
+        }
+    }],
+    series: {
+        type: 'pie',
+        datasetIndex: 1
+    }
+};
+```
+
+在 "filter" transform 中，有这些要素：
+
+**关于维度（ dimension ）：**
+
+`config.dimension` 指定了维度，能设成这样的值：
++ 设定成声明在 dataset 中的维度名，例如 `config: { dimension: 'Year', '=': 2011 }`。不过， 
dataset 中维度名的声明并非强制，所以我们也可以
++ 设定成 dataset 中的维度 index （index 值从 0 开始）例如 `config: { dimension: 3, '=': 2011 
}`。
+
+**关于关系比较操作符：**
+
+关系操作符，可以设定这些：
+`>`（`gt`）、`>=`（`gte`）、`<`（`lt`）、`<=`（`lte`）、`=`（`eq`）、`!=`（`ne`、`<>`）、`reg`。（小括号中的符号或名字，是别名，设置起来作用相同）。他们首先基本地能基于数值大小进行比较，然后也有些额外的功能特性：
++ 多个关系操作符能声明在一个 {} 中，例如 `{ dimension: 'Price', '>=': 20, '<': 30 
}`。这表示“与”的关系，即，筛选出价格大于等于 20 小雨 30 的数据项。
++ data 里的值，不仅可以是数值（ number ），也可以是“类数值的字符串”（“ numeric string 
”）。“类数值的字符串”本身是一个字符串，但是可以被转换为字面所描述的数值，例如 `' 123 '`。转换过程中，空格（全角半角空格）和换行符都能被消除（ 
trim ）。
++ 如果我们需要对日期对象（JS `Date`）或者日期字符串（如 '2012-05-12'）进行比较，我们需要手动指定 `parser: 
'time'`，例如 `config: { dimension: 3, lt: '2012-05-12', parser: 'time' }`。
++ 纯字符串比较也被支持，但是只能用在 `=` 或 `!=` 上。而 `>`, `>=`, `<`, `<=` 
并不支持纯字符串比较，也就是说，这四个操作符的右值，不能是字符串。
++ `reg` 操作符能提供正则表达式比较。例如， `{ dimension: 'Name', reg: /\s+Müller\s*$/ }` 能在 
`'Name'` 维度上选出姓 `'Müller'` 的数据项。
+
+**关于逻辑比较：**
+
+我们也支持了逻辑比较操作符 **与或非**（ `and` | `or` | `not` ）：
+```js
+option = {
+    dataset: [{
+        source: [...]
+    }, {
+        transform: {
+            type: 'filter',
+            config: {
+                // 使用 and 操作符。
+                // 类似地，同样的位置也可以使用 “or” 或 “not”。
+                // 但是注意 “not” 后应该跟一个 {...} 而非 [...] 。
+                and: [
+                    { dimension: 'Year', '=': 2011 },
+                    { dimension: 'Price', '>=': 20, '<': 30 }
+                ]
+            }
+            // 这个表达的是，选出 2011 年价格大于等于 20 但小于 30 的数据项。
+        }
+    }],
+    series: {
+        type: 'pie',
+        datasetIndex: 1
+    }
+};
+```
+`and`/`or`/`not` 自然可以被嵌套，例如：
+```js
+transform: {
+    type: 'filter',
+    config: {
+        or: [{
+            and: [{
+                dimension: 'Price', '>=': 10, '<': 20
+            }, {
+                dimension: 'Sales', '<': 100
+            }, {
+                not: { dimension: 'Product', '=': 'Tofu' }
+            }]
+        }, {
+            and: [{
+                dimension: 'Price', '>=': 10, '<': 20
+            }, {
+                dimension: 'Sales', '<': 100
+            }, {
+                not: { dimension: 'Product', '=': 'Cake' }
+            }]
+        }]
+    }
+}
+```
+
+**关于解析器（ parser ）：**
+
+还可以指定“解析器”（ parser ）来对值进行解析后再做比较。现在支持的解析器有：
++ `parser: 'time'`：把原始值解析成时间戳（ timestamp ）后再做比较。这个解析器的行为，和 
`echarts.time.parse` 相同，即，当原始值为时间对象（ JS `Date` 实例），或者是时间戳，或者是描述时间的字符串（例如 
`'2012-05-12 03:11:22'` 
），都可以被解析为时间戳，然后就可以基于数值大小进行比较。如果原始数据是其他不可解析为时间戳的值，那么会被解析为 NaN。
++ `parser: 'trim'`：如果原始数据是字符串，则把字符串两端的空格（全角半角）和换行符去掉。如果不是字符串，还保持为原始数据。
++ `parser: 'number'`：强制把原始数据转成数值。如果不能转成有意义的数值，那么转成 
`NaN`。在大多数场景下，我们并不需要这个解析器，因为按默认策略，“像数值的字符串”就会被转成数值。但是默认策略比较严格，这个解析器比较宽松，如果我们遇到含有尾缀的字符串（例如
 `'33%'`, `12px`），我们需要手动指定 `parser: 'number'`，从而去掉尾缀转为数值才能比较。
+
+这个例子显示了如何使用 `parser: 'time'`：
+```js
+option = {
+    dataset: [{
+        source: [
+            ['Product', 'Sales', 'Price', 'Date'],
+            ['Milk Tee', 311, 21, '2012-05-12'],
+            ['Cake', 135, 28, '2012-05-22'],
+            ['Latte', 262, 36, '2012-06-02'],
+            ['Milk Tee', 359, 21, '2012-06-22'],
+            ['Cake', 121, 28, '2012-07-02'],
+            ['Latte', 271, 36, '2012-06-22'],
+            ...
+        ]
+    }, {
+        transform: {
+            type: 'filter',
+            config: {
+                { dimension: 'Date', '>=': '2012-05', '<': '2012-06', parser: 
'time' }
+            }
+        }
+    }]
+}
+```
+
+**形式化定义：**
+
+最后，我们给出，数据转换器 "filter" 的 config 的形式化定义：
+```ts
+type FilterTransform = {
+    type: 'filter';
+    config: ConditionalExpressionOption;
+};
+type ConditionalExpressionOption =
+    true | false | RelationalExpressionOption | LogicalExpressionOption;
+type RelationalExpressionOption = {
+    dimension: DimensionName | DimensionIndex;
+    parser?: 'time' | 'trim' | 'number';
+    lt?: DataValue; // less than
+    lte?: DataValue; // less than or equal
+    gt?: DataValue; // greater than
+    gte?: DataValue; // greater than or equal
+    eq?: DataValue; // equal
+    ne?: DataValue; // not equal
+    '<'?: DataValue; // lt
+    '<='?: DataValue; // lte
+    '>'?: DataValue; // gt
+    '>='?: DataValue; // gte
+    '='?: DataValue; // eq
+    '!='?: DataValue; // ne
+    '<>'?: DataValue; // ne (SQL style)
+    reg?: RegExp | string; // RegExp
+};
+type LogicalExpressionOption = {
+    and?: ConditionalExpressionOption[];
+    or?: ConditionalExpressionOption[];
+    not?: ConditionalExpressionOption;
+};
+type DataValue = string | number | Date;
+type DimensionName = string;
+type DimensionIndex = number;
+```
+
+
+
+
+## 数据转换器 "sort"
+
+"sort" 是另一个内置的数据转换器，用于排序数据。目前主要能用于在类目轴（ `axis.type: 'category'` ）中显示排过序的数据。例如：
+
+```js
+option = {
+    dataset: [{
+        dimensions: ['name', 'age', 'profession', 'score', 'date'],
+        source: [
+            [' Hannah Krause ', 41, 'Engineer', 314, '2011-02-12'],
+            ['Zhao Qian ', 20, 'Teacher', 351, '2011-03-01'],
+            [' Jasmin Krause ', 52, 'Musician', 287, '2011-02-14'],
+            ['Li Lei', 37, 'Teacher', 219, '2011-02-18'],
+            [' Karle Neumann ', 25, 'Engineer', 253, '2011-04-02'],
+            [' Adrian Groß', 19, 'Teacher', null, '2011-01-16'],
+            ['Mia Neumann', 71, 'Engineer', 165, '2011-03-19'],
+            [' Böhm Fuchs', 36, 'Musician', 318, '2011-02-24'],
+            ['Han Meimei ', 67, 'Engineer', 366, '2011-03-12'],
+        ]
+    }, {
+        transform: {
+            type: 'sort',
+            // 按分数排序
+            config: { dimension: 'score', order: 'asc' }
+        }
+    }],
+    series: {
+        type: 'bar',
+        datasetIndex: 1
+    },
+    ...
+};
+```
+
+~[600x350](${galleryViewPath}data-transform-sort-bar&reset=1&edit=1)
+
+
+数据转换器 "sort" 还有一些额外的功能：
++ 可以多重排序，多个维度一起排序。见下面的例子。
++ 排序规则是这样的：
+  + 默认按照数值大小排序。其中，“可转为数值的字符串”也被转换成数值，和其他数值一起按大小排序。
+  + 
对于其他“不能转为数值的字符串”，也能在它们之间按字符串进行排序。这个特性有助于这种场景：把相同标签的数据项排到一起，尤其是当多个维度共同排序时。见下面的例子。
+  + 
当“数值及可转为数值的字符串”和“不能转为数值的字符串”进行排序时，或者它们和“其他类型的值”进行比较时，它们本身是不知如何进行比较的。那么我们称呼“后者”为“incomparable”，并且可以设置
 `incomparable: 'min' | 'max'` 
来指定一个“incomparable”在这个比较中是最大还是最小，从而能使它们能产生比较结果。这个设定的用途，比如可以是，决定空值（例如 `null`, 
`undefined`, `NaN`, `''`, `'-'`）在排序的头还是尾。
++ 过滤器 `filter: 'time' | 'trim' | 'number'` 可以被使用，和数据转换器 "filter" 中的情况一样。
+  + 如果要对时间进行排序（例如，值为 JS `Date` 实例或者时间字符串如 `'2012-03-12 11:13:54'`），我们需要声明 
`parser: 'time'`。
+  + 如果需要对有后缀的数值进行排序（如 `'33%'`, `'16px'`）我们需要声明 `parser: 'number'`。
+
+
+这是一个“多维度排序”的例子。
+```js
+option = {
+    dataset: [{
+        dimensions: ['name', 'age', 'profession', 'score', 'date'],
+        source: [
+            [' Hannah Krause ', 41, 'Engineer', 314, '2011-02-12'],
+            ['Zhao Qian ', 20, 'Teacher', 351, '2011-03-01'],
+            [' Jasmin Krause ', 52, 'Musician', 287, '2011-02-14'],
+            ['Li Lei', 37, 'Teacher', 219, '2011-02-18'],
+            [' Karle Neumann ', 25, 'Engineer', 253, '2011-04-02'],
+            [' Adrian Groß', 19, 'Teacher', null, '2011-01-16'],
+            ['Mia Neumann', 71, 'Engineer', 165, '2011-03-19'],
+            [' Böhm Fuchs', 36, 'Musician', 318, '2011-02-24'],
+            ['Han Meimei ', 67, 'Engineer', 366, '2011-03-12'],
+        ]
+    }, {
+        transform: {
+            type: 'sort',
+            config: [
+                // 对两个维度按声明的优先级分别排序。
+                { dimension: 'profession', order: 'desc' },
+                { dimension: 'score', order: 'desc' }
+            ]
+        }
+    }],
+    series: {
+        type: 'bar',
+        datasetIndex: 1
+    },
+    ...
+};
+```
+~[600x350](${galleryViewPath}doc-example/data-transform-multiple-sort-bar&reset=1&edit=1)
+
+
+最后，我们给出数据转换器 "sort" 的 config 的形式化定义。
+```ts
+type SortTransform = {
+    type: 'filter';
+    config: OrderExpression | OrderExpression[];
+};
+type OrderExpression = {
+    dimension: DimensionName | DimensionIndex;
+    order: 'asc' | 'desc';
+    incomparable?: 'min' | 'max';
+    parser?: 'time' | 'trim' | 'number';
+};
+type DimensionName = string;
+type DimensionIndex = number;
+```
+
+
+## 使用外部的数据转换器
+
+除了上述的内置的数据转换器外，我们也可以使用外部的数据转换器。外部数据转换器能提供或自己定制更丰富的功能。下面的例子中，我们使用第三方库 
[ecStat](https://github.com/ecomfe/echarts-stat) 提供的数据转换器。
+
+生成数据的回归线：
+```js
+// 首先要注册外部数据转换器。
+echarts.registerTransform(ecStatTransform(ecStat).regression);
+```
+```js
+option = {
+    dataset: [{
+        source: rawData
+    }, {
+        transform: {
+            // 引用注册的数据转换器。
+            // 注意，每个外部的数据转换器，都有名空间（如 'ecStat:xxx'，'ecStat' 是名空间）。
+            // 而内置数据转换器（如 'filter', 'sort'）没有名空间。
+            type: 'ecStat:regression',
+            config: {
+                // 这里是此外部数据转换器所需的参数。
+                method: 'exponential'
+            }
+        }
+    }, {
+        fromDatasetIndex: 1,
+        fromTransformResult: 1
+    }],
+    xAxis: { type: 'category' },
+    yAxis: {},
+    series: [{
+        name: 'scatter',
+        type: 'scatter',
+        datasetIndex: 0
+    }, {
+        name: 'regression',
+        type: 'line',
+        symbol: 'none',
+        datasetIndex: 1
+    }]
+};
+```
+
+example: ecState regression
+
diff --git a/zh/tutorial/dataset.md b/zh/tutorial/dataset.md
index 23efb7d..5c5af50 100644
--- a/zh/tutorial/dataset.md
+++ b/zh/tutorial/dataset.md
@@ -534,6 +534,10 @@ ECharts 4 之前一直以来的数据声明方式仍然被正常支持，如果
 其实，[series.data](option.html#series.data) 也是种会一直存在的重要设置方式。一些特殊的非 table 格式的图表，如 
[treemap](option.html#series-treemap)、[graph](option.html#series-graph)、[lines](option.html#series-lines)
 等，现在仍不支持在 dataset 中设置，仍然需要使用 
[series.data](option.html#series.data)。另外，对于巨大数据量的渲染（如百万以上的数据量），需要使用 
[appendData](api.html#echartsInstance.appendData) 进行增量加载，这种情况不支持使用 `dataset`。
 
 
+## 数据转换器（ data transform ）
+
+参见 [datat 
transform](~%E4%BD%BF%E7%94%A8%20transform%20%E8%BF%9B%E8%A1%8C%E6%95%B0%E6%8D%AE%E8%BD%AC%E6%8D%A2)。
+
 ## 其他
 
 目前并非所有图表都支持 dataset。支持 dataset 的图表有：
diff --git a/zh/tutorial/tutorial.md b/zh/tutorial/tutorial.md
index 008acc8..b0b1252 100644
--- a/zh/tutorial/tutorial.md
+++ b/zh/tutorial/tutorial.md
@@ -8,6 +8,7 @@
 {{ import: style-overview }}
 {{ import: dynamic-data }}
 {{ import: dataset }}
+{{ import: data-transform }}
 {{ import: data-zoom }}
 {{ import: media-query }}
 {{ import: visual-map }}


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@echarts.apache.org
For additional commands, e-mail: commits-h...@echarts.apache.org

[incubator-echarts-doc] branch next updated: add doc for data-transform

Reply via email to