100pah edited a comment on pull request #13358: URL: https://github.com/apache/incubator-echarts/pull/13358#issuecomment-704408035
# How a user expresses data mapping for transition? ## Issues First and foremost, we need to consider those issues below: ### ISSUE_I: If we need to "auto detect the change of dimensions" between old data and new data, how to implement it? We should consider: + We have never been forcing users to specify dimension names. User can only specify certain dimensions by dimension index, which is probably convenient in some scenario in practice. + If we implement "data mapping for transition animation" via "auto detection of the change of dimensions", probably we can force the users to specify dimension names if they want to have a "correct transition animation", and perform mapping by the rule of `MAPPING_ON_THE_SAME_DIMENSION_NAME`, which means that if there is any equality on `oldData.dimensions[i].name` and `newData.dimensions[j].name`, we can perform mapping of data items by the values on `oldData.dimensions[i]` and `newData.dimensions[j]`. **Is there any flaw if applying that rule**? ### ISSUE_II: The issues of "mapping by index": The default data mapping implementation is provided by `List['diff']`, where if the names of data items are not specified, they will be mapped by data index. "mapping by index" is not a big deal in scenarios that the meaning of transition are not noticed. But in some scenario that the meaning of transition need to be noticed, like storytelling, any incorrectly data mapping is probably inappropriate. For example: `dataA` is the raw data, where the dimensions are `['Year', 'Income', 'Population', 'Sex', 'Country']`. `dataB` is calculated by: ```sql select avg(`Population`), avg(`Income`) from `dataA` group by `Sex`; ``` `dataC` is calculated by: ```sql select avg(`Population`), avg(`Income`) from `dataA` group by `Country`; ``` Suppose there are only two values in dimension `Country` (`'France'`, `'Germany'`), which are just the same as the value count of dimension `Sex` (`'Woman'`, `'Man'`). Consequently the count of `dataB` and `dataC` are exactly the same. Having these data above, when `dataB` is switched to `dataC` via `setOption`, the data mapping should not be performed by index. Otherwise there will be misleading mappings from `'Man'` to `'France'` or from `'Women'` to `'Germany'`. In this case, no transition animation is probably better than misleading transition animation. ### ISSUE_III: The issues of "when dimensions not changed": Suppose there is no changes before and after `setOption` called: Dimensions of `dataA` is `['Income', 'Population', 'Country']`, Dimensions of `dataB` is `['Income', 'Population', 'Country']`, exactly the same. But `dataB` is calculated by: ```sql select sum(`Income`), avg(`Population`) from `dataA` group by `Country`; ``` Have these data above, the dimensions are not change, but obviously it should be mapped neither by index, nor by the first same dimension (`Income`). The appropriate mapping should be performed on dimension `Country`, which, nevertheless, can not be auto-detected. That is, even though the dimensions are not changed, it hardly auto-detect how to make a totally correct data mapping. User input about transition is still needed in this case. ### ISSUE_IV: Issues about "user specifies a dimension (also say, `key` below) to perform mapping": Suppose there are requirements: 1. `dataB`(`seriesB`) ---transition1(on `'Country'`)---> `dataA`(`seriesA`) 2. `dataC`(`seriesC`) ---transition2(on `'Income'`)---> `dataA`(`seriesA`) We say the data before the "transition arrow" as `from`, and the data after the arrow as `to`. `transition1` needs user to input a key `'Country'`, and `transition2` needs user to input a key `'Income'`. That is, the "user specified key" is not only related to `to` but also related to `from`. That is, the "user specified key" only work for this calling of `setOption`, and should be discarded after setOption called. That is, the "user specified key" should better be set on the params of `setOption` rather than series option. If we intend to make the "user specified key" on series option, probably we need to lift the concept of that "key", making it not describe something about this transition but describe something about the feature of the data itself. For example, describe that which dimension is the unique key of the data, and echarts can subsequently make an auto-mapping rule based on that unique key. We will discuss it below in detail. ### ISSUE_V: Issues about "data totally not changed but need transition animation". For example, users need transition that from bar to pie chart with the same data. Suppose there is `dataA`, which dimensions are `['Income', 'Population', 'Country', 'Sex']` and no dimension is suitable for a unique key. It needs echarts to be able to perform data mapping. Obviously, the current default mapping rule (List['diff']) that mapping by index can handle that. But if we disable the rule that "mapping by index" for some other unexpected transition animation scenarios, how to handle it instead? A possible solution can be: ```js option = { dataset: [{ dimensions: ['Income', 'Population', 'Country', 'Sex'], source: dataA }, { // Generate an extra dimension as id. transform: { type: 'id', dimensionIndex: 4, dimensionName: 'Id' } }], series: { type: 'custom', encode: { itemName: 'Id' }, datasetIndex: 1 } }; ``` <br> ## Solutions Based on the scenarios listed above, I summarized to two designs about **how a user expresses data mapping for transition**. ### SOLUTION_A: Dimension key about data mapping is set in the parameter of `setOption`. That is, user is responsible for the setting of "from dimension" and "to dimension" of data mapping when intending to have transition animation. The advantages: + The API is more "atomic" relatively. Users can control everything about transition, which might avoid some bad cases that haven't thought of. + It's not hard for users to configure it in the "linear scene changing" (that is, optionA -> optionB -> optionC, be a linked list rather than a directed graph). The disadvantages: + It's not easy for users to configure it in the "directed-graph scene changing", where users might need upper layer to manage transition settings. ### SOLUTION_B: The "key" about transition is set in series option. The key points of this strategy: + Apply `MAPPING_ON_THE_SAME_DIMENSION_NAME`. + User is responsible for specifying the "unique key" of data, which is used subsequently to select the transition key. + The term "unique key" follows the same concept of unique key in database. + `PENDING_I`: how to specify unique key? + We can use the existing setting `series.encode.itemId` to specify the unique key, whose only different from `series.encode.itemName` is that it will not be displayed in the default tooltip. Considering the compatibility with the current mapping strategy, when `setOption` happen, we have the rule as follows: + Get `UNIQUE_KEY_DIMENSION_NAME`: if `series.encode.itemId` is specified and has its dimension name specified, we have `UNIQUE_KEY_DIMENSION_NAME`. + If there is `newData`.`UNIQUE_KEY_DIMENSION_NAME`, check it in `oldData`. If there is any dimension having the same name, we got the transition mapping dimension `from` and `to`. + Else if there is `oldData`.`UNIQUE_KEY_DIMENSION_NAME`, check it in `newData`. If there is any dimension having the same name, we got the transition mapping dimension `from` and `to`. + Else if there is `newData`.`UNIQUE_KEY_DIMENSION_NAME`, do not apply transition animation. + This is to provide a way to disable unexpected transition. + Else apply the existing mapping rule (`List['diff']`). **User usage hints of SOLUTION_B:** Scenario in (ISSUE_II): Expect no transition animation. ```js chart.setOption({ series: { encode: { itemId: 'Sex' }, dimensions: ['Population', 'Income'], data: dataB_aggregate_by_Sex_from_dataA } }); chart.setOption({ series: { encode: { itemId: 'Country' }, dimensions: ['Population', 'Income'], data: dataC_aggregate_by_Country_from_dataA } }); ``` Scenario in (ISSUE_III): Expect map by dimension country. ```js chart.setOption({ series: { encode: { itemId: -1 }, // Means no item name. dimensions: ['Income', 'Population', 'Country'], data: dataA } }); chart.setOption({ series: { encode: { itemId: 'Country' }, dimensions: ['Income', 'Population', 'Country'], data: dataB_aggregate_by_Country_from_dataA } }); ``` Scenario in (ISSUE_IV): ```js chart.setOption({ series: { encode: { itemId: -1 }, // Means no item name. dimensions: ['Income', 'Population', 'Country', 'Sex'], data: dataA } }); chart.setOption({ series: { encode: { itemId: 'Country' }, dimensions: ['Income', 'Population', 'Country'], data: dataB_aggregate_by_Country_from_dataA } }); chart.setOption({ series: { encode: { itemId: 'Sex' }, dimensions: ['Income', 'Population', 'Sex'], data: dataC_aggregate_by_Sex_from_dataA } }); ``` Scenario in (ISSUE_V): Expect transition between bar and pie with the same data. ```js chart.setOption({ dataset: [{ dimensions: ['Income', 'Population', 'Country', 'Sex'], source: dataA }, { // Generate an extra dimension as id. transform: { type: 'id', dimensionIndex: 4, dimensionName: 'Id' } }, }, { lazyUpdate: true }); chart.setOption({ series: { // render pie type: 'custom', renderItem: renderBar, encode: { itemId: 'Id' }, datasetIndex: 1 } }); chart.setOption({ series: { // render bar type: 'custom', renderItem: renderPie, encode: { itemId: 'Id' }, datasetIndex: 1 } }); ``` <br> ## Summary At present I think `SOLUTION_B` probably better. But it might reduce the capability then `SOLUTION_A`. I am not sure is there any meaningful scenario that `SOLUTION_B` does not cover? What's your opinions @pissang ? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@echarts.apache.org For additional commands, e-mail: commits-h...@echarts.apache.org