Hi all:

I hope to let IoTDB support similar research by adding the indexing mechanism. 


Similarity search is one of the most important directions in the field of time 
series, and some users have shown their requirements for similarity search. 
Different from the existing query conditions in IoTDB, similarity search takes 
a sequence as input, and aim to find a list of similar sequences from the 
database efficiently. The similarity between the two sequences depends on their 
Euclidean distance or other distance functions. The similarity index techniques 
are widely used for speeding up the similarity search.


I demonstrate two industrial scenarios as well as the SQL and the query result 
format. The detailed SQL and result format can be found in [1]. 


Case 1: Erythromycin Fermentation
The pharmaceutical factory has many fermentors, each of which produces 
erythromycin batch by batch. For each batch, the factory will monitor some 
measurements in the fermentation process, like glucose feeding rate (Glu), 
carbon dioxide exit rate (CER), and pH value. Researchers have found that the 
glucose feeding rate (Glu) is critical to the final erythromycin output. 
Therefore, analysts want to build a similarity index (e.g. RTree+PAA [2]) for 
Glu sequences on all fermenters and all batches. After building the index, the 
analyst inputs a Glu sequence, finds batches with similar Glu curves, and makes 
further analysis. 


Case 2: Extreme Operating Gust
A manufacturer has many wind turbines, each of which continuously monitors the 
state of the turbine itself and the surrounding environment, such as wind 
speed, wind direction, generator power, etc. The analyst hopes to build an 
index (e.g. ELB index [3]) for the speed series of a certain turbine. After 
building the index, the analyst inputs an EOG pattern (Extreme Operating Gust), 
finds all results in speed series for further process control, fault diagnosis 
and predictive maintenance. 


Any suggestions are welcomed. If it makes sense, we could add SQL formats to 
the IoTDB as the first step of the IoTDB index mechanism.


[1] 
https://cwiki.apache.org/confluence/display/IOTDB/Supporting+Similarity+Search%3A+Scenarios%2C+SQL+and+Result+Format
[2] R-trees: a dynamic index structure for spatial searching
[3] Matching Consecutive Subpatterns over Streaming Time Series


Best,


康荣 Kang Rong                                                      
Ph.D. Student


School of Software, Tsinghua University, Beijing, China
(+86) 188-1030-5401
kang...@mails.tsinghua.edu.cn

Reply via email to