[ https://issues.apache.org/jira/browse/IOTDB-842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jialin Qiao reopened IOTDB-842: ------------------------------- > Better Export/Import-CSV Tool > ----------------------------- > > Key: IOTDB-842 > URL: https://issues.apache.org/jira/browse/IOTDB-842 > Project: Apache IoTDB > Issue Type: Task > Components: Tools/Others > Reporter: Xiangdong Huang > Assignee: Xuan Ronaldo > Priority: Major > Labels: pull-request-available > > Hi, our import-csv tool is currently implemented by JDBC and requires a > fossil format: > e.g., > {code:java} > Time,root.sg.d1.s1,root.sg.d1.s2,root.sg.d2.s1,root.sg.d2.s2,root.sg.d2.s3 > 2020-08-18T10:22:31.603+08:00,1,2.0,null,null,null > 2020-08-18T10:22:35.631+08:00,1,2.0,null,null,null > 2020-08-18T10:22:41.093+08:00,null,null,1,2.0,null > 2020-08-18T10:22:52.603+08:00,null,null,1,2.0,true > {code} > Requirement 1: > As we support 3 kinds of output format: align all series (by default), align > by device, without alignment, it is better to support such 3 kinds of > import-csv format: > a. > {code:java} > Time,root.sg.d1.s1,root.sg.d1.s2,root.sg.d2.s1,root.sg.d2.s2,root.sg.d2.s3 > 2020-08-18T10:22:31.603+08:00,1,2.0,null,null,null > 2020-08-18T10:22:35.631+08:00,1,2.0,null,null,null > 2020-08-18T10:22:41.093+08:00,null,null,1,2.0,null > 2020-08-18T10:22:52.603+08:00,null,null,1,2.0,true > {code} > b. > {code:java} > Time,Device,s1,s2,s3 > 2020-08-18T10:22:31.603+08:00,root.sg.d1,1,2.0,null > 2020-08-18T10:22:35.631+08:00,root.sg.d1,1,2.0,null > 2020-08-18T10:22:41.093+08:00,root.sg.d2,1,2.0,null > 2020-08-18T10:22:52.603+08:00,root.sg.d2,1,2.0,true > {code} > c. > (it is strange, I'd like to do not support such format.) > Requment2: > Different users may have different time formats for the first column. > So, we'd better support different kinds of time format. e.g., let users > define how to parse their timestamp: yyyy-MM-ddHH:mm:ss.SSS etc.. > Requirement 3: > Support NULL as well as empty char to describe the null data point. For > example, the following 3 lines are the same: > 2020-08-18T10:22:31.603+08:00,root.sg.d1,1,null,null > 2020-08-18T10:22:31.603+08:00,root.sg.d1,1,, > 2020-08-18T10:22:31.603+08:00,root.sg.d1,1, , > Requirement 4: > Support claiming the storage group name once rather than repeat the storage > group name for each line: > e.g., for format b, we can tell the tool the sg is `root.sg` and then each > row looks like: > 2020-08-18T10:22:35.631+08:00,d1,1,2.0,null > Another option is add a new column called storage_group for each row. > For UT: > 1. all data type should be covered; > 2. incorrect csv format should be covered; -- This message was sent by Atlassian Jira (v8.3.4#803005)