[jira] [Created] (IOTDB-5747) UDF SlidingTimeWindow slidingStep bug
Lei Rui created IOTDB-5747: -- Summary: UDF SlidingTimeWindow slidingStep bug Key: IOTDB-5747 URL: https://issues.apache.org/jira/browse/IOTDB-5747 Project: Apache IoTDB Issue Type: Bug Reporter: Lei Rui To reproduce the bug in IoTDB v1.0.1: ``` insert into root.sg1.d1(timestamp,s1) values(1,1) insert into root.sg1.d1(timestamp,s1) values(2,2) insert into root.sg1.d1(timestamp,s1) values(3,3) insert into root.sg1.d1(timestamp,s1) values(4,4) insert into root.sg1.d1(timestamp,s1) values(5,5) insert into root.sg1.d1(timestamp,s1) values(6,6) insert into root.sg1.d1(timestamp,s1) values(7,7) insert into root.sg1.d1(timestamp,s1) values(8,8) select M4(s1,'timeInterval'='3','slidingStep'='2') from root.sg1.d1 ``` The query result is: ``` +-+-+ | Time|M4(root.sg1.d1.s1, "timeInterval"="3", "slidingStep"="2")| +-+-+ |1970-01-01T08:00:00.001+08:00| 1.0| |1970-01-01T08:00:00.003+08:00| 3.0| +-+-+ Total line number = 2 ``` which is wrong, as the sliding time windows and the M4 samples of each window should be: - [1,4): samples (1,1),(3,3) - [3,6): samples (3,3),(5,5) - [5,8): samples (5,5), (7,7) - [7,10): samples (7,7), (8,8) >From my observation, the bug tends to happen when the slidingStep equals >timeInterval-1. I think there are bugs in the sliding time window code, but I >couldn't locate the bug because I am not familiar with the UDF module. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5433) IndexOutOfBoundsException: ExactOrderStatistics constructs an empty ArrayList and then sets it
Lei Rui created IOTDB-5433: -- Summary: IndexOutOfBoundsException: ExactOrderStatistics constructs an empty ArrayList and then sets it Key: IOTDB-5433 URL: https://issues.apache.org/jira/browse/IOTDB-5433 Project: Apache IoTDB Issue Type: Bug Reporter: Lei Rui For example: {code:java} public static double getMad(LongArrayList nums) { if (nums.isEmpty()) { throw new NoSuchElementException(); } else { double median = getMedian(nums); DoubleArrayList dal = new DoubleArrayList(); for (int i = 0; i < nums.size(); ++i) { dal.set(i, Math.abs(nums.get(i) - median)); } return getMedian(dal); } } {code} "DoubleArrayList dal = new DoubleArrayList();" will construct an empty list, and dal.set(...) will throw IndexOutOfBoundsException. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5421) Add sampling attributes for M4
[ https://issues.apache.org/jira/browse/IOTDB-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei Rui reassigned IOTDB-5421: -- Assignee: Lei Rui > Add sampling attributes for M4 > -- > > Key: IOTDB-5421 > URL: https://issues.apache.org/jira/browse/IOTDB-5421 > Project: Apache IoTDB > Issue Type: New Feature >Reporter: Lei Rui >Assignee: Lei Rui >Priority: Major > > Previously, the M4 function supports two types of attribute inputs (actually > two types of sliding windows): > (1) Control the sliding *size* window using attributes {{windowSize}} and > {{{}slidingStep{}}}. For example: {{{}select > M4(s1,'windowSize'='10','slidingStep'='10') as samples from > root.vehicle.d1{}}}. > (2) Control the sliding *time* window using attributes > {{{}windowInterval{}}}, {{{}slidingStep{}}}, {{displayWindowBegin}} and > {{{}displayWindowEnd{}}}. For example: {{{}select > M4(s1,'windowInterval'='25','slidingStep'='25','displayWindowBegin'='0','displayWindowEnd'='100') > as samples from root.vehicle.d1{}}}. > As proposes in a real use case (ZhongHe&DWF), the user wants to control M4 > behaviors using the following sampling attributes: {{{}samplingInterval{}}}, > {{{}samplingThreshold{}}}, {{{}displayWindowBegin{}}}, > {{{}displayWindowEnd{}}}. For example: {{{}select > M4(s1,'samplingInterval'='5','samplingThreshold'='100','displayWindowBegin'='0','displayWindowEnd'='150') > as samples from root.vehicle.d1{}}}. > * {{{}samplingInterval{}}}: The sampling time interval length. Long data > type. {*}Required{*}. > * {{{}samplingThreshold{}}}: The upper limit of the number of sampling > points. Long data type. Optional. If not set, default to 1. > * {{{}displayWindowBegin{}}}: The starting position of the window > (included). Long data type. {*}Required{*}. > * {{{}displayWindowEnd{}}}: End time limit (excluded, essentially playing > the same role as {{{}WHERE time < displayWindowEnd{}}}). Long data type. > {*}Required{*}. > The user-defined sampling time window is a special kind of sliding time > window, which is special in that: > # There is a conversion relationship between the length of the sliding time > window {{windowInterval}} and the sampling time interval {{samplingInterval. > }}Note that here user *indirectly* controls the window time length > {{{}windowInterval{}}}. > # The sliding step of the sliding time window {{slidingStep}} is fixed to be > equal to the window length {{windowInterval}} here, so there is no need for > the user to input the {{slidingStep}} parameter. > # {{displayWindowBegin}} and {{displayWindowEnd}} are required parameters > here. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5421) Add sampling attributes for M4
Lei Rui created IOTDB-5421: -- Summary: Add sampling attributes for M4 Key: IOTDB-5421 URL: https://issues.apache.org/jira/browse/IOTDB-5421 Project: Apache IoTDB Issue Type: New Feature Reporter: Lei Rui Previously, the M4 function supports two types of attribute inputs (actually two types of sliding windows): (1) Control the sliding *size* window using attributes {{windowSize}} and {{{}slidingStep{}}}. For example: {{{}select M4(s1,'windowSize'='10','slidingStep'='10') as samples from root.vehicle.d1{}}}. (2) Control the sliding *time* window using attributes {{{}windowInterval{}}}, {{{}slidingStep{}}}, {{displayWindowBegin}} and {{{}displayWindowEnd{}}}. For example: {{{}select M4(s1,'windowInterval'='25','slidingStep'='25','displayWindowBegin'='0','displayWindowEnd'='100') as samples from root.vehicle.d1{}}}. As proposes in a real use case (ZhongHe&DWF), the user wants to control M4 behaviors using the following sampling attributes: {{{}samplingInterval{}}}, {{{}samplingThreshold{}}}, {{{}displayWindowBegin{}}}, {{{}displayWindowEnd{}}}. For example: {{{}select M4(s1,'samplingInterval'='5','samplingThreshold'='100','displayWindowBegin'='0','displayWindowEnd'='150') as samples from root.vehicle.d1{}}}. * {{{}samplingInterval{}}}: The sampling time interval length. Long data type. {*}Required{*}. * {{{}samplingThreshold{}}}: The upper limit of the number of sampling points. Long data type. Optional. If not set, default to 1. * {{{}displayWindowBegin{}}}: The starting position of the window (included). Long data type. {*}Required{*}. * {{{}displayWindowEnd{}}}: End time limit (excluded, essentially playing the same role as {{{}WHERE time < displayWindowEnd{}}}). Long data type. {*}Required{*}. The user-defined sampling time window is a special kind of sliding time window, which is special in that: # There is a conversion relationship between the length of the sliding time window {{windowInterval}} and the sampling time interval {{samplingInterval. }}Note that here user *indirectly* controls the window time length {{{}windowInterval{}}}. # The sliding step of the sliding time window {{slidingStep}} is fixed to be equal to the window length {{windowInterval}} here, so there is no need for the user to input the {{slidingStep}} parameter. # {{displayWindowBegin}} and {{displayWindowEnd}} are required parameters here. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5316) Session.setFetchSize is not used in the following fetch requests
[ https://issues.apache.org/jira/browse/IOTDB-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei Rui reassigned IOTDB-5316: -- Assignee: Lei Rui > Session.setFetchSize is not used in the following fetch requests > > > Key: IOTDB-5316 > URL: https://issues.apache.org/jira/browse/IOTDB-5316 > Project: Apache IoTDB > Issue Type: Bug >Reporter: Lei Rui >Assignee: Lei Rui >Priority: Major > > Session.setFetchSize only applies for the first query execution request and > the following fetch request uses the default fetch size 5000. > The reason is that the construction method of SessionDataSet in > `SessionConnection.executeQueryStatement` uses > `SessionConfig.DEFAULT_FETCH_SIZE` instead of the actual `session.fetchSize`. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5316) Session.setFetchSize is not used in the following fetch requests
Lei Rui created IOTDB-5316: -- Summary: Session.setFetchSize is not used in the following fetch requests Key: IOTDB-5316 URL: https://issues.apache.org/jira/browse/IOTDB-5316 Project: Apache IoTDB Issue Type: Bug Reporter: Lei Rui Session.setFetchSize only applies for the first query execution request and the following fetch request uses the default fetch size 5000. The reason is that the construction method of SessionDataSet in `SessionConnection.executeQueryStatement` uses `SessionConfig.DEFAULT_FETCH_SIZE` instead of the actual `session.fetchSize`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5187) Import-csv tool runs very slowly
Lei Rui created IOTDB-5187: -- Summary: Import-csv tool runs very slowly Key: IOTDB-5187 URL: https://issues.apache.org/jira/browse/IOTDB-5187 Project: Apache IoTDB Issue Type: Improvement Reporter: Lei Rui It took far more than 10 minutes for import-csv to import a csv file containing one time series with 12780287 points (I waited 10 minutes before killing the import progress, so the import task didn't finish). * CommitID used: e68f560463c7c31ef0bb0fbaf74335cf6386ff5a (Mon Dec 12 15:15:48 2022) * Import-csv command: ~/iotdb/cli/target/iotdb-cli-1.0.1-SNAPSHOT/tools$ ./import-csv.sh -f ~/ZT11529.csv -h 127.0.0.1 -p 6667 -u root -pw root * The ZT11529.csv file looks like this: {code:java} Time,root.group_69.`1701`.ZT11529 1591717867194,68.11 1591717867705,68.2 1591717868201,68.2 1591717868711,68.11 1591717869202,68.11 1591717869712,68.11 1591717870201,68.11 1591717870724,68.3 1591717871209,68.3 1591717871713,68.3 1591717872202,68.3 1591717872711,68.2 1591717873200,68.2 1591717873714,68.3 1591717874194,68.3 1591717874715,68.11 1591717875200,68.11 1591717875710,68.11 1591717876199,68.11 1591717876729,68.0 1591717877259,68.3 1591717877709,68.3 1591717878208,68.3 1591717878708,68.3 1591717879197,68.3 1591717879708,68.3 1591717880202,68.3 1591717880716,68.3 1591717881198,68.3 1591717881712,68.3 1591717882201,68.2 1591717882723,68.2 1591717883209,68.2 1591717883709,68.2 1591717884199,68.3 .. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5119) [M4]There is an exception in the M4 function query
[ https://issues.apache.org/jira/browse/IOTDB-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei Rui reassigned IOTDB-5119: -- Assignee: Lei Rui > [M4]There is an exception in the M4 function query > -- > > Key: IOTDB-5119 > URL: https://issues.apache.org/jira/browse/IOTDB-5119 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Query, mpp-cluster >Affects Versions: master branch, 1.0.0 >Reporter: xiaozhihong >Assignee: Lei Rui >Priority: Major > Attachments: image-2022-12-05-17-15-28-373.png > > > In V1.0, Started 3C3D. > Entered CLI, executed M4 query and reported a exception : > {code:java} > IoTDB> select temperature from root.ln.wf01.wt01; > +-+-+ > | Time|root.ln.wf01.wt01.temperature| > +-+-+ > |2017-11-05T10:20:00.000+08:00| 25.57756| > |2017-11-05T10:21:00.000+08:00| 23.97946| > |2017-11-05T10:22:00.000+08:00| 22.720444| > +-+-+ > Total line number = 3 > It costs 0.007s > IoTDB> select > M4(temperature,'timeInterval'='1000','displayWindowBegin'='1','displayWindowEnd'='100') > from root.ln.wf01.wt01; > Msg: 301: Error occurred during executing UDTF#transform(RowWindow, > PointCollector): > java.lang.IndexOutOfBoundsException: Size is 0{code} > The log is: > !image-2022-12-05-17-15-28-373.png|width=542,height=225! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5119) [M4]There is an exception in the M4 function query
[ https://issues.apache.org/jira/browse/IOTDB-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei Rui reassigned IOTDB-5119: -- Assignee: (was: Lei Rui) > [M4]There is an exception in the M4 function query > -- > > Key: IOTDB-5119 > URL: https://issues.apache.org/jira/browse/IOTDB-5119 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Query, mpp-cluster >Affects Versions: master branch, 1.0.0 >Reporter: xiaozhihong >Priority: Major > Attachments: image-2022-12-05-17-15-28-373.png > > > In V1.0, Started 3C3D. > Entered CLI, executed M4 query and reported a exception : > {code:java} > IoTDB> select temperature from root.ln.wf01.wt01; > +-+-+ > | Time|root.ln.wf01.wt01.temperature| > +-+-+ > |2017-11-05T10:20:00.000+08:00| 25.57756| > |2017-11-05T10:21:00.000+08:00| 23.97946| > |2017-11-05T10:22:00.000+08:00| 22.720444| > +-+-+ > Total line number = 3 > It costs 0.007s > IoTDB> select > M4(temperature,'timeInterval'='1000','displayWindowBegin'='1','displayWindowEnd'='100') > from root.ln.wf01.wt01; > Msg: 301: Error occurred during executing UDTF#transform(RowWindow, > PointCollector): > java.lang.IndexOutOfBoundsException: Size is 0{code} > The log is: > !image-2022-12-05-17-15-28-373.png|width=542,height=225! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5119) [M4]There is an exception in the M4 function query
[ https://issues.apache.org/jira/browse/IOTDB-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei Rui reassigned IOTDB-5119: -- Assignee: Lei Rui > [M4]There is an exception in the M4 function query > -- > > Key: IOTDB-5119 > URL: https://issues.apache.org/jira/browse/IOTDB-5119 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Query, mpp-cluster >Affects Versions: master branch, 1.0.0 >Reporter: xiaozhihong >Assignee: Lei Rui >Priority: Major > Attachments: image-2022-12-05-17-15-28-373.png > > > In V1.0, Started 3C3D. > Entered CLI, executed M4 query and reported a exception : > {code:java} > IoTDB> select temperature from root.ln.wf01.wt01; > +-+-+ > | Time|root.ln.wf01.wt01.temperature| > +-+-+ > |2017-11-05T10:20:00.000+08:00| 25.57756| > |2017-11-05T10:21:00.000+08:00| 23.97946| > |2017-11-05T10:22:00.000+08:00| 22.720444| > +-+-+ > Total line number = 3 > It costs 0.007s > IoTDB> select > M4(temperature,'timeInterval'='1000','displayWindowBegin'='1','displayWindowEnd'='100') > from root.ln.wf01.wt01; > Msg: 301: Error occurred during executing UDTF#transform(RowWindow, > PointCollector): > java.lang.IndexOutOfBoundsException: Size is 0{code} > The log is: > !image-2022-12-05-17-15-28-373.png|width=542,height=225! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-4994) Micrometer and DropWizard behave differently
Lei Rui created IOTDB-4994: -- Summary: Micrometer and DropWizard behave differently Key: IOTDB-4994 URL: https://issues.apache.org/jira/browse/IOTDB-4994 Project: Apache IoTDB Issue Type: Wish Reporter: Lei Rui Micrometer: Value only appears in one window. DropWizard: Value retains in every window. Here is an example of Micrometer: {code:java} IoTDB> select DCP_SeriesScanOperator_hasNext_count.`name=DCP_A_GET_CHUNK_METADATAS`.value as cnt from root.__system.metric.`0.0.0.0:6667` +-+---+ | Time|cnt| +-+---+ |2022-11-19T20:12:09.306+08:00|2.0| |2022-11-19T20:12:24.305+08:00|0.0| |2022-11-19T20:12:39.306+08:00|0.0| |2022-11-19T20:12:54.306+08:00|0.0| +-+---+ Total line number = 4 It costs 0.022s IoTDB> select sum(DCP_SeriesScanOperator_hasNext_count.`name=DCP_A_GET_CHUNK_METADATAS`.value) as sum_cnt from root.__system.metric.`0.0.0.0:6667` +---+ |sum_cnt| +---+ | 2.0| +---+ Total line number = 1 It costs 0.320s {code} Here is an example of DropWizard: {code:java} IoTDB> select `dropwizard:DCP_SeriesScanOperator_hasNext_count`.`name=DCP_A_GET_CHUNK_METADATAS`.value as cnt from root.__system.metric.`0.0.0.0:6667` +-+---+ | Time|cnt| +-+---+ |2022-11-19T20:09:17.090+08:00| 2| |2022-11-19T20:09:31.881+08:00| 2| |2022-11-19T20:09:46.847+08:00| 2| |2022-11-19T20:10:01.867+08:00| 2| |2022-11-19T20:10:16.868+08:00| 2| |2022-11-19T20:10:31.852+08:00| 2| +-+---+ Total line number = 6 It costs 0.035s IoTDB> select sum(`dropwizard:DCP_SeriesScanOperator_hasNext_count`.`name=DCP_A_GET_CHUNK_METADATAS`.value) as sum_cnt from root.__system.metric.`0.0.0.0:6667` +---+ |sum_cnt| +---+ | 18.0| +---+ Total line number = 1 It costs 0.010s {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-4894) TsFileSketchTool prints only the first page info when there are multiple pages in a chunk
[ https://issues.apache.org/jira/browse/IOTDB-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei Rui reassigned IOTDB-4894: -- Assignee: Lei Rui > TsFileSketchTool prints only the first page info when there are multiple > pages in a chunk > - > > Key: IOTDB-4894 > URL: https://issues.apache.org/jira/browse/IOTDB-4894 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Server >Reporter: Lei Rui >Assignee: Lei Rui >Priority: Minor > > An example is as follows. This tsfile has one chunk with 100 pages, but the > sketch tool prints only the first page info: "[page] CompressedSize:1887, > UncompressedSize:2606". > {code:java} > TsFile Sketch > > file path: D:\github\1667985904766-1-0-0.tsfile > file length: 205798 POSITION| CONTENT > --- > 0| [magic head] TsFile > 6| [version number] 3 > | [Chunk Group] of root.sg1.d1, num of Chunks:1 > 7| [Chunk Group Header] > | [marker] 0 > | [deviceID] root.sg1.d1 > 20| [Chunk] of s1, numOfPoints:10, time > range:[1591717867194,1591768035853], tsDataType:DOUBLE, > startTime: 1591717867194 endTime: 1591768035853 > count: 10 > [minValue:0.0,maxValue:88.5,firstValue:68.11,lastValue:64.2,sumValue:7098754.5999] > | [chunk header] marker=1, measurementId=s1, > dataSize=205580, serializedSize=10 > | [chunk] java.nio.HeapByteBuffer[pos=0 lim=205580 > cap=205580] > | [page] CompressedSize:1887, > UncompressedSize:2606 > | [Chunk Group] of root.sg1.d1 ends > 205627| [marker] 2 > 205628| [TimeseriesIndex] of root.sg1.d1.s1, > tsDataType:DOUBLE > | [ChunkIndex] s1, offset=20 > | [startTime: 1591717867194 endTime: 1591768035853 > count: 10 > [minValue:0.0,maxValue:88.5,firstValue:68.11,lastValue:64.2,sumValue:7098754.5999]] > > | > 205701| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT > | > | > 205722| [TsFileMetadata] > | [meta offset] 205627 > | [num of devices] 1 > | 1 key&TsMetadataIndex > | [bloom filter bit vector byte array length] 24 > | [bloom filter bit vector byte array] > | [bloom filter number of bits] 256 > | [bloom filter number of hash functions] 5 > 205788| [TsFileMetadataSize] 66 > 205792| [magic tail] TsFile > 205798| END of TsFile > IndexOfTimerseriesIndex Tree > - > [MetadataIndex:LEAF_DEVICE] > └──[root.sg1.d1,205701] > [MetadataIndex:LEAF_MEASUREMENT] > └──[s1,205628] > -- TsFile Sketch End > -- > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-4894) TsFileSketchTool prints only the first page info when there are multiple pages in a chunk
Lei Rui created IOTDB-4894: -- Summary: TsFileSketchTool prints only the first page info when there are multiple pages in a chunk Key: IOTDB-4894 URL: https://issues.apache.org/jira/browse/IOTDB-4894 Project: Apache IoTDB Issue Type: Bug Components: Core/Server Reporter: Lei Rui An example is as follows. This tsfile has one chunk with 100 pages, but the sketch tool prints only the first page info: "[page] CompressedSize:1887, UncompressedSize:2606". {code:java} TsFile Sketch file path: D:\github\1667985904766-1-0-0.tsfile file length: 205798 POSITION| CONTENT --- 0| [magic head] TsFile 6| [version number] 3 | [Chunk Group] of root.sg1.d1, num of Chunks:1 7| [Chunk Group Header] | [marker] 0 | [deviceID] root.sg1.d1 20| [Chunk] of s1, numOfPoints:10, time range:[1591717867194,1591768035853], tsDataType:DOUBLE, startTime: 1591717867194 endTime: 1591768035853 count: 10 [minValue:0.0,maxValue:88.5,firstValue:68.11,lastValue:64.2,sumValue:7098754.5999] | [chunk header] marker=1, measurementId=s1, dataSize=205580, serializedSize=10 | [chunk] java.nio.HeapByteBuffer[pos=0 lim=205580 cap=205580] | [page] CompressedSize:1887, UncompressedSize:2606 | [Chunk Group] of root.sg1.d1 ends 205627| [marker] 2 205628| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:DOUBLE | [ChunkIndex] s1, offset=20 | [startTime: 1591717867194 endTime: 1591768035853 count: 10 [minValue:0.0,maxValue:88.5,firstValue:68.11,lastValue:64.2,sumValue:7098754.5999]] | 205701| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT | | 205722| [TsFileMetadata] | [meta offset] 205627 | [num of devices] 1 | 1 key&TsMetadataIndex | [bloom filter bit vector byte array length] 24 | [bloom filter bit vector byte array] | [bloom filter number of bits] 256 | [bloom filter number of hash functions] 5 205788| [TsFileMetadataSize] 66 205792| [magic tail] TsFile 205798| END of TsFile IndexOfTimerseriesIndex Tree - [MetadataIndex:LEAF_DEVICE] └──[root.sg1.d1,205701] [MetadataIndex:LEAF_MEASUREMENT] └──[s1,205628] -- TsFile Sketch End -- {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-4810) print-tsfile-sketch.bat went wrong when reading measurementID with Chinese characters
[ https://issues.apache.org/jira/browse/IOTDB-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei Rui reassigned IOTDB-4810: -- Assignee: Lei Rui > print-tsfile-sketch.bat went wrong when reading measurementID with Chinese > characters > - > > Key: IOTDB-4810 > URL: https://issues.apache.org/jira/browse/IOTDB-4810 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Server >Reporter: Lei Rui >Assignee: Lei Rui >Priority: Minor > Fix For: master branch > > Original Estimate: 1h > Remaining Estimate: 1h > > `print-tsfile-sketch.bat` went wrong when reading measurementID with Chinese > characters. > Specifically, it is the following code that does not return correct results: > {code:java} > int measurementIdLength = > measurementID.getBytes(TSFileConfig.STRING_CHARSET).length; {code} > For example, if measurementID="电机绕组温度1", > the correct measurementIdLength is 20, > while `print-tsfile-sketch.bat` assigns measurementIdLength as 30, which is > wrong. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-4810) print-tsfile-sketch.bat went wrong when reading measurementID with Chinese characters
Lei Rui created IOTDB-4810: -- Summary: print-tsfile-sketch.bat went wrong when reading measurementID with Chinese characters Key: IOTDB-4810 URL: https://issues.apache.org/jira/browse/IOTDB-4810 Project: Apache IoTDB Issue Type: Bug Components: Core/Server Reporter: Lei Rui Fix For: master branch `print-tsfile-sketch.bat` went wrong when reading measurementID with Chinese characters. Specifically, it is the following code that does not return correct results: {code:java} int measurementIdLength = measurementID.getBytes(TSFileConfig.STRING_CHARSET).length; {code} For example, if measurementID="电机绕组温度1", the correct measurementIdLength is 20, while `print-tsfile-sketch.bat` assigns measurementIdLength as 30, which is wrong. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IOTDB-2091) Support aggregation with UDF nested query
[ https://issues.apache.org/jira/browse/IOTDB-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465573#comment-17465573 ] Lei Rui commented on IOTDB-2091: Thank you for your explanations. I pictured `s1+sum(s1)` to be used without group-by-time clause. Just `select s1+sum(s1) from root.sg.d1 where ...` Because I suppose group-by-time clause has to be used with select-aggregation clause. `select s1 from root.sg.d1 group by ([1000,7000),3)` is not allowed grammatically. > Support aggregation with UDF nested query > - > > Key: IOTDB-2091 > URL: https://issues.apache.org/jira/browse/IOTDB-2091 > Project: Apache IoTDB > Issue Type: New Feature > Components: Core/Engine >Reporter: Eric Pai >Assignee: Eric Pai >Priority: Major > Labels: features > Fix For: master branch > > > Currently we have already support udf nested query , i.e. f(g(a)). But we > can't query like this > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.* GROUP BY LEVEL=1 > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 GROUP BY([0, 9000), 1s); > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 GROUP BY([0, 9000), 1s) FILL(previous); > This task's object is to implement the above quries with reusing existing > calculation logics. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (IOTDB-2091) Support aggregation with UDF nested query
[ https://issues.apache.org/jira/browse/IOTDB-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465559#comment-17465559 ] Lei Rui commented on IOTDB-2091: # Subquery...yeah, I think this is possible in the future to let the user explicitly define sum(s1) as a constant using subquery. # No I don't have any use cases to support the need to calcuate raw data mixed with aggretations. It's just a thought. But according to your discussion, I think it is find to let alone supporting s0+sum(s1) right now. > "For me it's a little confused that when I execute a GROUP BY query, I get > the dataset with the timestamps of original timeseries, not the timestamps of > grouped windows. If we support group by query with overlapped windows in the > furture(Of course it's forbidden currently as we should let slide_step >= > interval, but it doesn't mean that overlapped window aggregations are > meaningless.), what should the result be in the overlapped timestamps?" Sorry I didn't get the point of the above paragraph. Did you raise another question about the overlapped window aggregation? And what does "when I execute a GROUP BY query, I get the dataset with the timestamps of original timeseries, not the timestamps of grouped windows" mean? As far as I know, the query result of GROUP BY time query returns the timestamps of grouped windows, not the timestamps of original timeseries. > Support aggregation with UDF nested query > - > > Key: IOTDB-2091 > URL: https://issues.apache.org/jira/browse/IOTDB-2091 > Project: Apache IoTDB > Issue Type: New Feature > Components: Core/Engine >Reporter: Eric Pai >Assignee: Eric Pai >Priority: Major > Labels: features > Fix For: master branch > > > Currently we have already support udf nested query , i.e. f(g(a)). But we > can't query like this > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.* GROUP BY LEVEL=1 > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 GROUP BY([0, 9000), 1s); > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 GROUP BY([0, 9000), 1s) FILL(previous); > This task's object is to implement the above quries with reusing existing > calculation logics. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (IOTDB-2091) Support aggregation with UDF nested query
[ https://issues.apache.org/jira/browse/IOTDB-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465219#comment-17465219 ] Lei Rui commented on IOTDB-2091: (1) Some thoughts about case 3. The well-implemented IoTDB native aggregations (count,sum,avg,max_value,min_value,first_value,last_value,min_time,max_time) utilizes the precomputed statistics of the single input time series. Aggregation on multiple input time series does not have the above advantage, when the aggregation cannot be calculated by first aggregate each single time series. * For example, max_value(s1+s2) doesn't equal max_value(s1)+max_value(s2), thus max_value(s1+s2) has to add up s1 and s2 point by point, and cannot utilize the precomputed statistics of s1 or s2. If the customer happens to really need a function like max_value(s1+s2), I guess they can just use UDF to write for example a function named udf_max_value_after_add(s1,s2) to achieve their goal. So, if we explicitly restrict IoTDB native aggregations to only accept one input time series and let UDF take over the responsibility of supporting functions like max_value(s1+s2) by implementing udf_max_value_after_add(s1,s2), then I think case 3 can drop off the agenda. (2) Question about s0+sum(s1) # {*}s0+s1{*}: UDTF # {*}s0+1{*}: UDTF # {*}s0+sum(s1){*}: Treated as UDAF and report error: "Common queries and aggregated queries are not allowed to appear at the same time." # {*}sum(s0)+sum(s1){*}: UDAF My intuition is that sum(s1) is a constant and therefore 3 should be treated as UDTF, just like 2. I'm not saying this is a bug, I just think there is some room for discussion here, or please let me know if this question has been discussed before. > Support aggregation with UDF nested query > - > > Key: IOTDB-2091 > URL: https://issues.apache.org/jira/browse/IOTDB-2091 > Project: Apache IoTDB > Issue Type: New Feature > Components: Core/Engine >Reporter: Eric Pai >Assignee: Eric Pai >Priority: Major > Labels: features > Fix For: master branch > > > Currently we have already support udf nested query , i.e. f(g(a)). But we > can't query like this > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.* GROUP BY LEVEL=1 > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 GROUP BY([0, 9000), 1s); > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 GROUP BY([0, 9000), 1s) FILL(previous); > This task's object is to implement the above quries with reusing existing > calculation logics. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (IOTDB-2091) Support aggregation with UDF nested query
[ https://issues.apache.org/jira/browse/IOTDB-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17459694#comment-17459694 ] Lei Rui commented on IOTDB-2091: [~ericpai] 👍 > Support aggregation with UDF nested query > - > > Key: IOTDB-2091 > URL: https://issues.apache.org/jira/browse/IOTDB-2091 > Project: Apache IoTDB > Issue Type: New Feature > Components: Core/Engine >Reporter: Eric Pai >Assignee: Eric Pai >Priority: Major > Labels: features > Fix For: master branch > > > Currently we have already support udf nested query , i.e. f(g(a)). But we > can't query like this > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.* GROUP BY LEVEL=1 > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 GROUP BY([0, 9000), 1s); > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 GROUP BY([0, 9000), 1s) FILL(previous); > This task's object is to implement the above quries with reusing existing > calculation logics. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (IOTDB-2091) Support aggregation with UDF nested query
[ https://issues.apache.org/jira/browse/IOTDB-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17459275#comment-17459275 ] Lei Rui commented on IOTDB-2091: > Currently we have already support udf nested query , i.e. f(g(a)). || ||f||g(a)||f(g(a))||example||supported|| |1|single-point calculation|time series|time series|sin(cos(s1)), sin(top_k(s1,'k'='5')|yes| |2|single-point calculation|numerical value|numerical value|sin(sum(s1)), cos(max_value(s1))|yes, but sin(extreme(s1)) is not supported| |3|aggregation|time series|numerical value|sum(sin(s1)), max_value(s1-s2) |not yet| |4|aggregation|numerical value| |/|/| * Is the support of case 3 discussed or on the agenda? * I found that sin({*}extreme{*}(s1)) is not supported in case 2. > Support aggregation with UDF nested query > - > > Key: IOTDB-2091 > URL: https://issues.apache.org/jira/browse/IOTDB-2091 > Project: Apache IoTDB > Issue Type: New Feature > Components: Core/Engine >Reporter: Eric Pai >Assignee: Eric Pai >Priority: Major > Labels: features > Fix For: master branch > > > Currently we have already support udf nested query , i.e. f(g(a)). But we > can't query like this > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.* GROUP BY LEVEL=1 > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 GROUP BY([0, 9000), 1s); > * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from > root.sg.d1 GROUP BY([0, 9000), 1s) FILL(previous); > This task's object is to implement the above quries with reusing existing > calculation logics. -- This message was sent by Atlassian Jira (v8.20.1#820001)