[jira] [Created] (IOTDB-5747) UDF SlidingTimeWindow slidingStep bug

2023-03-30 Thread Lei Rui (Jira)
Lei Rui created IOTDB-5747:
--

 Summary: UDF SlidingTimeWindow slidingStep bug
 Key: IOTDB-5747
 URL: https://issues.apache.org/jira/browse/IOTDB-5747
 Project: Apache IoTDB
  Issue Type: Bug
Reporter: Lei Rui


To reproduce the bug in IoTDB v1.0.1:
```
insert into root.sg1.d1(timestamp,s1) values(1,1)
insert into root.sg1.d1(timestamp,s1) values(2,2)
insert into root.sg1.d1(timestamp,s1) values(3,3)
insert into root.sg1.d1(timestamp,s1) values(4,4)
insert into root.sg1.d1(timestamp,s1) values(5,5)
insert into root.sg1.d1(timestamp,s1) values(6,6)
insert into root.sg1.d1(timestamp,s1) values(7,7)
insert into root.sg1.d1(timestamp,s1) values(8,8)
select M4(s1,'timeInterval'='3','slidingStep'='2') from root.sg1.d1
```
The query result is:
```
+-+-+
| Time|M4(root.sg1.d1.s1, "timeInterval"="3", 
"slidingStep"="2")|
+-+-+
|1970-01-01T08:00:00.001+08:00| 
 1.0|
|1970-01-01T08:00:00.003+08:00| 
 3.0|
+-+-+
Total line number = 2
```
which is wrong, as the sliding time windows and the M4 samples of each window 
should be:
- [1,4): samples (1,1),(3,3)
- [3,6): samples (3,3),(5,5)
- [5,8): samples (5,5), (7,7)
- [7,10): samples (7,7), (8,8)

>From my observation, the bug tends to happen when the slidingStep equals 
>timeInterval-1. I think there are bugs in the sliding time window code, but I 
>couldn't locate the bug because I am not familiar with the UDF module.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IOTDB-5433) IndexOutOfBoundsException: ExactOrderStatistics constructs an empty ArrayList and then sets it

2023-01-21 Thread Lei Rui (Jira)
Lei Rui created IOTDB-5433:
--

 Summary: IndexOutOfBoundsException: ExactOrderStatistics 
constructs an empty ArrayList and then sets it
 Key: IOTDB-5433
 URL: https://issues.apache.org/jira/browse/IOTDB-5433
 Project: Apache IoTDB
  Issue Type: Bug
Reporter: Lei Rui


For example:
{code:java}
  public static double getMad(LongArrayList nums) {
if (nums.isEmpty()) {
  throw new NoSuchElementException();
} else {
  double median = getMedian(nums);
  DoubleArrayList dal = new DoubleArrayList();
  for (int i = 0; i < nums.size(); ++i) {
dal.set(i, Math.abs(nums.get(i) - median));
  }
  return getMedian(dal);
}
  }
{code}

"DoubleArrayList dal = new DoubleArrayList();" will construct an empty list, 
and dal.set(...) will throw IndexOutOfBoundsException.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IOTDB-5421) Add sampling attributes for M4

2023-01-15 Thread Lei Rui (Jira)


 [ 
https://issues.apache.org/jira/browse/IOTDB-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei Rui reassigned IOTDB-5421:
--

Assignee: Lei Rui

> Add sampling attributes for M4
> --
>
> Key: IOTDB-5421
> URL: https://issues.apache.org/jira/browse/IOTDB-5421
> Project: Apache IoTDB
>  Issue Type: New Feature
>Reporter: Lei Rui
>Assignee: Lei Rui
>Priority: Major
>
> Previously, the M4 function supports two types of attribute inputs (actually 
> two types of sliding windows):
> (1) Control the sliding *size* window using attributes {{windowSize}} and 
> {{{}slidingStep{}}}. For example: {{{}select 
> M4(s1,'windowSize'='10','slidingStep'='10') as samples from 
> root.vehicle.d1{}}}.
> (2) Control the sliding *time* window using attributes 
> {{{}windowInterval{}}}, {{{}slidingStep{}}}, {{displayWindowBegin}} and 
> {{{}displayWindowEnd{}}}. For example: {{{}select 
> M4(s1,'windowInterval'='25','slidingStep'='25','displayWindowBegin'='0','displayWindowEnd'='100')
>  as samples from root.vehicle.d1{}}}.
> As proposes in a real use case (ZhongHe&DWF), the user wants to control M4 
> behaviors using the following sampling attributes: {{{}samplingInterval{}}}, 
> {{{}samplingThreshold{}}}, {{{}displayWindowBegin{}}}, 
> {{{}displayWindowEnd{}}}. For example: {{{}select 
> M4(s1,'samplingInterval'='5','samplingThreshold'='100','displayWindowBegin'='0','displayWindowEnd'='150')
>  as samples from root.vehicle.d1{}}}.
>  * {{{}samplingInterval{}}}: The sampling time interval length. Long data 
> type. {*}Required{*}.
>  * {{{}samplingThreshold{}}}: The upper limit of the number of sampling 
> points. Long data type. Optional. If not set, default to 1.
>  * {{{}displayWindowBegin{}}}: The starting position of the window 
> (included). Long data type. {*}Required{*}.
>  * {{{}displayWindowEnd{}}}: End time limit (excluded, essentially playing 
> the same role as {{{}WHERE time < displayWindowEnd{}}}). Long data type. 
> {*}Required{*}.
> The user-defined sampling time window is a special kind of sliding time 
> window, which is special in that:
>  # There is a conversion relationship between the length of the sliding time 
> window {{windowInterval}} and the sampling time interval {{samplingInterval. 
> }}Note that here user *indirectly* controls the window time length 
> {{{}windowInterval{}}}.
>  # The sliding step of the sliding time window {{slidingStep}} is fixed to be 
> equal to the window length {{windowInterval}} here, so there is no need for 
> the user to input the {{slidingStep}} parameter.
>  # {{displayWindowBegin}} and {{displayWindowEnd}} are required parameters 
> here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IOTDB-5421) Add sampling attributes for M4

2023-01-15 Thread Lei Rui (Jira)
Lei Rui created IOTDB-5421:
--

 Summary: Add sampling attributes for M4
 Key: IOTDB-5421
 URL: https://issues.apache.org/jira/browse/IOTDB-5421
 Project: Apache IoTDB
  Issue Type: New Feature
Reporter: Lei Rui


Previously, the M4 function supports two types of attribute inputs (actually 
two types of sliding windows):
(1) Control the sliding *size* window using attributes {{windowSize}} and 
{{{}slidingStep{}}}. For example: {{{}select 
M4(s1,'windowSize'='10','slidingStep'='10') as samples from root.vehicle.d1{}}}.
(2) Control the sliding *time* window using attributes {{{}windowInterval{}}}, 
{{{}slidingStep{}}}, {{displayWindowBegin}} and {{{}displayWindowEnd{}}}. For 
example: {{{}select 
M4(s1,'windowInterval'='25','slidingStep'='25','displayWindowBegin'='0','displayWindowEnd'='100')
 as samples from root.vehicle.d1{}}}.

As proposes in a real use case (ZhongHe&DWF), the user wants to control M4 
behaviors using the following sampling attributes: {{{}samplingInterval{}}}, 
{{{}samplingThreshold{}}}, {{{}displayWindowBegin{}}}, 
{{{}displayWindowEnd{}}}. For example: {{{}select 
M4(s1,'samplingInterval'='5','samplingThreshold'='100','displayWindowBegin'='0','displayWindowEnd'='150')
 as samples from root.vehicle.d1{}}}.
 * {{{}samplingInterval{}}}: The sampling time interval length. Long data type. 
{*}Required{*}.
 * {{{}samplingThreshold{}}}: The upper limit of the number of sampling points. 
Long data type. Optional. If not set, default to 1.
 * {{{}displayWindowBegin{}}}: The starting position of the window (included). 
Long data type. {*}Required{*}.
 * {{{}displayWindowEnd{}}}: End time limit (excluded, essentially playing the 
same role as {{{}WHERE time < displayWindowEnd{}}}). Long data type. 
{*}Required{*}.

The user-defined sampling time window is a special kind of sliding time window, 
which is special in that:
 # There is a conversion relationship between the length of the sliding time 
window {{windowInterval}} and the sampling time interval {{samplingInterval. 
}}Note that here user *indirectly* controls the window time length 
{{{}windowInterval{}}}.
 # The sliding step of the sliding time window {{slidingStep}} is fixed to be 
equal to the window length {{windowInterval}} here, so there is no need for the 
user to input the {{slidingStep}} parameter.
 # {{displayWindowBegin}} and {{displayWindowEnd}} are required parameters here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IOTDB-5316) Session.setFetchSize is not used in the following fetch requests

2022-12-29 Thread Lei Rui (Jira)


 [ 
https://issues.apache.org/jira/browse/IOTDB-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei Rui reassigned IOTDB-5316:
--

Assignee: Lei Rui

> Session.setFetchSize is not used in the following fetch requests
> 
>
> Key: IOTDB-5316
> URL: https://issues.apache.org/jira/browse/IOTDB-5316
> Project: Apache IoTDB
>  Issue Type: Bug
>Reporter: Lei Rui
>Assignee: Lei Rui
>Priority: Major
>
> Session.setFetchSize only applies for the first query execution request and 
> the following fetch request uses the default fetch size 5000.
> The reason is that the construction method of SessionDataSet in 
> `SessionConnection.executeQueryStatement` uses 
> `SessionConfig.DEFAULT_FETCH_SIZE` instead of the actual `session.fetchSize`.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IOTDB-5316) Session.setFetchSize is not used in the following fetch requests

2022-12-29 Thread Lei Rui (Jira)
Lei Rui created IOTDB-5316:
--

 Summary: Session.setFetchSize is not used in the following fetch 
requests
 Key: IOTDB-5316
 URL: https://issues.apache.org/jira/browse/IOTDB-5316
 Project: Apache IoTDB
  Issue Type: Bug
Reporter: Lei Rui


Session.setFetchSize only applies for the first query execution request and the 
following fetch request uses the default fetch size 5000.

The reason is that the construction method of SessionDataSet in 
`SessionConnection.executeQueryStatement` uses 
`SessionConfig.DEFAULT_FETCH_SIZE` instead of the actual `session.fetchSize`.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IOTDB-5187) Import-csv tool runs very slowly

2022-12-12 Thread Lei Rui (Jira)
Lei Rui created IOTDB-5187:
--

 Summary: Import-csv tool runs very slowly
 Key: IOTDB-5187
 URL: https://issues.apache.org/jira/browse/IOTDB-5187
 Project: Apache IoTDB
  Issue Type: Improvement
Reporter: Lei Rui


It took far more than 10 minutes for import-csv to import a csv file containing 
one time series with 12780287 points (I waited 10 minutes before killing the 
import progress, so the import task didn't finish).
 * CommitID used: e68f560463c7c31ef0bb0fbaf74335cf6386ff5a (Mon Dec 12 15:15:48 
2022)
 * Import-csv command: ~/iotdb/cli/target/iotdb-cli-1.0.1-SNAPSHOT/tools$ 
./import-csv.sh -f ~/ZT11529.csv -h 127.0.0.1 -p 6667 -u root -pw root
 * The ZT11529.csv file looks like this:

 
{code:java}
Time,root.group_69.`1701`.ZT11529
1591717867194,68.11
1591717867705,68.2
1591717868201,68.2
1591717868711,68.11
1591717869202,68.11
1591717869712,68.11
1591717870201,68.11
1591717870724,68.3
1591717871209,68.3
1591717871713,68.3
1591717872202,68.3
1591717872711,68.2
1591717873200,68.2
1591717873714,68.3
1591717874194,68.3
1591717874715,68.11
1591717875200,68.11
1591717875710,68.11
1591717876199,68.11
1591717876729,68.0
1591717877259,68.3
1591717877709,68.3
1591717878208,68.3
1591717878708,68.3
1591717879197,68.3
1591717879708,68.3
1591717880202,68.3
1591717880716,68.3
1591717881198,68.3
1591717881712,68.3
1591717882201,68.2
1591717882723,68.2
1591717883209,68.2
1591717883709,68.2
1591717884199,68.3
.. {code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IOTDB-5119) [M4]There is an exception in the M4 function query

2022-12-08 Thread Lei Rui (Jira)


 [ 
https://issues.apache.org/jira/browse/IOTDB-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei Rui reassigned IOTDB-5119:
--

Assignee: Lei Rui

> [M4]There is an exception in the M4 function query
> --
>
> Key: IOTDB-5119
> URL: https://issues.apache.org/jira/browse/IOTDB-5119
> Project: Apache IoTDB
>  Issue Type: Bug
>  Components: Core/Query, mpp-cluster
>Affects Versions: master branch, 1.0.0
>Reporter: xiaozhihong
>Assignee: Lei Rui
>Priority: Major
> Attachments: image-2022-12-05-17-15-28-373.png
>
>
> In V1.0, Started 3C3D.
> Entered CLI, executed M4 query and reported a exception :
> {code:java}
> IoTDB> select temperature from root.ln.wf01.wt01;
> +-+-+
> |                         Time|root.ln.wf01.wt01.temperature|
> +-+-+
> |2017-11-05T10:20:00.000+08:00|                     25.57756|
> |2017-11-05T10:21:00.000+08:00|                     23.97946|
> |2017-11-05T10:22:00.000+08:00|                    22.720444|
> +-+-+
> Total line number = 3
> It costs 0.007s
> IoTDB> select 
> M4(temperature,'timeInterval'='1000','displayWindowBegin'='1','displayWindowEnd'='100')
>  from root.ln.wf01.wt01;
> Msg: 301: Error occurred during executing UDTF#transform(RowWindow, 
> PointCollector): 
> java.lang.IndexOutOfBoundsException: Size is 0{code}
> The log is:
> !image-2022-12-05-17-15-28-373.png|width=542,height=225!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IOTDB-5119) [M4]There is an exception in the M4 function query

2022-12-06 Thread Lei Rui (Jira)


 [ 
https://issues.apache.org/jira/browse/IOTDB-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei Rui reassigned IOTDB-5119:
--

Assignee: (was: Lei Rui)

> [M4]There is an exception in the M4 function query
> --
>
> Key: IOTDB-5119
> URL: https://issues.apache.org/jira/browse/IOTDB-5119
> Project: Apache IoTDB
>  Issue Type: Bug
>  Components: Core/Query, mpp-cluster
>Affects Versions: master branch, 1.0.0
>Reporter: xiaozhihong
>Priority: Major
> Attachments: image-2022-12-05-17-15-28-373.png
>
>
> In V1.0, Started 3C3D.
> Entered CLI, executed M4 query and reported a exception :
> {code:java}
> IoTDB> select temperature from root.ln.wf01.wt01;
> +-+-+
> |                         Time|root.ln.wf01.wt01.temperature|
> +-+-+
> |2017-11-05T10:20:00.000+08:00|                     25.57756|
> |2017-11-05T10:21:00.000+08:00|                     23.97946|
> |2017-11-05T10:22:00.000+08:00|                    22.720444|
> +-+-+
> Total line number = 3
> It costs 0.007s
> IoTDB> select 
> M4(temperature,'timeInterval'='1000','displayWindowBegin'='1','displayWindowEnd'='100')
>  from root.ln.wf01.wt01;
> Msg: 301: Error occurred during executing UDTF#transform(RowWindow, 
> PointCollector): 
> java.lang.IndexOutOfBoundsException: Size is 0{code}
> The log is:
> !image-2022-12-05-17-15-28-373.png|width=542,height=225!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IOTDB-5119) [M4]There is an exception in the M4 function query

2022-12-05 Thread Lei Rui (Jira)


 [ 
https://issues.apache.org/jira/browse/IOTDB-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei Rui reassigned IOTDB-5119:
--

Assignee: Lei Rui

> [M4]There is an exception in the M4 function query
> --
>
> Key: IOTDB-5119
> URL: https://issues.apache.org/jira/browse/IOTDB-5119
> Project: Apache IoTDB
>  Issue Type: Bug
>  Components: Core/Query, mpp-cluster
>Affects Versions: master branch, 1.0.0
>Reporter: xiaozhihong
>Assignee: Lei Rui
>Priority: Major
> Attachments: image-2022-12-05-17-15-28-373.png
>
>
> In V1.0, Started 3C3D.
> Entered CLI, executed M4 query and reported a exception :
> {code:java}
> IoTDB> select temperature from root.ln.wf01.wt01;
> +-+-+
> |                         Time|root.ln.wf01.wt01.temperature|
> +-+-+
> |2017-11-05T10:20:00.000+08:00|                     25.57756|
> |2017-11-05T10:21:00.000+08:00|                     23.97946|
> |2017-11-05T10:22:00.000+08:00|                    22.720444|
> +-+-+
> Total line number = 3
> It costs 0.007s
> IoTDB> select 
> M4(temperature,'timeInterval'='1000','displayWindowBegin'='1','displayWindowEnd'='100')
>  from root.ln.wf01.wt01;
> Msg: 301: Error occurred during executing UDTF#transform(RowWindow, 
> PointCollector): 
> java.lang.IndexOutOfBoundsException: Size is 0{code}
> The log is:
> !image-2022-12-05-17-15-28-373.png|width=542,height=225!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IOTDB-4994) Micrometer and DropWizard behave differently

2022-11-19 Thread Lei Rui (Jira)
Lei Rui created IOTDB-4994:
--

 Summary: Micrometer and DropWizard behave differently
 Key: IOTDB-4994
 URL: https://issues.apache.org/jira/browse/IOTDB-4994
 Project: Apache IoTDB
  Issue Type: Wish
Reporter: Lei Rui


Micrometer: Value only appears in one window.

DropWizard: Value retains in every window.

Here is an example of Micrometer:
{code:java}
IoTDB> select 
DCP_SeriesScanOperator_hasNext_count.`name=DCP_A_GET_CHUNK_METADATAS`.value as 
cnt from root.__system.metric.`0.0.0.0:6667`
+-+---+
|                         Time|cnt|
+-+---+
|2022-11-19T20:12:09.306+08:00|2.0|
|2022-11-19T20:12:24.305+08:00|0.0|
|2022-11-19T20:12:39.306+08:00|0.0|
|2022-11-19T20:12:54.306+08:00|0.0|
+-+---+
Total line number = 4
It costs 0.022s

IoTDB> select 
sum(DCP_SeriesScanOperator_hasNext_count.`name=DCP_A_GET_CHUNK_METADATAS`.value)
 as sum_cnt from root.__system.metric.`0.0.0.0:6667`
+---+
|sum_cnt|
+---+
|    2.0|
+---+
Total line number = 1
It costs 0.320s {code}
Here is an example of DropWizard:
{code:java}
IoTDB> select 
`dropwizard:DCP_SeriesScanOperator_hasNext_count`.`name=DCP_A_GET_CHUNK_METADATAS`.value
 as cnt from root.__system.metric.`0.0.0.0:6667`
+-+---+
|                         Time|cnt|
+-+---+
|2022-11-19T20:09:17.090+08:00|  2|
|2022-11-19T20:09:31.881+08:00|  2|
|2022-11-19T20:09:46.847+08:00|  2|
|2022-11-19T20:10:01.867+08:00|  2|
|2022-11-19T20:10:16.868+08:00|  2|
|2022-11-19T20:10:31.852+08:00|  2|
+-+---+
Total line number = 6
It costs 0.035s

IoTDB> select 
sum(`dropwizard:DCP_SeriesScanOperator_hasNext_count`.`name=DCP_A_GET_CHUNK_METADATAS`.value)
 as sum_cnt from root.__system.metric.`0.0.0.0:6667`
+---+
|sum_cnt|
+---+
|   18.0|
+---+
Total line number = 1
It costs 0.010s {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IOTDB-4894) TsFileSketchTool prints only the first page info when there are multiple pages in a chunk

2022-11-09 Thread Lei Rui (Jira)


 [ 
https://issues.apache.org/jira/browse/IOTDB-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei Rui reassigned IOTDB-4894:
--

Assignee: Lei Rui

> TsFileSketchTool prints only the first page info when there are multiple 
> pages in a chunk
> -
>
> Key: IOTDB-4894
> URL: https://issues.apache.org/jira/browse/IOTDB-4894
> Project: Apache IoTDB
>  Issue Type: Bug
>  Components: Core/Server
>Reporter: Lei Rui
>Assignee: Lei Rui
>Priority: Minor
>
> An example is as follows. This tsfile has one chunk with 100 pages, but the 
> sketch tool prints only the first page info: "[page]  CompressedSize:1887, 
> UncompressedSize:2606".
> {code:java}
>  TsFile Sketch 
> 
> file path: D:\github\1667985904766-1-0-0.tsfile
> file length: 205798            POSITION|    CONTENT
>                  ---
>                    0|    [magic head] TsFile
>                    6|    [version number] 3
> |    [Chunk Group] of root.sg1.d1, num of Chunks:1
>                    7|    [Chunk Group Header]
>                     |        [marker] 0
>                     |        [deviceID] root.sg1.d1
>                   20|    [Chunk] of s1, numOfPoints:10, time 
> range:[1591717867194,1591768035853], tsDataType:DOUBLE, 
>                          startTime: 1591717867194 endTime: 1591768035853 
> count: 10 
> [minValue:0.0,maxValue:88.5,firstValue:68.11,lastValue:64.2,sumValue:7098754.5999]
>                     |        [chunk header] marker=1, measurementId=s1, 
> dataSize=205580, serializedSize=10
>                     |        [chunk] java.nio.HeapByteBuffer[pos=0 lim=205580 
> cap=205580]
>                     |        [page]  CompressedSize:1887, 
> UncompressedSize:2606
> |    [Chunk Group] of root.sg1.d1 ends
>               205627|    [marker] 2
>               205628|    [TimeseriesIndex] of root.sg1.d1.s1, 
> tsDataType:DOUBLE
>                     |        [ChunkIndex] s1, offset=20
>                     |        [startTime: 1591717867194 endTime: 1591768035853 
> count: 10 
> [minValue:0.0,maxValue:88.5,firstValue:68.11,lastValue:64.2,sumValue:7098754.5999]]
>  
> |
>               205701|    [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT
>                     |        
>                     |        
>               205722|    [TsFileMetadata]
>                     |        [meta offset] 205627
>                     |        [num of devices] 1
>                     |        1 key&TsMetadataIndex
>                     |        [bloom filter bit vector byte array length] 24
>                     |        [bloom filter bit vector byte array] 
>                     |        [bloom filter number of bits] 256
>                     |        [bloom filter number of hash functions] 5
>               205788|    [TsFileMetadataSize] 66
>               205792|    [magic tail] TsFile
>               205798|    END of TsFile
>  IndexOfTimerseriesIndex Tree 
> -
>     [MetadataIndex:LEAF_DEVICE]
>     └──[root.sg1.d1,205701]
>             [MetadataIndex:LEAF_MEASUREMENT]
>             └──[s1,205628]
> -- TsFile Sketch End 
> --
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IOTDB-4894) TsFileSketchTool prints only the first page info when there are multiple pages in a chunk

2022-11-09 Thread Lei Rui (Jira)
Lei Rui created IOTDB-4894:
--

 Summary: TsFileSketchTool prints only the first page info when 
there are multiple pages in a chunk
 Key: IOTDB-4894
 URL: https://issues.apache.org/jira/browse/IOTDB-4894
 Project: Apache IoTDB
  Issue Type: Bug
  Components: Core/Server
Reporter: Lei Rui


An example is as follows. This tsfile has one chunk with 100 pages, but the 
sketch tool prints only the first page info: "[page]  CompressedSize:1887, 
UncompressedSize:2606".
{code:java}
 TsFile Sketch 
file path: D:\github\1667985904766-1-0-0.tsfile
file length: 205798            POSITION|    CONTENT
                 ---
                   0|    [magic head] TsFile
                   6|    [version number] 3
|    [Chunk Group] of root.sg1.d1, num of Chunks:1
                   7|    [Chunk Group Header]
                    |        [marker] 0
                    |        [deviceID] root.sg1.d1
                  20|    [Chunk] of s1, numOfPoints:10, time 
range:[1591717867194,1591768035853], tsDataType:DOUBLE, 
                         startTime: 1591717867194 endTime: 1591768035853 count: 
10 
[minValue:0.0,maxValue:88.5,firstValue:68.11,lastValue:64.2,sumValue:7098754.5999]
                    |        [chunk header] marker=1, measurementId=s1, 
dataSize=205580, serializedSize=10
                    |        [chunk] java.nio.HeapByteBuffer[pos=0 lim=205580 
cap=205580]
                    |        [page]  CompressedSize:1887, UncompressedSize:2606
|    [Chunk Group] of root.sg1.d1 ends
              205627|    [marker] 2
              205628|    [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:DOUBLE
                    |        [ChunkIndex] s1, offset=20
                    |        [startTime: 1591717867194 endTime: 1591768035853 
count: 10 
[minValue:0.0,maxValue:88.5,firstValue:68.11,lastValue:64.2,sumValue:7098754.5999]]
 
|
              205701|    [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT
                    |        
                    |        
              205722|    [TsFileMetadata]
                    |        [meta offset] 205627
                    |        [num of devices] 1
                    |        1 key&TsMetadataIndex
                    |        [bloom filter bit vector byte array length] 24
                    |        [bloom filter bit vector byte array] 
                    |        [bloom filter number of bits] 256
                    |        [bloom filter number of hash functions] 5
              205788|    [TsFileMetadataSize] 66
              205792|    [magic tail] TsFile
              205798|    END of TsFile
 IndexOfTimerseriesIndex Tree 
-
    [MetadataIndex:LEAF_DEVICE]
    └──[root.sg1.d1,205701]
            [MetadataIndex:LEAF_MEASUREMENT]
            └──[s1,205628]
-- TsFile Sketch End 
--
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IOTDB-4810) print-tsfile-sketch.bat went wrong when reading measurementID with Chinese characters

2022-10-31 Thread Lei Rui (Jira)


 [ 
https://issues.apache.org/jira/browse/IOTDB-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei Rui reassigned IOTDB-4810:
--

Assignee: Lei Rui

> print-tsfile-sketch.bat went wrong when reading measurementID with Chinese 
> characters
> -
>
> Key: IOTDB-4810
> URL: https://issues.apache.org/jira/browse/IOTDB-4810
> Project: Apache IoTDB
>  Issue Type: Bug
>  Components: Core/Server
>Reporter: Lei Rui
>Assignee: Lei Rui
>Priority: Minor
> Fix For: master branch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> `print-tsfile-sketch.bat` went wrong when reading measurementID with Chinese 
> characters.
> Specifically, it is the following code that does not return correct results:
> {code:java}
> int measurementIdLength = 
> measurementID.getBytes(TSFileConfig.STRING_CHARSET).length; {code}
> For example, if measurementID="电机绕组温度1",
> the correct measurementIdLength is 20,
> while `print-tsfile-sketch.bat` assigns measurementIdLength as 30, which is 
> wrong.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IOTDB-4810) print-tsfile-sketch.bat went wrong when reading measurementID with Chinese characters

2022-10-31 Thread Lei Rui (Jira)
Lei Rui created IOTDB-4810:
--

 Summary: print-tsfile-sketch.bat went wrong when reading 
measurementID with Chinese characters
 Key: IOTDB-4810
 URL: https://issues.apache.org/jira/browse/IOTDB-4810
 Project: Apache IoTDB
  Issue Type: Bug
  Components: Core/Server
Reporter: Lei Rui
 Fix For: master branch


`print-tsfile-sketch.bat` went wrong when reading measurementID with Chinese 
characters.

Specifically, it is the following code that does not return correct results:
{code:java}
int measurementIdLength = 
measurementID.getBytes(TSFileConfig.STRING_CHARSET).length; {code}
For example, if measurementID="电机绕组温度1",

the correct measurementIdLength is 20,

while `print-tsfile-sketch.bat` assigns measurementIdLength as 30, which is 
wrong.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IOTDB-2091) Support aggregation with UDF nested query

2021-12-27 Thread Lei Rui (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465573#comment-17465573
 ] 

Lei Rui commented on IOTDB-2091:


Thank you for your explanations.

I pictured `s1+sum(s1)` to be used without group-by-time clause. Just `select 
s1+sum(s1) from root.sg.d1 where ...`

Because I suppose group-by-time clause has to be used with select-aggregation 
clause. `select s1 from root.sg.d1 group by ([1000,7000),3)` is not allowed 
grammatically.

> Support aggregation with UDF nested query
> -
>
> Key: IOTDB-2091
> URL: https://issues.apache.org/jira/browse/IOTDB-2091
> Project: Apache IoTDB
>  Issue Type: New Feature
>  Components: Core/Engine
>Reporter: Eric Pai
>Assignee: Eric Pai
>Priority: Major
>  Labels: features
> Fix For: master branch
>
>
> Currently we have already support udf nested query , i.e. f(g(a)). But we 
> can't query like this
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.* GROUP BY LEVEL=1
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1 GROUP BY([0, 9000), 1s);
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1 GROUP BY([0, 9000), 1s) FILL(previous);
> This task's object is to implement the above quries with reusing existing 
> calculation logics.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (IOTDB-2091) Support aggregation with UDF nested query

2021-12-26 Thread Lei Rui (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465559#comment-17465559
 ] 

Lei Rui commented on IOTDB-2091:


# Subquery...yeah, I think this is possible in the future to let the user 
explicitly define sum(s1) as a constant using subquery.
 # No I don't have any use cases to support the need to calcuate raw data mixed 
with aggretations. It's just a thought. But according to your discussion, I 
think it is find to let alone supporting s0+sum(s1) right now.

> "For me it's a little confused that when I execute a GROUP BY query, I get 
> the dataset with the timestamps of original timeseries, not the timestamps of 
> grouped windows. If we support group by query with overlapped windows in the 
> furture(Of course it's forbidden currently as we should let slide_step >= 
> interval, but it doesn't mean that overlapped window aggregations are 
> meaningless.), what should the result be in the overlapped timestamps?"

Sorry I didn't get the point of the above paragraph. Did you raise another 
question about the overlapped window aggregation? And what does "when I execute 
a GROUP BY query, I get the dataset with the timestamps of original timeseries, 
not the timestamps of grouped windows" mean? As far as I know, the query result 
of GROUP BY time query returns the timestamps of grouped windows, not the 
timestamps of original timeseries.

 

> Support aggregation with UDF nested query
> -
>
> Key: IOTDB-2091
> URL: https://issues.apache.org/jira/browse/IOTDB-2091
> Project: Apache IoTDB
>  Issue Type: New Feature
>  Components: Core/Engine
>Reporter: Eric Pai
>Assignee: Eric Pai
>Priority: Major
>  Labels: features
> Fix For: master branch
>
>
> Currently we have already support udf nested query , i.e. f(g(a)). But we 
> can't query like this
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.* GROUP BY LEVEL=1
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1 GROUP BY([0, 9000), 1s);
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1 GROUP BY([0, 9000), 1s) FILL(previous);
> This task's object is to implement the above quries with reusing existing 
> calculation logics.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (IOTDB-2091) Support aggregation with UDF nested query

2021-12-25 Thread Lei Rui (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465219#comment-17465219
 ] 

Lei Rui commented on IOTDB-2091:


(1) Some thoughts about case 3.

The well-implemented IoTDB native aggregations 
(count,sum,avg,max_value,min_value,first_value,last_value,min_time,max_time) 
utilizes the precomputed statistics of the single input time series.

Aggregation on multiple input time series does not have the above advantage, 
when the aggregation cannot be calculated by first aggregate each single time 
series.
 * For example, max_value(s1+s2) doesn't equal max_value(s1)+max_value(s2), 
thus max_value(s1+s2) has to add up s1 and s2 point by point, and cannot 
utilize the precomputed statistics of s1 or s2. 
If the customer happens to really need a function like max_value(s1+s2), I 
guess they can just use UDF to write for example a function named 
udf_max_value_after_add(s1,s2) to achieve their goal. 

So, if we explicitly restrict IoTDB native aggregations to only accept one 
input time series and let UDF take over the responsibility of supporting 
functions like max_value(s1+s2) by implementing udf_max_value_after_add(s1,s2), 
then I think case 3 can drop off the agenda.

(2) Question about s0+sum(s1)
 # {*}s0+s1{*}: UDTF
 # {*}s0+1{*}: UDTF
 # {*}s0+sum(s1){*}: Treated as UDAF and report error: "Common queries and 
aggregated queries are not allowed to appear at the same time."
 # {*}sum(s0)+sum(s1){*}: UDAF

My intuition is that sum(s1) is a constant and therefore 3 should be treated as 
UDTF, just like 2. 

I'm not saying this is a bug, I just think there is some room for discussion 
here, or please let me know if this question has been discussed before.

> Support aggregation with UDF nested query
> -
>
> Key: IOTDB-2091
> URL: https://issues.apache.org/jira/browse/IOTDB-2091
> Project: Apache IoTDB
>  Issue Type: New Feature
>  Components: Core/Engine
>Reporter: Eric Pai
>Assignee: Eric Pai
>Priority: Major
>  Labels: features
> Fix For: master branch
>
>
> Currently we have already support udf nested query , i.e. f(g(a)). But we 
> can't query like this
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.* GROUP BY LEVEL=1
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1 GROUP BY([0, 9000), 1s);
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1 GROUP BY([0, 9000), 1s) FILL(previous);
> This task's object is to implement the above quries with reusing existing 
> calculation logics.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (IOTDB-2091) Support aggregation with UDF nested query

2021-12-14 Thread Lei Rui (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17459694#comment-17459694
 ] 

Lei Rui commented on IOTDB-2091:


[~ericpai] 👍

> Support aggregation with UDF nested query
> -
>
> Key: IOTDB-2091
> URL: https://issues.apache.org/jira/browse/IOTDB-2091
> Project: Apache IoTDB
>  Issue Type: New Feature
>  Components: Core/Engine
>Reporter: Eric Pai
>Assignee: Eric Pai
>Priority: Major
>  Labels: features
> Fix For: master branch
>
>
> Currently we have already support udf nested query , i.e. f(g(a)). But we 
> can't query like this
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.* GROUP BY LEVEL=1
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1 GROUP BY([0, 9000), 1s);
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1 GROUP BY([0, 9000), 1s) FILL(previous);
> This task's object is to implement the above quries with reusing existing 
> calculation logics.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (IOTDB-2091) Support aggregation with UDF nested query

2021-12-14 Thread Lei Rui (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17459275#comment-17459275
 ] 

Lei Rui commented on IOTDB-2091:


> Currently we have already support udf nested query , i.e. f(g(a)).
|| ||f||g(a)||f(g(a))||example||supported||
|1|single-point calculation|time series|time series|sin(cos(s1)), 
sin(top_k(s1,'k'='5')|yes|
|2|single-point calculation|numerical value|numerical value|sin(sum(s1)), 
cos(max_value(s1))|yes, but sin(extreme(s1)) is not supported|
|3|aggregation|time series|numerical value|sum(sin(s1)), max_value(s1-s2) |not 
yet|
|4|aggregation|numerical value| |/|/|
 * Is the support of case 3 discussed or on the agenda?
 * I found that sin({*}extreme{*}(s1)) is not supported in case 2.

> Support aggregation with UDF nested query
> -
>
> Key: IOTDB-2091
> URL: https://issues.apache.org/jira/browse/IOTDB-2091
> Project: Apache IoTDB
>  Issue Type: New Feature
>  Components: Core/Engine
>Reporter: Eric Pai
>Assignee: Eric Pai
>Priority: Major
>  Labels: features
> Fix For: master branch
>
>
> Currently we have already support udf nested query , i.e. f(g(a)). But we 
> can't query like this
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.* GROUP BY LEVEL=1
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1 GROUP BY([0, 9000), 1s);
>  * select sum(s1) + sum(s2), -sum(s3), sum(s4) + 1, sin(cos(sum(s5))) from 
> root.sg.d1 GROUP BY([0, 9000), 1s) FILL(previous);
> This task's object is to implement the above quries with reusing existing 
> calculation logics.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)