Re: [Proposal] Merge configuration files to iotdb-system.properties

2024-05-31 Thread Jialin Qiao
Hi,

Good feature!  This will make it easy for users to set configs.

Jialin Qiao

wenwei shu  于2024年5月31日周五 21:23写道:
>
>  Hello everyone,
>
> I am Wenwei Shu, a new contributor to Apache IoTDB. Recently, we are
> working on merging `iotdb-confignode.properties`,
> `iotdb-datanode.properties` and `iotdb-common.properties` into a new
> `iotdb-system.properties` file. For old version users who upgrade to a new
> version, if they don't create the new configuration file themselves during
> the upgrade, IoTDB will generate the new configuration file based on
> several old configuration files after it is started. You can check this PR
> to get more details: https://github.com/apache/iotdb/pull/12570.
> Thank you for your reading.
>
> Best regards,
> Wenwei Shu


[Proposal] Merge configuration files to iotdb-system.properties

2024-05-31 Thread wenwei shu
 Hello everyone,

I am Wenwei Shu, a new contributor to Apache IoTDB. Recently, we are
working on merging `iotdb-confignode.properties`,
`iotdb-datanode.properties` and `iotdb-common.properties` into a new
`iotdb-system.properties` file. For old version users who upgrade to a new
version, if they don't create the new configuration file themselves during
the upgrade, IoTDB will generate the new configuration file based on
several old configuration files after it is started. You can check this PR
to get more details: https://github.com/apache/iotdb/pull/12570.
Thank you for your reading.

Best regards,
Wenwei Shu


AW: [Proposal] Data Subscription Client on IoTDB

2024-05-31 Thread Christofer Dutz
Hi Yuchen,

do I understand it correctly, that with this, a client could subscribe to data 
and receive this in TsFile format? This plus the ability of an IoTDB instance 
subscribing to such a TsFile subscription would perfectly align with my plans 
for 2024 allowing TsFile libraries on PLCs sending data to IoTDB as well as 
simplify data-collection gateways.

Chris

Von: Yuchen Ding 
Datum: Freitag, 31. Mai 2024 um 10:14
An: dev@iotdb.apache.org 
Betreff: Re: [Proposal] Data Subscription Client on IoTDB
Hello everyone,

I am VGalaxies. Recently, I have been working on providing TsFile subscription 
support for IoTDB data subscription client. The background of this feature is 
to achieve data file export and backup for multi-replica clusters using data 
subscription. Using the existing data subscription client, the subscribed data 
is in the form of SessionDataSet. The server needs to parse TsFile, and the 
client needs to rewrite TsFile. Raw data is transmitted over the network, 
without leveraging the high compression features of TsFile. Therefore, we hope 
to support exporting TsFile using the TsFile client SDK in data subscriptions.

In terms of functionality, TsFile subscription support for the data 
subscription client includes three steps. First, create a topic with a data 
presentation format of TsFile, by specifying the topic format as TsFileHandler. 
Second, when creating a consumer, specify the directory where the subscribed 
TsFile will be saved using the fileSaveDir parameter. Third, obtain the 
corresponding handler based on the type of SubscriptionMessage, which is 
SubscriptionTsFileHandler. SubscriptionTsFileHandler encapsulates operations 
like cp, mv, and rm from the Java standard library, and can also iterate data 
through the TsFile SDK, i.e., TsFileReader. Users can achieve TsFile 
subscription with minimal configuration.

Technically, we have restructured the Message Payload of the pipeSubscribe RPC 
poll type to support the transmission of data in both SessionDataSet and TsFile 
formats. The transmission of a TsFile file is divided into multiple events, 
including tsfile init event, tsfile piece event, and tsfile seal event. The 
reliable transmission of TsFile files is achieved through the interaction of 
these events between the DN side and the client side.

Currently, TsFile subscription still has some limitations. For example, when 
the format of the topic is TsFileHandler, there are certain constraints on its 
path and time configuration. We will continue to optimize these issues in 
future iterations.

I have initially implemented support for TsFile subscription in this PR[1]. I 
hope you are interested in this feature and would like to participate in the 
development and testing. You can also leave your comments and suggestions in 
this thread. Appreciate any suggestion/feedback & contribution.

Thank you for your attention and support.

Best regards,
VGalaxies

Reference:
1. https://github.com/apache/iotdb/pull/12326

On 2024/04/08 03:06:59 VGalaxies wrote:
> Hello everyone,
>
> I am VGalaxies, a new contributor to Apache IoTDB. I am excited to
> share with you a new feature that I have been working on for the past
> few months.
>
> The data subscription client is a new way to access data within IoTDB,
> distinct from the traditional method of querying data using SQL-like
> syntax. In scenarios where real-time data, quick response to data
> changes, and building highly event-driven systems are required, data
> subscription has greater advantages over data querying. For example,
> in the following two scenarios:
>
> 1. Replace extensive polling queries for large amounts of data: Avoid
> significant impacts on the performance of existing systems when
> querying frequently or when there are many data points. Also, avoid
> problems with determining the query scope and ensure downstream
> receives accurate full data.
> 2. Facilitate downstream system integration: It's easier to integrate
> with components such as Flink, Spark, Kafka/DataX, Camel/MySQL, etc.
> There's no need to customize the logic of IoTDB's data change capture
> for each big data component separately, simplifying integration
> component design and making it easier for users.
>
> The IoTDB subscription client references some features defined by some
> message queue products like Kafka. It consists of 3 core concepts:
> Topic, Consumer, and Consumer Group.
>
> - Topic is a logical concept used by the IoTDB subscription client to
> classify data, serving as a channel for data publication. Producers
> publish data to specific topics, while consumers subscribe to these
> topics to receive related data. In the IoTDB subscription client,
> topics describe the sequence characteristics, time characteristics,
> presentation forms, and optional custom processing logic of subscribed
> data.
> - Consumer is an application or service in the IoTDB subscription
> client responsible for receiving and processing data published to

Re: [Proposal] Data Subscription Client on IoTDB

2024-05-31 Thread Yuchen Ding
Hello everyone,

I am VGalaxies. Recently, I have been working on providing TsFile subscription 
support for IoTDB data subscription client. The background of this feature is 
to achieve data file export and backup for multi-replica clusters using data 
subscription. Using the existing data subscription client, the subscribed data 
is in the form of SessionDataSet. The server needs to parse TsFile, and the 
client needs to rewrite TsFile. Raw data is transmitted over the network, 
without leveraging the high compression features of TsFile. Therefore, we hope 
to support exporting TsFile using the TsFile client SDK in data subscriptions.

In terms of functionality, TsFile subscription support for the data 
subscription client includes three steps. First, create a topic with a data 
presentation format of TsFile, by specifying the topic format as TsFileHandler. 
Second, when creating a consumer, specify the directory where the subscribed 
TsFile will be saved using the fileSaveDir parameter. Third, obtain the 
corresponding handler based on the type of SubscriptionMessage, which is 
SubscriptionTsFileHandler. SubscriptionTsFileHandler encapsulates operations 
like cp, mv, and rm from the Java standard library, and can also iterate data 
through the TsFile SDK, i.e., TsFileReader. Users can achieve TsFile 
subscription with minimal configuration.

Technically, we have restructured the Message Payload of the pipeSubscribe RPC 
poll type to support the transmission of data in both SessionDataSet and TsFile 
formats. The transmission of a TsFile file is divided into multiple events, 
including tsfile init event, tsfile piece event, and tsfile seal event. The 
reliable transmission of TsFile files is achieved through the interaction of 
these events between the DN side and the client side.

Currently, TsFile subscription still has some limitations. For example, when 
the format of the topic is TsFileHandler, there are certain constraints on its 
path and time configuration. We will continue to optimize these issues in 
future iterations.

I have initially implemented support for TsFile subscription in this PR[1]. I 
hope you are interested in this feature and would like to participate in the 
development and testing. You can also leave your comments and suggestions in 
this thread. Appreciate any suggestion/feedback & contribution.

Thank you for your attention and support.

Best regards,
VGalaxies

Reference:
1. https://github.com/apache/iotdb/pull/12326

On 2024/04/08 03:06:59 VGalaxies wrote:
> Hello everyone,
> 
> I am VGalaxies, a new contributor to Apache IoTDB. I am excited to
> share with you a new feature that I have been working on for the past
> few months.
> 
> The data subscription client is a new way to access data within IoTDB,
> distinct from the traditional method of querying data using SQL-like
> syntax. In scenarios where real-time data, quick response to data
> changes, and building highly event-driven systems are required, data
> subscription has greater advantages over data querying. For example,
> in the following two scenarios:
> 
> 1. Replace extensive polling queries for large amounts of data: Avoid
> significant impacts on the performance of existing systems when
> querying frequently or when there are many data points. Also, avoid
> problems with determining the query scope and ensure downstream
> receives accurate full data.
> 2. Facilitate downstream system integration: It's easier to integrate
> with components such as Flink, Spark, Kafka/DataX, Camel/MySQL, etc.
> There's no need to customize the logic of IoTDB's data change capture
> for each big data component separately, simplifying integration
> component design and making it easier for users.
> 
> The IoTDB subscription client references some features defined by some
> message queue products like Kafka. It consists of 3 core concepts:
> Topic, Consumer, and Consumer Group.
> 
> - Topic is a logical concept used by the IoTDB subscription client to
> classify data, serving as a channel for data publication. Producers
> publish data to specific topics, while consumers subscribe to these
> topics to receive related data. In the IoTDB subscription client,
> topics describe the sequence characteristics, time characteristics,
> presentation forms, and optional custom processing logic of subscribed
> data.
> - Consumer is an application or service in the IoTDB subscription
> client responsible for receiving and processing data published to
> specific topics. Consumers retrieve data from the queue and perform
> corresponding processing. The IoTDB subscription client provides two
> types of consumers: pull consumer and push consumer.
> - Consumer Group is a collection of consumers. When different
> consumers in the same consumer group subscribe to the same topic,
> these consumers share the processing progress of data under this
> topic. Each data under this topic can only be processed by one
> consumer within the group, ensuring that d