Hi vishal, I got your point, i have changed accordingly and updated the document in jira, please check
Regards, Akash R Nilugal On 2019/09/30 17:09:44, Kumar Vishal <kumarvishal1...@gmail.com> wrote: > Hi Akash, > > In this desing document you haven't mentioned how to handle data loading > for timeseries datamap for older segments[Existing table]. > If the customer's main table data is also stored based on time[increasing > time] in different segments,he can use this feature as well. > > We can discuss and finalize the solution. > > -Regards > Kumar Vishal > > On Mon, Sep 30, 2019 at 2:42 PM Akash Nilugal <akashnilu...@gmail.com> > wrote: > > > Hi Ajantha, > > > > Thanks for the queries and suggestions > > > > 1. Yes, this is a good suggestion, i ll include this change. Both date and > > timestamp columns are supported, will be updated in document. > > 2. yes, you are right. > > 3. you are right, if the day level is not available, then we will try to > > get the whole day data from hour level, if not availaible, as explained in > > design document, we will get the data from datamap UNION data from main > > table based on user query. > > > > Regards, > > Akash R Nilugal > > > > > > On 2019/09/30 06:56:45, Ajantha Bhat <ajanthab...@gmail.com> wrote: > > > + 1 , > > > > > > I have some suggestions and questions. > > > > > > 1. In DMPROPERTIES, instead of 'timestamp_column' suggest to use > > > 'timeseries_column'. > > > so that it won't give an impression that only time stamp datatype is > > > supported and update the document with all the datatype supported. > > > > > > 2. Querying on this datamap table is also supported right ? supporting > > > changing plan for main table to refer datamap table is for user to avoid > > > changing his query or any other reason ? > > > > > > 3. If user has not created day granularity datamap, but just created hour > > > granularity datamap. When query has day granularity, data will be fetched > > > form hour granularity datamap and aggregated ? or data is fetched from > > main > > > table ? > > > > > > Thanks, > > > Ajantha > > > > > > On Mon, Sep 30, 2019 at 11:46 AM Akash Nilugal <akashnilu...@gmail.com> > > > wrote: > > > > > > > Hi xuchuanyin, > > > > > > > > Thanks for the comments/Suggestions > > > > > > > > 1. Preaggregate is productized, but not the timeseries with > > preaggregate, > > > > i think you got confused with that, if im right. > > > > 2. Limitations like, auto sampling or rollup, which we will be > > supporting > > > > now. Retention policies. etc > > > > 3. segmentTimestampMin, this i will consider in design. > > > > 4. RP is added as a separate task, i thought instead of maintaining two > > > > variables better to maintabin one and parse it. But i will consider > > your > > > > point based on feasibility during implementation. > > > > 5. We use an accumulator which takes list, so before writing index > > files > > > > we take the min max of the timestamp column and fill in accumulator and > > > > then we can access accumulator.value in driver after load is finished. > > > > > > > > Regards, > > > > Akash R Nilugal > > > > > > > > On 2019/09/28 10:46:31, xuchuanyin <xuchuan...@apache.org> wrote: > > > > > Hi akash, glad to see the feature proposed and I have some questions > > > > about > > > > > this. Please notice that some of the following descriptions are > > comments > > > > > followed by '===' described in the design document attached in the > > > > > corresponding jira. > > > > > > > > > > 1. > > > > > "Currently carbondata supports timeseries on preaggregate datamap, > > but > > > > its > > > > > an alpha feature" > > > > > === > > > > > It has been some time since the preaggregate datamap was introduced > > and > > > > it > > > > > is still **alpha**, why it is still not product-ready? Will the new > > > > feature > > > > > also come into the similar situation? > > > > > > > > > > 2. > > > > > "there are so many limitations when we compare and analyze the > > existing > > > > > timeseries database or projects which supports time series like > > apache > > > > druid > > > > > or influxdb" > > > > > === > > > > > What are the actual limitations? Besides, please give an example of > > this. > > > > > > > > > > 3. > > > > > "Segment_Timestamp_Min" > > > > > === > > > > > Suggest using camel-case style like 'segmentTimestampMin' > > > > > > > > > > 4. > > > > > "RP is way of telling the system, for how long the data should be > > kept" > > > > > === > > > > > Since the function is simple, I'd suggest using 'retentionTime'=15 > > and > > > > > 'timeUnit'='day' instead of 'RP'='15_days' > > > > > > > > > > 5. > > > > > "When the data load is called for main table, use an spark > > accumulator to > > > > > get the maximum value of timestamp in that load and return to the > > load." > > > > > === > > > > > How can you get the spark accumulator? The load is launched using > > > > > loading-by-dataframe not using global-sort-by-spark. > > > > > > > > > > 6. > > > > > For the rest of the content, still reading. > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Sent from: > > > > > > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > > > > > > > > > > > > > >