[jira] [Updated] (CARBONDATA-2731) Timeseries datamap queries should fetch data from the Timeseries datamap.

Prasanna Ravichandran (JIRA) Wed, 11 Jul 2018 05:41:49 -0700


     [ 
https://issues.apache.org/jira/browse/CARBONDATA-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Prasanna Ravichandran updated CARBONDATA-2731:
----------------------------------------------
    Description: 
While creation of the timeseries datamap, queries to which it would apply would 
also be defined. SO when the user uses that same query after creation of TS 
datamap, then that query should fetch the data from the TS datamap created. 

Test queries:

create table brinjal (imei string,AMSize string,channelsId string,ActiveCountry 
string, Activecity string,gamePointId double,deviceInformationId 
double,productionDate Timestamp,deliveryDate timestamp,deliverycharge double) 
STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('table_blocksize'='1');
 LOAD DATA INPATH 'hdfs://hacluster/user/prasanna/vardhandaterestruct.csv' INTO 
TABLE brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
'"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');

CREATE DATAMAP agg0_time ON TABLE brinjal USING 'timeSeries' DMPROPERTIES 
('EVENT_TIME'='productionDate','SECOND_GRANULARITY'='1') AS SELECT 
productionDate, SUM(imei) FROM brinjal GROUP BY productionDate;

explain SELECT productionDate, SUM(imei) FROM brinjal GROUP BY productionDate; 

0: jdbc:hive2://10.18.98.136:23040/default> show datamap on table brinjal;
 
+-----------------+------------++++-----------------------------------------------------------------------------------
|DataMapName|ClassName|Associated Table|DataMap Properties|

+-----------------+------------++++-----------------------------------------------------------------------------------
|agg0_time|timeSeries|*rp.brinjal_agg0_time*|'event_time'='productionDate', 
'second_granularity'='1'|
|sensor|preaggregate|rp.brinjal_sensor| |

+-----------------+------------++++-----------------------------------------------------------------------------------
 2 rows selected (0.042 seconds)
 0: jdbc:hive2://10.18.98.136:23040/default> explain SELECT productionDate, 
SUM(imei) FROM brinjal GROUP BY productionDate;
 
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------++
|plan|

+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------++
|== CarbonData Profiler ==
 Table Scan on brinjal
 - total blocklets: 1
 - filter: none
 - pruned by Main DataMap
 - skipped blocklets: 0|
|== Physical Plan ==
 *HashAggregate(keys=[productionDate#155228|#155228], 
functions=[sum(cast(imei#155221 as double))|#155221 as double))])
 +- Exchange hashpartitioning(productionDate#155228, 200)
 +- *HashAggregate(keys=[productionDate#155228|#155228], 
functions=[partial_sum(cast(imei#155221 as double))|#155221 as double))])
 +- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :rp, *Table 
name :brinjal,* Schema :Some(StructType(StructField(imei,StringType,true), 
StructField(amsize,StringType,true), StructField(channelsid,StringType,true), 
StructField(activecountry,StringType,true), 
StructField(activecity,StringType,true), 
StructField(gamepointid,DoubleType,true), 
StructField(deviceinformationid,DoubleType,true), 
StructField(productiondate,TimestampType,true), 
StructField(deliverydate,TimestampType,true), 
StructField(deliverycharge,DoubleType,true))) ] 
rp.brinjal[imei#155221,productiondate#155228|#155221,productiondate#155228]|

+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------++
 2 rows selected (0.155 seconds)

 

Expected result: Timeseries datamap queries should fetch data from the 
Timeseries datamap.

Actual result: Timeseries datamap queries should fetch data from the main table.

 

  was:
While creation of the timeseries datamap, queries to which it would apply would 
also be defined. SO when the user uses that same query after creation of TS 
datamap, then that query should fetch the data from the TS datamap created. 

Test queries:

create table brinjal (imei string,AMSize string,channelsId string,ActiveCountry 
string, Activecity string,gamePointId double,deviceInformationId 
double,productionDate Timestamp,deliveryDate timestamp,deliverycharge double) 
STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('table_blocksize'='1');
 LOAD DATA INPATH 'hdfs://hacluster/user/prasanna/vardhandaterestruct.csv' INTO 
TABLE brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
'"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');

CREATE DATAMAP agg0_time ON TABLE brinjal USING 'timeSeries' DMPROPERTIES 
('EVENT_TIME'='productionDate','SECOND_GRANULARITY'='1') AS SELECT 
productionDate, SUM(imei) FROM brinjal GROUP BY productionDate;

explain SELECT productionDate, SUM(imei) FROM brinjal GROUP BY productionDate; 

0: jdbc:hive2://10.18.98.136:23040/default> show datamap on table brinjal;
 
+----------------+-------------+++---------------------------------------------------------------------------------+--
|DataMapName|ClassName|Associated Table|DataMap Properties|

+----------------+-------------+++---------------------------------------------------------------------------------+--
|agg0_time|timeSeries|*rp.brinjal_agg0_time*|'event_time'='productionDate', 
'second_granularity'='1'|
|sensor|preaggregate|rp.brinjal_sensor| |

+----------------+-------------+++---------------------------------------------------------------------------------+--
 2 rows selected (0.042 seconds)
 0: jdbc:hive2://10.18.98.136:23040/default> explain SELECT productionDate, 
SUM(imei) FROM brinjal GROUP BY productionDate;
 
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------++
|plan|

+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------++
|== CarbonData Profiler ==
 Table Scan on brinjal
 - total blocklets: 1
 - filter: none
 - pruned by Main DataMap
 - skipped blocklets: 0|
|== Physical Plan ==
 *HashAggregate(keys=[productionDate#155228|#155228], 
functions=[sum(cast(imei#155221 as double))|#155221 as double))])
 +- Exchange hashpartitioning(productionDate#155228, 200)
 +- *HashAggregate(keys=[productionDate#155228|#155228], 
functions=[partial_sum(cast(imei#155221 as double))|#155221 as double))])
 +- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :rp, *Table 
name :brinjal,* Schema :Some(StructType(StructField(imei,StringType,true), 
StructField(amsize,StringType,true), StructField(channelsid,StringType,true), 
StructField(activecountry,StringType,true), 
StructField(activecity,StringType,true), 
StructField(gamepointid,DoubleType,true), 
StructField(deviceinformationid,DoubleType,true), 
StructField(productiondate,TimestampType,true), 
StructField(deliverydate,TimestampType,true), 
StructField(deliverycharge,DoubleType,true))) ] 
rp.brinjal[imei#155221,productiondate#155228|#155221,productiondate#155228]|

+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------++
 2 rows selected (0.155 seconds)

 

Expected result:

 


> Timeseries datamap queries should  fetch data from the Timeseries datamap.
> --------------------------------------------------------------------------
>
>                 Key: CARBONDATA-2731
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2731
>             Project: CarbonData
>          Issue Type: Bug
>          Components: data-query
>    Affects Versions: 1.4.1
>         Environment: Spark 2.2
>            Reporter: Prasanna Ravichandran
>            Priority: Major
>
> While creation of the timeseries datamap, queries to which it would apply 
> would also be defined. SO when the user uses that same query after creation 
> of TS datamap, then that query should fetch the data from the TS datamap 
> created. 
> Test queries:
> create table brinjal (imei string,AMSize string,channelsId 
> string,ActiveCountry string, Activecity string,gamePointId 
> double,deviceInformationId double,productionDate Timestamp,deliveryDate 
> timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('table_blocksize'='1');
>  LOAD DATA INPATH 'hdfs://hacluster/user/prasanna/vardhandaterestruct.csv' 
> INTO TABLE brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');
> CREATE DATAMAP agg0_time ON TABLE brinjal USING 'timeSeries' DMPROPERTIES 
> ('EVENT_TIME'='productionDate','SECOND_GRANULARITY'='1') AS SELECT 
> productionDate, SUM(imei) FROM brinjal GROUP BY productionDate;
> explain SELECT productionDate, SUM(imei) FROM brinjal GROUP BY 
> productionDate; 
> 0: jdbc:hive2://10.18.98.136:23040/default> show datamap on table brinjal;
>  
> +-----------------+------------++++-----------------------------------------------------------------------------------
> |DataMapName|ClassName|Associated Table|DataMap Properties|
> +-----------------+------------++++-----------------------------------------------------------------------------------
> |agg0_time|timeSeries|*rp.brinjal_agg0_time*|'event_time'='productionDate', 
> 'second_granularity'='1'|
> |sensor|preaggregate|rp.brinjal_sensor| |
> +-----------------+------------++++-----------------------------------------------------------------------------------
>  2 rows selected (0.042 seconds)
>  0: jdbc:hive2://10.18.98.136:23040/default> explain SELECT productionDate, 
> SUM(imei) FROM brinjal GROUP BY productionDate;
>  
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------++
> |plan|
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------++
> |== CarbonData Profiler ==
>  Table Scan on brinjal
>  - total blocklets: 1
>  - filter: none
>  - pruned by Main DataMap
>  - skipped blocklets: 0|
> |== Physical Plan ==
>  *HashAggregate(keys=[productionDate#155228|#155228], 
> functions=[sum(cast(imei#155221 as double))|#155221 as double))])
>  +- Exchange hashpartitioning(productionDate#155228, 200)
>  +- *HashAggregate(keys=[productionDate#155228|#155228], 
> functions=[partial_sum(cast(imei#155221 as double))|#155221 as double))])
>  +- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :rp, *Table 
> name :brinjal,* Schema :Some(StructType(StructField(imei,StringType,true), 
> StructField(amsize,StringType,true), StructField(channelsid,StringType,true), 
> StructField(activecountry,StringType,true), 
> StructField(activecity,StringType,true), 
> StructField(gamepointid,DoubleType,true), 
> StructField(deviceinformationid,DoubleType,true), 
> StructField(productiondate,TimestampType,true), 
> StructField(deliverydate,TimestampType,true), 
> StructField(deliverycharge,DoubleType,true))) ] 
> rp.brinjal[imei#155221,productiondate#155228|#155221,productiondate#155228]|
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------++
>  2 rows selected (0.155 seconds)
>  
> Expected result: Timeseries datamap queries should fetch data from the 
> Timeseries datamap.
> Actual result: Timeseries datamap queries should fetch data from the main 
> table.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (CARBONDATA-2731) Timeseries datamap queries should fetch data from the Timeseries datamap.

Reply via email to