Hey All, I am using Phoenix Jdbc Api to insert raw data, do OLAP processing (mainly groupby aggregations) and insert the aggregated data back in HBase.
Sample phoenix table: CREATE TABLE ABCD (TXID INTEGER NOT NULL, INSTANCEID INTEGER NOT NULL, DATELOGGED TIMESTAMP NOT NULL, ORGNAME VARCHAR, MERCHANT_NAME VARCHAR, COUNTRY VARCHAR, AMOUNT FLOAT, CODE INTEGER, CONTYPE VARCHAR, IP_ROUTINGTYPE VARCHAR, SCORE INTEGER,RULE VARCHAR, STATUS INTEGER, ACTION VARCHAR CONSTRAINT pk PRIMARY KEY (DATELOGGED, ORGNAME, TXID, INSTANCEID )) I want to do aggregations on many dimensions over the data in this table and store the output in a Phoenix table. Sample aggregation query: 1. Select TRUNC(DATELOGGED, 'hour'), ORGNAME, 'fraudrep1' as reportid, count(*) from ABCD GROUP BY TRUNC(DATELOGGED, 'hour'), ORGNAME, STATUS WHERE DATELOGGED > X && <Y 2. Select TRUNC(DATELOGGED, 'minute'), ORGNAME, 'fraudrep2' ' as reportid, STATUS, count(*) from ABCD GROUP BY TRUNC(DATELOGGED, 'minute'), ORGNAME, STATUS WHERE DATELOGGED > X && <Y 3. Select TRUNC(DATELOGGED, 'minute'), ORGNAME, 'fraudrep3'' as reportid, CODE, count(*) from ABCD GROUP BY TRUNC(DATELOGGED, 'minute'), ORGNAME, CODE WHERE DATELOGGED > X && <Y; 4. Select TRUNC(DATELOGGED, 'hour'), ORGNAME, 'fraudrep4'' as reportid, STATUS, CODE, count(*) from ABCD GROUP BY TRUNC(DATELOGGED, 'hour'), ORGNAME, STATUS, CODE WHERE DATELOGGED > X && <Y; 5. Select TRUNC(DATELOGGED, 'hour'), ORGNAME, 'fraudrep5'' as reportid, CONTYPE, count(*) from ABCD GROUP BY TRUNC(DATELOGGED, 'hour'), ORGNAME, CONTYPE WHERE DATELOGGED > X && <Y; There can be many queries with 2-4 dimensions clubbed. *How should I design the schema of the table to store the aggregated data ?* FYI, When these processed data is queried then Datelogged, reportId, timedimension(hour mor minute or day) are known. --Unilocal
