I dont think hbase TTL is the issue because

   1. I added the data 1 day  back
   2. I have a simlar server running for 1.5 million events each having 6k
   feature having data 10 days old and its working fine.


> My vague recollection is that HBase may mark things for removal but wait
> for certain operations before they are compacted. If this is the case I’m
> sure there is a way to get the correct count so this may be a question for
> the HBase list.
> Done the same as you have mentioned but problem still ersists
>> try setting TTL for rows in your hbase table
>> it can be set in hbase shell:
>> alter 'pio_event:events_?', NAME => 'e', TTL => <seconds to live>
>> and then do the following in the shell:
>> major_compact 'pio_event:events_?'
>> You can configure auto major compact: it will delete all the rows that
>> are older than TTL
>> I am stuck at this point .How to identify the problem?
>>> Hi , I am new to predictionIO V 0.12.0 (elasticsearch - 5.2.1 , hbase -
>>> 1.2.6 , spark - 2.6.0) Hardware (244 GB RAM and Core - 32) . I have
>>> uploaded near about 1 million events(each containing 30k features) . while
>>> uploading I can see the size of hbase disk increasing and after all the
>>> events got uploaded the size of hbase disk is 567GB. In order to verify I
>>> ran the following commands
>>>  - pio-shell --with-spark --conf spark.network.timeout=10000000
>>> --driver-memory 30G --executor-memory 21G --num-executors 7
>>> --executor-cores 3 --conf spark.driver.maxResultSize=4g --conf
>>> spark.executor.heartbeatInterval=10000000
>>>  - import org.apache.predictionio.data.store.PEventStore
>>>  - val eventsRDD = PEventStore.find(appName="test")(sc)
>>>  - val c = eventsRDD.count()
>>> it shows event counts as 18944
>>> After that from the script through which I uploaded the events, I
>>> randomly queried with there events Id and I was getting that event.
>>> I don't know how to make sure that all the events uploaded by me are
>>> there in the app. Any help is appreciated.
