[ https://issues.apache.org/jira/browse/PHOENIX-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463382#comment-16463382 ]
Ethan Wang commented on PHOENIX-4724: ------------------------------------- [~vincentpoon] If I understand correctly, with this feature implemented, when you build index table, you will at same time record some info into this histogram, so that in the future at some point you will conveniently get the distribution info of the index table. correct? So do you store a histogram obj for each index table like a shadow obj some where off line? Also, will there every be case that you need mutate index or remove index from a existing index table? Cool idea! > Efficient Equi-Depth histogram for streaming data > ------------------------------------------------- > > Key: PHOENIX-4724 > URL: https://issues.apache.org/jira/browse/PHOENIX-4724 > Project: Phoenix > Issue Type: Sub-task > Reporter: Vincent Poon > Assignee: Vincent Poon > Priority: Major > Attachments: PHOENIX-4724.v1.patch > > > Equi-Depth histogram from > http://web.cs.ucla.edu/~zaniolo/papers/Histogram-EDBT2011-CamReady.pdf, but > without the sliding window - we assume a single window over the entire data > set. > Used to generate the bucket boundaries of a histogram where each bucket has > the same # of items. > This is useful, for example, for pre-splitting an index table, by feeding in > data from the indexed column. > Works on streaming data - the histogram is dynamically updated for each new > value. -- This message was sent by Atlassian JIRA (v7.6.3#76005)