hi seshu yang's work is more of a framework. it reduces developers' efforts if he/she wants to add a new custom aggregations. Since some of the aggregations happens in coprocessors, we cannot completely get rid of re-compiling & re-deploying. If someone from the community is interested in crafting a new aggregation, he/she can take a look at how HLL/TOPN aggregation is implemented.
On Wed, Dec 9, 2015 at 9:43 PM, Adunuthula, Seshu <[email protected]> wrote: > Yang, > > Would it be possible to create a How to guide on ability to add custom > aggregates into Kylin. Javadocs are good, but to encourage community > participation we should make it more easily consumable. > > Where are the custom aggregates computed on the Kylin Service or on Hbase > CoProcessors? > > Regards > Seshu Adunuthula. > > On 12/8/15, 6:18 AM, "Adunuthula, Seshu" <[email protected]> wrote: > > >This is awesome! > > > >On 12/8/15, 6:05 AM, "Shi, Shaofeng" <[email protected]> wrote: > > > >>This is another important refactor since making the build/query engines > >>as > >>plugable. Thanks Yang! > >> > >>On 12/8/15, 5:47 PM, "Li Yang" <[email protected]> wrote: > >> > >>>This is a bump of KYLIN-976 in case you are not yet aware... > >>> > >>>KYLIN-976 is a refactoring of how Kylin works with aggregation and aims > >>>to > >>>allow adding custom aggregation types easily. > >>> > >>>Kylin started with basic support of SUM, COUNT, MAX, MIN, AVG (from sum > >>>and > >>>count), and COUNT_DISTINCT (based on hyperloglog). Later TopN is added > >>>in > >>>2.x branch. And the list is growing for sure. Xiaoyu is working on > >>>storing > >>>raw records as a special type of measure (KYLIN-1122), also Yerui is > >>>working on precise count distinct using bitmap (KYLIN-1186). > >>> > >>>The possibility is unlimited. Implement a domain specific aggregation is > >>>now quite easy. E.g. aggregate user events to detect time serials or > >>>access > >>>patterns. Or draw a sketch of certain user groups. Or pre-calculate > >>>clusters of data points. Or histogram... Use your imagination. > >>> > >>>Whoever interested can peek at MeasureTypeFactory and MeasureType on 2.x > >>>branch. The API may still change, but at the same time is stable enough > >>>for > >>>pilots. The javadoc should get you started. HLLCMeasureType and > >>>TopNMeasureType are two good examples. > >>> > >>> > >>>Cheers > >>>Yang > >> > > > > -- Regards, *Bin Mahone | 马洪宾* Apache Kylin: http://kylin.io Github: https://github.com/binmahone
