Reposting .. Thanks & Regards Umesh Prasad
On Thu, Jul 21, 2016 at 8:04 AM, Umesh Prasad <[email protected]> wrote: > Hi All, > Does hive a Automated Database Desginer or has anyone tried building it > ? Something which is equivalent to Vertica's DDB and Microsoft SQL > server's Automated Partitioning Design in Parallel Databases. > > References are : > 1. Automated Partitioning Design in Parallel Database Systems ( > https://cs.brown.edu/courses/cs227/archives/2012/papers/partitioning/p1137-nehme.pdf > ) > > 2. DBDesigner: A Customizable Physical Design Tool for Vertica Analytic > Database > (http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6816725) > > Hive tuning tips mention need for pre-sorting tables on filter columns(for > better predicate push down and joins), partitioning/clustering on > join/group by columns, having a higher replication factor for dimension > tables etc. However, I couldn't find any tool/library which suggests a > physical layout given set of hive queries. > > Manually designing the physical layout doesn't scale specially the > producers and consumers of tables (Data) are multiple different teams. > There are conflicting requirements for optimizing different queries and > globally optimal design can be very different from locally optimal design. > > If someone in community has worked on this or can give pointers, then it > would be extremely useful for us. > > > Thanks & Regards > Umesh Prasad > > Team Lead, Flipkart > > >
