Thanks for clarifying Adam.

I am interested in learning more about the hash-join strategy, in case
you're familiar with them.  Suppose we want to join the PartSupp and
Supplier table on ps_suppkey = s_suppkey.  The s_suppkey is a primary key
of the Supplier table and it is stored in the Accumulo row.  The ps_suppkey
is neither a key nor stored in the row of the PartSupp table.  (The
PartSupp table's row is a UUID.)

Is the hash-join strategy to (1) scan tuples (whole rows) from PartSupp to
a Presto worker, (2) for a batch of PartSupp tuples fetch the
matching Supplier tuples, (3) repeat until all tuples are read from
PartSupp?

Regards, Dylan

On Mon, Jun 13, 2016 at 8:24 AM, Adam J. Shook <adamjsh...@gmail.com> wrote:

> A few clarifications:
>
> - Presto supports hash-based distributed joins as well as broadcast joins
>
> - Presto metadata is stored in ZooKeeper, but metadata storage is pluggable
> and could be stored in Accumulo instead
>
> - The connector does use tablet locality when scanning Accumulo, but our
> testing has shown you get better performance by giving Accumulo and Presto
> their own dedicated machines, making locality a moot point.  This will
> certainly change based on types of queries, data sizes, network quality,
> etc.
>
> - You can insert the results of a query into a Presto table using INSERT
> INTO foo SELECT ..., as well as create a table from the results of a query
> (CTAS).  Though, for large inserts, it is typically best to bypass the
> Presto layer and insert directly into the Accumulo tables using the
> PrestoBatchWriter API
>
> Cheers,
> --Adam
>
> On Mon, Jun 13, 2016 at 7:20 AM, Christopher <ctubb...@apache.org> wrote:
>
> > Thanks for that summary, Dylan! Very helpful.
> >
> > On Mon, Jun 13, 2016, 01:36 Dylan Hutchison <dhutc...@cs.washington.edu>
> > wrote:
> >
> > > Thanks for sharing Sean.  Here are some notes I wrote after reading the
> > > article on Presto-Accumulo design.  I have a research interest in the
> > > relationship between relational (SQL) and non-relational (Accumulo)
> > > systems, so I couldn't resist reading the post in detail.
> > >
> > >    - Places the primary key in the Accumulo row.
> > >    - Performs row-at-a-time processing (each tuple is one row in
> > >    Accumulo) using WholeRowIterator behavior.
> > >    - Relational table metadata is stored in the Presto infrastructure
> (as
> > >    opposed to an Accumulo table).
> > >    - Supports the creation of index tables for any attributes. These
> > >    index tables speed up queries that filter on indexed attributes.  It
> > is
> > >    standard secondary indexing, which provides speedups when the
> > selectivity
> > >    of the query is roughly <10% of the original table.
> > >    - Only database->client querying is supported.  You cannot run
> "select
> > >    ... into result_table".
> > >    - As far as I can see, Presto only has one join strategy: *broadcast
> > >    join*.  The right table of every join is scanned into one of the
> > >    Presto worker's memory.  Subsequently the size of the right table is
> > >    limited by worker memory.
> > >    - There is one Presto worker for each Accumulo tablet, which enables
> > >    good scaling.
> > >    - The Presto bridge classes track internal Accumulo information such
> > >    as the assignment of tablets to tablet servers by reading Accumulo's
> > >    Metadata table. Presto uses tablet locations to provide better
> > locality.
> > >    - The Presto bridge comes with several Accumulo server-side
> iterators
> > >    for filtering and aggregating.
> > >    - The code is quite nice and clean.
> > >
> > > This image below gives Presto's architecture.  Accumulo takes the role
> of
> > > the DB icon in the bottom-right corner.
> > >
> > > [image: Inline image 2]
> > >
> > > Bloomberg ran 13 out of the 22 TPC-H queries.  There is no fundamental
> > > reason why they cannot run all the queries; they just have not
> > implemented
> > > everything required ('exists' clauses, non-equi join, etc.).
> > >
> > > The interface looks like this, though they use a compiled java jar to
> > > insert entries from a csv file (it wraps around a BatchWriter).
> > >
> > > [image: Inline image 3]
> > >
> > > Here are performance results.  They don't say what hardware or data
> sizes
> > > they use.  Whatever it is, they must have the ability to fit the
> smaller
> > > table of any join into memory as a result of Presto's broadcast join
> > > strategy.  The strong scaling looks very nice.
> > >
> > > [image: Inline image 4]
> > >
> > > They have one other plot that shows how secondary indexing speeds up
> some
> > > queries with low selectivity.
> > >
> > > Cheers, Dylan
> > >
> > >
> > >
> > > On Sun, Jun 12, 2016 at 7:06 PM, Sean Busbey <bus...@cloudera.com>
> > wrote:
> > >
> > >> Bloomberg have a post about a connector they made to query Accumulo
> from
> > >> Presto:
> > >>
> > >>
> > >>
> >
> http://www.bloomberg.com/company/announcements/open-source-at-bloomberg-reducing-application-development-time-via-presto-accumulo/
> > >>
> > >> --
> > >> Sean Busbey
> > >>
> > >
> > >
> >
>

Reply via email to