Re: "broadcast" tablet replication for kudu?

Todd Lipcon Wed, 24 Apr 2019 09:36:22 -0700

Hey Boris,

Sorry to say that the situation is still the same.


-Todd

On Wed, Apr 24, 2019 at 9:02 AM Boris Tyukin <bo...@boristyukin.com> wrote:

> sorry to revive the old thread but curious if there is a better solution 1
> year after...We have a few small tables (under 300k rows) which are
> practically used with every single query and to make things worse joined
> more than once in the same query.
>
> Is there a way to replicate this table on every node to improve
> performance and avoid broadcasting this table every time?
>
> On Mon, Jul 23, 2018 at 10:52 AM Todd Lipcon <t...@cloudera.com> wrote:
>
>>
>>
>> On Mon, Jul 23, 2018, 7:21 AM Boris Tyukin <bo...@boristyukin.com> wrote:
>>
>>> Hi Todd,
>>>
>>> Are you saying that your earlier comment below is not longer valid with
>>> Impala 2.11 and if I replicate a table to all our Kudu nodes Impala can
>>> benefit from this?
>>>
>>
>> No, the earlier comment is still valid. Just saying that in some cases
>> exchange can be faster in the new Impala version.
>>
>>
>>> "
>>> *It's worth noting that, even if your table is replicated, Impala's
>>> planner is unaware of this fact and it will give the same plan regardless.
>>> That is to say, rather than every node scanning its local copy, instead a
>>> single node will perform the whole scan (assuming it's a small table) and
>>> broadcast it from there within the scope of a single query. So, I don't
>>> think you'll see any performance improvements on Impala queries by
>>> attempting something like an extremely high replication count.*
>>>
>>> *I could see bumping the replication count to 5 for these tables since
>>> the extra storage cost is low and it will ensure higher availability of the
>>> important central tables, but I'd be surprised if there is any measurable
>>> perf impact.*
>>> "
>>>
>>> On Mon, Jul 23, 2018 at 9:46 AM Todd Lipcon <t...@cloudera.com> wrote:
>>>
>>>> Are you on the latest release of Impala? It switched from using Thrift
>>>> for RPC to a new implementation (actually borrowed from kudu) which might
>>>> help broadcast performance a bit.
>>>>
>>>> Todd
>>>>
>>>> On Mon, Jul 23, 2018, 6:43 AM Boris Tyukin <bo...@boristyukin.com>
>>>> wrote:
>>>>
>>>>> sorry to revive the old thread but I am curious if there is a good way
>>>>> to speed up requests to frequently used tables in Kudu.
>>>>>
>>>>> On Thu, Apr 12, 2018 at 8:19 AM Boris Tyukin <bo...@boristyukin.com>
>>>>> wrote:
>>>>>
>>>>>> bummer..After reading your guys conversation, I wish there was an
>>>>>> easier way...we will have the same issue as we have a few dozens of 
>>>>>> tables
>>>>>> which are used very frequently in joins and I was hoping there was an 
>>>>>> easy
>>>>>> way to replicate them on most of the nodes to avoid broadcasts every time
>>>>>>
>>>>>> On Thu, Apr 12, 2018 at 7:26 AM, Clifford Resnick <
>>>>>> cresn...@mediamath.com> wrote:
>>>>>>
>>>>>>> The table in our case is 12x hashed and ranged by month, so the
>>>>>>> broadcasts were often to all (12) nodes.
>>>>>>>
>>>>>>> On Apr 12, 2018 12:58 AM, Mauricio Aristizabal <mauri...@impact.com>
>>>>>>> wrote:
>>>>>>> Sorry I left that out Cliff, FWIW it does seem to have been
>>>>>>> broadcast..
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Not sure though how a shuffle would be much different from a
>>>>>>> broadcast if entire table is 1 file/block in 1 node.
>>>>>>>
>>>>>>> On Wed, Apr 11, 2018 at 8:52 PM, Cliff Resnick <cre...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> From the screenshot it does not look like there was a broadcast of
>>>>>>>> the dimension table(s), so it could be the case here that the multiple
>>>>>>>> smaller sends helps. Our dim tables are generally in the single-digit
>>>>>>>> millions and Impala chooses to broadcast them. Since the fact result
>>>>>>>> cardinality is always much smaller, we've found that forcing a 
>>>>>>>> [shuffle]
>>>>>>>> dimension join is actually faster since it only sends dims once rather 
>>>>>>>> than
>>>>>>>> all to all nodes. The degenerative performance of broadcast is 
>>>>>>>> especially
>>>>>>>> obvious when the query returns zero results. I don't have much 
>>>>>>>> experience
>>>>>>>> here, but it does seem that Kudu's efficient predicate scans can 
>>>>>>>> sometimes
>>>>>>>> "break" Impala's query plan.
>>>>>>>>
>>>>>>>> -Cliff
>>>>>>>>
>>>>>>>> On Wed, Apr 11, 2018 at 5:41 PM, Mauricio Aristizabal <
>>>>>>>> mauri...@impact.com> wrote:
>>>>>>>>
>>>>>>>>> @Todd not to belabor the point, but when I suggested breaking up
>>>>>>>>> small dim tables into multiple parquet files (and in this thread's 
>>>>>>>>> context
>>>>>>>>> perhaps partition kudu table, even if small, into multiple tablets), 
>>>>>>>>> it was
>>>>>>>>> to speed up joins/exchanges, not to parallelize the scan.
>>>>>>>>>
>>>>>>>>> For example recently we ran into this slow query where the 14M
>>>>>>>>> record dimension fit into a single file & block, so it got scanned on 
>>>>>>>>> a
>>>>>>>>> single node though still pretty quickly (300ms), however it caused 
>>>>>>>>> the join
>>>>>>>>> to take 25+ seconds and bogged down the entire query.  See highlighted
>>>>>>>>> fragment and its parent.
>>>>>>>>>
>>>>>>>>> So we broke it into several small files the way I described in my
>>>>>>>>> previous post, and now join and query are fast (6s).
>>>>>>>>>
>>>>>>>>> -m
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Mar 16, 2018 at 3:55 PM, Todd Lipcon <t...@cloudera.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I suppose in the case that the dimension table scan makes a
>>>>>>>>>> non-trivial portion of your workload time, then yea, parallelizing 
>>>>>>>>>> the scan
>>>>>>>>>> as you suggest would be beneficial. That said, in typical analytic 
>>>>>>>>>> queries,
>>>>>>>>>> scanning the dimension tables is very quick compared to scanning the
>>>>>>>>>> much-larger fact tables, so the extra parallelism on the dim table 
>>>>>>>>>> scan
>>>>>>>>>> isn't worth too much.
>>>>>>>>>>
>>>>>>>>>> -Todd
>>>>>>>>>>
>>>>>>>>>> On Fri, Mar 16, 2018 at 2:56 PM, Mauricio Aristizabal <
>>>>>>>>>> mauri...@impactradius.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> @Todd I know working with parquet in the past I've seen small
>>>>>>>>>>> dimensions that fit in 1 single file/block limit parallelism of
>>>>>>>>>>> join/exchange/aggregation nodes, and I've forced those dims to 
>>>>>>>>>>> spread
>>>>>>>>>>> across 20 or so blocks by leveraging SET PARQUET_FILE_SIZE=8m; or 
>>>>>>>>>>> similar
>>>>>>>>>>> when doing INSERT OVERWRITE to load them, which then allows these
>>>>>>>>>>> operations to parallelize across that many nodes.
>>>>>>>>>>>
>>>>>>>>>>> Wouldn't it be useful here for Cliff's small dims to be
>>>>>>>>>>> partitioned into a couple tablets to similarly improve parallelism?
>>>>>>>>>>>
>>>>>>>>>>> -m
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Mar 16, 2018 at 2:29 PM, Todd Lipcon <t...@cloudera.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Mar 16, 2018 at 2:19 PM, Cliff Resnick <
>>>>>>>>>>>> cre...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hey Todd,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for that explanation, as well as all the great work
>>>>>>>>>>>>> you're doing  -- it's much appreciated! I just have one last 
>>>>>>>>>>>>> follow-up
>>>>>>>>>>>>> question. Reading about BROADCAST operations (Kudu, Spark, Flink, 
>>>>>>>>>>>>> etc. ) it
>>>>>>>>>>>>> seems the smaller table is always copied in its entirety BEFORE 
>>>>>>>>>>>>> the
>>>>>>>>>>>>> predicate is evaluated.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> That's not quite true. If you have a predicate on a joined
>>>>>>>>>>>> column, or on one of the columns in the joined table, it will be 
>>>>>>>>>>>> pushed
>>>>>>>>>>>> down to the "scan" operator, which happens before the "exchange". 
>>>>>>>>>>>> In
>>>>>>>>>>>> addition, there is a feature called "runtime filters" that can push
>>>>>>>>>>>> dynamically-generated filters from one side of the exchange to the 
>>>>>>>>>>>> other.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> But since the Kudu client provides a serialized scanner as
>>>>>>>>>>>>> part of the ScanToken API, why wouldn't Impala use that instead 
>>>>>>>>>>>>> if it knows
>>>>>>>>>>>>> that the table is Kudu and the query has any type of predicate? 
>>>>>>>>>>>>> Perhaps if
>>>>>>>>>>>>> I hash-partition the table I could maybe force this (because that
>>>>>>>>>>>>> complicates a BROADCAST)? I guess this is really a question for 
>>>>>>>>>>>>> Impala but
>>>>>>>>>>>>> perhaps there is a more basic reason.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Impala could definitely be smarter, just a matter of
>>>>>>>>>>>> programming Kudu-specific join strategies into the optimizer. 
>>>>>>>>>>>> Today, the
>>>>>>>>>>>> optimizer isn't aware of the unique properties of Kudu scans vs 
>>>>>>>>>>>> other
>>>>>>>>>>>> storage mechanisms.
>>>>>>>>>>>>
>>>>>>>>>>>> -Todd
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Cliff
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Mar 16, 2018 at 4:10 PM, Todd Lipcon <
>>>>>>>>>>>>> t...@cloudera.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Mar 16, 2018 at 12:30 PM, Clifford Resnick <
>>>>>>>>>>>>>> cresn...@mediamath.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I thought I had read that the Kudu client can configure a
>>>>>>>>>>>>>>> scan for CLOSEST_REPLICA and assumed this was a way to take 
>>>>>>>>>>>>>>> advantage of
>>>>>>>>>>>>>>> data collocation.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yea, when a client uses CLOSEST_REPLICA it will read a local
>>>>>>>>>>>>>> one if available. However, that doesn't influence the higher 
>>>>>>>>>>>>>> level
>>>>>>>>>>>>>> operation of the Impala (or Spark) planner. The planner isn't 
>>>>>>>>>>>>>> aware of the
>>>>>>>>>>>>>> replication policy, so it will use one of the existing supported 
>>>>>>>>>>>>>> JOIN
>>>>>>>>>>>>>> strategies. Given statistics, it will choose to broadcast the 
>>>>>>>>>>>>>> small table,
>>>>>>>>>>>>>> which means that it will create a plan that looks like:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                                    +-------------------------+
>>>>>>>>>>>>>>                                    |                         |
>>>>>>>>>>>>>>                         +---------->build      JOIN          |
>>>>>>>>>>>>>>                         |          |                         |
>>>>>>>>>>>>>>                         |          |              probe      |
>>>>>>>>>>>>>>                  +--------------+  +-------------------------+
>>>>>>>>>>>>>>                  |              |                  |
>>>>>>>>>>>>>>                  | Exchange     |                  |
>>>>>>>>>>>>>>             +----+ (broadcast   |                  |
>>>>>>>>>>>>>>             |    |              |                  |
>>>>>>>>>>>>>>             |    +--------------+                  |
>>>>>>>>>>>>>>             |                                      |
>>>>>>>>>>>>>>       +---------+                                  |
>>>>>>>>>>>>>>       |         |
>>>>>>>>>>>>>> +-----------------------+
>>>>>>>>>>>>>>       |  SCAN   |                        |
>>>>>>>>>>>>>>    |
>>>>>>>>>>>>>>       |  KUDU   |                        |   SCAN (other
>>>>>>>>>>>>>> side)   |
>>>>>>>>>>>>>>       |         |                        |
>>>>>>>>>>>>>>    |
>>>>>>>>>>>>>>       +---------+
>>>>>>>>>>>>>> +-----------------------+
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (hopefully the ASCII art comes through)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In other words, the "scan kudu" operator scans the table
>>>>>>>>>>>>>> once, and then replicates the results of that scan into the JOIN 
>>>>>>>>>>>>>> operator.
>>>>>>>>>>>>>> The "scan kudu" operator of course will read its local copy, but 
>>>>>>>>>>>>>> it will
>>>>>>>>>>>>>> still go through the exchange process.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For the use case you're talking about, where the join is just
>>>>>>>>>>>>>> looking up a single row by PK in a dimension table, ideally we'd 
>>>>>>>>>>>>>> be using
>>>>>>>>>>>>>> an altogether different join strategy such as nested-loop join, 
>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>> inner "loop" actually being a Kudu PK lookup, but that strategy 
>>>>>>>>>>>>>> isn't
>>>>>>>>>>>>>> implemented by Impala.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Todd
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  If this exists then how far out of context is my
>>>>>>>>>>>>>>> understanding of it? Reading about HDFS cache replication, I do 
>>>>>>>>>>>>>>> know that
>>>>>>>>>>>>>>> Impala will choose a random replica there to more evenly 
>>>>>>>>>>>>>>> distribute load.
>>>>>>>>>>>>>>> But especially compared to Kudu upsert, managing mutable data 
>>>>>>>>>>>>>>> using Parquet
>>>>>>>>>>>>>>> is painful. So, perhaps to sum thing up, if nearly 100% of my 
>>>>>>>>>>>>>>> metadata scan
>>>>>>>>>>>>>>> are single Primary Key lookups followed by a tiny broadcast 
>>>>>>>>>>>>>>> then am I
>>>>>>>>>>>>>>> really just splitting hairs performance-wise between Kudu and 
>>>>>>>>>>>>>>> HDFS-cached
>>>>>>>>>>>>>>> parquet?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> From:  Todd Lipcon <t...@cloudera.com>
>>>>>>>>>>>>>>> Reply-To: "user@kudu.apache.org" <user@kudu.apache.org>
>>>>>>>>>>>>>>> Date: Friday, March 16, 2018 at 2:51 PM
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To: "user@kudu.apache.org" <user@kudu.apache.org>
>>>>>>>>>>>>>>> Subject: Re: "broadcast" tablet replication for kudu?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It's worth noting that, even if your table is replicated,
>>>>>>>>>>>>>>> Impala's planner is unaware of this fact and it will give the 
>>>>>>>>>>>>>>> same plan
>>>>>>>>>>>>>>> regardless. That is to say, rather than every node scanning its 
>>>>>>>>>>>>>>> local copy,
>>>>>>>>>>>>>>> instead a single node will perform the whole scan (assuming 
>>>>>>>>>>>>>>> it's a small
>>>>>>>>>>>>>>> table) and broadcast it from there within the scope of a single 
>>>>>>>>>>>>>>> query. So,
>>>>>>>>>>>>>>> I don't think you'll see any performance improvements on Impala 
>>>>>>>>>>>>>>> queries by
>>>>>>>>>>>>>>> attempting something like an extremely high replication count.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I could see bumping the replication count to 5 for these
>>>>>>>>>>>>>>> tables since the extra storage cost is low and it will ensure 
>>>>>>>>>>>>>>> higher
>>>>>>>>>>>>>>> availability of the important central tables, but I'd be 
>>>>>>>>>>>>>>> surprised if there
>>>>>>>>>>>>>>> is any measurable perf impact.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Todd
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Mar 16, 2018 at 11:35 AM, Clifford Resnick <
>>>>>>>>>>>>>>> cresn...@mediamath.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for that, glad I was wrong there! Aside from
>>>>>>>>>>>>>>>> replication considerations, is it also recommended the number 
>>>>>>>>>>>>>>>> of tablet
>>>>>>>>>>>>>>>> servers be odd?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I will check forums as you suggested, but from what I read
>>>>>>>>>>>>>>>> after searching is that Impala relies on user configured 
>>>>>>>>>>>>>>>> caching strategies
>>>>>>>>>>>>>>>> using HDFS cache.  The workload for these tables is very light 
>>>>>>>>>>>>>>>> write, maybe
>>>>>>>>>>>>>>>> a dozen or so records per hour across 6 or 7 tables. The size 
>>>>>>>>>>>>>>>> of the tables
>>>>>>>>>>>>>>>> ranges from thousands to low millions of rows so so 
>>>>>>>>>>>>>>>> sub-partitioning would
>>>>>>>>>>>>>>>> not be required. So perhaps this is not a typical use-case but 
>>>>>>>>>>>>>>>> I think it
>>>>>>>>>>>>>>>> could work quite well with kudu.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> From: Dan Burkert <danburk...@apache.org>
>>>>>>>>>>>>>>>> Reply-To: "user@kudu.apache.org" <user@kudu.apache.org>
>>>>>>>>>>>>>>>> Date: Friday, March 16, 2018 at 2:09 PM
>>>>>>>>>>>>>>>> To: "user@kudu.apache.org" <user@kudu.apache.org>
>>>>>>>>>>>>>>>> Subject: Re: "broadcast" tablet replication for kudu?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The replication count is the number of tablet servers which
>>>>>>>>>>>>>>>> Kudu will host copies on.  So if you set the replication level 
>>>>>>>>>>>>>>>> to 5, Kudu
>>>>>>>>>>>>>>>> will put the data on 5 separate tablet servers.  There's no 
>>>>>>>>>>>>>>>> built-in
>>>>>>>>>>>>>>>> broadcast table feature; upping the replication factor is the 
>>>>>>>>>>>>>>>> closest
>>>>>>>>>>>>>>>> thing.  A couple of things to keep in mind:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - Always use an odd replication count.  This is important
>>>>>>>>>>>>>>>> due to how the Raft algorithm works.  Recent versions of Kudu 
>>>>>>>>>>>>>>>> won't even
>>>>>>>>>>>>>>>> let you specify an even number without flipping some flags.
>>>>>>>>>>>>>>>> - We don't test much much beyond 5 replicas.  It *should*
>>>>>>>>>>>>>>>> work, but you may run in to issues since it's a relatively rare
>>>>>>>>>>>>>>>> configuration.  With a heavy write workload and many replicas 
>>>>>>>>>>>>>>>> you are even
>>>>>>>>>>>>>>>> more likely to encounter issues.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It's also worth checking in an Impala forum whether it has
>>>>>>>>>>>>>>>> features that make joins against small broadcast tables 
>>>>>>>>>>>>>>>> better?  Perhaps
>>>>>>>>>>>>>>>> Impala can cache small tables locally when doing joins.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - Dan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Mar 16, 2018 at 10:55 AM, Clifford Resnick <
>>>>>>>>>>>>>>>> cresn...@mediamath.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The problem is, AFIK, that replication count is not
>>>>>>>>>>>>>>>>> necessarily the distribution count, so you can't guarantee 
>>>>>>>>>>>>>>>>> all tablet
>>>>>>>>>>>>>>>>> servers will have a copy.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mar 16, 2018 1:41 PM, Boris Tyukin <
>>>>>>>>>>>>>>>>> bo...@boristyukin.com> wrote:
>>>>>>>>>>>>>>>>> I'm new to Kudu but we are also going to use Impala mostly
>>>>>>>>>>>>>>>>> with Kudu. We have a few tables that are small but used a 
>>>>>>>>>>>>>>>>> lot. My plan is
>>>>>>>>>>>>>>>>> replicate them more than 3 times. When you create a kudu 
>>>>>>>>>>>>>>>>> table, you can
>>>>>>>>>>>>>>>>> specify number of replicated copies (3 by default) and I 
>>>>>>>>>>>>>>>>> guess you can put
>>>>>>>>>>>>>>>>> there a number, corresponding to your node count in cluster. 
>>>>>>>>>>>>>>>>> The downside,
>>>>>>>>>>>>>>>>> you cannot change that number unless you recreate a table.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Mar 16, 2018 at 10:42 AM, Cliff Resnick <
>>>>>>>>>>>>>>>>> cre...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> We will soon be moving our analytics from AWS Redshift to
>>>>>>>>>>>>>>>>>> Impala/Kudu. One Redshift feature that we will miss is its 
>>>>>>>>>>>>>>>>>> ALL
>>>>>>>>>>>>>>>>>> Distribution, where a copy of a table is maintained on each 
>>>>>>>>>>>>>>>>>> server. We
>>>>>>>>>>>>>>>>>> define a number of metadata tables this way since they are 
>>>>>>>>>>>>>>>>>> used in nearly
>>>>>>>>>>>>>>>>>> every query. We are considering using parquet in HDFS cache 
>>>>>>>>>>>>>>>>>> for these, and
>>>>>>>>>>>>>>>>>> Kudu would be a much better fit for the update semantics but 
>>>>>>>>>>>>>>>>>> we are worried
>>>>>>>>>>>>>>>>>> about the additional contention.  I'm wondering if having a 
>>>>>>>>>>>>>>>>>> Broadcast, or
>>>>>>>>>>>>>>>>>> ALL, tablet replication might be an easy feature to add to 
>>>>>>>>>>>>>>>>>> Kudu?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -Cliff
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Todd Lipcon
>>>>>>>>>>>>>>> Software Engineer, Cloudera
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Todd Lipcon
>>>>>>>>>>>>>> Software Engineer, Cloudera
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Todd Lipcon
>>>>>>>>>>>> Software Engineer, Cloudera
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> *MAURICIO ARISTIZABAL*
>>>>>>>>>>> Architect - Business Intelligence + Data Science
>>>>>>>>>>> mauri...@impactradius.com(m)+1 323 309 4260 <(323)%20309-4260>
>>>>>>>>>>> 223 E. De La Guerra St. | Santa Barbara, CA 93101
>>>>>>>>>>> <https://maps.google.com/?q=223+E.+De+La+Guerra+St.+%7C+Santa+Barbara,+CA+93101&entry=gmail&source=g>
>>>>>>>>>>>
>>>>>>>>>>> Overview <http://www.impactradius.com/?src=slsap> | Twitter
>>>>>>>>>>> <https://twitter.com/impactradius> | Facebook
>>>>>>>>>>> <https://www.facebook.com/pages/Impact-Radius/153376411365183>
>>>>>>>>>>>  | LinkedIn
>>>>>>>>>>> <https://www.linkedin.com/company/impact-radius-inc->
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Todd Lipcon
>>>>>>>>>> Software Engineer, Cloudera
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Mauricio Aristizabal
>>>>>>>>> Architect - Data Pipeline
>>>>>>>>> *M * 323 309 4260
>>>>>>>>> *E  *mauri...@impact.com  |  *W * https://impact.com
>>>>>>>>> <https://www.linkedin.com/company/608678/>
>>>>>>>>> <https://www.facebook.com/ImpactMarTech/>
>>>>>>>>> <https://twitter.com/impactmartech>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Mauricio Aristizabal
>>>>>>> Architect - Data Pipeline
>>>>>>> *M * 323 309 4260
>>>>>>> *E  *mauri...@impact.com  |  *W * https://impact.com
>>>>>>> <https://www.linkedin.com/company/608678/>
>>>>>>> <https://www.facebook.com/ImpactMarTech/>
>>>>>>> <https://twitter.com/impactmartech>
>>>>>>>
>>>>>>
>>>>>>

-- 
Todd Lipcon
Software Engineer, Cloudera

Re: "broadcast" tablet replication for kudu?

Reply via email to