Re: [ANNOUNCE] Welcoming Yingchun Lai as a Kudu committer and PMC member

2019-06-05 Thread Mike Percy
Congrats Yingchun and welcome aboard! Regards, Mike On Wed, Jun 5, 2019 at 11:25 AM Todd Lipcon wrote: > Hi Kudu community, > > I'm happy to announce that the Kudu PMC has voted to add Yingchun Lai as a > new committer and PMC member. > > Yingchun has been contributing to Kudu for the last 6-7

Re: close Kudu client on timeout

2019-01-17 Thread Mike Percy
gt; Thanks, > > Alexey > > On Wed, Jan 16, 2019 at 1:31 PM Boris Tyukin > wrote: > >> sorry it is Java >> >> On Wed, Jan 16, 2019 at 3:32 PM Mike Percy wrote: >> >>> Java or C++ / Python client? >>> >>> Mike >>> >>> Se

Re: close Kudu client on timeout

2019-01-16 Thread Mike Percy
Java or C++ / Python client? Mike Sent from my iPhone > On Jan 16, 2019, at 12:27 PM, Boris Tyukin wrote: > > Hi guys, > > is there a setting on Kudu server to close/clean-up inactive Kudu clients? > > we just found some rogue code that did not close client on code completion > and

Re: kudu-client dependencies

2019-01-02 Thread Mike Percy
Hi Boris, kudu-client is a client API library designed to be embedded in a client application, and it specifies its dependencies via a Maven pom. Typically one would only want one version of a given dep on the classpath at runtime and so shipping a fat jar usually isn't done for client libraries.

Community chat on Slack on Tue Nov 13 @ 10am PDT

2018-10-24 Thread Mike Percy
Hi Kudu dev community, I'm posting this to dev@ and BCC'ing user@ -- let's follow up on the Kudu dev@ list. Following up on some previous email threads on the topic of growing the Kudu community, I would like to know if Kudu developers / interested community members would be interested in having

Re: Locks are acquired to cost much time in transactions

2018-09-18 Thread Mike Percy
Why do you think you are spending a lot of time contending on row locks? Have you tried configuring your clients to send smaller batches? This may decrease throughput on a per-client basis but will likely improve latency and reduce the likelihood of row lock contention. If you are really

Re: poor performance on insert into range partitions and scaling

2018-07-31 Thread Mike Percy
Can you post a query profile from Impala for one of the slow insert jobs? Mike On Tue, Jul 31, 2018 at 12:56 PM Tomas Farkas wrote: > Hi, > wanted share with you the preliminary results of my Kudu testing on AWS > Created a set of performance tests for evaluation of different instance > types

Re: Growing the Kudu community

2018-07-23 Thread Mike Percy
On Mon, Jul 23, 2018 at 10:46 AM Sailesh Mukil wrote: > On Tue, Jul 17, 2018 at 7:37 PM, Mike Percy wrote: > > On Tue, Jul 17, 2018 at 2:59 PM Sailesh Mukil > wrote: > > > > > A suggestion to add on to the easily downloadable pre-built packages, > is t

Re: Growing the Kudu community

2018-07-18 Thread Mike Percy
On Wed, Jul 18, 2018 at 8:52 AM Tim Robertson wrote: > Perhaps we should continue this on the dev@ list discussion I started a > few weeks back [2]? [2] > https://lists.apache.org/thread.html/ee697a022b72bbca2761b1af0581773d8fb708f701fc969bc259fc2d@%3Cdev.kudu.apache.org%3E > Sure, let's

Growing the Kudu community

2018-07-17 Thread Mike Percy
Hi Apache Kudu community, Apologies for cross-posting, we just wanted to reach a broad audience for this topic. Grant and I have been brainstorming about what we can do to grow the community of Kudu developers and users. We think Kudu has a lot going for it, but not everybody knows what it is

Re: WAL directory is full

2018-05-14 Thread Mike Percy
Hi Saeid, What version of Kudu are you running? Do you see any errors when you run "sudo -u kudu kudu cluster ksck" on the cluster? Mike On Fri, May 11, 2018 at 5:12 AM, Saeid Sattari wrote: > Hi all, > > I assigned a 100GB SSD disks to WAL on each node in my cluster.

Re: Spark Streaming + Kudu

2018-03-06 Thread Mike Percy
ess? I launched > my Spark application and tried to get the pid using which I thought I can > grab jstack trace during hang. Unfortunately, I am not able to figure out > grabbing pid for Spark application. > > Thanks, > Ravi > > On 6 March 2018 at 18:36, Mike Percy <mpe...@

Re: Spark Streaming + Kudu

2018-03-05 Thread Mike Percy
;> } >> >> session.apply(upsert); >> } catch (Exception e) { >> logger.error("Exception during upsert:", e); >> } >> } >> } >> } >> class KuduConnection { >> private static Logger logger = LoggerFactory.getLogger(KuduCo >> nnection.cla

Re: Spark Streaming + Kudu

2018-03-05 Thread Mike Percy
; in the KuduClient when retrying to connect to Tablet Server. My > getPendingErrors is not getting ahold of these exceptions. > > Let me know if you need more clarification. I can post some Snippets. > > Thanks, > Ravi > > On 5 March 2018 at 13:18, Mike Percy <mpe.

Re: Spark Streaming + Kudu

2018-03-05 Thread Mike Percy
Hi Ravi, are you using AUTO_FLUSH_BACKGROUND ? You mention that you are trying to use getPendingErrors()

Re: swap data in Kudu table

2018-02-23 Thread Mike Percy
Hi Boris, those are good ideas. Currently Kudu does not have atomic bulk load capabilities or staging abilities. Theoretically renaming a partition atomically shouldn't be that hard to implement, since it's just a master metadata operation which can be done atomically, but it's not yet

Re: [ANNOUNCE] New committers over past several months

2017-12-18 Thread Mike Percy
Well deserved for all! Congratulations belated and otherwise to Andrew, Grant, and Hao! Mike > On Dec 18, 2017, at 9:00 PM, Todd Lipcon wrote: > > Hi Kudu community, > > I'm pleased to announce that the Kudu PMC has voted to add Andrew Wong, > Grant Henke, and Hao Hao as

[ANNOUNCE] Apache Kudu 1.6.0 released

2017-12-07 Thread Mike Percy
The Apache Kudu team is happy to announce the release of Kudu 1.6.0. Kudu is an open source storage engine for structured data that supports low-latency random access together with efficient analytical access patterns. It is designed within the context of the Apache Hadoop ecosystem and supports

Re: Confused where to post user type questions

2017-11-29 Thread Mike Percy
asts featuring you :) you did a really great job > explaining Kudu's role in Big Data ecosystem, I enjoyed both episodes and > they were one year apart I think so it was interesting to see how Kudu had > been evolving over the past year. > > Thanks, > Boris > > On Wed

Re: [DISCUSS] Move Slack discussions to ASF official slack?

2017-10-23 Thread Mike Percy
Users will likely be confused if they have to switch Slack instances. We switched over to ASF mailing lists over a year ago and we still get requests to join the old pre-ASF user mailing list sometimes. Unfortunately the Slack-In inviter bot doesn’t allow you to invite people to a particular

Re: Change Data Capture (CDC) with Kudu

2017-09-22 Thread Mike Percy
Franco, I just realized that I suggested something you mentioned in your initial email. My mistake for not reading through to the end. It is probably the least-worst approach right now and it's probably what I would do if I were you. Mike On Fri, Sep 22, 2017 at 2:29 PM, Mike Percy <

Re: Change Data Capture (CDC) with Kudu

2017-09-22 Thread Mike Percy
CDC is something that I would like to see in Kudu but we aren't there yet with the underlying support in the Raft Consensus implementation. Once we have higher availability re-replication support (KUDU-1097) we will be a bit closer for a solution involving traditional WAL streaming to an external

Re: Table size is not decreasing after large amount of rows deleted.

2017-04-24 Thread Mike Percy
Yep, that's right -- currently the only thing that reclaims space taken by deleted rows is a RowSet merge compaction. We haven't added any logic to trigger those based on the number of deleted rows in a RowSet; they are currently only triggered by logic which tries to merge RowSets with

Re: Number of data files and opened file descriptors are not decreasing after DROP TABLE.

2017-04-24 Thread Mike Percy
HI Jason, I would strongly recommend upgrading to Kudu 1.3.1 as 1.3.0 has a serious data-loss bug related to re-replication. Please see https://kudu.apache.org/ releases/1.3.1/docs/release_notes.html (if you are using the Cloudera version of 1.3.0, no need to worry because it includes the fix for

Re: Kudu on top of Alluxio

2017-03-27 Thread Mike Percy
+1 thanks for adding that Todd. Mike On Mon, Mar 27, 2017 at 9:55 AM, Todd Lipcon <t...@cloudera.com> wrote: > On Sat, Mar 25, 2017 at 2:54 PM, Mike Percy <mpe...@apache.org> wrote: > >> Kudu currently relies on local storage on a POSIX file system. Right now >&

Re: Spark on Kudu Roadmap

2017-03-27 Thread Mike Percy
Hi Ben, Is there anything in particular you are looking for? Thanks, Mike On Mon, Mar 27, 2017 at 9:48 AM, Benjamin Kim wrote: > Hi, > > Are there any plans for deeper integration with Spark especially Spark > SQL? Is there a roadmap to look at, so I can know what to expect

Re: Kudu on top of Alluxio

2017-03-25 Thread Mike Percy
keep our > cluster size to a minimum and not need to add more nodes based on storage > capacity. We would only need to size our clusters based on load (cores, > memory, bandwidth) instead. > > Cheers, > Ben > > >> On Mar 25, 2017, at 2:54 PM, Mike Percy <mpe...

Re: [ANNOUNCE] Two new Kudu committer/PMC members

2016-09-12 Thread Mike Percy
Congrats Alexey and Will! Great work. Best, Mike On Mon, Sep 12, 2016 at 3:55 PM, Todd Lipcon wrote: > Hi Kudu community, > > It's my great pleasure to announce that the PMC has voted to add both > Alexey Serbin and Will Berkeley as committers and PMC members. > > Alexey has

Re: Where can we Use Apache Kudu?

2016-08-05 Thread Mike Percy
Hi Darshan, You should be able to use Kudu as an additional store alongside HDFS and Phoenix. Your data scientists should be able to do joins across HDFS, HBase, and Kudu using Spark. You could also use Apache Impala (incubating) to do those joins, however Impala does not support accessing