Re: Kudu 1.17.1 release proposal

2024-07-11 Thread Alexey Serbin
Hi Abhishek,

Thank you very much for taking care of the maintenance release!

Indeed -- it's been long overdue, and I think we need to release
Kudu 1.17.1 soon.  There were a few inquiries about releasing
Kudu 1.17.1 in the #kudu-general Slack channel, so at least
that is the confirmation that the maintenance release is needed.


Kind regards,

Alexey

On Thu, Jul 11, 2024 at 2:15 PM Abhishek Chennaka 
wrote:

> Hi All,
>
> It's been close to ten months since the last Kudu release and we have
> accumulated several maintenance patches. Hence proposing a new release -
> Kudu 1.17.1 and planning to handle the release process. Let me know your
> thoughts.
>
> Thanks,
> Abhishek
>


Re: [ANNOUNCE] Welcoming Márton Greber as Kudu committer and PMC member

2023-11-14 Thread Alexey Serbin
Congratulations, Márton!

On Tue, Nov 14, 2023 at 9:37 AM Andrew Wong  wrote:

> Hi Kudu community,
>
> I'm happy to announce that the Kudu PMC has voted to add Márton Greber as a
> new committer and PMC member.
>
> Some of Márton's contributions include:
> - Getting Kudu to build and run on Apple silicon
> - Improving feature parity of the Python client with a number of features
> - Various bug fixes around the codebase
>
> Please join me in congratulating Márton!
>


Re: [VOTE] Apache Kudu 1.17.0-RC2

2023-08-29 Thread Alexey Serbin
Thanks a lot for taking care of release management, Yingchun!

+1 for releasing Kudu 1.17.0 from the 1.17.0-RC2 tag in the git repository.

I did the following to verify the functionality of the RC2 candidate on
CentOS 7.9:
  * built C++ (DEBUG and RELEASE configurations) and Java parts of the
project: success
  * ran the C++ tests: see below for details
  * ran a smoke test scenario using the 'kudu perf loadgen' CLI tool against
a small POC cluster (RELEASE binaries): success

The only failed test scenarios were in the test suites listed below, and
all that
seems to be test-only issues so far (long startup times for the Ranger
process, etc.):
  161:master_authz-itest.4
  162:master_authz-itest.5
  163:master_authz-itest.6
  164:master_authz-itest.7
  194:security-itest

I did the following to verify the functionality of the RC2 candidate on
RHEL 8.8:
  * built C++ (DEBUG and RELEASE configurations) and Java parts of the
project: success
  * ran the C++ tests (DEBUG): see below for details
  * ran a smoke test scenario using the 'kudu perf loadgen' CLI tool against
a small POC cluster (RELEASE binaries): success

The only failed test scenarios were in the test suites listed below, and
all that
seems to be test-only issues so far (long startup times for the Ranger
process, etc.):
  161:master_authz-itest.4
  162:master_authz-itest.5
  163:master_authz-itest.6
  164:master_authz-itest.7
  173:master-stress-test.0
  194:security-itest
  286:subprocess_server-test
  453:threadpool-test

I did the following to verify the functionality of the RC2 candidate on
Ubuntu 20.04.6 LTS (focal):
  * built C++ (DEBUG and RELEASE configurations) and Java parts of the
project: success
  * ran the C++ tests: see below for details
  * ran a smoke test scenario using the 'kudu perf loadgen' CLI tool against
a small POC cluster (RELEASE binaries): success

The only failed test scenarios were in the test suites listed below, and
all that
seems to be test-only issues so far (Ranger KMS-related issues):
  194:security-itest

For the test scenarios that failed on all the platforms I tested: that are
the recently
introduced ones involving Ranger KMS.  That seems to be a test-only issue
as well,
and I filed https://issues.apache.org/jira/browse/KUDU-3507 to track that.
I'll try
to report and probably also fix the other test-only issues reported above
when I have a chance.


Kind regards,

Alexey

On Sun, Aug 27, 2023 at 8:00 AM Yingchun Lai  wrote:

> Hello Kudu devs!
>
> The Apache Kudu team is happy to announce the second release candidate
> for Apache
> Kudu 1.17.0.
>
> Apache Kudu 1.17.0 is a minor release that offers many improvements and
> fixes
> since Apache Kudu 1.16.0.
>
> This is a source-only release. The artifacts have been staged here:
> https://dist.apache.org/repos/dist/dev/kudu/1.17.0-RC2
>
> Java convenience binaries in the form of a Maven repository are staged
> here:
> https://repository.apache.org/content/repositories/orgapachekudu-1113
>
> Linux (built on CentOS 7) and macOS (built on Ventura) test-only Kudu
> binary
> JAR artifacts are staged here:
> https://repository.apache.org/content/repositories/orgapachekudu-1112
> https://repository.apache.org/content/repositories/orgapachekudu-1114
>
> It is tagged in Git as 1.17.0-RC2 and signed with my key (found in the
> KEYS file
> below). Its commit hash is 4a7700bdcff2bf2afe4e46efdb8874484fd66b86 you can
> check it out from ASF Gitbox or the official GitHub mirror:
>
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=4a7700bdcff2bf2afe4e46efdb8874484fd66b86
> https://github.com/apache/kudu/releases/tag/1.17.0-RC2
>
> The KEYS file to verify the artifact and tag signatures can be found here:
> https://dist.apache.org/repos/dist/release/kudu/KEYS
>
> The release notes can be found here:
> https://github.com/apache/kudu/blob/1.17.0-RC2/docs/release_notes.adoc
>
> I'd suggest going through the release notes, building Kudu, and running
> the unit
> tests. Testing out the Maven repo would also be appreciated. Also, it's
> worth
> running Kudu Java tests against kudu-binary JAR artifact as described in
> the
> commit message here:
>
> https://github.com/apache/kudu/commit/8a6faaa93f3e206ac75e8087731daccaf7ab646a
>
> The vote will run until a majority[1] is achieved, but at least until
> Wednesday
> Aug 30th 20:00:00 UTC 2023.
>
> Thank You,
> Yingchun Lai
>
> [1] https://www.apache.org/foundation/voting.html#ReleaseVotes
>


Re: [VOTE] Apache Kudu 1.17.0-RC1

2023-08-03 Thread Alexey Serbin
Thank you for taking care of the release management for Kudu 1.17.0,
Yingchun!

+1 for releasing Kudu 1.17.0 as per the information above.

I verified the 1.17.0-RC1 tag is set on the 34243b3d0 changelist in the
upstream git repo.

I did the following to verify the functionality of the new 1.17.0 release:
* checked out the source from the github.com git repo mirror
* built Kudu's C++ components and tests in RELEASE configuration on CentOS
Linux release 7.9.2009 (Core) node
* run Kudu the tests using 'ctest -j2' command
* all tests pass except for a few scenarios involving Ranger and RangerKMS
in the following tests, but those seem to be test-only issues with the
mini_ranger_kms and mini_ranger wrappers (those scenarios pass in gerrit
pre-commit tests):
** master_authz-itest
** security-itest

I also ran a smoke test running a small Kudu cluster and executing `kudu
perf loadgen`,
and it successfully completed.


Kind regards,

Alexey

On Wed, Aug 2, 2023 at 5:54 AM Yingchun Lai  wrote:

> Hello Kudu devs!
>
> The Apache Kudu team is happy to announce the first release candidate for
> Apache
> Kudu 1.17.0.
>
> Apache Kudu 1.17.0 is a minor release that offers many improvements and
> fixes
> since Apache Kudu 1.16.0.
>
> This is a source-only release. The artifacts have been staged here:
> https://dist.apache.org/repos/dist/dev/kudu/1.17.0-RC1
>
> Java convenience binaries in the form of a Maven repository are staged
> here:
> https://repository.apache.org/content/repositories/orgapachekudu-1104
>
> Linux (built on CentOS 7) and macOS (built on Ventura) test-only Kudu
> binary
> JAR artifacts are staged here:
> https://repository.apache.org/content/repositories/orgapachekudu-1105
> https://repository.apache.org/content/repositories/orgapachekudu-1106
>
> It is tagged in Git as 1.17.0-RC1 and signed with my key (found in the
> KEYS file
> below). Its commit hash is 34243b3d0777597862aa6d3b51fd401f72d8bbf2 you can
> check it out from ASF Gitbox or the official GitHub mirror:
>
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=34243b3d0777597862aa6d3b51fd401f72d8bbf2
> https://github.com/apache/kudu/releases/tag/1.17.0-RC1
>
> The KEYS file to verify the artifact and tag signatures can be found here:
> https://dist.apache.org/repos/dist/release/kudu/KEYS
>
> The release notes can be found here:
> https://github.com/apache/kudu/blob/1.17.0-RC1/docs/release_notes.adoc
>
> I'd suggest going through the release notes, building Kudu, and running
> the unit
> tests. Testing out the Maven repo would also be appreciated. Also, it's
> worth
> running Kudu Java tests against kudu-binary JAR artifact as described in
> the
> commit message here:
>
> https://github.com/apache/kudu/commit/8a6faaa93f3e206ac75e8087731daccaf7ab646a
>
> The vote will run until a majority[1] is achieved, but at least until
> Saturday
> Aug 5th 23:00:00 UTC 2023.
>
> Thank You,
> Yingchun Lai
>
> [1] https://www.apache.org/foundation/voting.html#ReleaseVotes
>


Re: [DISCUSS] Kudu 1.17.0 Release Proposal

2023-07-27 Thread Alexey Serbin
Hi Yingchun,

This sounds good to me.
Thank you for taking care of Kudu 1.17.0 release management!


Kind regards,

Alexey



On Thu, Jul 27, 2023 at 8:55 AM Yingchun Lai 
wrote:

> Hello Kudu developers,
>
> It seems the main developing features needed to squeeze into this
> release have been merged.
> I'm planning to create the 1.17.0-RC1 tag on the branch-1.17.x around
> 4pm Aug 1th, UTC time, leave some more days because of the coming
> weekends. Please don't hesitate to cherry-pick bug fixes onto theis branch.
>
> Best regards,
> Yingchun Lai
>
>
> On Fri, Mar 3, 2023 at 2:29 PM Abhishek Chennaka
>  wrote:
> >
> > Hi Yingchun,
> >
> > As discussed over slack we would ideally want to have
> > https://gerrit.cloudera.org/#/c/19445/ for a stable release. I’m
> working on
> > getting this merged as soon as possible(~2 weeks).
> >
> > Thanks,
> > Abhishek
> >
> > On Sun, Feb 19, 2023 at 7:58 AM Yingchun Lai 
> wrote:
> >
> > > Hello Kudu developers,
> > >
> > > The new 1.17.x branch is available for use (it's named branch-1.17.x).
> > > Let me know if there are any pending CRs that are hoping to land in
> > > the 1.17.0 release, and when ready, please cherry-pick onto
> > > branch-1.17.x in Gerrit.
> > > I have put up a review for release notes
> > > (https://gerrit.cloudera.org/c/19512/) and put together an RC1 ASAP.
> > > Thanks to everyone who contributed!.
> > >
> > > As always, if you have questions or concerns, don't hesitate to reach
> out.
> > >
> > >
> > > Best regards,
> > > Yingchun
> > >
>


Re: Welcoming Yuqi Du as Kudu committer and PMC member

2023-06-06 Thread Alexey Serbin
Congratulations, Yuqi!

Thank you for your contribution to the project!
I look forward to seeing more of that coming from you as a committer and
PMC member :)


Kind regards,

Alexey

On Tue, Jun 6, 2023 at 10:19 AM Marton Greber  wrote:

> Congrats Yuqi!
>
> Andrew Wong  ezt írta (időpont: 2023. jún. 6., K,
> 19:11):
>
> > Hi Kudu community,
> >
> > I'm happy to announce that the Kudu PMC has voted to add Yuqi Du as a
> > new committer and PMC member.
> >
> > Just some of Yuqi's contributions include:
> > - Designing and implementing automatic partition leader rebalancing
> > - Adding several bug fixes, performance improvements, and tooling all
> > across the codebase
> > - Driving a design for replication to external systems from Kudu
> >
> > Please join me in congratulating Yuqi!
> >
>


Re: [design doc] KUDU-3342: Add an implementation of the block cache on high bandwidth memory

2023-03-07 Thread Alexey Serbin
Hi Sammy,

Thank you very much for working on this!

I took a quick look at the design draft and the patches.
I left a few comments on the design draft doc.

Implementation-wise, the Option1 approach looks reasonable to me.
There are few options how that might be done, and I guess we are about
to converge on details in a few review/comment iterations.

Just curious: did you happen to run any benchmarks against a real hardware
with HBW mamemory using the POC patch you posted at
https://gerrit.cloudera.org/#/c/19498/ ?

Thanks!


Kind regards,

Alexey

On Fri, Feb 24, 2023 at 5:21 PM Nah, Sammy  wrote:

> Hi all,
>
> I'm working on a new block cache implementation (KUDU-3342<
> https://issues.apache.org/jira/browse/KUDU-3342>) and I drafted a design
> doc[1] including motivation and background.
>
> The patch that I am working on is WIP[2] but there is a PoC patch[3] that
> I've tested locally which shows that the implementation is feasible.
>
> Please review and any feedback would be appreciated.
>
> -sammy
>
> [1]
> https://docs.google.com/document/d/12zzk7clZpJZtfKoPDL6LB13jRuBSq5LZP2rVfx9T-oc/edit?usp=sharing
> [2] https://gerrit.cloudera.org/#/c/18686/
> [3] https://gerrit.cloudera.org/#/c/19498/
>
>


Re: [ANNOUNCE] Welcoming Abhishek Chennaka as Kudu committer and PMC member

2023-02-27 Thread Alexey Serbin
Congrats, Abhishek!

I'm happy to know you've accepted the invitation and look forward
to contributing to the project.


Kind regards,

Alexey

On Mon, Feb 27, 2023 at 9:18 AM Abhishek Chennaka 
wrote:

> Thank you all a ton for your appreciation. I'll try to keep contributing
> more and more.
>
> Regards,
> Abhishek
>
> On Mon, Feb 27, 2023 at 5:27 AM Zoltan Chovan 
> wrote:
>
>> Congrats Abhishek!
>>
>> On Mon, Feb 27, 2023 at 1:57 PM Attila Bukor  wrote:
>>
>> > Congrats Abhishek, well deserved!
>> >
>>
>


Re: gerrit.cloudera.org brief restart tonight @ 19:00 PST

2023-02-16 Thread Alexey Serbin
The restart is complete.

On Thu, Feb 16, 2023 at 11:35 AM Alexey Serbin  wrote:

> Hi devs,
>
> We need to update Gerrit replication configuration for Kudu to account
> for the new 1.17.x branch. The downtime shouldn't be more
> than 10 minutes. This is just a small addition to a Gerrit configuration
> file and if there are any problems the changes will be reverted to the
> current version of the config file.
>
> Please reach out if you have any concerns about this.
>
> If I don't hear any protests then I won't bother sending out another email
> about this restart.
>
> Kind regards,
>
> Alexey
>


gerrit.cloudera.org brief restart tonight @ 19:00 PST

2023-02-16 Thread Alexey Serbin
Hi devs,

We need to update Gerrit replication configuration for Kudu to account
for the new 1.17.x branch. The downtime shouldn't be more
than 10 minutes. This is just a small addition to a Gerrit configuration
file and if there are any problems the changes will be reverted to the
current version of the config file.

Please reach out if you have any concerns about this.

If I don't hear any protests then I won't bother sending out another email
about this restart.

Kind regards,

Alexey


Re: [DISCUSS] Kudu 1.17.0 Release Proposal

2023-01-05 Thread Alexey Serbin
Thanks a lot, Yingchun!  Feel free to ping me as well on Slack or via
e-mail if you need any help.

For starters, you can read
https://github.com/apache/kudu/blob/master/RELEASING.adoc
to get a better understanding of the release process.  As already
mentioned, feel free
to ping me and/or Attila if you have questions.


Kind regards,

Alexey

On Thu, Jan 5, 2023 at 8:51 AM Attila Bukor  wrote:

> Thanks for volunteering, Yingchun, I’m happy to help with the release, feel
> free to DM me on Slack if you have any questions.


Re: [DISCUSS] Kudu 1.17.0 Release Proposal

2023-01-04 Thread Alexey Serbin
Any volunteers to take care of Kudu 1.17.0 release management?

With this upcoming release, people who have committership/PMC
status and haven't yet participated in Kudu release management
have a chance to do so :)

Thanks!


Kind regards,

Alexey



On Fri, Dec 9, 2022 at 11:58 AM Alexey Serbin  wrote:

> Hi Yingchun,
>
> That's a good point: it's time to plan for releasing Kudu 1.17.
> Do you know of volunteers who would be interested in running release
> management duties for 1.17.0 release?
>
>
> Thanks,
>
> Alexey
>
> On Wed, Dec 7, 2022 at 6:32 PM Yingchun Lai 
> wrote:
>
>> Hi,
>>
>> It’s been about 6 months since Kudu 1.16.0 was released on Jun 17, 2022.
>>
>> Would it be a good time to discuss and prepare for the release of Kudu
>> 1.17?
>>
>>
>> Best regards,
>> Yingchun Lai
>>
>


Re: [DISCUSS] Kudu 1.17.0 Release Proposal

2022-12-09 Thread Alexey Serbin
Hi Yingchun,

That's a good point: it's time to plan for releasing Kudu 1.17.
Do you know of volunteers who would be interested in running release
management duties for 1.17.0 release?


Thanks,

Alexey

On Wed, Dec 7, 2022 at 6:32 PM Yingchun Lai  wrote:

> Hi,
>
> It’s been about 6 months since Kudu 1.16.0 was released on Jun 17, 2022.
>
> Would it be a good time to discuss and prepare for the release of Kudu
> 1.17?
>
>
> Best regards,
> Yingchun Lai
>


Re: Issue

2022-05-25 Thread Alexey Serbin
Hi,

Scanner objects live at the server side and expire if there isn't any
activity related to those scanners
for some time (that's controlled by the --scanner_ttl_ms flag).  If you
want to keep them open/alive,
you need to issue keep-alive requests from time to time. See [1] and [2]
for the corresponding
reference in the Kudu API.


HTH,

Alexey


[1]
https://kudu.apache.org/releases/1.15.0/cpp-client-api/classkudu_1_1client_1_1KuduScanner.html#aa4a0caf7142880255d7aac1d75f33d21
[2]
https://kudu.apache.org/releases/1.15.0/apidocs/org/apache/kudu/client/KuduScanner.html#keepAlive--

On Thu, May 5, 2022 at 10:56 PM Shanmugabavan Shanmugakumar <
shanmugabavan.develo...@gmail.com> wrote:

> -- Forwarded message -
> From: Shanmugabavan Shanmugakumar 
> Date: Fri, May 6, 2022 at 9:44 AM
> Subject: Issue
> To: 
>
>
> I developed a Kudu csv export tool from the 1.15 tag branch. Which use
> similar command path of scan table. When I export many big data through
> multiple threads there is a scanner expired error occurred.. Can any one
> help to sort out this issue?
>


Re: [VOTE] Apache Kudu 1.16.0-RC1

2022-03-29 Thread Alexey Serbin
+1

I checked out the code for 1.16.x RC1, built the C++ code in RELEASE
and DEBUG configurations, then ran the C++ tests on CentOS Linux 7.6
(release 7.6.1810).  I also built the Java side of the project as well.

All builds succeeded.

All C++ tests passed except for
  MasterTest.TestStartupWebPage

That doesn't look like a show stopper to me -- after a quick look,
it seemed to be a test-only issue.


Kind regards,

Alexey

On Tue, Mar 22, 2022 at 10:51 AM Attila Bukor  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> Hello Kudu devs!
>
> The Apache Kudu team is happy to announce the first release candidate for
> Apache
> Kudu 1.16.0.
>
> Apache Kudu 1.16.0 is a minor release that offers many improvements and
> fixes
> since Apache Kudu 1.15.0.
>
> This is a source-only release. The artifacts have been staged here:
> https://dist.apache.org/repos/dist/dev/kudu/1.16.0-RC1/
>
> Java convenience binaries in the form of a Maven repository are staged
> here:
> https://repository.apache.org/content/repositories/orgapachekudu-1100
>
> Linux (built on CentOS 7) and macOS (built on Catalina) test-only Kudu
> binary
> JAR artifacts are staged here:
> https://repository.apache.org/content/repositories/orgapachekudu-1095
>
> It is tagged in Git as 1.16.0-RC1 and signed with my key (found in the
> KEYS file
> below). Its commit hash is 5cd6779a073ce02e19a3794dd19657df6aeea86c you can
> check it out from ASF Gitbox or the official GitHub mirror:
>
>
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=5cd6779a073ce02e19a3794dd19657df6aeea86c
> https://github.com/apache/kudu/releases/tag/1.16.0-RC1
>
> The KEYS file to verify the artifact and tag signatures can be found here:
> https://dist.apache.org/repos/dist/release/kudu/KEYS
>
> The release notes can be found here:
> https://github.com/apache/kudu/blob/1.16.0-RC1/docs/release_notes.adoc
>
> I'd suggest going through the release notes, building Kudu, and running
> the unit
> tests. Testing out the Maven repo would also be appreciated. Also, it's
> worth
> running Kudu Java tests against kudu-binary JAR artifact as described in
> the
> commit message here:
>
>
> https://github.com/apache/kudu/commit/8a6faaa93f3e206ac75e8087731daccaf7ab646a
>
> The vote will run until a majority[1] is achieved, but at least until
> Friday
> Mar 25th 20:00:00 CEST 2020.
>
> Thank You,
> Attila
>
> [1] https://www.apache.org/foundation/voting.html#ReleaseVotes
> -BEGIN PGP SIGNATURE-
>
> iQIzBAEBCAAdFiEEeovxOccqrV1d+vZcTUsRD2bLrfAFAmI6DCYACgkQTUsRD2bL
> rfACoRAAswVBH1cMaE7mLWzY5bt0tn02oJN6LpSGUKasVnES5VQR2TlAmVv5WVEj
> QUNztmfHZNgMd3/asR18ZIi85U2ftpM+U/GCrBvSBc8zFuRWNdCZ3i4W2i2htbee
> uJMtO6XYkvqHPnGZkAYykFQ8FiWjMIkFLatR3XL9XWJ730dboivteeOoh4AuZVdJ
> Q3kx+F1Y6Z7UwAs47Y+6YeQL9VPKHkJJm+gIqAm7oITaNl49jcFYeolN78a5ahcb
> C9J1+882SoRyQM7KiPCLM6HLyI1j4cedcSshAWQQiNT3p0tWFBtZcVlu6q5n0XRI
> lOskCssz3RTWVA8zO4lItclAh3moAEMZNwNpHeqzkN/GiAhbfjaQCrS4FJ0VVzM1
> b6KM408Ytr2I1PyGgijx1nUVzptVYtKV233SeL270pjJ1a60qy3RXaJyFZ6mdRFG
> EUkdOyG9qN96fhu/lg8oNvN3LLw6xodiRHa9a3LqUARN1d0QD5VNuZfW2DJkEPrk
> uxlxTeuwl464QV88FNcgAWZv760/22jduvTNkH3DznawQm1QgxaoeZ6g9n+Rnayk
> JsL9lWmwZ1zYg9dyb2VuwPPjIXXzWZo3gW9fe6MoLFXT8eWewg6jZUjx+CtV21Un
> qVGjoknZY/FydsNnLIWPeZqc6emAN9w4t0k7JMir0jK2WcDOGlY=
> =T5EH
> -END PGP SIGNATURE-
>
>


Re: Kudu 1.16.0 Release Proposal

2021-12-07 Thread Alexey Serbin
Hi Attila,

Thank you very much for volunteering for 1.16.0 RM work!

So far I have nothing I'd like to squeeze into 1.16.0: I have some
scattered pieces of partition-related work I thought might be useful, but
the current state of the code looks good to go without those extras.  Also,
to enable the end-to-end functionality for that would require extra work on
the Impala side anyways, and I don't think it's viable to push those before
the end of this year.


Kind regards,

Alexey

On Thu, Dec 2, 2021 at 8:36 AM Attila Bukor  wrote:

> Hi,
>
> It’s been a while since our last release, so I’d like to volunteer to
> manage the release of Kudu 1.16.0.
>
> Please let me know if there’s any outstanding work you’d like to squeeze
> into this release. If there aren’t any, I’ll cut the branch on 12/10.
>
> Attila


Re: Apache Kudu 1.15.0-RC2

2021-06-15 Thread Alexey Serbin
+1

I built a DEBUG version of 1.15.0 RC2 from sources on Ubuntu 18.04 and
CentOS 8.2 and ran tests for the backend components using ctest.

Ubuntu 18.04: all passed except for:
   master_authz-itest:
AuthzProvidersWithOwner/MasterAuthzOwnerITest.TestMismatchedTable/Ranger_owner

CentOS 8.2: all passed except for:
  master_hms-itest: MasterHmsTest.TestAlterTableOwner, etc.
  kudu-tool-test: ToolTest.TestHmsIgnoresDifferentMasters (the issue was
due to RSA 768bit key -- I'll need to update Ranger's JVM settings to allow
for less secure cipher/algorithm constraints).

Those Ranger-related tests in master_authz-itest are known to be a bit
flaky, it's not a big deal:
http://dist-test.cloudera.org:8080/test_drilldown?test_name=master_authz-itest

The root case of test failures of the scenarios from master_hms-itest and
from kudu-tool-test  at CentOS 8.2 is using 768bit RSA key for Kudu IPKI CA
in tests -- I'll need to update Ranger's JVM settings to allow for less
secure cipher/algorithm constraints.  It's a test-only failure related to
tighter security settings on contemporary Linux distros, and so not a big
deal.  The funny fact is that not a single run for my RC verification went
without failing at least one of HMS scenarios since HMS tests had been
introduced, and I see a similar pattern for this RC :)

ToolTest.TestHmsIgnoresDifferentMasters failed because of using 768bit RSA
key for Kudu IPKI CA in tests -- I'll need to update Ranger's JVM settings
to allow for less secure cipher/algorithm constraints.  It's a test-only
failure related to tighter security settings on contemporary Linux distros,
and so not a big deal.

Overall, those test failures are related to test-only defects and known
flaky tests, so overall this RC is good to go, IMO.


/Alexey

On Tue, Jun 15, 2021 at 8:33 PM Bankim Bhavsar  wrote:

> Gentle reminder for voting on 1.15.0 RC2.
>
> -Bankim
>
> On Thu, Jun 10, 2021 at 3:16 PM Bankim Bhavsar  wrote:
>
> > Hello Kudu devs!
> >
> > The Apache Kudu team is happy to announce the second release candidate
> for
> > Apache Kudu 1.15.0.
> >
> > Apache Kudu 1.15.0 is a minor release that offers many improvements and
> > fixes since Apache Kudu 1.14.0.
> >
> > This is a source-only release. The artifacts have been staged here:
> > https://dist.apache.org/repos/dist/dev/kudu/1.15.0-RC2/
> >
> > Branch: branch-1.15.x
> >
> > Java convenience binaries in the form of a Maven repository are staged
> > here:
> > https://repository.apache.org/content/repositories/orgapachekudu-1092/
> >
> > Linux test-only Kudu binary JAR artifacts are staged here:
> > https://repository.apache.org/content/repositories/orgapachekudu-1093/
> >
> > MacOS test-only Kudu binary JAR artifacts are staged here:
> > https://repository.apache.org/content/repositories/orgapachekudu-1094/
> >
> > It is tagged in Git as 1.15.0-RC2 and the corresponding hash is the
> > following:
> >
> >
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=64619147e7932518bb88b3f3f719d9c27cfdae68
> >
> > The Release Notes notes can be found here:
> >
> https://github.com/apache/kudu/blob/branch-1.15.x/docs/release_notes.adoc
> >
> > The KEYS file to verify the artifact signatures can be found here:
> > https://dist.apache.org/repos/dist/release/kudu/KEYS
> >
> > Some common release validations include building Kudu, and running the
> unit
> > tests on your platforms and environments. Additionally it is worth
> running
> > Kudu Java tests against kudu-binary JAR artifact using
> > `-PuseBinJar=`
> > as described in the commit message here:
> >
> >
> >
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=8a6faaa93f3e206ac75e8087731daccaf7ab646a
> >
> > The vote will run until a majority[1] is achieved, but at least until
> > Tuesday
> > June 15th 2021, to give everyone a chance to review this release
> candidate
> > and
> > vote.
> >
> > Thank You,
> > Bankim.
> >
> > [1] https://www.apache.org/foundation/voting.html#ReleaseVotes
> >
>


Re: Kudu 1.15 Release Proposal

2021-05-04 Thread Alexey Serbin
Thank you for taking care of this upcoming release, Bankim!

Branching on Friday May 14th looks good to me.


Kind regards,

Alexey


On Tue, May 4, 2021 at 10:11 AM Bankim Bhavsar  wrote:

> Hello Kudu devs,
>
> It has come to my attention that many members of the dev@kudu.apache.org
> mailing list didn't receive the first communication of Kudu 1.15 release
> proposal.
> Hence I'm postponing the branching date for 1.15.0 from Monday May 10th
> 2021 to Friday May 14th 2021 and resending the email from my apache email
> address.
>
> Let me know if you've received this email.
>
> Thanks,
> -Bankim.
>
> On Mon, May 3, 2021 at 11:23 AM Andrew Wong 
> wrote:
>
> > Thanks for volunteering Bankim!
> >
> > +1, branching next Monday sounds good to me.
> >
> > On Tue, Apr 27, 2021 at 10:02 AM Bankim Bhavsar
> > 
> > wrote:
> >
> > > Hello Kudu devs,
> > >
> > > It's been almost 3 months since Kudu 1.14.0 was released on Jan 29th
> > 2021,
> > > I
> > > would like to volunteer to manage the Kudu 1.15.0 release.
> > >
> > > Based on previous experience, I'd like to have a window to be a bit
> > longer
> > > to allow finishing any in-progress work that you'd like to squeeze into
> > > this
> > > release. I propose cutting the branch on Monday May 10th 2021.
> > >
> > > I will send follow up emails with release notes details.
> > >
> > > Thanks,
> > > Bankim.
> > >
> >
> >
> > --
> > Andrew Wong
> >
>


Re: [VOTE] Apache Kudu 1.14.0-RC1

2021-01-25 Thread Alexey Serbin
+1

I checked out 1.14-RC1 and build Kudu from source in DEBUG configuration at
CentOS 7.4 machine as documented at
https://kudu.apache.org/docs/installation.html#rhel_from_source

I ran tests using using "ctest -j4" and all tests passed:

100% tests passed, 0 tests failed out of 446

Label Time Summary:
no_dist_test=  31.15 sec
no_tsan =   9.86 sec

Total Test time (real) = 5015.88 sec


I also ran "./gradlew assemble" and "./gradlew test" in $KUDU_ROOT/java
subdirectory.  Everything passed except
for org.apache.kudu.client.ITClientStress.  The latter failed with
OutOfMemoryError: Java heap space error since I had limited amount of RAM
on the server.

BUILD SUCCESSFUL in 1m 36s
139 actionable tasks: 124 executed, 15 up-to-date



org.apache.kudu.client.ITClientStress >
testManyShortClientsGeneratingScanTokens FAILED
java.lang.AssertionError: java.lang.OutOfMemoryError: Java heap space
at
org.apache.kudu.client.ITClientStress.runTasks(ITClientStress.java:94)
at
org.apache.kudu.client.ITClientStress.testManyShortClientsGeneratingScanTokens(ITClientSt
ress.java:116)

Caused by:
java.lang.OutOfMemoryError: Java heap space


I guess the latter isn't a release blocker since my server was provisioned
with not enough memory for the stress test where JVM is supposed to hog a
lot of memory.

I agree with Grant that the issue pointed by Greg isn't a regression and
does not look like a release stopper, but it would be nice to update README
if it's decided to cut RC2 due to some other issue.


Kind regards,

Alexey

On Thu, Jan 21, 2021 at 9:11 AM Greg Solovyev
 wrote:

> Should we remove this from the README until it works?
> Greg
>
> On Thu, Jan 21, 2021 at 6:19 AM Grant Henke 
> wrote:
>
> > Yeah, I also was unable to build any coverage report using the --html
> > option.
> > I am not sure the HTML report has worked in recent releases and would not
> > consider
> > it a regression for this release but instead a "build only" thing to fix
> > going forward.
> >
> > On Wed, Jan 20, 2021 at 5:43 PM Greg Solovyev
> >  wrote:
> >
> > > I built in debug mode on Ubuntu 18, all C++ tests passed. I generated
> > > coverage report, but following the instructions here
> > > https://github.com/apache/kudu/tree/1.14.0-RC1/Readme.adoc to convert
> > > coverage report into HTML resulted in the following error:
> > >
> > > *gsolovyev@greg-laptop*:*~/git/kudu/build/coverage*$
> > > ../../thirdparty/installed/common/bin/gcovr
> > > --gcov-executable=$(pwd)/../../build-support/llvm-gcov-wrapper --html
> > > --html-details -o cov_html/coverage.html
> > > Traceback (most recent call last):
> > >   File "../../thirdparty/installed/common/bin/gcovr", line 1767, in
> > > 
> > > print_html_report(covdata, options.html_details)
> > >   File "../../thirdparty/installed/common/bin/gcovr", line 1311, in
> > > print_html_report
> > > INPUT = open(data['FILENAME'], 'r')
> > > IOError: [Errno 2] No such file or directory:
> > > '/home/gsolovyev/git/kudu/build/coverage/FacebookService.cpp'
> > >
> > >
> > > Greg
> > >
> > >
> > > On Wed, Jan 20, 2021 at 10:15 AM Grant Henke 
> > > wrote:
> > >
> > > > Hello Kudu devs!
> > > >
> > > > The Apache Kudu team is happy to announce the first release candidate
> > for
> > > > Apache
> > > > Kudu 1.14.0.
> > > >
> > > > Apache Kudu 1.14.0 is a minor release that offers many improvements
> and
> > > > fixes
> > > > since Apache Kudu 1.13.0.
> > > >
> > > > This is a source-only release. The artifacts have been staged here:
> > > > https://dist.apache.org/repos/dist/dev/kudu/1.14.0-RC1/
> > > > 
> > > >
> > > > Java convenience binaries in the form of a Maven repository are
> staged
> > > > here:
> > > >
> https://repository.apache.org/content/repositories/orgapachekudu-1088
> > > >
> > > > Linux and macOS test-only Kudu binary JAR artifacts are staged here:
> > > >
> https://repository.apache.org/content/repositories/orgapachekudu-1089
> > > >
> > > > It is tagged in Git as 1.14.0-RC1 and the corresponding hash is the
> > > > following:
> > > >
> > > >
> > >
> >
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=f9a1c3b2bae482ec1f44f78eea7c96c01455c20a
> > > >
> > > > The WIP release notes can be found here:
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1GFBOfPYW_fx2PfUES5NskFPd3Q6PiNoHj44u82b7EBI/edit#
> > > >
> > > > The KEYS file to verify the artifact signatures can be found here:
> > > > https://dist.apache.org/repos/dist/release/kudu/KEYS
> > > >
> > > > Some common release validations include building Kudu, and running
> the
> > > unit
> > > > tests on your platforms and environments. Additionally it is worth
> > > running
> > > > Kudu
> > > > Java tests against kudu-binary JAR artifact as described in the
> commit
> > > > message here:
> > > >
> > > >
> > > >
> > >
> >
> 

Re: Kudu 1.14.0 Release Proposal

2020-12-14 Thread Alexey Serbin
+1

Thank you for taking care of this!


Kind regards,

Alexey

On Mon, Dec 14, 2020 at 1:41 PM Greg Solovyev
 wrote:

> +1. And, thanks for volunteering!
> Greg
>
> On Mon, Dec 14, 2020 at 11:43 AM Andrew Wong 
> wrote:
>
> > +1
> >
> > Thanks for volunteering! And thanks for accommodating the upcoming
> holidays
> > :)
> >
> > On Wed, Dec 9, 2020 at 11:26 AM Grant Henke 
> wrote:
> >
> > > Hi Kudu devs,
> > >
> > > It's been almost 3 months since Kudu 1.13.0 was released on Sep 17,
> 2020
> > > and I
> > > would like to volunteer to manage the Kudu 1.14.0 release.
> > >
> > > Based on previous experience, I'd like to have a window a bit longer
> than
> > > usual
> > > to allow finishing any in-progress work that you'd like to squeeze into
> > > this
> > > release. I also know many people like to take vacation near the end of
> > the
> > > year,
> > > so I propose cutting the branch on Monday Jan 4th, 2021. This gives
> about
> > > a month before branching and doesn't branch until after most are back
> > from
> > > vacation.
> > >
> > > I will send follow up emails with release notes details.
> > >
> > > Thank you,
> > > Grant
> > >
> >
> >
> > --
> > Andrew Wong
> >
>


Re: [proposal] Kudu operating system requirements changes

2020-11-25 Thread Alexey Serbin
+1

This sounds good to me.  The only nit from my side (and I remember we
discussed that offline) was a possibility to use devtoolset-8 instead of
devtoolset-7 to get C++17-compatible compiler on RHEL/CentOS 7.

I also had some concerns about dropping support for Ubuntu 16, but so far
it seems that people run Kudu on Ubuntu mostly for development purposes
(like running on their own laptop), and those most likely have upgraded to
18.04 (if not to 20.04) already.  Anyways, I guess if we don't hear any
real concerns about dropping support for Ubuntu 16 in this thread, it means
it's safe to proceed with that :)


Kind regards,

Alexey

On Wed, Nov 25, 2020 at 10:56 AM Andrew Wong 
wrote:

> +1
>
> Thanks for proposing this Grant!
>
> I share Bankim's thoughts on Ubuntu 16, given it's not quite EOL yet, but
> I'll echo his curiosity to hear from anyone on that OS (or any others
> listed) that can't upgrade OSes but want to be on the latest version of
> Kudu.
>
> On Wed, Nov 25, 2020 at 10:00 AM Bankim Bhavsar 
> wrote:
>
> > It'd be good to post this on the Kudu Slack channel as well.
> >
> > >   - Drop Ubuntu 16 (Xenial) - EOL April 30th, 2021
> >
> > Only concern will be dropping support for Ubuntu 16. So I would be
> > interested in hearing from any Kudu users running on Ubuntu 16 if they
> are
> > waiting for new features in Kudu but can't upgrade host operating systems
> > in near future.
> >
> > Rest looks good to me and can't wait to start using the new C++ features
> > this will unlock.
> >
> > Thanks,
> > -Bankim.
> >
> >
> > On Mon, Nov 23, 2020 at 1:00 PM Grant Henke 
> wrote:
> >
> >> Hello Kudu developers and users!
> >>
> >> The purpose of this email is to propose and collect feedback on changes
> to
> >> the documented "Operating System Requirements"
> >> on https://kudu.apache.org/docs/installation.html for the next Kudu
> >> release
> >> (1.14.0).
> >>
> >> There are a few goals to updating the documented operating system
> >> requirements. Below is each goal and the suggested changes:
> >>
> >>1. Drop operating systems that are at or near EOL
> >>   - Drop CentOS 6/RHEL 6 - EOL November 30th, 2020
> >>   - Drop Ubuntu 14 (Trusty) - EOL April 30, 2019
> >>   - Drop Ubuntu 16 (Xenial) - EOL April 30th, 2021
> >>  - Note: The next Apache Kudu release would likely be early 2021
> >>   - Drop Debian 8 (Jessie) - EOL June 30, 2020
> >>   - A deprecation was noted for some all but Ubuntu 16 in the Kudu
> >>   1.12.0 release notes:
> >>
> >>
> https://kudu.apache.org/releases/1.12.0/docs/release_notes.html#rn_1.12.0_obsoletions
> >>   - We can and will still accept patches for fixes, but shouldn't
> >>   document/promise support.
> >>2. Drop operating systems that are not well tested by the community
> >>   - Drop SLES 12
> >>   - Drop OS X 10.10 Yosemite, OS X 10.11 El Capitan, macOS Sierra
> >>   - We can and will still accept patches for fixes, but shouldn't
> >>   document/promise support.
> >>3. Add new operating system versions
> >>   - Add Ubuntu 20.04 (Focal)
> >>   - Add macOS 10.14 (Mojave), macOS 10.15 (Catalina), macOS 11 (Big
> >> Sur)
> >>   4. Continued Innovation/Improvements
> >>   - Bump C++ language level to C++17 (gcc 7)
> >>  - Similar to CentOS/RHEL 6 current;y, devtoolset-7 will be used
> >> on
> >>  Centos/RHEL 7 get gcc 7.3
> >>  - This is aligned with the Apache Impala community requirements
> >>   - Upgrade dependencies the required C++14 and higher
> >>   - Introduce new dependencies that require or benefit from C++14
> and
> >>   higher
> >>   - Potential performance improvements
> >>
> >> If you have any concerns about these changes your feedback would be
> >> appreciated. If you are in support of these changes a response
> indicating
> >> your support is encouraged as well.
> >>
> >> Thank you,
> >> Grant
> >>
> >
>
> --
> Andrew Wong
>


Re: [VOTE] Apache Kudu 1.13.0-RC2

2020-09-16 Thread Alexey Serbin
+1

I checked out Kudu's source tagged with 1.13.0-RC2 from
https://github.com/apache/kudu git mirror and built it on CentOS6 in
RELEASE configuration.  I installed binaries at a 6-node cluster, and ran a
test workload of inserting 4B rows and then upserting them using tens of
Kudu C++ clients.  All worked as expected.

I checked out Kudu's source tagged with 1.13.0-RC2 from
https://github.com/apache/kudu git mirror and built it on CentOS Linux
release 8.2.2004 (Core) in DEBUG configuration.  Ran tests with ctest -j4,
and got the following stats:

99% tests passed, 6 tests failed out of 441

Label Time Summary:
no_dist_test=  46.06 sec*proc (3 tests)
no_tsan =  12.11 sec*proc (3 tests)

Total Test time (real) = 5067.18 sec

The following tests FAILED:
109 - client-stress-test (Failed)
111 - consistency-itest (Failed)
214 - webserver-crawl-itest (Failed)
217 - minidump_generation-itest (Failed)
223 - master-test (Failed)
231 - mini_ranger-test (Failed)

Out of the failed tests, mini_ranger-test failed due to the long startup
times of Ranger, and I posted a small patch to increase Ranger start
timeout (the test passed with the patch).  The rest failed due to a timeout
while negotiating a connection because of DNS resolution failures:

  913 09:35:43.932912 2169799 net_util.cc:413] Time spent look up canonical
hostname for localhost 'xxx-wifi': real 10.014s  user 0.000s sys 0.000s

Once I fixed the name resolution issue, the tests passed with no issues.

I also successfully built Kudu at macOS HighSierra in debug configuration.
I ran 'ctest -j4': the majority of tests passed.  Some tests failed, but I
guess that's not a release stopper because macOS is a developer-only
platform for Kudu (IIRC, we haven't had all tests passing on macOS for a
long time).


Kind regards,

Alexey


On Fri, Sep 11, 2020 at 2:09 PM Attila Bukor  wrote:

> Hello Kudu devs!
>
> The Apache Kudu team is happy to announce the second release candidate for
> Apache Kudu 1.13.0.
>
> Apache Kudu 1.13.0 is a minor release that offers many improvements and
> fixes
> since Apache Kudu 1.12.0.
>
> This is a source-only release. The artifacts have been staged here:
> https://dist.apache.org/repos/dist/dev/kudu/1.13.0-RC2/
>
> Java convenience binaries in the form of a Maven repository are staged
> here:
> https://repository.apache.org/content/repositories/orgapachekudu-1085
>
> Linux (built on CentOS 6) and macOS (built on Catalina) test-only Kudu
> binary
> JAR artifacts are staged here:
> https://repository.apache.org/content/repositories/orgapachekudu-1087
>
> It is tagged in Git as 1.13.0-RC2 and signed with my key (found in the
> KEYS file
> below). Its commit hash is b4e0ad597fd9a88d63fa524e37141426a28c406a, you
> can
> check it out from ASF Gitbox or the official GitHub mirror:
>
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=tag;h=refs/tags/1.13.0-RC2
> https://github.com/apache/kudu/releases/tag/1.13.0-RC2
>
> The KEYS file to verify the artifact and tag signatures can be found here:
> https://dist.apache.org/repos/dist/release/kudu/KEYS
>
> The release notes can be found here:
> https://github.com/apache/kudu/blob/1.13.0-RC2/docs/release_notes.adoc
>
> I'd suggest going through the release notes, building Kudu, and running
> the unit
> tests. Testing out the Maven repo would also be appreciated. Also, it's
> worth
> running Kudu Java tests against kudu-binary JAR artifact as described in
> the
> commit message here:
>
>
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=8a6faaa93f3e206ac75e8087731daccaf7ab646a
>
> The vote will run until a majority[1] is achieved, but at least until
> Wednesday
> Sep 16th 9:00:00 CEST 2020, which is a bit over the suggested 72 hours due
> to
> the weekend, to give everyone a chance to review this release candidate and
> vote.
>
> Thank You,
> Attila
>
> [1] https://www.apache.org/foundation/voting.html#ReleaseVotes
>
>


Re: Kudu 1.13

2020-08-24 Thread Alexey Serbin
Hi,

I'd like to include the fix for KUDU-1587 [1] into the upcoming 1.13
release.  With the fix for KUDU-2727 [2] already landed, KUDU-1587 might
become more pronounced .  It's not about every workload, but rather about
some specific ones, of course.  I think it's nice to have a way to mitigate
the issue if it appears in the fields after upgrading to 1.13.

The change in the context of fixing KUDU-1587 consists of three patches
posted for review: [3], [4], [5].  I think it's realistic to expect those
patches to land this week, assuming I'm getting review feedback on those.
At least Andrew Wong already took a look at [4] (thank you Andrew!)

[1] https://issues.apache.org/jira/browse/KUDU-1587
[2] https://issues.apache.org/jira/browse/KUDU-2727
[3] https://gerrit.cloudera.org/#/c/16312/
[4] https://gerrit.cloudera.org/#/c/16332/
[5] https://gerrit.cloudera.org/#/c/16343/

Thanks!


Kind regards,

Alexey

On Tue, Aug 11, 2020 at 12:54 PM Attila Bukor  wrote:

> Hi,
>
> To make sure we don't break backward and/or forward compatibility, I
> changed the
> priority of KUDU-3176[1] to blocker and will fix it before I cut the
> release.
> I'll also be out of office next week, so I propose to postpone the cutting
> of
> the branch to 8/24 Monday.
>
> If you know of any other outstanding bugs that we should consider as
> blockers,
> please list them in this thread (and also change their priority on JIRA).
>
> Also, please don't forget to list your significant changes for the release
> notes
> in https://s.apache.org/kudu1.13rn
>
> Thanks,
> Attila
>
> [1] https://issues.apache.org/jira/browse/KUDU-3176
>
> On Thu, Jul 30, 2020 at 05:46:28PM +0200, Attila Bukor wrote:
> > Hi Kudu devs,
> >
> > It's been almost 3 months since Kudu 1.12.0 has been released on 5/19,
> so I'd
> > like to volunteer to RM Kudu 1.13.0.
> >
> > Based on previous experience, I'd like to have a window a bit longer
> than usual
> > to allow finishing any in-progress work that you'd like to squeeze in to
> this
> > release so I propose cutting the branch on 8/14 Friday. This will give
> us a
> > little over two weeks.
> >
> > In turn, I'd like to get the release notes committed before we cut the
> RC1, this
> > way it would actually be part of the release instead of available only
> online.
> >
> > Please collect your significant changes for the release notes in this
> document:
> > https://s.apache.org/kudu1.13rn
> >
> > Thanks,
> > Attila
>
>
>


Re: [VOTE] Apache Kudu 1.12.0-RC2

2020-04-30 Thread Alexey Serbin
Ah, and I also verified the signature of the
staged apache-kudu-1.12.0.tar.gz source tarball.


/Alexey

On Thu, Apr 30, 2020 at 5:56 PM Alexey Serbin  wrote:

> +1
>
> Verified SHA sum for the release tarball, successfully built RELEASE and
> DEBUG builds on CentOS6.6, CentOS8.1 and macOS.
>
> On CentOS8.1 all tests pass for DEBUG build except a few scenarios in
> ntp-test and auto_rebalancer-test, which appear to be test-only issues.
>
>
> Thanks,
>
> Alexey
>
> On Thu, Apr 30, 2020 at 5:10 PM Bankim Bhavsar 
> wrote:
>
>> +1
>>
>> I was able to successfully compile debug and release builds on Linux and
>> Mac OS.
>>
>> I ran C++ tests against debug and release builds on Linux using dist-test.
>> Only the following test failed on release build.
>>
>> [ RUN  ]
>> MaintenanceManagerTest.TestPrioritizeLogRetentionUnderMemoryPressure
>> /data/12/bankim/src/kudu/src/kudu/util/maintenance_manager-test.cc:394:
>> Failure
>>   Expected: op_and_why.second
>>   Which is: "under memory pressure (2.00% used), 100 bytes log
>> retention, and flush 100 bytes memory"
>> To be equal to: "under memory pressure (0.00% used), 100 bytes log
>> retention, and " "flush 100 bytes memory"
>>   Which is: "under memory pressure (0.00% used), 100 bytes log
>> retention, and flush 100 bytes memory"
>> [  FAILED  ]
>> MaintenanceManagerTest.TestPrioritizeLogRetentionUnderMemoryPressure (1
>> ms)
>>
>>
>>
>> On Thu, Apr 30, 2020 at 3:51 PM Attila Bukor  wrote:
>>
>> > +1
>> >
>> > Verified the checksum and signature, diffed against the git tag with no
>> > differences in source code, built in release mode on el7. Unfortunately
>> I
>> > couldn't get all tests to pass due to disk space issues though.
>> >
>> > Attila
>> >
>> > On Wed, Apr 29, 2020 at 02:17:36PM -0500, Grant Henke wrote:
>> > > +1 I verified the kudu-binary jar issue I saw in the last RC is
>> resolved.
>> > >
>> > > On Mon, Apr 27, 2020 at 6:32 PM Hao Hao > >
>> > > wrote:
>> > >
>> > > > Hello Kudu devs!
>> > > >
>> > > > The Apache Kudu team is happy to announce the second release
>> candidate
>> > for
>> > > > Apache Kudu 1.12.0.
>> > > >
>> > > > Apache Kudu 1.12.0 is a minor release that offers many improvements
>> and
>> > > > fixes since the prior release.
>> > > >
>> > > > This is a source-only release. The artifacts have been staged here:
>> > > > https://dist.apache.org/repos/dist/dev/kudu/1.12.0-RC2/
>> > > >
>> > > > Java convenience binaries in the form of a Maven repository are
>> staged
>> > > > here:
>> > > >
>> https://repository.apache.org/content/repositories/orgapachekudu-1062
>> > > >
>> > > > Linux and macOS kudu-binary JAR artifacts are staged here
>> > correspondingly:
>> > > >
>> > https://repository.apache.org/content/repositories/orgapachekudu-1063/
>> > > >
>> > https://repository.apache.org/content/repositories/orgapachekudu-1064/
>> > > >
>> > > > It is tagged in Git as 1.12.0-RC2 and the corresponding hash is the
>> > > > following:
>> > > >
>> > > >
>> >
>> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=80e35e5f9d6a02010ff65ab0e5bd1fc115d190df
>> > > >
>> > > > The WIP release notes can be found here:
>> > > > https://gerrit.cloudera.org/c/15685/
>> > > >
>> > > > The KEYS file to verify the artifact signatures can be found here:
>> > > > https://dist.apache.org/repos/dist/release/kudu/KEYS
>> > > >
>> > > > I'd suggest going through the README and the release notes, building
>> > Kudu,
>> > > > and running the unit tests. Testing out the Maven repo would also be
>> > > > appreciated.
>> > > > Also, it's worth running Kudu Java tests against kudu-binary JAR
>> > artifact
>> > > > as described in the commit message here:
>> > > >
>> > > >
>> > > >
>> >
>> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=8a6faaa93f3e206ac75e8087731daccaf7ab646a
>> > > >
>> > > > The vote will run until Thursday Apr 30 18:00:00 PST 2020.
>> > > >
>> > > > Best,
>> > > > Hao
>> > > >
>> > >
>> > >
>> > > --
>> > > Grant Henke
>> > > Software Engineer | Cloudera
>> > > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>> >
>>
>


Re: [VOTE] Apache Kudu 1.12.0-RC2

2020-04-30 Thread Alexey Serbin
+1

Verified SHA sum for the release tarball, successfully built RELEASE and
DEBUG builds on CentOS6.6, CentOS8.1 and macOS.

On CentOS8.1 all tests pass for DEBUG build except a few scenarios in
ntp-test and auto_rebalancer-test, which appear to be test-only issues.


Thanks,

Alexey

On Thu, Apr 30, 2020 at 5:10 PM Bankim Bhavsar 
wrote:

> +1
>
> I was able to successfully compile debug and release builds on Linux and
> Mac OS.
>
> I ran C++ tests against debug and release builds on Linux using dist-test.
> Only the following test failed on release build.
>
> [ RUN  ]
> MaintenanceManagerTest.TestPrioritizeLogRetentionUnderMemoryPressure
> /data/12/bankim/src/kudu/src/kudu/util/maintenance_manager-test.cc:394:
> Failure
>   Expected: op_and_why.second
>   Which is: "under memory pressure (2.00% used), 100 bytes log
> retention, and flush 100 bytes memory"
> To be equal to: "under memory pressure (0.00% used), 100 bytes log
> retention, and " "flush 100 bytes memory"
>   Which is: "under memory pressure (0.00% used), 100 bytes log
> retention, and flush 100 bytes memory"
> [  FAILED  ]
> MaintenanceManagerTest.TestPrioritizeLogRetentionUnderMemoryPressure (1 ms)
>
>
>
> On Thu, Apr 30, 2020 at 3:51 PM Attila Bukor  wrote:
>
> > +1
> >
> > Verified the checksum and signature, diffed against the git tag with no
> > differences in source code, built in release mode on el7. Unfortunately I
> > couldn't get all tests to pass due to disk space issues though.
> >
> > Attila
> >
> > On Wed, Apr 29, 2020 at 02:17:36PM -0500, Grant Henke wrote:
> > > +1 I verified the kudu-binary jar issue I saw in the last RC is
> resolved.
> > >
> > > On Mon, Apr 27, 2020 at 6:32 PM Hao Hao 
> > > wrote:
> > >
> > > > Hello Kudu devs!
> > > >
> > > > The Apache Kudu team is happy to announce the second release
> candidate
> > for
> > > > Apache Kudu 1.12.0.
> > > >
> > > > Apache Kudu 1.12.0 is a minor release that offers many improvements
> and
> > > > fixes since the prior release.
> > > >
> > > > This is a source-only release. The artifacts have been staged here:
> > > > https://dist.apache.org/repos/dist/dev/kudu/1.12.0-RC2/
> > > >
> > > > Java convenience binaries in the form of a Maven repository are
> staged
> > > > here:
> > > >
> https://repository.apache.org/content/repositories/orgapachekudu-1062
> > > >
> > > > Linux and macOS kudu-binary JAR artifacts are staged here
> > correspondingly:
> > > >
> > https://repository.apache.org/content/repositories/orgapachekudu-1063/
> > > >
> > https://repository.apache.org/content/repositories/orgapachekudu-1064/
> > > >
> > > > It is tagged in Git as 1.12.0-RC2 and the corresponding hash is the
> > > > following:
> > > >
> > > >
> >
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=80e35e5f9d6a02010ff65ab0e5bd1fc115d190df
> > > >
> > > > The WIP release notes can be found here:
> > > > https://gerrit.cloudera.org/c/15685/
> > > >
> > > > The KEYS file to verify the artifact signatures can be found here:
> > > > https://dist.apache.org/repos/dist/release/kudu/KEYS
> > > >
> > > > I'd suggest going through the README and the release notes, building
> > Kudu,
> > > > and running the unit tests. Testing out the Maven repo would also be
> > > > appreciated.
> > > > Also, it's worth running Kudu Java tests against kudu-binary JAR
> > artifact
> > > > as described in the commit message here:
> > > >
> > > >
> > > >
> >
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=8a6faaa93f3e206ac75e8087731daccaf7ab646a
> > > >
> > > > The vote will run until Thursday Apr 30 18:00:00 PST 2020.
> > > >
> > > > Best,
> > > > Hao
> > > >
> > >
> > >
> > > --
> > > Grant Henke
> > > Software Engineer | Cloudera
> > > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
> >
>


Re: [VOTE] Apache Kudu 1.12.0-RC1

2020-04-21 Thread Alexey Serbin
+0

Checked out from the git repo at branch 1.12.x  tagged 1.12.0-RC1, compiled
on macOS HighSierra (10.13), ran tests.

The following tests FAILED:
12 - client-test.0 (Failed)
47 - types-test (Failed)
77 - fs_manager-test (Failed)
145 - master_authz-itest.0 (Failed)
146 - master_authz-itest.1 (Failed)
147 - master_authz-itest.2 (Failed)
148 - master_authz-itest.3 (Failed)
149 - master_authz-itest.4 (Failed)
150 - master_authz-itest.5 (Failed)
151 - master_authz-itest.6 (Failed)
152 - master_authz-itest.7 (Failed)
154 - master_failover-itest.0 (Failed)
155 - master_failover-itest.1 (Failed)
156 - master_failover-itest.2 (Failed)
181 - security-itest (Failed)
218 - sentry_authz_provider-test.0 (Failed)
219 - sentry_authz_provider-test.1 (Failed)
220 - sentry_authz_provider-test.2 (Failed)
221 - sentry_authz_provider-test.3 (Failed)
222 - sentry_authz_provider-test.4 (Failed)
223 - sentry_authz_provider-test.5 (Failed)
224 - sentry_authz_provider-test.6 (Failed)
225 - sentry_authz_provider-test.7 (Failed)
255 - sentry_client-test (Failed)
258 - webserver-test (Failed)
270 - compaction-test (Failed)
323 - kudu-tool-test.0 (Failed)
343 - tablet_server-test.2 (Failed)
371 - file_cache-stress-test (Failed)
423 - trace-test (Failed)

Most of the failures were due to HMS/Sentry issues and that's nothing new,
some tests we fixed just today (2020-04-21).  Not a big deal since macOS is
a development platform for Kudu.

Checked out from the git repo at branch 1.12.x  tagged 1.12.0-RC1, compiled
on CentOS6.6, ran tests.
The following tests failed:

  master_authz-itest.0
  master_authz-itest.1
  master_authz-itest.2
  master_authz-itest.3
  master_authz-itest.4
  master_authz-itest.5
  master_authz-itest.6
  master_authz-itest.7
  security-itest.txt

The failures in master_authz-itest are due to failure to start Ranger and
looks like that's test-only issue related to my environment.  As for the
failure in security-itest.txt, I didn't get to the bottom of that yet but
it seems a test-only issue as well:

src/kudu/integration-tests/security-itest.cc:432: Failure
Value of: s.ToString()

Expected: has substring "server requires authentication, but client does
not have Kerberos credentials available"
  Actual: "OK"

Given that we have a few patches with various fixes coming in after tagging
RC1, it would not hurt cutting RC2 to incorporate those, but since those
fixes are test-onty, going with RC1 might be an option as well.


Thanks,

Alexey


On Tue, Apr 21, 2020 at 4:00 PM Attila Bukor  wrote:

> +1, but if this vote fails (deadline is close and we don't have 3 +1s
> yet), I
> think it would be best to backport this change and the release notes
> before we
> cut RC2.
>
> Grant, did you mean to vote on the release or only share your testing
> results?
>
> Attila
>
> On Mon, Apr 20, 2020 at 12:36:15PM -0500, Grant Henke wrote:
> > I forgot to mention that this patch was required to run some of the tests
> > that use min-ranger in Docker:
> > https://gerrit.cloudera.org/#/c/15756/
> >
> > It shouldn't block the release, but we can backport it to branch-1.12.x
> to
> > facilitate future testing.
> >
> > On Mon, Apr 20, 2020 at 12:22 PM Grant Henke 
> wrote:
> >
> > > I ran the following to test various OS versions in Docker:
> > >
> > > *# Build all the images*
> > >
> > > *export
> > > BASES="centos:7,centos:8,ubuntu:xenial,ubuntu:bionic,debian:stretch"*
> > > *export TARGETS="build"*
> > > *./docker/docker-build.sh*
> > >
> > > *# For each image, run the tests*
> > >
> > > *docker run -it --rm apache/kudu:build-latest- /bin/bash*
> > > *mkdir build/debug*
> > > *cd build/debug*
> > > *export NO_REBUILD_THIRDPARTY=1*
> > > *../../thirdparty/installed/common/bin/cmake \*
> > > * -DCMAKE_BUILD_TYPE=debug ../..*
> > > *make -j6*
> > > *ctest -j6*
> > > *cd /kudu/java/*
> > > *./gradlew test*
> > > *cd /kudu/python*
> > > *export KUDU_HOME="/kudu"*
> > > *python setup.py build_ext*
> > > *python setup.py test*
> > >
> > > Below are the flaky and failing tests I observed:
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > *centos:7 saw flakes in
> > > TestKuduTable.testFormatRangePartitionsStringColumcentos:8 saw flakes
> in
> > > auto_rebalancer-testmini_ranger-test
> > > DefaultSourceTest.testInsertionMultipleubuntu:xenial saw failures in
> > >  client-stress-test (Failed) memory_gc-itest (Failed)
> > >  raft_consensus-itest.2 (Failed)ubuntu:xenial saw flakes in
> > > auto_rebalancer-testubuntu:bionic saw flakes in
> > > mini_ranger-testdebian:stretch saw flakes inkudu-tool-test.1*
> > >
> > >
> > > *ITClientStress.testManyShortClientsGeneratingScanTokens
> > > TestMiniKuduCluster.testHiveMetastoreIntegration
> > > TestMiniKuduCluster.testKerberos*
> > >
> > >
> > > I seem to remember that those tests that are falling in xenial have
> always
> > > been an issue in
> > > Docker and are likely a Docker image/setup issue. I don't think they
> are a
> > > concern 

Re: Proposal for supported OS version changes

2020-04-13 Thread Alexey Serbin
Looks good to me.

BTW, do we plan to make sure 1.12 runs on recently released [1] Debian 10?
Or we can declare it's supported de-facto (if it's so) after releasing 1.12?

[1]  https://www.debian.org/News/2020/20200208


Thanks,

Alexey

On Mon, Apr 13, 2020 at 1:57 PM Todd Lipcon 
wrote:

> Sounds good. I've struggled with GCC 4 compatibility workarounds lately as
> well, so would be nice to cut out that surface area.
>
> -Todd
>
> On Mon, Apr 13, 2020 at 11:39 AM Bankim Bhavsar
> 
> wrote:
>
> > LGTM.
> >
> > Bankim
> >
> > On Sun, Apr 12, 2020 at 10:48 AM Grant Henke 
> > wrote:
> >
> > > Hello Kudu Developers,
> > >
> > > As we approach the 1.12.0 release I think now is a good opportunity to
> > > evaluate the platform versions
> > > we support. Doing so before the 1.12.0 release is useful because it
> will
> > > allow us to mark any OS we
> > > intend to drop as deprecated. This is similar to how we have handled
> Java
> > > version support in the past.
> > >
> > > The primary goal of dropping older OS versions is to remove OS versions
> > > that are nearing EOL and
> > > to bump the minimum required gcc version allowing the use of C++ 14
> > > features. However, some
> > > additional benefits include:
> > >
> > >- Updating the list of OS versions we test and support (including
> > newer
> > >OS versions)
> > >- Aligning more closely with Impala given they are the main consumer
> > of
> > >the C++ client (thread
> > ><
> > >
> >
> https://lists.apache.org/thread.html/r19b95826b59486d61c1f8f5c1edd93adb8ec9925f66ff42c0c103d66%40%3Cdev.impala.apache.org%3E
> > > >
> > >)
> > >- Allow the use of Abseil  which various Kudu
> > >contributors have suggested bringing into Kudu
> > >- A first step towards supporting C++17
> > >
> > > With that said I propose we mark the following OS versions as
> deprecated
> > in
> > > the 1.12.0 release
> > > and consider dropping them in the next minor release:
> > >
> > >- CentOS/RHEL 6 - EOL November 30, 2020
> > ><
> > >
> >
> https://wiki.centos.org/FAQ/General#What_is_the_support_.27.27end_of_life.27.27_for_each_CentOS_release.3F
> > > >
> > >- Debian 8 - EOL June 30, 2020 
> > >- Ubuntu 14 - EOL April 30, 2020 
> > >
> > > This means that  our minimum gcc version could be 5.3+ and the list of
> > > supported OS versions would be:
> > >
> > >- CentOS 6 (deprecated)
> > >- CentOS 7 (with devtoolset gcc)
> > >- CentOS 8
> > >- RHEL 6 (deprecated)
> > >- RHEL 7 (with devtoolset gcc)
> > >- RHEL 8
> > >- Ubuntu 14.04 (deprecated)
> > >- Ubuntu 16.04
> > >- Ubuntu 18.04
> > >- Debian 8 (deprecated)
> > >- Debian 9
> > >- SLES 12 (with toolchain gcc)
> > >
> > > Please provide your feedback, suggestions, or agreement on this
> proposal.
> > >
> > > Thank you,
> > > Grant
> > >
> >
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


Re: Heads-up: change in include-what-you-use behavior

2020-03-31 Thread Alexey Serbin
Adar,

Thank you very much for making this possible!

I hope we can address the macOS-related wrinkles soon.


Kind regards,

Alexey



On Fri, Mar 27, 2020 at 1:11 PM Adar Lieber-Dembo 
wrote:

> I just merged some changes that modify include-what-you-use (IWYU) to
> run against the libc++ in thirdparty rather than against the system's
> libstdc++. Coupled with some other changes, I expect that running IWYU
> on any Linux system should yield the exact same set of recommendations
> as when IWYU is run during a precommit build. Ideally, we'll achieve
> this determinism on macOS too, but for now I don't expect that to be
> the case.
>
> As a reminder, you can run IWYU locally via "make iwyu" (or "ninja
> iwyu"), and you can have it automatically perform fixes via "make
> iwyu-fix" (or "ninja iwyu-fix"). If you do see any deviation between
> local and precommit IWYU behavior, please reply to this thread, or let
> me know directly.
>


Re: Kudu 1.12.0 Release

2020-03-24 Thread Alexey Serbin
Hi,

Thank you Hao for taking care of the release management for 1.12 release!

The new timeline looks good to me.  Thinking about April 1st reminds
me about April Fools' Day, but since it's not a release date requiring a
public announcement, I think it's OK :)


Thanks

Alexey

On Tue, Mar 24, 2020 at 4:01 PM Hao Hao 
wrote:

> After recent discussions with members of the dev community and I think it
> makes sense to push out branching further, since the in-progress work still
> need some time to finish. So I'm now proposing a new target branch date
> which is Wednesday, *Apr. 1*, and we can re-evaluate the date if needed.
>
> Again, please let me know if you agree/disagree with the new plan. Thanks
> a lot!
>
> Best.
> Hao
>
> On Tue, Mar 17, 2020 at 1:34 PM Hao Hao  wrote:
>
> > Thanks Grant!
> >
> > I created the shared document to serve as a scratchpad for Kudu 1.12
> > release
> > notes:
> >
> >
> https://docs.google.com/document/d/1TwxsNz9vD6u6i1ihfmbHdvwQjhIiTOXABFxKnrbH08A/edit?usp=sharing
> > It contains a full list of commits since 1.11 release. If you have
> > context, please fill
> > out any notable changes that are worth adding release notes for.  Also
> > feel free to
> > add anything else that you think is worth documenting.
> >
> > Also I've been discussing with members of the dev community and I think
> it
> > makes sense to push out branching, to let some in-progress work such as
> > Ranger integration to finish and bake. So I'm now proposing we branch
> > Wednesday, *March. 25*, and we'll vote a couple days after.
> >
> > Please let me know if you agree/disagree with the new plan.
> >
> > Best,
> > Hao
> >
> > On Thu, Mar 12, 2020 at 8:09 AM Grant Henke  >
> > wrote:
> >
> >> The plan sounds good to me. Thank you for volunteering to RM.
> >>
> >> Could you create and share a google document similar to ones from past
> >> releases
> >> where we can start to compile the release notes?
> >>
> >> Here is an example:
> >>
> >>
> https://docs.google.com/document/d/1CnGvoOob9H8BcY5dPciYCW5RTj9J--wXPmeMDQzqH6Y/edit?usp=sharing
> >>
> >> Thanks,
> >> Grant
> >>
> >> On Wed, Mar 11, 2020 at 11:16 PM Hao Hao 
> >> wrote:
> >>
> >> > Hello Kudu developers!
> >> >
> >> > It's been a bit more than three months since we released 1.11.1. In
> that
> >> > time, we've accrued some important improvements and bug fixes (around
> >> > 438 commits since Kudu 1.11.0). By our usual three-month release
> >> cadence,
> >> > now seems like as good a time as any to start thinking about a Kudu
> >> 1.12.0
> >> > release, and I'm volunteering to RM it.
> >> >
> >> > I'm proposing we cut the branch for 1.12.x on Wednesday, *March 18*,
> >> > which gives us a week to prepare, and then we'll start a vote a couple
> >> of
> >> > days after that.
> >> >
> >> > If this sound good to you all, please start thinking about and writing
> >> up
> >> > release notes for notable changes that have landed in this release,
> and
> >> get
> >> > them checked in before March 18 2020.
> >> >
> >> > Here's a command you can run to check out roughly what you've been up
> to
> >> > since Kudu 1.11.0 release:
> >> >
> >> > $ git log 08db97c591d9131cbef9b8e5b4f44a6e854c25f0..master --oneline \
> >> > --graph --no-merges --first-parent --author=
> >> >
> >> > Please let me know if you agree with this plan, have comments,
> >> questions,
> >> > concerns, etc.
> >> >
> >> > Thanks!
> >> > Hao
> >> >
> >>
> >>
> >> --
> >> Grant Henke
> >> Software Engineer | Cloudera
> >> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
> >>
> >
>


Re: Switching time source to the built-in NTP client

2020-03-19 Thread Alexey Serbin
Great, thank you Todd!

I assume nobody has any concerns with keeping the time source {{system}} by
default in Kudu 1.12 :)  Otherwise, please let me know.


Kind regards,

Alexey

On Wed, Mar 18, 2020 at 10:10 AM Todd Lipcon 
wrote:

> Sounds fine to me.
>
> -Todd
>
> On Wed, Mar 18, 2020 at 1:18 AM Alexey Serbin  >
> wrote:
>
> > Thank you all for the feedback!
> >
> > As Grant mentioned, parsing the configuration files of local NTP servers
> or
> > getting the list of their peers via appropriate CLI was considered at
> some
> > point in the scope of the 'auto' time source.  However, this option
> didn't
> > look robust enough and no progress was made in that direction.  Among the
> > concerns were the following corner cases: (a) the configuration file
> might
> > be in a non-default location (b) certain CLI commands might be prohibited
> > by custom security policy (c) there might be both ntpd and chronyd
> > installed, and it's necessary to consult systemd/chkconfig (which is
> > version/platform dependent) to resolve the ambiguity if neither of NTP
> > daemons are running during Kudu's startup time
> >
> > I don't think we evaluated the idea of multiple time sources described by
> > Todd, at least as it is articulated in this e-mail thread (i.e. using
> > multiple time sources with a fallback behavior between them).  We looked
> at
> > auto-configuring the time source at startup based on the system clock’s
> NTP
> > synchronization status.  The latter was considered to be not very robust
> > because it might (a) mask configuration issues (b) result in different
> time
> > source within the same Kudu cluster due to transient issues (c) introduce
> > extra startup delay.  The preferred choice was having something more
> > deterministic and static, and --time_source=auto currently means using
> the
> > built-in NTP client configured with the internal NTP server for AWS and
> GCE
> > instances and using the system clock synchronized by NTP in all other
> > cases.
> >
> > It's true that the built-in NTP client has numerous TODOs.  We have some
> > test coverage based on chronyd's as part of our external mini-cluster
> test
> > harness, but the built-in NTP client is not battle-tested at this
> point.  I
> > agree with Adar that it's not clear how many existing and new Kudu users
> > would benefit from switching to the built-in NTP client by default.  From
> > that perspective, keeping the default time source as 'system' and
> allowing
> > to switch to the built-in NTP client is a conservative, but very
> reasonable
> > approach.
> >
> > OK, at this point I can see the following options for the default time
> > source:
> >
> > 1. Keep the default clock source as 'system' and make it possible to
> switch
> > to the built-in NTP client when --time_source=builtin is set explicitly
> > (that's already how it is now).
> > 2. Switch the default clock source to 'builtin' and mention in the
> release
> > notes that it's not backwards-compatible change and might require
> updating
> > Kudu's configuration after upgrading to 1.12.
> > 3. Switch the default clock source to 'builtin', set its list of NTP
> > servers empty by default, and introduce a parser for chronyd/ntpd
> > configuration files.  This way upgraded Kudu masters and tablet servers
> > would seamlessly switch to the built-in NTP client working with the same
> > set of NTP servers as local NTP daemons (assuming they are using either
> > chronyd or ntpd at their Kudu nodes).
> > 4. Implement a new mode with multiple time sources with a fallback
> behavior
> > between them.  Make the new time source the default one.  This way
> existing
> > users will not need to change anything unless they want to stick with the
> > 'system' time source.
> >
> > It's clear that option 2 brings usability issues, so it's not a good one.
> > Options 3 and 4 require some extra functionality: it's not too
> cumbersome,
> > but it requires some time to implement and test.  However, it's necessary
> > to re-evaluate the decision to allow inherent 'dynamicity' of the time
> > source for a Kudu cluster with option 4.  Option 1 looks like the safest
> > bet at this point.
> >
> > So, here is the proposal: let's keep 'system' as the default time source
> > for 1.12 release.  This automatically removes upgrade-related risks for
> > existing Kudu clusters. It's always possible to switch to the built-in
> NTP
> > client with --time_source=builtin, of course.
> >
> > I'll document options 3 and 4

Switching time source to the built-in NTP client

2020-03-16 Thread Alexey Serbin
Hi,

I'd like to get feedback on the subj, please.

The built-in NTP client for Kudu masters and tablet servers was introduced
in Kudu 1.11.0.  Back then, there were thoughts of switching to the
built-in client by default starting Kudu 1.12.

Since it's time for cutting 1.12 release branch pretty soon, I think it's a
good opportunity to clarify on whether we want to make that change or we
want to keep the time source as is (i.e. 'system') in 1.12 release.

For more context, the built-in NTP client has been used to run external
mini-cluster-based test scenarios since 1.11.0 release for every gerrit
pre-commit build.  In addition, I ran a 6 node cluster for a few weeks at
two clusters cluster in public cloud with basic write/read workload ('kudu
perf loadgen' with the --run_scan option).  So far I've seen no issues
there.  As for the use in a production environment, at this point I'm not
aware of any Kudu clusters running in production using the built-in NTP
client.

The benefit of the internal built-in NTP client is that it allows to run
Kudu without the requirement of having the local machines' clocks
synchronized by the kernel NTP discipline.  That might benefit newer Kudu
installations where machines' clocks are not synchronized out-of-the-box
and users are not keen performing an extra step deploying NTP servers (and
configure them appropriately if the default configuration is not good
enough -- e.g., in case of fire-walled internal clusters).

If we switch to the 'builtin' time source by default (i.e. use the built-in
NTP client), existing installations running with the 'system' time source
will need to add an extra flag if it's desired to stay with the 'system'
time source after the upgrade to 1.12.  In that regard, the update would
not be backwards-compatible, but Kudu users should not care much about the
clock source assuming the built-in NTP client is reliable enough.  Also, in
case of Kudu clusters running without access to the internet, it will be
necessary to point the built-in NTP client to some internal NTP servers
since pool.ntp.org servers (the default servers for the built-in NTP
client) might not be accessible.

So, it seems enabling the built-in NTP client by default could benefit
newer installations, but might require extra configuration steps for
existing Kudu deployments where pool.ntp.org NTP servers are not
accessible.  The latter step should be described in the release notes for
1.12 release, of course.  Also, there is some risk of hitting a not-yet
detected bug in the built-in NTP client.

Do you think the benefits of removing the requirement to have the local
clock synchronized by local NTP server outweighs the drawbacks of adding an
extra configuration step during 1.12 upgrade for Kudu clusters isolated
from the Internet?

Your feedback is highly appreciated!


Thanks,

Alexey


P.S. I sent the original message one week ago, but it seems it went into
spam box or alike, so I'm re-sending it.


Switching time source to the built-in NTP client

2020-03-09 Thread Alexey Serbin
Hi,

I'd like to get feedback on the subj.

The built-in NTP client for Kudu masters and tablet servers was introduced
in Kudu 1.11.0.  Back then, there were thoughts of switching to the
built-in client by default starting Kudu 1.12.

Since it's time for cutting 1.12 release branch pretty soon, I think it's a
good opportunity to clarify on whether we want to make that change or we
want to keep the time source as is in 1.12 release.

For more context, the built-in NTP client has been used to run external
mini-cluster-based test scenarios since 1.11.0 release in gerrit pre-commit
builds.  In addition, I ran a 6 node cluster for a few weeks in public
cloud with basic write/read workload ('kudu perf loadgen' with the
--run_scan option).  So far I've seen no issues there.  As for the use in a
production environment, at this point I'm not aware of any Kudu clusters
running in production using the built-in NTP client.

The benefit of the internal built-in NTP client is that it allows to run
Kudu without the requirement of having the local machines' clocks
synchronized by the kernel NTP discipline.  That might benefit newer Kudu
installations where machines' clocks are not synchronized out-of-the-box
and users struggle to deploy NTP servers (and configure them appropriately
if the default configuration is not good enough -- e.g., in case of
firewalled internal clusters).

If we switch to the 'builtin' time source by default (i.e. use the built-in
NTP client), existing installations running with the 'system' time source
will need to add an extra flag if it's desired to stay with the 'system'
time source after the upgrade to 1.12.  In that regard, the update would
not be backwards-compatible, but Kudu users should not care much about the
clock source assuming the built-in NTP client is reliable enough.  Also, in
case of Kudu clusters running without access to the internet, it will be
necessary to point the built-in NTP client to some internal NTP servers
since pool.ntp.org servers (the default servers for the built-in NTP
client) might not be accessible.

So, it seems enabling the built-in NTP client by default could benefit
newer installations, but might require extra configuration steps for
existing Kudu deployments where pool.ntp.org NTP servers are not
accessible.  The latter step should be described in the release notes for
1.12 release, of course.  Also, there is some risk of hitting a not-yet
detected bug in the built-in NTP client.

Do you think the benefits of removing the requirement to have the local
clock synchronized outweighs the drawbacks of adding an extra configuration
step during 1.12 upgrade for Kudu clusters isolated from the Internet?

Your feedback is highly appreciated!


Thanks,

Alexey


[ANNOUNCE] Apache Kudu 1.10.1 Released

2019-11-20 Thread Alexey Serbin
The Apache Kudu team is happy to announce the release of Kudu 1.10.1!

Kudu is an open source storage engine for structured data which
supports low-latency random access together with efficient analytical
access patterns. It supports many integrations with other data analytics
projects both inside and outside of the Apache Software Foundation.

Apache Kudu 1.10.1 is a bug fix release. Please see the release notes
for details:
  https://kudu.apache.org/releases/1.10.1/docs/release_notes.html

The Apache Kudu project only publishes source code releases. To build
Kudu 1.10.1, follow these steps:
  - Download the Kudu 1.10.1 source release:
  https://kudu.apache.org/releases/1.10.1
  - Follow the instructions in the documentation to build Kudu 1.10.1
from source:

https://kudu.apache.org/releases/1.10.1/docs/installation.html#build_from_source

For your convenience, binary JAR files for the Kudu Java client library,
Spark DataSource, Flume sink, and other Java integrations are published
to the ASF Maven repository and are now available:
https://search.maven.org/search?q=g:org.apache.kudu%20AND%20v:1.10.1

The Python client source is also available on PyPI:
  https://pypi.org/project/kudu-python/

Regards,
The Apache Kudu team


[RESULT] [VOTE] Apache Kudu 1.11.1-RC2

2019-11-19 Thread Alexey Serbin
Hello Kudu devs!

The vote for Apache Kudu 1.11.1-RC2 has closed.  The release candidate has
passed.

The votes received so far by Tue Nov 19 19:03:26 PST 2019 (with
deadline Tue Nov 19 19:00:00 PST 2019):

+1s:
  Attila Bukor
  Adar Lieber-Dembo
  Andrew Wong
  Grant Henke
  Hao Hao
  Alexey Serbin
  Greg Solovyev (non-binding)

+0s:
  Greg Solovyev (non-binding); this +0 was amended to be +1 after
re-running a few tests

No -1s of any kind.

Thank you for voting.  I'll run the rest of the release machinery to
release the candidate as Apache Kudu 1.11.1.


Thanks,

Alexey


Re: [VOTE] Apache Kudu 1.10.1-RC3

2019-11-19 Thread Alexey Serbin
+1

I built the C++ and Java code out of the 1.11.1-RC2 tag on CentOS6.6 x86_64
in RELEASE mode, ran C++ tests using dist-test and Java tests using
'./gradlew test'.

On Tue, Nov 19, 2019 at 3:17 PM Hao Hao 
wrote:

> +1 built at DEBUG mode and ran on CentOS 7.3, all the c++ and java tests
> passed.
>
> On Tue, Nov 19, 2019 at 2:26 PM Andrew Wong 
> wrote:
>
> > +1
> >
> > - I built in RELEASE mode on Centos 7.3
> > - Looked at the updated release notes
> > - Looked through the source release for references to libnuma, and didn't
> > see anything unexpected
> >
> >
> > On Mon, Nov 18, 2019 at 1:47 AM Adar Lieber-Dembo
> > 
> > wrote:
> >
> > > +1
> > >
> > > I ran DEBUG tests in slow mode on Ubuntu 18.04 and CentOS 6.6. All
> > passed.
> > >
> > > On Sat, Nov 16, 2019 at 1:36 PM Attila Bukor 
> wrote:
> > > >
> > > > +1
> > > >
> > > > - Verified checksum, the signature and that the archive was created
> > from
> > > the tag
> > > > - Successfully built Kudu in release mode on CentOS 7.3 and macOS
> > Mojave
> > > 10.14.6
> > > > - Ran Java and C++ tests in slow mode on CentOS 7.3, only one failure
> > in
> > > each,
> > > >   both environmental (address already in use, too many open files)
> > > > - Ran Java tests on macOS Mojave with one failure (lsof missing).
> > > > - Successfully built Java and Spark examples using the staged JARs on
> > > CentOS and
> > > >   the Java examples also on macOS. Spark examples build failed on
> > macOS,
> > > but it
> > > >   seems it's not specific to this version. Opened a JIRA: KUDU-2997
> > > > - Read the release notes
> > > >
> > > > Attila
> > > >
> > > > On Fri, Nov 15, 2019 at 08:41:35PM -0800, Alexey Serbin wrote:
> > > > > Hello Kudu devs!
> > > > >
> > > > > The Apache Kudu team is happy to announce the third release
> candidate
> > > (RC3)
> > > > > for Apache Kudu 1.10.1.
> > > > >
> > > > > Apache Kudu 1.10.1 is a bug-fix release which fixes one critical
> > > licensing
> > > > > and few other issues in Kudu 1.10.0.
> > > > >
> > > > > This is a source-only release. The artifacts have been staged here:
> > > > >   https://dist.apache.org/repos/dist/dev/kudu/1.10.1-RC3/
> > > > >
> > > > > Java convenience binaries in the form of a Maven repository are
> > staged
> > > here:
> > > > >
> > > https://repository.apache.org/content/repositories/orgapachekudu-1053
> > > > >
> > > > > Linux and macOS kudu-binary JAR artifacts are staged here
> > > correspondingly:
> > > > >
> > > https://repository.apache.org/content/repositories/orgapachekudu-1054
> > > > >
> > > https://repository.apache.org/content/repositories/orgapachekudu-1055
> > > > >
> > > > > It is tagged in Git as 1.10.1-RC3 and the corresponding hash is the
> > > > > following:
> > > > >
> > > > >
> > >
> >
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=d7e2e998cf963dc98c568f264cc09c7c320cae0a
> > > > >
> > > > > The release notes can be found here:
> > > > >
> > >
> >
> https://github.com/apache/kudu/blob/branch-1.10.x/docs/release_notes.adoc
> > > > >
> > > > > The KEYS file to verify the artifact signatures can be found here:
> > > > >   https://dist.apache.org/repos/dist/release/kudu/KEYS
> > > > >
> > > > > I'd suggest going through the README and the release notes,
> building
> > > Kudu,
> > > > > and
> > > > > running the unit tests. Testing out the Maven repo would also be
> > > > > appreciated.
> > > > > Also, it's worth running Kudu Java tests against kudu-binary JAR
> > > artifact
> > > > > as described in the commit message here:
> > > > >
> > > > >
> > >
> >
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=8a6faaa93f3e206ac75e8087731daccaf7ab646a
> > > > >
> > > > > This vote runs until Tue Nov 19 15:00:00 PST 2019.
> > > > > That's over 72 hours from the time of sending out this e-mail
> message
> > > due
> > > > > to the weekend.
> > > > >
> > > > >
> > > > > Kind regads,
> > > > >
> > > > > Alexey
> > >
> >
> >
> > --
> > Andrew Wong
> >
>


[VOTE] Apache Kudu 1.10.1-RC3

2019-11-15 Thread Alexey Serbin
Hello Kudu devs!

The Apache Kudu team is happy to announce the third release candidate (RC3)
for Apache Kudu 1.10.1.

Apache Kudu 1.10.1 is a bug-fix release which fixes one critical licensing
and few other issues in Kudu 1.10.0.

This is a source-only release. The artifacts have been staged here:
  https://dist.apache.org/repos/dist/dev/kudu/1.10.1-RC3/

Java convenience binaries in the form of a Maven repository are staged here:
  https://repository.apache.org/content/repositories/orgapachekudu-1053

Linux and macOS kudu-binary JAR artifacts are staged here correspondingly:
  https://repository.apache.org/content/repositories/orgapachekudu-1054
  https://repository.apache.org/content/repositories/orgapachekudu-1055

It is tagged in Git as 1.10.1-RC3 and the corresponding hash is the
following:

https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=d7e2e998cf963dc98c568f264cc09c7c320cae0a

The release notes can be found here:
  https://github.com/apache/kudu/blob/branch-1.10.x/docs/release_notes.adoc

The KEYS file to verify the artifact signatures can be found here:
  https://dist.apache.org/repos/dist/release/kudu/KEYS

I'd suggest going through the README and the release notes, building Kudu,
and
running the unit tests. Testing out the Maven repo would also be
appreciated.
Also, it's worth running Kudu Java tests against kudu-binary JAR artifact
as described in the commit message here:

https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=8a6faaa93f3e206ac75e8087731daccaf7ab646a

This vote runs until Tue Nov 19 15:00:00 PST 2019.
That's over 72 hours from the time of sending out this e-mail message due
to the weekend.


Kind regads,

Alexey


[RESULT] [VOTE] Apache Kudu 1.11.1-RC1

2019-11-15 Thread Alexey Serbin
Hello Kudu devs!

The vote for Apache Kudu 1.11.1-RC1 has closed.  The release candidate
didn't pass.

The votes received so far by Fri Nov 15 18:30:00 PST 2019 (with
deadline Fri Nov 15 18:00:00 PST 2019):

+1s:
  Greg Solovyev (non-binding)
  Adar Lieber-Dembo

No 0 or -1s of any kind.

Thank you for voting.  I'll prepare 1.11.1-RC2 and send out an announcement
soon.


Thanks,

Alexey


[RESULT] [VOTE] Apache Kudu 1.10.1-RC2

2019-11-14 Thread Alexey Serbin
Hello Kudu devs!

The vote for Apache Kudu 1.10.1-RC2 has closed.  The release candidate
didn't pass.

The votes received so far by Thu Nov 14 11:30:00 PST 2019 (with
deadline Wed Nov 13 23:00:00 PST 2019):

+1s:
  Adar Lieber-Dembo

No 0 or -1s of any kind.

Thank you for voting.  I'll prepare 1.10.1-RC3 and send out an announcement
soon.


Thanks,

Alexey


[VOTE] Apache Kudu 1.11.1-RC1

2019-11-12 Thread Alexey Serbin
Hello Kudu devs!

The Apache Kudu team is happy to announce the first release candidate (RC1)
for Apache Kudu 1.11.1.

Apache Kudu 1.11.1 is a bug-fix release which fixes one critical licensing
and few other issues in Kudu 1.11.0.

This is a source-only release. The artifacts have been staged here:
  https://dist.apache.org/repos/dist/dev/kudu/1.11.1-RC1/

Java convenience binaries in the form of a Maven repository are staged here:
  https://repository.apache.org/content/repositories/orgapachekudu-1048

Linux and macOS kudu-binary JAR artifacts are staged here correspondingly:
  https://repository.apache.org/content/repositories/orgapachekudu-1049
  https://repository.apache.org/content/repositories/orgapachekudu-1052

It is tagged in Git as 1.11.1-RC1 and the corresponding hash is the
following:

https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=c42b88906a6261c4c1de281eab8afc82674ecafd

The release notes can be found here:
  https://github.com/apache/kudu/blob/branch-1.11.x/docs/release_notes.adoc

The KEYS file to verify the artifact signatures can be found here:
  https://dist.apache.org/repos/dist/release/kudu/KEYS

I'd suggest going through the README and the release notes, building Kudu,
and
running the unit tests. Testing out the Maven repo would also be
appreciated.
Also, it's worth running Kudu Java tests against kudu-binary JAR artifact
as described in the commit message here:

https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=8a6faaa93f3e206ac75e8087731daccaf7ab646a

The vote will run until Fri Nov 15 18:00:00 PST 2019.
This is a bit over 72 hours from the time of sending out this e-mail
message.


Thanks,

Alexey


Re: [VOTE] Apache Kudu 1.10.1-RC2

2019-11-12 Thread Alexey Serbin
Thank you for the verification of 1.10.1-RC2.

The missing and required piece between 1.10.1-RC1 and 1.10.1-RC2 was
990f612bb.
Without the changelist, invocation of the build_mini_cluster_binaries.sh
script was returning an error because of missing licensing info for
libyaml-cpp: kudu-master and kudu-tserver binaries are dynamically linked
with that thirdparty library.

On Tue, Nov 12, 2019 at 1:14 AM Adar Lieber-Dembo 
wrote:

> +1
>
> On Ubuntu 18.04 I built RELEASE with NO_TESTS=1 and verified that
> there were no numa or memkind dependencies or symbols in any binaries.
> I also grepped for numa and memkind across the codebase and didn't see
> any unexpected dependencies.
>
> Also on Ubuntu 18.04 I built DEBUG and verified that all tests passed
> in slow mode.
>
> > P.S. RC1 vote wasn't run since the 1.10.1-RC1 tag was placed on a
> snapshot
> >  that failed to build kudu-binaries JAR artifact).
>
> What caused the build to fail? I see these commits in RC2 but not in RC1:
> * e437f5add - (tag: 1.10.1-RC2, origin-internal/branch-1.10.x,
> gerrit/branch-1.10.x, apache/branch-1.10.x) [build-support] fix on
> enable_devtoolset_inner.sh (2 days ago) 
> * 98bb8d2fc - [thirdparty] add license info about yaml-cpp (2 days
> ago) 
> * 990f612bb - [build-support] add info on libyaml-cpp (2 days ago)
> 
>


Re: JIRA integration with gitbox

2019-11-08 Thread Alexey Serbin
+1 since it will save me some time, at least.

I usually post a comment on a JIRA ticket once it's resolved, adding
information about git commit hash of the change that contains the fix.
With this integration in place, there will be no need to do that.


/Alexey

On Fri, Nov 8, 2019 at 8:07 PM Attila Bukor  wrote:

> I’m +1 on this as well. It would be nice to be able to find the commit
> right from the JIRA.
>
> Sent from my iPhone
>
> > On Nov 8, 2019, at 7:57 PM, Grant Henke 
> wrote:
> >
> > +1 This would be great
> >
> >> On Fri, Nov 8, 2019 at 12:53 PM Andrew Wong  wrote:
> >>
> >> SGTM. My first thought was that the issues mailing list might get more
> >> traffic, but I'm ok with that, since contributors today often try to
> post
> >> an equivalent comment anyways. This just makes that more consistent.
> >>
> >> On Fri, Nov 8, 2019, 10:45 AM Todd Lipcon 
> >> wrote:
> >>
> >>> Sounds nice to me
> >>>
> >>> On Fri, Nov 8, 2019 at 10:44 AM Adar Lieber-Dembo
> >>> 
> >>> wrote:
> >>>
>  Hi devs,
> 
>  The Apache Impala project has this neat integration set up between
> >> gitbox
>  and JIRA where if someone pushes a new commit to gitbox and that
> commit
> >>> has
>  a JIRA associated with it, the push generates a JIRA comment
> describing
> >>> the
>  commit that was just pushed. You can see an example of this here:
>  https://issues.apache.org/jira/browse/IMPALA-6478
> 
>  I think this is pretty useful and I’d get it enabled for Kudu too.
> Does
>  anyone object? If I don’t hear any objections by next Wednesday I’ll
> go
>  ahead and file an INFRA ticket.
> 
> >>>
> >>>
> >>> --
> >>> Todd Lipcon
> >>> Software Engineer, Cloudera
> >>>
> >>
> >
> >
> > --
> > Grant Henke
> > Software Engineer | Cloudera
> > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>
>


Re: Licensing issue in Kudu 1.10.0 and 1.11.0

2019-11-06 Thread Alexey Serbin
Thank you for addressing KUDU-2990, Adar.

I can run the RM machinery for 1.11.1 and 1.10.1.

BTW, what 1.11.1 and 1.10.1 release notes should include besides a note on
KUDU-2990?  Should that be the note on KUDU-2990 and the rest of notes for
1.11.0 and 1.10.0 correspondingly?


Thanks,

Alexey

On Thu, Nov 7, 2019 at 5:07 AM Adar Lieber-Dembo 
wrote:

> KUDU-2990 has been fixed on master, so 1.12.0 will be conformant when
> it is released in several months.
>
> I also cherry-picked the fix into branch-1.10.x and branch-1.11.x.
> Would anyone like to volunteer to RM 1.10.1 and 1.11.1? Besides the
> usual RM machinery, we need new release notes that document the
> effects of KUDU-2990.
>
> On Sun, Nov 3, 2019 at 8:21 PM Grant Henke 
> wrote:
> >
> > +1 I agree with all of Adars suggestions.
> >
> > On Fri, Nov 1, 2019 at 8:34 PM Alexey Serbin
> 
> > wrote:
> >
> > > Thank you for the feedback, Adar.
> > >
> > > I'll add the information on the licensing issue into the 1.11.0 release
> > > announcement I'm about to send.
> > >
> > > I asked a question about the proper way of communicating of the issue
> on
> > > the LEGAL-487's comment thread, mentioning that we are about to add a
> > > notice into
> > > https://kudu.apache.org/docs/known_issues.html#_other_known_issues
> > >
> > >
> > > Best regards,
> > >
> > > Alexey
> > >
> > > On Fri, Nov 1, 2019 at 4:20 PM Adar Lieber-Dembo
>  > > >
> > > wrote:
> > >
> > > > My two cents:
> > > > - The presence of 1.11.0 on the download page means that 1.11.0 has
> > > > been de facto released, announcement or no announcement. The
> > > > announcement doesn't add any additional hurt, so I think we should
> > > > move forward with it.
> > > > - Separately, let's also announce the licensing issue and say that
> > > > we're working to rectify it in all affected release lines. To that
> > > > end, we will release 1.10.1 and 1.11.1 with the fix ASAP. The
> guidance
> > > > offered in LEGAL-487 so far seems to corroborate this.
> > > > - When 1.12.0 is released several months hence, it will be de facto
> > > > compliant by virtue of whatever fix first landing in master and then
> > > > being backported to branch-1.10.x and branch-1.11.x.
> > > > - I don't know whether we should call this out as a "known issue", as
> > > > that's typically been used for technical issues rather than legal
> > > > ones. Would be curious to hear what others think, and maybe you can
> > > > solicit further feedback in LEGAL-487?
> > > >
> > > > On Fri, Nov 1, 2019 at 4:08 PM Alexey Serbin
> > > >  wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > As Adar recently found, both in Kudu 1.10.0 and Kudu 1.11.0 (due
> to be
> > > > > announced today) the kudu-binary artifact contains libnuma library
> > > which
> > > > is
> > > > > under LGPL v.2.1, but it's against the ASF 3rd-party license
> policy:
> > > > >   https://www.apache.org/legal/resolved.html#category-x
> > > > >
> > > > > See https://issues.apache.org/jira/browse/KUDU-2990 for details.
> > > > >
> > > > > Apart from the technical discussion on how to resolve that, there
> are
> > > few
> > > > > process-related questions like:
> > > > >   1. How to address the issue in Kudu 1.11.0, which is de facto
> already
> > > > out
> > > > > of the door?
> > > > >   2. Should we address the issue in upcoming Kudu 1.12.0 release
> (about
> > > > 3-4
> > > > > month in the future) or implement the solution and release it with
> Kudu
> > > > > 1.11.1 ASAP?
> > > > >   3. If choosing the latter option from the previous item, should
> the
> > > > > announcement of the new Kudu 1.11.0 release be postponed/muted, so
> we
> > > > > announce only when Kudu 1.11.1 is out with KUDU-2990 addressed?
> > > > >
> > > > > Given the timing and the fact that Kudu 1.11.0 artifacts are
> already
> > > > > published, I think one of the possible paths forward is to proceed
> with
> > > > the
> > > > > announcement of Kudu 1.11.0 release as planned, but add an item
> about
> > > > > KUDU-2990 into the 'known issues' document, so it will be
> available at
> > > > the
> > > > > Apache Kudu website:
> > > > > https://kudu.apache.org/docs/known_issues.html#_other_known_issues
> > > > >
> > > > > What do you think?  Your feedback is appreciated.
> > > > >
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Alexey
> > > >
> > >
> >
> >
> > --
> > Grant Henke
> > Software Engineer | Cloudera
> > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>


[ANNOUNCE] Apache Kudu 1.11.0 Released

2019-11-01 Thread Alexey Serbin
 The Apache Kudu team is happy to announce the release
of Kudu 1.11.0!

Kudu is an open source storage engine for structured data which
supports low-latency random access together with efficient analytical
access patterns. It is designed within the context of the Apache Hadoop
ecosystem and supports many integrations with other data analytics
projects both inside and outside of the Apache Software Foundation.

Apache Kudu 1.11.0 is a minor release that offers several new features,
improvements, optimizations, and bug fixes. Please see the release notes
for details:
  https://kudu.apache.org/releases/1.11.0/docs/release_notes.html

The Apache Kudu project only publishes source code releases. To build
Kudu 1.11.0, follow these steps:
  - Download the Kudu 1.11.0 source release:
  https://kudu.apache.org/releases/1.11.0
  - Follow the instructions in the documentation to build Kudu 1.11.0
from source:

https://kudu.apache.org/releases/1.11.0/docs/installation.html#build_from_source

For your convenience, binary JAR files for the Kudu Java client library,
Spark DataSource, Flume sink, and other Java integrations are published
to the ASF Maven repository and are now available:
https://search.maven.org/search?q=g:org.apache.kudu%20AND%20v:1.11.0

The Python client source is also available on PyPI:
  https://pypi.org/project/kudu-python/

Additionally, experimental Docker images are published to Docker Hub:
  https://hub.docker.com/r/apache/kudu

NOTE: as it was found after the release artifacts had already been
  published, the kudu-binary JAR artifact in Kudu 1.11.0 doesn't
  comply with the ASF 3rd-party license policy [1] since it includes
  the libnuma dynamic library which is licensed under LGPL v2.1.
  As it turned out, the same is true for the kudu-binary JAR
  artifact released in July with Kudu 1.10.0. See [2] and [3] for
details.

  The inadvertent inclusion of an LGPL library will be addressed
  ASAP by releasing Kudu 1.10.1 and Kudu 1.11.1 patch releases
  adhering to the ASF 3rd party license policy.

References:
  [1] https://www.apache.org/legal/resolved.html
  [2] https://issues.apache.org/jira/browse/KUDU-2990
  [3] https://issues.apache.org/jira/browse/LEGAL-487


Regards,
The Apache Kudu team


Re: Licensing issue in Kudu 1.10.0 and 1.11.0

2019-11-01 Thread Alexey Serbin
Thank you for the feedback, Adar.

I'll add the information on the licensing issue into the 1.11.0 release
announcement I'm about to send.

I asked a question about the proper way of communicating of the issue on
the LEGAL-487's comment thread, mentioning that we are about to add a
notice into
https://kudu.apache.org/docs/known_issues.html#_other_known_issues


Best regards,

Alexey

On Fri, Nov 1, 2019 at 4:20 PM Adar Lieber-Dembo 
wrote:

> My two cents:
> - The presence of 1.11.0 on the download page means that 1.11.0 has
> been de facto released, announcement or no announcement. The
> announcement doesn't add any additional hurt, so I think we should
> move forward with it.
> - Separately, let's also announce the licensing issue and say that
> we're working to rectify it in all affected release lines. To that
> end, we will release 1.10.1 and 1.11.1 with the fix ASAP. The guidance
> offered in LEGAL-487 so far seems to corroborate this.
> - When 1.12.0 is released several months hence, it will be de facto
> compliant by virtue of whatever fix first landing in master and then
> being backported to branch-1.10.x and branch-1.11.x.
> - I don't know whether we should call this out as a "known issue", as
> that's typically been used for technical issues rather than legal
> ones. Would be curious to hear what others think, and maybe you can
> solicit further feedback in LEGAL-487?
>
> On Fri, Nov 1, 2019 at 4:08 PM Alexey Serbin
>  wrote:
> >
> > Hi,
> >
> > As Adar recently found, both in Kudu 1.10.0 and Kudu 1.11.0 (due to be
> > announced today) the kudu-binary artifact contains libnuma library which
> is
> > under LGPL v.2.1, but it's against the ASF 3rd-party license policy:
> >   https://www.apache.org/legal/resolved.html#category-x
> >
> > See https://issues.apache.org/jira/browse/KUDU-2990 for details.
> >
> > Apart from the technical discussion on how to resolve that, there are few
> > process-related questions like:
> >   1. How to address the issue in Kudu 1.11.0, which is de facto already
> out
> > of the door?
> >   2. Should we address the issue in upcoming Kudu 1.12.0 release (about
> 3-4
> > month in the future) or implement the solution and release it with Kudu
> > 1.11.1 ASAP?
> >   3. If choosing the latter option from the previous item, should the
> > announcement of the new Kudu 1.11.0 release be postponed/muted, so we
> > announce only when Kudu 1.11.1 is out with KUDU-2990 addressed?
> >
> > Given the timing and the fact that Kudu 1.11.0 artifacts are already
> > published, I think one of the possible paths forward is to proceed with
> the
> > announcement of Kudu 1.11.0 release as planned, but add an item about
> > KUDU-2990 into the 'known issues' document, so it will be available at
> the
> > Apache Kudu website:
> > https://kudu.apache.org/docs/known_issues.html#_other_known_issues
> >
> > What do you think?  Your feedback is appreciated.
> >
> >
> > Best regards,
> >
> > Alexey
>


Licensing issue in Kudu 1.10.0 and 1.11.0

2019-11-01 Thread Alexey Serbin
Hi,

As Adar recently found, both in Kudu 1.10.0 and Kudu 1.11.0 (due to be
announced today) the kudu-binary artifact contains libnuma library which is
under LGPL v.2.1, but it's against the ASF 3rd-party license policy:
  https://www.apache.org/legal/resolved.html#category-x

See https://issues.apache.org/jira/browse/KUDU-2990 for details.

Apart from the technical discussion on how to resolve that, there are few
process-related questions like:
  1. How to address the issue in Kudu 1.11.0, which is de facto already out
of the door?
  2. Should we address the issue in upcoming Kudu 1.12.0 release (about 3-4
month in the future) or implement the solution and release it with Kudu
1.11.1 ASAP?
  3. If choosing the latter option from the previous item, should the
announcement of the new Kudu 1.11.0 release be postponed/muted, so we
announce only when Kudu 1.11.1 is out with KUDU-2990 addressed?

Given the timing and the fact that Kudu 1.11.0 artifacts are already
published, I think one of the possible paths forward is to proceed with the
announcement of Kudu 1.11.0 release as planned, but add an item about
KUDU-2990 into the 'known issues' document, so it will be available at the
Apache Kudu website:
https://kudu.apache.org/docs/known_issues.html#_other_known_issues

What do you think?  Your feedback is appreciated.


Best regards,

Alexey


Re: 回复: [VOTE] Apache Kudu 1.11.0-RC3

2019-10-28 Thread Alexey Serbin
> > > >
> > > > > +1
> > > > > I ran dev docker build per
> > > > > https://github.com/apache/kudu/tree/1.11.0-RC3/docker - no errors.
> > > > > Greg
> > > > >
> > > > >
> > > > > On Fri, Oct 25, 2019 at 5:27 AM Attila Bukor 
> > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > - Successfully built in RELEASE mode on macOS Mojave and CentOS
> > 7.3.
> > > > > >
> > > > > > - Ran tests in slow mode on CentOS, memory_gc-itest seems to be
> > flaky
> > > > as it
> > > > > >   failed at first but I re-ran it several times and it passed.
> > There
> > > > was
> > > > > > another
> > > > > >   failure which was environmental (address already in use).
> > > > > >
> > > > > > - Built and tested the Java clients, there was one environmental
> > > > failure
> > > > > > (too
> > > > > >   many open files).
> > > > > >
> > > > > > - Successfully built examples/java/insert-loadgen and
> > > > > >   examples/scala/spark-example using the 1.11.0 convenience JARs
> > from
> > > > the
> > > > > >   staging repo.
> > > > > >
> > > > > > Attila
> > > > > >
> > > > > > On Fri, Oct 25, 2019 at 07:34:45PM +0800,
> > hzhel...@corp.netease.com
> > > > wrote:
> > > > > > > Hi Yifan,
> > > > > > >
> > > > > > > Thanks for your report, but I guess that's a known problem. In
> > the
> > > > > > previous
> > > > > > > patch submitted by Yingchun, it has improved the output of the
> > > > defective
> > > > > > > value, especially for the historical range partitioned tables.
> At
> > > the
> > > > > > same
> > > > > > > time, the problems you found in the master's metrics and kudu
> CLI
> > > > tool
> > > > > > are
> > > > > > > real. But both of them have a lower priority in my opinion, and
> > > > shouldn't
> > > > > > > block the release. As your said we can fix them in the next
> > > version.
> > > > > > >
> > > > > > > +1
> > > > > > > Ran both debug and release mode on debian8.9, all c++ tests
> > passed
> > > > except
> > > > > > > client_symbol-test as I made a soft link on directory "bin".
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > 何李夫
> > > > > > > 2019-10-25 19:34:19
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > -邮件原件-
> > > > > > > 发件人: dev-return-6342-hzhelifu=corp.netease@kudu.apache.org
> > > > > > >  代表
> > > Zhang
> > > > > > Yifan
> > > > > > > 发送时间: 2019年10月25日 15:08
> > > > > > > 收件人: dev@kudu.apache.org; aser...@cloudera.com
> > > > > > > 主题: Re:[VOTE] Apache Kudu 1.11.0-RC3
> > > > > > >
> > > > > > > +0Built on CentOS 7.3. Ran DEBUG tests in slow mode. No
> > failures.It
> > > > seems
> > > > > > > the table metric 'live_row_count' still has some problems.
> > > > > > >
> > > > > > >
> > > > > > > When master aggregate all tablets' live_row_count, the table
> > metric
> > > > > > > 'live_row_count' is 0 for legacy tables, and for legacy tables
> > with
> > > > newly
> > > > > > > added partitions, the value is live row count of new
> partitions.
> > I
> > > > could
> > > > > > get
> > > > > > > the metric value via master’s Web UI at master:8051/metrics.
> > > > > > > I also tried to get this metric via `kudu table statistics` CLI
> > > > tool, the
> > > > > > > value is 0 for both legacy tables and partly legacy tables.
> > > > > > > I think this new table metric should be invalid if the table
> > > contains
> > > > > > some
&g

Re: 回复: [VOTE] Apache Kudu 1.11.0-RC3

2019-10-28 Thread Alexey Serbin
I'm surprised we got non-0 count for a table with a mix of legacy and new
tablets.  I hope that doesn't bring any incompatibilities, so we could
address that in the future 1.12 release.  It would help if we had a JIRA
with documented reproduction scenarios.

I'll update the release notes in branch-1.11.x w.r.t. the behavior of the
'live_row_count' metric. If this RC pass, the note will get into the
branch, but not into the released source tarball.

Let's extend the vote deadline till 6 pm PDT (Mon Oct 28 18:00:00 PDT 2019).


Thanks,

Alexey

On Mon, Oct 28, 2019 at 8:45 AM Adar Lieber-Dembo  wrote:

> +1
>
> I ran all tests in slow DEBUG mode on CentOS 6.6 and Ubuntu 18.04. All
> of them passed.
>
> I'm surprised that live row count still has issues given the late
> breaking fixes that Yingchun Lai merged into 1.11.0. His last fix was
> expressly intended to address the problem where a table with a mix of
> legacy and new tablets misreports the count: it should report a live
> row count of 0. Zhang Yifan, could you file a JIRA with your findings?
>
> On Sun, Oct 27, 2019 at 4:06 PM Greg Solovyev
>  wrote:
> >
> > +1
> > I ran dev docker build per
> > https://github.com/apache/kudu/tree/1.11.0-RC3/docker - no errors.
> > Greg
> >
> >
> > On Fri, Oct 25, 2019 at 5:27 AM Attila Bukor  wrote:
> >
> > > +1
> > >
> > > - Successfully built in RELEASE mode on macOS Mojave and CentOS 7.3.
> > >
> > > - Ran tests in slow mode on CentOS, memory_gc-itest seems to be flaky
> as it
> > >   failed at first but I re-ran it several times and it passed. There
> was
> > > another
> > >   failure which was environmental (address already in use).
> > >
> > > - Built and tested the Java clients, there was one environmental
> failure
> > > (too
> > >   many open files).
> > >
> > > - Successfully built examples/java/insert-loadgen and
> > >   examples/scala/spark-example using the 1.11.0 convenience JARs from
> the
> > >   staging repo.
> > >
> > > Attila
> > >
> > > On Fri, Oct 25, 2019 at 07:34:45PM +0800, hzhel...@corp.netease.com
> wrote:
> > > > Hi Yifan,
> > > >
> > > > Thanks for your report, but I guess that's a known problem. In the
> > > previous
> > > > patch submitted by Yingchun, it has improved the output of the
> defective
> > > > value, especially for the historical range partitioned tables. At the
> > > same
> > > > time, the problems you found in the master's metrics and kudu CLI
> tool
> > > are
> > > > real. But both of them have a lower priority in my opinion, and
> shouldn't
> > > > block the release. As your said we can fix them in the next version.
> > > >
> > > > +1
> > > > Ran both debug and release mode on debian8.9, all c++ tests passed
> except
> > > > client_symbol-test as I made a soft link on directory "bin".
> > > >
> > > >
> > > >
> > > > 何李夫
> > > > 2019-10-25 19:34:19
> > > >
> > > >
> > > >
> > > > -邮件原件-
> > > > 发件人: dev-return-6342-hzhelifu=corp.netease@kudu.apache.org
> > > >  代表 Zhang
> > > Yifan
> > > > 发送时间: 2019年10月25日 15:08
> > > > 收件人: dev@kudu.apache.org; aser...@cloudera.com
> > > > 主题: Re:[VOTE] Apache Kudu 1.11.0-RC3
> > > >
> > > > +0Built on CentOS 7.3. Ran DEBUG tests in slow mode. No failures.It
> seems
> > > > the table metric 'live_row_count' still has some problems.
> > > >
> > > >
> > > > When master aggregate all tablets' live_row_count, the table metric
> > > > 'live_row_count' is 0 for legacy tables, and for legacy tables with
> newly
> > > > added partitions, the value is live row count of new partitions. I
> could
> > > get
> > > > the metric value via master’s Web UI at master:8051/metrics.
> > > > I also tried to get this metric via `kudu table statistics` CLI
> tool, the
> > > > value is 0 for both legacy tables and partly legacy tables.
> > > > I think this new table metric should be invalid if the table contains
> > > some
> > > > legacy tablets, and we can't get the metric via clients and Web UI.
> > > >
> > > >
> > > > We should at least document the meaning of the table metric in the
> > > release
> > > > notes and fix it in 1.12.0.
> > > >
> &g

Re: Time source for Kudu tests

2019-10-25 Thread Alexey Serbin
Thank you Adar and Andrew for the thoughtful feedback.

I think switching to the 'system_unsync' clock for non-EMC (i.e. not based
on External Mini-Cluster) tests is safe enough, even if it might be
affected by NTP and manual time changes.

I guess the majority of Kudu tests which are now run with 'system' clock
source would not even notice if time travels back and/or forth during their
run.  Those that notice are tests which should be set apart into a separate
time-sensitive category and run only
  a. using 'system' time source when machine clock is synchronized with NTP
  b. using the built-in NTP client when it's synchronized with a reliable
NTP server

So far I like the idea of switching non-EMC tests to 'system_unsync' time
source and keeping EMC tests on the 'builtin' source.  Meanwhile, I can
find out which tests are in the time-sensitive category: it would be an
environment where system time is jumping back and forth while tests are
running.

Please let me know what you think about this.


Thanks,

Alexey

On Mon, Oct 7, 2019 at 6:46 PM Andrew Wong 
wrote:

> I had a chat with Alexey about this that I think helped clarify some things
> for
> me, so thought I'd share (though feel free to correct me if I'm wrong about
> anything):
> - The current state of the master branch is that:
>   - Our external mini cluster tests initialize MiniChronyd, and because of
> this, when used with the builtin client, the clock is considered
> synchronized almost immediately.
>   - All our other tests don't do this. They continue to use the system
> clock,
> and thus, may be susceptible to the "slow NTP" startup issues we've
> seen
> before.
> - This thread is suggesting that we converge on using a single time piece:
>   the `system_unsync` aka "local clock", which may not necessarily be
>   synchronized by the Kernel with time servers, but would be good enough
> for
>   all of our existing tests. This seems desirable because it:
>   - Means that all our tests are run in the same way using the same clock
> semantics.
>   - Should be sufficiently fast to "synchronize" because there's no actual
> NTP
> synchronization involved. And that's OK. Our tests don't care.
>
> I agree with this approach, but I also wouldn't be opposed to keeping our
> existing external mini cluster tests using the builtin client (and using
> the
> local clock for other tests), because:
> - It is closer to what we might expect in the real world because we expect
> to
>   eventually default to using the builtin client. While not truly "real",
> since
>   we're still synchronizing with something local and not an external set of
>   time servers, it still seems like better coverage than none.
> - It should still net us fewer NTP-related "random" test failures.
> - The line between tests that use external mini clusters and tests that
> don't
>   is very clear. And so the argument that it might be confusing to have
>   different time semantics across tests isn't as compelling to me.
>
> I don't feel particularly strongly for this approach vs the one laid out by
> Alexey, but thought I'd bring it up since in general it seems like external
> mini cluster tests are meant to simulate as real of an environment as
> possible (modulo a lot of effort). Continuing to use the builtin client
> seems
> like a step in that direction since we hope it will be the default one day.
>
> On Mon, Oct 7, 2019 at 5:49 PM Adar Lieber-Dembo  >
> wrote:
>
> > Thanks for the detailed summary.
> >
> > I agree with the goal of using the same time source across all tests
> > if possible. Originally, I was hoping this would be 'builtin', but
> > that'd require deploying MiniChronyd too, both to avoid a dependency
> > on Internet connectivity and to ensure that clock synchronization at
> > test startup is sufficiently fast. And, like you said, not only is
> > that a good chunk of work, it's also somewhat antithetical to those
> > pure "in-proc" test setups.
> >
> > So I reluctantly agree that 'system_unsync' may be the next best
> > option, with targeted uses of 'system' and 'builtin' for testing
> > specific functionality. That said, is it safe that 'system_unsync'
> > uses CLOCK_REALTIME under the hood, which is affected by NTP and admin
> > time changes?
> >
> > On Sun, Oct 6, 2019 at 11:27 PM Alexey Serbin
> >  wrote:
> > >
> > > Hi,
> > >
> > > Recently, built-in NTP client was been introduced in Kudu.  Todd Lipcon
> > > put together the original WIP patch a couple of years ago [1], and
> about
> > a
> > > week
> > > ago the built-in NTP client was merged into t

[VOTE] Apache Kudu 1.11.0-RC3

2019-10-23 Thread Alexey Serbin
Hello Kudu devs!

The Apache Kudu team is happy to announce the third release candidate (RC3)
for Apache Kudu 1.11.0.

Apache Kudu 1.11.0 is a minor release that offers many improvements and
fixes
since the prior release.

This is a source-only release. The artifacts have been staged here:
  https://dist.apache.org/repos/dist/dev/kudu/1.11.0-RC3/

Java convenience binaries in the form of a Maven repository are staged here:
  https://repository.apache.org/content/repositories/orgapachekudu-1041

It is tagged in Git as 1.11.0-RC3 and the corresponding hash is the
following:

https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=08db97c591d9131cbef9b8e5b4f44a6e854c25f0

The release notes can be found here:
  https://github.com/apache/kudu/blob/branch-1.11.x/docs/release_notes.adoc

The KEYS file to verify the artifact signatures can be found here:
  https://dist.apache.org/repos/dist/release/kudu/KEYS

I'd suggest going through the README and the release notes, building Kudu,
and
running the unit tests. Testing out the Maven repo would also be
appreciated.
Another test is to check for the behavior of the newly introduced
table's live rows metric when upgrading from Kudu 1.10 with already existing
tables, making sure it behaves the expected way for the legacy tables and
also for legacy tables with newly added partitions while running Kudu 1.11.

The vote will run until Mon Oct 28 11:00:00 PDT 2019. This is more than
usual 72 hours from the time of sending out this e-mail message because
of the weekend.


Thanks,
Alexey


Re: [VOTE] Apache Kudu 1.11.0-RC2

2019-10-23 Thread Alexey Serbin
The vote is closed, 1.11.0-RC2 didn't pass.

As Adar mentioned, we needed http://gerrit.cloudera.org:8080/14507 to make
live row count working properly and avoiding introducing incompatible
changes if fixing it in later releases.  Also, having KUDU-2980 fixed would
make the release more robust.

I'll cut RC3 as soon as all the patches we want are in the branch-1.11.x.


On Tue, Oct 22, 2019 at 2:22 PM Attila Bukor  wrote:

> I agree with Adar regarding the fix, we should avoid knowingly releasing a
> version with an issue we can't fix down the road due to
> backwards-compatibility
> issues.
>
> I also agree with Alexey that we can revert the change introducing
> live_row_count instead of waiting for the fix, at least if it takes too
> long. I
> think we can revert the commit in branch-1.11.x only though, that way we
> don't
> need to re-add it in master.
>
> Either way, we'll need to cut an RC3 if we want to avoid a breaking change
> in
> 1.12, so in case KUDU-2980 and gerrit/14507 are close to be merged, we
> might as
> well wait for them to be merged before cutting RC3.
>
> I would let Alexey to decide which approach to take though as he's the RM.
>
> On Tue, Oct 22, 2019 at 01:30:41PM -0700, Alexey Serbin wrote:
> > I think we can cut another RC if we really think that incorporating these
> > fixes makes the release candidate more robust and sound.  I'm not sure
> what
> > is time-based limit there, but it would be great if we could wrap it up
> in
> > one week or so.
> >
> > From the other side, as PlanB we could rollback the changes that
> > http://gerrit.cloudera.org:8080/14507 follows up and update the release
> > notes regarding the live_row_count metrics to make sure we don't
> introduce
> > any incompatibility.  We can re-apply the reverted patches in the master
> > branch and continue developing and fixing related bugs (if any),
> scheduling
> > releasing them in Kudu 1.12 with less hustle.  We can also schedule fixes
> > for KUDU-2980 to be in 1.12 since it doesn't look like a release blocker
> > (1.10 is also affected by the issue, so it's not a regression introduced
> in
> > 1.11 anyways).
> >
> > Which path is more preferable?
> >
> > We can think of other options as well.
> >
> >
> > Thanks,
> >
> > Alexey
> >
> > On Tue, Oct 22, 2019 at 10:46 AM Adar Lieber-Dembo
> >  wrote:
> >
> > > +0
> > >
> > > Built on Ubuntu 18.04 and CentOS 6.6. Ran DEBUG tests in slow mode. No
> > > failures.
> > >
> > > I do wish the release incorporated the fix for KUDU-2980, but that
> > > hasn't even been published to CR yet. Plus it won't break backwards
> > > compatibility if we fix it later.
> > >
> > > On the other hand, http://gerrit.cloudera.org:8080/14507 must get in
> > > now; otherwise there will be backwards compat issues if it is merged
> > > for 1.12.0. What do other people think?
> > >
> > >
> > >
> > >
> > > On Fri, Oct 18, 2019 at 12:53 PM Alexey Serbin
> > >  wrote:
> > > >
> > > > Hello Kudu devs!
> > > >
> > > > The Apache Kudu team is happy to announce the second release
> candidate
> > > for
> > > > Apache Kudu 1.11.0.
> > > >
> > > > Apache Kudu 1.11.0 is a minor release that offers many improvements
> and
> > > > fixes
> > > > since the prior release.
> > > >
> > > > This is a source-only release. The artifacts have been staged here:
> > > > https://dist.apache.org/repos/dist/dev/kudu/1.11.0-RC2/
> > > >
> > > > Java convenience binaries in the form of a Maven repository are
> staged
> > > here:
> > > >
> https://repository.apache.org/content/repositories/orgapachekudu-1040/
> > > >
> > > > It is tagged in Git as 1.11.0-RC1 and the corresponding hash is the
> > > > following:
> > > >
> > >
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=f6623f527b6c7e95b84d6f719fa1cb9eff4ebd29
> > > >
> > > > The release notes can be found here:
> > > >
> > >
> https://github.com/apache/kudu/blob/branch-1.11.x/docs/release_notes.adoc
> > > >
> > > > The KEYS file to verify the artifact signatures can be found here:
> > > > https://dist.apache.org/repos/dist/release/kudu/KEYS
> > > >
> > > > I'd suggest going through the README and the release notes, building
> > > Kudu,
> > > > and
> > > > running the unit tests. Testing out the Maven repo would also be
> > > > appreciated.
> > > >
> > > > The vote will run until Wed Oct 23 11:00:00 PDT 2019.  This is more
> than
> > > > usual 72 hours from the time of sending out this e-mail message
> because
> > > > of the weekend.
> > > >
> > > >
> > > > Thanks,
> > > > Alexey
> > >
>


Re: [VOTE] Apache Kudu 1.11.0-RC2

2019-10-22 Thread Alexey Serbin
I think we can cut another RC if we really think that incorporating these
fixes makes the release candidate more robust and sound.  I'm not sure what
is time-based limit there, but it would be great if we could wrap it up in
one week or so.

>From the other side, as PlanB we could rollback the changes that
http://gerrit.cloudera.org:8080/14507 follows up and update the release
notes regarding the live_row_count metrics to make sure we don't introduce
any incompatibility.  We can re-apply the reverted patches in the master
branch and continue developing and fixing related bugs (if any), scheduling
releasing them in Kudu 1.12 with less hustle.  We can also schedule fixes
for KUDU-2980 to be in 1.12 since it doesn't look like a release blocker
(1.10 is also affected by the issue, so it's not a regression introduced in
1.11 anyways).

Which path is more preferable?

We can think of other options as well.


Thanks,

Alexey

On Tue, Oct 22, 2019 at 10:46 AM Adar Lieber-Dembo
 wrote:

> +0
>
> Built on Ubuntu 18.04 and CentOS 6.6. Ran DEBUG tests in slow mode. No
> failures.
>
> I do wish the release incorporated the fix for KUDU-2980, but that
> hasn't even been published to CR yet. Plus it won't break backwards
> compatibility if we fix it later.
>
> On the other hand, http://gerrit.cloudera.org:8080/14507 must get in
> now; otherwise there will be backwards compat issues if it is merged
> for 1.12.0. What do other people think?
>
>
>
>
> On Fri, Oct 18, 2019 at 12:53 PM Alexey Serbin
>  wrote:
> >
> > Hello Kudu devs!
> >
> > The Apache Kudu team is happy to announce the second release candidate
> for
> > Apache Kudu 1.11.0.
> >
> > Apache Kudu 1.11.0 is a minor release that offers many improvements and
> > fixes
> > since the prior release.
> >
> > This is a source-only release. The artifacts have been staged here:
> > https://dist.apache.org/repos/dist/dev/kudu/1.11.0-RC2/
> >
> > Java convenience binaries in the form of a Maven repository are staged
> here:
> > https://repository.apache.org/content/repositories/orgapachekudu-1040/
> >
> > It is tagged in Git as 1.11.0-RC1 and the corresponding hash is the
> > following:
> >
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=f6623f527b6c7e95b84d6f719fa1cb9eff4ebd29
> >
> > The release notes can be found here:
> >
> https://github.com/apache/kudu/blob/branch-1.11.x/docs/release_notes.adoc
> >
> > The KEYS file to verify the artifact signatures can be found here:
> > https://dist.apache.org/repos/dist/release/kudu/KEYS
> >
> > I'd suggest going through the README and the release notes, building
> Kudu,
> > and
> > running the unit tests. Testing out the Maven repo would also be
> > appreciated.
> >
> > The vote will run until Wed Oct 23 11:00:00 PDT 2019.  This is more than
> > usual 72 hours from the time of sending out this e-mail message because
> > of the weekend.
> >
> >
> > Thanks,
> > Alexey
>


[VOTE] Apache Kudu 1.11.0-RC1

2019-10-15 Thread Alexey Serbin
Hello Kudu devs!

The Apache Kudu team is happy to announce the first release candidate for
Apache Kudu 1.11.0.

Apache Kudu 1.11.0 is a minor release that offers many improvements and
fixes
since the prior release.

This is a source-only release. The artifacts have been staged here:
https://dist.apache.org/repos/dist/dev/kudu/1.11.0-RC1/

Java convenience binaries in the form of a Maven repository are staged here:
https://repository.apache.org/content/repositories/orgapachekudu-1039/

It is tagged in Git as 1.11.0-RC1 and the corresponding hash is the
following:
https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=0d952614e8bbfc20d0f3f3bb23d794343e4d0e44

The release notes can be found here:
https://github.com/apache/kudu/blob/branch-1.11.x/docs/release_notes.adoc

The KEYS file to verify the artifact signatures can be found here:
https://dist.apache.org/repos/dist/release/kudu/KEYS

I'd suggest going through the README and the release notes, building Kudu,
and
running the unit tests. Testing out the Maven repo would also be
appreciated.

The vote will run until Fri Oct 18 16:30:00 PDT 2019.  This is about 72
hours from the time of sending out this e-mail message.


Thanks,

Alexey


Re: Kudu 1.11.0 Release

2019-10-11 Thread Alexey Serbin
Hello Kudu developers,

The new 1.11.x branch is available for use (it's named branch-1.11.x).

I will put the release notes for review in 1.11.x branch (thanks for
everyone who contributed and reviewed the draft at
https://gerrit.cloudera.org/#/c/14404) and start putting together an RC1.

As always, if you have questions or concerns, don't hesitate to reach out.


Thanks,

Alexey

On Tue, Oct 8, 2019 at 3:50 PM Alexey Serbin  wrote:

> Hi,
>
> Just a reminder: I'm planning to create 1.11.x branch in the Kudu
> repository tomorrow, Wednesday, October 9th 2019, at about 19:00 PDT.
>
> If you haven't yet, please add high-level description of new features and
> improvements that you introduced since Kudu 1.10.  Add them into the
> following shared document:
>
> https://docs.google.com/document/d/1X730M_8lbvLuFNL_n5mKHTeqcdv6eoIOX-pthgLiuks/edit#heading=h.yu409rl84ire
>
> It will help to prepare 1.11 release notes.
>
> Thank you!
>
>
> Kind regards,
>
> Alexey
>
>
> On Mon, Oct 7, 2019 at 11:05 AM Alexey Serbin 
> wrote:
>
>> Hi,
>> Thank you everyone for your feedback on pr
>>
> eparing for Kudu 1.11 release.
>> FYI, there is a shared document to serve as a scratchpad for Kudu 1.11
>> release notes:
>> https://docs.google.com/document/d/1X730M_8lbvLuFNL_n5mKHTeqcdv6eoIOX-pthgLiuks/edit#heading=h.yu409rl84ire
>> It already contains some notable changes that are worth adding release
>> notes for.  If you have context, please fill them out.  Also feel free
>> to add anything else that you think is worth documenting.  It will help
>> to put together release notes for 1.11 release.
>>
>> Thank you!
>>
>>
>> Kind regards,
>>
>> Alexey
>>
>> On Tue, Oct 1, 2019 at 11:18 AM Alexey Serbin 
>> wrote:
>>
>>> Hello Kudu developers!
>>>
>>> It's been a bit more than three months since we released 1.10.0. In that
>>> time,
>>> we've accrued some important improvements and bug fixes (around 189
>>> commits
>>> since Kudu 1.10.0). By our usual three-month release cadence, now seems
>>> like as good a time as any to start thinking about a Kudu 1.10.0 release,
>>> and I'm volunteering to RM it.
>>>
>>> I'm proposing we cut the branch for 1.11.x on Wednesday, *October 9*,
>>> which gives us a week to prepare, and then we'll start a vote a couple
>>> of days
>>> after that.
>>>
>>> If this sound good to you all, please start thinking about and writing up
>>> release notes for notable changes that have landed in this release, and
>>> get
>>> them checked in before October 9 2019.
>>>
>>> Here's a command you can run to check out roughly what you've been up to
>>> since Kudu 1.10.0 release:
>>>
>>> $ git log d75a77081326d2f2e7c4d10bc06afb0b39d2c0a2..master --oneline \
>>> --graph --no-merges --first-parent --author=
>>>
>>> Please let me know if you agree with this plan, have comments, questions,
>>> concerns, etc.
>>>
>>> Thanks!
>>>
>>> --
>>> Alexey Serbin
>>> Software Engineer | Cloudera
>>>
>>


gerrit.cloudera.org brief restart tonight @ 22:00 PDT

2019-10-10 Thread Alexey Serbin
Hi devs,

Kudu is branching and we need to update our Gerrit replication
configuration to account for the new branch. The downtime shouldn't be more
than 10 minutes. This is just a small addition to a Gerrit configuration
file and if there are any problems the changes will be reverted to the
current version of the config file.

Please reach out if you have any concerns about this.

If I don't hear any protests then I won't bother sending out another email
about this restart.


Thanks,

Alexey


Re: Kudu 1.11.0 Release

2019-10-08 Thread Alexey Serbin
Hi,

Just a reminder: I'm planning to create 1.11.x branch in the Kudu
repository tomorrow, Wednesday, October 9th 2019, at about 19:00 PDT.

If you haven't yet, please add high-level description of new features and
improvements that you introduced since Kudu 1.10.  Add them into the
following shared document:

https://docs.google.com/document/d/1X730M_8lbvLuFNL_n5mKHTeqcdv6eoIOX-pthgLiuks/edit#heading=h.yu409rl84ire

It will help to prepare 1.11 release notes.

Thank you!


Kind regards,

Alexey


On Mon, Oct 7, 2019 at 11:05 AM Alexey Serbin  wrote:

> Hi,
> Thank you everyone for your feedback on pr
>
eparing for Kudu 1.11 release.
> FYI, there is a shared document to serve as a scratchpad for Kudu 1.11
> release notes:
> https://docs.google.com/document/d/1X730M_8lbvLuFNL_n5mKHTeqcdv6eoIOX-pthgLiuks/edit#heading=h.yu409rl84ire
> It already contains some notable changes that are worth adding release
> notes for.  If you have context, please fill them out.  Also feel free to
> add anything else that you think is worth documenting.  It will help to
> put together release notes for 1.11 release.
>
> Thank you!
>
>
> Kind regards,
>
> Alexey
>
> On Tue, Oct 1, 2019 at 11:18 AM Alexey Serbin 
> wrote:
>
>> Hello Kudu developers!
>>
>> It's been a bit more than three months since we released 1.10.0. In that
>> time,
>> we've accrued some important improvements and bug fixes (around 189
>> commits
>> since Kudu 1.10.0). By our usual three-month release cadence, now seems
>> like as good a time as any to start thinking about a Kudu 1.10.0 release,
>> and I'm volunteering to RM it.
>>
>> I'm proposing we cut the branch for 1.11.x on Wednesday, *October 9*,
>> which gives us a week to prepare, and then we'll start a vote a couple of
>> days
>> after that.
>>
>> If this sound good to you all, please start thinking about and writing up
>> release notes for notable changes that have landed in this release, and
>> get
>> them checked in before October 9 2019.
>>
>> Here's a command you can run to check out roughly what you've been up to
>> since Kudu 1.10.0 release:
>>
>> $ git log d75a77081326d2f2e7c4d10bc06afb0b39d2c0a2..master --oneline \
>> --graph --no-merges --first-parent --author=
>>
>> Please let me know if you agree with this plan, have comments, questions,
>> concerns, etc.
>>
>> Thanks!
>>
>> --
>> Alexey Serbin
>> Software Engineer | Cloudera
>>
>


Re: Kudu 1.11.0 Release

2019-10-07 Thread Alexey Serbin
Hi,
Thank you everyone for your feedback on preparing for Kudu 1.11 release.
FYI, there is a shared document to serve as a scratchpad for Kudu 1.11
release notes:
https://docs.google.com/document/d/1X730M_8lbvLuFNL_n5mKHTeqcdv6eoIOX-pthgLiuks/edit#heading=h.yu409rl84ire
It already contains some notable changes that are worth adding release
notes for.  If you have context, please fill them out.  Also feel free to
add anything else that you think is worth documenting.  It will help to put
together release notes for 1.11 release.

Thank you!


Kind regards,

Alexey

On Tue, Oct 1, 2019 at 11:18 AM Alexey Serbin  wrote:

> Hello Kudu developers!
>
> It's been a bit more than three months since we released 1.10.0. In that
> time,
> we've accrued some important improvements and bug fixes (around 189 commits
> since Kudu 1.10.0). By our usual three-month release cadence, now seems
> like as good a time as any to start thinking about a Kudu 1.10.0 release,
> and I'm volunteering to RM it.
>
> I'm proposing we cut the branch for 1.11.x on Wednesday, *October 9*,
> which gives us a week to prepare, and then we'll start a vote a couple of
> days
> after that.
>
> If this sound good to you all, please start thinking about and writing up
> release notes for notable changes that have landed in this release, and get
> them checked in before October 9 2019.
>
> Here's a command you can run to check out roughly what you've been up to
> since Kudu 1.10.0 release:
>
> $ git log d75a77081326d2f2e7c4d10bc06afb0b39d2c0a2..master --oneline \
> --graph --no-merges --first-parent --author=
>
> Please let me know if you agree with this plan, have comments, questions,
> concerns, etc.
>
> Thanks!
>
> --
> Alexey Serbin
> Software Engineer | Cloudera
>


Time source for Kudu tests

2019-10-07 Thread Alexey Serbin
Hi,

Recently, built-in NTP client was been introduced in Kudu.  Todd Lipcon
put together the original WIP patch a couple of years ago [1], and about a
week
ago the built-in NTP client was merged into the master branch of the Kudu
repo.
With that, a new time source is now available for Kudu masters and tablet
servers: the built-in NTP client.

With the introduction of the new time source for Kudu components, we have
had a few offline discussions about using different time sources in Kudu
tests.  That's not only about providing test coverage for the newly
introduced
built-in NTP client, but for all other tests as well.  Last week Adar and I
talked about that offline one more time.  I'll try to summarize few key
points in this e-mail message.

With the introduction of the built-in NTP client, a significant part of
tests
was switched to run with it as the time source.  Particularly, all tests
based on ExternalMiniCluster are now run with built-in NTP client with
commit
03e2ada69 [2].  The idea is to have more coverage for the newly added
functionality, especially given the fact that at some point we might switch
to use the built-in NTP client by default instead of relying on the local
machine clock synchronized by the system NTP service.

There are many other Kudu tests (e.g., tests based on InternalMiniCluster)
which still require the machine clock to be synchronized with NTP to run.
Ideally, it would be great to run all Kudu tests with using same time source
unless they are specifically targeted to verify functionality of a
particular
time source.  An example of the latter are tests which cover the
functionality
of the built-in NTP client itself.

Prior to the introduction of the built-in NTP client, almost all Kudu tests
were running against the local system clock synchronized with NTP, at least
on Linux.  The only exception were tests for read-your-writes and other
consistency guarantees: those use mock time source to control the time
in a very specific way.

In retrospective, it was great to have a single time source because we
always
knew that the same time source was used for almost all of our tests.  Also,
the time source used in Kudu tests was the same as in Kudu masters and
tablet
servers running in real distributed clusters.  From the other side, that
required local machine's clock to be synchronized with NTP to run those
tests,
and the tests would fail if the time was not synchronized with NTP.
In addition, as far as I know, there are no Kudu tests which are targeted
to detect inconsistencies due to irregularities in the time source or
non-synchronized clocks between different Kudu nodes.  Of course, we have
Jepsen-based tests, but that's another story: we are not talking about
Jepsen
tests here, only C++ and Java tests based on gtest and JUnit frameworks.

Now, here comes one observation: all the components of above mentioned tests
are running at the same machine during execution of corresponding scenarios.
If all of them are using the local machine's clock, then there is no need
to use NTP or other synchronisation techniques designed to synchronize
multiple clocks.

So, here comes the question: why to require the local clock to be
synchronized
with NTP for tests?  Yes, we need to verify that the 'system' time source
works for HybridTime timestamps, but as for the functionality of a generic
Kudu test which is simply based on the HybridTime timestamps without any
knowledge of the underlying time source, why is it important?

It seems we can safely change the time source in those generic tests to be
'system_unsync': local machine clock which is not necessarily synchronized
by
the kernel NTP discipline (see SystemUnsyncTime in
$KUDU_ROOT/src/kudu/clock/system_unsync_time.{h,cc} for details).

Once switched to 'system_unsync' time source for tests, in addition it will
be necessary to add new dedicated tests scenarios such as:
  * Smoke scenarios: using the system NTP time source for Kudu clusters.
  * Smoke scenarios: using the built-in NTP time source for Kudu clusters.
  * Advanced scenarios to detect issues due to irregularities in time source
(like time jumping back and forth, etc.).

What do you think?  Your feedback is appreciated.

Thank you!



Kind regards,

Alexey

[1] http://gerrit.cloudera.org:8080/7477
[2]
https://github.com/apache/kudu/commit/03e2ada694290cafce0bea6ebbf092709aa64a2a


Kudu 1.11.0 Release

2019-10-01 Thread Alexey Serbin
Hello Kudu developers!

It's been a bit more than three months since we released 1.10.0. In that
time,
we've accrued some important improvements and bug fixes (around 189 commits
since Kudu 1.10.0). By our usual three-month release cadence, now seems
like as good a time as any to start thinking about a Kudu 1.10.0 release,
and I'm volunteering to RM it.

I'm proposing we cut the branch for 1.11.x on Wednesday, *October 9*,
which gives us a week to prepare, and then we'll start a vote a couple of
days
after that.

If this sound good to you all, please start thinking about and writing up
release notes for notable changes that have landed in this release, and get
them checked in before October 9 2019.

Here's a command you can run to check out roughly what you've been up to
since Kudu 1.10.0 release:

$ git log d75a77081326d2f2e7c4d10bc06afb0b39d2c0a2..master --oneline \
--graph --no-merges --first-parent --author=

Please let me know if you agree with this plan, have comments, questions,
concerns, etc.

Thanks!

--
Alexey Serbin
Software Engineer | Cloudera


Re: [VOTE] Apache Kudu 1.10.0-RC3

2019-06-27 Thread Alexey Serbin
+1

I built the C++ and Java parts of the project on Ubuntu18.04.02 LTS
(release configuration), all C++ and Java tests passed.

I built the C++ and the Java part of the project on Mac OS X 10.11.6, build
15G22010.  All Java tests passed, several C++ tests failed:

The following tests FAILED:
17 - client-test.5 (Failed)
23 - hybrid_clock-test (Failed)
125 - master_failover-itest.0 (Failed)
126 - master_failover-itest.1 (Failed)
127 - master_failover-itest.2 (Failed)
132 - master_sentry-itest.0 (Failed)
133 - master_sentry-itest.1 (Failed)
134 - master_sentry-itest.2 (Failed)
135 - master_sentry-itest.3 (Failed)
136 - master_sentry-itest.4 (Failed)
137 - master_sentry-itest.5 (Failed)
138 - master_sentry-itest.6 (Failed)
139 - master_sentry-itest.7 (Failed)
188 - location_cache-test (Failed)
192 - sentry_authz_provider-test.0 (Failed)
193 - sentry_authz_provider-test.1 (Failed)
194 - sentry_authz_provider-test.2 (Failed)
195 - sentry_authz_provider-test.3 (Failed)
196 - sentry_authz_provider-test.4 (Failed)
197 - sentry_authz_provider-test.5 (Failed)
198 - sentry_authz_provider-test.6 (Failed)
199 - sentry_authz_provider-test.7 (Failed)
222 - sentry_client-test (Failed)
225 - webserver-test (Timeout)
287 - kudu-tool-test.3 (Failed)
379 - subprocess-test (Failed)
383 - trace-test (Failed)

It seems it's expected to have Sentry-related tests to fail on OS X
(master_sentry-itest, sentry_authz_provider-test).  Failing the rest is not
expected, but given MacOS X is not a production platform, that seems to be
OK.


/Alexey

On Thu, Jun 27, 2019 at 8:22 AM Grant Henke 
wrote:

> Hello Kudu devs!
>
> The Apache Kudu team is happy to announce the third release candidate for
> Apache Kudu 1.10.0.
>
> Apache Kudu 1.10.0 is a minor release that offers many improvements and
> fixes since the prior release.
>
> This is a source-only release. The artifacts have been staged here:
> https://dist.apache.org/repos/dist/dev/kudu/1.10.0-RC3/
>
> Java convenience binaries in the form of a Maven repository are staged
> here:
> https://repository.apache.org/content/repositories/orgapachekudu-1037/
>
> It is tagged in Git as 1.10.0-RC3 and the corresponding hash is the
> following:
>
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=d75a77081326d2f2e7c4d10bc06afb0b39d2c0a2
>
> The release notes can be found here:
> https://github.com/apache/kudu/blob/branch-1.10.x/docs/release_notes.adoc
>
> The KEYS file to verify the artifact signatures can be found here:
> https://dist.apache.org/repos/dist/release/kudu/KEYS
>
> I'd suggest going through the README and the release notes, building Kudu,
> and
> running the unit tests. Testing out the Maven repo would also be
> appreciated.
>
> The vote will run until Monday, July 1st at 11:59AM PST. This is a bit
> over the suggested 72 hours due to the weekend.
>
> Thank you,
> Grant
>
> --
> Grant Henke
> Software Engineer | Cloudera
> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>


Re: java unit tests failing when starting MiniKuduCluster

2019-06-20 Thread Alexey Serbin
Hi Clemens,

I apologize for dropping the ball on this.

I tried to look at the issue some time ago, but failed with provisioning
Fedora 29 VM, and it slipped away.

Just recently, Todd and I looked at the issue in the context of failing
Kudu tests on Ubuntu 18.04 LTS (bionic).  It turned out the issue is
related to TLSv1.3 handshake changes.  Kudu code didn't take into account
the specifics of TLSv1.3 handshake, and failed to handle it properly.  The
problem appears as soon as both the client and server support TLSv1.3 (that
starts with OpenSSL 1.1.1).

Long story short:
 * the issue is tracked by https://issues.apache.org/jira/browse/KUDU-2871
 * the workaround to use TLS not higher than v1.2 is in the master branch
of the Kudu repo already, see changelist efc3f372e
 * most likely, the workaround to use TLS not higher than v1.2 will be in
the upcoming release Kudu 1.10
 * hopefully, there soon will be a fix to adapt Kudu RPC code to work with
TLSv1.3


Kind regards,

Alexey


On Fri, May 3, 2019 at 12:59 AM Clemens Valiente <
clemens.valie...@trivago.com> wrote:

> Hi Alexey,
>
> thanks for looking into it.
>
> I'd like to share the result from tls_handshake-test since that seems to
> be getting closer to the root cause:
>
> [ RUN  ] TestTlsHandshake.TestHandshakeSequence
> /home/cvaliente/git/kudu/src/kudu/security/tls_handshake-test.cc:206:
> Failure
> Value of: client.Continue(buf1, ).IsIncomplete()
>   Actual: false
> Expected: true
>
> it fails at this position:
>
>   // Client receives server Hello and sends client Finished
>   ASSERT_TRUE(client.Continue(buf1, ).IsIncomplete());
>
>
> So for some reason the client believes it has not finished yet.
>
>
> Cheers
>
> Clemens
>
> 
> From: Alexey Serbin 
> Sent: 19 April 2019 18:10:51
> To: dev
> Subject: Re: java unit tests failing when starting MiniKuduCluster
>
> Hi Clemens,
>
> Thank you for the information.  Yes, to me it looks like an issue with IO
> over TLS-encrypted channels.  I don't have access to Fedora29 and newer, so
> I didn't do any troubleshooting/debugging this time.  However, if you
> really want to run on Fedora29, you can disable authentication and
> encryption (--rpc_authentication=disabled --rpc_encryption=disabled) and it
> seems that should work fine on Fedora29.  Other options are running on
> Linux platforms that are currently listed at
>
> https://kudu.apache.org/docs/installation.html#_prerequisites_and_requirements
> or going forward with troubleshooting on your own, where good starting
> points might be tls_handshake-test and tls_socket-test in addition to
> negotiation-test.
>
> I'm also planning to take a look at the issue soon.  I'll let you know once
> I find anything of particular interest there.
>
>
> Kind regards,
>
> Alexey
>
> On Thu, Apr 18, 2019 at 2:14 AM Clemens Valiente <
> clemens.valie...@trivago.com> wrote:
>
> > Hi Alexey,
> >
> >
> > find the tests attached.
> >
> > The pattern that I can see is that the successful tests are those without
> > a signed certificate (server: {pki: NONE) and the ones with disabled
> > encryption on client- or serverside (encryption: DISABLED) - but those
> > last ones are not supposed to establish a successful connection anyway.
> >
> > The failed tests all require some signed key on the server's side and
> > encryption supported by both sides. So it might hint at an SSL issue.
> >
> > I am using a FIPS enabled version of openssl:
> >
> > OpenSSL 1.1.1b FIPS  26 Feb 2019
> >
> >
> > Cheers
> >
> > Clemens
> > --
> > *From:* Alexey Serbin 
> > *Sent:* 17 April 2019 20:49:29
> > *To:* dev
> > *Subject:* Re: java unit tests failing when starting MiniKuduCluster
> >
> > Hi Clemens,
> >
> > Could you run the negotiation-test (built without your patch) and attach
> > the output?  The test loops through various combinations of client/server
> > configurations, so it would be much easier to see what works and what
> not.
> > I suspect there might be some issues related to the combination of newer
> > glibc and OpenSSL libraries.
> >
> >
> > Kind regards,
> >
> > Alexey
> >
> > On Wed, Apr 17, 2019 at 3:07 AM Clemens Valiente <
> > clemens.valie...@trivago.com> wrote:
> >
> > > Hi Adar,
> > >
> > > thanks for looking into it.
> > >
> > > I had to apply the patches mentioned in KUDU-2770 to be able to build
> > kudu
> > > at all. As mentioned in the ticket, I am using fedora 29 which comes
> with
> >

Re: [VOTE] Apache Kudu 1.10.0-RC1

2019-06-20 Thread Alexey Serbin
Another one from the same category is
https://issues.apache.org/jira/browse/KUDU-2871

On Wed, Jun 19, 2019 at 6:57 PM Grant Henke 
wrote:

> It looks like KUDU-2870 
> is
> a bug that should be fixed and included in the release. There are a few
> other small bugs that can be backported as well. I will cut a new RC and
> call a vote after that issue is resolved.
>
> Thank you,
> Grant
>
> On Fri, Jun 14, 2019 at 5:20 PM Grant Henke  wrote:
>
> > Hello Kudu devs!
> >
> > The Apache Kudu team is happy to announce the first release candidate for
> > Apache Kudu 1.10.0.
> >
> > Apache Kudu 1.10.0 is a minor release that offers many improvements and
> > fixes since the prior release.
> >
> > This is a source-only release. The artifacts have been staged here:
> > https://dist.apache.org/repos/dist/dev/kudu/1.10.0-RC1/
> >
> > Java convenience binaries in the form of a Maven repository are staged
> > here:
> > https://repository.apache.org/content/repositories/orgapachekudu-1035/
> >
> > It is tagged in Git as 1.10.0-RC1 and the corresponding hash is the
> > following:
> >
> >
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=ce322bdfc979f0f3f5d3881affcdd22e0c5b6ac2
> >
> > The WIP release notes can be found here:
> >
> >
> https://docs.google.com/document/d/1CnGvoOob9H8BcY5dPciYCW5RTj9J--wXPmeMDQzqH6Y/edit?usp=sharing
> >
> > The KEYS file to verify the artifact signatures can be found here:
> > https://dist.apache.org/repos/dist/release/kudu/KEYS
> >
> > I'd suggest going through the README and the release notes, building
> Kudu,
> > and
> > running the unit tests. Testing out the Maven repo would also be
> > appreciated.
> >
> > The vote will run until Wednesday, June 19th at 11:59AM PST. This is a
> bit
> > over the suggested 72 hours due to the weekend.
> >
> > Thank you,
> > Grant
> >
> > --
> > Grant Henke
> > Software Engineer | Cloudera
> > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
> >
>
>
> --
> Grant Henke
> Software Engineer | Cloudera
> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>


Re: java unit tests failing when starting MiniKuduCluster

2019-04-19 Thread Alexey Serbin
Hi Clemens,

Thank you for the information.  Yes, to me it looks like an issue with IO
over TLS-encrypted channels.  I don't have access to Fedora29 and newer, so
I didn't do any troubleshooting/debugging this time.  However, if you
really want to run on Fedora29, you can disable authentication and
encryption (--rpc_authentication=disabled --rpc_encryption=disabled) and it
seems that should work fine on Fedora29.  Other options are running on
Linux platforms that are currently listed at
https://kudu.apache.org/docs/installation.html#_prerequisites_and_requirements
or going forward with troubleshooting on your own, where good starting
points might be tls_handshake-test and tls_socket-test in addition to
negotiation-test.

I'm also planning to take a look at the issue soon.  I'll let you know once
I find anything of particular interest there.


Kind regards,

Alexey

On Thu, Apr 18, 2019 at 2:14 AM Clemens Valiente <
clemens.valie...@trivago.com> wrote:

> Hi Alexey,
>
>
> find the tests attached.
>
> The pattern that I can see is that the successful tests are those without
> a signed certificate (server: {pki: NONE) and the ones with disabled
> encryption on client- or serverside (encryption: DISABLED) - but those
> last ones are not supposed to establish a successful connection anyway.
>
> The failed tests all require some signed key on the server's side and
> encryption supported by both sides. So it might hint at an SSL issue.
>
> I am using a FIPS enabled version of openssl:
>
> OpenSSL 1.1.1b FIPS  26 Feb 2019
>
>
> Cheers
>
> Clemens
> --
> *From:* Alexey Serbin 
> *Sent:* 17 April 2019 20:49:29
> *To:* dev
> *Subject:* Re: java unit tests failing when starting MiniKuduCluster
>
> Hi Clemens,
>
> Could you run the negotiation-test (built without your patch) and attach
> the output?  The test loops through various combinations of client/server
> configurations, so it would be much easier to see what works and what not.
> I suspect there might be some issues related to the combination of newer
> glibc and OpenSSL libraries.
>
>
> Kind regards,
>
> Alexey
>
> On Wed, Apr 17, 2019 at 3:07 AM Clemens Valiente <
> clemens.valie...@trivago.com> wrote:
>
> > Hi Adar,
> >
> > thanks for looking into it.
> >
> > I had to apply the patches mentioned in KUDU-2770 to be able to build
> kudu
> > at all. As mentioned in the ticket, I am using fedora 29 which comes with
> > glibc 2.28.
> >
> > I made sure that everything built correctly and ran the c++ tests and got
> > many failing tests. They all seem to occur when trying to connect, the
> > first stacktrace usually leads to
> >
> >
> > 0417 12:01:53.737010 (+  3212us) negotiation.cc:304] Negotiation
> complete:
> > IO error: Server connection negotiation failed: server connection from
> > 127.4.196.1:38781: received invalid message of size 386073344 which
> > exceeds the rpc_max_message_size of 52428800 bytes
> >
> >
> > This message size of 386073344 bytes is the same across all failed tests
> > so I suspect that to be part of the issue, but I haven't been able to
> > figure out what is inside that message. Is there any debug logging I
> could
> > activate on the tests? I can now easily reproduce a failure with
> >
> >
> > bin/rpc_line_item_dao-test --gtest_filter=RpcLineItemDAOTest.TestInsert
> >
> >
> > Thanks a lot for the help
> >
> > Clemens
> >
> >
> > --
> > *From:* Adar Lieber-Dembo 
> > *Sent:* 16 April 2019 21:15:23
> > *To:* dev@kudu.apache.org
> > *Subject:* Re: java unit tests failing when starting MiniKuduCluster
> >
> > I'm guessing it's something to do with your local system. You had
> > previously filed KUDU-2770 regarding incompatibilities with newer
> > versions of glibc; what system/distro are you running on? Are you
> > using _any_ Kudu patches whatsoever, even just to get the build (or
> > thirdparty build) working?
> >
> > FWIW, these tests pass locally for me (running against master) and
> > they're passing in Kudu precommit tests regularly. Could you try to
> > run the C++ test suite? They're easier to debug when they fail.
> >
> > On Tue, Apr 16, 2019 at 12:53 AM Clemens Valiente
> >  wrote:
> > >
> > > Hi,
> > >
> > >
> > > I added these flags after the initial runs that complained about
> > rpc_max_message_size exceeded in the tests. That got me one step further
> to
> > the connection failed error.
> > >
> > >
> > > I tried running 

Re: java unit tests failing when starting MiniKuduCluster

2019-04-17 Thread Alexey Serbin
Hi Clemens,

Could you run the negotiation-test (built without your patch) and attach
the output?  The test loops through various combinations of client/server
configurations, so it would be much easier to see what works and what not.
I suspect there might be some issues related to the combination of newer
glibc and OpenSSL libraries.


Kind regards,

Alexey

On Wed, Apr 17, 2019 at 3:07 AM Clemens Valiente <
clemens.valie...@trivago.com> wrote:

> Hi Adar,
>
> thanks for looking into it.
>
> I had to apply the patches mentioned in KUDU-2770 to be able to build kudu
> at all. As mentioned in the ticket, I am using fedora 29 which comes with
> glibc 2.28.
>
> I made sure that everything built correctly and ran the c++ tests and got
> many failing tests. They all seem to occur when trying to connect, the
> first stacktrace usually leads to
>
>
> 0417 12:01:53.737010 (+  3212us) negotiation.cc:304] Negotiation complete:
> IO error: Server connection negotiation failed: server connection from
> 127.4.196.1:38781: received invalid message of size 386073344 which
> exceeds the rpc_max_message_size of 52428800 bytes
>
>
> This message size of 386073344 bytes is the same across all failed tests
> so I suspect that to be part of the issue, but I haven't been able to
> figure out what is inside that message. Is there any debug logging I could
> activate on the tests? I can now easily reproduce a failure with
>
>
> bin/rpc_line_item_dao-test --gtest_filter=RpcLineItemDAOTest.TestInsert
>
>
> Thanks a lot for the help
>
> Clemens
>
>
> --
> *From:* Adar Lieber-Dembo 
> *Sent:* 16 April 2019 21:15:23
> *To:* dev@kudu.apache.org
> *Subject:* Re: java unit tests failing when starting MiniKuduCluster
>
> I'm guessing it's something to do with your local system. You had
> previously filed KUDU-2770 regarding incompatibilities with newer
> versions of glibc; what system/distro are you running on? Are you
> using _any_ Kudu patches whatsoever, even just to get the build (or
> thirdparty build) working?
>
> FWIW, these tests pass locally for me (running against master) and
> they're passing in Kudu precommit tests regularly. Could you try to
> run the C++ test suite? They're easier to debug when they fail.
>
> On Tue, Apr 16, 2019 at 12:53 AM Clemens Valiente
>  wrote:
> >
> > Hi,
> >
> >
> > I added these flags after the initial runs that complained about
> rpc_max_message_size exceeded in the tests. That got me one step further to
> the connection failed error.
> >
> >
> > I tried running the tests both on master and on the kudu 1.9.0 release
> (with the appropriate kudu build), with and without my patch.
> >
> > 
> > From: Adar Lieber-Dembo 
> > Sent: 14 April 2019 19:59:45
> > To: dev@kudu.apache.org
> > Subject: Re: java unit tests failing when starting MiniKuduCluster
> >
> > It might be related to your local changes. This caught my eye:
> >
> >   extra_master_flags: "--rpc_max_message_size=3860733440"
> >   extra_tserver_flags: "--rpc_max_message_size=3860733440"
> >
> > Why did you have to add this configuration? What happens if you remove
> > it? On a related note, do the tests still fail if you rebuild from a
> > clean working tree (i.e. without your patches)?
> >
> > On Sat, Apr 13, 2019 at 12:52 PM Clemens Valiente
> >  wrote:
> > >
> > > Hi,
> > >
> > > I have a few fixes for the kudu-mapreduce package and wanted to add
> unit tests but neither the existing nor the newly added unit tests can be
> run on my system.
> > >
> > > I managed to build kudu and theMiniKuduCluster starts up but then
> fails with connection/timeout issues.
> > >
> > > I think this might point at the problem:
> > >
> > > 16:42:05.780 [INFO - cluster stderr printer]
> (MiniKuduCluster.java:543) 0410 16:42:05.779084 (+3053866us)
> negotiation.cc:304] Negotiation complete: Timed out: Server connection
> negotiation failed: server connection from 127.0.0.1:60698
> > >
> > > Though I have no explanation as to why kudu would not be able to make
> local connections within my machine. (fedora 29, OpenSSL 1.1.1b FIPS  26
> Feb 2019)
> > > I attached the full test report. Can someone help me figure out how to
> further debug this, or is there any additional information I can provide?
> > >
> > > Best Regards
> > > Clemens Valiente
>


Re: [VOTE] Apache Kudu 1.9.0-RC2

2019-03-01 Thread Alexey Serbin
+1

Built and ran C++ test on Mac OS X 10.11.6.  All passed except for the
following, which I think isn't crucial with MacOS being the development
only platform:

sentry_authz_provider-test (Failed)
sentry_client-test (Failed)
kudu-tool-test.2 (Failed)
trace-test (Failed)

On Fri, Mar 1, 2019 at 6:08 PM Mike Percy  wrote:

> I'm also seeing the jsonreader-test crash in RELEASE mode on Ubuntu bionic.
> The symptom is a nullptr dereference but really it's caused by an assertion
> failure.
>
> I modified the test to get a proper failure and it fails here:
>
>   // Min signed 32-bit integer.
>   const char* const signed_small32 = "signed_small32";
>   Status s = r.ExtractUint32(r.root(), signed_small32, _u32);
>   ASSERT_TRUE(s.IsInvalidArgument()) << s.ToString();
>
> That status object is OK, which means jsonreader incorrectly detected the
> negative integer as a uint in the following piece of code:
>
> Status JsonReader::ExtractUint32(const Value* object,
>  const char* field,
>  uint32_t* result) const {
>   const Value* val;
>   RETURN_NOT_OK(ExtractField(object, field, ));
>   if (PREDICT_FALSE(!val->IsUint())) {
> return Status::InvalidArgument(Substitute(
> "wrong type during field extraction: expected uint32 but got $0",
> TypeToString(val->GetType(;
>   }
>   *result = val->GetUint();
>   return Status::OK();
> }
>
> Note that this only happens in RELEASE mode for me. In this test, there is
> a comment preventing UB testing that looks like this:
>
>   // The rapidjson code has some improper handling of the min int32 and min
>   // int64 that exposes UB.
>   #if defined(ADDRESS_SANITIZER)
> LOG(WARNING) << "this test is skipped in ASAN builds";
> return;
>   #endif
>
> It's surprising we have never noticed this before. Maybe only the newer
> compilers are causing problems due to the undefined behavior.
>
> When I comment out that #ifdef and run this under ASAN with UBSAN enabled,
> I get the following warning:
>
> [ RUN  ] JsonReaderTest.SignedAndUnsignedInts
> thirdparty/installed/common/include/rapidjson/reader.h:644:18: runtime
> error: negation of -2147483648 cannot be represented in type 'int'; cast to
> an unsigned type to negate this value to itself
> #0 0x8b15e4 in void rapidjson::GenericReader,
> rapidjson::MemoryPoolAllocator >::ParseNumber<0u,
> rapidjson::GenericStringStream >,
> rapidjson::GenericDocument,
> rapidjson::MemoryPoolAllocator >
> >(rapidjson::GenericStringStream >&,
> rapidjson::GenericDocument,
> rapidjson::MemoryPoolAllocator >&)
> ../thirdparty/installed/common/include/rapidjson/reader.h:644:18
> #1 0x8aaf00 in void rapidjson::GenericReader,
> rapidjson::MemoryPoolAllocator >::ParseValue<0u,
> rapidjson::GenericStringStream >,
> rapidjson::GenericDocument,
> rapidjson::MemoryPoolAllocator >
> >(rapidjson::GenericStringStream >&,
> rapidjson::GenericDocument,
> rapidjson::MemoryPoolAllocator >&)
> ../thirdparty/installed/common/include/rapidjson/reader.h:663:14
> #2 0x8a8b96 in void rapidjson::GenericReader,
> rapidjson::MemoryPoolAllocator >::ParseObject<0u,
> rapidjson::GenericStringStream >,
> rapidjson::GenericDocument,
> rapidjson::MemoryPoolAllocator >
> >(rapidjson::GenericStringStream >&,
> rapidjson::GenericDocument,
> rapidjson::MemoryPoolAllocator >&)
> ../thirdparty/installed/common/include/rapidjson/reader.h:290:4
> #3 0x8a7829 in bool rapidjson::GenericReader,
> rapidjson::MemoryPoolAllocator >::Parse<0u,
> rapidjson::GenericStringStream >,
> rapidjson::GenericDocument,
> rapidjson::MemoryPoolAllocator >
> >(rapidjson::GenericStringStream >&,
> rapidjson::GenericDocument,
> rapidjson::MemoryPoolAllocator >&)
> ../thirdparty/installed/common/include/rapidjson/reader.h:243:15
> #4 0x8a6f10 in rapidjson::GenericDocument,
> rapidjson::MemoryPoolAllocator >&
> rapidjson::GenericDocument,
> rapidjson::MemoryPoolAllocator >::ParseStream<0u,
> rapidjson::GenericStringStream >
> >(rapidjson::GenericStringStream >&)
> ../thirdparty/installed/common/include/rapidjson/document.h:712:23
> #5 0x8a34f2 in rapidjson::GenericDocument,
> rapidjson::MemoryPoolAllocator >&
> rapidjson::GenericDocument,
> rapidjson::MemoryPoolAllocator >::Parse<0u>(char
> const*) ../thirdparty/installed/common/include/rapidjson/document.h:745:10
> #6 0x89d804 in kudu::JsonReader::Init()
> ../src/kudu/util/jsonreader.cc:65:13
> #7 0x7754f7 in
> kudu::JsonReaderTest_SignedAndUnsignedInts_Test::TestBody()
> ../src/kudu/util/jsonreader-test.cc:172:3
> #8 0xe69c6c in void
> testing::internal::HandleSehExceptionsInMethodIfSupported void>(testing::Test*, void (testing::Test::*)(), char const*)
>
> /home/mpercy/src/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2402
> #9 0xe69c6c in void
> testing::internal::HandleExceptionsInMethodIfSupported void>(testing::Test*, void (testing::Test::*)(), char const*)
>
> 

Re: [VOTE] Apache Kudu 1.9.0-RC1

2019-02-25 Thread Alexey Serbin
+1

I built and run C++ test on Mac OS X 10.11.6, some of the test failed, but
I don't think it's a blocker for the release, especially given the fact
that macOS is a development-only platform.

98% tests passed, 6 tests failed out of 361

23 - hybrid_clock-test (Failed)
175 - sentry_authz_provider-test (Failed)
197 - sentry_client-test (Failed)
250 - ksck_remote-test (Timeout)
261 - kudu-tool-test.2 (Failed)
355 - trace-test (Failed)


On Mon, Feb 25, 2019 at 2:19 PM Attila Bukor  wrote:

> Just a small correction, the MacOS tests didn't fail due to lsof, I just
> got confused. I had failures on CentOS due to lsof not being installed
> which were fixed after it was. On MacOS I had the below failures,
> neither connected to lsof:
>
> ---
> 96% tests passed, 14 tests failed out of 361
>
> Label Time Summary:
> no_dist_test=  13.13 sec*proc (1 test)
> no_tsan =   5.61 sec*proc (3 tests)
>
> Total Test time (real) = 8620.63 sec
>
> The following tests FAILED:
> 127 - master_sentry-itest (Failed)
> 134 - raft_consensus_nonvoter-itest (Failed)
> 156 - tablet_copy_client_session-itest (Failed)
> 175 - sentry_authz_provider-test (Failed)
> 197 - sentry_client-test (Failed)
> 261 - kudu-tool-test.2 (Failed)
> 267 - rebalancer_tool-test.0 (Failed)
> 268 - rebalancer_tool-test.1 (Failed)
> 269 - rebalancer_tool-test.2 (Failed)
> 270 - rebalancer_tool-test.3 (Failed)
> 271 - rebalancer_tool-test.4 (Failed)
> 280 - tablet_server-test (Failed)
> 327 - net_util-test (Failed)
> 355 - trace-test (Failed)
> Errors while running CTest
> ---
>
> This doesn't change my vote of course.
>
> Attila
>
> On Mon, Feb 25, 2019 at 07:49:17PM +0100, Attila Bukor wrote:
> > +1
> >
> > I ran all C++ tests with KUDU_ALLOW_SLOW_TESTS=1 both in DEBUG and
> > RELEASE on CentOS 7.6. master_hms-itest failed in both build types for
> > me, but as HMS integration is still not finished, I think that's fine -
> > plus it might be an environmental issue if it didn't fail for others.
> >
> > I also ran the tests for the Java kudu-client where two stress tests
> > failed:
> >
> > ITClientStress. testManyShortClientsGeneratingScanTokens
> > ITClientStress. testMultipleSessions
> >
> > I believe this should be fine too.
> >
> > The build succeeded on MacOS, but a bunch of tests failed complaining
> > about lsof - as MacOS is not officially supported, this also shouldn't
> > be a problem.
> >
> > Docker build failed from the extracted build directory due to not being
> > in a Git working directory and git rev-parse failing. I'll fix this
> > later, but of course this is also not a blocker as Docker is still
> > experimental and last time I checked it worked fine from the Git working
> > directory.
> >
> > Long story short, I've ran into some failures, but none of them should
> > be blockers.
> >
> > If the comments on the release notes draft are addressed, I'm happy with
> > releasing this RC.
> >
> > Attila
> >
> > On Thu, Feb 21, 2019 at 10:00:23PM -0800, Adar Lieber-Dembo wrote:
> > > +1
> > >
> > > I ran all C++ tests with KUDU_ALLOW_SLOW_TESTS=1 in DEBUG mode on
> > > Ubuntu 18.04 and CentOS 6.6. All passed except for a known Ubuntu 18
> > > failure (a variant of KUDU-2641).
> > >
> > > On Thu, Feb 21, 2019 at 5:54 PM helifu 
> wrote:
> > > >
> > > > +1
> > > > * All C++ tests passed in debug/release/tsan mode on Debian8.9,
> except asan;
> > > >   However, I think it's acceptable because the asan mode never
> worked for me;
> > > > * All Java tests passed on Debian8.9 using gradlew;
> > > >
> > > >
> > > > 何李夫
> > > > 2019-02-22 09:54:53
> > > >
> > > > -邮件原件-
> > > > 发件人: dev-return-6186-hzhelifu=corp.netease@kudu.apache.org
>  代表 Andrew Wong
> > > > 发送时间: 2019年2月21日 8:42
> > > > 收件人: dev 
> > > > 主题: [VOTE] Apache Kudu 1.9.0-RC1
> > > >
> > > > Hello Kudu devs!
> > > >
> > > > The Apache Kudu team is happy to announce the first release
> candidate for Apache Kudu 1.9.0.
> > > >
> > > > Apache Kudu 1.9.0 is a minor release that offers many improvements
> and fixes since the prior release.
> > > >
> > > > This is a source-only release. The artifacts have been staged here:
> > > > https://dist.apache.org/repos/dist/dev/kudu/1.9.0-RC1/
> > > >
> > > > Java convenience binaries in the form of a Maven repository are
> staged here:
> > > >
> https://repository.apache.org/content/repositories/orgapachekudu-1028/
> > > >
> > > > It is tagged in Git as 1.9.0-RC1 and the corresponding hash is the
> > > > following:
> > > >
> https://gitbox.apache.org/repos/asf?p=kudu.git;a=commit;h=76e8af74e151c018de7e3d6aa34fadd49bf41601
> > > > <
> https://git-wip-us.apache.org/repos/asf?p=kudu.git;a=commit;h=76e8af74e151c018de7e3d6aa34fadd49bf41601
> >
> > > >
> > > > A draft of the release notes can be found here:
> > > > https://gerrit.cloudera.org/c/12389/
> > > >
> > > > The KEYS file to verify the artifact signatures can be found 

Proposed change for the kudu-tidy-bot at jenkins.kudu.apache.org

2019-02-13 Thread Alexey Serbin
Hi,

In the context of fixing https://issues.apache.org/jira/browse/KUDU-2699,
if going with the approach posted for review at
https://gerrit.cloudera.org/#/c/12471/, it's necessary to adapt the command
for the kudu-tidy-bot job accordingly.

One alternative is to keep the old command for the branches without that
patch, but for master and newer branches use new, updated command.
Probably, we would want to back-port it into 1.9.x to be along with
commit 861ecc12f.  The command would look like in
https://gerrit.cloudera.org/#/c/12471/4//COMMIT_MSG

The other alternative is suggested by Adar: let's just run the
kudu-tidy-bot job against the changes in the master branch, since our
development process is making the changes in the master branch and
cherry-picking them into the release/maintenance branches.

Personally, I prefer the latter option.

What do you think?  Should I go ahead with the second option or there might
be some objections?


Thanks,

Alexey


Re: [VOTE] Apache Kudu 1.8.0-RC2

2018-10-23 Thread Alexey Serbin
+1

I built and ran C++ tests under CentOS 6.6 (debug configuration).  All
passed.
I also ran Java tests on CentOS 6.6 using gradlew, and everything passed
except for
  org.apache.kudu.client.TestKuduClient >
testReadYourWritesSyncLeaderReplica.
I think the failure of one of the inherently flaky Java tests isn't an
issue.

I built and ran C++ tests under macOS 10.11.6 (debug configuration).  All
passed except for:

client_examples-test
fs_manager-test
sentry_client-test
kudu-tool-test.2
tablet_server-test
trace-test

However, that does look like a blocker given the fact that macOS is
development-only platform.


On Tue, Oct 23, 2018 at 11:57 AM Grant Henke 
wrote:

> +1
>
> Ran all of the tests on MacOs and Verified the Maven poms and artifacts
> visually.
>
> On Tue, Oct 23, 2018 at 1:30 PM Adar Lieber-Dembo
> 
> wrote:
>
> > +1
> >
> > Built and ran DEBUG C++ tests in slow mode on CentOS 6.6. All passed.
> > Built and ran Java tests using DEBUG Kudu binaries on CentOS 6.6. All
> > passed.
> > Reviewed the release notes.
> > On Mon, Oct 22, 2018 at 5:17 PM Dan Burkert 
> wrote:
> > >
> > > +1
> > >
> > > Built and ran C++ tests on MacOS.  Had three failures (fs_manager-test,
> > > kudu-tool-test, and trace-test), but I believe they are all
> > MacOS-specific,
> > > and shouldn't be release blockers.
> > >
> > > - Dan
> > >
> > > On Fri, Oct 19, 2018 at 11:48 PM Andrew Wong  wrote:
> > >
> > > > +1
> > > >
> > > > Built and ran all C++ tests on Centos 6.6 in debug and release
> modes. A
> > > > couple failed but, upon retrying, passed.
> > > > Ran all Java tests on Centos 6.6 via maven and gradle in release
> mode.
> > All
> > > > passed.
> > > >
> > > > Built and ran all C++ tests on Centos 7.3 in release mode. All
> passed.
> > > > Ran all Java tests on Centos 7.3 via gradle in release mode. A number
> > of
> > > > tests failed at first:
> > > >
> > > >- TestKuduBackup.testSimpleBackupAndRestore and
> > > >TestKuduBackup.testRandomBackupAndRestore (both passed on retry,
> > > > flakiness
> > > >is tracked as KUDU-2548 <
> > > > https://issues.apache.org/jira/browse/KUDU-2584>
> > > >)
> > > >- ITClientStressTest.testManyShortClientsGeneratingScanTokens (too
> > many
> > > >open files led to ClassNotFoundException on my shared machine that
> > is
> > > >running Kudu and Impala)
> > > >- TestKuduClient.testGetAuthnToken (passed on retry, some
> flakiness
> > of
> > > >TestKuduClient tracked in KUDU-2236
> > > >)
> > > >- TestMasterFailover.testKillLeaderBeforeOpenTable (passed on
> retry,
> > > >flakiness tracked as KUDU-2592
> > > >)
> > > >- DefaultSourceTest.testTableScanWithProjectionAndPredicateLong
> > (passed
> > > >on retry, flakiness of DefaultSourceTest is tracked in KUDU-2029
> > > > and KUDU-2599
> > > >)
> > > >- DefaultSourceTest.testBasicSparkSQLWithInListPredicate (passed
> on
> > > >retry, ditto)
> > > >
> > > > The takeaway for me is that while we've been OK about tracking
> > flakiness in
> > > > our C++ tests (there exists a flaky test tracking server for them),
> we
> > > > should do a better job at this for Java tests. Within the last
> > release, the
> > > > pre-commit gerrit job was updated to retry failed Java tests up to 3x
> > and,
> > > > for the sake of pre-commits passing, has costed the introduction of a
> > > > number of new flakes. This doesn't seem release-blocking, but we
> > should be
> > > > mindful to address this moving forward.
> > > >
> > > > On Tue, Oct 16, 2018 at 2:47 AM Attila Bukor 
> > wrote:
> > > >
> > > > > Hi Devs,
> > > > >
> > > > > As suggested on the RC1 vote thread, I've included the complete fix
> > for
> > > > > KUDU-2463 and created a new release candidate for Apache Kudu
> 1.8.0.
> > > > >
> > > > > Apache Kudu 1.8.0 is a minor release that offers many improvements
> > and
> > > > > fixes
> > > > > since the prior release.
> > > > >
> > > > > The is a source-only release. The artifacts have been staged here:
> > > > > https://dist.apache.org/repos/dist/dev/kudu/1.8.0-RC2/
> > > > >
> > > > > Java convenience binaries in the form of a Maven repository are
> > staged
> > > > > here:
> > > > >
> > https://repository.apache.org/content/repositories/orgapachekudu-1027
> > > > >
> > > > > It is tagged in Git as 1.8.0-RC2 and the corresponding hash is the
> > > > > following:
> > > > >
> > > > >
> > > >
> >
> https://git-wip-us.apache.org/repos/asf?p=kudu.git;a=commit;h=cbbf7b580c4ab4fdf6621e4ee5ab1ddc5f03cb4e
> > > > >
> > > > > The release notes can be found here:
> > > > >
> > https://github.com/apache/kudu/blob/1.8.0-RC2/docs/release_notes.adoc
> > > > >
> > > > > The KEYS file to verify the artifact signatures can be found here:
> > > > > 

Re: [VOTE] Apache Kudu 1.7.1 RC2

2018-06-06 Thread Alexey Serbin

+1

Built on CentOS release 6.6 (Final) in release and debug modes.

Ran C++ tests for release build.  All passed but few scenarios from 
kudu-tool-test, which failed due to GLOG_colorlogtostderr set.  Ran the 
failed tests after unsetting GLOG_colorlogtostderr environment variable: 
no failures.


As for the release notes, I think it would be nice to update those at 
least including note about fix for KUDU-2443, if possible.  I think it's 
not a big deal, so even with current release for RC2 it looks good to me.



Thanks,

Alexey

On 6/5/18 5:21 PM, William Berkeley wrote:

+1

Built on el7.
Ran tests. Passed.
Put on a 4-node cluster with some existing data written by 1.5. Was able to
scan the data back (same results as before).

-Will

On Wed, May 30, 2018 at 8:24 PM, Grant Henke  wrote:


Thank you for all the validation Attila.

Good reminder on the release notes. I will update those. That doesn't
require a new release candidate.

Thank you,
Grant

On Wed, May 30, 2018 at 4:44 PM, Attila Bukor  wrote:


+1 (non-binding)

* Artifact's sha1 checksum and gpg signature is valid.
* C++ release build on el7 succeeded, all tests passed.
* Java build also succeeded on el7, 2 tests failed in kudu-client, both
due to environmental issues (stress tests maxed out fd count, don't have
permissinos to change it on this machine). Removing these two tests

allowed

all other tests to pass on the other Java projects.
* C++ release build fails on MacOS 10.13.2 with Apple LLVM version 9.1.0
(clang-902.0.39.1) build fails, but as MacOS support is experimental, I
believe this is not an issue and the build fails on 1.7.0 and 1.7.1-RC1

so

this is not a new regression. After backporting 61d3fff, the build

succeeds

and all tests pass. I cherry-picked this commit on branch-1.7.x in case
there will be a 1.7.2 or a 1.7.1-RC3.
* Java build on MacOS succeeds, all tests passed.
* Tested Maven repo, works fine.
* Skimmed through README, looks fine to me, but I'm not sure what to

check

here.
* Release notes hasn't been updated since RC1, I believe at least
KUDU-2443 and ColumnSchema.toString NPE fix should be mentioned, but I

also

think this doesn't warrant a new release candidate. I'm not sure about

the

policy on the release notes though.

Thanks for this RC, Grant!

Attila



On 2018. May 30., at 15:44, Grant Henke  wrote:

Hi,

The Apache Kudu team is happy to announce the first release candidate

for

Apache Kudu 1.7.1.

Apache Kudu 1.7.1 is a bug-fix release which fixes critical issues in

Kudu

1.7.0.

The is a source-only release. The artifacts have been staged here:
https://dist.apache.org/repos/dist/dev/kudu/1.7.1-RC2/

Java convenience binaries in the form of a Maven repository are staged

here:

https://repository.apache.org/content/repositories/orgapachekudu-1021/

It is tagged in Git as 1.7.1-RC1 and the corresponding git hash is the
following:
*https://git-wip-us.apache.org/repos/asf?p=kudu.git;a=commit;h=

5418bfcbbfc6c1809cc869e0119f003e8fb66e37

*

The release notes can be found here:
https://github.com/apache/kudu/blob/1.7.1-RC2/docs/release_notes.adoc

The KEYS file to verify the artifact signatures can be found here:
https://dist.apache.org/repos/dist/release/kudu/KEYS

I'd suggest going through the README and the release notes, building

Kudu,

and running the unit tests. Testing out the Maven repo would also be
appreciated.

The vote will run until Monday, June 4th at 11PM PDT.

Thank you,
Grant

--
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke





--
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke





Re-balancing algorithm draft

2018-04-19 Thread Alexey Serbin

Hi,

Please take a look at the re-balancing algorithm draft:
https://docs.google.com/document/d/1jEoruO6NF_YykHsqt_GKZR-ExRkYjVE6SZq2McyKYjg

The algorithm has a simplistic implementation in Python (see the gist reference
in the end of the document).  There are some open points for discussion at this 
point,
but I think it's time to start collecting feedback from a broader audience.

Your feedback and suggestions are appreciated.


Thanks,

Alexey



Re: [VOTE] Apache Kudu 1.7.0 RC2

2018-03-20 Thread Alexey Serbin

+1

-- built on OS X El Capitan (v 10.11.6) in DEBUG, ran tests using ctest 
-j2, all but 3 passed (the fixes for the failed ones are already 
committed into the main trunk)
-- build on CentOS 6.6 in DEBUG with devtoolset, ran tests using ctest 
-j4.  All passed except for the client_samples-test which failed because 
of some port conflicts



/Alexey

On 3/20/18 10:09 AM, William Berkeley wrote:

+1

- Built in debug mode on macOS and el7.
- Ran tests, all passed except the macOS issues Alexey has addressed. I
don't think the macOS test-only issues are a big deal.
- Did some basic insert/update/scan operations on a loadgen table.

-Will

On Tue, Mar 20, 2018 at 9:24 AM, Alexey Serbin <aser...@cloudera.com> wrote:


Yep, it seems the failure of those 3 tests should not be a show-stopper for
the 1.7 release since macOS is a development-only platform:

   delete_table-itest
   master-test
   rpc-test


Thanks,

Alexey



On 3/20/18 8:52 AM, Grant Henke wrote:


Thanks for the correction Alexey.

For anyone reading.* Today (March 20th) is the intended last day of the
vote*. Given I listed Tuesday, I hope that date error wasn't too
confusing.

I am not sure an Mac OSX test issue is enough that it warrant's a new RC.
Given it only affects development on an experimental system. We can cherry
pick them to 1.7.x to ensure they don't impact future
development/cherry-pick work in the branch. Does that seam reasonable?

Thank you,
Grant

On Tue, Mar 20, 2018 at 1:01 AM, Todd Lipcon <t...@cloudera.com> wrote:

+1

- Downloaded and verified signature and sha1. My signature is attached
below
- built on el7 DEBUG using gcc 4.8.5. ran tests.
- built on el7 using devvtoolset-7 (gcc 7). ran tests.
- got a few failures in both configs, but they passed upon re-running. I
was running them in parallel with each other on the same temp disk so I
think it was just various timeouts due to the disk being overloaded.
- did various failure testing of the new "3-4-3" replication scheme on a
6-node cluster
- did basic validation on a 125-node cluster running Kudu with Impala
including some 100B-row inserts (which discovered some bugs earlier in
the
1.7 release cycle and now seem OK). A few nodes went down during the
testing and everything seems to have recovered fine.




-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEABECAAYFAlqweKUACgkQXkPKua7Hfq+7DQCg4t6llqevULFCmEIzGHrh7YSv
nXoAoPvJUrYpP46KC1TQP/exD57qS3TO
=+kkn
-END PGP SIGNATURE-



On Fri, Mar 16, 2018 at 6:28 PM, Grant Henke <ghe...@cloudera.com>
wrote:

Hi,

The Apache Kudu team is happy to announce the second release candidate


for


Apache Kudu 1.7.0.

Apache Kudu 1.7.0 is a minor release that offers many improvements and
fixes since the prior release.

The is a source-only release. The artifacts have been staged here:
https://dist.apache.org/repos/dist/dev/kudu/1.7.0-RC2/

Java convenience binaries in the form of a Maven repository are staged
here:
https://repository.apache.org/content/repositories/orgapachekudu-1019/

It is tagged in Git as 1.7.0-RC2 and the corresponding git hash is the
following:
https://git-wip-us.apache.org/repos/asf?p=kudu.git;a=commit;
h=472ae66565d9a8efdc81f16568249d7c4b108251

The release notes can be found here:
https://github.com/apache/kudu/blob/1.7.0-RC2/docs/release_notes.adoc

The KEYS file to verify the artifact signatures can be found here:
https://dist.apache.org/repos/dist/release/kudu/KEYS

I'd suggest going through the README and the release notes, building


Kudu,


and running the unit tests. Testing out the Maven repo would also be
appreciated.

The vote will run until Tuesday, March 6th at 11PM PDT. We'd normally
run a vote for 3 full days but since we're headed into the weekend now
let's do 4 days instead.

Thank you,
Grant


--
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke



--
Todd Lipcon
Software Engineer, Cloudera








Re: [VOTE] Apache Kudu 1.7.0 RC2

2018-03-20 Thread Alexey Serbin

Yep, it seems the failure of those 3 tests should not be a show-stopper for
the 1.7 release since macOS is a development-only platform:

  delete_table-itest
  master-test
  rpc-test


Thanks,

Alexey


On 3/20/18 8:52 AM, Grant Henke wrote:

Thanks for the correction Alexey.

For anyone reading.* Today (March 20th) is the intended last day of the
vote*. Given I listed Tuesday, I hope that date error wasn't too confusing.

I am not sure an Mac OSX test issue is enough that it warrant's a new RC.
Given it only affects development on an experimental system. We can cherry
pick them to 1.7.x to ensure they don't impact future
development/cherry-pick work in the branch. Does that seam reasonable?

Thank you,
Grant

On Tue, Mar 20, 2018 at 1:01 AM, Todd Lipcon  wrote:


+1

- Downloaded and verified signature and sha1. My signature is attached
below
- built on el7 DEBUG using gcc 4.8.5. ran tests.
- built on el7 using devvtoolset-7 (gcc 7). ran tests.
- got a few failures in both configs, but they passed upon re-running. I
was running them in parallel with each other on the same temp disk so I
think it was just various timeouts due to the disk being overloaded.
- did various failure testing of the new "3-4-3" replication scheme on a
6-node cluster
- did basic validation on a 125-node cluster running Kudu with Impala
including some 100B-row inserts (which discovered some bugs earlier in the
1.7 release cycle and now seem OK). A few nodes went down during the
testing and everything seems to have recovered fine.




-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEABECAAYFAlqweKUACgkQXkPKua7Hfq+7DQCg4t6llqevULFCmEIzGHrh7YSv
nXoAoPvJUrYpP46KC1TQP/exD57qS3TO
=+kkn
-END PGP SIGNATURE-



On Fri, Mar 16, 2018 at 6:28 PM, Grant Henke  wrote:


Hi,

The Apache Kudu team is happy to announce the second release candidate

for

Apache Kudu 1.7.0.

Apache Kudu 1.7.0 is a minor release that offers many improvements and
fixes since the prior release.

The is a source-only release. The artifacts have been staged here:
https://dist.apache.org/repos/dist/dev/kudu/1.7.0-RC2/

Java convenience binaries in the form of a Maven repository are staged
here:
https://repository.apache.org/content/repositories/orgapachekudu-1019/

It is tagged in Git as 1.7.0-RC2 and the corresponding git hash is the
following:
https://git-wip-us.apache.org/repos/asf?p=kudu.git;a=commit;
h=472ae66565d9a8efdc81f16568249d7c4b108251

The release notes can be found here:
https://github.com/apache/kudu/blob/1.7.0-RC2/docs/release_notes.adoc

The KEYS file to verify the artifact signatures can be found here:
https://dist.apache.org/repos/dist/release/kudu/KEYS

I'd suggest going through the README and the release notes, building

Kudu,

and running the unit tests. Testing out the Maven repo would also be
appreciated.

The vote will run until Tuesday, March 6th at 11PM PDT. We'd normally
run a vote for 3 full days but since we're headed into the weekend now
let's do 4 days instead.

Thank you,
Grant


--
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke




--
Todd Lipcon
Software Engineer, Cloudera








Re: Flaky tests?

2017-11-29 Thread Alexey Serbin

An update: the flakiness in raft_consensus_nonvoter-itest has been fixed.

On 11/27/17 6:55 PM, Alexey Serbin wrote:
Yep, that CatalogManagerAddsNonVoter is the new one which was 
committed just yesterday.


On 11/27/17 6:53 PM, Alexey Serbin wrote:
The raft_consensus_nonvoter-itest is the set of tests added for 3-4-3 
re-replication improvements.  I'm adding more scenarios there right 
now, and I'll take care of the current flaky ones from there as well.



Thanks,

Alexey

On 11/27/17 6:38 PM, Andrew Wong wrote:
N/w! I should have checked with you beforehand given you were 
already in
the area (per your response last week). Seems the double-effort was 
fairly

minimal anyway.

With the fixes for tablet_copy-itest and delete_table-itest checked 
in, the

next-highest offenders on the dashboard
<http://dist-test.cloudera.org:8080/> are:

- raft_consensus_nonvoter-itest (9.62%)
- linked_list-test (8.45%)

 From a quick glance I'm not sure I have a grasp on what's going on in
either test. Would anyone like to volunteer? 

On Mon, Nov 27, 2017 at 6:27 PM, Alexey Serbin 
<aser...@cloudera.com> wrote:



I just realized after re-reading this message that Andrew was about to
look at the flake in delete_table-itest as well.  I'm sorry for the
double-effort here, if any.  I read this message after posting the 
patch.




On 11/27/17 12:09 PM, Andrew Wong wrote:


I'm taking a look at tablet_copy-itest and the flakiness in
delete_table-itest beyond Alexey's outstanding patch.

On Tue, Nov 21, 2017 at 10:17 AM, Todd Lipcon <t...@cloudera.com> 
wrote:


On Tue, Nov 21, 2017 at 10:13 AM, Alexey Serbin 
<aser...@cloudera.com>

wrote:

I'll take a look at delete_table-itest (at least I have had a 
patch in

review for one flake there for a long time).

BTW, it would be much better if it were possible to see the type of


failed

build in the dashboard (as it was prior to quasar).  Is the type 
of a



build


something inherently impossible to expose from quasar?

I think it should be possible by just setting the BUILD_ID 
environment
variable appropriate before reporting the test result. That 
information

should be available in the enviornment as $BUILD_TYPE or somesuch. I
think
Ed is out this week but maybe he can take a look at this when he 
gets

back?

-Todd




Best regards,

Alexey


On 11/20/17 11:50 AM, Todd Lipcon wrote:

Hey folks,
It seems some of our tests have gotten pretty flaky lately 
again. Some



of
it is likely due to churn in test infrastructure (running on a 
different

VM
type now I think) but it makes me a little nervous to go into 
the 1.6

release with some tests at 5%+ flaky.

Can we get some volunteers to triage the top couple most flaky? 
Note



that
"triage" doesn't necessarily mean "fix" -- just want to 
investigate to

the
point that we can decide it's likely to be a test issue or known
existing
issue rather than a regression before the release.
I'll volunteer to look at consensus_peers-itests (the top most 
flaky



one).
-Todd



--
Todd Lipcon
Software Engineer, Cloudera














Re: Flaky tests?

2017-11-27 Thread Alexey Serbin
Yep, that CatalogManagerAddsNonVoter is the new one which was committed 
just yesterday.


On 11/27/17 6:53 PM, Alexey Serbin wrote:
The raft_consensus_nonvoter-itest is the set of tests added for 3-4-3 
re-replication improvements.  I'm adding more scenarios there right 
now, and I'll take care of the current flaky ones from there as well.



Thanks,

Alexey

On 11/27/17 6:38 PM, Andrew Wong wrote:

N/w! I should have checked with you beforehand given you were already in
the area (per your response last week). Seems the double-effort was 
fairly

minimal anyway.

With the fixes for tablet_copy-itest and delete_table-itest checked 
in, the

next-highest offenders on the dashboard
<http://dist-test.cloudera.org:8080/> are:

- raft_consensus_nonvoter-itest (9.62%)
- linked_list-test (8.45%)

 From a quick glance I'm not sure I have a grasp on what's going on in
either test. Would anyone like to volunteer? 

On Mon, Nov 27, 2017 at 6:27 PM, Alexey Serbin <aser...@cloudera.com> 
wrote:



I just realized after re-reading this message that Andrew was about to
look at the flake in delete_table-itest as well.  I'm sorry for the
double-effort here, if any.  I read this message after posting the 
patch.




On 11/27/17 12:09 PM, Andrew Wong wrote:


I'm taking a look at tablet_copy-itest and the flakiness in
delete_table-itest beyond Alexey's outstanding patch.

On Tue, Nov 21, 2017 at 10:17 AM, Todd Lipcon <t...@cloudera.com> 
wrote:


On Tue, Nov 21, 2017 at 10:13 AM, Alexey Serbin <aser...@cloudera.com>

wrote:

I'll take a look at delete_table-itest (at least I have had a 
patch in

review for one flake there for a long time).

BTW, it would be much better if it were possible to see the type of


failed

build in the dashboard (as it was prior to quasar).  Is the type 
of a



build


something inherently impossible to expose from quasar?

I think it should be possible by just setting the BUILD_ID 
environment
variable appropriate before reporting the test result. That 
information

should be available in the enviornment as $BUILD_TYPE or somesuch. I
think
Ed is out this week but maybe he can take a look at this when he gets
back?

-Todd




Best regards,

Alexey


On 11/20/17 11:50 AM, Todd Lipcon wrote:

Hey folks,
It seems some of our tests have gotten pretty flaky lately 
again. Some



of
it is likely due to churn in test infrastructure (running on a 
different

VM
type now I think) but it makes me a little nervous to go into 
the 1.6

release with some tests at 5%+ flaky.

Can we get some volunteers to triage the top couple most flaky? 
Note



that
"triage" doesn't necessarily mean "fix" -- just want to 
investigate to

the
point that we can decide it's likely to be a test issue or known
existing
issue rather than a regression before the release.
I'll volunteer to look at consensus_peers-itests (the top most 
flaky



one).
-Todd



--
Todd Lipcon
Software Engineer, Cloudera












Re: Flaky tests?

2017-11-27 Thread Alexey Serbin
The raft_consensus_nonvoter-itest is the set of tests added for 3-4-3 
re-replication improvements.  I'm adding more scenarios there right now, 
and I'll take care of the current flaky ones from there as well.



Thanks,

Alexey

On 11/27/17 6:38 PM, Andrew Wong wrote:

N/w! I should have checked with you beforehand given you were already in
the area (per your response last week). Seems the double-effort was fairly
minimal anyway.

With the fixes for tablet_copy-itest and delete_table-itest checked in, the
next-highest offenders on the dashboard
<http://dist-test.cloudera.org:8080/> are:

- raft_consensus_nonvoter-itest (9.62%)
- linked_list-test (8.45%)

 From a quick glance I'm not sure I have a grasp on what's going on in
either test. Would anyone like to volunteer? 

On Mon, Nov 27, 2017 at 6:27 PM, Alexey Serbin <aser...@cloudera.com> wrote:


I just realized after re-reading this message that Andrew was about to
look at the flake in delete_table-itest as well.  I'm sorry for the
double-effort here, if any.  I read this message after posting the patch.



On 11/27/17 12:09 PM, Andrew Wong wrote:


I'm taking a look at tablet_copy-itest and the flakiness in
delete_table-itest beyond Alexey's outstanding patch.

On Tue, Nov 21, 2017 at 10:17 AM, Todd Lipcon <t...@cloudera.com> wrote:

On Tue, Nov 21, 2017 at 10:13 AM, Alexey Serbin <aser...@cloudera.com>

wrote:

I'll take a look at delete_table-itest (at least I have had a patch in

review for one flake there for a long time).

BTW, it would be much better if it were possible to see the type of


failed


build in the dashboard (as it was prior to quasar).  Is the type of a


build


something inherently impossible to expose from quasar?

I think it should be possible by just setting the BUILD_ID environment

variable appropriate before reporting the test result. That information
should be available in the enviornment as $BUILD_TYPE or somesuch. I
think
Ed is out this week but maybe he can take a look at this when he gets
back?

-Todd




Best regards,

Alexey


On 11/20/17 11:50 AM, Todd Lipcon wrote:

Hey folks,

It seems some of our tests have gotten pretty flaky lately again. Some


of
it is likely due to churn in test infrastructure (running on a different

VM
type now I think) but it makes me a little nervous to go into the 1.6
release with some tests at 5%+ flaky.

Can we get some volunteers to triage the top couple most flaky? Note


that
"triage" doesn't necessarily mean "fix" -- just want to investigate to
the
point that we can decide it's likely to be a test issue or known
existing
issue rather than a regression before the release.

I'll volunteer to look at consensus_peers-itests (the top most flaky


one).
-Todd



--
Todd Lipcon
Software Engineer, Cloudera










Re: Flaky tests?

2017-11-27 Thread Alexey Serbin
I just realized after re-reading this message that Andrew was about to 
look at the flake in delete_table-itest as well.  I'm sorry for the 
double-effort here, if any.  I read this message after posting the patch.



On 11/27/17 12:09 PM, Andrew Wong wrote:

I'm taking a look at tablet_copy-itest and the flakiness in
delete_table-itest beyond Alexey's outstanding patch.

On Tue, Nov 21, 2017 at 10:17 AM, Todd Lipcon <t...@cloudera.com> wrote:


On Tue, Nov 21, 2017 at 10:13 AM, Alexey Serbin <aser...@cloudera.com>
wrote:


I'll take a look at delete_table-itest (at least I have had a patch in
review for one flake there for a long time).

BTW, it would be much better if it were possible to see the type of

failed

build in the dashboard (as it was prior to quasar).  Is the type of a

build

something inherently impossible to expose from quasar?


I think it should be possible by just setting the BUILD_ID environment
variable appropriate before reporting the test result. That information
should be available in the enviornment as $BUILD_TYPE or somesuch. I think
Ed is out this week but maybe he can take a look at this when he gets back?

-Todd




Best regards,

Alexey


On 11/20/17 11:50 AM, Todd Lipcon wrote:


Hey folks,

It seems some of our tests have gotten pretty flaky lately again. Some

of

it is likely due to churn in test infrastructure (running on a different
VM
type now I think) but it makes me a little nervous to go into the 1.6
release with some tests at 5%+ flaky.

Can we get some volunteers to triage the top couple most flaky? Note

that

"triage" doesn't necessarily mean "fix" -- just want to investigate to

the

point that we can decide it's likely to be a test issue or known

existing

issue rather than a regression before the release.

I'll volunteer to look at consensus_peers-itests (the top most flaky

one).

-Todd





--
Todd Lipcon
Software Engineer, Cloudera








Re: Flaky tests?

2017-11-21 Thread Alexey Serbin
I'll take a look at delete_table-itest (at least I have had a patch in 
review for one flake there for a long time).


BTW, it would be much better if it were possible to see the type of 
failed build in the dashboard (as it was prior to quasar).  Is the type 
of a build something inherently impossible to expose from quasar?



Best regards,

Alexey

On 11/20/17 11:50 AM, Todd Lipcon wrote:

Hey folks,

It seems some of our tests have gotten pretty flaky lately again. Some of
it is likely due to churn in test infrastructure (running on a different VM
type now I think) but it makes me a little nervous to go into the 1.6
release with some tests at 5%+ flaky.

Can we get some volunteers to triage the top couple most flaky? Note that
"triage" doesn't necessarily mean "fix" -- just want to investigate to the
point that we can decide it's likely to be a test issue or known existing
issue rather than a regression before the release.

I'll volunteer to look at consensus_peers-itests (the top most flaky one).

-Todd




Re: Kudu 1.6 release

2017-11-17 Thread Alexey Serbin

Hi Mike,

Thank you for taking care of the process for 1.6.0 release.
The plan you described looks good to me.


Best regards,

Alexey

On 11/16/17 1:51 AM, Mike Percy wrote:

Hi Kudu dev community,

It's been 2 months since release 1.5.0 and we've got a bunch of valuable
improvements and bug fixes waiting in the wings. Based on our usual 2-month
cadence, now looks like a good time to start thinking about a Kudu 1.6.0
release.

I'll volunteer to RM this one, unless someone else has a burning desire to
do it.

I'll also propose to cut the branch for 1.6.x early in the week *after* the
Thanksgiving holiday in the US, and to start a vote on RC1 a couple of days
after that.

Devs: That means release notes for notable changes in 1.6 should be up for
review and ready to go by Monday, November 27 (the Monday after
Thanksgiving) to ensure their inclusion.

Please let me know your thoughts on the above plan.

Thanks!
Mike





Obsolete sections from: design for Kudu re-replication improvements

2017-11-14 Thread Alexey Serbin (via Google Docs)

I've shared an item with you:

Obsolete sections from: design for Kudu re-replication improvements
https://docs.google.com/document/d/15KQp_6yB0fd7cRn-5vsU9C_x5cTBbVqWhAuDNAqwtF0/edit?usp=sharing=5a0b0466

It's not an attachment -- it's stored online. To open this item, just click  
the link above.


Obsoleted sections from 'design for Kudu re-replication improvements'


Replication improvement design doc

2017-09-26 Thread Alexey Serbin

Hi,

The subj is available at
https://docs.google.com/document/d/1_gQ3BONhKVR2hFlDTShtdx1vHkbVTcLE3b-nrNFwGDI

The work on the document is still in progress, but one can get an idea
what it's about from the sections already available.

As usual, your feedback is highly appreciated!


Best regards,

Alexey



Re: [VOTE] Apache Kudu 1.5.0 RC3

2017-09-06 Thread Alexey Serbin

+1

Built Kudu on Mac OS X (ver 10.11.6) from source in debug mode,
ran tests with ctest -j4.  3 tests failed, but that are test-only issues
which are either already fixed or fixes are pending in the main trunk:

  encoding-test (fix is pending, see KUDU-2119)
  delete_table-itest (fix is out for review: 
https://gerrit.cloudera.org/#/c/7972/)

  kudu-tool-test (fixed with 5583d620 in the main trunk already)


Thanks,

Alexey


On 9/5/17 10:40 AM, Dan Burkert wrote:

Hi,

The Apache Kudu team is happy to announce the third release candidate for
Apache Kudu 1.5.0.

Apache Kudu 1.5.0 is a minor release which offers many improvements and
fixes since the prior release.

The is a source-only release. The artifacts are staged here:
https://dist.apache.org/repos/dist/dev/kudu/1.5.0-RC3

Java convenience binaries in the form of a Maven repository are staged
here: https://repository.apache.org/content/repositories/orgapachekudu-1015.

It is built from this tag:
https://git-wip-us.apache.org/repos/asf?p=kudu.git;a=commit;h=9101f85fa4ec465b490720629eb5a0caba2563dd

KEYS file: http://www.apache.org/dist/kudu/KEYS

I suggest going through the README, building Kudu, and running the unit
tests, and testing the Maven artifacts.

The vote will run until Friday, September 8th at 12pm PDT.

Thanks,
Dan





IWYU configuration for gerrit pre-commit checks

2017-08-24 Thread Alexey Serbin

Hi,

Today I enabled the IWYU (include-what-you-use) configuration
for the pre-commit Jenkins job at https://gerrit.cloudera.org

The newly introduced automated check runs the IWYU tool [1]
(include-what-you-use is the name of the binary)
to help us keeping the Kudu source code cleaner: [2].

As you would expect, Jenkins automatically starts a job for
the IWYU configuration along with jobs for other configs
(DEBUG, RELEASE, ..., LINT).  The IWYU job runs the tool
against the files modified by the changelists in question.
If the IWYU jobs fails, you should look at the job's console
output and update your code in accordance with IWYU recommendations.
I assume your changelists are already synchronized with
the trunk to include a couple of IWYU-related updates
that the IWYU Jenkins job depends on.

As a side note, I want to mention that the include-what-you-use
tool is still in alpha quality phase and there might be some quirks;
e.g. the tool might suggest something that breaks compilation, etc.
I put some effort to minimize such mishaps, but if you hit any of
those, please let me know -- I'll help you to resolve those.

If you want to run the verification locally
before submitting your patch for review,
in case of using GNU make you just run

  make iwyu

(the same as you would do to run the lint: 'make lint')

If you have any question or concerns, please let me know.


Kind regards,

Alexey


References:
  [1] https://github.com/include-what-you-use/include-what-you-use
  [2] 
https://github.com/include-what-you-use/include-what-you-use/blob/master/docs/WhyIWYU.md





Re: Load Data Question

2017-07-12 Thread Alexey Serbin
If you use Kudu API and set flush mode for a session to anything but 
AUTO_FLUSH_SYNC, those inserts will be accumulated into batches at the 
client side and sent to the corresponding tablet servers in chunks.  
Consider using the AUTO_FLUSH_BACKGROUND mode while working with 
KuduSession API (using MANUAL_FLUSH would require you to flush those 
batches manually before the size of the accumulated data reaches the max 
allowed size, which is configurable).


Also, if the lines in your file(s) contain data for independent rows 
(i.e. you are not expecting to perform upserts for some lines), you 
could split those lines into ranges (e.g., 0 -- 99, 10 -- 
19, etc.) and run multiple Kudu sessions (one per line range in the 
file) in parallel.


Hope this helps.


Best regards,

Alexey



On 7/10/17 7:54 PM, sky wrote:

Hi,
 If load  data from a csv file, I can only traverse the file, one by one 
insert through the API ?

  







At 2017-07-10 22:40:05, "Jean-Daniel Cryans"  wrote:

(sending to user@ and putting dev@ in bcc)

Hi,

Kudu by itself doesn't really have file loading capabilities, you'd have to
write your own code that reads a file and then uses either the Java or C++
API to insert the data.

Hope this helps,

J-D

On Mon, Jul 10, 2017 at 1:55 AM, sky  wrote:


Hi all,
 Kudu how to load data from a file?  I know that kudu can insert data
from impala , but is there any other way? Not through impala, executed by
kudu alone.
 Thanks.




Preconditions vs assert in the Kudu Java client

2017-07-11 Thread Alexey Serbin

Hi,

While working on a small refactoring in the Kudu Java client, I found 
that sometimes we use asserts
and sometimes Preconditions [2] from the guava library to assert on the 
consistency of the code.


As Todd suggested, I'm starting this thread to clarify on what is the 
best way to do perform
the consistency checks in our Kudu Java client code.  Ideally, the 
desired outcome from the
anticipated discussion would be some sort of guidance on assertion of 
the consistency constrains
and invariants in our Java client code.  Once we have the guidance, we 
could put it as an

additional section of our 'Contributing' page.

The issue with the Java asserts is that they are not applicable unless 
the JVM is run with '-ea'
flag, so it's not possible to explicitly stop processing further 
instructions in the context

when an inconsistency is detected.

To me, we have more clarity with the consistency checks and invariant 
assertion in our C++ code.
We have CHECK- and DCHECK- derived macros to perform various types of 
consistency checks.  In short,
the CHECK-related are always there, and the DCHECK-related ones are 
present only in debug builds.
We don't turn on/off any of those dynamically in runtime.  Also, those 
C++ asserts are kind of 'hard'
ones -- the whole program is terminated and you don't need to think how 
to deal with the mess which
isn't possible to recover from.  The latter one is a luxury which is not 
available in Java

(correct me if I'm wrong).

Putting the explicit termination of the whole program (with an optional 
coredump) aside,
the Java's assert looks like DCHECK() but with the ability to turn it 
on/off using the run-time switch.
Using guava's Preconditions gives us an option to address that and have 
a way
to assert the consistency in our code even if code is run without the 
'-ea' JVM flag.


So, to me it looks natural to use the Preconditions-derived checks for 
asserting on invariants
even while running in production environment, while using asserts only 
in debug/test mode,
asserting on programmatic errors and other non-crucial invariants which 
might be handled

by the code even if those checks are removed.

However, this simple approach contradicts to some points of [1] (which, 
by my understanding,
is contradictory and very confusing as is).  Also, I'm not a Java 
programmer,

so I might be missing some crucial points here.

What do you think?  Your feedback is highly appreciated.


Thanks,

Alexey


References:
  [1] 
http://docs.oracle.com/javase/8/docs/technotes/guides/language/assert.html

  [2] https://github.com/google/guava/wiki/PreconditionsExplained





Re: [VOTE] Apache Kudu 1.4.0 RC1

2017-06-06 Thread Alexey Serbin

+1

Checked out source from the branch-1.4.x git branch, built the DEBUG 
configuration and ran tests using 'ctest -j4' on


ProductName:Mac OS X
ProductVersion: 10.11.6
BuildVersion:   15G1421

100% tests passed, 0 tests failed out of 224


Thanks,

Alexey

On 6/4/17 10:39 PM, Todd Lipcon wrote:

Hi,

The Apache Kudu team is happy to announce the first release candidate for
Apache Kudu 1.4.0.

Apache Kudu 1.4.0 is a minor release which offers many improvements and
fixes since the prior release. This first release candidate does not
contain full release notes -- I am hoping that we can start voting while
preparing release notes, and then have an abbreviated RC2 vote if the only
changes between RC1 and RC2 are the incorporation of docs.

The is a source-only release. The artifacts were staged here:
https://dist.apache.org/repos/dist/dev/kudu/1.4.0-RC1/

Java convenience binaries in the form of a Maven repository are staged here:
https://repository.apache.org/content/repositories/orgapachekudu-1010/

It was built from this tag:
https://git-wip-us.apache.org/repos/asf?p=kudu.git;a=commit;h=afcd412baedcd153d152bd43128ca82886630200

As noted above, the release notes are work-in-progress.

KEYS file:
http://www.apache.org/dist/kudu/KEYS

I'd suggest going through the README, building Kudu, and running the
unit tests. Testing out the Maven repo would also be appreciated.

The vote will run until Wednesday, June 7th at 11PM PDT. As noted above,
we'll need to do an RC2 to incorporate release notes, so please focus your
voting and testing on the non-doc contents here.

Thanks,
Todd





Re: [VOTE] Apache Kudu 1.3.1 RC1

2017-04-17 Thread Alexey Serbin

+1

Built and ran tests on OS X 10.11.6 using custom version of Kerberos5 
suite (1.15.1) and a small patch to adapt krb5 configuration to work on 
krb5 1.15.1 (available in the main trunk in 
b89dc9a28e7f49a1b429ca0b976643f3b16d1563).


All tests passed except for the delete_tablet-itest, which I think isn't 
a blocker.



Best regards,

Alexey


On 4/17/17 4:52 PM, Dan Burkert wrote:

+1

Built and ran tests on OS X 10.10.  delete_tablet-itest fails, but it
appears to fail on master as well, all they way back to when it was
introduced about a month ago.  Not a blocker IMO.

- Dan

On Mon, Apr 17, 2017 at 4:32 PM, Todd Lipcon  wrote:


+1

Built the release build bits on CentOS 6.8 and CentOS 7.3 (from stock VM
images, following the docs).

Since no read/write path or consensus changes were made since 1.3.0 I
didn't bother rerunning any larger cluster tests or benchmarks. Instead I
focused on more targeted testing of the changes.

Ran release mode tests on both, no unexpected failures (couple timing
flakes due to overloaded VMs but succeeded on retry).

I also ran through the repro steps from KUDU-1968 a couple of times and
verified that no blocks were lost, servers could restart correctly, and
ksck with checksum_scan mode passed.

I also successfully ran RAT as part of building the release tarball.

-Todd

On Thu, Apr 13, 2017 at 2:35 PM, Adar Dembo  wrote:


+1

I downloaded, built on Ubuntu 16.04 for TSAN, and all the C++ tests

passed.

I recently filed KUDU-1975 which is (IMHO) an annoying regression, but
I don't think it should hold up the release.


On Thu, Apr 13, 2017 at 9:09 AM, Jean-Daniel Cryans  wrote:


Hi,

The Apache Kudu team is happy to announce the first release candidate

for

Apache Kudu 1.3.1.

Apache Kudu 1.3.1 is a bug fix release which fixes critical issues
discovered in Apache Kudu 1.3.0. Please see the release notes for

details.

The is a source-only release. The artifacts were staged here:
https://dist.apache.org/repos/dist/dev/kudu/1.3.1-RC1/

Java convenience binaries in the form of a Maven repository are staged
here:
https://repository.apache.org/content/repositories/

orgapachekudu-1009/

It was built from this tag:
*https://git-wip-us.apache.org/repos/asf?p=kudu.git;a=commit;h=
afc0f479ba6e04ec84c3c10c55be940c0784c8e1
*

The release notes can be found here:
*https://github.com/apache/kudu/blob/branch-1.3.x/docs/

release_notes.adoc



Re: Prepping for 1.3.1 release

2017-04-12 Thread Alexey Serbin

On 4/12/17 2:56 PM, Todd Lipcon wrote:


On Wed, Apr 12, 2017 at 2:54 PM, Alexey Serbin <aser...@cloudera.com> wrote:


There is a patch to fix shortened TSK lifetime:
795f5ee948e525941c575b231e2c1f9456c160ac

However, that does not sound like a critical fix to me.  If you think it
would be nice to pick it up anyway, please let me know -- I'll put up a
patch for 1.3.x branch.


Agreed it doesn't seem quite critical enough -- the outcome is just a 6-day
expiration instead of 7-day, right?

Exactly.  It does not seem to be a big deal, IMO.







Thank you for finding and fixing KUDU-1968!


Best regards,

Alexey


On 4/12/17 2:43 PM, Jean-Daniel Cryans wrote:


+1

Nothing else comes to mind. Kudos for finding the bug.

J-D

On Wed, Apr 12, 2017 at 2:42 PM, Todd Lipcon <t...@cloudera.com> wrote:

This morning we found a serious issue (KUDU-1968) which can eat data at a

relatively high rate on clusters after a failure. We need to get 1.3.1
out
ASAP.  I'd like to build a release candidate tonight or tomorrow.

Are there any other patches that we should be cherry-picking into 1.3.1?
Reminder that point-releases like this should be critical fixes (data
loss,
incorrect results, crashes, etc) only.

-Todd
--
Todd Lipcon
Software Engineer, Cloudera








Re: Prepping for 1.3.1 release

2017-04-12 Thread Alexey Serbin
There is a patch to fix shortened TSK lifetime: 
795f5ee948e525941c575b231e2c1f9456c160ac


However, that does not sound like a critical fix to me.  If you think it 
would be nice to pick it up anyway, please let me know -- I'll put up a 
patch for 1.3.x branch.


Thank you for finding and fixing KUDU-1968!


Best regards,

Alexey

On 4/12/17 2:43 PM, Jean-Daniel Cryans wrote:

+1

Nothing else comes to mind. Kudos for finding the bug.

J-D

On Wed, Apr 12, 2017 at 2:42 PM, Todd Lipcon  wrote:


This morning we found a serious issue (KUDU-1968) which can eat data at a
relatively high rate on clusters after a failure. We need to get 1.3.1 out
ASAP.  I'd like to build a release candidate tonight or tomorrow.

Are there any other patches that we should be cherry-picking into 1.3.1?
Reminder that point-releases like this should be critical fixes (data loss,
incorrect results, crashes, etc) only.

-Todd
--
Todd Lipcon
Software Engineer, Cloudera





Re: Multi-word Flag Style

2017-04-10 Thread Alexey Serbin
Oops, sure -- we cannot use dashes when referring to the variables in 
our C++ code.


On 4/10/17 12:40 PM, Dan Burkert wrote:

On Mon, Apr 10, 2017 at 11:56 AM, Adar Dembo <a...@cloudera.com> wrote:

But, I don't know whether gflags can be coerced to programmatically
emit flags with dashes (i.e. when invoked with --help) without a patch
or two.


Yah, I was thinking the same.  Gflags just landed support for dashes
upstream in 2.2, so I imagine they would be open to a patch to configure
dashes in help output.



Certainly in the code we would want to retain the use of
underscores when referring to flag variables; FLAGS_foo_bar conforms
to our coding style more than something like FLAGS-foo-bar.


I agree - I don't think dashes are valid identifier characters in C++
anyway.

- Dan




On Mon, Apr 10, 2017 at 11:00 AM, Alexey Serbin <aser...@cloudera.com>
wrote:

I think it's a good move.  It would be nice to add a notice about that in
the user-facing docs.

Also, I think it would be more consistent to convert those flags

altogether

at some point to be in dash-ish form, both the code and the docs.  Maybe,
1.4 is a good point to do that.


Kind regards,

Alexey



On 4/10/17 10:42 AM, William Berkeley wrote:

I agree, for the reason you gave: dashes are the norm in Unix, so they
"feel right" for flag names.

-Will

On Mon, Apr 10, 2017 at 1:38 PM, Dan Burkert <danburk...@apache.org>
wrote:


Hi all,

As of Kudu 1.3, multi-word flags can use a dash '-' separator in lieu

of

the underscore '_' separator.  For example,  --memory_limit_hard_bytes
can
now be specified as --memory-limit-hard-bytes, or even
--memory_limit-hard_bytes.  Of the people I've talked to, most seem to
prefer dashes to underscores in flag names, since that's been the Unix
norm
for a long time.

Going forward, I'd like to propose that we document flag names using
dashes
wherever possible.  We would continue accepting underscores

indefinitely,

since to stop doing so would break compatibility. For the most part,

this

means incrementally switching the documentation to use dashes, and
getting
glog to output dashes in --help output.

Any thoughts?

- Dan





Re: Multi-word Flag Style

2017-04-10 Thread Alexey Serbin
I think it's a good move.  It would be nice to add a notice about that 
in the user-facing docs.


Also, I think it would be more consistent to convert those flags 
altogether at some point to be in dash-ish form, both the code and the 
docs.  Maybe, 1.4 is a good point to do that.



Kind regards,

Alexey


On 4/10/17 10:42 AM, William Berkeley wrote:

I agree, for the reason you gave: dashes are the norm in Unix, so they
"feel right" for flag names.

-Will

On Mon, Apr 10, 2017 at 1:38 PM, Dan Burkert  wrote:


Hi all,

As of Kudu 1.3, multi-word flags can use a dash '-' separator in lieu of
the underscore '_' separator.  For example,  --memory_limit_hard_bytes can
now be specified as --memory-limit-hard-bytes, or even
--memory_limit-hard_bytes.  Of the people I've talked to, most seem to
prefer dashes to underscores in flag names, since that's been the Unix norm
for a long time.

Going forward, I'd like to propose that we document flag names using dashes
wherever possible.  We would continue accepting underscores indefinitely,
since to stop doing so would break compatibility. For the most part, this
means incrementally switching the documentation to use dashes, and getting
glog to output dashes in --help output.

Any thoughts?

- Dan





Re: TLS for localhost connections

2017-02-10 Thread Alexey Serbin
ok, thanks.

Yep -- the realization of the simple fact that we should not protect
against super-user on a local machine came to me after I sent that e-mail.
Sorry for the noise.

On Fri, Feb 10, 2017 at 10:30 AM, Todd Lipcon <t...@cloudera.com> wrote:

> On Fri, Feb 10, 2017 at 10:14 AM, Alexey Serbin <aser...@cloudera.com>
> wrote:
>
> > Hi Todd,
> >
> > Thank you for sharing the perf stats you observed.  I'm curious: during
> > those s_client/s_server tests, was the TLS/SSL compression on or off?  I
> > don't think it would change the result a lot but it's interesting to
> know.
> >
>
> Compression was off:
>
> todd@todd-ThinkPad-T540p:~/sw/openssl-1.0.1f$ perf stat bash -c 'dd
> if=/dev/zero bs=1M count=5000 | openssl s_client -cipher ADH-AES128-SHA'
> CONNECTED(0003)
> ---
> no peer certificate available
> ---
> No client certificate CA names sent
> Server Temp Key: DH, 2048 bits
> ---
> SSL handshake has read 850 bytes and written 441 bytes
> ---
> New, TLSv1/SSLv3, Cipher is ADH-AES128-SHA
> Secure Renegotiation IS supported
> Compression: NONE
> Expansion: NONE
> No ALPN negotiated
> SSL-Session:
> Protocol  : TLSv1.2
> Cipher: ADH-AES128-SHA
> Session-ID:
> 5FE2AA31BC78C5578DE5FE95D3380E4D7094B1040A7D6E9C6A5EC15929F04564
> Session-ID-ctx:
> Master-Key:
> AE0FA5291957492495B7B3424CD4283FEA113727919D393AA19318516827
> E9AB074BCBC2A445584FE5C01DC59424B6F3
> Key-Arg   : None
> PSK identity: None
> PSK identity hint: None
> SRP username: None
> TLS session ticket lifetime hint: 300 (seconds)
> TLS session ticket:
>  - 8f 7e 92 27 06 5f 24 7c-3c a0 20 5d 7e a3 f8 d1   .~.'._$|<.
> ]~...
> 0010 - 4f 49 ad fc 52 30 e3 89-e0 a8 3a 53 29 e1 07 d4
> OI..R0:S)...
> 0020 - 22 01 4b 95 40 5d 27 77-cf 6c b5 77 41 97 3a 88   ".K.@
> ]'w.l.wA.:.
> 0030 - 35 23 6e c4 c7 66 36 0b-aa b5 ef d5 eb d8 3e cf
> 5#n..f6...>.
> 0040 - 34 c3 38 2a 0d b3 f9 26-1c a2 49 fe bc 27 b1 74
> 4.8*...&..I..'.t
> 0050 - 89 96 42 69 af 11 c9 6c-da 3d 65 bc 85 dd 64 d7
> ..Bi...l.=e...d.
> 0060 - 39 0f 78 34 6a c6 27 7e-57 37 b3 eb 60 cc c0 2d
> 9.x4j.'~W7..`..-
> 0070 - 3a a2 12 bc e6 d6 85 8e-ba 9d 7a 9e e2 e7 a0 ab
> :.z.
> 0080 - 47 1a d9 67 ec be 78 2a-d4 91 57 75 93 e1 28 a3
> G..g..x*..Wu..(.
> 0090 - 30 24 c9 8f d1 37 bd e1-69 4b 18 43 85 f6 7e 63
> 0$...7..iK.C..~c
>
> Start Time: 1486707067
> Timeout   : 300 (sec)
> Verify return code: 0 (ok)
>
>
>
> >
> > I think that from performance perspective dropping TLS wrapping around
> the
> > connection just after authentication is the best solution.
> >
> > From the other side, I think dropping TLS opens a door for localhost MITM
> > attacks if an attacker can control access to ipfilter (fiddling with data
> > like rewriting traffic?).
> >
>
> I think the assumption we're going on is that we can't protect against root
> on the same machine. (if you're root you could also just read the process's
> memory, or edit the process, or dump the WAL, etc)
>
>
> >
> > BTW, if dropping encryption, are we concerned about leaking authz tokens
> > when they are introduced?
> >
> >
> Same answer as above -- I don't think we're attempting to protect against
> local root in our threat model.
>
> -Todd
>
>
> >
> > On Thu, Feb 9, 2017 at 10:22 PM, Todd Lipcon <t...@cloudera.com> wrote:
> >
> > > Hey folks,
> > >
> > > For those not following along, we're very close to the point where
> we'll
> > be
> > > enabling TLS for all wire communication done by a Kudu cluster (at
> least
> > > when security features are enabled). One thing we've decided is
> important
> > > is to preserve good performance for applications like Spark and Impala
> > > which typically schedule tasks local to the data on the tablet servers,
> > and
> > > we think that enabling TLS for these localhost connections will have an
> > > unacceptable performance hit.
> > >
> > > Our thinking was to continue to use TLS *authentication* to prevent
> MITM
> > > attacks (possible because we typically don't bind to low ports). But,
> we
> > > don't need TLS *encryption*.
> > >
> > > This is possible using the various TLS "NULL" ciphers -- we can have
> both
> > > the client and server notice that the remote peer is local and enable
> the
> > > NULL cipher suite. However, I did some research this evening and it
> looks
> >

Re: TLS for localhost connections

2017-02-10 Thread Alexey Serbin
Hi Todd,

Thank you for sharing the perf stats you observed.  I'm curious: during
those s_client/s_server tests, was the TLS/SSL compression on or off?  I
don't think it would change the result a lot but it's interesting to know.

I think that from performance perspective dropping TLS wrapping around the
connection just after authentication is the best solution.

>From the other side, I think dropping TLS opens a door for localhost MITM
attacks if an attacker can control access to ipfilter (fiddling with data
like rewriting traffic?).

BTW, if dropping encryption, are we concerned about leaking authz tokens
when they are introduced?


Best regards,

Alexey


On Thu, Feb 9, 2017 at 10:22 PM, Todd Lipcon  wrote:

> Hey folks,
>
> For those not following along, we're very close to the point where we'll be
> enabling TLS for all wire communication done by a Kudu cluster (at least
> when security features are enabled). One thing we've decided is important
> is to preserve good performance for applications like Spark and Impala
> which typically schedule tasks local to the data on the tablet servers, and
> we think that enabling TLS for these localhost connections will have an
> unacceptable performance hit.
>
> Our thinking was to continue to use TLS *authentication* to prevent MITM
> attacks (possible because we typically don't bind to low ports). But, we
> don't need TLS *encryption*.
>
> This is possible using the various TLS "NULL" ciphers -- we can have both
> the client and server notice that the remote peer is local and enable the
> NULL cipher suite. However, I did some research this evening and it looks
> like the NULL ciphers disable encryption but don't disable the MAC
> integrity portion of TLS. Best I can tell, there is no API to do so.
>
> I did some brief checks using openssl s_client and s_server on my laptop
> (openssl 1.0.2g, haswell), and got the following numbers for transferring
> 5GB:
>
> ADH-AES128-SHA
> Client: 42.2M cycles
> Server: 35.3M cycles
>
> AECDH-NULL-SHA: (closest NULL I could find to the above)
> Client: 36.2M cycles
> Server: 28.6M cycles
>
> no TLS at all (using netcat to a local TCP port):
> Client: 20.8M cycles
> Server: 10.0M cycles
>
> baseline: iperf -n 5000M localhost
> Client: 2.3M cycles
> Server: 1.8M cycles
> [not sure why this is so much faster than netcat - I guess because with
> netcat I was piping to /dev/null which still requires more syscalls?]
>
> (note that the client in all of these cases includes the 'dd' command to
> generate the data, which probably explains why it's 7-10M cycles more than
> the server in every case)
>
> To summarize, just disabling encryption has not much improvement, given
> that Intel chips now optimize AES. The checksumming itself adds more
> significant overhead than the encryption. This agrees with numbers I've
> seen around the web that crypto-strength checksums only go 1GB/sec or so
> max, typically much slower.
>
> Thinking about the best solution here, I think we should consider using TLS
> during negotiation, and then just completely dropping the TLS (i.e not
> wrapping the sockets in TlsSockets). I think this still gives us the
> protection against the localhost MITM (because the handshake would fail)
> and be trivially zero-overhead. Am I missing any big issues with this idea?
> Anyone got a better one?
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


Re: TokenPB contents

2017-01-25 Thread Alexey Serbin
Probably we could, but I didn't make my research on that yet.  Will need to
tinker a bit with that to get better understanding.  At least I know that
in proto3 the unknown fields are no longer present when serializing
previously de-serialized message:

  https://developers.google.com/protocol-buffers/docs/proto3#updating

Also, I'm not sure whether it would be possible to differentiate between
compatible and non-compatible extensions.   Or it's proposed that if new
fields are found in a token then don't accept the token at all?  In that
case it might be an issue during rolling upgrade, I think.


Best regards,

Alexey

On Wed, Jan 25, 2017 at 3:14 PM, Dan Burkert <danburk...@apache.org> wrote:

> That's an interesting idea - say if the format evolved to have an
> additional field which restricts use of the token?  Could we use the
> protobuf unknown fields API to recognize if this happened?
>
> - Dan
>
> On Wed, Jan 25, 2017 at 3:03 PM, Alexey Serbin <aser...@cloudera.com>
> wrote:
>
>> I like this idea.
>>
>> Probably, that's too early at this point, but consider adding notion of
>> compt/non-compat feature flags into tokens.  Imagine the new version of
>> token has some additional restriction field, which older code does not
>> understand.  In that case, even if the token signature is correct, the
>> older code should not accept the new token since unsupported non-compat
>> feature flags is present.
>>
>> But in the essence this looks great, IMO.
>>
>>
>> Best regards,
>>
>> Alexey
>>
>> On Wed, Jan 25, 2017 at 12:52 PM, Todd Lipcon <t...@cloudera.com> wrote:
>>
>>> Actually had one more idea... how about:
>>> message AuthenticationToken {
>>> };
>>>
>>> message AuthorizationToken {
>>> };
>>>
>>> message TokenPB {
>>>   // The time at which this token expires, in seconds since the
>>>   // unix epoch.
>>>   optional int64 expire_unix_epoch_seconds = 1;
>>>
>>>   oneof token {
>>> AuthenticationToken authn = 2;
>>> AuthenticationToken authz = 3;
>>>   }
>>> };
>>>
>>> message SignedTokenPB {
>>>   // The actual token contents. This is a serialized TokenPB protobuf.
>>> However, we use a
>>>   // 'bytes' field, since protobuf doesn't guarantee that if two
>>> implementations serialize
>>>   // a protobuf, they'll necessary get bytewise identical results,
>>> particularly in the
>>>   // presence of unknown fields.
>>>   optional bytes token_contents = 2;
>>>
>>>   // The cryptographic signature of 'token_contents'.
>>>   optional bytes signature = 3;
>>>
>>>   // The sequence number of the key which produced 'signature'.
>>>   optional int64 signing_key_seq_num = 4;
>>> };
>>>
>>>
>>>
>>> On Wed, Jan 25, 2017 at 12:44 PM, Todd Lipcon <t...@cloudera.com> wrote:
>>>
>>>> On Wed, Jan 25, 2017 at 12:40 PM, Dan Burkert <d...@cloudera.com> wrote:
>>>>
>>>>> I think it must go in the 'token_contents' itself, otherwise it can be
>>>>> modified by a malicious client.  Other than that, looks good.
>>>>>
>>>>>
>>>> well, if it went into a separate field, then we'd have something like:
>>>>
>>>> optional bytes token_contents = 1;
>>>> optional int64 expiration_timestamp = 2;
>>>>
>>>> // Signature of the string: '<32-bit big-endian length of
>>>> token_contents> token_contents <64-bit big-endian expiration>'
>>>> optional bytes signature = 3;
>>>>
>>>> so they could try to modify it, but the signature would fail.
>>>>
>>>>
>>>>
>>>>> On Wed, Jan 25, 2017 at 12:37 PM, Todd Lipcon <t...@cloudera.com>
>>>>> wrote:
>>>>>
>>>>>> Hey folks
>>>>>>
>>>>>> I'm working on the token signing/verification stuff at the moment.
>>>>>> Curious to solicit some opinions on this:
>>>>>>
>>>>>>
>>>>>> message TokenPB {
>>>>>>   // The actual token contents. This is typically a serialized
>>>>>>   // protobuf of its own. However, we use a 'bytes' field, since
>>>>>>   // protobuf doesn't guarantee that if two implementations serialize
>>>>>>   // a protobuf, they'll necessary get bytewise identical results,
>>>>&

Re: TokenPB contents

2017-01-25 Thread Alexey Serbin
I like this idea.

Probably, that's too early at this point, but consider adding notion of
compt/non-compat feature flags into tokens.  Imagine the new version of
token has some additional restriction field, which older code does not
understand.  In that case, even if the token signature is correct, the
older code should not accept the new token since unsupported non-compat
feature flags is present.

But in the essence this looks great, IMO.


Best regards,

Alexey

On Wed, Jan 25, 2017 at 12:52 PM, Todd Lipcon  wrote:

> Actually had one more idea... how about:
> message AuthenticationToken {
> };
>
> message AuthorizationToken {
> };
>
> message TokenPB {
>   // The time at which this token expires, in seconds since the
>   // unix epoch.
>   optional int64 expire_unix_epoch_seconds = 1;
>
>   oneof token {
> AuthenticationToken authn = 2;
> AuthenticationToken authz = 3;
>   }
> };
>
> message SignedTokenPB {
>   // The actual token contents. This is a serialized TokenPB protobuf.
> However, we use a
>   // 'bytes' field, since protobuf doesn't guarantee that if two
> implementations serialize
>   // a protobuf, they'll necessary get bytewise identical results,
> particularly in the
>   // presence of unknown fields.
>   optional bytes token_contents = 2;
>
>   // The cryptographic signature of 'token_contents'.
>   optional bytes signature = 3;
>
>   // The sequence number of the key which produced 'signature'.
>   optional int64 signing_key_seq_num = 4;
> };
>
>
>
> On Wed, Jan 25, 2017 at 12:44 PM, Todd Lipcon  wrote:
>
>> On Wed, Jan 25, 2017 at 12:40 PM, Dan Burkert  wrote:
>>
>>> I think it must go in the 'token_contents' itself, otherwise it can be
>>> modified by a malicious client.  Other than that, looks good.
>>>
>>>
>> well, if it went into a separate field, then we'd have something like:
>>
>> optional bytes token_contents = 1;
>> optional int64 expiration_timestamp = 2;
>>
>> // Signature of the string: '<32-bit big-endian length of token_contents>
>> token_contents <64-bit big-endian expiration>'
>> optional bytes signature = 3;
>>
>> so they could try to modify it, but the signature would fail.
>>
>>
>>
>>> On Wed, Jan 25, 2017 at 12:37 PM, Todd Lipcon  wrote:
>>>
 Hey folks

 I'm working on the token signing/verification stuff at the moment.
 Curious to solicit some opinions on this:


 message TokenPB {
   // The actual token contents. This is typically a serialized
   // protobuf of its own. However, we use a 'bytes' field, since
   // protobuf doesn't guarantee that if two implementations serialize
   // a protobuf, they'll necessary get bytewise identical results,
   // particularly in the presence of unknown fields.
   optional bytes token_contents = 1;

   // The cryptographic signature of 'token_contents'.
   optional bytes signature = 2;

   // The sequence number of the key which produced 'signature'.
   optional int64_t signing_key_seq_num = 3;
 };

 The thing that's currently missing is an expiration timestamp of the
 signature. I have two options here:

 *Option A*) say that the TokenPB itself doesn't capture expiration,
 and if a particular type of token needs expiration, it would have to put an
 'expiration time' in its token contents itself.

 *pros:*
 - token signing/verification is just a simple operation on the
 'token_contents' string

 *Cons:*
 - would likely end up with redundant code between AuthN and AuthZ
 tokens, both of which need expiration. However, that code isn't very
 complicated (just a timestamp comparison) so maybe not a big deal?

 *Option B)* add an expiration timestamp field to TokenPB
 *pros:*
 - consolidate the expiration checking code into TokenVerifier
 *cons:*
 - now in order to sign/verify a token, we actually need to be signing
 something like a concatenation of 'token_contents + signature'. Not too
 hard to construct this concatenation, but it does add some complexity.

 Any strong opinions either way?

 -Todd
 --
 Todd Lipcon
 Software Engineer, Cloudera

>>>
>>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


Re: C++ api docs including some unexpected classes

2017-01-05 Thread Alexey Serbin
Good catch!

Yep, it seems that's due to the forward declarations of those classes in
the files processed by doxygen.  There is at least one way omit generation
of docs for those -- using @cond/@endcond command.  I'll send a patch for
review shortly.


Best regards,

Alexey

On Thu, Jan 5, 2017 at 6:37 AM, Todd Lipcon  wrote:

> Was just browsing the published API docs, and found it odd that it
> generates a page for the following:
>
> https://kudu.apache.org/cpp-client-api/structkudu_1_
> 1client_1_1SliceKeysTestSetup.html
>
> It seems to be due to a forward declaration of a template specialization.
> Is there any way we can configure doxygen or annotate the declaration to
> avoid generating this doc?
>
> Another odd example is the following:
> https://kudu.apache.org/cpp-client-api/structStubsCompileAssert.html
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


Re: Maven 3.3.9 and Java8 now available on Jenkins slaves

2016-12-21 Thread Alexey Serbin
Great, thank you for  updating the machines!


Best regards,

Alexey

On Wed, Dec 14, 2016 at 9:56 PM, Todd Lipcon  wrote:

> Hey folks,
>
> By popular demand, I added Maven 3.3.9 and Java 8 on the Jenkins slave
> boxes.
>
> To avoid disrupting current builds, for now I've not put them on the path
> by default. But, if we want to add a new job for Jepsen, we can explicitly
> request those. Or, if we think that it's time to move to Java8 by default,
> I can adjust the kudu-gerrit job config to put them on the path.
>
> The paths are:
> /opt/apache-maven-3.3.9/bin
> and:
> JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
> (bin/ directory inside)
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


Re: Flaky tablet_history_gc-itest

2016-11-28 Thread Alexey Serbin
Mike,

Thank you for clarifying on this and opening the JIRA item which tracks the
issue.

I'm not an expert on exactly-once semantics, but it seems David provided
more information on KUDU-1761.

Indeed, when using the auto background flush the client library does not
provide a means to guarantee the order of operation completion on the
server side, unless the number of batches is limited to 1 (by default it's
2).  From the client perspective, the client sends the operations
sequentially, but due to the background flushing they might be sent to the
server in parallel and completing at the server side even in reverse
order.  So, if the exactly-once semantics does not play here, we should
address this in a different way.


Best regards,

Alexey

On Thu, Nov 24, 2016 at 7:32 AM, Mike Percy  wrote:

> I investigated this further today and I believe this has to do with
> concurrent client flushes interleaving, resulting in out-of-order writes on
> the server side.
>
> More details in https://issues.apache.org/jira/browse/KUDU-1761
>
> David / Alexey, based on your knowledge of exactly-once semantics and the
> client flush path does my explanation make sense? I'll dig into it further
> but based on my hazy memory I think we allow 2 outstanding flushes at a
> time and do not enforce ordering with our exactly-once mechanism, only
> idempotence of individual operations (IIRC we do not do flow control if a
> client sequence number is skipped).
>
> Mike
>
> On Tue, Nov 22, 2016 at 5:10 PM, Mike Percy  wrote:
>
> > Alexey says that KuduSession::Flush() already blocks on all outstanding
> > async flushes, and apparently that's true, so either there is a
> non-obvious
> > bug there or it's caused by something else.
> >
> > Mike
> >
> > On Tue, Nov 22, 2016 at 4:46 PM, Mike Percy  wrote:
> >
> >> Hi Alexey,
> >> Thank you very much for investigating this test and providing an easy
> >> repro case. I just got around to spending some time on this due to
> David's
> >> read-your-writes (RYW) patches adding to this test. I think we should
> try
> >> to make this test reliable to be sure RYW doesn't have anomalies we
> aren't
> >> aware of. I applied your (second) patch mentioned earlier in this thread
> >> and was able to quickly reproduce the issues you described.
> >>
> >> I suspect that the problem stems from the fact that in the C++ client,
> >> session->Flush() is a no-op if an async flush is already in-progress. At
> >> least, it looks to me like those are the current semantics. If I remove
> the
> >> session->FlushAsync(null) from your patch then the test passes for me.
> >>
> >> The problem with a non-blocking Flush() while an async flush is
> >> outstanding that is that tablet_history_gc-itest relies on the following
> >> series of operations in order to ensure it knows what the consistent
> set of
> >> values should be for a particular timestamp:
> >>
> >>   FlushSessionOrDie(session);
> >> }
> >> SaveSnapshot(std::move(snapshot), clock_->Now());
> >>
> >> So it waits until the data is flushed to the server and then captures
> the
> >> current server timestamp as a key for its record of expected data
> values at
> >> that timestamp. The above appears in all of the write code paths in the
> >> test. If FlushSessionOrDie() is an async no-op, and the client has not
> >> truly flushed outstanding data at the current time stamp, then the
> >> "SaveSnapshot()" code path will think the latest writes are included in
> the
> >> snapshot for the current time, but in actuality the old value will be
> >> stored for that timestamp on the server.
> >>
> >> This matches with the observed behavior based on running the (patched)
> >> test with --vmodule=tablet_history_gc-itest=2 and seeing the following
> >> in the log:
> >>
> >> I1122 15:42:48.355686 15303 tablet_history_gc-itest.cc:166] Saving
> >> snapshot at ts = P: 10 usec, L: 1 (40961)
> >> I1122 15:42:48.355768 15303 tablet_history_gc-itest.cc:380] Starting
> >> round 1
> >> ...
> >> I1122 15:42:48.355917 15303 tablet_history_gc-itest.cc:463] Updating
> row
> >> to { 511, 1535296252, 1535296251, NOT_DELETED }
> >> I1122 15:42:48.355926 15303 tablet_history_gc-itest.cc:463] Updating
> row
> >> to { 511, 1708088659, 1708088658, NOT_DELETED }
> >> ...
> >> I1122 15:42:48.975769 15303 tablet_history_gc-itest.cc:166] Saving
> >> snapshot at ts = P: 10 usec, L: 514 (409600514)
> >> I1122 15:42:48.975796 15303 tablet_history_gc-itest.cc:496] Updated 512
> >> rows
> >> I1122 15:42:48.975807 15303 tablet_history_gc-itest.cc:380] Starting
> >> round 2
> >> ...
> >> I1122 15:42:54.547406 15303 tablet_history_gc-itest.cc:213] Round 26:
> >> Verifying snapshot scan for timestamp P: 10 usec, L: 1559
> >> (409601559)
> >> ../../src/kudu/integration-tests/tablet_history_gc-itest.cc:242:
> Failure
> >> Value of: int_val
> >>   Actual: 1535296252
> >> Expected: 

Re: Flaky tablet_history_gc-itest

2016-10-12 Thread Alexey Serbin
One small update: the issue might be not in GC logic, but some other
flakiness related to reading data at snapshot.

I updated the patch so the only operations the test now does are inserts,
updates and scans. No tablet merge compactions, redo delta compactions,
forced re-updates of missing deltas, or moving time forward.  The updated
patch can be found at:
  https://gist.github.com/alexeyserbin/06ed8dbdb0e8e9abcbde2991c6615660

The test firmly fails if running as described in the previous message in
this thread, just use the updated patch location.

David, may be you can take a quick look at that as well?


Thanks,

Alexey

On Wed, Oct 12, 2016 at 2:01 AM, Alexey Serbin <aser...@cloudera.com> wrote:

> Hi,
>
> I played with the test (mostly in background), making the failure almost
> 100% reproducible.
>
> After collecting some evidence, I can say it's a server-side bug.  I think
> so because the reproduction scenario I'm talking about uses good old
> MANUAL_FLUSH mode, not AUTO_FLUSH_BACKGROUND mode.  Yes, I've modified the
> test slightly to achieve higher reproduction ratio and to clear the
> question whether it's AUTO_FLUSH_BACKGROUND-specific bug.
>
> That's what I found:
>   1. The problem occurs when updating rows with the same primary keys
> multiple times.
>   2. It's crucial to flush (i.e. call KuduSession::Flush() or
> KuduSession::FlushAsync()) freshly applied update operations not just once
> in the very end of a client session, but multiple times while adding those
> operations.  If flushing just once in the very end, the issue becomes 0%
> reproducible.
>   3. The more updates for different rows we have, the more likely we hit
> the issue (but there should be at least a couple updates for every row).
>   4. The problem persists in all types of Kudu builds: debug, TSAN,
> release, ASAN (in the decreasing order of the reproduction ratio).
>   5. The problem is also highly reproducible if running the test via the
> dist_test.py utility (check for 256 out of 256 failure ratio at
> http://dist-test.cloudera.org//job?job_id=aserbin.1476258983.2603 )
>
> To build the modified test and run the reproduction scenario:
>   1. Get the patch from https://gist.github.com/alexeyserbin/
> 7c885148dadff8705912f6cc513108d0
>   2. Apply the patch to the latest Kudu source from the master branch.
>   3. Build debug, TSAN, release or ASAN configuration and run with the
> command (the random seed is not really crucial, but this gives better
> results):
> ../../build-support/run-test.sh ./bin/tablet_history_gc-itest
> --gtest_filter=RandomizedTabletHistoryGcITest.TestRandomHistoryGCWorkload
> --stress_cpu_threads=64 --test_random_seed=1213726993
>
> 4. If running via dist_test.py, run the following instead:
>
> ../../build-support/dist_test.py loop -n 256 --
> ./bin/tablet_history_gc-itest --gtest_filter=
> RandomizedTabletHistoryGcITest.TestRandomHistoryGCWorkload
> --stress_cpu_threads=8 --test_random_seed=1213726993
>
> Mike, it seems I'll need your help to troubleshoot/debug this issue
> further.
>
>
> Best regards,
>
> Alexey
>
>
> On Mon, Oct 3, 2016 at 9:48 AM, Alexey Serbin <aser...@cloudera.com>
> wrote:
>
>> Todd,
>>
>> I apologize for the late response -- somehow my inbox is messed up.
>> Probably, I need to switch to use stand-alone mail application (as iMail)
>> instead of browser-based one.
>>
>> Yes, I'll take a look at that.
>>
>>
>> Best regards,
>>
>> Alexey
>>
>> On Mon, Sep 26, 2016 at 12:58 PM, Todd Lipcon <t...@cloudera.com> wrote:
>>
>>> This test has gotten flaky with a concerning failure mode (seeing
>>> "wrong" results, not just a timeout or something):
>>>
>>> http://dist-test.cloudera.org:8080/test_drilldown?test_name=
>>> tablet_history_gc-itest
>>>
>>> It seems like it got flaky starting with Alexey's
>>> commit bc14b2f9d775c9f27f2e2be36d4b03080977e8fa which switched it to
>>> use AUTO_FLUSH_BACKGROUND. So perhaps the bug is actually a client bug and
>>> not anything to do with GC.
>>>
>>> Alexey, do you have time to take a look, and perhaps consult with Mike
>>> if you think it's actually a server-side bug?
>>>
>>> -Todd
>>>
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>>>
>>
>>
>


Re: Flaky tablet_history_gc-itest

2016-10-12 Thread Alexey Serbin
Hi,

I played with the test (mostly in background), making the failure almost
100% reproducible.

After collecting some evidence, I can say it's a server-side bug.  I think
so because the reproduction scenario I'm talking about uses good old
MANUAL_FLUSH mode, not AUTO_FLUSH_BACKGROUND mode.  Yes, I've modified the
test slightly to achieve higher reproduction ratio and to clear the
question whether it's AUTO_FLUSH_BACKGROUND-specific bug.

That's what I found:
  1. The problem occurs when updating rows with the same primary keys
multiple times.
  2. It's crucial to flush (i.e. call KuduSession::Flush() or
KuduSession::FlushAsync()) freshly applied update operations not just once
in the very end of a client session, but multiple times while adding those
operations.  If flushing just once in the very end, the issue becomes 0%
reproducible.
  3. The more updates for different rows we have, the more likely we hit
the issue (but there should be at least a couple updates for every row).
  4. The problem persists in all types of Kudu builds: debug, TSAN,
release, ASAN (in the decreasing order of the reproduction ratio).
  5. The problem is also highly reproducible if running the test via the
dist_test.py utility (check for 256 out of 256 failure ratio at
http://dist-test.cloudera.org//job?job_id=aserbin.1476258983.2603 )

To build the modified test and run the reproduction scenario:
  1. Get the patch from
https://gist.github.com/alexeyserbin/7c885148dadff8705912f6cc513108d0
  2. Apply the patch to the latest Kudu source from the master branch.
  3. Build debug, TSAN, release or ASAN configuration and run with the
command (the random seed is not really crucial, but this gives better
results):
../../build-support/run-test.sh ./bin/tablet_history_gc-itest
--gtest_filter=RandomizedTabletHistoryGcITest.TestRandomHistoryGCWorkload
--stress_cpu_threads=64 --test_random_seed=1213726993

4. If running via dist_test.py, run the following instead:

../../build-support/dist_test.py loop -n 256 --
./bin/tablet_history_gc-itest
--gtest_filter=RandomizedTabletHistoryGcITest.TestRandomHistoryGCWorkload
--stress_cpu_threads=8 --test_random_seed=1213726993

Mike, it seems I'll need your help to troubleshoot/debug this issue further.


Best regards,

Alexey


On Mon, Oct 3, 2016 at 9:48 AM, Alexey Serbin <aser...@cloudera.com> wrote:

> Todd,
>
> I apologize for the late response -- somehow my inbox is messed up.
> Probably, I need to switch to use stand-alone mail application (as iMail)
> instead of browser-based one.
>
> Yes, I'll take a look at that.
>
>
> Best regards,
>
> Alexey
>
> On Mon, Sep 26, 2016 at 12:58 PM, Todd Lipcon <t...@cloudera.com> wrote:
>
>> This test has gotten flaky with a concerning failure mode (seeing "wrong"
>> results, not just a timeout or something):
>>
>> http://dist-test.cloudera.org:8080/test_drilldown?test_name=
>> tablet_history_gc-itest
>>
>> It seems like it got flaky starting with Alexey's
>> commit bc14b2f9d775c9f27f2e2be36d4b03080977e8fa which switched it to use
>> AUTO_FLUSH_BACKGROUND. So perhaps the bug is actually a client bug and not
>> anything to do with GC.
>>
>> Alexey, do you have time to take a look, and perhaps consult with Mike if
>> you think it's actually a server-side bug?
>>
>> -Todd
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>
>


Scoping document on client timeout diagnosis

2016-10-11 Thread Alexey Serbin
Hello all,

I have a draft of the high level scoping document on client timeout
diagnosis.  Please take a look and feel free to provide comments in the
Google Doc located here:
  https://s.apache.org/SM6V


Regards,

Alexey


Re: [VOTE] Apache Kudu 1.0.1 RC1

2016-10-10 Thread Alexey Serbin
+1

I compiled the project from source using the 1.0.1-RC1 tag (release and
debug modes) and ran the test suite using 'ctest -j4' at CentOS 6.6 (kernel
2.6.32-504.30.3.el6.x86_64).  All tests passed.

Also, I compiled the project from source using the 1.0.1-RC1 tag (release
and debug modes) and ran the tests using 'ctest -j2' at MacOS X 10.11.5.
All tests passed except for pstack_watcher-test, which is a known issue for
Kudu on MacOS X.

Besides, I run 'kudu test loadgen' tool against the newly built binaries
multiple times, inserting several million of rows and it passed.


Best regards,

Alexey


On Fri, Oct 7, 2016 at 2:00 PM, Dan Burkert  wrote:

> Hi,
>
> We're happy to announce the first release candidate for Apache Kudu 1.0.1.
>
> This release includes bug fixes and documentation updates since the 1.0.0
> release.
>
> The is a source-only release. The artifacts were staged here:
> https://dist.apache.org/repos/dist/dev/kudu/1.0.1-RC1/
>
> It was built from this tag:
> https://git-wip-us.apache.org/repos/asf?p=kudu.git;a=commit;h=
> e60b610253f4303b24d41575f7bafbc5d69edddb
>
> The release notes can be found here (the release notes on kudu.apache.org
> will be updated with the release):
> https://github.com/apache/kudu/blob/branch-1.0.x/docs/release_notes.adoc
>
> KEYS file:
> https://www.apache.org/dist/kudu/KEYS
>
> I'd suggest going through the README, building Kudu, and running the
> unit tests.
>
> Please try the release and vote; vote will be open for at least 72 hours.
>
> Thanks,
>
> - Dan
>


  1   2   >