Tests that don't run impalad

2016-11-29 Thread Jim Apple
Should we add to our pre-merge testing (aka GVM, aka GVO) some tests
that don't run impalad, but only build it or check for correctness?

For instance:

1. bin/run_clang_tidy.sh - should we force our code to always be clang-tidy?

2. bin/check-rat-report.py - uses Apache RAT to check that our code
has proper license headers

3. Other buildall.sh options - in the past we have broken -asan,
-release, or -so without breaking the pre-commit test.

4. Docs build

I think I can do these without increasing the end-to-end time it takes
to run the tests, by doing them in parallel. One issue I see is that
committers who see their change as minor and merge it manually,
without pre-merge testing, might break clang-tidy or RAT tests.


Is anyone regularly running e2e tests with the "pairwise" exploration strategy?

2016-11-29 Thread Jim Apple
In hte "pairwise" exploration strategy, I have a test error
(https://issues.cloudera.org/browse/IMPALA-4545), a crash, and a
22-hour runtime. Is running this regularly?


Re: Kudu start error with low ntpdate "maximum error"

2016-11-29 Thread Jim Apple
That didn't do it:

http://35.164.73.121:8080/job/ubuntu-14.04-from-scratch/184/consoleFull


The setup of the machine (which runs before the jobs starts and is not
visible at that page) includes:

+ sudo sed -i s/ubuntu\.pool/amazon\.pool/ /etc/ntp.conf
+ grep amazon /etc/ntp.conf
server 0.amazon.pool.ntp.org
server 1.amazon.pool.ntp.org
server 2.amazon.pool.ntp.org
server 3.amazon.pool.ntp.org
+ grep ubuntu /etc/ntp.conf
server ntp.ubuntu.com

As before, it looks like ntptime is reporting a max error of 16
seconds. I'm going to try and add ntp-wait to our cluster start
scripts to make sure that ntp is in a good state the moment before we
try to start Kudu.

On Sat, Nov 26, 2016 at 1:51 PM, Jim Apple  wrote:
> OK, I changed those lines in /etc/ntp.conf; let's see how that goes.
>
> On Mon, Nov 21, 2016 at 7:19 AM, Todd Lipcon  wrote:
>> On Sat, Nov 19, 2016 at 10:50 AM, Jim Apple  wrote:
>>
>>> Can you do so?
>>>
>>> This job used ntpwait before starting and has syslog sent to the
>>> output; perhaps that will help debug:
>>>
>>> http://ec2-35-161-220-160.us-west-2.compute.amazonaws.com:80
>>> 80/job/ubuntu-14.04-from-scratch/147/
>>>
>>>
>> Yea, looks from the last ntptime output that it lost its NTP
>> synchronization. Looking at the startup log it appears that it's just
>> configured with the default Ubuntu NTP servers rather than using multiple
>> hosts inside EC2. Is it possible for you to adjust your AMI or change the
>> startup script to use the following lines in /etc/ntp.conf?
>>
>> server 0.amazon.pool.ntp.org
>> server 1.amazon.pool.ntp.org
>> server 2.amazon.pool.ntp.org
>> server 3.amazon.pool.ntp.org
>>
>> We could add a flag to ignore NTP sync, but it looks like once NTP goes out
>> of sync it's also reporting a maxerror of 16 seconds, which would also
>> cause an issue.
>>
>> -Todd
>>
>>
>>> In the meantime, it seems we could use a workaround in order to get
>>> this working in ec2 more consistently.
>>>
>>> On Thu, Nov 17, 2016 at 12:21 PM, Todd Lipcon  wrote:
>>> > Nothing else comes to mind. Let me know if you think the solution of
>>> > allowing Kudu to ignore NTP status is preferable and I'll prioritize a
>>> > patch to unblock you guys.
>>> >
>>> > -Todd
>>> >
>>> > On Thu, Nov 17, 2016 at 11:43 AM, Jim Apple 
>>> wrote:
>>> >
>>> >> This is a normal functional test. The node is gone now, but I will add
>>> >> "cat /var/log/syslog" to the logging. Anything else I should add?
>>> >>
>>> >> On Thu, Nov 17, 2016 at 10:55 AM, Todd Lipcon 
>>> wrote:
>>> >> > On Thu, Nov 17, 2016 at 10:52 AM, Jim Apple 
>>> >> wrote:
>>> >> >
>>> >> >> The kudu master failure was at 16:36:28 UTC, but at 13:57:55 and
>>> >> >> 15:58:41 we have ntptime printing "status 0x2001 (PLL,NANO),".
>>> >> >>
>>> >> >
>>> >> > Do you see anything from ntp in /var/log/messages or syslog, etc?
>>> Usually
>>> >> > once NTP is synchronized it's fairly hard to knock it out of sync
>>> unless
>>> >> it
>>> >> > loses internet access. Is this a stress test or normal functional
>>> tests?
>>> >> >
>>> >> >
>>> >> >>
>>> >> >> On Thu, Nov 17, 2016 at 10:14 AM, Todd Lipcon 
>>> >> wrote:
>>> >> >> > On Thu, Nov 17, 2016 at 10:00 AM, Jim Apple 
>>> >> >> wrote:
>>> >> >> >
>>> >> >> >> I now have a Kudu master failing with "F1117 16:36:26.940562
>>> 113049
>>> >> >> >> hybrid_clock.cc:227] Couldn't get the current time: Clock
>>> >> >> >> unsynchronized. Status: Service unavailable: Error reading clock.
>>> >> >> >> Clock considered unsynchronized"
>>> >> >> >>
>>> >> >> >> Todd, when you say "we need to wait for sync before running tests
>>> or
>>> >> >> >> allow an unsafe flag to disable the check," do you mean "we"
>>> Impala,
>>> >> >> >> "we" Kudu, or both? If Kudu or both, is there a Kudu bug I should
>>> >> >> >> follow? I don't see a likely candidate at
>>> >> >> >>
>>> >> >> >> https://issues.apache.org/jira/browse/KUDU-1202?jql=
>>> >> >> >> project%20%3D%20KUDU%20AND%20text%20~%20%22%5C%22Clock%
>>> >> >> >> 20considered%20unsynchronized%5C%22%22
>>> >> >> >
>>> >> >> >
>>> >> >> > Sorry, I was on my phone so wasn't as precise as I should have
>>> been. I
>>> >> >> > think that either:
>>> >> >> >
>>> >> >> > (a) Impala could add a script which runs prior to tests which waits
>>> >> for
>>> >> >> > synchronization. Apparently there is a builtin "ntp-wait" command
>>> in
>>> >> some
>>> >> >> > distros. We use the following which also includes restarting NTP a
>>> few
>>> >> >> > times which seems to speed things up:
>>> >> >> > https://gist.github.com/toddlipcon/97f7c8a4cf1d9c2551bd4289b
>>> 97dfe48
>>> >> >> >
>>> >> >> > (b) Kudu could add a flag like --ignore_bad_ntp_sync_status which
>>> you
>>> >> >> could
>>> >> >> > use in minicluster tests. In a minicluster where all the daemons
>>> run
>>> >> on
>>> >> >> one
>>> >> >> > node, it's sort of guaranteed that the clocks are in sync, so maybe
>>> >> it's
>>> >> >> > not too bad of an idea. The only risk I can see is that the clock
>>> >> might
>>> >> >> be
>>> >

Re: Tests that don't run impalad

2016-11-29 Thread Henry Robinson
On 29 November 2016 at 08:06, Jim Apple  wrote:

> Should we add to our pre-merge testing (aka GVM, aka GVO) some tests
> that don't run impalad, but only build it or check for correctness?
>
> For instance:
>
> 1. bin/run_clang_tidy.sh - should we force our code to always be
> clang-tidy?
>

I don't have enough experience of the tool to know if this likely to be a
help or hindrance.


>
> 2. bin/check-rat-report.py - uses Apache RAT to check that our code
> has proper license headers
>

+1


>
> 3. Other buildall.sh options - in the past we have broken -asan,
> -release, or -so without breaking the pre-commit test.
>

If all can be tested for 'free' wrt to wall-clock-time, then sure. But if
that's not possible, I'd only consider building -release, and maybe not
even that. -asan failures are infrequent enough that I don't expect it to
be worth the extra time it would add to the pre-commit build.

-so is less important to me.


>
> 4. Docs build
>
> I think I can do these without increasing the end-to-end time it takes
> to run the tests, by doing them in parallel. One issue I see is that
> committers who see their change as minor and merge it manually,
> without pre-merge testing, might break clang-tidy or RAT tests.
>

For that reason, perhaps a separate docs build makes the most sense.


Re: Tests that don't run impalad

2016-11-29 Thread Sailesh Mukil
On Tue, Nov 29, 2016 at 9:50 AM, Henry Robinson  wrote:

> On 29 November 2016 at 08:06, Jim Apple  wrote:
>
> > Should we add to our pre-merge testing (aka GVM, aka GVO) some tests
> > that don't run impalad, but only build it or check for correctness?
> >
> > For instance:
> >
> > 1. bin/run_clang_tidy.sh - should we force our code to always be
> > clang-tidy?
> >
>
> I don't have enough experience of the tool to know if this likely to be a
> help or hindrance.
>
>
>
+1 for this. My opinion is unless we foresee some patches that would fail
clang-tidy but still be considered a valid patch by us, we should have this
as a pre-commit test.

>
> > 2. bin/check-rat-report.py - uses Apache RAT to check that our code
> > has proper license headers
> >
>
> +1
>
>
> >
> > 3. Other buildall.sh options - in the past we have broken -asan,
> > -release, or -so without breaking the pre-commit test.
> >
>
> If all can be tested for 'free' wrt to wall-clock-time, then sure. But if
> that's not possible, I'd only consider building -release, and maybe not
> even that. -asan failures are infrequent enough that I don't expect it to
> be worth the extra time it would add to the pre-commit build.
>
> -so is less important to me.
>
>
> >
> > 4. Docs build
> >
> > I think I can do these without increasing the end-to-end time it takes
> > to run the tests, by doing them in parallel. One issue I see is that
> > committers who see their change as minor and merge it manually,
> > without pre-merge testing, might break clang-tidy or RAT tests.
> >
>
> For that reason, perhaps a separate docs build makes the most sense.
>


Re: Tests that don't run impalad

2016-11-29 Thread Jim Apple
Also, individual warning can be suppressed with "// NOLINT" (or with
"#pragma clang diagnostic ignored" for tidy checks that are also
compiler warnings)

On Tue, Nov 29, 2016 at 10:01 AM, Sailesh Mukil  wrote:
> On Tue, Nov 29, 2016 at 9:50 AM, Henry Robinson  wrote:
>
>> On 29 November 2016 at 08:06, Jim Apple  wrote:
>>
>> > Should we add to our pre-merge testing (aka GVM, aka GVO) some tests
>> > that don't run impalad, but only build it or check for correctness?
>> >
>> > For instance:
>> >
>> > 1. bin/run_clang_tidy.sh - should we force our code to always be
>> > clang-tidy?
>> >
>>
>> I don't have enough experience of the tool to know if this likely to be a
>> help or hindrance.
>>
>>
>>
> +1 for this. My opinion is unless we foresee some patches that would fail
> clang-tidy but still be considered a valid patch by us, we should have this
> as a pre-commit test.
>
>>
>> > 2. bin/check-rat-report.py - uses Apache RAT to check that our code
>> > has proper license headers
>> >
>>
>> +1
>>
>>
>> >
>> > 3. Other buildall.sh options - in the past we have broken -asan,
>> > -release, or -so without breaking the pre-commit test.
>> >
>>
>> If all can be tested for 'free' wrt to wall-clock-time, then sure. But if
>> that's not possible, I'd only consider building -release, and maybe not
>> even that. -asan failures are infrequent enough that I don't expect it to
>> be worth the extra time it would add to the pre-commit build.
>>
>> -so is less important to me.
>>
>>
>> >
>> > 4. Docs build
>> >
>> > I think I can do these without increasing the end-to-end time it takes
>> > to run the tests, by doing them in parallel. One issue I see is that
>> > committers who see their change as minor and merge it manually,
>> > without pre-merge testing, might break clang-tidy or RAT tests.
>> >
>>
>> For that reason, perhaps a separate docs build makes the most sense.
>>


Re: Tests that don't run impalad

2016-11-29 Thread Todd Lipcon
On Kudu we've hooked it up as a precommit, but without any "vote". That is
to say, it will add a gerrit comment with any warnings/diagnostics, but
doesn't add a '-1' which prevents merge. This has been useful for pointing
out style issues or missed optimizations, but we've been able to adopt it
gradually without a big-bang "tidy everything" kind of commit which can be
disruptive to in-flight patches.

Typically we expect patch contributors to address the review comments from
clang-tidy just like they'd address review comments from a human reviewer
(which might include ignoring the comment if it's just some code that was
moved from one place to another rather than new code)

Here's an example of the type of comments it leaves:
https://gerrit.cloudera.org/#/c/5252/1/src/kudu/master/catalog_manager.cc@3296

-Todd

On Tue, Nov 29, 2016 at 10:06 AM, Jim Apple  wrote:

> Also, individual warning can be suppressed with "// NOLINT" (or with
> "#pragma clang diagnostic ignored" for tidy checks that are also
> compiler warnings)
>
> On Tue, Nov 29, 2016 at 10:01 AM, Sailesh Mukil 
> wrote:
> > On Tue, Nov 29, 2016 at 9:50 AM, Henry Robinson 
> wrote:
> >
> >> On 29 November 2016 at 08:06, Jim Apple  wrote:
> >>
> >> > Should we add to our pre-merge testing (aka GVM, aka GVO) some tests
> >> > that don't run impalad, but only build it or check for correctness?
> >> >
> >> > For instance:
> >> >
> >> > 1. bin/run_clang_tidy.sh - should we force our code to always be
> >> > clang-tidy?
> >> >
> >>
> >> I don't have enough experience of the tool to know if this likely to be
> a
> >> help or hindrance.
> >>
> >>
> >>
> > +1 for this. My opinion is unless we foresee some patches that would fail
> > clang-tidy but still be considered a valid patch by us, we should have
> this
> > as a pre-commit test.
> >
> >>
> >> > 2. bin/check-rat-report.py - uses Apache RAT to check that our code
> >> > has proper license headers
> >> >
> >>
> >> +1
> >>
> >>
> >> >
> >> > 3. Other buildall.sh options - in the past we have broken -asan,
> >> > -release, or -so without breaking the pre-commit test.
> >> >
> >>
> >> If all can be tested for 'free' wrt to wall-clock-time, then sure. But
> if
> >> that's not possible, I'd only consider building -release, and maybe not
> >> even that. -asan failures are infrequent enough that I don't expect it
> to
> >> be worth the extra time it would add to the pre-commit build.
> >>
> >> -so is less important to me.
> >>
> >>
> >> >
> >> > 4. Docs build
> >> >
> >> > I think I can do these without increasing the end-to-end time it takes
> >> > to run the tests, by doing them in parallel. One issue I see is that
> >> > committers who see their change as minor and merge it manually,
> >> > without pre-merge testing, might break clang-tidy or RAT tests.
> >> >
> >>
> >> For that reason, perhaps a separate docs build makes the most sense.
> >>
>



-- 
Todd Lipcon
Software Engineer, Cloudera


Re: Tests that don't run impalad

2016-11-29 Thread Jim Apple
At the moment, we have not added anything to our .clang-tidy config
file that doesn't apply to the whole codebase. That is to say, we are
clang-tidy green (but for a recent errant semicolon after a member fn
decl).

On Tue, Nov 29, 2016 at 11:18 AM, Todd Lipcon  wrote:
> On Kudu we've hooked it up as a precommit, but without any "vote". That is
> to say, it will add a gerrit comment with any warnings/diagnostics, but
> doesn't add a '-1' which prevents merge. This has been useful for pointing
> out style issues or missed optimizations, but we've been able to adopt it
> gradually without a big-bang "tidy everything" kind of commit which can be
> disruptive to in-flight patches.
>
> Typically we expect patch contributors to address the review comments from
> clang-tidy just like they'd address review comments from a human reviewer
> (which might include ignoring the comment if it's just some code that was
> moved from one place to another rather than new code)
>
> Here's an example of the type of comments it leaves:
> https://gerrit.cloudera.org/#/c/5252/1/src/kudu/master/catalog_manager.cc@3296
>
> -Todd
>
> On Tue, Nov 29, 2016 at 10:06 AM, Jim Apple  wrote:
>
>> Also, individual warning can be suppressed with "// NOLINT" (or with
>> "#pragma clang diagnostic ignored" for tidy checks that are also
>> compiler warnings)
>>
>> On Tue, Nov 29, 2016 at 10:01 AM, Sailesh Mukil 
>> wrote:
>> > On Tue, Nov 29, 2016 at 9:50 AM, Henry Robinson 
>> wrote:
>> >
>> >> On 29 November 2016 at 08:06, Jim Apple  wrote:
>> >>
>> >> > Should we add to our pre-merge testing (aka GVM, aka GVO) some tests
>> >> > that don't run impalad, but only build it or check for correctness?
>> >> >
>> >> > For instance:
>> >> >
>> >> > 1. bin/run_clang_tidy.sh - should we force our code to always be
>> >> > clang-tidy?
>> >> >
>> >>
>> >> I don't have enough experience of the tool to know if this likely to be
>> a
>> >> help or hindrance.
>> >>
>> >>
>> >>
>> > +1 for this. My opinion is unless we foresee some patches that would fail
>> > clang-tidy but still be considered a valid patch by us, we should have
>> this
>> > as a pre-commit test.
>> >
>> >>
>> >> > 2. bin/check-rat-report.py - uses Apache RAT to check that our code
>> >> > has proper license headers
>> >> >
>> >>
>> >> +1
>> >>
>> >>
>> >> >
>> >> > 3. Other buildall.sh options - in the past we have broken -asan,
>> >> > -release, or -so without breaking the pre-commit test.
>> >> >
>> >>
>> >> If all can be tested for 'free' wrt to wall-clock-time, then sure. But
>> if
>> >> that's not possible, I'd only consider building -release, and maybe not
>> >> even that. -asan failures are infrequent enough that I don't expect it
>> to
>> >> be worth the extra time it would add to the pre-commit build.
>> >>
>> >> -so is less important to me.
>> >>
>> >>
>> >> >
>> >> > 4. Docs build
>> >> >
>> >> > I think I can do these without increasing the end-to-end time it takes
>> >> > to run the tests, by doing them in parallel. One issue I see is that
>> >> > committers who see their change as minor and merge it manually,
>> >> > without pre-merge testing, might break clang-tidy or RAT tests.
>> >> >
>> >>
>> >> For that reason, perhaps a separate docs build makes the most sense.
>> >>
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera


Re: Tests that don't run impalad

2016-11-29 Thread Tim Armstrong
> 1. bin/run_clang_tidy.sh - should we force our code to always be
clang-tidy?
I think so (at least if we have a subset of rules that the codebase is
already clean for). It's hard to avoid noise creeping in otherwise. Kudu's
approach would also work here.
> 2. bin/check-rat-report.py - uses Apache RAT to check that our code
has proper license headers
Makes sense - we have no reason to check in code that doesn't pass RAT.
> 3. Other buildall.sh options - in the past we have broken -asan,
-release, or -so without breaking the pre-commit test.
I'm going to echo other people's comments - I think this would be great
aside from the build latency problem. If we do this, we could also enable
-Werror for release and ASAN, which would cut down on warning noise. Can we
fork off separate jenkins jobs to compile the code under different
settings? Or somehow use ccache?
> 4. Docs build
It would be great to get John's input on what the docs review/merge process
should look like and whether pre-commit testing for doc changes make sense.
I suspect something lighter-weight than the code-review/test process would
be appropriate here, and, as others have said, we could have a separate,
faster, doc build rather than bundling it all into a mega-build.

On Tue, Nov 29, 2016 at 11:21 AM, Jim Apple  wrote:

> At the moment, we have not added anything to our .clang-tidy config
> file that doesn't apply to the whole codebase. That is to say, we are
> clang-tidy green (but for a recent errant semicolon after a member fn
> decl).
>
> On Tue, Nov 29, 2016 at 11:18 AM, Todd Lipcon  wrote:
> > On Kudu we've hooked it up as a precommit, but without any "vote". That
> is
> > to say, it will add a gerrit comment with any warnings/diagnostics, but
> > doesn't add a '-1' which prevents merge. This has been useful for
> pointing
> > out style issues or missed optimizations, but we've been able to adopt it
> > gradually without a big-bang "tidy everything" kind of commit which can
> be
> > disruptive to in-flight patches.
> >
> > Typically we expect patch contributors to address the review comments
> from
> > clang-tidy just like they'd address review comments from a human reviewer
> > (which might include ignoring the comment if it's just some code that was
> > moved from one place to another rather than new code)
> >
> > Here's an example of the type of comments it leaves:
> > https://gerrit.cloudera.org/#/c/5252/1/src/kudu/master/
> catalog_manager.cc@3296
> >
> > -Todd
> >
> > On Tue, Nov 29, 2016 at 10:06 AM, Jim Apple 
> wrote:
> >
> >> Also, individual warning can be suppressed with "// NOLINT" (or with
> >> "#pragma clang diagnostic ignored" for tidy checks that are also
> >> compiler warnings)
> >>
> >> On Tue, Nov 29, 2016 at 10:01 AM, Sailesh Mukil 
> >> wrote:
> >> > On Tue, Nov 29, 2016 at 9:50 AM, Henry Robinson 
> >> wrote:
> >> >
> >> >> On 29 November 2016 at 08:06, Jim Apple 
> wrote:
> >> >>
> >> >> > Should we add to our pre-merge testing (aka GVM, aka GVO) some
> tests
> >> >> > that don't run impalad, but only build it or check for correctness?
> >> >> >
> >> >> > For instance:
> >> >> >
> >> >> > 1. bin/run_clang_tidy.sh - should we force our code to always be
> >> >> > clang-tidy?
> >> >> >
> >> >>
> >> >> I don't have enough experience of the tool to know if this likely to
> be
> >> a
> >> >> help or hindrance.
> >> >>
> >> >>
> >> >>
> >> > +1 for this. My opinion is unless we foresee some patches that would
> fail
> >> > clang-tidy but still be considered a valid patch by us, we should have
> >> this
> >> > as a pre-commit test.
> >> >
> >> >>
> >> >> > 2. bin/check-rat-report.py - uses Apache RAT to check that our code
> >> >> > has proper license headers
> >> >> >
> >> >>
> >> >> +1
> >> >>
> >> >>
> >> >> >
> >> >> > 3. Other buildall.sh options - in the past we have broken -asan,
> >> >> > -release, or -so without breaking the pre-commit test.
> >> >> >
> >> >>
> >> >> If all can be tested for 'free' wrt to wall-clock-time, then sure.
> But
> >> if
> >> >> that's not possible, I'd only consider building -release, and maybe
> not
> >> >> even that. -asan failures are infrequent enough that I don't expect
> it
> >> to
> >> >> be worth the extra time it would add to the pre-commit build.
> >> >>
> >> >> -so is less important to me.
> >> >>
> >> >>
> >> >> >
> >> >> > 4. Docs build
> >> >> >
> >> >> > I think I can do these without increasing the end-to-end time it
> takes
> >> >> > to run the tests, by doing them in parallel. One issue I see is
> that
> >> >> > committers who see their change as minor and merge it manually,
> >> >> > without pre-merge testing, might break clang-tidy or RAT tests.
> >> >> >
> >> >>
> >> >> For that reason, perhaps a separate docs build makes the most sense.
> >> >>
> >>
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
>


Re: Tests that don't run impalad

2016-11-29 Thread Jim Apple
> Can we
> fork off separate jenkins jobs to compile the code under different
> settings?

I think so. I haven't tried yet, but the Jenkins docs seems to suggest
it is possible.


Re: Is anyone regularly running e2e tests with the "pairwise" exploration strategy?

2016-11-29 Thread Tim Armstrong
I didn't even know the strategy existed (so no). It seems strange that it's
taking longer than exhaustive strategy. What workload_exploration_strategy
are you using in conjunction with it?

On Tue, Nov 29, 2016 at 8:24 AM, Jim Apple  wrote:

> In hte "pairwise" exploration strategy, I have a test error
> (https://issues.cloudera.org/browse/IMPALA-4545), a crash, and a
> 22-hour runtime. Is running this regularly?
>


Re: Is anyone regularly running e2e tests with the "pairwise" exploration strategy?

2016-11-29 Thread Jim Apple
Whatever the default value is

On Tue, Nov 29, 2016 at 11:57 AM, Tim Armstrong  wrote:
> I didn't even know the strategy existed (so no). It seems strange that it's
> taking longer than exhaustive strategy. What workload_exploration_strategy
> are you using in conjunction with it?
>
> On Tue, Nov 29, 2016 at 8:24 AM, Jim Apple  wrote:
>
>> In hte "pairwise" exploration strategy, I have a test error
>> (https://issues.cloudera.org/browse/IMPALA-4545), a crash, and a
>> 22-hour runtime. Is running this regularly?
>>


Re: Is anyone regularly running e2e tests with the "pairwise" exploration strategy?

2016-11-29 Thread Tim Armstrong
It looks like our "exhaustive" build runs with --exploration_strategy=core
--workload_exploration_strategy=functional-query:exhaustive, so the problem
may be that a lot of non-functional-query (e.g. tpch) tests run in
configurations that don't make sense.

On Tue, Nov 29, 2016 at 12:18 PM, Jim Apple  wrote:

> Whatever the default value is
>
> On Tue, Nov 29, 2016 at 11:57 AM, Tim Armstrong 
> wrote:
> > I didn't even know the strategy existed (so no). It seems strange that
> it's
> > taking longer than exhaustive strategy. What
> workload_exploration_strategy
> > are you using in conjunction with it?
> >
> > On Tue, Nov 29, 2016 at 8:24 AM, Jim Apple  wrote:
> >
> >> In hte "pairwise" exploration strategy, I have a test error
> >> (https://issues.cloudera.org/browse/IMPALA-4545), a crash, and a
> >> 22-hour runtime. Is running this regularly?
> >>
>


StorageHandler related query

2016-11-29 Thread Deepak Dixit
Hello Impala Team,

I am working on building application which stores data in in memory key
value store and would like to query though impala.
I have found that Impala works well with Hbase with the
HBaseStoreageHandler which is essentially connector for Hive.
I have written similar connector for our key value store. And hive works
well with that. But when I tries to run select query on table created
through hive it gives me following error

AnalysisException: Failed to load metadata for table: 'table_1'
CAUSED BY: TableLoadingException: Unrecognized table type for table:
default.table_1

Can you please help in understanding the issue and possible resolution?

Thank You

-- 
From:

Deepak D Dixit
deepakdixit2...@gmail.com
+919028507537


[Toolchain-CR] IMPALA-4549: consistently enforce that boost gregorian year <= 9999

2016-11-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5264

Change subject: IMPALA-4549: consistently enforce that boost gregorian year <= 

..

IMPALA-4549: consistently enforce that boost gregorian year <= 

The documentation and code are inconsistent about whether the max is
 or 1. The documentation and the max_date value are both ,
so fix cases where it is 1.

I also filed a bug against boost https://svn.boost.org/trac/boost/ticket/12630

Change-Id: I65e4912b407ae43e281e8fb8153c89cd12e2e237
---
M buildall.sh
A source/boost/boost-1.57.0-patches/0001-greg-year-range.patch
2 files changed, 31 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Toolchain refs/changes/64/5264/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5264
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I65e4912b407ae43e281e8fb8153c89cd12e2e237
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Toolchain-CR] IMPALA-4549: consistently enforce that boost gregorian year <= 9999

2016-11-29 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4549: consistently enforce that boost gregorian year <= 

..


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5264/1//COMMIT_MSG
Commit Message:

Line 9: The documentation and code are inconsistent about whether the max is
Maybe indicate in this message that this commit adds a custom boost patch?


-- 
To view, visit http://gerrit.cloudera.org:8080/5264
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I65e4912b407ae43e281e8fb8153c89cd12e2e237
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


Re: StorageHandler related query

2016-11-29 Thread Tim Armstrong
Hi Deepak,
  We don't support Hive's storage handler API - we have our own HBase
support that was use for HBase tables. We also have similar support for
Kudu tables. To add equivalent support system you'd need to add frontend
Java support for planning queries against that table type and backend C++
support for scanning the data. This kind of thing would be a pretty big
change so there would need to be buy-in from the community.

We also have a Java data source API that lets you write a Java class that
scans an arbitrary data source. That is easier to use but has limitations
(only executes on a single node, performance is worse than other table
types. E.g. see this thread:
https://groups.google.com/a/cloudera.org/forum/#!topic/impala-user/egcflD8XkHc

What key-value store are you working with, out of curiousity?

- Tim

On Tue, Nov 29, 2016 at 8:41 AM, Deepak Dixit 
wrote:

> Hello Impala Team,
>
> I am working on building application which stores data in in memory key
> value store and would like to query though impala.
> I have found that Impala works well with Hbase with the
> HBaseStoreageHandler which is essentially connector for Hive.
> I have written similar connector for our key value store. And hive works
> well with that. But when I tries to run select query on table created
> through hive it gives me following error
>
> AnalysisException: Failed to load metadata for table: 'table_1'
> CAUSED BY: TableLoadingException: Unrecognized table type for table:
> default.table_1
>
> Can you please help in understanding the issue and possible resolution?
>
> Thank You
>
> --
> From:
>
> Deepak D Dixit
> deepakdixit2...@gmail.com
> +919028507537
>


[Toolchain-CR] IMPALA-4549: consistently enforce that boost gregorian year <= 9999

2016-11-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4549: consistently enforce that boost gregorian year <= 

..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5264/1//COMMIT_MSG
Commit Message:

Line 9: The documentation and code are inconsistent about whether the max is
> Maybe indicate in this message that this commit adds a custom boost patch?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5264
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I65e4912b407ae43e281e8fb8153c89cd12e2e237
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Toolchain-CR] IMPALA-4549: consistently enforce that boost gregorian year <= 9999

2016-11-29 Thread Tim Armstrong (Code Review)
Hello Taras Bobrovytsky,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5264

to look at the new patch set (#2).

Change subject: IMPALA-4549: consistently enforce that boost gregorian year <= 

..

IMPALA-4549: consistently enforce that boost gregorian year <= 

Adds a Boost patch that makes the documentation and code consistent on
whether the max supported year is  or 1. The documentation and
the max_date value are both , so fix cases where it is 1.

I also filed a bug against boost https://svn.boost.org/trac/boost/ticket/12630

Change-Id: I65e4912b407ae43e281e8fb8153c89cd12e2e237
---
M buildall.sh
A source/boost/boost-1.57.0-patches/0001-greg-year-range.patch
2 files changed, 31 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Toolchain refs/changes/64/5264/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5264
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I65e4912b407ae43e281e8fb8153c89cd12e2e237
Gerrit-PatchSet: 2
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Taras Bobrovytsky 


[Toolchain-CR] IMPALA-4549: consistently enforce that boost gregorian year <= 9999

2016-11-29 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4549: consistently enforce that boost gregorian year <= 

..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5264
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I65e4912b407ae43e281e8fb8153c89cd12e2e237
Gerrit-PatchSet: 2
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


Re: StorageHandler related query

2016-11-29 Thread Tim Armstrong
On Tue, Nov 29, 2016 at 2:34 PM, Tim Armstrong 
wrote:

> Hi Deepak,
>   We don't support Hive's storage handler API - we have our own HBase
> support that was use for HBase tables. We also have similar support for
> Kudu tables. To add equivalent support system you'd need to add frontend
> Java support for planning queries against that table type and backend C++
> support for scanning the data. This kind of thing would be a pretty big
> change so there would need to be buy-in from the community.
>
> We also have a Java data source API that lets you write a Java class that
> scans an arbitrary data source. That is easier to use but has limitations
> (only executes on a single node, performance is worse than other table
> types. E.g. see this thread: https://groups.google.com/a/
> cloudera.org/forum/#!topic/impala-user/egcflD8XkHc
>
> What key-value store are you working with, out of curiousity?
>
> - Tim
>
> On Tue, Nov 29, 2016 at 8:41 AM, Deepak Dixit 
> wrote:
>
>> Hello Impala Team,
>>
>> I am working on building application which stores data in in memory key
>> value store and would like to query though impala.
>> I have found that Impala works well with Hbase with the
>> HBaseStoreageHandler which is essentially connector for Hive.
>> I have written similar connector for our key value store. And hive works
>> well with that. But when I tries to run select query on table created
>> through hive it gives me following error
>>
>> AnalysisException: Failed to load metadata for table: 'table_1'
>> CAUSED BY: TableLoadingException: Unrecognized table type for table:
>> default.table_1
>>
>> Can you please help in understanding the issue and possible resolution?
>>
>> Thank You
>>
>> --
>> From:
>>
>> Deepak D Dixit
>> deepakdixit2...@gmail.com
>> +919028507537
>>
>
>


Re: StorageHandler related query

2016-11-29 Thread Deepak Dixit
Thanks Tim.
I understand java frontend seems to be major hurdle. If any thread is going
on for the same would certainly like to be part of it.

We are working with apache geode and seeking support for the same.

Thanks,
Deepak

On Wed, Nov 30, 2016 at 10:38 AM, Tim Armstrong 
wrote:

>
>
> On Tue, Nov 29, 2016 at 2:34 PM, Tim Armstrong 
> wrote:
>
>> Hi Deepak,
>>   We don't support Hive's storage handler API - we have our own HBase
>> support that was use for HBase tables. We also have similar support for
>> Kudu tables. To add equivalent support system you'd need to add frontend
>> Java support for planning queries against that table type and backend C++
>> support for scanning the data. This kind of thing would be a pretty big
>> change so there would need to be buy-in from the community.
>>
>> We also have a Java data source API that lets you write a Java class that
>> scans an arbitrary data source. That is easier to use but has limitations
>> (only executes on a single node, performance is worse than other table
>> types. E.g. see this thread: https://groups.google.com/a/cl
>> oudera.org/forum/#!topic/impala-user/egcflD8XkHc
>>
>> What key-value store are you working with, out of curiousity?
>>
>> - Tim
>>
>> On Tue, Nov 29, 2016 at 8:41 AM, Deepak Dixit 
>> wrote:
>>
>>> Hello Impala Team,
>>>
>>> I am working on building application which stores data in in memory key
>>> value store and would like to query though impala.
>>> I have found that Impala works well with Hbase with the
>>> HBaseStoreageHandler which is essentially connector for Hive.
>>> I have written similar connector for our key value store. And hive works
>>> well with that. But when I tries to run select query on table created
>>> through hive it gives me following error
>>>
>>> AnalysisException: Failed to load metadata for table: 'table_1'
>>> CAUSED BY: TableLoadingException: Unrecognized table type for table:
>>> default.table_1
>>>
>>> Can you please help in understanding the issue and possible resolution?
>>>
>>> Thank You
>>>
>>> --
>>> From:
>>>
>>> Deepak D Dixit
>>> deepakdixit2...@gmail.com
>>> +919028507537
>>>
>>
>>
>


-- 
From:

Deepak D Dixit
deepakdixit2...@gmail.com
+919028507537