Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-07 Thread Adrian Prantl via lldb-dev


> On Feb 6, 2018, at 9:29 AM, Davide Italiano via lldb-dev 
>  wrote:
> 
> On Tue, Feb 6, 2018 at 8:18 AM, Pavel Labath  wrote:
>> On 6 February 2018 at 15:41, Davide Italiano  wrote:
>>> On Tue, Feb 6, 2018 at 7:09 AM, Pavel Labath  wrote:
 On 6 February 2018 at 04:11, Davide Italiano via lldb-dev
 
 So, I guess my question is: are you guys looking into making sure that
 others are also able to reproduce the 0-fail+0-xpass state? I would
 love to run the mac test suite locally, as I tend to touch a lot of
 stuff that impacts all targets, but as it stands now, I have very
 little confidence that the test I am running reflect in any way the
 results you will get when you run the test on your end.
 
 I am ready to supply any test logs or information you need if you want
 to try to tackle this.
 
>>> 
>>> Yes, I'm definitely interested in making the testusuite
>>> working/reliable on any configuration.
>>> I was afraid there were a lot of latent issues, that's why I sent this
>>> mail in the first place.
>>> It's also the reason why I started thinking about `lldb-test` as a
>>> driver for testing, because I found out the testsuite being a little
>>> inconsistent/brittle depending on the environment it's run on (which,
>>> FWIW, doesn't happen when you run lit/FileCheck or even the unit tests
>>> in lldb). I'm not currently claiming switching to a different method
>>> would improve the situation, but it's worth a shot.
>>> 
>> 
>> Despite Zachary's claims, I do not believe this is caused by the test
>> driver (dotest). It's definitely not beautiful, but I haven't seen an
>> issue that would be caused by this in a long time. The issue is that
>> the tests are doing too much -- even the simplest involves compiling a
>> fully working executable, which pulls in a lot of stuff from the
>> environment (runtime libraries, dynamic linker, ...) that we have no
>> control of. And of course it makes it impossible to test the debugging
>> functionality of any other platform than what you currently have in
>> front of you.
>> 
>> In this sense, the current setup makes an excellent integration test
>> suite -- if you run the tests and they pass, you can be fairly
>> confident that the debugging on your system is setup correctly.
>> However, it makes a very bad regression test suite, as the tests will
>> be checking something different on each machine.
>> 
> 
> Yes, I didn't complain about "dotest" in general, but, as you say, the
> fact that it pull in lots of stuffs we don't really have control on.
> Also, most of the times I actually found out we've been sloppy watching
> bots for a while, or XFAILING tests instead of fixing them and that resulted 
> in
> issues piling up). This is a more general problem not necessarily tied to
> `dotest` as a driver.
> 
>> So I believe we need more lightweight tests, and lldb-test can provide
>> us with that. The main question for me (and that's something I don't
> 
> +1.
> 
>> really have an answer to) is how to make writing tests like that easy.
>> E.g. for these "foreign" language plugins, the only way to make a
>> self-contained regression test would be to check-in some dwarf which
>> mimics what the compiler in question would produce. But doing that is
>> extremely tedious as we don't have any tooling for that. Since debug
>> info is very central to what we do, having something like that would
>> go a long way towards improving the testing situation, and it would be
>> useful for C/C++ as well, as we generally need to make sure that we
>> work with a wide range of compiler versions, not just accept what ToT
>> clang happens to produce.
>> 
> 
> I think the plan here (and I'd love to spend some time on this once we
> have stability, which seems we're slowly getting) is that of enhancing
> `yaml2*` to do the work for us.
> I do agree is a major undertaken but even spending a month on it will
> go a long way IMHO. I will try to come up with a plan after discussing
> with folks in my team (I'd really love to get also inputs from DWARF
> people in llvm, e.g. Eric or David Blake).

The last time I looked into yaml2obj was to use it for testing llvm-dwarfdump 
and back then I concluded that it needs a lot of work to be useful even for 
testing dwarfdump. In the current state it is both too low-level (e.g., you 
need to manually specify all Mach-O load commands, you have to manually compute 
and specify the size of each debug info section) and too high-level (it can 
only auto-generate one exact version of .debug_info headers) to be useful.

If we could make a tool whose input roughly looks like the output of dwarfdump, 
then this might be a viable option. Note that I'm not talking about syntax but 
about the abstraction level of the contents.

In summary, I think this is an interesting direction to explore, but we 
shouldn't underestimate the amount of 

Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-07 Thread Davide Italiano via lldb-dev
On Wed, Feb 7, 2018 at 9:32 AM, Pavel Labath  wrote:
> On 6 February 2018 at 15:51, Davide Italiano  wrote:
>>
>> FWIW, I strongly believe we should all agree on a configuration to run
>> tests and standardize on that.
>> It's unfortunate that we have two build systems, but there are plans
>> to move away from manually generating xcodebuild, as many agree it's a
>> terrible maintenance burden.
>> So, FWIW, I'll share my conf (I'm on high Sierra):
>>
>>
>> git clone https://github.com/monorepo
>> symlink clang -> tools
>> symlink lldb -> tools
>> symlink libcxx -> projects (this particular one has caused lots of
>> trouble for me in the past, and I realized it's undocumented :()
>>
>> cmake -GNinja -DCMAKE_BUILD_TYPE=Release ../llvm
>> ninja check-lldb
>>
> Right, so I tried following these instructions as precisely as I could.
>
> - The first thing that failed is the libc++ link step (missing 
> -lcxxabi_shared).
>
> So, I added libcxxabi to the build, and tried again.
> Aaand, I have to say the situation is much better now: I got two
> unexpected successes and one timeout:
> UNEXPECTED SUCCESS: test_lldbmi_output_grammar
> (tools/lldb-mi/syntax/TestMiSyntax.py)
> UNEXPECTED SUCCESS: test_process_interrupt_dsym
> (functionalities/thread/state/TestThreadStates.py)
> TIMEOUT: test_breakpoint_doesnt_match_file_with_different_case_dwarf
> (functionalities/breakpoint/breakpoint_case_sensitivity/TestBreakpointCaseSensitivity.py)
>
> On the second run I got these results:
> FAIL: test_launch_in_terminal (functionalities/tty/TestTerminal.py)
> UNEXPECTED SUCCESS: test_lldbmi_output_grammar
> (tools/lldb-mi/syntax/TestMiSyntax.py)
> UNEXPECTED SUCCESS: test_process_interrupt_dwarf
> (functionalities/thread/state/TestThreadStates.py)
>
>
> So, checking out libc++ certainly helped (this definitely needs to be
> documented somewhere) a lot. Of these, the MI test seems to be failing
> consistently. The rest appear to be flakes. I am attaching the logs
> from the second run, but there doesn't appear to be anything
> interesting there...

Terrific that we're making progress! I plan to take a look at the
`lldb-mi` failure soon, as I can reproduce it here fairly
consistently.

About the others, we've seen
functionalities/breakpoint/breakpoint_case_sensitivity/TestBreakpointCaseSensitivity.py
failing on the bots and I think might be due to a spotlight issue
Adrian found (and fixed).
You might still have `.dSYM` bundles from stale build directories, i.e.

To fix this, you need to wipe out all old build artifacts:

- Inside of the LLDB source tree:
 $ git clean -f -d

- Globally:
 $ find / -name a.out.dSYM -exec rm -rf \{} \;

This s a long shot, but might help you

--
Davide
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-07 Thread Pavel Labath via lldb-dev
On 6 February 2018 at 15:51, Davide Italiano  wrote:
>
> FWIW, I strongly believe we should all agree on a configuration to run
> tests and standardize on that.
> It's unfortunate that we have two build systems, but there are plans
> to move away from manually generating xcodebuild, as many agree it's a
> terrible maintenance burden.
> So, FWIW, I'll share my conf (I'm on high Sierra):
>
>
> git clone https://github.com/monorepo
> symlink clang -> tools
> symlink lldb -> tools
> symlink libcxx -> projects (this particular one has caused lots of
> trouble for me in the past, and I realized it's undocumented :()
>
> cmake -GNinja -DCMAKE_BUILD_TYPE=Release ../llvm
> ninja check-lldb
>
Right, so I tried following these instructions as precisely as I could.

- The first thing that failed is the libc++ link step (missing -lcxxabi_shared).

So, I added libcxxabi to the build, and tried again.
Aaand, I have to say the situation is much better now: I got two
unexpected successes and one timeout:
UNEXPECTED SUCCESS: test_lldbmi_output_grammar
(tools/lldb-mi/syntax/TestMiSyntax.py)
UNEXPECTED SUCCESS: test_process_interrupt_dsym
(functionalities/thread/state/TestThreadStates.py)
TIMEOUT: test_breakpoint_doesnt_match_file_with_different_case_dwarf
(functionalities/breakpoint/breakpoint_case_sensitivity/TestBreakpointCaseSensitivity.py)

On the second run I got these results:
FAIL: test_launch_in_terminal (functionalities/tty/TestTerminal.py)
UNEXPECTED SUCCESS: test_lldbmi_output_grammar
(tools/lldb-mi/syntax/TestMiSyntax.py)
UNEXPECTED SUCCESS: test_process_interrupt_dwarf
(functionalities/thread/state/TestThreadStates.py)


So, checking out libc++ certainly helped (this definitely needs to be
documented somewhere) a lot. Of these, the MI test seems to be failing
consistently. The rest appear to be flakes. I am attaching the logs
from the second run, but there doesn't appear to be anything
interesting there...


Failure-LaunchInTerminalTestCase-test_launch_in_terminal.log
Description: Binary data


UnexpectedSuccess-MiSyntaxTestCase-test_lldbmi_output_grammar.log
Description: Binary data


UnexpectedSuccess-ThreadStateTestCase-test_process_interrupt_dwarf.log
Description: Binary data
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-07 Thread Davide Italiano via lldb-dev
On Wed, Feb 7, 2018 at 7:57 AM, Pavel Labath  wrote:
> On 7 February 2018 at 14:20, Zachary Turner  wrote:
>>
>> As someone who gave up on trying to set up a bot due to flakiness, I
have a
>> different experience.
>
> I did not say it was easy to get to the present point, and I am
> certain that the situation is much harder on windows. But I believe
> this is due to reasons not related to the test runner (such various
> posixism spread out over the codebase and the fact that windows uses a
> completely different (i.e. lest tested) code path for debugging).
>
> FWIW, we also have a windows bot running remote tests targetting
> android. It's not as stable as the one hosted on linux, but most of
> the issues I've seen there also do not point towards dotest.
>
>> Rust is based on llvm so we have the tools necessary for that.  The rest
are
>> still maybe and someday so we can cross that bridge when (if) we come to
it
>
> I don't know enough about Rust to say whether that is true. If it uses
> llvm as a backend then I guess we could check-in some rust-generated
> IR to serve as a test case (but we still figure out what exactly to do
> with it).
>
> However, I would assert that even for C family languages a more
> low-level approach than "$CC -g" for generating debug info would be
> useful. People generally will not have their compiler and debugger
> versions in sync, so we need tests that check we handle debug info
> produced by older versions of clang (or gcc for that matter). And
> then, there are the tests to make sure we handle "almost valid" debug
> info gracefully...

This last category is really interesting (and, unfortunately, given our
current testing strategy, almost entirely untested).
I think the proper thing here is that of having tooling that generates
broken debug info, as yaml2obj can generate broken object files, and test
with them.
lldb does a great deal of work trying to "recover" with a lot of heuristics
in case debug info are wrong but not that off. In order to have better
control of this codepath, we need to have a better testing for this case,
otherwise this will break (and we'll be forced to remove the codepath
entirely).

--
Davide
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-07 Thread Zachary Turner via lldb-dev
On Wed, Feb 7, 2018 at 2:38 AM Pavel Labath  wrote:

> On 6 February 2018 at 18:53, Zachary Turner  wrote:
> > I'm not claiming that it's definitely caused by dotest and that moving
> away
> > from dotest is going to fix all the problems.  Rather, I'm claiming that
> > dotest has an unknown amount of flakiness (which may be 0, but may be
> > large), and the alternative has a known amount of flakiness (which is
> very
>
> Well, it may be unknown to you, but as someone who has managed a bot
> running tests for a long time, I can tell you that the it's pretty
> close to 0. Some test still fail sometimes, but the failure rate is
> approximately at the same level as failures caused by the bot not
> being able to reach the svn server to fetch the sources.

As someone who gave up on trying to set up a bot due to flakiness, I have a
different experience.



>
> That said, I'm still in favor of replacing the test runner with lit. I
> just think it needs to be done with a steady hand.
>
>
> >> So I believe we need more lightweight tests, and lldb-test can provide
> >> us with that. The main question for me (and that's something I don't
> >> really have an answer to) is how to make writing tests like that easy.
> >> E.g. for these "foreign" language plugins, the only way to make a
> >> self-contained regression test would be to check-in some dwarf which
> >> mimics what the compiler in question would produce. But doing that is
> >> extremely tedious as we don't have any tooling for that.
> >
> >
> >  Most of these other language plugins are being removed anyway.  Which
> > language plugins are going to still remain that aren't some flavor of
> c/c++?
>
> Well, right now we have another thread proposing the addition of a
> Rust plugin, and we will want to resurrect Java support sooner or
> later. Go/Ocaml  folks may want to do the
> same, if doing that will not
> involve them inventing a whole test framework.
>
> So, I'm not sure where you were heading with that question..


Rust is based on llvm so we have the tools necessary for that.  The rest
are still maybe and someday so we can cross that bridge when (if) we come
to it
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-07 Thread Pavel Labath via lldb-dev
On 6 February 2018 at 18:53, Zachary Turner  wrote:
> I'm not claiming that it's definitely caused by dotest and that moving away
> from dotest is going to fix all the problems.  Rather, I'm claiming that
> dotest has an unknown amount of flakiness (which may be 0, but may be
> large), and the alternative has a known amount of flakiness (which is very

Well, it may be unknown to you, but as someone who has managed a bot
running tests for a long time, I can tell you that the it's pretty
close to 0. Some test still fail sometimes, but the failure rate is
approximately at the same level as failures caused by the bot not
being able to reach the svn server to fetch the sources.

That said, I'm still in favor of replacing the test runner with lit. I
just think it needs to be done with a steady hand.


>> So I believe we need more lightweight tests, and lldb-test can provide
>> us with that. The main question for me (and that's something I don't
>> really have an answer to) is how to make writing tests like that easy.
>> E.g. for these "foreign" language plugins, the only way to make a
>> self-contained regression test would be to check-in some dwarf which
>> mimics what the compiler in question would produce. But doing that is
>> extremely tedious as we don't have any tooling for that.
>
>
>  Most of these other language plugins are being removed anyway.  Which
> language plugins are going to still remain that aren't some flavor of c/c++?

Well, right now we have another thread proposing the addition of a
Rust plugin, and we will want to resurrect Java support sooner or
later. Go/Ocaml folks may want to do the same, if doing that will not
involve them inventing a whole test framework.

So, I'm not sure where you were heading with that question..

On 6 February 2018 at 18:53, Zachary Turner  wrote:
>
>
> On Tue, Feb 6, 2018 at 8:19 AM Pavel Labath via lldb-dev
>  wrote:
>>
>> On 6 February 2018 at 15:41, Davide Italiano 
>> wrote:
>> > On Tue, Feb 6, 2018 at 7:09 AM, Pavel Labath  wrote:
>> >> On 6 February 2018 at 04:11, Davide Italiano via lldb-dev
>> >>
>> >> So, I guess my question is: are you guys looking into making sure that
>> >> others are also able to reproduce the 0-fail+0-xpass state? I would
>> >> love to run the mac test suite locally, as I tend to touch a lot of
>> >> stuff that impacts all targets, but as it stands now, I have very
>> >> little confidence that the test I am running reflect in any way the
>> >> results you will get when you run the test on your end.
>> >>
>> >> I am ready to supply any test logs or information you need if you want
>> >> to try to tackle this.
>> >>
>> >
>> > Yes, I'm definitely interested in making the testusuite
>> > working/reliable on any configuration.
>> > I was afraid there were a lot of latent issues, that's why I sent this
>> > mail in the first place.
>> > It's also the reason why I started thinking about `lldb-test` as a
>> > driver for testing, because I found out the testsuite being a little
>> > inconsistent/brittle depending on the environment it's run on (which,
>> > FWIW, doesn't happen when you run lit/FileCheck or even the unit tests
>> > in lldb). I'm not currently claiming switching to a different method
>> > would improve the situation, but it's worth a shot.
>> >
>>
>> Despite Zachary's claims, I do not believe this is caused by the test
>> driver (dotest). It's definitely not beautiful, but I haven't seen an
>> issue that would be caused by this in a long time. The issue is that
>> the tests are doing too much -- even the simplest involves compiling a
>> fully working executable, which pulls in a lot of stuff from the
>> environment (runtime libraries, dynamic linker, ...) that we have no
>> control of. And of course it makes it impossible to test the debugging
>> functionality of any other platform than what you currently have in
>> front of you.
>
> I'm not claiming that it's definitely caused by dotest and that moving away
> from dotest is going to fix all the problems.  Rather, I'm claiming that
> dotest has an unknown amount of flakiness (which may be 0, but may be
> large), and the alternative has a known amount of flakiness (which is very
> close to, if not equal to 0).  So we should do it because, among other
> benefits, it replaces an unknown with a known that is at least as good, if
> not better.
>
>
>>
>>
>> In this sense, the current setup makes an excellent integration test
>> suite -- if you run the tests and they pass, you can be fairly
>> confident that the debugging on your system is setup correctly.
>> However, it makes a very bad regression test suite, as the tests will
>> be checking something different on each machine.
>>
>> So I believe we need more lightweight tests, and lldb-test can provide
>> us with that. The main question for me (and that's something I don't
>> really have an answer to) is how to make writing 

Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-06 Thread Zachary Turner via lldb-dev
On Mon, Feb 5, 2018 at 8:12 PM Davide Italiano via lldb-dev <
lldb-dev@lists.llvm.org> wrote:

>
>
> Conclusions:
> The reliability of the suite (and the health of the codebase) is very
> important to us. If you have issues, let us know.
> In general, I'm looking for any kind of feedback, feel free to speak!
>
>
I think that the path forward is to massively expand test coverage in all
areas.  We need roughly 20x - 30x the amount of tests that we currently
have.  25,000 ~ 30,000 tests that run equally well on all platforms is a
good target to shoot for.

The goal of tests is, obviously, to increase test coverage by increasing
the amount of code that is tested.  Another way to increase test coverage
is to reduce the amount of code that isn't tested.  If you can delete an
untested branch then even if you don't add a test, you've increased test
coverage.  To that end, we should be looking to assert more liberally and
end the dubious practice of defensive programming.

On the subject of lldb-test.  I think the existing test suite serves its
purpose as an integration test suite well, and I would even say that it has
a reasonable amount of test coverage from what you could expect of an
integration test suite.  But what we need is a regression test suite.  I
don't think we should spend a significant amount of time adding new tests
to the integration test suite.  It's coverage is already decent.  I think
almost all new tests going forward should be very lightweight, targeted,
regression tests that do not depend on the host platform at all.  lldb-test
is the perfect vehicle for this kind of test.  It's still early and doesn't
do much, so we will need to continually add more functionality to lldb-test
as well to realize this goal, but I think we can rapidly expand test
coverage going this route.

Finally, I think we should get buildbots running sanitized builds of LLDB
under the test suite.  For LLDB specifically I think TSan and UBSan would
add the most value, but long term I think we should get all sanitizers
enabled.
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-06 Thread Zachary Turner via lldb-dev
On Tue, Feb 6, 2018 at 8:19 AM Pavel Labath via lldb-dev <
lldb-dev@lists.llvm.org> wrote:

> On 6 February 2018 at 15:41, Davide Italiano 
> wrote:
> > On Tue, Feb 6, 2018 at 7:09 AM, Pavel Labath  wrote:
> >> On 6 February 2018 at 04:11, Davide Italiano via lldb-dev
> >>
> >> So, I guess my question is: are you guys looking into making sure that
> >> others are also able to reproduce the 0-fail+0-xpass state? I would
> >> love to run the mac test suite locally, as I tend to touch a lot of
> >> stuff that impacts all targets, but as it stands now, I have very
> >> little confidence that the test I am running reflect in any way the
> >> results you will get when you run the test on your end.
> >>
> >> I am ready to supply any test logs or information you need if you want
> >> to try to tackle this.
> >>
> >
> > Yes, I'm definitely interested in making the testusuite
> > working/reliable on any configuration.
> > I was afraid there were a lot of latent issues, that's why I sent this
> > mail in the first place.
> > It's also the reason why I started thinking about `lldb-test` as a
> > driver for testing, because I found out the testsuite being a little
> > inconsistent/brittle depending on the environment it's run on (which,
> > FWIW, doesn't happen when you run lit/FileCheck or even the unit tests
> > in lldb). I'm not currently claiming switching to a different method
> > would improve the situation, but it's worth a shot.
> >
>
> Despite Zachary's claims, I do not believe this is caused by the test
> driver (dotest). It's definitely not beautiful, but I haven't seen an
> issue that would be caused by this in a long time. The issue is that
> the tests are doing too much -- even the simplest involves compiling a
> fully working executable, which pulls in a lot of stuff from the
> environment (runtime libraries, dynamic linker, ...) that we have no
> control of. And of course it makes it impossible to test the debugging
> functionality of any other platform than what you currently have in
> front of you.
>
I'm not claiming that it's definitely caused by dotest and that moving away
from dotest is going to fix all the problems.  Rather, I'm claiming that
dotest has an unknown amount of flakiness (which may be 0, but may be
large), and the alternative has a known amount of flakiness (which is very
close to, if not equal to 0).  So we should do it because, among other
benefits, it replaces an unknown with a known that is at least as good, if
not better.



>
> In this sense, the current setup makes an excellent integration test
> suite -- if you run the tests and they pass, you can be fairly
> confident that the debugging on your system is setup correctly.
> However, it makes a very bad regression test suite, as the tests will
> be checking something different on each machine.
>
> So I believe we need more lightweight tests, and lldb-test can provide
> us with that. The main question for me (and that's something I don't
> really have an answer to) is how to make writing tests like that easy.
> E.g. for these "foreign" language plugins, the only way to make a
> self-contained regression test would be to check-in some dwarf which
> mimics what the compiler in question would produce. But doing that is
> extremely tedious as we don't have any tooling for that.


 Most of these other language plugins are being removed anyway.  Which
language plugins are going to still remain that aren't some flavor of c/c++?
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-06 Thread Davide Italiano via lldb-dev
On Tue, Feb 6, 2018 at 8:18 AM, Pavel Labath  wrote:
> On 6 February 2018 at 15:41, Davide Italiano  wrote:
>> On Tue, Feb 6, 2018 at 7:09 AM, Pavel Labath  wrote:
>>> On 6 February 2018 at 04:11, Davide Italiano via lldb-dev
>>>
>>> So, I guess my question is: are you guys looking into making sure that
>>> others are also able to reproduce the 0-fail+0-xpass state? I would
>>> love to run the mac test suite locally, as I tend to touch a lot of
>>> stuff that impacts all targets, but as it stands now, I have very
>>> little confidence that the test I am running reflect in any way the
>>> results you will get when you run the test on your end.
>>>
>>> I am ready to supply any test logs or information you need if you want
>>> to try to tackle this.
>>>
>>
>> Yes, I'm definitely interested in making the testusuite
>> working/reliable on any configuration.
>> I was afraid there were a lot of latent issues, that's why I sent this
>> mail in the first place.
>> It's also the reason why I started thinking about `lldb-test` as a
>> driver for testing, because I found out the testsuite being a little
>> inconsistent/brittle depending on the environment it's run on (which,
>> FWIW, doesn't happen when you run lit/FileCheck or even the unit tests
>> in lldb). I'm not currently claiming switching to a different method
>> would improve the situation, but it's worth a shot.
>>
>
> Despite Zachary's claims, I do not believe this is caused by the test
> driver (dotest). It's definitely not beautiful, but I haven't seen an
> issue that would be caused by this in a long time. The issue is that
> the tests are doing too much -- even the simplest involves compiling a
> fully working executable, which pulls in a lot of stuff from the
> environment (runtime libraries, dynamic linker, ...) that we have no
> control of. And of course it makes it impossible to test the debugging
> functionality of any other platform than what you currently have in
> front of you.
>
> In this sense, the current setup makes an excellent integration test
> suite -- if you run the tests and they pass, you can be fairly
> confident that the debugging on your system is setup correctly.
> However, it makes a very bad regression test suite, as the tests will
> be checking something different on each machine.
>

Yes, I didn't complain about "dotest" in general, but, as you say, the
fact that it pull in lots of stuffs we don't really have control on.
Also, most of the times I actually found out we've been sloppy watching
bots for a while, or XFAILING tests instead of fixing them and that resulted in
issues piling up). This is a more general problem not necessarily tied to
`dotest` as a driver.

> So I believe we need more lightweight tests, and lldb-test can provide
> us with that. The main question for me (and that's something I don't

+1.

> really have an answer to) is how to make writing tests like that easy.
> E.g. for these "foreign" language plugins, the only way to make a
> self-contained regression test would be to check-in some dwarf which
> mimics what the compiler in question would produce. But doing that is
> extremely tedious as we don't have any tooling for that. Since debug
> info is very central to what we do, having something like that would
> go a long way towards improving the testing situation, and it would be
> useful for C/C++ as well, as we generally need to make sure that we
> work with a wide range of compiler versions, not just accept what ToT
> clang happens to produce.
>

I think the plan here (and I'd love to spend some time on this once we
have stability, which seems we're slowly getting) is that of enhancing
`yaml2*` to do the work for us.
I do agree is a major undertaken but even spending a month on it will
go a long way IMHO. I will try to come up with a plan after discussing
with folks in my team (I'd really love to get also inputs from DWARF
people in llvm, e.g. Eric or David Blake).

>
> PS: I saw your second email as well. I'm going to try out what you
> propose, most likely tomorrow.

Thanks!

--
Davide
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-06 Thread Pavel Labath via lldb-dev
On 6 February 2018 at 15:41, Davide Italiano  wrote:
> On Tue, Feb 6, 2018 at 7:09 AM, Pavel Labath  wrote:
>> On 6 February 2018 at 04:11, Davide Italiano via lldb-dev
>>
>> So, I guess my question is: are you guys looking into making sure that
>> others are also able to reproduce the 0-fail+0-xpass state? I would
>> love to run the mac test suite locally, as I tend to touch a lot of
>> stuff that impacts all targets, but as it stands now, I have very
>> little confidence that the test I am running reflect in any way the
>> results you will get when you run the test on your end.
>>
>> I am ready to supply any test logs or information you need if you want
>> to try to tackle this.
>>
>
> Yes, I'm definitely interested in making the testusuite
> working/reliable on any configuration.
> I was afraid there were a lot of latent issues, that's why I sent this
> mail in the first place.
> It's also the reason why I started thinking about `lldb-test` as a
> driver for testing, because I found out the testsuite being a little
> inconsistent/brittle depending on the environment it's run on (which,
> FWIW, doesn't happen when you run lit/FileCheck or even the unit tests
> in lldb). I'm not currently claiming switching to a different method
> would improve the situation, but it's worth a shot.
>

Despite Zachary's claims, I do not believe this is caused by the test
driver (dotest). It's definitely not beautiful, but I haven't seen an
issue that would be caused by this in a long time. The issue is that
the tests are doing too much -- even the simplest involves compiling a
fully working executable, which pulls in a lot of stuff from the
environment (runtime libraries, dynamic linker, ...) that we have no
control of. And of course it makes it impossible to test the debugging
functionality of any other platform than what you currently have in
front of you.

In this sense, the current setup makes an excellent integration test
suite -- if you run the tests and they pass, you can be fairly
confident that the debugging on your system is setup correctly.
However, it makes a very bad regression test suite, as the tests will
be checking something different on each machine.

So I believe we need more lightweight tests, and lldb-test can provide
us with that. The main question for me (and that's something I don't
really have an answer to) is how to make writing tests like that easy.
E.g. for these "foreign" language plugins, the only way to make a
self-contained regression test would be to check-in some dwarf which
mimics what the compiler in question would produce. But doing that is
extremely tedious as we don't have any tooling for that. Since debug
info is very central to what we do, having something like that would
go a long way towards improving the testing situation, and it would be
useful for C/C++ as well, as we generally need to make sure that we
work with a wide range of compiler versions, not just accept what ToT
clang happens to produce.


PS: I saw your second email as well. I'm going to try out what you
propose, most likely tomorrow.
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-06 Thread Davide Italiano via lldb-dev
On Tue, Feb 6, 2018 at 7:09 AM, Pavel Labath  wrote:
> On 6 February 2018 at 04:11, Davide Italiano via lldb-dev
>  wrote:
>> Hi,
>> in the last couple of months a lot of people put a lot of attentions
>> and energy into lldb and we're starting to see the first results. I
>> decided to sit down and write this e-mail to state where we are and
>> what are some possible future directions for the projects, in terms of
>> better quality/higher testability.
>>
>> Current state:
>>
>> 1) We got the testsuite on MacOS to build with zero unexpected
>> successes and zero failures (modulo one change I'm going to push
>> tomorrow). This is a collaborative effort and it's very important
>> because it allows us to push for unexpected successes as failures on
>> the bots, allowing us to discover issues quicker. Other platform I
>> think are improving their state as well, mainly thanks to the work of
>> Pavel and Jan.
>
> I don't mean to belittle this statement, as I think the situation has
> definitely improved a lot lately, but I feel I have to point out
> that's I've never been able to get a clean test suite run on a mac
> (not even the "0 failures" kind of clean). I'm not sure what are these
> caused by, but I guess that's because the tests are still very much
> dependent on the environment. So, I have to ask: what kind of
> environment are you running those tests in?
>
> My machine is not a completely off-the-shelf mac, as it has some
> google-specific customizations. I don't really know what this
> encompasses, but I would be surprised if these impact the result of
> the test suite. If I had to bet, my money would be on your machines
> having some apple-specific stuff which is not available in regular
> macs.
>
> I tried an experiment today. I configured with: cmake ../../llvm
> -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -GNinja. First
> problem I ran into was that I couldn't run check-lldb, as the clang I
> have just built was unable to compile any of the test binaries due to
> missing headers (this could be a manifestation of the SDKROOT issue we
> ran into a couple of weeks ago). So, I tried running with the system
> compiler and I got this output:
>
> FAIL: test_c_global_variables_dwarf
> (lang/c/global_variables/TestGlobalVariables.py)
> FAIL: test_c_global_variables_gmodules
> (lang/c/global_variables/TestGlobalVariables.py)
> FAIL: test_dsym (functionalities/ubsan/basic/TestUbsanBasic.py)
> FAIL: test_dwarf (functionalities/ubsan/basic/TestUbsanBasic.py)
> FAIL: test_gmodules (functionalities/ubsan/basic/TestUbsanBasic.py)
> FAIL: test_with_python_api_dsym (lang/cpp/class_static/TestStaticVariables.py)
> FAIL: test_with_python_api_dwarf 
> (lang/cpp/class_static/TestStaticVariables.py)
> FAIL: test_with_python_api_gmodules
> (lang/cpp/class_static/TestStaticVariables.py)
> ERROR: test_debug_info_for_apple_types_dsym
> (macosx/debug-info/apple_types/TestAppleTypesIsProduced.py)
> ERROR: test_debug_info_for_apple_types_dwarf
> (macosx/debug-info/apple_types/TestAppleTypesIsProduced.py)
> ERROR: test_debug_info_for_apple_types_gmodules
> (macosx/debug-info/apple_types/TestAppleTypesIsProduced.py)
> UNEXPECTED SUCCESS: test_lldbmi_output_grammar
> (tools/lldb-mi/syntax/TestMiSyntax.py)
> UNEXPECTED SUCCESS: test_process_interrupt_dsym
> (functionalities/thread/state/TestThreadStates.py)
> UNEXPECTED SUCCESS: test_process_interrupt_gmodules
> (functionalities/thread/state/TestThreadStates.py)
>

FWIW, I strongly believe we should all agree on a configuration to run
tests and standardize on that.
It's unfortunate that we have two build systems, but there are plans
to move away from manually generating xcodebuild, as many agree it's a
terrible maintenance burden.
So, FWIW, I'll share my conf (I'm on high Sierra):


git clone https://github.com/monorepo
symlink clang -> tools
symlink lldb -> tools
symlink libcxx -> projects (this particular one has caused lots of
trouble for me in the past, and I realized it's undocumented :()

cmake -GNinja -DCMAKE_BUILD_TYPE=Release ../llvm
ninja check-lldb

This *should* work just fine for every developer (and we should error
out if all the dependencies are not in place). If it doesn't, well,
it's a bug.
Can you please try with this and report all the bugs that you find?
I'll work with you to fix them, as I'm particularly interested in
getting the lldb experience for users flawlessly out-the-box (at least
on the platforms I work on :)

--
Davide
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-06 Thread Davide Italiano via lldb-dev
On Tue, Feb 6, 2018 at 7:09 AM, Pavel Labath  wrote:
> On 6 February 2018 at 04:11, Davide Italiano via lldb-dev
>  wrote:
>> Hi,
>> in the last couple of months a lot of people put a lot of attentions
>> and energy into lldb and we're starting to see the first results. I
>> decided to sit down and write this e-mail to state where we are and
>> what are some possible future directions for the projects, in terms of
>> better quality/higher testability.
>>
>> Current state:
>>
>> 1) We got the testsuite on MacOS to build with zero unexpected
>> successes and zero failures (modulo one change I'm going to push
>> tomorrow). This is a collaborative effort and it's very important
>> because it allows us to push for unexpected successes as failures on
>> the bots, allowing us to discover issues quicker. Other platform I
>> think are improving their state as well, mainly thanks to the work of
>> Pavel and Jan.
>
> I don't mean to belittle this statement, as I think the situation has
> definitely improved a lot lately, but I feel I have to point out
> that's I've never been able to get a clean test suite run on a mac
> (not even the "0 failures" kind of clean). I'm not sure what are these
> caused by, but I guess that's because the tests are still very much
> dependent on the environment. So, I have to ask: what kind of
> environment are you running those tests in?
>
> My machine is not a completely off-the-shelf mac, as it has some
> google-specific customizations. I don't really know what this
> encompasses, but I would be surprised if these impact the result of
> the test suite. If I had to bet, my money would be on your machines
> having some apple-specific stuff which is not available in regular
> macs.
>
> I tried an experiment today. I configured with: cmake ../../llvm
> -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -GNinja. First
> problem I ran into was that I couldn't run check-lldb, as the clang I
> have just built was unable to compile any of the test binaries due to
> missing headers (this could be a manifestation of the SDKROOT issue we
> ran into a couple of weeks ago). So, I tried running with the system
> compiler and I got this output:
>
> FAIL: test_c_global_variables_dwarf
> (lang/c/global_variables/TestGlobalVariables.py)
> FAIL: test_c_global_variables_gmodules
> (lang/c/global_variables/TestGlobalVariables.py)
> FAIL: test_dsym (functionalities/ubsan/basic/TestUbsanBasic.py)
> FAIL: test_dwarf (functionalities/ubsan/basic/TestUbsanBasic.py)
> FAIL: test_gmodules (functionalities/ubsan/basic/TestUbsanBasic.py)
> FAIL: test_with_python_api_dsym (lang/cpp/class_static/TestStaticVariables.py)
> FAIL: test_with_python_api_dwarf 
> (lang/cpp/class_static/TestStaticVariables.py)
> FAIL: test_with_python_api_gmodules
> (lang/cpp/class_static/TestStaticVariables.py)
> ERROR: test_debug_info_for_apple_types_dsym
> (macosx/debug-info/apple_types/TestAppleTypesIsProduced.py)
> ERROR: test_debug_info_for_apple_types_dwarf
> (macosx/debug-info/apple_types/TestAppleTypesIsProduced.py)
> ERROR: test_debug_info_for_apple_types_gmodules
> (macosx/debug-info/apple_types/TestAppleTypesIsProduced.py)
> UNEXPECTED SUCCESS: test_lldbmi_output_grammar
> (tools/lldb-mi/syntax/TestMiSyntax.py)
> UNEXPECTED SUCCESS: test_process_interrupt_dsym
> (functionalities/thread/state/TestThreadStates.py)
> UNEXPECTED SUCCESS: test_process_interrupt_gmodules
> (functionalities/thread/state/TestThreadStates.py)
>
> So, I guess my question is: are you guys looking into making sure that
> others are also able to reproduce the 0-fail+0-xpass state? I would
> love to run the mac test suite locally, as I tend to touch a lot of
> stuff that impacts all targets, but as it stands now, I have very
> little confidence that the test I am running reflect in any way the
> results you will get when you run the test on your end.
>
> I am ready to supply any test logs or information you need if you want
> to try to tackle this.
>

Yes, I'm definitely interested in making the testusuite
working/reliable on any configuration.
I was afraid there were a lot of latent issues, that's why I sent this
mail in the first place.
It's also the reason why I started thinking about `lldb-test` as a
driver for testing, because I found out the testsuite being a little
inconsistent/brittle depending on the environment it's run on (which,
FWIW, doesn't happen when you run lit/FileCheck or even the unit tests
in lldb). I'm not currently claiming switching to a different method
would improve the situation, but it's worth a shot.

>> 3) In the short term I plan to remove support for unmaintained
>> languages (Java/Go/Swift). This allows us to bring them back again (or
> I hope you meant OCaml instead of Swift. :P

Oh, yes, sigh.
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Testsuite in lldb & possible future directions

2018-02-06 Thread Pavel Labath via lldb-dev
On 6 February 2018 at 04:11, Davide Italiano via lldb-dev
 wrote:
> Hi,
> in the last couple of months a lot of people put a lot of attentions
> and energy into lldb and we're starting to see the first results. I
> decided to sit down and write this e-mail to state where we are and
> what are some possible future directions for the projects, in terms of
> better quality/higher testability.
>
> Current state:
>
> 1) We got the testsuite on MacOS to build with zero unexpected
> successes and zero failures (modulo one change I'm going to push
> tomorrow). This is a collaborative effort and it's very important
> because it allows us to push for unexpected successes as failures on
> the bots, allowing us to discover issues quicker. Other platform I
> think are improving their state as well, mainly thanks to the work of
> Pavel and Jan.

I don't mean to belittle this statement, as I think the situation has
definitely improved a lot lately, but I feel I have to point out
that's I've never been able to get a clean test suite run on a mac
(not even the "0 failures" kind of clean). I'm not sure what are these
caused by, but I guess that's because the tests are still very much
dependent on the environment. So, I have to ask: what kind of
environment are you running those tests in?

My machine is not a completely off-the-shelf mac, as it has some
google-specific customizations. I don't really know what this
encompasses, but I would be surprised if these impact the result of
the test suite. If I had to bet, my money would be on your machines
having some apple-specific stuff which is not available in regular
macs.

I tried an experiment today. I configured with: cmake ../../llvm
-DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -GNinja. First
problem I ran into was that I couldn't run check-lldb, as the clang I
have just built was unable to compile any of the test binaries due to
missing headers (this could be a manifestation of the SDKROOT issue we
ran into a couple of weeks ago). So, I tried running with the system
compiler and I got this output:

FAIL: test_c_global_variables_dwarf
(lang/c/global_variables/TestGlobalVariables.py)
FAIL: test_c_global_variables_gmodules
(lang/c/global_variables/TestGlobalVariables.py)
FAIL: test_dsym (functionalities/ubsan/basic/TestUbsanBasic.py)
FAIL: test_dwarf (functionalities/ubsan/basic/TestUbsanBasic.py)
FAIL: test_gmodules (functionalities/ubsan/basic/TestUbsanBasic.py)
FAIL: test_with_python_api_dsym (lang/cpp/class_static/TestStaticVariables.py)
FAIL: test_with_python_api_dwarf (lang/cpp/class_static/TestStaticVariables.py)
FAIL: test_with_python_api_gmodules
(lang/cpp/class_static/TestStaticVariables.py)
ERROR: test_debug_info_for_apple_types_dsym
(macosx/debug-info/apple_types/TestAppleTypesIsProduced.py)
ERROR: test_debug_info_for_apple_types_dwarf
(macosx/debug-info/apple_types/TestAppleTypesIsProduced.py)
ERROR: test_debug_info_for_apple_types_gmodules
(macosx/debug-info/apple_types/TestAppleTypesIsProduced.py)
UNEXPECTED SUCCESS: test_lldbmi_output_grammar
(tools/lldb-mi/syntax/TestMiSyntax.py)
UNEXPECTED SUCCESS: test_process_interrupt_dsym
(functionalities/thread/state/TestThreadStates.py)
UNEXPECTED SUCCESS: test_process_interrupt_gmodules
(functionalities/thread/state/TestThreadStates.py)

So, I guess my question is: are you guys looking into making sure that
others are also able to reproduce the 0-fail+0-xpass state? I would
love to run the mac test suite locally, as I tend to touch a lot of
stuff that impacts all targets, but as it stands now, I have very
little confidence that the test I am running reflect in any way the
results you will get when you run the test on your end.

I am ready to supply any test logs or information you need if you want
to try to tackle this.

> 3) In the short term I plan to remove support for unmaintained
> languages (Java/Go/Swift). This allows us to bring them back again (or
I hope you meant OCaml instead of Swift. :P
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev