Re: Calcite fuzz testing

2019-04-26 Thread Michael Mior
It would be interesting if the Tracehash author had a source of bug reports identified as duplicates along with stack traces to see how well it works in practice. At this point, it seems like it's just a heuristic based on an opinion of what's important. -- Michael Mior mm...@apache.org Le ven.

Re: Calcite fuzz testing

2019-04-26 Thread Julian Hyde
What is the evidence that Tracehash actually works? In GeoHash there is a notion of proximity, so it is clear that if two locations are within 10 miles then there will be a maximum distance between their hashes. When Tracehash removes part of the stack, is this based on a human expert’s

Re: Calcite fuzz testing

2019-04-26 Thread Michael Mior
I could see some might dismiss this as noise, but I really like the idea of tracehash and it would be nice to see that catch on. (I think it would be interesting if it could be structured something like a geohash so truncation would reduce specificity, but it's less obvious how to do this here.)

Re: Calcite fuzz testing

2019-04-26 Thread Vladimir Sitnikov
Let me post a couple of links I've came across today (it comes out of this Twitter thread: https://twitter.com/backendsecret/status/1121290210464034816 ): https://github.com/alexknvl/fuzzball -- it is a machine learning driven fuzzer for Scala which identifies quite a few bugs in Scala compiler.

Re: Calcite fuzz testing

2018-09-19 Thread Michael Mior
I was just suggesting that if there were a number of small bugs already discovered that maybe one JIRA would be enough to cover them. I'm certainly not suggesting that we have a JIRA that is continually updated. Perhaps there should just be a separate JIRA for each issue found. -- Michael Mior

Re: Calcite fuzz testing

2018-09-19 Thread Vladimir Sitnikov
Michael>Looks good to me! Should we maybe create a JIRA that we can point to for Michael>those interested in fixing some bugs? Frankly speaking I have no idea how that should work. The identified expressions are new every time, so do you mean we should have "ever-opened JIRA issue that suggests

Re: Calcite fuzz testing

2018-09-17 Thread Michael Mior
Looks good to me! Should we maybe create a JIRA that we can point to for those interested in fixing some bugs? -- Michael Mior mm...@apache.org Le mer. 12 sept. 2018 à 18:17, Vladimir Sitnikov < sitnikov.vladi...@gmail.com> a écrit : > Let the fuzzing begin:

Re: Calcite fuzz testing

2018-09-12 Thread Vladimir Sitnikov
Let the fuzzing begin: https://github.com/apache/calcite/pull/830 I have not added it to the CalciteSuite since otherwise it would fail each and every build. On the other hand, it might be a good source of inspiration for newbie contributions. RexProgrammFuzzyTest#testFuzzy produces lots of

Re: Calcite fuzz testing

2018-09-12 Thread Michael Mior
My own personal machine could work although I've found in the past it can be a pain since I'm constantly installing and reconfiguring it for other tasks and it's more likely to cause other things to break. But it's certainly a possible option. -- Michael Mior mm...@apache.org Le mer. 12 sept.

Re: Calcite fuzz testing

2018-09-12 Thread Julian Hyde
My intuition about fuzz testing is that since we are searching an exponentially-sized search space in random order we will find 90% of the bugs with the first 10% of the effort (or some similar power law). We should burn a large amount of CPU on it when we first introduce it, and thereafter

Re: Calcite fuzz testing

2018-09-12 Thread Michael Mior
True, although 23 days over the lifetime of the project still isn't very much. Definitely better than nothing though. If we take a bit of a hit in CI runtime and catch some bugs, I'm for it :) > It would be great, however we need to have a fuzzer first :) My setup fuzzing the parser with afl

Re: Calcite fuzz testing

2018-09-12 Thread Vladimir Sitnikov
>My only concern is that may betoo short and unlikely to find any bugs Remember: each time it starts from a random point. Apache Jenkins / Calcite-Master has 800+ builds now. Travis / Calcite has 2300+ builds now. Just to clarify: current Travis configuration is 4 jobs (Java 8, 9, 10, 11) They

Re: Calcite fuzz testing

2018-09-12 Thread Michael Mior
I certainly wouldn't be opposed to having a little bit of fuzz testing incorporated into regular tests as a trial. My only concern is that may be too short and unlikely to find any bugs. But we could always have it run only on CI in the future. It seems like we could also possibly request a VM

Re: Calcite fuzz testing

2018-09-12 Thread Vladimir Sitnikov
Michael>I wouldn't expect this to execute as part of the normal test suite Note: we can include limited fuzzing in the regular test suite. For instance: a single Travis job that performs fuzz-only tests for a couple of minutes. That would not increase test duration, however it could eventually

Re: Calcite fuzz testing

2018-09-10 Thread Michael Mior
Thanks for the pointer to sqlsmith. I had seen it in the past but its existence slipped from my memory. (I was also just looking for an excuse to play around with afl.) Sounds like it might be helpful to have a test harness to enable us to easily run sqlsmith (I wouldn't expect this to execute as

Re: Calcite fuzz testing

2018-09-10 Thread Vladimir Sitnikov
Michael>Tests are running quite slowly, but I left it overnight and haven't found any crashes There's https://github.com/anse1/sqlsmith which is supposed to be used against PostgreSQL. However, we could configure Calcite to be a proxy to the PostgreSQL, so sqlsmith would use Calcite as if it was

Calcite fuzz testing

2018-09-10 Thread Michael Mior
In a separate thread, the idea of fuzz testing was brought up. I decided this could be a fun thing to play around with. I managed to get something simple running with Kelinci (https://github.com/isstac/kelinci) that tests for crashes in SqlParser using queries from the Quidem tests as the initial