It would be interesting if the Tracehash author had a source of bug
reports identified as duplicates along with stack traces to see how
well it works in practice. At this point, it seems like it's just a
heuristic based on an opinion of what's important.
--
Michael Mior
mm...@apache.org
Le ven.
What is the evidence that Tracehash actually works? In GeoHash there is a
notion of proximity, so it is clear that if two locations are within 10 miles
then there will be a maximum distance between their hashes. When Tracehash
removes part of the stack, is this based on a human expert’s
I could see some might dismiss this as noise, but I really like the
idea of tracehash and it would be nice to see that catch on. (I think
it would be interesting if it could be structured something like a
geohash so truncation would reduce specificity, but it's less obvious
how to do this here.)
Let me post a couple of links I've came across today (it comes out of this
Twitter thread: https://twitter.com/backendsecret/status/1121290210464034816
):
https://github.com/alexknvl/fuzzball -- it is a machine learning driven
fuzzer for Scala which identifies quite a few bugs in Scala compiler.
I was just suggesting that if there were a number of small bugs already
discovered that maybe one JIRA would be enough to cover them. I'm certainly
not suggesting that we have a JIRA that is continually updated. Perhaps
there should just be a separate JIRA for each issue found.
--
Michael Mior
Michael>Looks good to me! Should we maybe create a JIRA that we can point
to for
Michael>those interested in fixing some bugs?
Frankly speaking I have no idea how that should work.
The identified expressions are new every time, so do you mean we should
have "ever-opened JIRA issue that suggests
Looks good to me! Should we maybe create a JIRA that we can point to for
those interested in fixing some bugs?
--
Michael Mior
mm...@apache.org
Le mer. 12 sept. 2018 à 18:17, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> a écrit :
> Let the fuzzing begin:
Let the fuzzing begin: https://github.com/apache/calcite/pull/830
I have not added it to the CalciteSuite since otherwise it would fail each
and every build.
On the other hand, it might be a good source of inspiration for newbie
contributions.
RexProgrammFuzzyTest#testFuzzy produces lots of
My own personal machine could work although I've found in the past it can
be a pain since I'm constantly installing and reconfiguring it for other
tasks and it's more likely to cause other things to break. But it's
certainly a possible option.
--
Michael Mior
mm...@apache.org
Le mer. 12 sept.
My intuition about fuzz testing is that since we are searching an
exponentially-sized search space in random order we will find 90% of the bugs
with the first 10% of the effort (or some similar power law). We should burn a
large amount of CPU on it when we first introduce it, and thereafter
True, although 23 days over the lifetime of the project still isn't very
much. Definitely better than nothing though. If we take a bit of a hit in
CI runtime and catch some bugs, I'm for it :)
> It would be great, however we need to have a fuzzer first :)
My setup fuzzing the parser with afl
>My only concern is that may betoo short and unlikely to find any bugs
Remember: each time it starts from a random point.
Apache Jenkins / Calcite-Master has 800+ builds now.
Travis / Calcite has 2300+ builds now.
Just to clarify: current Travis configuration is 4 jobs (Java 8, 9, 10, 11)
They
I certainly wouldn't be opposed to having a little bit of fuzz testing
incorporated into regular tests as a trial. My only concern is that may be
too short and unlikely to find any bugs. But we could always have it run
only on CI in the future. It seems like we could also possibly request a VM
Michael>I wouldn't expect this to execute as part of the normal test suite
Note: we can include limited fuzzing in the regular test suite.
For instance: a single Travis job that performs fuzz-only tests for a
couple of minutes.
That would not increase test duration, however it could eventually
Thanks for the pointer to sqlsmith. I had seen it in the past but its
existence slipped from my memory. (I was also just looking for an excuse to
play around with afl.) Sounds like it might be helpful to have a test
harness to enable us to easily run sqlsmith (I wouldn't expect this to
execute as
Michael>Tests are running quite slowly, but I left it overnight and haven't
found any crashes
There's https://github.com/anse1/sqlsmith which is supposed to be used
against PostgreSQL.
However, we could configure Calcite to be a proxy to the PostgreSQL, so
sqlsmith would use Calcite as if it was
In a separate thread, the idea of fuzz testing was brought up. I decided
this could be a fun thing to play around with. I managed to get something
simple running with Kelinci (https://github.com/isstac/kelinci) that tests
for crashes in SqlParser using queries from the Quidem tests as the initial
17 matches
Mail list logo