[
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated SOLR-13741:
----------------------------
Attachment: SOLR-13741.patch
Status: Open (was: Open)
Attaching my patch, note that at the moment this patch only modifies
{{AuditLoggerIntegrationTest}} and does not yet address the '#1' comment I made
above regarding the 'delay' option on {{CallbackAuditLoggerPlugin}} – there are
additional nocommit comments regarding the planned changes for that, but I
didn't want to start on those changes until these existing uncertainties were
addressed.
[~janhoy] : I would greatly appreciate your review here to help clear up the
"correct test, bad behavior" vs "correct behavior, bad test" questions.
> possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
> ------------------------------------------------------------------------------
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Hoss Man
> Assignee: Hoss Man
> Priority: Major
> Attachments: SOLR-13741.patch
>
>
> A while back i saw a weird non-reproducible failure from
> AuditLoggerIntegrationTest. When i started reading through that code, 2
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about
> CPU scheduling that aren't neccessarily going to be true (and also suffers
> from the problem that Thread.sleep isn't garunteed to sleep as long as you
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll
> loop, until it contains at least N items -- but the code that adds items to
> that buffer in the async Callback thread async _before_ the code that updates
> other state variables (like the global {{count}} and the patch specific
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3
> events added to the buffer, but calling {{assertEquals(3,
> receiver.getTotalCount())}} could subsequently fail because that variable
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for
> that specific problem would be to update all other state _before_ adding the
> event to the buffer, I set out to try and make more general improvements to
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data
> structures
> * harden the assertions made about the expected events recieved (updating
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies
> between things the test currently claims/asserts, and what *actually* happens
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes
> copious nocommits about things seem suspect. the summary of these concerns
> is:
> * SolrException status codes that do not match what the existing test says
> they should (but doesn't assert)
> * extra AuditEvents occuring that the existing test does not expect
> * AuditEvents for incorrect credentials that do not at all match the expected
> AuditEvent in the existing test -- which the current test seems to miss in
> it's assertions because it's picking up some extra events from triggered by
> previuos requests earlier in the test that just happen to also match the
> asserctions.
> ...it's not clear to me if the test logic is correct and these are "code
> bugs" or if the test is faulty.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]