[ 
https://issues.apache.org/jira/browse/OAK-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16535812#comment-16535812
 ] 

Vikas Saurabh commented on OAK-7495:
------------------------------------

I think I know what's going on (at least in one case). The issue seems to be 
originating due to the fact that sometimes between "observer adding a document 
to queue" and "observer call stack updating and refreshing index view" the 
queue processor gets called and writes to document to index (and also marks it 
as processed .. thus observer doesn't process the new doc anymore). But queue 
processor then gets unscheduled (before marking as "index requires refresh") 
and consumer hits the query without the doc visible in reader yet.

I've added a few attachments:
* [^OAK-7495-add-logs.patch] - add a lot of logs which show how the calls happen
*  [^OAK-7495-test.patch] - this is almost same as test in 
[^OAK-7495.demo.patch] with deadlock fixed, removed sleeps and additional 
reference editor provider (to avoid continuous logs for reference index)
*  [^OAK-7495-potential-fix.patch] - a trivial fix which is simply forcing sync 
docs not pushed to queue but instead always to be updated in sync with save 
call stack (I think it can have impact on save performance... but, as a 
counter-point, I think "sync" index should do that anyway)

All this said, I don't feel comfortable with trying to maintain quite a complex 
sync implementation where hybridV2 does a repository level sync view for 
specific properties (so force sync for only some properties and not all 
properties being handled by the index). I'd rather prefer we deprecate 
"async='sync' " type (of course, that's my personal view)

[~chetanm] would love to hear your thought on this.

[~egli], while I'm still not completely sure how we might want to handle this 
issue... but, maybe, you can try  [^OAK-7495-potential-fix.patch] to see if 
expectation of your use case works better.

> async,sync index not synchronous
> --------------------------------
>
>                 Key: OAK-7495
>                 URL: https://issues.apache.org/jira/browse/OAK-7495
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: 1.6.1
>            Reporter: Stefan Egli
>            Assignee: Vikas Saurabh
>            Priority: Major
>         Attachments: GetJobVerifier.java, OAK-7495-add-logs.patch, 
> OAK-7495-potential-fix.patch, OAK-7495-test.patch, OAK-7495.demo.patch, 
> slingeventJob.-1.tidy.json, unit-tests.log
>
>
> On an oak 1.6.1 (AEM 6.3) a suspicious behaviour was detected, where in Sling 
> an 
> [addJob|https://github.com/apache/sling-old-svn-mirror/blob/org.apache.sling.event-4.2.0/src/main/java/org/apache/sling/event/impl/jobs/JobManagerImpl.java#L286]
>  followed by a 
> [getJobById|https://github.com/apache/sling-old-svn-mirror/blob/org.apache.sling.event-4.2.0/src/main/java/org/apache/sling/event/impl/jobs/JobManagerImpl.java#L294]
>  (in a different thread though, but perhaps would also fail in same thread) 
> was not seeing the job that was just created.
> To give a bit more background, in Sling getJobById results in a query. That 
> query uses an index which is built using {{"async, sync"}}. So the assumption 
> is that the index is actually synchronous. But a test reproducing initially 
> mentioned scenario showed the opposite.
> Attached:
> *  [^GetJobVerifier.java] a Sling job test case that has 2 threads: a thread 
> that does addJob, adds the resulting jobId to a list (synchronized). and a 
> second thread that reads the jobId off that list and does a getJobById. That 
> getJobById should find the job, as it was just created (how else could you 
> figure out the jobId) - but sometimes it FAILs (see system out FAIL)
> *  [^slingeventJob.-1.tidy.json] the index definition showing it is indeed 
> "async, sync"
> PS: Example query that is executed: 
> {{/jcr:root/var/eventing/jobs//element(*,slingevent:Job)[@slingevent:eventId 
> = '2018/5/11/2/12/bca505d9-3044-4de9-9732-056ab1b6c513_5569']}}
> /cc [~catholicon]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to