I'm interested in the time it is supposed to be processed, actually. I'm trying to recreate your example here to see if I can get more information.
Karl On Thu, Sep 17, 2015 at 12:36 PM, Colreavy, Niall < [email protected]> wrote: > The document is in a state of 'Processed' and the status is 'Ready for > processing' > > -----Original Message----- > From: Karl Wright [mailto:[email protected]] > Sent: 17 September 2015 5:28 > To: dev > Subject: Re: Potential Issue with pausing jobs > > When it is in the state after the job has resumed, can you do a Document > Status report and tell me what that says for your document? > > Thanks, > Karl > > > On Thu, Sep 17, 2015 at 12:16 PM, Colreavy, Niall < > [email protected]> wrote: > > > Hi Karl, > > > > Thanks for that. I think the problem might be more fundamental. When I > > start my job and monitor the simple job history I can see the job doing > > things like: > > > > Run the seed query > > Run the data query > > Run the seed query > > Run the data query > > > > Etc. > > > > It continues to do this indefinitely from what I have observed. As soon > as > > I pause and resume the job, all I can see in the simple job history is: > > > > Run the seed query > > Run the seed query > > Run the seed query > > > > It's like it's never going to run the data query again? > > > > Kind Regards, > > > > Niall > > > > -----Original Message----- > > From: Karl Wright [mailto:[email protected]] > > Sent: 17 September 2015 4:53 > > To: dev > > Subject: Re: Potential Issue with pausing jobs > > > > Hi Niall, > > > > A continuous job reseeds on a schedule, which you set as part of the job > > setup. For a continuous job, if the document has been crawled, it will > be > > recrawled again at a specific time in the future, and if at that time it > > hasn't changed, it will be scheduled for checking again even further out, > > up to a certain limit (also settable within the job). > > > > You can look at the document's schedule, by the way, using the "Document > > Status" report, and it should be pretty clear from that what should > happen > > and when. > > > > When you abort the job and restart it, everything is reset, so the > document > > will be checked immediately at that point, and relatively frequently for > a > > while until the system figures out that the document isn't changing very > > rapidly. > > > > Thanks, > > Karl > > > > > > > > > > > > > > On Thu, Sep 17, 2015 at 11:38 AM, Colreavy, Niall < > > [email protected]> wrote: > > > > > Hi Karl, > > > > > > You'll have to forgive me if my answer is a bit uncertain but I am very > > > new to MCF. Just to clarify, I have a very simple job. For the JDBC > > > connector, I am literally just selecting 1 for the id, 'myurl' for the > > url > > > and 'mydata' for the data. So there is only ever 1 document being > > processed. > > > > > > So to answer the questions: > > > > > > 1. There are 0 active documents on the queue. > > > 2. Single process > > > 3. Yes, this is a continuous crawl. > > > > > > Kind Regards, > > > > > > Niall > > > > > > -----Original Message----- > > > From: Karl Wright [mailto:[email protected]] > > > Sent: 17 September 2015 4:27 > > > To: dev > > > Subject: Re: Potential Issue with pausing jobs > > > > > > Hi Niall, > > > > > > Pausing and resuming a job should have no effects *other* than > > > reprioritization of the active documents on the queue, which if there > > are a > > > lot of them, may take some time. > > > > > > So let's ask some basic questions. (1) How many active documents on > your > > > queue? (2) What kind of synchronization are you using? Is this single > > > process, or multiprocess? (3) Is this a continuous crawl? > > > > > > >>>>>> > > > And on a side note, what is the difference between pausing a job and > > > aborting a job? > > > <<<<<< > > > > > > I can't fully answer that unless I know the characteristics of your > job, > > > especially continuous crawl vs. crawl to completion. > > > > > > Karl > > > > > > > > > On Thu, Sep 17, 2015 at 11:07 AM, Colreavy, Niall < > > > [email protected]> wrote: > > > > > > > Hi, > > > > > > > > I am experimenting with pausing a job. The job has a simple JDBC > > > > connection and a null output connection. I was experimenting with > > pausing > > > > the job and I notice that when I resume the job, and monitor it's > > > progress > > > > in the simple history report, the job never seems to run the data > query > > > any > > > > more. I can see that it runs the seed query but it doesn't progress > to > > > the > > > > data query. If I abort the job and restart it, it does seem to start > > > > running the data query again. > > > > > > > > Can anyone explain this behaviour? And on a side note, what is the > > > > difference between pausing a job and aborting a job? > > > > > > > > Thanks, > > > > > > > > Niall > > > > > > > > > >
