Hi - the big source of bloat for hopcount processing is the delete
dependencies table, and the options provided allow you to not track those
at all.  The other tables (intrinsiclink and hopcount) are 1:1 with the
documents themselves, so these were not considered worth optimizing.

It may be possible to introduce a fourth hopcount mode that did not record
any information in those tables - but since this can be changed on a job,
very careful analysis would need to be done to figure out what happens when
someone flips that setting after a crawl has already been run.

Karl


On Thu, May 11, 2023 at 2:28 AM Mingchun Zhao <mingchun.zha...@gmail.com>
wrote:

> Hi Karl,
>
> Thank you for taking time out of your busy schedule to reply.
>
> > There is an option on the "hopcount" tab of your job to disable hopcount
>
> You mean setting "Hop count mode" to "keep unreachable documents,
> forever" in the "Hop Filters" tab?
> Yes, I did it, however, it seems that the records were still inserted
> into the "intrinsiclink" and "hopcount" tables. Is there a way to tell
> MCF not to insert data into those tables because operations on it can
> become a performance bottleneck when the tables bloat?
>
> Regards,
> Mingchun
>
> 2023年5月10日(水) 19:53 Karl Wright <daddy...@gmail.com>:
> >
> > There is an option on the "hopcount" tab of your job to disable hopcount
> > tracking entirely.
> > Karl
> >
> > On Tue, May 9, 2023 at 11:49 PM Mingchun Zhao <mingchun.zha...@gmail.com
> >
> > wrote:
> >
> > > Hi Karl,
> > >
> > > Could you please advise me on tracking hopcount.
> > > I'm using ManifoldCF 2.24 with PostgreSQL 12.14 as the database for
> now.
> > > In my case, I don't need to use the 'Hop Filters' feature so I'd like
> > > to disable tracking hopcount and reduce the insert/update/delete load
> > > on the 'intrinsiclink' and 'hopcount' tables. So I have two questions
> > > about this.
> > > First, is there an option to disable tracking hopcount?
> > > Second, if I disable tracking hopcount , can it affect other crawling
> > > processes?
> > >
> > > Thank you in advance.
> > > Kind regards,
> > > Mingchun
> > >
>

Reply via email to