You try increasing the verbosity level. Setting the verbosity level to 3 might help you to debug for the exact cause why only one subset is getting indexed.
Also try checking the `index_queryset` method for your SearchIndex extending classes. Ref: https://django-haystack.readthedocs.io/en/master/searchindex_api.html#index-queryset On Mon, Jun 19, 2017 at 11:57 PM, Purbasha <[email protected]> wrote: > Hi, > > Need help in understanding how rebuild_index works with multiple workers. > I am using django 1.8.4, Whoosh 2.7.4 and django-haystack 2.6.0 to build > out a search functionality on a database of 5 million records. My > environment is Ubuntu and MacOSx. > > When I use the multiple workers option, I am not getting the total 5M > records in the index. I have tested this with a smaller subset of 1200 > records and found that I can only get all 1200 records into the index when > I have one worker. I have tried with several different batch sizes and > different number of workers and it is always the case where only a subset > of records get indexed. > > Is this a known problem? I saw some issues reported on this topic in the > Github repository but not sure if they have been resolved or not. When I > run with multiple workers, the logs look fine and there are no errors > around files getting locked or file not accessible which is something I > would expect if multiple workers are trying to write into the file. I have > allocated 150GB of space to the volume where indexed data is being stored > and my server has 64 GB memory. So I am sure that this not due to lack of > storage or lack of memory. > > I would really like to use the multiple workers option to cut down the > indexing time to a few hours instead of 12-14 hours. > > Thank you, > Purbasha > > > > > -- > You received this message because you are subscribed to the Google Groups > "django-haystack" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- Thanks, Subhranath Chunder. -- You received this message because you are subscribed to the Google Groups "django-haystack" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
