Hello,

We currently index our data through a SQL-DIH setup but due to our model
(and therefore sql query) becoming complex we need to index our data
programmatically. As we didn't have to deal with commit/optimise before, we
are now wondering whether there is an optimal approach to that. Is there a
batch size after which we should fire a commit or should we execute a commit
after indexing all of our data? What about optimise?

Our document corpus is > 4m documents and through DIH the resulting index is
around 1.5G

We have searched previous posts but couldn't find a definite answer. Any
input much appreciated!

Regards,
-- Savvas

Reply via email to