My guess is that Solr isn't doing much and DIH is
taking much more time to get the data from
the database than Solr is using to index it.
DIH can have some tricky bits to get it to be
fast. Among them are:

1> caching certain table results
2> batching the rows returned from the DB
3> ??? I'll leave that part to DIH experts, I
tend to prefer using a JDBC driver and using
SolrJ...

Best,
Erick

On Sun, Mar 20, 2016 at 5:11 PM, Amit Jha <shanuu....@gmail.com> wrote:

> Hi All,
>
> In my case I am using DIH to index the data and Query is having 2 join
> statements. To index 70K documents it is taking 3-4Hours. Document size
> would be around 10-20KB. DB is MSSQL and using solr4.2.10 in cloud mode.
>
> Rgds
> AJ
>
> > On 21-Mar-2016, at 05:23, Erick Erickson <erickerick...@gmail.com>
> wrote:
> >
> > In my experience, a majority of the time the bottleneck is in
> > the data acquisition, not the Solr indexing per-se. Take a look
> > at the CPU utilization on Solr, if it's not running very heavy,
> > then you need to look upstream.
> >
> > You haven't told us anything about _how_ you're indexing.
> > SolrJ? DIH? Something from some other party? so it's hard to
> > say much useful.
> >
> > You might review:
> >
> > http://wiki.apache.org/solr/UsingMailingLists
> >
> > Best,
> > Erick
> >
> > On Sun, Mar 20, 2016 at 3:31 PM, Nick Vasilyev <nick.vasily...@gmail.com
> >
> > wrote:
> >
> >> There can be a lot of factors, can you provide a bit of additional
> >> information to get started?
> >>
> >> - How many items are you indexing per second?
> >> - How does the indexing process look like?
> >> - How large is each item?
> >> - What hardware are you using?
> >> - How is your Solr set up? JVM memory, collection layout, etc...
> >> - What is your current commit frequency?
> >> - What is the query volume while you are indexing?
> >>
> >> On Sun, Mar 20, 2016 at 6:25 PM, fabigol <fabien.stou...@vialtis.com>
> >> wrote:
> >>
> >>> hi,
> >>> i have a soir project where i do the indexing since a database postgre.
> >>> the indexation is very long.
> >>> How i can accelerate it.
> >>> I can modify autocommit in the file solrconfig.xml?
> >>> someone has some ideas. I looking on google but I found little
> >>> help me please
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> View this message in context:
> >>> http://lucene.472066.n3.nabble.com/How-fast-indexing-tp4264994.html
> >>> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
>

Reply via email to