It should be supported in SolrJ, I'm surprised it's been lopped out.
Bulk indexing is extremely common.

On Fri, Nov 4, 2011 at 1:16 PM, Ken Krugler <kkrugler_li...@transpac.com> wrote:
> Hi list,
>
> I'm working on improving the performance of the Solr scheme for Cascading.
>
> This supports generating a Solr index as the output of a Hadoop job. We use 
> SolrJ to write the index locally (via EmbeddedSolrServer).
>
> There are mentions of using overwrite=false with the CSV request handler, as 
> a way of improving performance.
>
> I see that https://issues.apache.org/jira/browse/SOLR-653 removed this 
> support from SolrJ, because it was deemed too dangerous for mere mortals.
>
> My question is whether anyone knows just how much performance boost this 
> really provides.
>
> For Hadoop-based workflows, it's straightforward to ensure that the unique 
> key field is really unique, thus if the performance gain is significant, I 
> might look into figuring out some way (with a trigger lock) of re-enabling 
> this support in SolrJ.
>
> Thanks,
>
> -- Ken
>
> --------------------------
> Ken Krugler
> http://www.scaleunlimited.com
> custom big data solutions & training
> Hadoop, Cascading, Mahout & Solr
>
>
>
>
>

Reply via email to