Are you saying to do a local mini-collection and then mirror final result
to the real one?

What about deletions? Per-entry cleanup statements and so on? DIH does full
updates, not just additions.

Or did I miss the focus?

Regards,
    Alex
On 15 Dec 2015 11:46 pm, "Erik Hatcher" <erik.hatc...@gmail.com> wrote:

> With time shaken loose, IMO ideally what we do (under
> https://issues.apache.org/jira/browse/SOLR-7188 <
> https://issues.apache.org/jira/browse/SOLR-7188> probably) is create an
> update processor that *forwards* to a _real_ Solr collection update
> handler, and fire up EmbeddedSolrServer in a client-side command-line tool
> that can run /update/extract, DIH stuff, etc - does what it does now to
> extract, parse, and build documents and then forwards them via javabin to a
> live Solr collection.   I’m not sure that SOLR-7188 currently spells it out
> like that, but it is a nice, clean, straightforward path from DIH and Tika
> embedded inside a real Solr cluster to leveraging and scaling it on its
> own.   We’d lose the DIH admin UI, but that’s ok by me.
>
> —
> Erik Hatcher, Senior Solutions Architect
> http://www.lucidworks.com <http://www.lucidworks.com/>
>
>
>
> > On Dec 15, 2015, at 9:23 AM, Davis, Daniel (NIH/NLM) [C] <
> daniel.da...@nih.gov> wrote:
> >
> > I am aware of the problems with the implementation of DIH, but is there
> any problem with the XML driven data import capability?
> > Could it be rewritten (using modern XPath) to run as a part of SolrJ?
> >
> > I've been interested in that, but I just haven't been able to shake
> loose the time.
> >
> > -----Original Message-----
> > From: Upayavira [mailto:u...@odoko.co.uk]
> > Sent: Tuesday, December 15, 2015 5:04 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Is DIH going to be removed from Solr future versions?
> >
> > I doubt DIH will be "removed". It more likely will be relegated - still
> there, but emphasised less.
> >
> > Another possibility that has been mooted is to extract it, so that it
> can run outside of Solr. This strikes me as the best option. Having it run
> inside Solr strikes me as architecturally wrong, and also problematic in a
> SolrCloud world. Taking the DIH codebase and running it
> > *outside* Solr you get the best of DIH without the same set of issues.
> >
> > Upayavira
> >
> > On Tue, Dec 15, 2015, at 05:47 AM, Anil Cherian wrote:
> >> Dear Team,
> >>
> >> I use DIH extensively and even wrote my own custom transformers in
> >> some situations.
> >> Recently during an architecture discussion one of my team members told
> >> that Solr is going to take away DIH from its future versions.
> >>
> >> Is that true?
> >>
> >> Also is using DIH for say 2 or 3 million docs a good option for
> >> indexing an XML content data set. I am planning to use it either by
> >> calling separate entities parallely or multiple /dataimport in
> >> solrconfig.xml.
> >>
> >> Cld you please reply at your earliest convenience as it is an
> >> important decision for us to continue on DIH or not!
> >>
> >> Thanks and Rgds,
> >> Anil.
>
>

Reply via email to