You might want to consider just doing the whole thing in SolrJ with a JDBC connection. When things get complex, it's sometimes more straightforward.
Best Erick... P.S. Yes, it's pretty standard to have a single field be the destination for several copyField directives. On Mon, Dec 12, 2011 at 12:48 PM, Gora Mohanty <g...@mimirtech.com> wrote: > On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb > <brian.l...@journalexperts.com> wrote: >> Hi all, >> >> I have a few questions about how the MySQL data import works. It seems it >> creates a separate connection for each entity I create. Is there any way to >> avoid this? > > Not sure, but I do not think that it is possible. However, from your > description > below, I think that you are unnecessarily multiplying entities. > >> By nature of my schema, I have several multivalued fields. Each one I >> populate with a separate entity. Is there a better way to do it? For >> example, could I pull in all the singular data in one sitting and then come >> back in later and populate with the multivalued items. > > Not quite sure as to what you mean. Would it be possible for you > to post your schema.xml, and the DIH configuration file? Preferably, > put these on pastebin.com, and send us links. Also, you should > obfuscate details like access passwords. > >> An alternate approach in some cases would be to do a GROUP_CONCAT and then >> populate the multivalued column with some transformation. Is that possible? > [...] > > This is how we have been handling it. A complete description would > be long, but here is the gist of it: > * A transformer will be needed. In this case, we found it easiest > to use a Java-based transformer. Thus, your entity should include > something like > <entity name="myname" dataSource="mysource" > transformer="com.mycompany.search.solr.handler.JobsNumericTransformer...> > ... > </entity> > Here, the class name to be used for the transformer attribute follows > the usual Java rules, and the .jar needs to be made available to Solr. > * The SELECT statement for the entity looks something like > select group_concat( myfield SEPARATOR '@||@')... > The separator should be something that does not occur in your > normal data stream. > * Within the entity, define > <field column="myfield"/> > * There are complications involved if NULL values are allowed > for the field, in which case you would need to use COALESCE, > maybe along with CAST > * The transformer would look up "myfield", split along the separator, > and populate the multi-valued field. > > This *is* a little complicated, so I would also like to hear about > possible alternatives. > > Regards, > Gora