On 18 January 2013 15:04, ashimbose <ashimb...@gmail.com> wrote:
> Hi Gora,
>
> Thank you for your reply again,
>
> Joining is not possible in my case. coz there is no relation between all
> tables. is there joining is possible without any relation in this solr case?

No, one needs some kind of a relationship to join.

> I really going through hard time.Its really tough me to do alternatives
> process also, coz I am very new in solr.
>
> My Requirement is I need all text, varchar or char type data from my data
> source sampleDB, Its huge dsta source, may be it will come in TB amd more
> than 1000 tables.

This will be quite difficult to benchmark, architect, and set up,
especially if you are not experienced with Solr. Again, it might
be best to re-examine your assumptions. Could you describe
what you are trying to do, rather than jumping to a solution?
What kind of search requirements do you have that need data
from a thousand tables per Solr document? How is it that the
data tables have no relationships, where presumably the fields
put into each Solr document will have something in common?
What kind of queries are you planning to run?

Even if you are certain that you want to go down this route, it
would make sense to approach this in a phased manner. Become
familiar with Solr indexing using just a few tables, then extend
that to more tables. Benchmark, and use the results to plan the
sort of architecture you will need to build.

> Which one is the batter solution
> 1) Having multiple root entities and indexing each separately
> 2) Having multiple data-import requestHandlers
>
> Can you give me example and procedure how to achieve those. What changes I
> need to have.
[...]

You should really try and get familiar with the basics of Solr
indexing first, but here is a brief outline:
* Each Solr DIH configuration file has only one <document>
  tag, but can have multiple data sources, multiple root entities,
  and nested entities. In your case, you could use multiple
  data sources to spread the load over multiple databases,
  each holding a subset of the tables. It looks like you are
  already using multiple root entities, as all your entities are
  distinct. Thus, instead of importing all the entities at once,
  which is what /dataimport?command=full-import does, you
  could import them in batches, e.g.,
  /dataimport?command=full-import&entity=CUSTOMER
  would import only the CUSTOMER entity,
  /dataimport?command=full-import&entity=CUSTOMER&entity=SHOP
  would import the CUSTOMER and SHOP entities, and so on.

* Have never tried this, but one can set up multiple request handlers
  in solrconfig.xml for each DIH instance that one plans to run.
  These can run in parallel rather than the sequential indexing of
  root entities in a single DIH instance.

Regards,
Gora

Reply via email to