Really good point on the ids, I completely overlooked that matter. I will give it a try. Thanks again.
On Thu, Feb 16, 2012 at 5:00 PM, Dmitry Kan <dmitry....@gmail.com> wrote: > Each document in SOLR will correspond to one db record and since both > databases have the same schema, you can't index two records from two > databases into the same SOLR document. > > So after indexing, you should have 7k different documents, each of which > holds data from a db record. > > Also one problem I see here is that since the record id in each table is > unique only within the table and (most probably) not globally, there will > be collisions. To aviod this, I would prepend a record_id with some static > value, like: concat("t1", CONVERT(id, CHAR(8))). > > Dmitry > > On Thu, Feb 16, 2012 at 4:47 PM, Radu Toev <radut...@gmail.com> wrote: > > > I'm not sure I follow. > > The idea is to have only one document. Do the multiple documents have the > > same structure then(different datasources), and if so how are they > actually > > indexed? > > > > Thanks. > > > > On Thu, Feb 16, 2012 at 4:40 PM, Dmitry Kan <dmitry....@gmail.com> > wrote: > > > > > I think the problem here is that initially you trying to create > separate > > > documents for two different tables, while your config is aiming to > create > > > only one document. Here there is one solution (not tried by me): > > > > > > ------ > > > You can have multiple documents generated by the same data-config: > > > > > > <dataConfig> > > > <dataSource name="ds1" .../> > > > <dataSource name="ds2" .../> > > > <dataSource name="ds3" .../> > > > <document> > > > <entity blah blah rootEntity="false"> > > > <entity blah blah this is a document> > > > <entity sets unique id/> > > > </document> > > > <document blah blah this is another document> > > > <entity sets unique id> > > > </document> > > > </document> > > > </dataConfig> > > > > > > It's the 'rootEntity="false" that makes the child entity a document. > > > ------ > > > > > > Dmitry > > > > > > On Thu, Feb 16, 2012 at 2:37 PM, Radu Toev <radut...@gmail.com> wrote: > > > > > > > <dataConfig> > > > > <dataSource > > > > name="s" > > > > driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" > > > > url="" > > > > user="" > > > > password=""/> > > > > <dataSource > > > > name="p" > > > > driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" > > > > url="" > > > > user="" > > > > password=""/> > > > > <document> > > > > <entity name="ms" > > > > datasource="s" > > > > query="SELECT m.id as id, m.serial as m_machine_serial, m.ivk as > > > > m_machine_ivk, m.sitename as m_sitename, m.deliveryDate as > > > m_delivery_date, > > > > m.hotsite as m_hotsite, m.guardian as m_guardian, m.warranty as > > > m_warranty, > > > > m.contract as m_contract, > > > > st.name as m_st_name, pm.name as m_pm_name, p.name as m_p_name, > > > > sv.shortName as m_sv_name, c.clusterMajor as m_c_cluster_major, > > > > c.clusterMinor as m_c_cluster_minor, c.country as m_c_country, c.code > > as > > > > m_c_code > > > > FROM Machine AS m > > > > LEFT JOIN SystemType AS st ON m.fk_systemType=st.id > > > > LEFT JOIN ProductModel AS pm ON fk_productModel = pm.id > > > > LEFT JOIN Platform AS p ON m.fk_platform = p.id > > > > LEFT JOIN SoftwareVersion AS sv ON fk_softwareVersion = sv.id > > > > LEFT JOIN Country AS c ON fk_country = c.id" > > > > readOnly="true" > > > > transformer="DateFormatTransformer"> > > > > <field column="id" /> > > > > <field column="m_machine_serial"/> > > > > <field column="m_machine_ivk"/> > > > > <field column="m_sitename"/> > > > > <filed column="m_delivery_date" dateTimeFormat="yyyy-MM-dd"/> > > > > <field column="m_hotsite"/> > > > > <field column="m_guardian"/> > > > > <field column="m_warranty"/> > > > > <field column="m_contract"/> > > > > <field column="m_st_name"/> > > > > <field column="m_pm_name"/> > > > > <field column="m_p_name"/> > > > > <field column="m_sv_name"/> > > > > <field column="m_c_cluster_major"/> > > > > <field column="m_c_cluster_minor"/> > > > > <field column="m_c_country"/> > > > > <field column="m_c_code"/> > > > > </entity> > > > > > > > > <entity name="mp" > > > > datasource="p" > > > > query="SELECT m.id as id, m.serial as m_machine_serial, m.ivk as > > > > m_machine_ivk, m.sitename as m_sitename, m.deliveryDate as > > > m_delivery_date, > > > > m.hotsite as m_hotsite, m.guardian as m_guardian, m.warranty as > > > m_warranty, > > > > m.contract as m_contract, > > > > st.name as m_st_name, pm.name as m_pm_name, p.name as m_p_name, > > > > sv.shortName as m_sv_name, c.clusterMajor as m_c_cluster_major, > > > > c.clusterMinor as m_c_cluster_minor, c.country as m_c_country, c.code > > as > > > > m_c_code > > > > FROM Machine AS m > > > > LEFT JOIN SystemType AS st ON m.fk_systemType=st.id > > > > LEFT JOIN ProductModel AS pm ON fk_productModel = pm.id > > > > LEFT JOIN Platform AS p ON m.fk_platform = p.id > > > > LEFT JOIN SoftwareVersion AS sv ON fk_softwareVersion = sv.id > > > > LEFT JOIN Country AS c ON fk_country = c.id" > > > > readOnly="true" > > > > transformer="DateFormatTransformer"> > > > > <field column="id" /> > > > > <field column="m_machine_serial"/> > > > > <field column="m_machine_ivk"/> > > > > <field column="m_sitename"/> > > > > <filed column="m_delivery_date" dateTimeFormat="yyyy-MM-dd"/> > > > > <field column="m_hotsite"/> > > > > <field column="m_guardian"/> > > > > <field column="m_warranty"/> > > > > <field column="m_contract"/> > > > > <field column="m_st_name"/> > > > > <field column="m_pm_name"/> > > > > <field column="m_p_name"/> > > > > <field column="m_sv_name"/> > > > > <field column="m_c_cluster_major"/> > > > > <field column="m_c_cluster_minor"/> > > > > <field column="m_c_country"/> > > > > <field column="m_c_code"/> > > > > </entity> > > > > </document> > > > > </dataConfig> > > > > > > > > I've removed the connection params > > > > The unique key is id. > > > > > > > > On Thu, Feb 16, 2012 at 2:27 PM, Dmitry Kan <dmitry....@gmail.com> > > > wrote: > > > > > > > > > OK, maybe you can show the db-data-config.xml just in case? > > > > > Also in schema.xml, does you <uniqueKey> correspond to the unique > > field > > > > in > > > > > the db? > > > > > > > > > > On Thu, Feb 16, 2012 at 2:13 PM, Radu Toev <radut...@gmail.com> > > wrote: > > > > > > > > > > > I tried running with just one datasource(the one that has 6k > > entries) > > > > and > > > > > > it indexes them ok. > > > > > > The same, if I do sepparately the 1k database. It indexes ok. > > > > > > > > > > > > On Thu, Feb 16, 2012 at 2:11 PM, Dmitry Kan < > dmitry....@gmail.com> > > > > > wrote: > > > > > > > > > > > > > It sounds a bit, as if SOLR stopped processing data once it > > queried > > > > all > > > > > > > from the smaller dataset. That's why you have 2000. If you just > > > have > > > > a > > > > > > > handler pointed to the bigger data set (6k), do you manage to > get > > > all > > > > > 6k > > > > > > db > > > > > > > entries into solr? > > > > > > > > > > > > > > On Thu, Feb 16, 2012 at 1:46 PM, Radu Toev <radut...@gmail.com > > > > > > wrote: > > > > > > > > > > > > > > > 1. Nothing in the logs > > > > > > > > 2. No. > > > > > > > > > > > > > > > > On Thu, Feb 16, 2012 at 12:44 PM, Dmitry Kan < > > > dmitry....@gmail.com > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > 1. Do you see any errors / exceptions in the logs? > > > > > > > > > 2. Could you have duplicates? > > > > > > > > > > > > > > > > > > On Thu, Feb 16, 2012 at 10:15 AM, Radu Toev < > > > radut...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > > > I created a data-config.xml file where I define a > > datasource > > > > and > > > > > an > > > > > > > > > entity > > > > > > > > > > with 12 fields. > > > > > > > > > > In my use case I have 2 databases with the same schema, > so > > I > > > > want > > > > > > to > > > > > > > > > > combine in one index the 2 databases. > > > > > > > > > > I defined a second dataSource tag and duplicateed the > > entity > > > > with > > > > > > its > > > > > > > > > > field(changed the name and the datasource). > > > > > > > > > > What I'm expecting is to get around 7k results(I have > > around > > > 6k > > > > > in > > > > > > > the > > > > > > > > > > first db and 1k in the second). However I'm getting a > total > > > of > > > > > 2k. > > > > > > > > > > Where could be the problem? > > > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Regards, > > > > > > > > > > > > > > > > > > Dmitry Kan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Regards, > > > > > > > > > > > > > > Dmitry Kan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Regards, > > > > > > > > > > Dmitry Kan > > > > > > > > > > > > > > > > > > > > > -- > > > Regards, > > > > > > Dmitry Kan > > > > > > > > > -- > Regards, > > Dmitry Kan >