RE: dataimport db-data-config.xml
Kishor, Data Import Handler doesn't know how to randomly access rows from the CSV to "JOIN" them to rows from the MySQL table at indexing time. However, both MySQL and Solr know how to JOIN rows/documents from multiple tables/collections/cores. Data Import Handler could read the CSV first, and query MySQL within that, but I don't think that's a great architecture because it depends on the business requirements in a rather brittle way (more on this below). So, I see three basic architectures: Use MySQL to do the JOIN: -- - Your indexing isn't just DIH, but a script that first. - Imports the CSV into a MySQL table, validating that the id in the CSV table is found in the MySQL table. - Your DIH has either an for one SQL query that contains an for the other SQL query, or it has a JOIN query/query on a MySQL view. This is ideal if: - Your resources (including you) are more familiar with RDBMS technology than Solr. - You have no business requirement to return rows from just the MySQL table or just the CSV as search results. - The data is small enough that the processing time to import into MySQL each time you index is acceptable. Use Solr to do the JOIN: -- - Index all the rows from the CSV as documents within Solr, - Index all the rows from the MySQL table as documents within Solr, - Use JOIN queries to query them together. This is ideal if: - You don't control the MySQL database, and have no way at all to add a table to it. - You have a business requirement to return either or both results from the MySQL table or the CSV. - You want Solr JOIN queries on your Solr resume ;) Not a terribly good reason, I guess. Use Data Import Handler to do the JOIN: --- If you absolutely want to join the data using Data Import Handler, then: - Have DIH loop through the CSV *first*, and then make queries based on the id into the MySQL table. - In this case, the for the MySQL query will appear within the for the CSV row, which will appear within an for the CSV file within the filesystem. - The for the CSV row would be the primary document entity. This is only appropriate if: - There is no business requirement to search for results directly from the MySQL table on its own. - Your business requirements suggest one result for each row from the CSV, rather than from the MySQL table or either way. - The CSV contains every id in the MySQL table, or the entries within the MySQL table that don't have anything from the CSV shouldn't appear in the results anyway. -Original Message- From: kishor [mailto:krajus...@gmail.com] Sent: Friday, April 29, 2016 4:58 AM To: solr-user@lucene.apache.org Subject: dataimport db-data-config.xml I want to import data from mysql-table and csv file ata the same time beacuse some data are in mysql tables and some are in csv file . I want to match specific id from mysql table in csv file then add the data in solar. What i think or wnat to do Is this possible in solr? Please suggest me How to import data from csv and mysql table at the same time. -- View this message in context: http://lucene.472066.n3.nabble.com/dataimport-db-data-config-xml-tp4270673p4273614.html Sent from the Solr - User mailing list archive at Nabble.com.
dataimport db-data-config.xml
I want to import data from mysql-table and csv file ata the same time beacuse some data are in mysql tables and some are in csv file . I want to match specific id from mysql table in csv file then add the data in solar. What i think or wnat to do Is this possible in solr? Please suggest me How to import data from csv and mysql table at the same time. -- View this message in context: http://lucene.472066.n3.nabble.com/dataimport-db-data-config-xml-tp4270673p4273614.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: dataimport db-data-config.xml
What are the errors reported? Errors can be either seen on admin page logging tab or log file under solr_home. If you follow the steps mentioned on the blog precisely, it should almost work http://solr.pl/en/2010/10/11/data-import-handler-%E2%80%93-how-to-import-data-from-sql-databases-part-1/ <http://solr.pl/en/2010/10/11/data-import-handler-%E2%80%93-how-to-import-data-from-sql-databases-part-1/> If you encounter errors at any step, lets us know. On Sat, Apr 16, 2016 at 10:49 AM, kishor <krajus...@gmail.com> wrote: > I am try to run two pgsql query on same data-source. is this possible in > db-data-config.xml. > > > > > url="jdbc:postgresql://0.0.0.0:5432/iboats" > user="iboats" > password="root" /> > > > transformer="TemplateTransformer"> > > template="user1-${user1.id}"/> > > > This code is not working please suggest any more example > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/dataimport-db-data-config-xml-tp4270673.html > Sent from the Solr - User mailing list archive at Nabble.com. >
dataimport db-data-config.xml
I am try to run two pgsql query on same data-source. is this possible in db-data-config.xml. This code is not working please suggest any more example -- View this message in context: http://lucene.472066.n3.nabble.com/dataimport-db-data-config-xml-tp4270673.html Sent from the Solr - User mailing list archive at Nabble.com.
dataimport db-data-config.xml
I am try to run two pgsql query on same data-source. is this possible in db-data-config.xml. This code is not working please suggest any more example -- View this message in context: http://lucene.472066.n3.nabble.com/dataimport-db-data-config-xml-tp4270674.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: db-data-config.xml ?
Did you find any other exceptions in the logs? When I pasted the script section of your data config into my test setup, I got an error saying that there is an unclosed string literal in line 6 On Tue, Sep 3, 2013 at 12:23 AM, Kunzman, Doug dkunz...@usgs.gov wrote: Hi - I'm new to Solr and am trying to combine a script:and RegExTransformer in a db-dataconfig.xml that is used to ingest data into Solr. Can anyone be of any help? There is definitly a comma between my script:add , and addRegexTransfomer lines. Any help would be appreciated. My db-data-config.xml looks like this? dataConfig dataSource type=JdbcDataSource driver=org.postgresql.Driver url=jdbc:postgresql://localhost:/test?netTimeoutForStreamingResults=24000 autoReconnect=true user=postgres password= batchSize =10 responseBuffering=adaptive/ script![CDATA[ function add(row){ var latlon_s= row.get('longitude')+','+row.get('latitude'); var provider = row.get('provider'); var pointPath = '/'+ row.get('longitude')+','+row.get('latitude')+'/'+row.get('basis_of_record'); if ('NatureServe'.equalsIgnoreCase(provider) || 'USDA PLANTS'.equalsIgnoreCase(provider)) { pointPath += '/centroid'; } row.put('latlon_s', latlon_s); row.put('pointPath_s',pointPath); var provider_id = row.get('provider_id'); var resource_id = row.get('resource_id_s'); var hierarchy = row.get('hierarchy_string'); row.put('hierarchy_homonym_string', '-' + hierarchy + '-'); row.put('BISONResourceID', '/' + provider_id + '/' + resource_id +'/'); return row; } ]]/script document name=itis_to_portal.occurence !--entity name=occurrence pk=id transformer=script:add query=select id, scientific_name, latitude, longitude, year, basis_of_record, provider_id,resource_id_s, occurrence_date, tsns, parent_tsn, hierarchy_string, collector, ambiguous, statecomputedfips, countycomputedfips from itis_to_portal.solr transformer=RegexTransformer -- entity name=occurrence pk=id query=select id, scientific_name, latitude, longitude, year, basis_of_record, provider_id,resource_id_s, occurrence_date, tsns, parent_tsn, hierarchy_string, collector, ambiguous, statecomputedfips, countycomputedfips from itis_to_portal.solr transformer=RegexTransformer,script:add and at runtime import I'm getting the following error message, SEVERE: Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Could not invoke method :addRegexTransformer Thanks, Doug -- Regards, Shalin Shekhar Mangar.
db-data-config.xml ?
Hi - I'm new to Solr and am trying to combine a script:and RegExTransformer in a db-dataconfig.xml that is used to ingest data into Solr. Can anyone be of any help? There is definitly a comma between my script:add , and addRegexTransfomer lines. Any help would be appreciated. My db-data-config.xml looks like this? dataConfig dataSource type=JdbcDataSource driver=org.postgresql.Driver url=jdbc:postgresql://localhost:/test?netTimeoutForStreamingResults=24000 autoReconnect=true user=postgres password= batchSize =10 responseBuffering=adaptive/ script![CDATA[ function add(row){ var latlon_s= row.get('longitude')+','+row.get('latitude'); var provider = row.get('provider'); var pointPath = '/'+ row.get('longitude')+','+row.get('latitude')+'/'+row.get('basis_of_record'); if ('NatureServe'.equalsIgnoreCase(provider) || 'USDA PLANTS'.equalsIgnoreCase(provider)) { pointPath += '/centroid'; } row.put('latlon_s', latlon_s); row.put('pointPath_s',pointPath); var provider_id = row.get('provider_id'); var resource_id = row.get('resource_id_s'); var hierarchy = row.get('hierarchy_string'); row.put('hierarchy_homonym_string', '-' + hierarchy + '-'); row.put('BISONResourceID', '/' + provider_id + '/' + resource_id +'/'); return row; } ]]/script document name=itis_to_portal.occurence !--entity name=occurrence pk=id transformer=script:add query=select id, scientific_name, latitude, longitude, year, basis_of_record, provider_id,resource_id_s, occurrence_date, tsns, parent_tsn, hierarchy_string, collector, ambiguous, statecomputedfips, countycomputedfips from itis_to_portal.solr transformer=RegexTransformer -- entity name=occurrence pk=id query=select id, scientific_name, latitude, longitude, year, basis_of_record, provider_id,resource_id_s, occurrence_date, tsns, parent_tsn, hierarchy_string, collector, ambiguous, statecomputedfips, countycomputedfips from itis_to_portal.solr transformer=RegexTransformer,script:add and at runtime import I'm getting the following error message, SEVERE: Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Could not invoke method :addRegexTransformer Thanks, Doug
Re: Configuring seperate db-data-config.xml per shard
Hi, we were able to accomplish this by single collection. Zookeeper : create separate node for each shards, and upload the dbconfig file under shards. eg : /config/config1/shard1 /config/config1/shard2 /config/config1/shard3 In the solrconfig.xml, requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=config${dbconfig}/str /lst /requestHandler In solr.xml, ?xml version=1.0 encoding=UTF-8 ? solr persistent=true zkHost=localhost:2181 cores defaultCoreName=core1 adminPath=/admin/cores zkClientTimeout=${zkClientTimeout:15000} host=${host:} hostPort=9985 hostContext=${hostContext:} core loadOnStartup=true instanceDir=core1 transient=false name=core1 property name=dbconfig value=shard1/db-data-config.xml / /core /cores /solr This way you can configure dbconfig file per shard. Thanks, Sathish -- View this message in context: http://lucene.472066.n3.nabble.com/Configuring-seperate-db-data-config-xml-per-shard-tp4068383p4068819.html Sent from the Solr - User mailing list archive at Nabble.com.
Configuring seperate db-data-config.xml per shard
Hi, We have a setup where we have 3 shards in a collection, and each shard in the collection need to load different sets of data That is Shard1- will contain data only for Entity1 Shard2 - will contain data for entity2 shard3- will contain data for entity3 So in this case,. the db-data-config.xml can't be same for three shards so it can;'t be uploaded in zookeeper. Is there any way, where we can mantain db-data-config.xml inside each shard's folder and make our shards to refer to this db-data-config.xml(during data import), rather than looking for this file in zookeepers repository Thanks in Advance Radha -- View this message in context: http://lucene.472066.n3.nabble.com/Configuring-seperate-db-data-config-xml-per-shard-tp4068383.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Configuring seperate db-data-config.xml per shard
Might not be a solution but I had asked a similar question before..Check out this thread.. http://lucene.472066.n3.nabble.com/Is-there-a-way-to-load-multiple-schema-when-using-zookeeper-td4058358.html You can create multiple collection and each collecion can use completley differnet sets of configs. You then can have a cloud cluster for performing searching across multiple collections. -- View this message in context: http://lucene.472066.n3.nabble.com/Configuring-seperate-db-data-config-xml-per-shard-tp4068383p4068466.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - db-data-config.xml general asking to entity
Two answers: 1) Do you have maybe user names or timestamps for the comments? Usually people want those also. 2) You can store the comments as one long string, or as multiple entries in a field. Your database should have a concatenate function that will take field X from multiple documents in a join and make a long string. I would concatenate the comments into one string with a magic separator, and then just split them up in my application. It is not simple to get multiple join results into one document. Here's how it works: a) Collect all of the results of 'comment' into one long string and use a unique character to separate them. This is your one document from your query. b) Use the RegexTransformer in the DIH to split the long string into several values. http://lucidworks.lucidimagination.com/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler On Sat, Oct 13, 2012 at 6:21 AM, Marvin markus.pfeif...@ebcont-et.com wrote: Hi there! I have 2 tables 'blog' and 'comment'. A blog can contains n comments (blog --1:n-- comment). Up to date I use following select to insert the data into solr index: entity name=blog dataSource=mssqlDatasource pk=id transformer=ClobTransformer query=SELECT b.id, b.market, b.title AS blogTitle, b.message AS blogMessage, c.message AS commentMessage FROM blog b LEFT JOIN comment c ON b.id = c.source_id AND c.source_type = 'blog' field column=blogMessage name=blogMessage clob=true / field column=commentMessage name=commentMessage clob=true / /entity The index result looks like: doc str name=id1/str str name=market12/str str name=titleblog of title 1/str str name=blogMessagemessage of blog 1/str str name=commentMessagemessage of comment/str /doc doc str name=id1/str str name=market12/str str name=titleblog of title 1/str str name=blogMessagemessage of blog 1/str str name=commentMessagemessage of comment - Im the second comment/str /doc I would say this is stupid because I got too many index data with the same blog just the comments are different. Is it possible to set 'comments' as 'subentity' like following: entity name=blog dataSource=mssqlDatasource pk=id transformer=ClobTransformer query=SELECT b.id, b.market, b.title AS blogTitle, b.message AS blogMessage FROM blog b field column=blogMessage name=blogMessage clob=true / entity name=comment dataSource=mssqlDatasource pk=id transformer=ClobTransformer query=SELECT c.id, c.message as commentMessage FROM comment c WHERE c.source_id = ${blog.id} field column=commentMessage name=commentMessage clob=true / /entity /entity Is that possible? How would the result looks like (cant test it until monday)? All example I found the sub entity just select 1 column but I need at least 2. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-db-data-config-xml-general-asking-to-entity-tp4013533.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: Solr - db-data-config.xml general asking to entity
Thanks for the response! Its a bad news that it isnt that simple I hoped. Certainly I need names and a timestamp for the comment. There are any problems if I want to add a timestamp in the one long string? Apart from this can I add this one long string to the index? Example: table blog: id, author(foreign key: I want to join table user to get the firstname and lastname), title (need to be in the index!), message (need to be in the index), created (need to be in the index) table comment: id(int), author(foreign key: I want to join table user to get the firstname and lastname), message(need to be in the index), created(timestamp) result should be: doc str name=id1/str str name=authorFirstNameJaime/str str name=authofLastNameOliver/str str name=blogTitletitleof blog 1/str str name=blogMessagemessage of blog 1/str str name=created2007-11-09 T 11:10/str str name=commentid|Walter|White|message of comment|2007-11-09 T 11:20/str /doc I really can do this with RegexTransformer? Can you give me a hint how that should looks like (my regex is really bad)? Would be following another choice: I add all Blogs and after that adding Blogs with Comments and making the blog entries not to be in index. Would result following: doc str name=id1/str str name=authorFirstNameJaime/str str name=authofLastNameOliver/str str name=blogTitletitleof blog 1/str str name=blogMessagemessage of blog 1/str str name=created2007-11-09 T 11:10/str /doc doc str name=id1/str str name=authorFirstNameWalter/str str name=authofLastNameWhite/str str name=commentMessagemessage of comment/str str name=created2007-11-09 T 11:20/str /doc So I can get blog as results and comments(containing blog values). Its just important that the blog values in comment are not in the index - so I dont get blog result twice. Whats your opinion? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-nested-entity-with-multiple-values-tp4013533p4013602.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr - db-data-config.xml general asking to entity
Hi there! I have 2 tables 'blog' and 'comment'. A blog can contains n comments (blog --1:n-- comment). Up to date I use following select to insert the data into solr index: entity name=blog dataSource=mssqlDatasource pk=id transformer=ClobTransformer query=SELECT b.id, b.market, b.title AS blogTitle, b.message AS blogMessage, c.message AS commentMessage FROM blog b LEFT JOIN comment c ON b.id = c.source_id AND c.source_type = 'blog' field column=blogMessage name=blogMessage clob=true / field column=commentMessage name=commentMessage clob=true / /entity The index result looks like: doc str name=id1/str str name=market12/str str name=titleblog of title 1/str str name=blogMessagemessage of blog 1/str str name=commentMessagemessage of comment/str /doc doc str name=id1/str str name=market12/str str name=titleblog of title 1/str str name=blogMessagemessage of blog 1/str str name=commentMessagemessage of comment - Im the second comment/str /doc I would say this is stupid because I got too many index data with the same blog just the comments are different. Is it possible to set 'comments' as 'subentity' like following: entity name=blog dataSource=mssqlDatasource pk=id transformer=ClobTransformer query=SELECT b.id, b.market, b.title AS blogTitle, b.message AS blogMessage FROM blog b field column=blogMessage name=blogMessage clob=true / entity name=comment dataSource=mssqlDatasource pk=id transformer=ClobTransformer query=SELECT c.id, c.message as commentMessage FROM comment c WHERE c.source_id = ${blog.id} field column=commentMessage name=commentMessage clob=true / /entity /entity Is that possible? How would the result looks like (cant test it until monday)? All example I found the sub entity just select 1 column but I need at least 2. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-db-data-config-xml-general-asking-to-entity-tp4013533.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: JNDI in db-data-config.xml websphere
Have to use exact JNDI name in db-data-config.xml, as unmanaged threads in Websphere do not have access to java:comp/env namespace. Resource name can not be mapped to websphere jdbc datasource name via reference definition in web.xml. Now using jndiName=jdbc/testdb instead of jndiName=java:comp/env/jdbc/testdb and also defining websphere JDBC datasource as jdbc/testdb -- View this message in context: http://lucene.472066.n3.nabble.com/JNDI-in-db-data-config-xml-websphere-tp3884787p3896869.html Sent from the Solr - User mailing list archive at Nabble.com.
JNDI in db-data-config.xml websphere
I am trying to use jndiName attribute in db-data-config.xml. This works great in tomcat. However having issues in websphere. Following exception is thrown Make sure that a J2EE application does not execute JNDI operations on java: names within static code blocks or in threads created by that J2EE application. Such code does not necessarily run on the thread of a server application request and therefore is not supported by JNDI operations on java: names. [Root exception is javax.naming.NameNotFoundException: Name comp/env/jdbc not found in context java:. It seems like websphere has issues accessing jndi resource from Static code. Has anyone experienced this ? Thanks Regards -- View this message in context: http://lucene.472066.n3.nabble.com/JNDI-in-db-data-config-xml-websphere-tp3884787p3884787.html Sent from the Solr - User mailing list archive at Nabble.com.
Specifing BatchSize parameter in db-data-config.xml will improve performance?
Hi I am using Oracle Exadata as my DB. I want to index nearly 4 crore rows. I have tried with specifing batchsize as 1. and with out specifing batchsize. But both tests takes nearly same time. Could anyone suggest me best way to index huge data Quickly? -- View this message in context: http://lucene.472066.n3.nabble.com/Specifing-BatchSize-parameter-in-db-data-config-xml-will-improve-performance-tp3588355p3588355.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problem using db-data-config.xml
On Thu, Jun 11, 2009 at 2:41 AM, jayakeerthi s mail2keer...@gmail.comwrote: As displayed above str name=*Total Requests made to DataSource**3739*/str * * str name=*Total Rows Fetched**4135*/str * * str name=*Total Documents Processed**1402*/str are differing The request to the datasource is increasing ..and the documents processed is less than the rows fetchedPlease advise If I am missing something here. You many entities, some of them nested within others. The rows is the count of number of rows retrieved for all entities but only the root-level entities create documents. -- Regards, Shalin Shekhar Mangar.
Re: Problem using db-data-config.xml
Many Thanks Noble the issue was with case of the field names. After fixing that I am getting the response for the full-data import cmd as *-* http://localhost:8983/solr/dataimport?command=abort# lst name=* initArgs* *-* http://localhost:8983/solr/dataimport?command=abort# lst name=* defaults* * * str name=*config** C:\apache-solr-nightly\example\example-DIH\solr\db\conf\db-data-config.xml* /str * * /lst * * /lst * * str name=*command**abort*/str * * str name=*status**busy*/str * * str name=*importResponse* / *-* http://localhost:8983/solr/dataimport?command=abort# lst name=* statusMessages* * * str name=*Time Elapsed**0:3:55.861*/str * * str name=*Total Requests made to DataSource**3739*/str * * str name=*Total Rows Fetched**4135*/str * * str name=*Total Documents Processed**1402*/str * * str name=*Total Documents Skipped**0*/str * * str name=*Full Dump Started**2009-06-10 13:54:22*/str * * /lst * * str name=*WARNING**This response format is experimental. It is likely to change in the future.*/str * * /response As displayed above str name=*Total Requests made to DataSource**3739*/str * * str name=*Total Rows Fetched**4135*/str * * str name=*Total Documents Processed**1402*/str are differing The request to the datasource is increasing ..and the documents processed is less than the rows fetchedPlease advise If I am missing something here. I have attached the db-data-config.xml after modifying. Thanks in advance, jayakeerthi 2009/6/9 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com are you sure prod_cd and reg_id\ are emitted by respective entities in the same name if not you may need to alias those fields (using as) keep in mind ,the field namkes are case sensitive. Just to know what are the values emitted use debug mode or use logTransformer On Wed, Jun 10, 2009 at 4:55 AM, jayakeerthi smail2keer...@gmail.com wrote: Hi All, I am facing an issue while fetching the records from database by providing the value '${prod.prod_cd}' in this type at db-data-config.xml. It is working fine If I provide the exact value of the product code ie '302437-413' Here is the db-data-config.xm I am using dataConfig dataSource type=JdbcDataSource driver=oracle.jdbc.driver.OracleDriver url=jdbc:oracle:thin:@*:1521: user=lslsls password=***/ document name=products entity name=prod pk=prod_id query=SELECT p.prod_id, p.prod_cd, ps.styl_cd, p.colr_disp_cd, p.colr_comb_desc, p.div_id, p.po_grid_desc, p.silo_id, p.silh_id, psa.sport_acty_desc, pga.gndr_age_desc, psh.silh_desc, pso.silo_desc, od.org_lgcy_div_cd, greatest ( nvl(p.last_mod_dt,sysdate-), nvl(ps.last_mod_dt,sysdate-), nvl(od.last_mod_dt,sysdate-), nvl(psa.last_mod_dt,sysdate-), nvl(pga.last_mod_dt,sysdate-), nvl(psh.last_mod_dt,sysdate-), nvl(pso.last_mod_dt,sysdate-) ) last_mod_dt FROM prod p INNER JOIN prod_styl ps ON p.prod_styl_id = ps.prod_styl_id INNER JOIN org_div od ON p.div_id = od.div_id LEFT OUTER JOIN prod_sport_acty psa ON p.sport_acty_id = psa.sport_acty_id LEFT OUTER JOIN prod_gndr_age pga ON p.gndr_age_id = pga.gndr_age_id LEFT OUTER JOIN prod_silh psh ON p.silh_id = psh.silh_id LEFT OUTER JOIN prod_silo pso ON p.silo_id = pso.silo_id WHERE nvl(od.stat,'A') = 'A' AND nvl(psa.stat,'A') = 'A' AND nvl(pga.stat,'A') = 'A' AND nvl(psh.stat,'A') = 'A' AND nvl(pso.stat,'A') = 'A' AND p.prod_cd = '302437-413' field column=prod_id name=prod_id/ field column=prod_cd name=prod_cd/ field column=styl_cd name=styl_cd/ field column=colr_disp_cd name=colr_disp_cd/ field column=colr_comb_desc name=colr_comb_desc/ field column=div_id name=div_id/ field column=po_grid_desc name=po_grid_desc/ field column=silo_id name=silo_id/ field column=sport_acty_desc name=sport_acty_desc/ field column=silh_id name=silh_id/ field column=gndr_age_desc name=gndr_age_desc/ field column=silh_desc name=silh_desc/ field column=silo_desc name=silo_desc/ field column=org_lgcy_div_cd name=org_lgcy_div_cd/ entity name=prod_reg query=SELECT pr.prod_id, pr.prod_cd, pr.reg_id, pr.retl_pr_amt, pr.whsle_pr_amt, pr.retl_crcy_id, pr.whsle_crcy_id, pr.frst_prod_offr_dt, pr.end_ftr_offr_dt, pr.last_mod_dt last_mod_dt FROM prod_reg pr WHERE prod_cd =* '${prod.prod_cd}' * field column=retl_pr_amt name=retl_pr_amt
Problem using db-data-config.xml
Hi All, I am facing an issue while fetching the records from database by providing the value '${prod.prod_cd}' in this type at db-data-config.xml. It is working fine If I provide the exact value of the product code ie '302437-413' Here is the db-data-config.xm I am using dataConfig dataSource type=JdbcDataSource driver=oracle.jdbc.driver.OracleDriver url=jdbc:oracle:thin:@*:1521: user=lslsls password=***/ document name=products entity name=prod pk=prod_id query=SELECT p.prod_id, p.prod_cd, ps.styl_cd, p.colr_disp_cd, p.colr_comb_desc, p.div_id, p.po_grid_desc, p.silo_id, p.silh_id, psa.sport_acty_desc, pga.gndr_age_desc, psh.silh_desc, pso.silo_desc, od.org_lgcy_div_cd, greatest ( nvl(p.last_mod_dt,sysdate-), nvl(ps.last_mod_dt,sysdate-), nvl(od.last_mod_dt,sysdate-), nvl(psa.last_mod_dt,sysdate-), nvl(pga.last_mod_dt,sysdate-), nvl(psh.last_mod_dt,sysdate-), nvl(pso.last_mod_dt,sysdate-) ) last_mod_dt FROM prod p INNER JOIN prod_styl ps ON p.prod_styl_id = ps.prod_styl_id INNER JOIN org_div od ON p.div_id = od.div_id LEFT OUTER JOIN prod_sport_acty psa ON p.sport_acty_id = psa.sport_acty_id LEFT OUTER JOIN prod_gndr_age pga ON p.gndr_age_id = pga.gndr_age_id LEFT OUTER JOIN prod_silh psh ON p.silh_id = psh.silh_id LEFT OUTER JOIN prod_silo pso ON p.silo_id = pso.silo_id WHERE nvl(od.stat,'A') = 'A' AND nvl(psa.stat,'A') = 'A' AND nvl(pga.stat,'A') = 'A' AND nvl(psh.stat,'A') = 'A' AND nvl(pso.stat,'A') = 'A' AND p.prod_cd = '302437-413' field column=prod_id name=prod_id/ field column=prod_cd name=prod_cd/ field column=styl_cd name=styl_cd/ field column=colr_disp_cd name=colr_disp_cd/ field column=colr_comb_desc name=colr_comb_desc/ field column=div_id name=div_id/ field column=po_grid_desc name=po_grid_desc/ field column=silo_id name=silo_id/ field column=sport_acty_desc name=sport_acty_desc/ field column=silh_id name=silh_id/ field column=gndr_age_desc name=gndr_age_desc/ field column=silh_desc name=silh_desc/ field column=silo_desc name=silo_desc/ field column=org_lgcy_div_cd name=org_lgcy_div_cd/ entity name=prod_reg query=SELECT pr.prod_id, pr.prod_cd, pr.reg_id, pr.retl_pr_amt, pr.whsle_pr_amt, pr.retl_crcy_id, pr.whsle_crcy_id, pr.frst_prod_offr_dt, pr.end_ftr_offr_dt, pr.last_mod_dt last_mod_dt FROM prod_reg pr WHERE prod_cd =* '${prod.prod_cd}' * field column=retl_pr_amt name=retl_pr_amt/ field column=whsle_pr_amt name=whsle_pr_amt/ field column=retl_crcy_id name=retl_crcy_id/ field column=whsle_crcy_id name=whsle_crcy_id/ field column=frst_prod_offr_dt name=frst_prod_offr_dt/ field column=end_ftr_offr_dt name=end_ftr_offr_dt/ field column=last_mod_dt name=last_mod_dt/ entity name=prod_reg_cmrc_styl query=SELECT p.prod_id, p.prod_cd, pr.reg_id, prcs.sap_lang_id, prcs.reg_cmrc_styl_nm, prcs.insm_desc, prcs.otsm_desc, prcs.dim_desc, prcs.prfl_desc, prcs.upr_desc, prcs.mdsl_desc, prcs.outsl_desc, prcs.ctnt_desc, prcs.size_run_desc, greatest ( nvl(p.last_mod_dt,sysdate-), nvl(ps.last_mod_dt,sysdate-), nvl(pr.last_mod_dt,sysdate-), nvl(prcs.last_mod_dt,sysdate-) ) last_mod_dt FROM prod p INNER JOIN prod_styl ps ON p.prod_styl_id = ps.prod_styl_id INNER JOIN prod_reg pr ON p.prod_id = pr.prod_id INNER JOIN prod_reg_cmrc_styl prcs ON prcs.prod_styl_id = ps.prod_styl_id AND prcs.reg_id = pr.reg_id WHERE prcs.stat_cd = 'A' *AND prod_cd ='${prod.prod_cd}' AND reg_id = '${prod_reg.reg_id' * field column=sap_lang_id name=sap_lang_id/ field column=reg_cmrc_styl_nm name=reg_cmrc_styl_nm/ field column=insm_desc name=insm_desc/ field column=otsm_desc name=otsm_desc/ field column=dim_desc name=dim_desc/ field column=prfl_desc name=prfl_desc/ field column=upr_desc name=upr_desc/ field column=mdsl_desc name=mdsl_desc/ field column=outsl_desc name=outsl_desc/ field column=ctnt_desc name=ctnt_desc/ field column=size_run_desc name=size_run_desc/ /entity /entity /entity /document
Re: Problem using db-data-config.xml
are you sure prod_cd and reg_id\ are emitted by respective entities in the same name if not you may need to alias those fields (using as) keep in mind ,the field namkes are case sensitive. Just to know what are the values emitted use debug mode or use logTransformer On Wed, Jun 10, 2009 at 4:55 AM, jayakeerthi smail2keer...@gmail.com wrote: Hi All, I am facing an issue while fetching the records from database by providing the value '${prod.prod_cd}' in this type at db-data-config.xml. It is working fine If I provide the exact value of the product code ie '302437-413' Here is the db-data-config.xm I am using dataConfig dataSource type=JdbcDataSource driver=oracle.jdbc.driver.OracleDriver url=jdbc:oracle:thin:@*:1521: user=lslsls password=***/ document name=products entity name=prod pk=prod_id query=SELECT p.prod_id, p.prod_cd, ps.styl_cd, p.colr_disp_cd, p.colr_comb_desc, p.div_id, p.po_grid_desc, p.silo_id, p.silh_id, psa.sport_acty_desc, pga.gndr_age_desc, psh.silh_desc, pso.silo_desc, od.org_lgcy_div_cd, greatest ( nvl(p.last_mod_dt,sysdate-), nvl(ps.last_mod_dt,sysdate-), nvl(od.last_mod_dt,sysdate-), nvl(psa.last_mod_dt,sysdate-), nvl(pga.last_mod_dt,sysdate-), nvl(psh.last_mod_dt,sysdate-), nvl(pso.last_mod_dt,sysdate-) ) last_mod_dt FROM prod p INNER JOIN prod_styl ps ON p.prod_styl_id = ps.prod_styl_id INNER JOIN org_div od ON p.div_id = od.div_id LEFT OUTER JOIN prod_sport_acty psa ON p.sport_acty_id = psa.sport_acty_id LEFT OUTER JOIN prod_gndr_age pga ON p.gndr_age_id = pga.gndr_age_id LEFT OUTER JOIN prod_silh psh ON p.silh_id = psh.silh_id LEFT OUTER JOIN prod_silo pso ON p.silo_id = pso.silo_id WHERE nvl(od.stat,'A') = 'A' AND nvl(psa.stat,'A') = 'A' AND nvl(pga.stat,'A') = 'A' AND nvl(psh.stat,'A') = 'A' AND nvl(pso.stat,'A') = 'A' AND p.prod_cd = '302437-413' field column=prod_id name=prod_id/ field column=prod_cd name=prod_cd/ field column=styl_cd name=styl_cd/ field column=colr_disp_cd name=colr_disp_cd/ field column=colr_comb_desc name=colr_comb_desc/ field column=div_id name=div_id/ field column=po_grid_desc name=po_grid_desc/ field column=silo_id name=silo_id/ field column=sport_acty_desc name=sport_acty_desc/ field column=silh_id name=silh_id/ field column=gndr_age_desc name=gndr_age_desc/ field column=silh_desc name=silh_desc/ field column=silo_desc name=silo_desc/ field column=org_lgcy_div_cd name=org_lgcy_div_cd/ entity name=prod_reg query=SELECT pr.prod_id, pr.prod_cd, pr.reg_id, pr.retl_pr_amt, pr.whsle_pr_amt, pr.retl_crcy_id, pr.whsle_crcy_id, pr.frst_prod_offr_dt, pr.end_ftr_offr_dt, pr.last_mod_dt last_mod_dt FROM prod_reg pr WHERE prod_cd =* '${prod.prod_cd}' * field column=retl_pr_amt name=retl_pr_amt/ field column=whsle_pr_amt name=whsle_pr_amt/ field column=retl_crcy_id name=retl_crcy_id/ field column=whsle_crcy_id name=whsle_crcy_id/ field column=frst_prod_offr_dt name=frst_prod_offr_dt/ field column=end_ftr_offr_dt name=end_ftr_offr_dt/ field column=last_mod_dt name=last_mod_dt/ entity name=prod_reg_cmrc_styl query=SELECT p.prod_id, p.prod_cd, pr.reg_id, prcs.sap_lang_id, prcs.reg_cmrc_styl_nm, prcs.insm_desc, prcs.otsm_desc, prcs.dim_desc, prcs.prfl_desc, prcs.upr_desc, prcs.mdsl_desc, prcs.outsl_desc, prcs.ctnt_desc, prcs.size_run_desc, greatest ( nvl(p.last_mod_dt,sysdate-), nvl(ps.last_mod_dt,sysdate-), nvl(pr.last_mod_dt,sysdate-), nvl(prcs.last_mod_dt,sysdate-) ) last_mod_dt FROM prod p INNER JOIN prod_styl ps ON p.prod_styl_id = ps.prod_styl_id INNER JOIN prod_reg pr ON p.prod_id = pr.prod_id INNER JOIN prod_reg_cmrc_styl prcs ON prcs.prod_styl_id = ps.prod_styl_id AND prcs.reg_id = pr.reg_id WHERE prcs.stat_cd = 'A' *AND prod_cd ='${prod.prod_cd}' AND reg_id = '${prod_reg.reg_id' * field column=sap_lang_id name=sap_lang_id/ field column=reg_cmrc_styl_nm name=reg_cmrc_styl_nm/ field column=insm_desc name=insm_desc/ field column=otsm_desc name=otsm_desc/ field column=dim_desc name=dim_desc/ field column=prfl_desc name
Re: query regarding Indexing xml files -db-data-config.xml
Hi Noble, Thanks for the reply, As advised I have changed the db-data-config.xml as below. But still the str name=Indexing completed. Added/Updated: 0 documents. Deleted 0 documents./str dataConfig dataSource type=FileDataSource name =xmlindex/ document name=products entity name=xmlfile processor=FileListEntityProcessor fileName=c:\\test\\ipod_other.xml recursive=true rootEntity=false dataSource=null baseDir=${dataimporter.request.xmlDataDir} useSolrAddSchema=true entity name=data processor=XPathEntityProcessor url=${xmlfile.fileAbsolutePath} field column=manu name=manu/ /entity /entity /document /dataConfig Got error as below when baseDir is removed INFO: last commit = 1242683454570 May 18, 2009 2:55:15 PM org.apache.solr.handler.dataimport.DataImporter doFullImport SEVERE: Full Import failed org.apache.solr.handler.dataimport.DataImportHandlerException: 'baseDir' is a required attribute Pro cessing Document # 1 at org.apache.solr.handler.dataimport.FileListEntityProcessor.init(FileListEntityProcessor.j ava:76) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:299) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:225) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:167) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:324) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:382) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:363) May 18, 2009 2:55:15 PM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: start rollback Please advise. Thanks and regards, Jay 2009/5/17 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com hi , u may not need that enclosing entity , if you only wish to index one file. baseDir is not required if you give absolute path in the fileName. no need to mention forEach or fields if you set useSolrAddSchema=true On Sat, May 16, 2009 at 1:23 AM, jayakeerthi s mail2keer...@gmail.com wrote: Hi All, I am trying to index the fileds from the xml files, here is the configuration that I am using. db-data-config.xml dataConfig dataSource type=FileDataSource name =xmlindex/ document name=products entity name=xmlfile processor=FileListEntityProcessor fileName=c:\test\ipod_other.xml recursive=true rootEntity=false dataSource=null baseDir=${dataimporter.request.xmlDataDir} entity name=data processor=XPathEntityProcessor forEach=/record | /the/record/xpath url=${xmlfile.fileAbsolutePath} field column=manu name=manu/ /entity /entity /document /dataConfig Schema.xml has the field manu The input xml file used to import the field is doc field name=idF8V7067-APL-KIT/field field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field field name=manuBelkin/field field name=catelectronics/field field name=catconnector/field field name=featurescar power adapter, white/field field name=weight4/field field name=price19.95/field field name=popularity1/field field name=inStockfalse/field /doc doing the full-import this is the response I am getting - lst name=statusMessages str name=Total Requests made to DataSource0/str str name=Total Rows Fetched0/str str name=Total Documents Skipped0/str str name=Full Dump Started2009-05-15 11:58:00/str str name=Indexing completed. Added/Updated: 0 documents. Deleted 0 documents./str str name=Committed2009-05-15 11:58:00/str str name=Optimized2009-05-15 11:58:00/str str name=Time taken0:0:0.172/str /lst str name=WARNINGThis response format is experimental. It is likely to change in the future./str /response Do I missing anything here or is there any format on the input xml,?? please help resolving this. Thanks and regards, Jay -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: query regarding Indexing xml files -db-data-config.xml
hi , u may not need that enclosing entity , if you only wish to index one file. baseDir is not required if you give absolute path in the fileName. no need to mention forEach or fields if you set useSolrAddSchema=true On Sat, May 16, 2009 at 1:23 AM, jayakeerthi s mail2keer...@gmail.com wrote: Hi All, I am trying to index the fileds from the xml files, here is the configuration that I am using. db-data-config.xml dataConfig dataSource type=FileDataSource name =xmlindex/ document name=products entity name=xmlfile processor=FileListEntityProcessor fileName=c:\test\ipod_other.xml recursive=true rootEntity=false dataSource=null baseDir=${dataimporter.request.xmlDataDir} entity name=data processor=XPathEntityProcessor forEach=/record | /the/record/xpath url=${xmlfile.fileAbsolutePath} field column=manu name=manu/ /entity /entity /document /dataConfig Schema.xml has the field manu The input xml file used to import the field is doc field name=idF8V7067-APL-KIT/field field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field field name=manuBelkin/field field name=catelectronics/field field name=catconnector/field field name=featurescar power adapter, white/field field name=weight4/field field name=price19.95/field field name=popularity1/field field name=inStockfalse/field /doc doing the full-import this is the response I am getting - lst name=statusMessages str name=Total Requests made to DataSource0/str str name=Total Rows Fetched0/str str name=Total Documents Skipped0/str str name=Full Dump Started2009-05-15 11:58:00/str str name=Indexing completed. Added/Updated: 0 documents. Deleted 0 documents./str str name=Committed2009-05-15 11:58:00/str str name=Optimized2009-05-15 11:58:00/str str name=Time taken0:0:0.172/str /lst str name=WARNINGThis response format is experimental. It is likely to change in the future./str /response Do I missing anything here or is there any format on the input xml,?? please help resolving this. Thanks and regards, Jay -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: query regarding Indexing xml files -db-data-config.xml
Hmmm, I thought that if you were using the XPathEntityProcessor that you have to specify an xpath for each of the fields you want to populate. Unless you are using XPathEntityProcessor's use useSolrAddSchema mode? Fergus. If that is your complete input file then it looks like you are missing the wrapping add/add element: add doc field name=idF8V7067-APL-KIT/ field field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field field name=manuBelkin/field field name=catelectronics/field field name=catconnector/field field name=featurescar power adapter, white/field field name=weight4/field field name=price19.95/field field name=popularity1/field field name=inStockfalse/field /doc /add Is it possible you just forgot to include the add? -Jay On Fri, May 15, 2009 at 12:53 PM, jayakeerthi s mail2keer...@gmail.comwrote: Hi All, I am trying to index the fileds from the xml files, here is the configuration that I am using. db-data-config.xml dataConfig dataSource type=FileDataSource name =xmlindex/ document name=products entity name=xmlfile processor=FileListEntityProcessor fileName=c:\test\ipod_other.xml recursive=true rootEntity=false dataSource=null baseDir=${dataimporter.request.xmlDataDir} entity name=data processor=XPathEntityProcessor forEach=/record | /the/record/xpath url=${xmlfile.fileAbsolutePath} field column=manu name=manu/ /entity /entity /document /dataConfig Schema.xml has the field manu The input xml file used to import the field is doc field name=idF8V7067-APL-KIT/field field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field field name=manuBelkin/field field name=catelectronics/field field name=catconnector/field field name=featurescar power adapter, white/field field name=weight4/field field name=price19.95/field field name=popularity1/field field name=inStockfalse/field /doc doing the full-import this is the response I am getting - lst name=statusMessages str name=Total Requests made to DataSource0/str str name=Total Rows Fetched0/str str name=Total Documents Skipped0/str str name=Full Dump Started2009-05-15 11:58:00/str str name=Indexing completed. Added/Updated: 0 documents. Deleted 0 documents./str str name=Committed2009-05-15 11:58:00/str str name=Optimized2009-05-15 11:58:00/str str name=Time taken0:0:0.172/str /lst str name=WARNINGThis response format is experimental. It is likely to change in the future./str /response Do I missing anything here or is there any format on the input xml,?? please help resolving this. Thanks and regards, Jay -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
query regarding Indexing xml files -db-data-config.xml
Hi All, I am trying to index the fileds from the xml files, here is the configuration that I am using. db-data-config.xml dataConfig dataSource type=FileDataSource name =xmlindex/ document name=products entity name=xmlfile processor=FileListEntityProcessor fileName=c:\test\ipod_other.xml recursive=true rootEntity=false dataSource=null baseDir=${dataimporter.request.xmlDataDir} entity name=data processor=XPathEntityProcessor forEach=/record | /the/record/xpath url=${xmlfile.fileAbsolutePath} field column=manu name=manu/ /entity /entity /document /dataConfig Schema.xml has the field manu The input xml file used to import the field is doc field name=idF8V7067-APL-KIT/field field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field field name=manuBelkin/field field name=catelectronics/field field name=catconnector/field field name=featurescar power adapter, white/field field name=weight4/field field name=price19.95/field field name=popularity1/field field name=inStockfalse/field /doc doing the full-import this is the response I am getting - lst name=statusMessages str name=Total Requests made to DataSource0/str str name=Total Rows Fetched0/str str name=Total Documents Skipped0/str str name=Full Dump Started2009-05-15 11:58:00/str str name=Indexing completed. Added/Updated: 0 documents. Deleted 0 documents./str str name=Committed2009-05-15 11:58:00/str str name=Optimized2009-05-15 11:58:00/str str name=Time taken0:0:0.172/str /lst str name=WARNINGThis response format is experimental. It is likely to change in the future./str /response Do I missing anything here or is there any format on the input xml,?? please help resolving this. Thanks and regards, Jay
Re: query regarding Indexing xml files -db-data-config.xml
If that is your complete input file then it looks like you are missing the wrapping add/add element: add doc field name=idF8V7067-APL-KIT/ field field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field field name=manuBelkin/field field name=catelectronics/field field name=catconnector/field field name=featurescar power adapter, white/field field name=weight4/field field name=price19.95/field field name=popularity1/field field name=inStockfalse/field /doc /add Is it possible you just forgot to include the add? -Jay On Fri, May 15, 2009 at 12:53 PM, jayakeerthi s mail2keer...@gmail.comwrote: Hi All, I am trying to index the fileds from the xml files, here is the configuration that I am using. db-data-config.xml dataConfig dataSource type=FileDataSource name =xmlindex/ document name=products entity name=xmlfile processor=FileListEntityProcessor fileName=c:\test\ipod_other.xml recursive=true rootEntity=false dataSource=null baseDir=${dataimporter.request.xmlDataDir} entity name=data processor=XPathEntityProcessor forEach=/record | /the/record/xpath url=${xmlfile.fileAbsolutePath} field column=manu name=manu/ /entity /entity /document /dataConfig Schema.xml has the field manu The input xml file used to import the field is doc field name=idF8V7067-APL-KIT/field field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field field name=manuBelkin/field field name=catelectronics/field field name=catconnector/field field name=featurescar power adapter, white/field field name=weight4/field field name=price19.95/field field name=popularity1/field field name=inStockfalse/field /doc doing the full-import this is the response I am getting - lst name=statusMessages str name=Total Requests made to DataSource0/str str name=Total Rows Fetched0/str str name=Total Documents Skipped0/str str name=Full Dump Started2009-05-15 11:58:00/str str name=Indexing completed. Added/Updated: 0 documents. Deleted 0 documents./str str name=Committed2009-05-15 11:58:00/str str name=Optimized2009-05-15 11:58:00/str str name=Time taken0:0:0.172/str /lst str name=WARNINGThis response format is experimental. It is likely to change in the future./str /response Do I missing anything here or is there any format on the input xml,?? please help resolving this. Thanks and regards, Jay
Re: query regarding Indexing xml files -db-data-config.xml
Many thanks for the reply The complete input xml file is below I missed to include this earlier. add doc field name=idF8V7067-APL-KIT/field field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field field name=manuBelkin/field field name=catelectronics/field field name=catconnector/field field name=featurescar power adapter, white/field field name=weight4/field field name=price19.95/field field name=popularity1/field field name=inStockfalse/field /doc doc field name=idIW-02/field field name=nameiPod amp; iPod Mini USB 2.0 Cable/field field name=manuBelkin/field field name=catelectronics/field field name=catconnector/field field name=featurescar power adapter for iPod, white/field field name=weight2/field field name=price11.50/field field name=popularity1/field field name=inStockfalse/field /doc /add regards, Jay On Fri, May 15, 2009 at 1:14 PM, Jay Hill jayallenh...@gmail.com wrote: If that is your complete input file then it looks like you are missing the wrapping add/add element: add doc field name=idF8V7067-APL-KIT/ field field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field field name=manuBelkin/field field name=catelectronics/field field name=catconnector/field field name=featurescar power adapter, white/field field name=weight4/field field name=price19.95/field field name=popularity1/field field name=inStockfalse/field /doc /add Is it possible you just forgot to include the add? -Jay On Fri, May 15, 2009 at 12:53 PM, jayakeerthi s mail2keer...@gmail.com wrote: Hi All, I am trying to index the fileds from the xml files, here is the configuration that I am using. db-data-config.xml dataConfig dataSource type=FileDataSource name =xmlindex/ document name=products entity name=xmlfile processor=FileListEntityProcessor fileName=c:\test\ipod_other.xml recursive=true rootEntity=false dataSource=null baseDir=${dataimporter.request.xmlDataDir} entity name=data processor=XPathEntityProcessor forEach=/record | /the/record/xpath url=${xmlfile.fileAbsolutePath} field column=manu name=manu/ /entity /entity /document /dataConfig Schema.xml has the field manu The input xml file used to import the field is doc field name=idF8V7067-APL-KIT/field field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field field name=manuBelkin/field field name=catelectronics/field field name=catconnector/field field name=featurescar power adapter, white/field field name=weight4/field field name=price19.95/field field name=popularity1/field field name=inStockfalse/field /doc doing the full-import this is the response I am getting - lst name=statusMessages str name=Total Requests made to DataSource0/str str name=Total Rows Fetched0/str str name=Total Documents Skipped0/str str name=Full Dump Started2009-05-15 11:58:00/str str name=Indexing completed. Added/Updated: 0 documents. Deleted 0 documents./str str name=Committed2009-05-15 11:58:00/str str name=Optimized2009-05-15 11:58:00/str str name=Time taken0:0:0.172/str /lst str name=WARNINGThis response format is experimental. It is likely to change in the future./str /response Do I missing anything here or is there any format on the input xml,?? please help resolving this. Thanks and regards, Jay
Is there any better way to configure db-data-config.xml
Hi, I am using the following schema - http://www.nabble.com/file/p21332196/table_stuct.gif 1. INSTITUTION table is the main table that has information about all the institutions. 2. INSTITUTION_TYPE table has 'institute_type' and its 'description' for each 'institute_type_id' in the INSTITUTION table. 3. INSTITUTION_SOURCE_MAP table is a mapping table. This has institution_id corresponding to source_id from external system. NOTE - INSTITUTION table is union of institutions created internally AND institutions corresponding to source_ids from external systems. Requirement - 1. Search Institutions by 'institution_name' in the INSTITUTION table. 2. Display institution_type for institution_type_id. 3. user should be able to search for institution by 'source_id' and 'source_entity_name'. My db-data-config.xml is following - === dataConfig dataSource driver=net.sourceforge.jtds.jdbc.Driver url=jdbc:jtds:sqlserver://localhost:1433/dummy-master user=dummy-master password=dummy-master / document name=institution entity name=INSTITUTION pk=institution_id query=select * from INSTITUTION deltaQuery=select institution_id from INSTITUTION where last_update_date '${dataimporter.last_index_time}' field column=institution_id name=id / field column=institution_name name=institutionName / field column=description name=description / field column=institution_type_id name=institutionTypeId / entity name=INSTITUTION_TYPE pk=institution_type_id query=select institution_type from INSTITUTION_TYPE where institution_type_id='${INSTITUTION.institution_type_id}' parentDeltaQuery=select institution_type_id from INSTITUTION where institution_type_id=${INSTITUTION_TYPE.institution_type_id} field name=institutionType column=institution_type / /entity /entity entity name=INSTITUTION_SOURCE_MAP pk=institution_id, source_id, source_entity_name, source_key, source_key_field query=select * from INSTITUTION_SOURCE_MAP field column=source_id name=sourceId / field column=source_entity_name name=sourceEntityName / entity name=INSTITUTION pk=institution_id query=select * from INSTITUTION where institution_id = '${INSTITUTION_SOURCE_MAP.institution_id}' field column=institution_id name=id / field column=institution_name name=institutionName / field column=description name=description / field column=institution_type_id name=institutionTypeId / entity name=INSTITUTION_TYPE pk=institution_type_id query=select institution_type from INSTITUTION_TYPE where institution_type_id='${INSTITUTION.institution_type_id}' parentDeltaQuery=select institution_type_id from INSTITUTION where institution_type_id=${INSTITUTION_TYPE.institution_type_id} field name=institutionType column=institution_type / /entity /entity /entity /document /dataConfig === My configuration file is working perfectly fine. I have specified two entity inside on document. And both the entity has further nested entity tags. Can anyone suggest me if there is any other/better way to configure the relationship? :confused: I have referred http://wiki.apache.org/solr/DataImportHandler and http://download.boulder.ibm.com/ibmdl/pub/software/dw/java/j-solr-update-pdf.pdf Is there any resource that has detailed information about tags used in db-data-config.xml? Thanks, Manu -- View this message in context: http://www.nabble.com/Is-there-any-better-way-to-configure-db-data-config.xml-tp21332196p21332196.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Is there any better way to configure db-data-config.xml
why do you have multiple root entities ? On Wed, Jan 7, 2009 at 7:48 PM, Manupriya manupriya.si...@gmail.com wrote: Hi, I am using the following schema - http://www.nabble.com/file/p21332196/table_stuct.gif 1. INSTITUTION table is the main table that has information about all the institutions. 2. INSTITUTION_TYPE table has 'institute_type' and its 'description' for each 'institute_type_id' in the INSTITUTION table. 3. INSTITUTION_SOURCE_MAP table is a mapping table. This has institution_id corresponding to source_id from external system. NOTE - INSTITUTION table is union of institutions created internally AND institutions corresponding to source_ids from external systems. Requirement - 1. Search Institutions by 'institution_name' in the INSTITUTION table. 2. Display institution_type for institution_type_id. 3. user should be able to search for institution by 'source_id' and 'source_entity_name'. My db-data-config.xml is following - === dataConfig dataSource driver=net.sourceforge.jtds.jdbc.Driver url=jdbc:jtds:sqlserver://localhost:1433/dummy-master user=dummy-master password=dummy-master / document name=institution entity name=INSTITUTION pk=institution_id query=select * from INSTITUTION deltaQuery=select institution_id from INSTITUTION where last_update_date '${dataimporter.last_index_time}' field column=institution_id name=id / field column=institution_name name=institutionName / field column=description name=description / field column=institution_type_id name=institutionTypeId / entity name=INSTITUTION_TYPE pk=institution_type_id query=select institution_type from INSTITUTION_TYPE where institution_type_id='${INSTITUTION.institution_type_id}' parentDeltaQuery=select institution_type_id from INSTITUTION where institution_type_id=${INSTITUTION_TYPE.institution_type_id} field name=institutionType column=institution_type / /entity /entity entity name=INSTITUTION_SOURCE_MAP pk=institution_id, source_id, source_entity_name, source_key, source_key_field query=select * from INSTITUTION_SOURCE_MAP field column=source_id name=sourceId / field column=source_entity_name name=sourceEntityName / entity name=INSTITUTION pk=institution_id query=select * from INSTITUTION where institution_id = '${INSTITUTION_SOURCE_MAP.institution_id}' field column=institution_id name=id / field column=institution_name name=institutionName / field column=description name=description / field column=institution_type_id name=institutionTypeId / entity name=INSTITUTION_TYPE pk=institution_type_id query=select institution_type from INSTITUTION_TYPE where institution_type_id='${INSTITUTION.institution_type_id}' parentDeltaQuery=select institution_type_id from INSTITUTION where institution_type_id=${INSTITUTION_TYPE.institution_type_id} field name=institutionType column=institution_type / /entity /entity /entity /document /dataConfig === My configuration file is working perfectly fine. I have specified two entity inside on document. And both the entity has further nested entity tags. Can anyone suggest me if there is any other/better way to configure the relationship? :confused: I have referred http://wiki.apache.org/solr/DataImportHandler and http://download.boulder.ibm.com/ibmdl/pub/software/dw/java/j-solr-update-pdf.pdf Is there any resource that has detailed information about tags used in db-data-config.xml? Thanks, Manu -- View this message in context: http://www.nabble.com/Is-there-any-better-way-to-configure-db-data-config.xml-tp21332196p21332196.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul
Re: Is there any better way to configure db-data-config.xml
Hi Noble, In my case, Institutions can be entered in two different ways. 1. Institutions information is directly present in the INSTITUTION table. (Note - Such institutions ARE NOT present in INSTITUTION_SOURCE_MAP table.). In this case, I have INSTITUTION as the parent entity. And INSTITUTION_TYPE as child entry in order to retrieve the institution_type. 2. Institutions are mapped to INSTITUTION table through INSTITUTION_SOURCE_MAP table. (In this case, user can search institutions based on INSTITUTION_SOURCE_MAP fields.) So here INSTITUTION_SOURCE_MAP table is parent entity. And INSTITUTION is the child entity. How else can I specify the relationship? Thanks, Manu Noble Paul നോബിള് नोब्ळ् wrote: why do you have multiple root entities ? On Wed, Jan 7, 2009 at 7:48 PM, Manupriya manupriya.si...@gmail.com wrote: Hi, I am using the following schema - http://www.nabble.com/file/p21332196/table_stuct.gif 1. INSTITUTION table is the main table that has information about all the institutions. 2. INSTITUTION_TYPE table has 'institute_type' and its 'description' for each 'institute_type_id' in the INSTITUTION table. 3. INSTITUTION_SOURCE_MAP table is a mapping table. This has institution_id corresponding to source_id from external system. NOTE - INSTITUTION table is union of institutions created internally AND institutions corresponding to source_ids from external systems. Requirement - 1. Search Institutions by 'institution_name' in the INSTITUTION table. 2. Display institution_type for institution_type_id. 3. user should be able to search for institution by 'source_id' and 'source_entity_name'. My db-data-config.xml is following - === dataConfig dataSource driver=net.sourceforge.jtds.jdbc.Driver url=jdbc:jtds:sqlserver://localhost:1433/dummy-master user=dummy-master password=dummy-master / document name=institution entity name=INSTITUTION pk=institution_id query=select * from INSTITUTION deltaQuery=select institution_id from INSTITUTION where last_update_date '${dataimporter.last_index_time}' field column=institution_id name=id / field column=institution_name name=institutionName / field column=description name=description / field column=institution_type_id name=institutionTypeId / entity name=INSTITUTION_TYPE pk=institution_type_id query=select institution_type from INSTITUTION_TYPE where institution_type_id='${INSTITUTION.institution_type_id}' parentDeltaQuery=select institution_type_id from INSTITUTION where institution_type_id=${INSTITUTION_TYPE.institution_type_id} field name=institutionType column=institution_type / /entity /entity entity name=INSTITUTION_SOURCE_MAP pk=institution_id, source_id, source_entity_name, source_key, source_key_field query=select * from INSTITUTION_SOURCE_MAP field column=source_id name=sourceId / field column=source_entity_name name=sourceEntityName / entity name=INSTITUTION pk=institution_id query=select * from INSTITUTION where institution_id = '${INSTITUTION_SOURCE_MAP.institution_id}' field column=institution_id name=id / field column=institution_name name=institutionName / field column=description name=description / field column=institution_type_id name=institutionTypeId / entity name=INSTITUTION_TYPE pk=institution_type_id query=select institution_type from INSTITUTION_TYPE where institution_type_id='${INSTITUTION.institution_type_id}' parentDeltaQuery=select institution_type_id from INSTITUTION where institution_type_id=${INSTITUTION_TYPE.institution_type_id} field name=institutionType column=institution_type / /entity /entity /entity /document /dataConfig === My configuration file is working perfectly fine. I have specified two entity inside on document. And both the entity has further nested entity tags. Can anyone suggest me if there is any other/better way to configure the relationship? :confused: I have referred http://wiki.apache.org/solr/DataImportHandler and http://download.boulder.ibm.com/ibmdl/pub/software/dw/java/j-solr-update-pdf.pdf Is there any resource that has detailed information about tags used in db-data-config.xml? Thanks, Manu -- View this message in context: http://www.nabble.com/Is-there-any-better-way-to-configure-db-data-config.xml-tp21332196p21332196.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul -- View this message in context: http://www.nabble.com/Is-there-any-better-way-to-configure-db-data-config.xml-tp21332196p21346513.html Sent from the Solr - User mailing list archive at Nabble.com.