Thanks! One more question. WGET seems to choking on a my URL in particular the # and the & character . What’s the best method escaping?
http://<My Host> :8983/solr/#/articles/dataimport//dataimport?command=full-import&clean=true&optimize=true -- Craig Hoffman w: http://www.craighoffmanphotography.com FB: www.facebook.com/CraigHoffmanPhotography TW: https://twitter.com/craiglhoffman > On Oct 30, 2014, at 12:30 PM, Ramzi Alqrainy <ramzi.alqra...@gmail.com> wrote: > > Simple add this line to your crontab with crontab -e command: > > 0,30 * * * * /usr/bin/wget > http://<solr_host>:8983/solr/<core_name>/dataimport?command=full-import > > This will full import every 30 minutes. Replace <solr_host> and <core_name> > with your configuration > > *Using delta-import command* > > Delta Import operation can be started by hitting the URL > http://localhost:8983/solr/dataimport?command=delta-import. This operation > will be started in a new thread and the status attribute in the response > should be shown busy now. Depending on the size of your data set, this > operation may take some time. At any time, you can hit > http://localhost:8983/solr/dataimport to see the status flag. > > When delta-import command is executed, it reads the start time stored in > conf/dataimport.properties. It uses that timestamp to run delta queries and > after completion, updates the timestamp in conf/dataimport.properties. > > Note: there is an alternative approach for updating documents in Solr, which > is in many cases more efficient and also requires less configuration > explained on DataImportHandlerDeltaQueryViaFullImport. > > *Delta-Import Example* > > We will use the same example database used in the full import example. Note > that the database schema has been updated and each table contains an > additional column last_modified of timestamp type. You may want to download > the database again since it has been updated recently. We use this timestamp > field to determine what rows in each table have changed since the last > indexed time. > > Take a look at the following data-config.xml > > > <dataConfig> > <dataSource driver="org.hsqldb.jdbcDriver" > url="jdbc:hsqldb:/temp/example/ex" user="sa" /> > <document name="products"> > <entity name="item" pk="ID" > query="select * from item" > deltaImportQuery="select * from item where > ID='${dih.delta.id}'" > deltaQuery="select id from item where last_modified > > '${dih.last_index_time}'"> > <entity name="feature" pk="ITEM_ID" > query="select description as features from feature where > item_id='${item.ID}'"> > </entity> > <entity name="item_category" pk="ITEM_ID, CATEGORY_ID" > query="select CATEGORY_ID from item_category where > ITEM_ID='${item.ID}'"> > <entity name="category" pk="ID" > query="select description as cat from category where > id = '${item_category.CATEGORY_ID}'"> > </entity> > </entity> > </entity> > </document> > </dataConfig> > Pay attention to the deltaQuery attribute which has an SQL statement capable > of detecting changes in the item table. Note the variable > ${dataimporter.last_index_time} The DataImportHandler exposes a variable > called last_index_time which is a timestamp value denoting the last time > full-import 'or' delta-import was run. You can use this variable anywhere in > the SQL you write in data-config.xml and it will be replaced by the value > during processing. > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Automating-Solr-tp4166696p4166707.html > Sent from the Solr - User mailing list archive at Nabble.com.