Re: Syncing 'delta-import' with 'select' query
Oops! That seems to be the problem, since I am using 1.4. Thanks! Juan M. On Tue, Dec 14, 2010 at 8:40 PM, Alexey Serba wrote: > What Solr version do you use? > > It seems that sync flag has been added to 3.1 and 4.0 (trunk) branches > and not to 1.4 > https://issues.apache.org/jira/browse/SOLR-1721 > > On Wed, Dec 8, 2010 at 11:21 PM, Juan Manuel Alvarez > wrote: >> Hello everyone! >> I have been doing some tests, but it seems I can't make the >> synchronize flag work. >> >> I have made two tests: >> 1) DIH with commit=false >> 2) DIH with commit=false + commit via Solr XML update protocol >> >> And here are the log results: >> For (1) the command is >> "/solr/dataimport?command=delta-import&commit=false&synchronous=true" >> and the first part of the output is: >> >> Dec 8, 2010 4:42:51 PM org.apache.solr.core.SolrCore execute >> INFO: [] webapp=/solr path=/dataimport params={command=status} status=0 >> QTime=0 >> Dec 8, 2010 4:42:51 PM org.apache.solr.core.SolrCore execute >> INFO: [] webapp=/solr path=/dataimport >> params={schema=testproject&dbHost=127.0.0.1&dbPassword=fuz10n!&dbName=fzm&commit=false&dbUser=fzm&command=delta-import&projectId=1&synchronous=true&dbPort=5432} >> status=0 QTime=4 >> Dec 8, 2010 4:42:51 PM org.apache.solr.handler.dataimport.DataImporter >> doDeltaImport >> INFO: Starting Delta Import >> Dec 8, 2010 4:42:51 PM org.apache.solr.handler.dataimport.SolrWriter >> readIndexerProperties >> INFO: Read dataimport.properties >> Dec 8, 2010 4:42:51 PM org.apache.solr.handler.dataimport.DocBuilder doDelta >> INFO: Starting delta collection. >> Dec 8, 2010 4:42:51 PM org.apache.solr.handler.dataimport.DocBuilder >> collectDelta >> >> >> For (2) the commands are >> "/solr/dataimport?command=delta-import&commit=false&synchronous=true" >> and "/solr/update?commit=true&waitFlush=true&waitSearcher=true" and >> the first part of the output is: >> >> Dec 8, 2010 4:22:50 PM org.apache.solr.core.SolrCore execute >> INFO: [] webapp=/solr path=/dataimport params={command=status} status=0 >> QTime=0 >> Dec 8, 2010 4:22:50 PM org.apache.solr.core.SolrCore execute >> INFO: [] webapp=/solr path=/dataimport >> params={schema=testproject&dbHost=127.0.0.1&dbPassword=fuz10n!&dbName=fzm&commit=false&dbUser=fzm&command=delta-import&projectId=1&synchronous=true&dbPort=5432} >> status=0 QTime=1 >> Dec 8, 2010 4:22:50 PM org.apache.solr.core.SolrCore execute >> INFO: [] webapp=/solr path=/dataimport params={command=status} status=0 >> QTime=0 >> Dec 8, 2010 4:22:50 PM org.apache.solr.handler.dataimport.DataImporter >> doDeltaImport >> INFO: Starting Delta Import >> Dec 8, 2010 4:22:50 PM org.apache.solr.handler.dataimport.SolrWriter >> readIndexerProperties >> INFO: Read dataimport.properties >> Dec 8, 2010 4:22:50 PM org.apache.solr.update.DirectUpdateHandler2 commit >> INFO: start >> commit(optimize=false,waitFlush=true,waitSearcher=true,expungeDeletes=false) >> >> In (2) it seems like the commit is being fired before the delta-update >> finishes. >> >> Am I using the "synchronous" flag right? >> >> Thanks in advance! >> Juan M. >> >> On Mon, Dec 6, 2010 at 6:46 PM, Juan Manuel Alvarez >> wrote: >>> Thanks for all the help! It is really appreciated. >>> >>> For now, I can afford the parallel requests problem, but when I put >>> synchronous=true in the delta import, the call still returns with >>> outdated items. >>> Examining the log, it seems that the commit operation is being >>> executed after the operation returns, even when I am using >>> commit=true. >>> Is it possible to also execute the commit synchronously? >>> >>> Cheers! >>> Juan M. >>> >>> On Mon, Dec 6, 2010 at 4:29 PM, Alexey Serba wrote: > When you say "two parallel requests from two users to single DIH > request handler", what do you mean by "request handler"? I mean DIH. > Are you > refering to the HTTP request? Would that mean that if I make the > request from different HTTP sessions it would work? No. It means that when you have two users that simultaneously changed two objects in the UI then you have two HTTP requests to DIH to pull changes from the db into Solr index. If the second request comes when the first is not fully processed then the second request will be rejected. As a result your index would be outdated (w/o the latest update) until the next update. >>> >> >
Re: Syncing 'delta-import' with 'select' query
What Solr version do you use? It seems that sync flag has been added to 3.1 and 4.0 (trunk) branches and not to 1.4 https://issues.apache.org/jira/browse/SOLR-1721 On Wed, Dec 8, 2010 at 11:21 PM, Juan Manuel Alvarez wrote: > Hello everyone! > I have been doing some tests, but it seems I can't make the > synchronize flag work. > > I have made two tests: > 1) DIH with commit=false > 2) DIH with commit=false + commit via Solr XML update protocol > > And here are the log results: > For (1) the command is > "/solr/dataimport?command=delta-import&commit=false&synchronous=true" > and the first part of the output is: > > Dec 8, 2010 4:42:51 PM org.apache.solr.core.SolrCore execute > INFO: [] webapp=/solr path=/dataimport params={command=status} status=0 > QTime=0 > Dec 8, 2010 4:42:51 PM org.apache.solr.core.SolrCore execute > INFO: [] webapp=/solr path=/dataimport > params={schema=testproject&dbHost=127.0.0.1&dbPassword=fuz10n!&dbName=fzm&commit=false&dbUser=fzm&command=delta-import&projectId=1&synchronous=true&dbPort=5432} > status=0 QTime=4 > Dec 8, 2010 4:42:51 PM org.apache.solr.handler.dataimport.DataImporter > doDeltaImport > INFO: Starting Delta Import > Dec 8, 2010 4:42:51 PM org.apache.solr.handler.dataimport.SolrWriter > readIndexerProperties > INFO: Read dataimport.properties > Dec 8, 2010 4:42:51 PM org.apache.solr.handler.dataimport.DocBuilder doDelta > INFO: Starting delta collection. > Dec 8, 2010 4:42:51 PM org.apache.solr.handler.dataimport.DocBuilder > collectDelta > > > For (2) the commands are > "/solr/dataimport?command=delta-import&commit=false&synchronous=true" > and "/solr/update?commit=true&waitFlush=true&waitSearcher=true" and > the first part of the output is: > > Dec 8, 2010 4:22:50 PM org.apache.solr.core.SolrCore execute > INFO: [] webapp=/solr path=/dataimport params={command=status} status=0 > QTime=0 > Dec 8, 2010 4:22:50 PM org.apache.solr.core.SolrCore execute > INFO: [] webapp=/solr path=/dataimport > params={schema=testproject&dbHost=127.0.0.1&dbPassword=fuz10n!&dbName=fzm&commit=false&dbUser=fzm&command=delta-import&projectId=1&synchronous=true&dbPort=5432} > status=0 QTime=1 > Dec 8, 2010 4:22:50 PM org.apache.solr.core.SolrCore execute > INFO: [] webapp=/solr path=/dataimport params={command=status} status=0 > QTime=0 > Dec 8, 2010 4:22:50 PM org.apache.solr.handler.dataimport.DataImporter > doDeltaImport > INFO: Starting Delta Import > Dec 8, 2010 4:22:50 PM org.apache.solr.handler.dataimport.SolrWriter > readIndexerProperties > INFO: Read dataimport.properties > Dec 8, 2010 4:22:50 PM org.apache.solr.update.DirectUpdateHandler2 commit > INFO: start > commit(optimize=false,waitFlush=true,waitSearcher=true,expungeDeletes=false) > > In (2) it seems like the commit is being fired before the delta-update > finishes. > > Am I using the "synchronous" flag right? > > Thanks in advance! > Juan M. > > On Mon, Dec 6, 2010 at 6:46 PM, Juan Manuel Alvarez > wrote: >> Thanks for all the help! It is really appreciated. >> >> For now, I can afford the parallel requests problem, but when I put >> synchronous=true in the delta import, the call still returns with >> outdated items. >> Examining the log, it seems that the commit operation is being >> executed after the operation returns, even when I am using >> commit=true. >> Is it possible to also execute the commit synchronously? >> >> Cheers! >> Juan M. >> >> On Mon, Dec 6, 2010 at 4:29 PM, Alexey Serba wrote: When you say "two parallel requests from two users to single DIH request handler", what do you mean by "request handler"? >>> I mean DIH. >>> Are you refering to the HTTP request? Would that mean that if I make the request from different HTTP sessions it would work? >>> No. >>> >>> It means that when you have two users that simultaneously changed two >>> objects in the UI then you have two HTTP requests to DIH to pull >>> changes from the db into Solr index. If the second request comes when >>> the first is not fully processed then the second request will be >>> rejected. As a result your index would be outdated (w/o the latest >>> update) until the next update. >>> >> >
Re: Syncing 'delta-import' with 'select' query
Hello everyone! I have been doing some tests, but it seems I can't make the synchronize flag work. I have made two tests: 1) DIH with commit=false 2) DIH with commit=false + commit via Solr XML update protocol And here are the log results: For (1) the command is "/solr/dataimport?command=delta-import&commit=false&synchronous=true" and the first part of the output is: Dec 8, 2010 4:42:51 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/dataimport params={command=status} status=0 QTime=0 Dec 8, 2010 4:42:51 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/dataimport params={schema=testproject&dbHost=127.0.0.1&dbPassword=fuz10n!&dbName=fzm&commit=false&dbUser=fzm&command=delta-import&projectId=1&synchronous=true&dbPort=5432} status=0 QTime=4 Dec 8, 2010 4:42:51 PM org.apache.solr.handler.dataimport.DataImporter doDeltaImport INFO: Starting Delta Import Dec 8, 2010 4:42:51 PM org.apache.solr.handler.dataimport.SolrWriter readIndexerProperties INFO: Read dataimport.properties Dec 8, 2010 4:42:51 PM org.apache.solr.handler.dataimport.DocBuilder doDelta INFO: Starting delta collection. Dec 8, 2010 4:42:51 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta For (2) the commands are "/solr/dataimport?command=delta-import&commit=false&synchronous=true" and "/solr/update?commit=true&waitFlush=true&waitSearcher=true" and the first part of the output is: Dec 8, 2010 4:22:50 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/dataimport params={command=status} status=0 QTime=0 Dec 8, 2010 4:22:50 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/dataimport params={schema=testproject&dbHost=127.0.0.1&dbPassword=fuz10n!&dbName=fzm&commit=false&dbUser=fzm&command=delta-import&projectId=1&synchronous=true&dbPort=5432} status=0 QTime=1 Dec 8, 2010 4:22:50 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/dataimport params={command=status} status=0 QTime=0 Dec 8, 2010 4:22:50 PM org.apache.solr.handler.dataimport.DataImporter doDeltaImport INFO: Starting Delta Import Dec 8, 2010 4:22:50 PM org.apache.solr.handler.dataimport.SolrWriter readIndexerProperties INFO: Read dataimport.properties Dec 8, 2010 4:22:50 PM org.apache.solr.update.DirectUpdateHandler2 commit INFO: start commit(optimize=false,waitFlush=true,waitSearcher=true,expungeDeletes=false) In (2) it seems like the commit is being fired before the delta-update finishes. Am I using the "synchronous" flag right? Thanks in advance! Juan M. On Mon, Dec 6, 2010 at 6:46 PM, Juan Manuel Alvarez wrote: > Thanks for all the help! It is really appreciated. > > For now, I can afford the parallel requests problem, but when I put > synchronous=true in the delta import, the call still returns with > outdated items. > Examining the log, it seems that the commit operation is being > executed after the operation returns, even when I am using > commit=true. > Is it possible to also execute the commit synchronously? > > Cheers! > Juan M. > > On Mon, Dec 6, 2010 at 4:29 PM, Alexey Serba wrote: >>> When you say "two parallel requests from two users to single DIH >>> request handler", what do you mean by "request handler"? >> I mean DIH. >> >>> Are you >>> refering to the HTTP request? Would that mean that if I make the >>> request from different HTTP sessions it would work? >> No. >> >> It means that when you have two users that simultaneously changed two >> objects in the UI then you have two HTTP requests to DIH to pull >> changes from the db into Solr index. If the second request comes when >> the first is not fully processed then the second request will be >> rejected. As a result your index would be outdated (w/o the latest >> update) until the next update. >> >
Re: Syncing 'delta-import' with 'select' query
Thanks for all the help! It is really appreciated. For now, I can afford the parallel requests problem, but when I put synchronous=true in the delta import, the call still returns with outdated items. Examining the log, it seems that the commit operation is being executed after the operation returns, even when I am using commit=true. Is it possible to also execute the commit synchronously? Cheers! Juan M. On Mon, Dec 6, 2010 at 4:29 PM, Alexey Serba wrote: >> When you say "two parallel requests from two users to single DIH >> request handler", what do you mean by "request handler"? > I mean DIH. > >> Are you >> refering to the HTTP request? Would that mean that if I make the >> request from different HTTP sessions it would work? > No. > > It means that when you have two users that simultaneously changed two > objects in the UI then you have two HTTP requests to DIH to pull > changes from the db into Solr index. If the second request comes when > the first is not fully processed then the second request will be > rejected. As a result your index would be outdated (w/o the latest > update) until the next update. >
Re: Syncing 'delta-import' with 'select' query
> When you say "two parallel requests from two users to single DIH > request handler", what do you mean by "request handler"? I mean DIH. > Are you > refering to the HTTP request? Would that mean that if I make the > request from different HTTP sessions it would work? No. It means that when you have two users that simultaneously changed two objects in the UI then you have two HTTP requests to DIH to pull changes from the db into Solr index. If the second request comes when the first is not fully processed then the second request will be rejected. As a result your index would be outdated (w/o the latest update) until the next update.
Re: Syncing 'delta-import' with 'select' query
Alex: Thanks for the quick reply. When you say "two parallel requests from two users to single DIH request handler", what do you mean by "request handler"? Are you refering to the HTTP request? Would that mean that if I make the request from different HTTP sessions it would work? Cheers! Juan M. On Mon, Dec 6, 2010 at 1:12 PM, Alexey Serba wrote: > Hey Juan, > > It seems that DataImportHandler is not a right tool for your scenario > and you'd better use Solr XML update protocol. > * http://wiki.apache.org/solr/UpdateXmlMessages > > You still can work around your outdated GUI view problem with calling > DIH synchronously, by adding synchronous=true to your request. But it > won't solve the problem with two parallel requests from two users to > single DIH request handler, because DIH doesn't support that, and if > previous request is still running it bounces the second request. > > HTH, > Alex > > > > On Fri, Dec 3, 2010 at 10:33 PM, Juan Manuel Alvarez > wrote: >> Hello everyone! I would like to ask you a question about DIH. >> >> I am using a database and DIH to sync against Solr, and a GUI to >> display and operate on the items retrieved from Solr. >> When I change the state of an item through the GUI, the following happens: >> a. The item is updated in the DB. >> b. A delta-import command is fired to sync the DB with Solr. >> c. The GUI is refreshed by making a query to Solr. >> >> My problem comes between (b) and (c). The delta-import operation is >> executed in a new thread, so my call returns immediately, refreshing >> the GUI before the Solr index is updated causing the item state in the >> GUI to be outdated. >> >> I had two ideas so far: >> 1. Querying the status of the DIH after the delta-import operation and >> do not return until it is "idle". The problem I see with this is that >> if other users execute delta-imports, the status will be "busy" until >> all operations are finished. >> 2. Use Zoie. The first problem is that configuring it is not as >> straightforward as it seems, so I don't want to spend more time trying >> it until I am sure that this will solve my issue. On the other hand, I >> think that I may suffer the same problem since the delta-import is >> still firing in another thread, so I can't be sure it will be called >> fast enough. >> >> Am I pointing on the right direction or is there another way to >> achieve my goal? >> >> Thanks in advance! >> Juan M. >> >
Re: Syncing 'delta-import' with 'select' query
Hey Juan, It seems that DataImportHandler is not a right tool for your scenario and you'd better use Solr XML update protocol. * http://wiki.apache.org/solr/UpdateXmlMessages You still can work around your outdated GUI view problem with calling DIH synchronously, by adding synchronous=true to your request. But it won't solve the problem with two parallel requests from two users to single DIH request handler, because DIH doesn't support that, and if previous request is still running it bounces the second request. HTH, Alex On Fri, Dec 3, 2010 at 10:33 PM, Juan Manuel Alvarez wrote: > Hello everyone! I would like to ask you a question about DIH. > > I am using a database and DIH to sync against Solr, and a GUI to > display and operate on the items retrieved from Solr. > When I change the state of an item through the GUI, the following happens: > a. The item is updated in the DB. > b. A delta-import command is fired to sync the DB with Solr. > c. The GUI is refreshed by making a query to Solr. > > My problem comes between (b) and (c). The delta-import operation is > executed in a new thread, so my call returns immediately, refreshing > the GUI before the Solr index is updated causing the item state in the > GUI to be outdated. > > I had two ideas so far: > 1. Querying the status of the DIH after the delta-import operation and > do not return until it is "idle". The problem I see with this is that > if other users execute delta-imports, the status will be "busy" until > all operations are finished. > 2. Use Zoie. The first problem is that configuring it is not as > straightforward as it seems, so I don't want to spend more time trying > it until I am sure that this will solve my issue. On the other hand, I > think that I may suffer the same problem since the delta-import is > still firing in another thread, so I can't be sure it will be called > fast enough. > > Am I pointing on the right direction or is there another way to > achieve my goal? > > Thanks in advance! > Juan M. >
Syncing 'delta-import' with 'select' query
Hello everyone! I would like to ask you a question about DIH. I am using a database and DIH to sync against Solr, and a GUI to display and operate on the items retrieved from Solr. When I change the state of an item through the GUI, the following happens: a. The item is updated in the DB. b. A delta-import command is fired to sync the DB with Solr. c. The GUI is refreshed by making a query to Solr. My problem comes between (b) and (c). The delta-import operation is executed in a new thread, so my call returns immediately, refreshing the GUI before the Solr index is updated causing the item state in the GUI to be outdated. I had two ideas so far: 1. Querying the status of the DIH after the delta-import operation and do not return until it is "idle". The problem I see with this is that if other users execute delta-imports, the status will be "busy" until all operations are finished. 2. Use Zoie. The first problem is that configuring it is not as straightforward as it seems, so I don't want to spend more time trying it until I am sure that this will solve my issue. On the other hand, I think that I may suffer the same problem since the delta-import is still firing in another thread, so I can't be sure it will be called fast enough. Am I pointing on the right direction or is there another way to achieve my goal? Thanks in advance! Juan M.