Re: Is there any other way to load the index beside using "http" connection?
Out of my head... but are you not supposed to active the stream-handler in SOLR ? Think it is documented... Cheers //Marcus On Mon, Jul 6, 2009 at 8:55 PM, Francis Yakin wrote: > Yes, I uploaded the CSV file that I get it from Database then I ran that > cmd and I have the error. > > Any suggestions? > > Thanks > > Francis > > -Original Message- > From: NitinMalik [mailto:malik.ni...@yahoo.com] > Sent: Monday, July 06, 2009 11:32 AM > To: solr-user@lucene.apache.org > Subject: RE: Is there any other way to load the index beside using "http" > connection? > > > Hi Francis, > > I have experienced that update stream handler (for a xml file in my case) > worked only for Solr running on the same machine. I also got same error > when > I tried to update the documents on a remote Solr instance. > > Regards > Nitin > > > Francis Yakin wrote: > > > > > > Ok, I have a CSV file(called it test.csv) from database. > > > > When I tried to upload this file to solr using this cmd, I got > > "stream.contentType=text/plain: No such file or directory" error > > > > curl > > > http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csv&stream.contentType=text/plain;charset=utf-8 > > > > -bash: stream.contentType=text/plain: No such file or directory > > undefined field cat > > > > What did I do wrong? > > > > Francis > > > > -Original Message- > > From: Norberto Meijome [mailto:numard...@gmail.com] > > Sent: Monday, July 06, 2009 11:01 AM > > To: Francis Yakin > > Cc: solr-user@lucene.apache.org > > Subject: Re: Is there any other way to load the index beside using "http" > > connection? > > > > On Mon, 6 Jul 2009 09:56:03 -0700 > > Francis Yakin wrote: > > > >> Norberto, > >> > >> Thanks, I think my questions is: > >> > >> >>why not generate your SQL output directly into your oracle server as a > >> file > >> > >> What type of file is this? > >> > >> > > > > a file in a format that you can then import into SOLR. > > > > _ > > {Beto|Norberto|Numard} Meijome > > > > "Gravity cannot be blamed for people falling in love." > > Albert Einstein > > > > I speak for myself, not my employer. Contents may be hot. Slippery when > > wet. Reading disclaimers makes you go blind. Writing them is worse. You > > have been Warned. > > > > > > -- > View this message in context: > http://www.nabble.com/Is-there-any-other-way-to-load-the-index-beside-using-%22http%22-connection--tp24297934p24360603.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/
Tagging and searching on tagged indexes.
Hi, How do we tag solr indexes and search on those indexes, there is not much information on wiki. all i could find is this: http://wiki.apache.org/solr/UserTagDesign has anyone tried it? (using solr API) One more question, can we change the schema dynamically at runtime? (while solr instance is on??) Regards, Raakhi.
Query on the updation of synonym and stopword file.
Hello All, I was figuring out the issue with the synonym.txt and stopword.txt files being updated on regular interval. Here in my case I am updating the synonym.txt and stopword.txt files as the synonym and stop word dictionary is update. I am facing a problem here that even after the core reload and re-indexing the documents the new updated synonym or stop words are not loaded. Seems so the filters are not aware that these files are updated so the solution to me is to restart the whole container in which I have embedded the Solr server; it is not feasible in production. I came across the discussion with subject “ synonyms.txt file updated frequently” in which Grant had a view to write a new logic in SynonymFilterFactory which would take care of this issue. Is there any possible solution to this or is this the solution. Thanks in advance! Regards, Sagar Khetkade _ Missed any of the IPL matches ? Catch a recap of all the action on MSN Videos http://msnvideos.in/iplt20/msnvideoplayer.aspx
Re: Filtering MoreLikeThis results
I have been trying to restrict MoreLikeThis results without any luck also. In additional to restricting the results, I am also looking to influence the scores similar to the way boost query (bq) works in the DisMaxRequestHandler. I think Solr's MoreLikeThis depends on Lucene's contrib queries MoreLikeThis, or at least it used to. Has anyone looked into enhancing Solrs' MoreLikeThis to support bq and restricting mlt results? Bill On Mon, Jul 6, 2009 at 2:16 PM, Yao Ge wrote: > > I could not find any support from http://wiki.apache.org/solr/MoreLikeThison > how to restrict MLT results to certain subsets. I passed along a fq > parameter and it is ignored. Since we can not incorporate the filters in > the > query itself which is used to retrieve the target for similarity > comparison, > it appears there is no way to filter MLT results. BTW. I am using Solr 1.3. > Please let me know if there is way (other than hacking the source code) to > do this. Thanks! > -- > View this message in context: > http://www.nabble.com/Filtering-MoreLikeThis-results-tp24360355p24360355.html > Sent from the Solr - User mailing list archive at Nabble.com. > >
Creating DataSource for DIH to Oracle Database
Have any one had experience creating a datasource for DIH to an Oracle Database? Also, from the Solr side we are running weblogic and deploy the application using weblogic. I know in weblogic we can create a datasource that can connect to Oracle database, has any one had experience with this? Thanks Francis
Re: Suggestions needed: Lots of updates for tiny changes
Hi Otis, Thanks for your reply and for giving it some thought. Actually we have considered using something that lives outside of the main index... We've looked into using the ExternalFileField, but abandoned that when it became clear we'd have to use a function to use it, and that limited how we could use the field in our searches. For another more-real-time data problem we're having, we've considered writing a search handler and search component to handle it as a filter-query. This is equivalent to the "data structure outside of the main index" that you have proposed. The problem with it is that getting it to be *part of the index* is difficult. Well... any more ideas would be appreciated. But thanks for your help so far. - Daryl. On Fri, Jul 3, 2009 at 9:34 PM, Otis Gospodnetic wrote: > > I don't have a very specific suggestion, but I wonder if you could have a > data structure that lives outside of the main index and keeps only these > dates. Presumably this smaller data structure would be simpler/faster to > update, and you'd just have to remain in sync with the main index > (document-document mapping). I think ParallelReader in Lucene is a similar > approach, as it Solr's ExternalFileField. > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message > > From: Development Team > > To: solr-user@lucene.apache.org > > Sent: Friday, July 3, 2009 4:46:37 PM > > Subject: Suggestions needed: Lots of updates for tiny changes > > > > Hi everybody, > > Let's say I had an index with 10M large-ish documents, and as people > > logged into a website and viewed them the "last viewed date" was updated > to > > the current time. We index a document's last-viewed-date because we allow > > users to a) search on this last-viewed-date alongside all other > searchable > > criteria, and b) we can order results of any search by the > last-viewed-date. > > The problem is that in a given 5-minute period, we may have many > > thousands of updated documents (due to this simple last-viewed-date). We > > have a task that looks for changed documents, loads the full documents, > and > > then feeds them into Solr to update the index, but unfortunately reading > > these changed documents and continually feeding them to Solr is > generating * > > far* more load on our system (both Solr and the database) than any of the > > searches. In a given day, *we may have more updates to documents than we > > have total documents indexed*. (Databases don't handle this well either, > the > > contention on rows for updates slows the database down significantly.) > > How should we approach this problem? It seems like such a waste of > > resources to be doing so much work in applications/database/solr only for > > last-viewed-dates. > > > > Solutions we've looked at include: > > 1) Update only partial document. --Apparently this isn't supported > in > > Solr yet (we're using nightly Solr 1.4 builds currently). > > 2) Use "near-real-time updates". --Not supported yet. Also, the > > "freshness" of the data isn't as much as concern as the sheer volume of > > changes that we have to make here. For example, we could update Solr > > less-fequently, but then we'd just have many more documents to update. > The > > data only has to be, say, fresh to within 30 minutes. > > 3) Use a separate index for the last-viewed-date. --This won't work > > because we need to search on the last-viewed-date alongside other > criteria, > > and we use it as scoring criteria for all our searches. > > > > Any suggestions? > > > > Sincerely, > > > > Daryl. > >
Re: Problem in parsing non-string dynamic field by using IndexReader
that works perfectly! Thanks a lot! On Mon, Jul 6, 2009 at 2:12 PM, Chris Hostetter wrote: > : OK, here is my latest code to get the IndexReader from the solr core. > : However, it still printed out the non-string fields as special chars. I > do > : use the schema file here. Please help. > > you'll want to use the IndexSchema object to get the FieldType > object for your field name. then use the FieldType to convert the values > in the index to readable values. > > Take a look at the javadocs for IndexSearcher and FieldType for more > details. > > if you look at code like the XMLResponseWriter you'll see examples of > iterating over all the fields in a Document and using those methods. > > > > -Hoss > >
Re: Problem in parsing non-string dynamic field by using IndexReader
: OK, here is my latest code to get the IndexReader from the solr core. : However, it still printed out the non-string fields as special chars. I do : use the schema file here. Please help. you'll want to use the IndexSchema object to get the FieldType object for your field name. then use the FieldType to convert the values in the index to readable values. Take a look at the javadocs for IndexSearcher and FieldType for more details. if you look at code like the XMLResponseWriter you'll see examples of iterating over all the fields in a Document and using those methods. -Hoss
Re: reindexed data on master not replicated to slave
It looks that the problem is here or before that in SnapPuller.fetchLatestIndex(): terminateAndWaitFsyncService(); LOG.info("Conf files are not downloaded or are in sync"); if (isSnapNeeded) { modifyIndexProps(tmpIndexDir.getName()); } else { successfulInstall = copyIndexFiles(tmpIndexDir, indexDir); } if (successfulInstall) { logReplicationTimeAndConfFiles(modifiedConfFiles); doCommit(); } Debugged into the place, and noticed that isSnapNeeded is true and therefore modifyIndexProps(tmpIndexDir.getName()); executed, but from the function name it looks that installing index actually happens in successfulInstall = copyIndexFiles(tmpIndexDir, indexDir); The function returns false, but the caller (doSnapPull) never checked the return value. Thanks, J On Mon, Jul 6, 2009 at 8:02 AM, solr jay wrote: > There is only one index directory: index/ > > Here is the content of index.properties > > #index properties > #Fri Jul 03 14:17:12 PDT 2009 > index=index.20090703021705 > > > Thanks, > > J > > 2009/7/5 Noble Paul നോബിള് नोब्ळ् > > BTW , how many index dirs are there in the data dir ? what is there in >> the /index.properties ? >> >> On Sat, Jul 4, 2009 at 12:15 AM, solr jay wrote: >> > >> > >> > I tried it with the latest nightly build and got the same result. >> > >> > Actually that was the symptom and it made me looking at the index >> directory. >> > The same log messages repeated again and again, never end. >> > >> > >> > >> > 2009/7/2 Noble Paul നോബിള് नोब्ळ् >> >> >> >> jay , I see updating index properties... twice >> >> >> >> >> >> >> >> this should happen rarely. in your case it should have happened only >> >> once. because you cleaned up the master only once >> >> >> >> >> >> On Fri, Jul 3, 2009 at 6:09 AM, Otis >> >> Gospodnetic wrote: >> >> > >> >> > Jay, >> >> > >> >> > You didn't mention which version of Solr you are using. It looks >> like >> >> > some trunk or nightly version. Maybe you can try the latest nightly? >> >> > >> >> > Otis >> >> > -- >> >> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch >> >> > >> >> > >> >> > >> >> > - Original Message >> >> >> From: solr jay >> >> >> To: solr-user@lucene.apache.org >> >> >> Sent: Thursday, July 2, 2009 9:14:48 PM >> >> >> Subject: reindexed data on master not replicated to slave >> >> >> >> >> >> Hi, >> >> >> >> >> >> When index data were corrupted on master instance, I wanted to wipe >> out >> >> >> all >> >> >> the index data and re-index everything. I was hoping the newly >> created >> >> >> index >> >> >> data would be replicated to slaves, but it wasn't. >> >> >> >> >> >> Here are the steps I performed: >> >> >> >> >> >> 1. stop master >> >> >> 2. delete the directory 'index' >> >> >> 3. start master >> >> >> 4. disable replication on master >> >> >> 5. index all data from scratch >> >> >> 6. enable replication on master >> >> >> >> >> >> It seemed from log file that the slave instances discovered that new >> >> >> index >> >> >> are available and claimed that new index installed, and then trying >> to >> >> >> update index properties, but looking into the index directory on >> >> >> slaves, you >> >> >> will find that no index data files were updated or added, plus >> slaves >> >> >> keep >> >> >> trying to get new index. Here are some from slave's log file: >> >> >> >> >> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller >> >> >> fetchLatestIndex >> >> >> INFO: Starting replication process >> >> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller >> >> >> fetchLatestIndex >> >> >> INFO: Number of files in latest snapshot in master: 69 >> >> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller >> >> >> fetchLatestIndex >> >> >> INFO: Total time taken for download : 0 secs >> >> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller >> >> >> fetchLatestIndex >> >> >> INFO: Conf files are not downloaded or are in sync >> >> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller >> >> >> modifyIndexProps >> >> >> INFO: New index installed. Updating index properties... >> >> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller >> >> >> fetchLatestIndex >> >> >> INFO: Master's version: 1246488421310, generation: 9 >> >> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller >> >> >> fetchLatestIndex >> >> >> INFO: Slave's version: 1246385166228, generation: 56 >> >> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller >> >> >> fetchLatestIndex >> >> >> INFO: Starting replication process >> >> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller >> >> >> fetchLatestIndex >> >> >> INFO: Number of files in latest snapshot in master: 69 >> >> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller >> >> >> fetchLatestIndex >> >> >> INFO: Total time taken for download : 0 secs >> >> >> Jul 1, 2009 4:00:33 PM org.apache.sol
RE: Is there any other way to load the index beside using "http" connection?
Yes, I uploaded the CSV file that I get it from Database then I ran that cmd and I have the error. Any suggestions? Thanks Francis -Original Message- From: NitinMalik [mailto:malik.ni...@yahoo.com] Sent: Monday, July 06, 2009 11:32 AM To: solr-user@lucene.apache.org Subject: RE: Is there any other way to load the index beside using "http" connection? Hi Francis, I have experienced that update stream handler (for a xml file in my case) worked only for Solr running on the same machine. I also got same error when I tried to update the documents on a remote Solr instance. Regards Nitin Francis Yakin wrote: > > > Ok, I have a CSV file(called it test.csv) from database. > > When I tried to upload this file to solr using this cmd, I got > "stream.contentType=text/plain: No such file or directory" error > > curl > http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csv&stream.contentType=text/plain;charset=utf-8 > > -bash: stream.contentType=text/plain: No such file or directory > undefined field cat > > What did I do wrong? > > Francis > > -Original Message- > From: Norberto Meijome [mailto:numard...@gmail.com] > Sent: Monday, July 06, 2009 11:01 AM > To: Francis Yakin > Cc: solr-user@lucene.apache.org > Subject: Re: Is there any other way to load the index beside using "http" > connection? > > On Mon, 6 Jul 2009 09:56:03 -0700 > Francis Yakin wrote: > >> Norberto, >> >> Thanks, I think my questions is: >> >> >>why not generate your SQL output directly into your oracle server as a >> file >> >> What type of file is this? >> >> > > a file in a format that you can then import into SOLR. > > _ > {Beto|Norberto|Numard} Meijome > > "Gravity cannot be blamed for people falling in love." > Albert Einstein > > I speak for myself, not my employer. Contents may be hot. Slippery when > wet. Reading disclaimers makes you go blind. Writing them is worse. You > have been Warned. > > -- View this message in context: http://www.nabble.com/Is-there-any-other-way-to-load-the-index-beside-using-%22http%22-connection--tp24297934p24360603.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Is there any other way to load the index beside using "http" connection?
Hi Francis, I have experienced that update stream handler (for a xml file in my case) worked only for Solr running on the same machine. I also got same error when I tried to update the documents on a remote Solr instance. Regards Nitin Francis Yakin wrote: > > > Ok, I have a CSV file(called it test.csv) from database. > > When I tried to upload this file to solr using this cmd, I got > "stream.contentType=text/plain: No such file or directory" error > > curl > http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csv&stream.contentType=text/plain;charset=utf-8 > > -bash: stream.contentType=text/plain: No such file or directory > undefined field cat > > What did I do wrong? > > Francis > > -Original Message- > From: Norberto Meijome [mailto:numard...@gmail.com] > Sent: Monday, July 06, 2009 11:01 AM > To: Francis Yakin > Cc: solr-user@lucene.apache.org > Subject: Re: Is there any other way to load the index beside using "http" > connection? > > On Mon, 6 Jul 2009 09:56:03 -0700 > Francis Yakin wrote: > >> Norberto, >> >> Thanks, I think my questions is: >> >> >>why not generate your SQL output directly into your oracle server as a >> file >> >> What type of file is this? >> >> > > a file in a format that you can then import into SOLR. > > _ > {Beto|Norberto|Numard} Meijome > > "Gravity cannot be blamed for people falling in love." > Albert Einstein > > I speak for myself, not my employer. Contents may be hot. Slippery when > wet. Reading disclaimers makes you go blind. Writing them is worse. You > have been Warned. > > -- View this message in context: http://www.nabble.com/Is-there-any-other-way-to-load-the-index-beside-using-%22http%22-connection--tp24297934p24360603.html Sent from the Solr - User mailing list archive at Nabble.com.
Filtering MoreLikeThis results
I could not find any support from http://wiki.apache.org/solr/MoreLikeThis on how to restrict MLT results to certain subsets. I passed along a fq parameter and it is ignored. Since we can not incorporate the filters in the query itself which is used to retrieve the target for similarity comparison, it appears there is no way to filter MLT results. BTW. I am using Solr 1.3. Please let me know if there is way (other than hacking the source code) to do this. Thanks! -- View this message in context: http://www.nabble.com/Filtering-MoreLikeThis-results-tp24360355p24360355.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Is there any other way to load the index beside using "http" connection?
Ok, I have a CSV file(called it test.csv) from database. When I tried to upload this file to solr using this cmd, I got "stream.contentType=text/plain: No such file or directory" error curl http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csv&stream.contentType=text/plain;charset=utf-8 -bash: stream.contentType=text/plain: No such file or directory undefined field cat What did I do wrong? Francis -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Monday, July 06, 2009 11:01 AM To: Francis Yakin Cc: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using "http" connection? On Mon, 6 Jul 2009 09:56:03 -0700 Francis Yakin wrote: > Norberto, > > Thanks, I think my questions is: > > >>why not generate your SQL output directly into your oracle server as a file > > What type of file is this? > > a file in a format that you can then import into SOLR. _ {Beto|Norberto|Numard} Meijome "Gravity cannot be blamed for people falling in love." Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Is there any other way to load the index beside using "http" connection?
On Mon, 6 Jul 2009 09:56:03 -0700 Francis Yakin wrote: > Norberto, > > Thanks, I think my questions is: > > >>why not generate your SQL output directly into your oracle server as a file > > What type of file is this? > > a file in a format that you can then import into SOLR. _ {Beto|Norberto|Numard} Meijome "Gravity cannot be blamed for people falling in love." Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
RE: Is there any other way to load the index beside using "http" connection?
Norberto, Thanks, I think my questions is: >>why not generate your SQL output directly into your oracle server as a file What type of file is this? Thanks again for your help. Francis -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Monday, July 06, 2009 4:33 AM To: Francis Yakin Cc: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using "http" connection? On Sun, 5 Jul 2009 10:28:16 -0700 Francis Yakin wrote: [...]> > >upload the file to your SOLR server? Then the data file is local to your SOLR > >server , you will bypass any WAN and firewall you may be having. (or some > >variation of it, sql -> SOLR server as file, etc..) > > How we upload the file? Do we need to convert the data file to Lucene Index > first? And Documentation how we do this? pick your poison... rsync? ftp? scp ? B _ {Beto|Norberto|Numard} Meijome "The freethinking of one age is the common sense of the next." Matthew Arnold I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Replication In 1.4
And if you don't mind using the nightly Solr build, the admin page caching has been fixed in the trunk, so this won't bite you again. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Lee Theobald > To: solr-user@lucene.apache.org > Sent: Monday, July 6, 2009 11:41:16 AM > Subject: Re: Replication In 1.4 > > > The problem turned out to be two fold. Firstly, Tomcat was caching an old > version on the admin page (hiding the replication link and some other info) > and secondly I had a mistake in some configuration meaning the indexes > weren't building in the correct places. But it's sorted now and I have some > lovely replication working with my two cores. > > Thanks for your help Mark. > -- > View this message in context: > http://www.nabble.com/Replication-In-1.4-tp24356158p24357821.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Using MMapDirectory
Is there a benefit to using MMapDirectory instead of, say, tmpfs (RAM disk) under Linux? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Mark Miller > To: solr-user@lucene.apache.org > Sent: Monday, July 6, 2009 9:28:43 AM > Subject: Re: Using MMapDirectory > > Marc Sturlese wrote: > > Hey there, > > > > For testing purpose I am trying to use lucene's MMapDirectory in a recent > > Solr nightly instance. I have read in Lucene's documentation: > > "To use MMapDirectory, invoke Java with the System property > > org.apache.lucene.FSDirectory.class set to > > org.apache.lucene.store.MMapDirectory. This will cause > > FSDirectory.getDirectory(File,boolean) to return instances of this class. " > > > > Do I have to change something in solrconfig.xml or modifying system property > > is just enough? > > Thanks in advance. > > > The system property won't do it. You will have to try the custom > DirectoryFactory in solrconfig.xml. > > Report back how it goes - havn't tried Solr with MMapDirectory before myself. > > -- - Mark > > http://www.lucidimagination.com
Re: Multiple values for custom fields provided in SOLR query
I actually don't fully understand your question. q=+fileID:111+fileID:222+fileID:333+apple looks like a valid query to me. (not sure what that space encoded as + is, though) Also not sure what you mean by: > Basically the requirement is , if fileIDs are provided as search parameter > then search should happen on the basis of fileID. Do you mean "apple" should be ignored if a term (field name:field value) is provided? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Suryasnat Das > To: solr-user@lucene.apache.org > Sent: Monday, July 6, 2009 11:31:10 AM > Subject: Multiple values for custom fields provided in SOLR query > > Hi, > I have a requirement in which i need to have multiple values in my custom > fields while forming the search query to SOLR. For example, > fileID is my custom field. I have defined the fileID in schema.xml as > name="fileID" type="string" indexed="true" stored="true" required="true" > multiValued="true"/>. > Now fileID can have multiple values like 111,222,333 etc. So will my query > be of the form, > > q=+fileID:111+fileID:222+fileID:333+apple > > where apple is my search query string. I tried with the above query but it > did not work. SOLR gave invalid query error. > Basically the requirement is , if fileIDs are provided as search parameter > then search should happen on the basis of fileID. > > Is my approach correct or i need to do something else? Please, if immediate > help is provided then that would be great. > > Regards > Suryasnat Das > Infosys.
Re: Retrieve docs with > 1 multivalue field hits
I don't recall seeing that mentioned at all... but my memory fails me all the time. Who's Solr? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: A. Steven Anderson > To: solr-user@lucene.apache.org > Sent: Monday, July 6, 2009 12:25:48 PM > Subject: Re: Retrieve docs with > 1 multivalue field hits > > I thought this would be a quick yes or no answer and/or reference to another > thread, but alas, I got no replies. > > Is it safe to assume the answer is 'no' for both Solr 1.3 and 1.4? > > > On Thu, Jul 2, 2009 at 3:48 PM, A. Steven Anderson wrote: > > > Greetings! > > > > I thought I remembered seeing a thread related to retrieving only documents > > that had more than one hit in a particular multivalue field, but I cannot > > find it now. > > > > Regardless, is this possible in Solr 1.3? Solr 1.4? > > > > -- > A. Steven Anderson > Independent Consultant
Re: Retrieve docs with > 1 multivalue field hits
I thought this would be a quick yes or no answer and/or reference to another thread, but alas, I got no replies. Is it safe to assume the answer is 'no' for both Solr 1.3 and 1.4? On Thu, Jul 2, 2009 at 3:48 PM, A. Steven Anderson wrote: > Greetings! > > I thought I remembered seeing a thread related to retrieving only documents > that had more than one hit in a particular multivalue field, but I cannot > find it now. > > Regardless, is this possible in Solr 1.3? Solr 1.4? > -- A. Steven Anderson Independent Consultant
Re: Popular keywords statistics .
Indeed that was one of the first approaches... Thanks a lot! Michael Ludwig wrote: Wallace schrieb: I'd like to hear what approaches are being used by users to know what people is searching for in their apps. You could process the access log. You could write a filter servlet logging the relevant part of the query string to a dedicated location. Michael Ludwig
grouping and sorting by facet?
Sorry if I am missing something obvious here Is there a way to group and sort by facet count? I have a large set of images, each of which is part of a different "collection." I am performing a faceted search: /solr/select/?q=my+term&max=30&version=2.2&rows=30&start=0&facet=true&facet.field=collection&facet.sort=true I would like to group the results by collection count. So all of the images in the collection with the most image "hits" comes first. Not sure how to do that --Peter Keane
Re: Replication In 1.4
The problem turned out to be two fold. Firstly, Tomcat was caching an old version on the admin page (hiding the replication link and some other info) and secondly I had a mistake in some configuration meaning the indexes weren't building in the correct places. But it's sorted now and I have some lovely replication working with my two cores. Thanks for your help Mark. -- View this message in context: http://www.nabble.com/Replication-In-1.4-tp24356158p24357821.html Sent from the Solr - User mailing list archive at Nabble.com.
Multiple values for custom fields provided in SOLR query
Hi, I have a requirement in which i need to have multiple values in my custom fields while forming the search query to SOLR. For example, fileID is my custom field. I have defined the fileID in schema.xml as . Now fileID can have multiple values like 111,222,333 etc. So will my query be of the form, q=+fileID:111+fileID:222+fileID:333+apple where apple is my search query string. I tried with the above query but it did not work. SOLR gave invalid query error. Basically the requirement is , if fileIDs are provided as search parameter then search should happen on the basis of fileID. Is my approach correct or i need to do something else? Please, if immediate help is provided then that would be great. Regards Suryasnat Das Infosys.
Re: Replication In 1.4
On Mon, Jul 6, 2009 at 10:43 AM, Lee Theobald wrote: > > I think it's best if I just clear down and start again. I've got something > wrong somewhere. I've noticed that what I consider to be the slave does > have a "Replication" link in the admin area. My master doesn't. They both > seem to be reporting the same version from the info page > (1.3.0.2009.06.30.08.05.44, nightly exported - yonik - 2009-06-30 08:05:44) > but I'm guessing I may have something pointing to an old 1.3 solr war/jar. > -- > View this message in context: > http://www.nabble.com/Replication-In-1.4-tp24356158p24356750.html > Sent from the Solr - User mailing list archive at Nabble.com. > > You should see the link based on the replication RequestHandler being detected. Is it still commented out on the Master solrconfig? -- -- - Mark http://www.lucidimagination.com
Re: Replication In 1.4
I think it's best if I just clear down and start again. I've got something wrong somewhere. I've noticed that what I consider to be the slave does have a "Replication" link in the admin area. My master doesn't. They both seem to be reporting the same version from the info page (1.3.0.2009.06.30.08.05.44, nightly exported - yonik - 2009-06-30 08:05:44) but I'm guessing I may have something pointing to an old 1.3 solr war/jar. -- View this message in context: http://www.nabble.com/Replication-In-1.4-tp24356158p24356750.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Replication In 1.4
Cheers for the info Mark. That looks pretty similar to what I have. Slave is almost the same, master was slightly different but I don't think incorrect: optimize optimize solrconfig_slave.xml:solrconfig.xml,schema.xml,stopwords.txt,elevate.xml I'll keep looking as I'm bound to have missed something but I can't quite see what it is yet. Lee, markrmiller wrote: > > Have you uncommented the proper RequestHandlers in solrconfig.xml? > > > > > > > > -- > - Mark > > http://www.lucidimagination.com > > > > > -- View this message in context: http://www.nabble.com/Replication-In-1.4-tp24356158p24356526.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Replication In 1.4
Lee Theobald wrote: Hi all, I've been trying to get the replication working in my test version of Solr (1.4) but I don't think I've got it right. There doesn't seem to be any errors but it doesn't seem to be working either. I'm going to the distribution admin page [1] and just getting a bit of text telling me "No distribution info present". I've tried with multiple cores (what I want but perhaps isn't possible looking at the raised bugs) and a single core, both give the same results. Going to the replication status page [2] gives me an OK. I'm thinking I've missed a vital bit of configuration. Is there anything on this page [3] that is missing but I need? For example, listeners in the solrconfig.xml? If so, could someone please give me an example. Cheers for any input, Lee [1] http://localhost:8080/solr/admin/distributiondump.jsp [2] http://localhost:8080/solr/replication [3] http://wiki.apache.org/solr/SolrReplication Have you uncommented the proper RequestHandlers in solrconfig.xml? -- - Mark http://www.lucidimagination.com
Faceting with MoreLikeThis
Does Solr support faceting on MoreLikeThis search results? -- View this message in context: http://www.nabble.com/Faceting-with-MoreLikeThis-tp24356166p24356166.html Sent from the Solr - User mailing list archive at Nabble.com.
Replication In 1.4
Hi all, I've been trying to get the replication working in my test version of Solr (1.4) but I don't think I've got it right. There doesn't seem to be any errors but it doesn't seem to be working either. I'm going to the distribution admin page [1] and just getting a bit of text telling me "No distribution info present". I've tried with multiple cores (what I want but perhaps isn't possible looking at the raised bugs) and a single core, both give the same results. Going to the replication status page [2] gives me an OK. I'm thinking I've missed a vital bit of configuration. Is there anything on this page [3] that is missing but I need? For example, listeners in the solrconfig.xml? If so, could someone please give me an example. Cheers for any input, Lee [1] http://localhost:8080/solr/admin/distributiondump.jsp [2] http://localhost:8080/solr/replication [3] http://wiki.apache.org/solr/SolrReplication -- View this message in context: http://www.nabble.com/Replication-In-1.4-tp24356158p24356158.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Using MMapDirectory
Marc Sturlese wrote: Hey there, For testing purpose I am trying to use lucene's MMapDirectory in a recent Solr nightly instance. I have read in Lucene's documentation: "To use MMapDirectory, invoke Java with the System property org.apache.lucene.FSDirectory.class set to org.apache.lucene.store.MMapDirectory. This will cause FSDirectory.getDirectory(File,boolean) to return instances of this class. " Do I have to change something in solrconfig.xml or modifying system property is just enough? Thanks in advance. The system property won't do it. You will have to try the custom DirectoryFactory in solrconfig.xml. Report back how it goes - havn't tried Solr with MMapDirectory before myself. -- - Mark http://www.lucidimagination.com
Re: Is there any other way to load the index beside using "http" connection?
Yes exactly just being friendly sharing a working routine. Took me some hours to figure out DIH myself at the time. //Marcus On Mon, Jul 6, 2009 at 1:32 PM, Norberto Meijome wrote: > On Sun, 5 Jul 2009 21:36:35 +0200 > Marcus Herou wrote: > > > Sharing some of our exports from DB to solr. Note: many of the statements > > below might not work due to clip-clip. > > thx Marcus - but that's a DIH config right? :) > b > _ > {Beto|Norberto|Numard} Meijome > > "I respect faith, but doubt is what gives you an education." > Wilson Mizner > > I speak for myself, not my employer. Contents may be hot. Slippery when > wet. Reading disclaimers makes you go blind. Writing them is worse. You have > been Warned. > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/
Re: Is there any other way to load the index beside using "http" connection?
On Sun, 5 Jul 2009 10:28:16 -0700 Francis Yakin wrote: [...]> > >upload the file to your SOLR server? Then the data file is local to your SOLR > >server , you will bypass any WAN and firewall you may be having. (or some > >variation of it, sql -> SOLR server as file, etc..) > > How we upload the file? Do we need to convert the data file to Lucene Index > first? And Documentation how we do this? pick your poison... rsync? ftp? scp ? B _ {Beto|Norberto|Numard} Meijome "The freethinking of one age is the common sense of the next." Matthew Arnold I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Is there any other way to load the index beside using "http" connection?
On Sun, 5 Jul 2009 21:36:35 +0200 Marcus Herou wrote: > Sharing some of our exports from DB to solr. Note: many of the statements > below might not work due to clip-clip. thx Marcus - but that's a DIH config right? :) b _ {Beto|Norberto|Numard} Meijome "I respect faith, but doubt is what gives you an education." Wilson Mizner I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Using MMapDirectory
Hey there, For testing purpose I am trying to use lucene's MMapDirectory in a recent Solr nightly instance. I have read in Lucene's documentation: "To use MMapDirectory, invoke Java with the System property org.apache.lucene.FSDirectory.class set to org.apache.lucene.store.MMapDirectory. This will cause FSDirectory.getDirectory(File,boolean) to return instances of this class. " Do I have to change something in solrconfig.xml or modifying system property is just enough? Thanks in advance. -- View this message in context: http://www.nabble.com/Using-MMapDirectory-tp24353063p24353063.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Index partitioning with solr multiple core feature
Shalin, First of all each entity data is unrelated so it makes sense to use solr core concept as per your suggestion. But Since you are suggesting putting each entity index on same box will consume CPU so does it make sense to add boxes based on number of entities considering i will have to add replication boxes also amounting a huge cost. This is what i am thinking after your suggestion - Have separate boxes for each entity and then inside each entity do some partitioning based on round robin or some strategy. With this if i am searching on any entity data then i will just require to reach a box for that entity. Now since i am doing a partitioning inside an entity also how i will do search for data so that i got merged result from each partition in a single entity box. If i doing this type of partitioning than which functionality of solr i will use ... is it http://wiki.apache.org/solr/IndexPartitioning ? My actual concern is performance irrespective of implementation design considering a good scaling logic also for future . On Mon, Jul 6, 2009 at 3:16 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Mon, Jul 6, 2009 at 3:05 PM, Sumit Aggarwal >wrote: > > > Hi Shalin, > > Yes i want to achieve a logical separation of indexes for performance > > reason > > also else index size will keep on growing as i have 8 different entities. > I > > am already partitioning all these entities to different servers also on > > which i will be doing search based on distributed search by solr using > > shards and collecting merged results from 3 different servers. You > > mentioned > > i wont achieve putting all partitions on the same box , why is that so? > > > This is because each shard will compete for CPU and disk if you put them on > the same box. Logical separation and partitioning for performance are two > different things. You should partition if one Solr instance is not able to > hold the complete index or if it is not giving you the desired performance. > You can use multiple cores if the data is unrelated and you wouldn't need > to > search on all of them. > > In your case, the primary reason is performance, so it makes sense to put > each shard on a separate box. > > > > While reading solr core it says solr core is used for different > > applications > > only My search on different entities is also a type of different > > applications theoritically > > > > Does solr provides any good support for index partitioning. > > > No. Partitioning is not done by Solr. So you should decide your > partitioning > scheme: round robin, fixed hashing, random etc. Once you have partitioned > your data, a distributed search helps you search over all the shards in one > go. > > -- > Regards, > Shalin Shekhar Mangar. > -- Cheers Sumit
Re: Popular keywords statistics .
Wallace schrieb: I'd like to hear what approaches are being used by users to know what people is searching for in their apps. You could process the access log. You could write a filter servlet logging the relevant part of the query string to a dedicated location. Michael Ludwig
Re: Index partitioning with solr multiple core feature
On Mon, Jul 6, 2009 at 3:05 PM, Sumit Aggarwal wrote: > Hi Shalin, > Yes i want to achieve a logical separation of indexes for performance > reason > also else index size will keep on growing as i have 8 different entities. I > am already partitioning all these entities to different servers also on > which i will be doing search based on distributed search by solr using > shards and collecting merged results from 3 different servers. You > mentioned > i wont achieve putting all partitions on the same box , why is that so? This is because each shard will compete for CPU and disk if you put them on the same box. Logical separation and partitioning for performance are two different things. You should partition if one Solr instance is not able to hold the complete index or if it is not giving you the desired performance. You can use multiple cores if the data is unrelated and you wouldn't need to search on all of them. In your case, the primary reason is performance, so it makes sense to put each shard on a separate box. > While reading solr core it says solr core is used for different > applications > only My search on different entities is also a type of different > applications theoritically > > Does solr provides any good support for index partitioning. No. Partitioning is not done by Solr. So you should decide your partitioning scheme: round robin, fixed hashing, random etc. Once you have partitioned your data, a distributed search helps you search over all the shards in one go. -- Regards, Shalin Shekhar Mangar.
Re: Index partitioning with solr multiple core feature
Shalin, at a time i will be doing search only on one entity... Also data will be indexed only to corresponding entity. Thanks, Sumit On Mon, Jul 6, 2009 at 3:05 PM, Sumit Aggarwal wrote: > Hi Shalin, > Yes i want to achieve a logical separation of indexes for performance > reason also else index size will keep on growing as i have 8 different > entities. I am already partitioning all these entities to different servers > also on which i will be doing search based on distributed search by solr > using shards and collecting merged results from 3 different servers. You > mentioned i wont achieve putting all partitions on the same box , why is > that so? > > While reading solr core it says solr core is used for different > applications only My search on different entities is also a type of > different applications theoritically > > Does solr provides any good support for index partitioning. > Thanks, > Sumit > > On Mon, Jul 6, 2009 at 2:43 PM, Shalin Shekhar Mangar < > shalinman...@gmail.com> wrote: > >> On Mon, Jul 6, 2009 at 1:40 PM, Sumit Aggarwal > >wrote: >> >> > I was trying to implement entity based partitioning using multiple core >> > feature. >> > So my solr.xml is like : >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > Now using http://localhost:8983/solr/User/ or >> > http://localhost:8983/solr/Group/ i am able to reach seperate partition >> > for >> > entity based search. Now question arises for entity based indexing. I >> was >> > reading http://wiki.apache.org/solr/IndexPartitioning document but it >> does >> > not help much How can i do entity based indexing of document.. >> > I don't want to make http url based on entity for indexing purpose. >> >> >> Why not? You know which document belongs to which "entity" so you can >> select >> which core to post that document to. >> >> >> >> > Another requirement: Since i have entity based partitioning and each >> entity >> > can have total index size more than 10GB so i need another partitioning >> > inside entity like based on no of document in an index inside entity. >> How >> > can i do this? Unfortunately solr wiki does not says much on >> partitioning.. >> > >> >> What are you trying to achieve by partitioning your data? Is it just for >> logical separation? If it is for performance reasons, I don't think you'll >> gain much by putting all partitions on the same box. >> >> -- >> Regards, >> Shalin Shekhar Mangar. >> > > > > -- > Cheers > Sumit > 9818621804 > -- Cheers Sumit 9818621804
Re: Index partitioning with solr multiple core feature
Hi Shalin, Yes i want to achieve a logical separation of indexes for performance reason also else index size will keep on growing as i have 8 different entities. I am already partitioning all these entities to different servers also on which i will be doing search based on distributed search by solr using shards and collecting merged results from 3 different servers. You mentioned i wont achieve putting all partitions on the same box , why is that so? While reading solr core it says solr core is used for different applications only My search on different entities is also a type of different applications theoritically Does solr provides any good support for index partitioning. Thanks, Sumit On Mon, Jul 6, 2009 at 2:43 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Mon, Jul 6, 2009 at 1:40 PM, Sumit Aggarwal >wrote: > > > I was trying to implement entity based partitioning using multiple core > > feature. > > So my solr.xml is like : > > > > > > > > > > > > > > > > > > > > > > > > Now using http://localhost:8983/solr/User/ or > > http://localhost:8983/solr/Group/ i am able to reach seperate partition > > for > > entity based search. Now question arises for entity based indexing. I was > > reading http://wiki.apache.org/solr/IndexPartitioning document but it > does > > not help much How can i do entity based indexing of document.. > > I don't want to make http url based on entity for indexing purpose. > > > Why not? You know which document belongs to which "entity" so you can > select > which core to post that document to. > > > > > Another requirement: Since i have entity based partitioning and each > entity > > can have total index size more than 10GB so i need another partitioning > > inside entity like based on no of document in an index inside entity. How > > can i do this? Unfortunately solr wiki does not says much on > partitioning.. > > > > What are you trying to achieve by partitioning your data? Is it just for > logical separation? If it is for performance reasons, I don't think you'll > gain much by putting all partitions on the same box. > > -- > Regards, > Shalin Shekhar Mangar. > -- Cheers Sumit 9818621804
Re: Index partitioning with solr multiple core feature
On Mon, Jul 6, 2009 at 1:40 PM, Sumit Aggarwal wrote: > I was trying to implement entity based partitioning using multiple core > feature. > So my solr.xml is like : > > > > > > > > > > > > Now using http://localhost:8983/solr/User/ or > http://localhost:8983/solr/Group/ i am able to reach seperate partition > for > entity based search. Now question arises for entity based indexing. I was > reading http://wiki.apache.org/solr/IndexPartitioning document but it does > not help much How can i do entity based indexing of document.. > I don't want to make http url based on entity for indexing purpose. Why not? You know which document belongs to which "entity" so you can select which core to post that document to. > Another requirement: Since i have entity based partitioning and each entity > can have total index size more than 10GB so i need another partitioning > inside entity like based on no of document in an index inside entity. How > can i do this? Unfortunately solr wiki does not says much on partitioning.. > What are you trying to achieve by partitioning your data? Is it just for logical separation? If it is for performance reasons, I don't think you'll gain much by putting all partitions on the same box. -- Regards, Shalin Shekhar Mangar.
Re: Index partitioning with solr multiple core feature
I forgot to mention i already have a partitioning to 3 different servers for each entity based on some unique int value. On Mon, Jul 6, 2009 at 1:40 PM, Sumit Aggarwal wrote: > I was trying to implement entity based partitioning using multiple core > feature. > So my solr.xml is like : > > > > > > > > > > > > Now using http://localhost:8983/solr/User/ or > http://localhost:8983/solr/Group/ i am able to reach seperate partition > for entity based search. Now question arises for entity based indexing. I > was reading http://wiki.apache.org/solr/IndexPartitioning document but it > does not help much How can i do entity based indexing of document.. > I don't want to make http url based on entity for indexing purpose. Kindly > help me in this? > > Another requirement: Since i have entity based partitioning and each entity > can have total index size more than 10GB so i need another partitioning > inside entity like based on no of document in an index inside entity. How > can i do this? Unfortunately solr wiki does not says much on partitioning.. > -- > Cheers > Sumit > -- Cheers Sumit 9818621804
Index partitioning with solr multiple core feature
I was trying to implement entity based partitioning using multiple core feature. So my solr.xml is like : Now using http://localhost:8983/solr/User/ or http://localhost:8983/solr/Group/ i am able to reach seperate partition for entity based search. Now question arises for entity based indexing. I was reading http://wiki.apache.org/solr/IndexPartitioning document but it does not help much How can i do entity based indexing of document.. I don't want to make http url based on entity for indexing purpose. Kindly help me in this? Another requirement: Since i have entity based partitioning and each entity can have total index size more than 10GB so i need another partitioning inside entity like based on no of document in an index inside entity. How can i do this? Unfortunately solr wiki does not says much on partitioning.. -- Cheers Sumit