Query 2 Cores
Hey All I have 2 cores which have been used with tika to do index files. I would like to do one query on both at once as I will be searching attr_content field. If I do a test on each core I get 1 17 results but trying with shards I just get 17 results. Here is my example query http://localhost8983/solr/core1/select?shards=localhost:8983/solr/core2q=attr_content:test Is this the correct way to query 2 cores at once ? Hope you can help Lee
Fwd: Query 2 Cores
Any ideas about my below Q ? Lee Begin forwarded message: From: Lee Smith l...@weblee.co.uk Date: 19 April 2010 11:19:45 GMT+01:00 To: solr-user@lucene.apache.org Subject: Query 2 Cores Reply-To: solr-user@lucene.apache.org Hey All I have 2 cores which have been used with tika to do index files. I would like to do one query on both at once as I will be searching attr_content field. If I do a test on each core I get 1 17 results but trying with shards I just get 17 results. Here is my example query http://localhost8983/solr/core1/select?shards=localhost:8983/solr/core2q=attr_content:test Is this the correct way to query 2 cores at once ? Hope you can help Lee
Delete id from a specific core
Hey All From the docs deleting from an index os pretty simpl: java -Ddata=args -Dcommit=no -jar post.jar deleteidSP2514N/id/delete How about from a specific core? Say I wanted to delete id=12344 from core 1 Hope this makes sense and is easy to answer! Regards Lee
Fwd: Highlighting Results
Can anyone help ?? Begin forwarded message: From: Lee Smith l...@weblee.co.uk Date: 11 March 2010 17:25:59 GMT To: solr-user@lucene.apache.org Subject: Highlighting Results Reply-To: solr-user@lucene.apache.org Hi All Im not sure where i'm going wrong but highlighting does not seem to work for me. I have indexed around 5000 PDF documents which went well. Running normal queries against the attr_content works well. When adding any hl code it does not seem to make a bit of difference. Here is an example query: ?q=attr_content:Some Namehl=truehl.fl=attr_contenthl.fragsize=50rows=5 If I am correct fragsize should be limiting the returned content for attr_content ?? and the keyowrds found in attr_contnet should be surronded with the em tags ? The attr_content is a stored if this helps. Hope someone can point me in the right direction. Thank you if you can !
Content Highlighting
With the highlighting options will Solr highlight the found text something like google search does ? I cant seem to get this working ? Hope someone can advise.
Highlighting Results
Hi All Im not sure where i'm going wrong but highlighting does not seem to work for me. I have indexed around 5000 PDF documents which went well. Running normal queries against the attr_content works well. When adding any hl code it does not seem to make a bit of difference. Here is an example query: ?q=attr_content:Some Namehl=truehl.fl=attr_contenthl.fragsize=50rows=5 If I am correct fragsize should be limiting the returned content for attr_content ?? and the keyowrds found in attr_contnet should be surronded with the em tags ? The attr_content is a stored if this helps. Hope someone can point me in the right direction. Thank you if you can !
Re: Highlighting
Yes Content is stored and I get same results adding that parameter. Still not highlighting the content :-( Any other ideas Lee On 9 Mar 2010, at 23:14, Ahmet Arslan wrote: Yes it shows when I run the debug -lst name=org.apache.solrhandler.component.HighlightComponent double name=time0.0/double /lst Any other ideas ? is the field attr_content stored? Are you querying this field? What happens when you append hl.maxAnalyzedChars=-1 to your search ulr?
Fwd: Highlighting
Im am getting results no problem with the query. But from what I believe it should wrap em/ around the text in the result. So if I search ie Andrew within the return content Ie would have the contents with the word emAndrew/em and hl.fl=attr_content Thank you for you help Begin forwarded message: From: Joe Calderon calderon@gmail.com Date: 10 March 2010 15:37:35 GMT To: solr-user@lucene.apache.org Subject: Re: Highlighting Reply-To: solr-user@lucene.apache.org just to make sure were on the same page, youre saying that the highlight section of the response is empty right? the results section is never highlighted but a separate section contains the highlighted fields specified in hl.fl= On Wed, Mar 10, 2010 at 5:23 AM, Ahmet Arslan iori...@yahoo.com wrote: Yes Content is stored and I get same results adding that parameter. Still not highlighting the content :-( Any other ideas What is the field type of attr_content? And what is your query? Are you running your query on another field and then requesting snippets from attr_content? q:attr_content:somequeryhl=truehl.fl=attr_contenthl.maxAnalyzedChars=-1 should return highlighting.
Re: Highlighting
Cant see why you would put highlighting in a separate field. Isn't it the idea to highlight the content found in a search result like google would do ? Lee On 10 Mar 2010, at 15:52, Joe Calderon wrote: no, thats not the case, see this example response in json format: { responseHeader:{ status:0, QTime:0, params:{ indent:on, q:title_edge:fami, hl.fl:title_edge, wt:json, hl:on, rows:1}}, response:{numFound:18,start:0,docs:[ { title_id:1581, title_edge:Family, num:4}] }, highlighting:{ 1581:{ title_edge:[emFami/emly]}} see how the highlight info is separate from the results? On Wed, Mar 10, 2010 at 7:44 AM, Lee Smith l...@weblee.co.uk wrote: Im am getting results no problem with the query. But from what I believe it should wrap em/ around the text in the result. So if I search ie Andrew within the return content Ie would have the contents with the word emAndrew/em and hl.fl=attr_content Thank you for you help Begin forwarded message: From: Joe Calderon calderon@gmail.com Date: 10 March 2010 15:37:35 GMT To: solr-user@lucene.apache.org Subject: Re: Highlighting Reply-To: solr-user@lucene.apache.org just to make sure were on the same page, youre saying that the highlight section of the response is empty right? the results section is never highlighted but a separate section contains the highlighted fields specified in hl.fl= On Wed, Mar 10, 2010 at 5:23 AM, Ahmet Arslan iori...@yahoo.com wrote: Yes Content is stored and I get same results adding that parameter. Still not highlighting the content :-( Any other ideas What is the field type of attr_content? And what is your query? Are you running your query on another field and then requesting snippets from attr_content? q:attr_content:somequeryhl=truehl.fl=attr_contenthl.maxAnalyzedChars=-1 should return highlighting.
Highlighting
Hey All I have indexed a whole bunch of documents and now I want to search against them. My search is going great all but highlighting. I have these items set hl=true hl.snippets=2 hl.fl = attr_content hl.fragsize=100 Everything works apart from the highlighted text found not being surrounded with a em Am I missing a setting ? Lee
Re: Highlighting
Yes it shows when I run the debug -lst name=org.apache.solrhandler.component.HighlightComponent double name=time0.0/double /lst Any other ideas ? On 9 Mar 2010, at 21:06, Joe Calderon wrote: did u enable the highlighting component in solrconfig.xml? try setting debugQuery=true to see if the highlighting component is even being called... On Tue, Mar 9, 2010 at 12:23 PM, Lee Smith l...@weblee.co.uk wrote: Hey All I have indexed a whole bunch of documents and now I want to search against them. My search is going great all but highlighting. I have these items set hl=true hl.snippets=2 hl.fl = attr_content hl.fragsize=100 Everything works apart from the highlighted text found not being surrounded with a em Am I missing a setting ? Lee
Re: Import database
I had same issue with Jetty Adding extra memory resolved my issue ie: java -Xms=512M -Xmx=1024M -jar start.jar Its in the manual, but cant seem to find the link On 8 Mar 2010, at 14:09, Quan Nguyen Anh wrote: Hi, I have started using Solr. I had a problem when I insert a database with 2 million rows . I hav The server encounters error: java.lang.OutOfMemoryError: Java heap space I searched around but can't find the solution. Any hep regarding this will be appreciated. Thanks in advance
Error on startup
Hi All. I have shutdown solr removed the index so I can start over then re-launched. I am getting an error of SEVERE: REFCOUNT ERROR: unreferenced org.apache.solr.solrc...@14db38a4 (core1) has a reference count of 1 Any idea on what this is a result of ? Hope you can advise. Lee
Formatting Results
Hey All I am indexing around 10,000 documents with Solar Cell which has gone superb. I can of course search the content like the example given: http://localhost:8983/solr/select?q=attr_content:tutorial But what I would like is for Solr to return the document with x many words and the matched content highlighted. I suppose a allot like google does. How can I achive such a result ? I know I can use the highlighting but cant seem to get this to work. Hope someone can put me on the right track. Thank you
Re: Formatting Results
Thanks Mark Ill have a good look at that part now. And I managed to get it started again :-). Thank you again Lee On 3 Mar 2010, at 18:52, Marc Sturlese wrote: I'll give you an example about how to configure your default SearchHandler to do highlighting but I strongly recomend you to check properly the wiki. Everything is really well explained in there: http://wiki.apache.org/solr/HighlightingParameters str name=hltrue/str str name=hl.flattr_content/str str name=f.attr_content.hl.fragsize200/str str name=f.attr_content.hl.snippets1/str str name=f.attr_content.hl.alternateFieldf.attr_content/str str name=f.attr_content.hl.maxAlternateFieldLength300/str Lee Smith-6 wrote: Hey All I am indexing around 10,000 documents with Solar Cell which has gone superb. I can of course search the content like the example given: http://localhost:8983/solr/select?q=attr_content:tutorial But what I would like is for Solr to return the document with x many words and the matched content highlighted. I suppose a allot like google does. How can I achive such a result ? I know I can use the highlighting but cant seem to get this to work. Hope someone can put me on the right track. Thank you -- View this message in context: http://old.nabble.com/Formatting-Results-tp27771256p27772151.html Sent from the Solr - User mailing list archive at Nabble.com.
Optimize Index
Hi All Is there a post request method to clean the index? I have removed my index folder and restarted solr and its still showing documents in the stats. I have run this post request: http://localhost:8983/solr/core1/update?optimize=true I get no errors but the stats are still show my 4 documents Hope you can advise. Thanks
Re: Optimize Index
Ha Now I feel stupid !! I had a misspell in the data path and you were correct. Can I ask Erik was the command correct though ? Thank you Lee On 2 Mar 2010, at 13:54, Erick Erickson wrote: My very first guess would be that you're removing an index that isn't the one your SOLR configuration points at. Second guess would be that your browser is caching the results of your first query and not going to SOLR at all. Stranger things have happened G. Third guess is you've mis-identified the core in your URL. Can you check those three things and let us know if you still have the problem? Erick On Tue, Mar 2, 2010 at 7:36 AM, Lee Smith l...@weblee.co.uk wrote: Hi All Is there a post request method to clean the index? I have removed my index folder and restarted solr and its still showing documents in the stats. I have run this post request: http://localhost:8983/solr/core1/update?optimize=true I get no errors but the stats are still show my 4 documents Hope you can advise. Thanks
Content Extraction
Hey All Hope someone can advise. I followed the example in the wiki on how to extract a html page i.e curl 'http://localhost:8983/solr/update/extract?literal.id=doc1uprefix=attr_fmap.content=attr_contentcommit=true' -F myfi...@tutorial.html And it displayed a html page but with a 404 and did not index the document? Any suggestions on how I can fix this? Thanks if you can advise. Lee
Re: Content Extraction
Hi Erik I did a post with more details yesterday with no response. I have a screen shot of what it does: http://screencast.com/t/MGRiZTU5M After running it I have done a query with 0 results and have checked to see how many docs are indexed with 0 being the value. Hope you can shed some more light for me. Lee On 26 Feb 2010, at 14:57, Erick Erickson wrote: You really have to provide more details of a what you did. b what the results were. Have you looked at you r index with the admin page and/or Luke? Have you tried querying in the admin page? Have you examined the logs to see what they report? Best Erick On Fri, Feb 26, 2010 at 7:54 AM, Lee Smith l...@weblee.co.uk wrote: Hey All Hope someone can advise. I followed the example in the wiki on how to extract a html page i.e curl ' http://localhost:8983/solr/update/extract?literal.id=doc1uprefix=attr_fmap.content=attr_contentcommit=true' -F myfi...@tutorial.html And it displayed a html page but with a 404 and did not index the document? Any suggestions on how I can fix this? Thanks if you can advise. Lee
Solr Extract
Hey All I am having a go at extracting some file as per the wiki guide. I cd to the root directory of the folder and run the command with no success apart from some broken HTML If you see this here: http://screencast.com/t/MGRiZTU5M It might help to understand what Im doing wrong. hope someone can advice. Thank you in advance
Re: Multicore Example
How can I find out ?? On 19 Feb 2010, at 19:26, Dave Searle wrote: Do you have something else using port 8983 or 8080? Sent from my iPhone On 19 Feb 2010, at 19:22, Lee Smith l...@weblee.co.uk wrote: Hey All Trying to dip my feet into multicore and hoping someone can advise why the example is not working. Basically I have been working with the example single core fine so I have stopped the server and restarted with the new command line for multicore ie, java -Dsolr.solr.home=multicore -jar start.jar When it launches I get this error: 2010-02-19 11:13:39.740::WARN: EXCEPTION java.net.BindException: Address already in use at java.net.PlainSocketImpl.socketBind(Native Method) at etc Any ideas what this can be because I have stopped the first one. Thank you if you can advise.
Re: Multicore Example
Thanks Shawn I am actually running it on mac It does not like those unix commands ?? Any further advice ? Lee On 19 Feb 2010, at 20:32, Shawn Heisey wrote: Assuming you are on a unix variant with a working lsof, use this. This probably won't work correctly on Solaris 10: lsof -nPi | grep 8983 lsof -nPi | grep 8080 On Windows, you can do this in a command prompt. It requires elevation on Vista or later. The -b option was added in WinXP SP2 and Win2003 SP1, without it you can't see the program name that's got the port open: netstat -b ports.txt ports.txt Shawn On 2/19/2010 1:01 PM, Lee Smith wrote: How can I find out ?? On 19 Feb 2010, at 19:26, Dave Searle wrote: Do you have something else using port 8983 or 8080?
Apache Tika/ Solar Cell
Hey All, Hope someone can advise me and a way to go. I have a my Solr setup and working well. I am using DIH to handle all my data input. Now I need to add content from word docs pdf's meta data etc and looking to use Solar Cell A few questions regarding this. Would it be best to add these to a different core ? How would I handle Documents removed from the server as I would want these removed from index as well. Hope you can advise Thank you in advanced
Query 2 Cats
Sorry of this is a poor Q but cant seem to get it to work. I have a field called cat setup so I can query against specific categories. It ok I search all or one but cant seem to make it search over multiples. ie q=string AND cat:name1 AND cat:name2 I have tried the following variations. cat:name1,name2 cat:name1+name2 I have also tried using instead of AND with still same results. Hope you can help !! Thank you in advance
Re: Query 2 Cats
Thank you Dave, Eric Worked a charm On 26 Jan 2010, at 18:58, Dave Searle wrote: Try q=string AND (cat:name1 OR cat:name2) On 26 Jan 2010, at 18:53, Lee Smith l...@weblee.co.uk wrote: Sorry of this is a poor Q but cant seem to get it to work. I have a field called cat setup so I can query against specific categories. It ok I search all or one but cant seem to make it search over multiples. ie q=string AND cat:name1 AND cat:name2 I have tried the following variations. cat:name1,name2 cat:name1+name2 I have also tried using instead of AND with still same results. Hope you can help !! Thank you in advance
Data Full Import Error
Hi All I am trying to do a data import but I am getting the following error. INFO: [] webapp=/solr path=/dataimport params={command=status} status=0 QTime=405 2010-01-12 03:08:08.576::WARN: Error for /solr/dataimport java.lang.OutOfMemoryError: Java heap space Jan 12, 2010 3:08:05 AM org.apache.solr.handler.dataimport.DataImporter doFullImport SEVERE: Full Import failed java.lang.OutOfMemoryError: Java heap space Exception in thread btpool0-2 java.lang.OutOfMemoryError: Java heap space Jan 12, 2010 3:08:14 AM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: start rollback Jan 12, 2010 3:08:21 AM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: end_rollback Jan 12, 2010 3:08:23 AM org.apache.solr.update.SolrIndexWriter finalize SEVERE: SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!! Any ideas what this can be ?? Hope you can help. Lee
Re: Data Full Import Error
Thank you for your response. Will I just need to adjust the allowed memory in a config file or is this a server issue. ? Sorry I know nothing about Java. Hope you can advise ! On 12 Jan 2010, at 12:26, Noble Paul നോബിള് नोब्ळ् wrote: You need more memory to run dataimport. On Tue, Jan 12, 2010 at 4:46 PM, Lee Smith l...@weblee.co.uk wrote: Hi All I am trying to do a data import but I am getting the following error. INFO: [] webapp=/solr path=/dataimport params={command=status} status=0 QTime=405 2010-01-12 03:08:08.576::WARN: Error for /solr/dataimport java.lang.OutOfMemoryError: Java heap space Jan 12, 2010 3:08:05 AM org.apache.solr.handler.dataimport.DataImporter doFullImport SEVERE: Full Import failed java.lang.OutOfMemoryError: Java heap space Exception in thread btpool0-2 java.lang.OutOfMemoryError: Java heap space Jan 12, 2010 3:08:14 AM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: start rollback Jan 12, 2010 3:08:21 AM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: end_rollback Jan 12, 2010 3:08:23 AM org.apache.solr.update.SolrIndexWriter finalize SEVERE: SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!! This is OK. don't bother Any ideas what this can be ?? Hope you can help. Lee -- - Noble Paul | Systems Architect| AOL | http://aol.com
Deleting * and Re-index after schema change
Am I doing this right. I have made changes to my schema so as per guide I done the following. Stopped the application Updated the Schema Re-Started Deleted the index folder Then ran a full import optimize command ie: /dataimport?command=full-importoptimize=true In the status it shows Indexing Complete. Add/Updated 100800 documents. 0 Deleted So all good ? But in the stats page it only shows numDocs:1 The only thing I can see maybe in the stats page it says in the reader line segments=1 but I noticed in the index folder the file says segments_6 Any ideas ? Thank you
Re: Deleting * and Re-index after schema change
Hi Erik Done as suggested and still only showing 1 Document Doing a *:* give me 1 document Cant understand why ? On 12 Jan 2010, at 14:25, Erik Hatcher wrote: What does a search of *:* give you? As far as your steps, delete the index folder *before* restarting Solr, not after. That might be the issue. Erik On Jan 12, 2010, at 9:23 AM, Lee Smith wrote: Am I doing this right. I have made changes to my schema so as per guide I done the following. Stopped the application Updated the Schema Re-Started Deleted the index folder Then ran a full import optimize command ie: /dataimport?command=full-importoptimize=true In the status it shows Indexing Complete. Add/Updated 100800 documents. 0 Deleted So all good ? But in the stats page it only shows numDocs:1 The only thing I can see maybe in the stats page it says in the reader line segments=1 but I noticed in the index folder the file says segments_6 Any ideas ? Thank you
Re: Deleting * and Re-index after schema change
Dont worry my bad. I made a mistake in my dataimport to all have the same ID ! All working now thank you On 12 Jan 2010, at 14:33, Lee Smith wrote: Hi Erik Done as suggested and still only showing 1 Document Doing a *:* give me 1 document Cant understand why ? On 12 Jan 2010, at 14:25, Erik Hatcher wrote: What does a search of *:* give you? As far as your steps, delete the index folder *before* restarting Solr, not after. That might be the issue. Erik On Jan 12, 2010, at 9:23 AM, Lee Smith wrote: Am I doing this right. I have made changes to my schema so as per guide I done the following. Stopped the application Updated the Schema Re-Started Deleted the index folder Then ran a full import optimize command ie: /dataimport?command=full-importoptimize=true In the status it shows Indexing Complete. Add/Updated 100800 documents. 0 Deleted So all good ? But in the stats page it only shows numDocs:1 The only thing I can see maybe in the stats page it says in the reader line segments=1 but I noticed in the index folder the file says segments_6 Any ideas ? Thank you
DIH solrconfig
Hi All There seems to be massive difference between the solrconfig in the DIH example to the one in the normal example ? Would I be correct in saying if I was to add the dataimport request handler in the solrconfig.xml thats all I will need ? ie: requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdb-data-config.xml/str /lst /requestHandler Is this correct ? Lee
Logging
Im trying to import data with DIH (mysql) All my SQL's are good having been tested manually. When I run full import ie: http://localhost:8983/solr/dataimport?command=full-import I get my XML result but nothing is being imported and it Rolles back. In loggin I set DIH logging to fine and set them then re-run but I can seem to find detailed logs. Im looking at the log in example/logs/ but its just giving basic logs still ? How can I find out whats going on ?? Thank you if you can advise. Lee
DIH Updating
Hello All Sorry newbie Q. Im looking at using the Data Import Handler to add my data to solr. But I am a little confused how I go about updating the index. I understand there is no update index so just a delete replace but how will solr know what to remove and add ? Also hope someone does not mind giving me advice on my scema I should use. I will be indexing multiple tables as each table means a different type of search. Here is the tables and the rows im looking at adding to solr. Files: - id - display_name - server_path - file_type - project_id Folders: - id - folder_name - fullpath - project_id Dailies: - id - scene - take - description - filename (join) - project_id Assets - id - title - project_id Calendar: (Events) - id - title - description - project_id On top of this I will be looking at doing full indexing using solr cell of the documents held in the file data table. Hope some can point me in the right direction and thank you in advance Regards Lee
Stopping Starting
Hello All I am just starting out today with solr and looking for some advice but I first have a problem. I ran the start command ie. user:~/solr/example$ java -jar start.jar Which worked perfect and started to explore the interface. But my terminal window dropped and I it has stopped working. If i try and restart it Im getting errors and its still not working. error like: 2009-12-03 21:55:41.785::WARN: EXCEPTION java.net.BindException: Address already in use So how can I stop and restart the service ? Hope you can help get me going again. Thank you Lee