Re: Searching problem
You must spend time on - http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters On Sat, Nov 13, 2010 at 10:42 AM, M.Rizwan griz...@gmail.com wrote: Hi All, Do you have any idea that why solr search for panasonic* ( without quotes ) does not match panasonic ? If we search panasonic it matches a result but if we search with panasonic* it does not find it. What needs to be done here ? Thanks Riz
Re: A Newbie Question
Another pov you might want to think about - what kind of search you want. Just plain - full text search or there is something more to those text files. Are they grouped in folders? Do the folders imply certain kind of grouping/hierarchy/tagging? I recently was trying to help somebody who had files across lot of places grouped by date/subject/author - he wanted to ensure these are fields which too can act as filters/navigators. Just an input - ignore it if you just want plain full text search. On Sat, Nov 13, 2010 at 11:25 AM, Lance Norskog goks...@gmail.com wrote: About web servers: Solr is a servlet war file and needs a Java web server container to run. The example/ folder in the Solr disribution uses 'Jetty', and this is fine for small production-quality projects. You can just copy the example/ directory somewhere to set up your own running Solr; that's what I always do. About indexing programs: if you know Unix scripting, it may be easiest to walk the file system yourself with the 'find' program and create Solr input XML files. But yes, you definitely want the Solr 1.4 Enterprise manual. I spent months learning this stuff very slowly, and the book would have been great back then. Lance Erick Erickson wrote: Think of the data import handler (DIH) as Solr pulling data to index from some source based on configuration. So, once you set up your DIH config to point to your file system, you issue a command to solr like OK, do your data import thing. See the FileListEntityProcessor. http://wiki.apache.org/solr/DataImportHandler http://wiki.apache.org/solr/DataImportHandlerSolrJ is a clent library you'd use to push data to Solr. Basically, you write a Java program that uses SolrJ to walk the file system, find documents, create a Solr document and sent that to Solr. It's not nearly as complex as it soundsG. See: http://wiki.apache.org/solr/Solrj http://wiki.apache.org/solr/SolrjIt's probably worth your while to get a copy of Solr 1.4, Enterprise Search Server by Erik Pugh and David Smiley. Best Erick On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyerseshadri...@gmail.com wrote: Hi Lance, Thank you very much for responding (not sure how I reply to the group, so, writing to you). Can you please expand on your suggestion? I am not a web guy and so, don't know where to start. What is the difference between SolrJ and DataImportHandler? Do I need to set up web servers on all my storage boxes? Apologies for the basic level of questions, but hope I can get started and implement this before the year end (you know why :o) Thanks, Sesh On 12 November 2010 13:31, Lance Norskoggoks...@gmail.com wrote: Using 'curl' is fine. There is a library called SolrJ for Java and other libraries for other scripting languages that let you upload with more control. There is a thing in Solr called the DataImportHandler that lets you script walking a file system. On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyerseshadri...@gmail.com wrote: Hi, Pardon me if this sounds very elementary, but I have a very basic question regarding Solr search. I have about 10 storage devices running Solaris with hundreds of thousands of text files (there are other files, as well, but my target is these text files). The directories on the Solaris boxes are exported and are available as NFS mounts. I have installed Solr 1.4 on a Linux box and have tested the installation, using curl to post documents. However, the manual says that curl is not the recommended way of posting documents to Solr. Could someone please tell me what is the preferred approach in such an environment? I am not a programmer and would appreciate some hand-holding here :o) Thanks in advance, Sesh -- Lance Norskog goks...@gmail.com
Re: How to Facet on a price range
Kudos to Jan's pre-compute option and gwk's range facet answer. On Wed, Nov 10, 2010 at 2:52 PM, Geert-Jan Brits gbr...@gmail.com wrote: Ah I see: like you said it's part of the facet range implementation. Frontend is already working, just need the 'update-on-slide' behavior. Thanks Geert-Jan 2010/11/10 gwk g...@eyefi.nl On 11/9/2010 7:32 PM, Geert-Jan Brits wrote: when you drag the sliders , an update of how many results would match is immediately shown. I really like this. How did you do this? IS this out-of-the-box available with the suggested Facet_by_range patch? Hi, With the range facets you get the facet counts for every discrete step of the slider, these values are requested in the AJAX request whenever search criteria change and then someone uses the sliders we simply check the range that is selected and add the discrete values of that range to get the expected amount of results. So yes it is available, but as Solr is just the search backend the frontend stuff you'll have to write yourself. Regards, gwk
Re: Color search for images
Not exactly sure how one would put context of what object is more dominant than other. Think of landscape with snow, green mountains and set of flowers of varied colors including a rose On Fri, Sep 17, 2010 at 8:43 PM, Shashi Kant sk...@sloan.mit.edu wrote: What I am envisioning (at least to start) is have all this add two fields in the index. One would be for color information for the color similarity search. The other would be a simple multivalued text field that we put keywords into based on what OpenCV can detect about the image. If it detects faces, we would put face into this field. Other things that it can detect would result in other keywords. For the color search, I have a few inter-related hurdles. I've got to figure out what form the color data actually takes and how to represent it in Solr. I need Java code for Solr that can take an input color value and find similar values in the index. Then I need some code that can go in our feed processing scripts for new content. That code would also go into a crawler script to handle existing images. You are on the right track. You can create a set of representative keywords from the image. OpenCV gets a color histogram from the image - you can set the bin values to be as granular as you need, and create a look-up list of color names to generate a MVF representative of the image. If you want to get more sophisticated, represent the colors with payloads in correlation with the distribution of the color in the image. Another approach would be to segment the image and extract colors from each. So if you have a red rose with all white background, the textual representation would be something like: white, white...red...white, white Play around and see which works best. HTH
Re: Adding new elements to index
Just for testing purpose - I would 1. Use curl to create new docs 2. Use Solrj to go to individual dbs and collect docs. On Wed, Jul 7, 2010 at 12:45 PM, Xavier Rodriguez xee...@gmail.com wrote: Thanks for the quick reply! In fact it was a typo, the 200 rows I got were from postgres. I tried to say that the full-import was omitting the 100 oracle rows. When I run the full import, I run it as a single job, using the url command=full-import. I've tried to clear the index both using the clean command and manually deleting it, but when I run the full-import, the number of indexed documents are the documents coming from postgres. To be sure that the id field is unique, i get the id by assigning a letter before the id value. When indexed, the id looks like s_123, and that's the id 123 for an entity identified as s. Other entities use different prefixes, but never s. I used DIH to index the data. My configuration is the folllowing: File db-data-config.xml dataSource type=JdbcDataSource name=ds_ora driver=oracle.jdbc.OracleDriver url=jdbc:oracle:thin:@xxx.xxx.xxx.xxx:1521:SID user=user password=password / dataSource type=JdbcDataSource name=ds_pg driver=org.postgresql.Driver url=jdbc:postgresql://xxx.xxx.xxx.yyy:5432/sid user=user password=password / entity name=carrers dataSource=ds_ora query=select 's_'||id as id_carrer,'a' as tooltip from imi_carrers field column=id_carrer name=identificador / field column=tooltip name=Nom / /entity entity name=hidrants dataSource=ds_pg query=select 'h_'||id as id_hidrant, parc as tooltip from hidrants field column=id_hidrant name=identificador / field column=tooltip name=Nom / /entity -- In that configuration, all the fields coming from ds_pg are indexed, and the fields coming from ds_ora are not indexed. As I've said, the strange behaviour for me is that no error is logged in tomcat, the number of documents created is the number of rows returned by hidrants, while the number of rows returned is the sum of the rows from hidrants and carrers. Thanks in advance. Xavi. On 7 July 2010 02:46, Erick Erickson erickerick...@gmail.com wrote: first do you have a unique key defined in your schema.xml? If you do, some of those 300 rows could be replacing earlier rows. You say: if I have 200 rows indexed from postgres and 100 rows from Oracle, the full-import process only indexes 200 documents from oracle, although it shows clearly that the query retruned 300 rows. Which really looks like a typo, if you have 100 rows from Oracle how did you get 200 rows from Oracle? Are you perhaps doing this in two different jobs and deleting the first import before running the second? And if this is irrelevant, could you provide more details like how you're indexing things (I'm assuming DIH, but you don't state that anywhere). If it *is* DIH, providing that configuration would help. Best Erick On Tue, Jul 6, 2010 at 11:19 AM, Xavier Rodriguez xee...@gmail.com wrote: Hi, I have a SOLR installed on a Tomcat application server. This solr instance has some data indexed from a postgres database. Now I need to add some entities from an Oracle database. When I run the full-import command, the documents indexed are only documents from postgres. In fact, if I have 200 rows indexed from postgres and 100 rows from Oracle, the full-import process only indexes 200 documents from oracle, although it shows clearly that the query retruned 300 rows. I'm not doing a delta-import, simply a full import. I've tried to clean the index, reload the configuration, and manually remove dataimport.properties because it's the only metadata i found. Is there any other file to check or modify just to get all 300 rows indexed? Of course, I tried to find one of that oracle fields, with no results. Thanks a lot, Xavier Rodriguez.
Re: Nested table support ability
Amit - unless you test it would not be apparent. Key piece is as Otis mentioned flatten everything. This requires effort from your side to actually create documents in manner suitable for your searches. The relationship needs to be merged into the document. To avoid storing text representations - you may want to store just the identifier and use front end to translate between human readable text vs stored identifier. Taking your case further - Rather than storing ADMIN store just a representation may be a smallint with customer information. On Wed, Jun 23, 2010 at 11:30 AM, amit_ak amit...@mindtree.com wrote: Hi Otis, Thanks for the update. My paramteric search has to span across customer table and 30 child tables. We have close to 1 million customers. Do you think Lucene/Solr is the right fsolution for such requirements? or database search would be more optimal. Regards, Amit -- View this message in context: http://lucene.472066.n3.nabble.com/Nested-table-support-ability-tp905253p916087.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Field Collapsing SOLR-236
fieldType:analyzer without class or tokenizer filter list seems to point to the config - you may want to correct. On Wed, Jun 23, 2010 at 3:09 PM, Rakhi Khatwani rkhatw...@gmail.com wrote: Hi, I checked out modules lucene from the trunk. Performed a build using the following commands ant clean ant compile ant example Which compiled successfully. I then put my existing index(using schema.xml from solr1.4.0/conf/solr/) in the multicore folder, configured solr.xml and started the server When i type in http://localhost:8983/solr i get the following error: org.apache.solr.common.SolrException: Plugin init failure for [schema.xml] fieldType:analyzer without class or tokenizer filter list at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:168) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:480) at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:122) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:429) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:286) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:198) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:123) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:86) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:662) at org.mortbay.jetty.servlet.Context.startContext(Context.java:140) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1250) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:517) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:467) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130) at org.mortbay.jetty.Server.doStart(Server.java:224) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.mortbay.start.Main.invokeMain(Main.java:194) at org.mortbay.start.Main.start(Main.java:534) at org.mortbay.start.Main.start(Main.java:441) at org.mortbay.start.Main.main(Main.java:119) Caused by: org.apache.solr.common.SolrException: analyzer without class or tokenizer filter list at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:908) at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:60) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:142) ... 32 more Then i picked up an existing index (schema.xml from solr1.3/solr/conf) and put it in multicore folder, configured solr.xml and restarted my index Collapsing worked fine. Any pointers, which part of schema.xml (solr 1.4) is causing this exception? Regards, Raakhi On Wed, Jun 23, 2010 at 1:35 PM, Rakhi Khatwani rkhatw...@gmail.com wrote: Oops this is probably i didn't checkout the modules file from the trunk. doing that right now :) Regards Raakhi On Wed, Jun 23, 2010 at 1:12 PM, Rakhi Khatwani rkhatw...@gmail.com wrote: Hi, Patching did work. but when i build the trunk, i get the following exception: [SolrTrunk]# ant compile Buildfile: /testWorkspace/SolrTrunk/build.xml init-forrest-entities: [mkdir] Created dir: /testWorkspace/SolrTrunk/build [mkdir] Created dir: /testWorkspace/SolrTrunk/build/web compile-lucene: BUILD FAILED /testWorkspace/SolrTrunk/common-build.xml:207: /testWorkspace/modules/analysis/common does not exist. Regards, Raakhi On Wed, Jun 23, 2010 at 2:39 AM, Martijn v Groningen martijn.is.h...@gmail.com wrote: What exactly did not work? Patching, compiling or running it? On 22 June 2010 16:06, Rakhi Khatwani rkhatw...@gmail.com wrote: Hi, I tried checking out the latest code (rev 956715) the patch did not work on it.