Re: UI
yes, I am using this library and it works perfectly so far. If something does not work you can just modify it http://code.google.com/p/solr-php-client/ Johannes 2012/5/21 Tolga : > Hi, > > Can you recommend a good PHP UI to search? Is SolrPHPClient good?
Re: summing facets on a specific field
I meant stats=true&stats.field=price&stats.facet=category 2012/2/6 Johannes Goll : > you can use the StatsComponent > > http://wiki.apache.org/solr/StatsComponent > > with stats=true&stats.price=category&stats.facet=category > > and pull the sum fields from the resulting stats facets. > > Johannes > > 2012/2/5 Paul Kapla : >> Hi everyone, >> I'm pretty new to solr and I'm not sure if this can even be done. Is there >> a way to sum a specific field per each item in a facet. For example, you >> have an ecommerce site that has the following documents: >> >> id,category,name,price >> 1,books,'solr book', $10.00 >> 2,books,'lucene in action', $12.00 >> 3.video, 'cool video', $20.00 >> >> so instead of getting (when faceting on category) >> books(2) >> video(1) >> >> I'd like to get: >> books ($22) >> video ($20) >> >> Is this something that can be even done? Any feedback would be much >> appreciated. > > > > -- > Dipl.-Ing.(FH) > Johannes Goll > 211 Curry Ford Lane > Gaithersburg, Maryland 20878 > USA -- Dipl.-Ing.(FH) Johannes Goll 211 Curry Ford Lane Gaithersburg, Maryland 20878 USA
Re: summing facets on a specific field
you can use the StatsComponent http://wiki.apache.org/solr/StatsComponent with stats=true&stats.price=category&stats.facet=category and pull the sum fields from the resulting stats facets. Johannes 2012/2/5 Paul Kapla : > Hi everyone, > I'm pretty new to solr and I'm not sure if this can even be done. Is there > a way to sum a specific field per each item in a facet. For example, you > have an ecommerce site that has the following documents: > > id,category,name,price > 1,books,'solr book', $10.00 > 2,books,'lucene in action', $12.00 > 3.video, 'cool video', $20.00 > > so instead of getting (when faceting on category) > books(2) > video(1) > > I'd like to get: > books ($22) > video ($20) > > Is this something that can be even done? Any feedback would be much > appreciated. -- Dipl.-Ing.(FH) Johannes Goll 211 Curry Ford Lane Gaithersburg, Maryland 20878 USA
Update Solr Schema To Store Field
Hi, I am running apache-solr-3.1.0 and would like to change a field attribute from stored="false" to stored="true". I have several hundred cores that have been indexed without storing the field which is fine as I only would like to retrieve the value for new data that I plan to index with the updated schema. My question is whether this change affects the query behavior for the existing indexed documents which were loaded with stored ="false" Thanks a lot, Johannes
Re: Hierarchical faceting in UI
another way is to store the original hierarchy in a sql database (in the form: id, parent_id, name, level) and in the Lucene index store the complete hierarchy (from root to leave node) for each document in one field using the ids of the sql database. In that way you can get documents at any level of the hierarchy. You can use the sql database to dynamically expand the tree by building facet queries to fetch document collections of child-nodes. Johannes from the root level down to leave node in one field "1 13 32 42 23 12" 2012/1/23 : > > On Mon, 23 Jan 2012 14:33:00 -0800 (PST), Yuhao > wrote: >> Programmatically, something like this might work: for each facet field, >> add another hidden field that identifies its parent. Then, program >> additional logic in the UI to show only the facet terms at the currently >> selected level. For example, if one filters on "cat:electronics", the > new >> UI logic would apply the additional filter "cat_parent:electronics". > Can >> this be done? > > Yes. This is how I do it. > >> Would it be a lot of work? > No. Its not a lot of work, simply represent your hierarchy as parent/child > relations in the document fields and in your UI drill down by issuing new > faceted searches. Use the current facet (tree level) as the parent: > in the next query. Its much easier than other suggestions for this. > >> Is there a better way? > Not in my opinion, there isn't. This is the simplest to implement and > understand. > >> >> By the way, Flamenco (another faceted browser) has built-in support for >> hierarchies, and it has worked well for my data in this aspect (but less >> well than Solr in others). I'm looking for the same kind of > hierarchical >> UI feature in Solr.
Re: Multi CPU Cores
Yes, same thing. This was for the jetty servlet container not tomcat. I would refer to the tomcat documentation on how to modify/configure the java runtime environment (JRE) arguments for your running instance. Johannes On Oct 17, 2011, at 4:01 AM, Robert Brown wrote: > Where exactly do you set this up? We're running Solr3.4 under tomcat, > OpenJDK 1.6.0.20 > > btw, is the JRE just a different name for the VM? Apologies for such a > newbie Java question. > > > > On Sun, 16 Oct 2011 12:51:44 -0400, Johannes Goll > wrote: >> we use the the following in production >> >> java -server -XX:+UseParallelGC -XX:+AggressiveOpts >> -XX:+DisableExplicitGC -Xms3G -Xmx40G -Djetty.port= >> -Dsolr.solr.home= jar start.jar >> >> more information >> http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html >> >> Johannes >
Re: Multi CPU Cores
we use the the following in production java -server -XX:+UseParallelGC -XX:+AggressiveOpts -XX:+DisableExplicitGC -Xms3G -Xmx40G -Djetty.port= -Dsolr.solr.home= jar start.jar more information http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html Johannes
Re: Multi CPU Cores
Try using -useParallelGc as vm option. Johannes On Oct 16, 2011, at 7:51 AM, Ken Krugler wrote: > > On Oct 16, 2011, at 1:44pm, Rob Brown wrote: > >> Looks like I checked the load during a quiet period, ab -n 1 -c 1000 >> saw a decent 40% load on each core. >> >> Still a little confused as to why 1 core stays at 100% constantly - even >> during the quiet periods? > > Could be background GC, depending on what you've got your JVM configured to > use. > > Though that shouldn't stay at 100% for very long. > > -- Ken > > >> -Original Message- >> From: Johannes Goll >> Reply-to: solr-user@lucene.apache.org >> To: solr-user@lucene.apache.org >> Subject: Re: Multi CPU Cores >> Date: Sat, 15 Oct 2011 21:30:11 -0400 >> >> Did you try to submit multiple search requests in parallel? The apache ab >> tool is great tool to simulate simultaneous load using (-n and -c). >> Johannes >> >> On Oct 15, 2011, at 7:32 PM, Rob Brown wrote: >> >>> Hi, >>> >>> I'm running Solr on a machine with 16 CPU cores, yet watching "top" >>> shows that java is only apparently using 1 and maxing it out. >>> >>> Is there anything that can be done to take advantage of more CPU cores? >>> >>> Solr 3.4 under Tomcat >>> >>> [root@solr01 ~]# java -version >>> java version "1.6.0_20" >>> OpenJDK Runtime Environment (IcedTea6 1.9.8) >>> (rhel-1.22.1.9.8.el5_6-x86_64) >>> OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode) >>> >>> >>> top - 14:36:18 up 22 days, 21:54, 4 users, load average: 1.89, 1.24, >>> 1.08 >>> Tasks: 317 total, 1 running, 315 sleeping, 0 stopped, 1 zombie >>> Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 99.6%id, 0.4%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu6 : 99.6%us, 0.4%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu8 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu10 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu11 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu13 : 0.7%us, 0.0%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Mem: 132088928k total, 23760584k used, 108328344k free, 318228k >>> buffers >>> Swap: 25920868k total,0k used, 25920868k free, 18371128k cached >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ >>> COMMAND >>> >>> >>> 4466 tomcat20 0 31.2g 4.0g 171m S 101.0 3.2 2909:38 >>> java >>> >>> >>> 6495 root 15 0 42416 3892 1740 S 0.4 0.0 9:34.71 >>> openvpn >>> >>> >>> 11456 root 16 0 12892 1312 836 R 0.4 0.0 0:00.08 >>> top >>> >>> >>> 1 root 15 0 10368 632 536 S 0.0 0.0 0:04.69 >>> init >>> >>> >>> >> > > -- > Ken Krugler > +1 530-210-6378 > http://bixolabs.com > custom big data solutions & training > Hadoop, Cascading, Mahout & Solr > > >
Re: Multi CPU Cores
Did you try to submit multiple search requests in parallel? The apache ab tool is great tool to simulate simultaneous load using (-n and -c). Johannes On Oct 15, 2011, at 7:32 PM, Rob Brown wrote: > Hi, > > I'm running Solr on a machine with 16 CPU cores, yet watching "top" > shows that java is only apparently using 1 and maxing it out. > > Is there anything that can be done to take advantage of more CPU cores? > > Solr 3.4 under Tomcat > > [root@solr01 ~]# java -version > java version "1.6.0_20" > OpenJDK Runtime Environment (IcedTea6 1.9.8) > (rhel-1.22.1.9.8.el5_6-x86_64) > OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode) > > > top - 14:36:18 up 22 days, 21:54, 4 users, load average: 1.89, 1.24, > 1.08 > Tasks: 317 total, 1 running, 315 sleeping, 0 stopped, 1 zombie > Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 99.6%id, 0.4%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu6 : 99.6%us, 0.4%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu8 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu10 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu11 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu13 : 0.7%us, 0.0%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > Mem: 132088928k total, 23760584k used, 108328344k free, 318228k > buffers > Swap: 25920868k total,0k used, 25920868k free, 18371128k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ > COMMAND > > > 4466 tomcat20 0 31.2g 4.0g 171m S 101.0 3.2 2909:38 > java > > > 6495 root 15 0 42416 3892 1740 S 0.4 0.0 9:34.71 > openvpn > > > 11456 root 16 0 12892 1312 836 R 0.4 0.0 0:00.08 > top > > >1 root 15 0 10368 632 536 S 0.0 0.0 0:04.69 > init > > >
Re: Huge performance drop in distributed search w/ shards on the same server/container
I increased the maximum POST size and headerBufferSize to 10MB ; lowThreads to 50, maxThreads to 10 and lowResourceMaxIdleTime=15000. We tried tomcat 6 using the following Connnector settings : I am getting the same exception as for jetty SEVERE: org.apache.solr.common.SolrException: org.apache.solr.client.solrj.SolrServerException: java.net.SocketException: Connection reset This seem to point towards a Solr specific issue (solrj.SolrServerException during individual shard searches). I monitored the CPU utilization executing sequential distributed searches and noticed that in the beginning all CPUs are getting used for a short period of time (multiple lines for shard searches are shown in the log with isShard=true arguments), then all CPU except one become idle and the request is being processed by this one CPU for the longest period of time. I also noticed in the logs that while most of the individual shard searches (isShard=true) have low QTimes (5-10), a minority has extreme QTimes (104402-105126). All shards are fairly similar in size and content (1.2 M documents) and the StatsComponent is being used [stats=true&stats.field=weight&stats.facet=library_id]. Here library_id equals the shard/core name. Is there an internal timeout for gathering shard results or other fixed resource limitation ? Johannes 2011/6/13 Yonik Seeley > On Sun, Jun 12, 2011 at 9:10 PM, Johannes Goll > wrote: > > However, sporadically, Jetty 6.1.2X (shipped with Solr 3.1.) > > sporadically throws Socket connect exceptions when executing distributed > > searches. > > Are you using the exact jetty.xml that shipped with the solr example > server, > or did you make any modifications? > > -Yonik > http://www.lucidimagination.com > -- Johannes Goll 211 Curry Ford Lane Gaithersburg, Maryland 20878
Re: Huge performance drop in distributed search w/ shards on the same server/container
Hi Fred, we are having similar issues of scaling Solr 3.1 distributed searches on a single box with 18 cores. We use the StatsComponent which seems to be mainly CPU bound. Using distributed searches resulted in a 9 fold decrease in response time. However, sporadically, Jetty 6.1.2X (shipped with Solr 3.1.) sporadically throws Socket connect exceptions when executing distributed searches. Our next step is to switch from from jetty to tomcat. Did you find a solution for improving the CPU utilization and requests per second for your system? Johannes 2011/5/26 pravesh > Do you really require multi-shards? Single core/shard will do for even > millions of documents and the search will be faster than searching on > multi-shards. > > Consider multi-shard when you cannot scale-up on a single > shard/machine(e.g, > CPU,RAM etc. becomes major block). > > Also read through the SOLR distributed search wiki to check on all tuning > up > required at application server(Tomcat) end, like maxHTTP request settings. > For a single request in a multi-shard setup internal HTTP requests are made > through all queried shards, so, make sure you set this parameter higher. > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Huge-performance-drop-in-distributed-search-w-shards-on-the-same-server-container-tp2938421p2988464.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Johannes Goll 211 Curry Ford Lane Gaithersburg, Maryland 20878
Re: apache-solr-3.1 slow stats component queries
Hi, I bench-marked the slow stats queries (6 point estimate) using the same hardware on an index of size 104M. We use a Solr/Lucene 3.1-mod which returns only the sum and count for statistics component results. Solr/Lucene is run on jetty. The relationship between query time and set of found documents is linear when using the stats component (R^2 0.99). I guess this is expected as the application needs to scan/sum-up the stat field for all matching documents? Are there any plans for caching stat results for a certain stat field along with the documents that match a filter query ? Any other ideas that could help to improve this (hardware/software configuration) ? Even for a subset of 10M entries, the stat search takes on the order of 10 seconds. Thanks in advance. Johannes 2011/4/18 Johannes Goll > any ideas why in this case the stats summaries are so slow ? Thank you > very much in advance for any ideas/suggestions. Johannes > > > 2011/4/5 Johannes Goll > >> Hi, >> >> thank you for making the new apache-solr-3.1 available. >> >> I have installed the version from >> >> http://apache.tradebit.com/pub//lucene/solr/3.1.0/ >> >> and am running into very slow stats component queries (~ 1 minute) >> for fetching the computed sum of the stats field >> >> url: ?q=*:*&start=0&rows=0&stats=true&stats.field=weight >> >> 52825 >> >> #documents: 78,359,699 >> total RAM: 256G >> vm arguments: -server -xmx40G >> >> the stats.field specification is as follows: >> > stored="false" required="true" multiValued="false" >> default="1"/> >> >> filter queries that narrow down the #docs help to reduce it - >> QTime seems to be proportional to the number of docs being returned >> by a filter query. >> >> Is there any way to improve the performance of such stats queries ? >> Caching only helped to improve the filter query performance but if >> larger subsets are being returned, QTime increases unacceptably. >> >> Since I only need the sum and not the STD or sumsOfSquares/Min/Max, >> I have created a custom 3.1 version that does only return the sum. But >> this >> only slightly improved the performance. Of course I could somehow cache >> the larger sum queries on the client side but I want to do this only as a >> last resort. >> >> Thank you very much in advance for any ideas/suggestions. >> >> Johannes >> >> > > > -- > Johannes Goll > 211 Curry Ford Lane > Gaithersburg, Maryland 20878 >
Re: apache-solr-3.1 slow stats component queries
any ideas why in this case the stats summaries are so slow ? Thank you very much in advance for any ideas/suggestions. Johannes 2011/4/5 Johannes Goll > Hi, > > thank you for making the new apache-solr-3.1 available. > > I have installed the version from > > http://apache.tradebit.com/pub//lucene/solr/3.1.0/ > > and am running into very slow stats component queries (~ 1 minute) > for fetching the computed sum of the stats field > > url: ?q=*:*&start=0&rows=0&stats=true&stats.field=weight > > 52825 > > #documents: 78,359,699 > total RAM: 256G > vm arguments: -server -xmx40G > > the stats.field specification is as follows: > stored="false" required="true" multiValued="false" > default="1"/> > > filter queries that narrow down the #docs help to reduce it - > QTime seems to be proportional to the number of docs being returned > by a filter query. > > Is there any way to improve the performance of such stats queries ? > Caching only helped to improve the filter query performance but if > larger subsets are being returned, QTime increases unacceptably. > > Since I only need the sum and not the STD or sumsOfSquares/Min/Max, > I have created a custom 3.1 version that does only return the sum. But this > only slightly improved the performance. Of course I could somehow cache > the larger sum queries on the client side but I want to do this only as a > last resort. > > Thank you very much in advance for any ideas/suggestions. > > Johannes > > -- Johannes Goll 211 Curry Ford Lane Gaithersburg, Maryland 20878
apache-solr-3.1 slow stats component queries
Hi, thank you for making the new apache-solr-3.1 available. I have installed the version from http://apache.tradebit.com/pub//lucene/solr/3.1.0/ and am running into very slow stats component queries (~ 1 minute) for fetching the computed sum of the stats field url: ?q=*:*&start=0&rows=0&stats=true&stats.field=weight 52825 #documents: 78,359,699 total RAM: 256G vm arguments: -server -xmx40G the stats.field specification is as follows: filter queries that narrow down the #docs help to reduce it - QTime seems to be proportional to the number of docs being returned by a filter query. Is there any way to improve the performance of such stats queries ? Caching only helped to improve the filter query performance but if larger subsets are being returned, QTime increases unacceptably. Since I only need the sum and not the STD or sumsOfSquares/Min/Max, I have created a custom 3.1 version that does only return the sum. But this only slightly improved the performance. Of course I could somehow cache the larger sum queries on the client side but I want to do this only as a last resort. Thank you very much in advance for any ideas/suggestions. Johannes
Re: solr upgrade question
Hi Alexander, I have posted same question a few month ago. The only solution that came up was to regenerate the index files using the new version. How did you do this exactly with luke 1.0.1 ? Would you mind sharing some of that magic ? Best, Johannes 2011/3/31 Alexander Aristov > Didn't get any responses. > > But I tried luke 1.0.1 and it did the magic. I run optimization and after > that solr got up. > > Best Regards > Alexander Aristov > > > On 30 March 2011 15:47, Alexander Aristov >wrote: > > > People > > > > Is were way to upgrade existsing index from solr 1.4 to solr 4(trunk). > When > > I configured solr 4 and launched it complained about incorrect lucence > file > > version (3 instead of old 2) > > > > Are there any procedures to convert index? > > > > > > Best Regards > > Alexander Aristov > > >
Re: Adding weightage to the facets count
Hi Siva, try using the Solr Stats Component http://wiki.apache.org/solr/StatsComponent similar to select/?&q=*:*&stats=true&stats.field={your-weight-field}&stats.facet={your-facet-field} and get the sum field from the response. You may need to resort the weighted facet counts to get a descending list of facet counts. Note, there is a bug for using the Stats Component with multi-valued facet fields. For details see https://issues.apache.org/jira/browse/SOLR-1782 Johannes 2011/1/24 Chris Hostetter > > : prod1 has tag called “Light Weight” with weightage 20, > : prod2 has tag called “Light Weight” with weightage 100, > : > : If i get facet for “Light Weight” , i will get Light Weight (2) , > : here i need to consider the weightage in to account, and the result will > be > : Light Weight (120) > : > : How can we achieve this?Any ideas are really helpful. > > > It's not really possible with Solr out of the box. Faceting is fast and > efficient in Solr because it's all done using set intersections (and most > of the sets can be kept in ram very compactly and reused). For what you > are describing you'd need to no only assocaited a weighted payload with > every TermPosition, but also factor that weight in when doing the > faceting, which means efficient set operations are now out the window. > > If you know java it would be probably be possible to write a custom > SolrPlugin (a SearchComponent) to do this type of faceting in special > cases (assuming you indexed in a particular way) but i'm not sure off hte > top of my head how well it would scale -- the basic algo i'm thinking of > is (after indexing each facet term wit ha weight payload) to iterate over > the DocSet of all matching documents in parallel with an iteration over > a TermPositions, skipping ahead to only the docs that match the query, and > recording the sum of the payloads for each term. > > Hmmm... > > except TermPositions iterates over >> tuples, > so you would have to iterate over every term, and for every term then loop > over all matching docs ... like i said, not sure how efficient it would > wind up being. > > You might be happier all arround if you just do some sampling -- store the > tag+weight pairs so thta htey cna be retireved with each doc, and then > when you get your top facet constraints back, look at the first page of > results, and figure out what the sun "weight" is for each of those > constraints based solely on the page#1 results. > > i've had happy users using a similar appraoch in the past. > > -Hoss -- Johannes Goll 211 Curry Ford Lane Gaithersburg, Maryland 20878
Re: Tuning StatsComponent
What field type do you recommend for a float stats.field for optimal Solr 1.4.1 StatsComponent performance ? float, pfloat or tfloat ? Do you recommend to index the field ? 2011/1/12 stockii > > my field Type is "double" maybe "sint" is better ? but i need double ... > =( > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Tuning-StatsComponent-tp2225809p2241903.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: solrconfig luceneMatchVersion 2.9.3
according to http://www.mail-archive.com/solr-user@lucene.apache.org/msg40491.html there is no more trunk support for 2.9 indexes. So I tried the suggested solution to execute an optimize to convert a 2.9.3 index to a 3.x index. However, when I tried to the optimize a 2.9.3 index using the Solr 4.0 trunk version with luceneMatchVersion set to LUCENE_30 in the solrconfig.xml, I am getting SimplePostTool: POSTing file optimize.xml SimplePostTool: FATAL: Solr returned an error: Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. - java.lang.RuntimeException: org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported in file '_0.fdx': 1 (needs to be between 2 and 2). This version of Lucene only supports indexes created with release 3.0 and later. Is there any other mechanism for converting index files to 3.x? 2011/1/6 Johannes Goll > Hi, > > our index files have been created using Lucene 2.9.3 and solr 1.4.1. > > I am trying to use a patched version of the current trunk (solr 1.5.0 ? ). > The patched version works fine with newly generated index data but > not with our existing data: > > After adjusting the solrconfig.xml - I added the line > > LUCENE_40 > > also tried > > LUCENE_30 > > I am getting the following exception > > "java.lang.RuntimeException: > org.apache.lucene.index.IndexFormatTooOldException: > Format version is not supported in file '_q.fdx': 1 (needs to be between 2 > and 2)" > > When I try to change it to > > LUCENE_29 > > or > > 2.9 > > or > > 2.9.3 > > I am getting > > "SEVERE: org.apache.solr.common.SolrException: Invalid luceneMatchVersion > '2.9', valid values are: [LUCENE_30, LUCENE_31, LUCENE_40, LUCENE_CURRENT] > or a string in format 'V.V'" > > Do you know a way to make this work with Lucene version 2.9.3 ? > > Thanks, > Johannes > -- Johannes Goll 211 Curry Ford Lane Gaithersburg, Maryland 20878
solrconfig luceneMatchVersion 2.9.3
Hi, our index files have been created using Lucene 2.9.3 and solr 1.4.1. I am trying to use a patched version of the current trunk (solr 1.5.0 ? ). The patched version works fine with newly generated index data but not with our existing data: After adjusting the solrconfig.xml - I added the line LUCENE_40 also tried LUCENE_30 I am getting the following exception "java.lang.RuntimeException: org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported in file '_q.fdx': 1 (needs to be between 2 and 2)" When I try to change it to LUCENE_29 or 2.9 or 2.9.3 I am getting "SEVERE: org.apache.solr.common.SolrException: Invalid luceneMatchVersion '2.9', valid values are: [LUCENE_30, LUCENE_31, LUCENE_40, LUCENE_CURRENT] or a string in format 'V.V'" Do you know a way to make this work with Lucene version 2.9.3 ? Thanks, Johannes
Solr 1.4.1 stats component count not matching facet count for multi valued field
Hi, I have a facet field called option which may be multi-valued and a weight field which is single-valued. When I use the Solr 1.4.1 stats component with a facet field, i.e. q=*:*&version=2.2&stats=true& stats.field=weight&stats.facet=option I get conflicting results for the stats count result 1 when compared with the faceting counts obtained by q=*:*&version=2.2&facet=true&facet.field=option I would expect the same count for either method. This happens if multiple values are stored in the options field. It seem that for a multiple values only the last entered value is being considered in the stats component? What am I doing wrong here? Thanks, Johannes