Re: UI

2012-05-21 Thread Johannes Goll
yes, I am using this library and it works perfectly so far. If
something does not work you can just modify it
http://code.google.com/p/solr-php-client/

Johannes
2012/5/21 Tolga to...@ozses.net:
 Hi,

 Can you recommend a good PHP UI to search? Is SolrPHPClient good?


Re: summing facets on a specific field

2012-02-06 Thread Johannes Goll
you can use the StatsComponent

http://wiki.apache.org/solr/StatsComponent

with stats=truestats.price=categorystats.facet=category

and pull the sum fields from the resulting stats facets.

Johannes

2012/2/5 Paul Kapla paul.ka...@gmail.com:
 Hi everyone,
 I'm pretty new to solr and I'm not sure if this can even be done. Is there
 a way to sum a specific field per each item in a facet. For example, you
 have an ecommerce site that has the following documents:

 id,category,name,price
 1,books,'solr book', $10.00
 2,books,'lucene in action', $12.00
 3.video, 'cool video', $20.00

 so instead of getting (when faceting on category)
 books(2)
 video(1)

 I'd like to get:
 books ($22)
 video ($20)

 Is this something that can be even done? Any feedback would be much
 appreciated.



-- 
Dipl.-Ing.(FH)
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878
USA


Re: summing facets on a specific field

2012-02-06 Thread Johannes Goll
I meant

stats=truestats.field=pricestats.facet=category

2012/2/6 Johannes Goll johannes.g...@gmail.com:
 you can use the StatsComponent

 http://wiki.apache.org/solr/StatsComponent

 with stats=truestats.price=categorystats.facet=category

 and pull the sum fields from the resulting stats facets.

 Johannes

 2012/2/5 Paul Kapla paul.ka...@gmail.com:
 Hi everyone,
 I'm pretty new to solr and I'm not sure if this can even be done. Is there
 a way to sum a specific field per each item in a facet. For example, you
 have an ecommerce site that has the following documents:

 id,category,name,price
 1,books,'solr book', $10.00
 2,books,'lucene in action', $12.00
 3.video, 'cool video', $20.00

 so instead of getting (when faceting on category)
 books(2)
 video(1)

 I'd like to get:
 books ($22)
 video ($20)

 Is this something that can be even done? Any feedback would be much
 appreciated.



 --
 Dipl.-Ing.(FH)
 Johannes Goll
 211 Curry Ford Lane
 Gaithersburg, Maryland 20878
 USA



-- 
Dipl.-Ing.(FH)
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878
USA


Update Solr Schema To Store Field

2012-02-01 Thread Johannes Goll
Hi,

I am running apache-solr-3.1.0 and would like to change a field
attribute from stored=false to
stored=true.

I have several hundred cores that have been indexed without storing
the field which is fine
as I only would like to retrieve the value for new data that I plan to
index with the
updated schema.

My question is whether this change affects the query behavior for the
existing indexed
documents which were loaded with stored =false

Thanks a lot,
Johannes


Re: Hierarchical faceting in UI

2012-01-23 Thread Johannes Goll
another way is to store the original hierarchy in a sql database (in
the form: id, parent_id, name, level) and in the Lucene index store
the complete
hierarchy (from root to leave node) for each document in one field
using the ids of the sql database. In that way you can get documents
at any level of
the hierarchy. You can use the sql database to dynamically expand the
tree by building facet queries to fetch document collections of
child-nodes.

Johannes


 from the root level down to leave node in one field 1 13 32 42 23 12

2012/1/23  dar...@ontrenet.com:

 On Mon, 23 Jan 2012 14:33:00 -0800 (PST), Yuhao nfsvi...@yahoo.com
 wrote:
 Programmatically, something like this might work: for each facet field,
 add another hidden field that identifies its parent.  Then, program
 additional logic in the UI to show only the facet terms at the currently
 selected level.  For example, if one filters on cat:electronics, the
 new
 UI logic would apply the additional filter cat_parent:electronics.
 Can
 this be done?

 Yes. This is how I do it.

 Would it be a lot of work?
 No. Its not a lot of work, simply represent your hierarchy as parent/child
 relations in the document fields and in your UI drill down by issuing new
 faceted searches. Use the current facet (tree level) as the parent:level
 in the next query. Its much easier than other suggestions for this.

 Is there a better way?
 Not in my opinion, there isn't. This is the simplest to implement and
 understand.


 By the way, Flamenco (another faceted browser) has built-in support for
 hierarchies, and it has worked well for my data in this aspect (but less
 well than Solr in others).  I'm looking for the same kind of
 hierarchical
 UI feature in Solr.


Re: Multi CPU Cores

2011-10-17 Thread Johannes Goll
Yes, same thing. This was for the jetty servlet container not tomcat. I would 
refer to the tomcat documentation on how to modify/configure the java runtime 
environment (JRE) arguments for your running instance.
Johannes

On Oct 17, 2011, at 4:01 AM, Robert Brown r...@intelcompute.com wrote:

 Where exactly do you set this up?  We're running Solr3.4 under tomcat,
 OpenJDK 1.6.0.20
 
 btw, is the JRE just a different name for the VM?  Apologies for such a
 newbie Java question.
 
 
 
 On Sun, 16 Oct 2011 12:51:44 -0400, Johannes Goll
 johannes.g...@gmail.com wrote:
 we use the the following in production
 
 java -server -XX:+UseParallelGC -XX:+AggressiveOpts
 -XX:+DisableExplicitGC -Xms3G -Xmx40G -Djetty.port=your-port
 -Dsolr.solr.home=your-solr-home jar start.jar
 
 more information
 http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html
 
 Johannes
 


Re: Multi CPU Cores

2011-10-16 Thread Johannes Goll
Try using -useParallelGc as vm option. 

Johannes

On Oct 16, 2011, at 7:51 AM, Ken Krugler kkrugler_li...@transpac.com wrote:

 
 On Oct 16, 2011, at 1:44pm, Rob Brown wrote:
 
 Looks like I checked the load during a quiet period, ab -n 1 -c 1000
 saw a decent 40% load on each core.
 
 Still a little confused as to why 1 core stays at 100% constantly - even
 during the quiet periods?
 
 Could be background GC, depending on what you've got your JVM configured to 
 use.
 
 Though that shouldn't stay at 100% for very long.
 
 -- Ken
 
 
 -Original Message-
 From: Johannes Goll johannes.g...@gmail.com
 Reply-to: solr-user@lucene.apache.org
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Subject: Re: Multi CPU Cores
 Date: Sat, 15 Oct 2011 21:30:11 -0400
 
 Did you try to submit multiple search requests in parallel? The apache ab 
 tool is great tool to simulate simultaneous load using (-n and -c).
 Johannes
 
 On Oct 15, 2011, at 7:32 PM, Rob Brown r...@intelcompute.com wrote:
 
 Hi,
 
 I'm running Solr on a machine with 16 CPU cores, yet watching top
 shows that java is only apparently using 1 and maxing it out.
 
 Is there anything that can be done to take advantage of more CPU cores?
 
 Solr 3.4 under Tomcat
 
 [root@solr01 ~]# java -version
 java version 1.6.0_20
 OpenJDK Runtime Environment (IcedTea6 1.9.8)
 (rhel-1.22.1.9.8.el5_6-x86_64)
 OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)
 
 
 top - 14:36:18 up 22 days, 21:54,  4 users,  load average: 1.89, 1.24,
 1.08
 Tasks: 317 total,   1 running, 315 sleeping,   0 stopped,   1 zombie
 Cpu0  :  0.0%us,  0.0%sy,  0.0%ni, 99.6%id,  0.4%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu3  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu4  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu6  : 99.6%us,  0.4%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu10 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu12 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu13 :  0.7%us,  0.0%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu14 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu15 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Mem:  132088928k total, 23760584k used, 108328344k free,   318228k
 buffers
 Swap: 25920868k total,0k used, 25920868k free, 18371128k cached
 
 PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+
 COMMAND 
 

 4466 tomcat20   0 31.2g 4.0g 171m S 101.0  3.2   2909:38
 java
 
   
 6495 root  15   0 42416 3892 1740 S  0.4  0.0   9:34.71
 openvpn 
 

 11456 root  16   0 12892 1312  836 R  0.4  0.0   0:00.08
 top 
 

  1 root  15   0 10368  632  536 S  0.0  0.0   0:04.69
 init 
 
 
 
 
 
 --
 Ken Krugler
 +1 530-210-6378
 http://bixolabs.com
 custom big data solutions  training
 Hadoop, Cascading, Mahout  Solr
 
 
 


Re: Multi CPU Cores

2011-10-16 Thread Johannes Goll
we use the the following in production

java -server -XX:+UseParallelGC -XX:+AggressiveOpts
-XX:+DisableExplicitGC -Xms3G -Xmx40G -Djetty.port=your-port
-Dsolr.solr.home=your-solr-home jar start.jar

more information
http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html

Johannes


Re: Multi CPU Cores

2011-10-15 Thread Johannes Goll
Did you try to submit multiple search requests in parallel? The apache ab tool 
is great tool to simulate simultaneous load using (-n and -c).
Johannes

On Oct 15, 2011, at 7:32 PM, Rob Brown r...@intelcompute.com wrote:

 Hi,
 
 I'm running Solr on a machine with 16 CPU cores, yet watching top
 shows that java is only apparently using 1 and maxing it out.
 
 Is there anything that can be done to take advantage of more CPU cores?
 
 Solr 3.4 under Tomcat
 
 [root@solr01 ~]# java -version
 java version 1.6.0_20
 OpenJDK Runtime Environment (IcedTea6 1.9.8)
 (rhel-1.22.1.9.8.el5_6-x86_64)
 OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)
 
 
 top - 14:36:18 up 22 days, 21:54,  4 users,  load average: 1.89, 1.24,
 1.08
 Tasks: 317 total,   1 running, 315 sleeping,   0 stopped,   1 zombie
 Cpu0  :  0.0%us,  0.0%sy,  0.0%ni, 99.6%id,  0.4%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu3  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu4  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu6  : 99.6%us,  0.4%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu10 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu12 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu13 :  0.7%us,  0.0%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu14 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Cpu15 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
 0.0%st
 Mem:  132088928k total, 23760584k used, 108328344k free,   318228k
 buffers
 Swap: 25920868k total,0k used, 25920868k free, 18371128k cached
 
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+
 COMMAND   
   

 4466 tomcat20   0 31.2g 4.0g 171m S 101.0  3.2   2909:38
 java  
   
   
 6495 root  15   0 42416 3892 1740 S  0.4  0.0   9:34.71
 openvpn   
   

 11456 root  16   0 12892 1312  836 R  0.4  0.0   0:00.08
 top   
   

1 root  15   0 10368  632  536 S  0.0  0.0   0:04.69
 init 
 
 
 


Re: Huge performance drop in distributed search w/ shards on the same server/container

2011-06-14 Thread Johannes Goll
I increased the maximum POST size and headerBufferSize to 10MB ; lowThreads
to 50, maxThreads to 10 and lowResourceMaxIdleTime=15000. We tried
tomcat 6 using the following Connnector settings :

Connector port=8989
   protocol=HTTP/1.1
   redirectPort=8443
   URIEncoding=UTF-8
   maxThreads=1
   maxKeepAliveRequests=200
   maxHttpHeaderSize=10485760
   /

I am getting the same exception as for jetty

SEVERE: org.apache.solr.common.SolrException:
org.apache.solr.client.solrj.SolrServerException: java.net.SocketException:
Connection reset

This seem to point towards a Solr specific issue (solrj.SolrServerException
during individual shard searches).  I monitored the CPU utilization
executing sequential distributed searches and noticed that in the beginning
all CPUs are getting used for a short period of time (multiple lines for
shard searches are shown in the log with isShard=true arguments), then all
CPU except one become idle and the request is being processed by this one
CPU for the longest period of time.

I also noticed in the logs that while most of the individual shard searches
(isShard=true) have low QTimes (5-10), a minority has extreme QTimes
(104402-105126). All shards are fairly similar in size and content (1.2 M
documents) and the StatsComponent is being used
[stats=truestats.field=weightstats.facet=library_id]. Here library_id
equals the shard/core name.

Is there an internal timeout for gathering shard results or other fixed
resource limitation ?

Johannes






2011/6/13 Yonik Seeley yo...@lucidimagination.com

 On Sun, Jun 12, 2011 at 9:10 PM, Johannes Goll johannes.g...@gmail.com
 wrote:
  However, sporadically, Jetty 6.1.2X (shipped with  Solr 3.1.)
  sporadically throws Socket connect exceptions when executing distributed
  searches.

 Are you using the exact jetty.xml that shipped with the solr example
 server,
 or did you make any modifications?

 -Yonik
 http://www.lucidimagination.com




-- 
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878


Re: Huge performance drop in distributed search w/ shards on the same server/container

2011-06-12 Thread Johannes Goll
Hi Fred,

we are having similar issues of scaling Solr 3.1 distributed searches on a
single box with 18 cores. We use the StatsComponent which seems to be mainly
CPU bound. Using distributed searches resulted in a 9 fold decrease in
response time. However, sporadically, Jetty 6.1.2X (shipped with  Solr 3.1.)
sporadically throws Socket connect exceptions when executing distributed
searches. Our next step is to switch from from jetty to tomcat.  Did you
find a solution for improving the CPU utilization and requests per second
for your system?

Johannes


2011/5/26 pravesh suyalprav...@yahoo.com

 Do you really require multi-shards? Single core/shard will do for even
 millions of documents and the search will be faster than searching on
 multi-shards.

 Consider multi-shard when you cannot scale-up on a single
 shard/machine(e.g,
 CPU,RAM etc. becomes major block).

 Also read through the SOLR distributed search wiki to check on all tuning
 up
 required at application server(Tomcat) end, like maxHTTP request settings.
 For a single request in a multi-shard setup internal HTTP requests are made
 through all queried shards, so, make sure you set this parameter higher.

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Huge-performance-drop-in-distributed-search-w-shards-on-the-same-server-container-tp2938421p2988464.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878


Re: apache-solr-3.1 slow stats component queries

2011-05-05 Thread Johannes Goll
Hi,

I bench-marked the slow stats queries (6 point estimate) using the same
hardware on an index of size 104M. We use a Solr/Lucene 3.1-mod which
returns only the sum and count for statistics component results. Solr/Lucene
is run on jetty.

The relationship between query time and set of found documents is linear
when using the stats component (R^2 0.99). I guess this is expected as the
application needs to scan/sum-up the stat field for all matching documents?

Are there any plans for caching stat results for a certain stat field along
with the documents that match a filter query ? Any other ideas that could
help to improve this (hardware/software configuration) ?  Even for a subset
of 10M entries, the stat search takes on the order of 10 seconds.

Thanks in advance.
Johannes



2011/4/18 Johannes Goll johannes.g...@gmail.com

 any ideas why in this case the stats summaries are so slow  ?  Thank you
 very much in advance for any ideas/suggestions. Johannes


 2011/4/5 Johannes Goll johannes.g...@gmail.com

 Hi,

 thank you for making the new apache-solr-3.1 available.

 I have installed the version from

 http://apache.tradebit.com/pub//lucene/solr/3.1.0/

 and am running into very slow stats component queries (~ 1 minute)
 for fetching the computed sum of the stats field

 url: ?q=*:*start=0rows=0stats=truestats.field=weight

 int name=QTime52825/int

 #documents: 78,359,699
 total RAM: 256G
 vm arguments:  -server -xmx40G

 the stats.field specification is as follows:
 field name=weighttype=pfloatindexed=true
 stored=false required=true multiValued=false
 default=1/

 filter queries that narrow down the #docs help to reduce it -
 QTime seems to be proportional to the number of docs being returned
 by a filter query.

 Is there any way to improve the performance of such stats queries ?
 Caching only helped to improve the filter query performance but if
 larger subsets are being returned, QTime increases unacceptably.

 Since I only need the sum and not the STD or sumsOfSquares/Min/Max,
 I have created a custom 3.1 version that does only return the sum. But
 this
 only slightly improved the performance. Of course I could somehow cache
 the larger sum queries on the client side but I want to do this only as a
 last resort.

 Thank you very much in advance for any ideas/suggestions.

 Johannes




 --
 Johannes Goll
 211 Curry Ford Lane
 Gaithersburg, Maryland 20878



Re: apache-solr-3.1 slow stats component queries

2011-04-18 Thread Johannes Goll
any ideas why in this case the stats summaries are so slow  ?  Thank you
very much in advance for any ideas/suggestions. Johannes

2011/4/5 Johannes Goll johannes.g...@gmail.com

 Hi,

 thank you for making the new apache-solr-3.1 available.

 I have installed the version from

 http://apache.tradebit.com/pub//lucene/solr/3.1.0/

 and am running into very slow stats component queries (~ 1 minute)
 for fetching the computed sum of the stats field

 url: ?q=*:*start=0rows=0stats=truestats.field=weight

 int name=QTime52825/int

 #documents: 78,359,699
 total RAM: 256G
 vm arguments:  -server -xmx40G

 the stats.field specification is as follows:
 field name=weighttype=pfloatindexed=true
 stored=false required=true multiValued=false
 default=1/

 filter queries that narrow down the #docs help to reduce it -
 QTime seems to be proportional to the number of docs being returned
 by a filter query.

 Is there any way to improve the performance of such stats queries ?
 Caching only helped to improve the filter query performance but if
 larger subsets are being returned, QTime increases unacceptably.

 Since I only need the sum and not the STD or sumsOfSquares/Min/Max,
 I have created a custom 3.1 version that does only return the sum. But this
 only slightly improved the performance. Of course I could somehow cache
 the larger sum queries on the client side but I want to do this only as a
 last resort.

 Thank you very much in advance for any ideas/suggestions.

 Johannes




-- 
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878


apache-solr-3.1 slow stats component queries

2011-04-05 Thread Johannes Goll
Hi,

thank you for making the new apache-solr-3.1 available.

I have installed the version from

http://apache.tradebit.com/pub//lucene/solr/3.1.0/

and am running into very slow stats component queries (~ 1 minute)
for fetching the computed sum of the stats field

url: ?q=*:*start=0rows=0stats=truestats.field=weight

int name=QTime52825/int

#documents: 78,359,699
total RAM: 256G
vm arguments:  -server -xmx40G

the stats.field specification is as follows:
field name=weighttype=pfloatindexed=true
stored=false required=true multiValued=false
default=1/

filter queries that narrow down the #docs help to reduce it -
QTime seems to be proportional to the number of docs being returned
by a filter query.

Is there any way to improve the performance of such stats queries ?
Caching only helped to improve the filter query performance but if
larger subsets are being returned, QTime increases unacceptably.

Since I only need the sum and not the STD or sumsOfSquares/Min/Max,
I have created a custom 3.1 version that does only return the sum. But this
only slightly improved the performance. Of course I could somehow cache
the larger sum queries on the client side but I want to do this only as a
last resort.

Thank you very much in advance for any ideas/suggestions.

Johannes


Re: solr upgrade question

2011-03-31 Thread Johannes Goll
Hi Alexander,

I have posted same question a few month ago. The only solution that came up
was to regenerate the index files using the new  version. How did you do
this exactly with
luke 1.0.1 ? Would you mind sharing some of that magic ?

Best,
Johannes


2011/3/31 Alexander Aristov alexander.aris...@gmail.com

 Didn't get any responses.

 But I tried luke 1.0.1 and it did the magic. I run optimization and after
 that solr got up.

 Best Regards
 Alexander Aristov


 On 30 March 2011 15:47, Alexander Aristov alexander.aris...@gmail.com
 wrote:

  People
 
  Is were way to upgrade existsing index from solr 1.4 to solr 4(trunk).
 When
  I configured solr 4 and launched it complained about incorrect lucence
 file
  version (3 instead of old 2)
 
  Are there any procedures to convert index?
 
 
  Best Regards
  Alexander Aristov
 



Re: Adding weightage to the facets count

2011-01-25 Thread Johannes Goll
Hi Siva,

try using the Solr Stats Component
http://wiki.apache.org/solr/StatsComponent

similar to
select/?q=*:*stats=truestats.field={your-weight-field}stats.facet={your-facet-field}

and get the sum field from the response. You may need to resort the weighted
facet counts to get a descending list of facet counts.

Note, there is a bug for using the Stats Component with multi-valued facet
fields.

For details see
https://issues.apache.org/jira/browse/SOLR-1782

Johannes

2011/1/24 Chris Hostetter hossman_luc...@fucit.org


 : prod1 has tag called “Light Weight” with weightage 20,
 : prod2 has tag called “Light Weight” with weightage 100,
 :
 : If i get facet for “Light Weight” , i will get Light Weight (2) ,
 : here i need to consider the weightage in to account, and the result will
 be
 : Light Weight (120)
 :
 : How can we achieve this?Any ideas are really helpful.


 It's not really possible with Solr out of the box.  Faceting is fast and
 efficient in Solr because it's all done using set intersections (and most
 of the sets can be kept in ram very compactly and reused).  For what you
 are describing you'd need to no only assocaited a weighted payload with
 every TermPosition, but also factor that weight in when doing the
 faceting, which means efficient set operations are now out the window.

 If you know java it would be probably be possible to write a custom
 SolrPlugin (a SearchComponent) to do this type of faceting in special
 cases (assuming you indexed in a particular way) but i'm not sure off hte
 top of my head how well it would scale -- the basic algo i'm thinking of
 is (after indexing each facet term wit ha weight payload) to iterate over
 the DocSet of all matching documents in parallel with an iteration over
 a TermPositions, skipping ahead to only the docs that match the query, and
 recording the sum of the payloads for each term.

 Hmmm...

 except TermPositions iterates over term, doc, freq, position tuples,
 so you would have to iterate over every term, and for every term then loop
 over all matching docs ... like i said, not sure how efficient it would
 wind up being.

 You might be happier all arround if you just do some sampling -- store the
 tag+weight pairs so thta htey cna be retireved with each doc, and then
 when you get your top facet constraints back, look at the first page of
 results, and figure out what the sun weight is for each of those
 constraints based solely on the page#1 results.

 i've had happy users using a similar appraoch in the past.

 -Hoss




-- 
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878


Re: Tuning StatsComponent

2011-01-13 Thread Johannes Goll
What field type do you recommend for a  float stats.field for optimal Solr
1.4.1 StatsComponent performance ?

float, pfloat or tfloat ?

Do you recommend to index the field ?


2011/1/12 stockii st...@shopgate.com


 my field Type  is double maybe sint is better ? but i need double ...
 =(
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Tuning-StatsComponent-tp2225809p2241903.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: solrconfig luceneMatchVersion 2.9.3

2011-01-07 Thread Johannes Goll
according to
http://www.mail-archive.com/solr-user@lucene.apache.org/msg40491.html

there is no more trunk support for 2.9 indexes.

So I tried the suggested solution to execute an optimize to convert a 2.9.3
index to a 3.x index.

However, when I tried to the optimize a 2.9.3 index using the Solr 4.0 trunk
version with luceneMatchVersion set to LUCENE_30 in the solrconfig.xml,
I am getting

SimplePostTool: POSTing file optimize.xml
SimplePostTool: FATAL: Solr returned an error: Severe errors in solr
configuration.  Check your log files for more detailed information on what
may be wrong.  -
java.lang.RuntimeException:
org.apache.lucene.index.IndexFormatTooOldException: Format version is not
supported in file '_0.fdx': 1 (needs to be between 2 and 2). This version of
Lucene only supports indexes created with release 3.0 and later.

Is there any other mechanism for converting index files to 3.x?



2011/1/6 Johannes Goll johannes.g...@gmail.com

 Hi,

 our index files have been created using Lucene 2.9.3 and solr 1.4.1.

 I am trying to use a patched version of the current trunk (solr 1.5.0 ? ).
 The patched version works fine with newly generated index data but
 not with our existing data:

 After adjusting the solrconfig.xml  - I added the line

   luceneMatchVersionLUCENE_40/luceneMatchVersion

 also tried

   luceneMatchVersionLUCENE_30/luceneMatchVersion

 I am getting the following exception

 java.lang.RuntimeException: 
 org.apache.lucene.index.IndexFormatTooOldException:
 Format version is not supported in file '_q.fdx': 1 (needs to be between 2 
 and 2)

 When I try to change it to

   luceneMatchVersionLUCENE_29/luceneMatchVersion

 or

   luceneMatchVersion2.9/luceneMatchVersion

 or

   luceneMatchVersion2.9.3/luceneMatchVersion

 I am getting

 SEVERE: org.apache.solr.common.SolrException: Invalid luceneMatchVersion
 '2.9', valid values are: [LUCENE_30, LUCENE_31, LUCENE_40, LUCENE_CURRENT]
 or a string in format 'V.V'

 Do you know a way to make this work with Lucene version 2.9.3 ?

 Thanks,
 Johannes




-- 
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878


solrconfig luceneMatchVersion 2.9.3

2011-01-06 Thread Johannes Goll
Hi,

our index files have been created using Lucene 2.9.3 and solr 1.4.1.

I am trying to use a patched version of the current trunk (solr 1.5.0 ? ).
The patched version works fine with newly generated index data but
not with our existing data:

After adjusting the solrconfig.xml  - I added the line

  luceneMatchVersionLUCENE_40/luceneMatchVersion

also tried

  luceneMatchVersionLUCENE_30/luceneMatchVersion

I am getting the following exception

java.lang.RuntimeException:
org.apache.lucene.index.IndexFormatTooOldException:
Format version is not supported in file '_q.fdx': 1 (needs to be
between 2 and 2)

When I try to change it to

  luceneMatchVersionLUCENE_29/luceneMatchVersion

or

  luceneMatchVersion2.9/luceneMatchVersion

or

  luceneMatchVersion2.9.3/luceneMatchVersion

I am getting

SEVERE: org.apache.solr.common.SolrException: Invalid luceneMatchVersion
'2.9', valid values are: [LUCENE_30, LUCENE_31, LUCENE_40, LUCENE_CURRENT]
or a string in format 'V.V'

Do you know a way to make this work with Lucene version 2.9.3 ?

Thanks,
Johannes


Solr 1.4.1 stats component count not matching facet count for multi valued field

2010-12-23 Thread Johannes Goll
Hi,

I have a facet field called option which may be multi-valued and
a weight field which is single-valued.

When I use the Solr 1.4.1 stats component with a facet field, i.e.

q=*:*version=2.2stats=true
stats.field=weightstats.facet=option

I get conflicting results for the stats count result
long name=count1/long

when compared with the faceting counts obtained by

q=*:*version=2.2facet=truefacet.field=option

I would expect the same count for either method.

This happens if multiple values are stored in the options field.

It seem that for a multiple values only the last entered value is being
considered in the stats component? What am I doing wrong here?

Thanks,
Johannes