from:"\"Johannes Goll\""

Re: UI

2012-05-21 Thread Johannes Goll

yes, I am using this library and it works perfectly so far. If
something does not work you can just modify it
http://code.google.com/p/solr-php-client/

Johannes
2012/5/21 Tolga :
> Hi,
>
> Can you recommend a good PHP UI to search? Is SolrPHPClient good?

Re: summing facets on a specific field

2012-02-06 Thread Johannes Goll

I meant

stats=true&stats.field=price&stats.facet=category

2012/2/6 Johannes Goll :
> you can use the StatsComponent
>
> http://wiki.apache.org/solr/StatsComponent
>
> with stats=true&stats.price=category&stats.facet=category
>
> and pull the sum fields from the resulting stats facets.
>
> Johannes
>
> 2012/2/5 Paul Kapla :
>> Hi everyone,
>> I'm pretty new to solr and I'm not sure if this can even be done. Is there
>> a way to sum a specific field per each item in a facet. For example, you
>> have an ecommerce site that has the following documents:
>>
>> id,category,name,price
>> 1,books,'solr book', $10.00
>> 2,books,'lucene in action', $12.00
>> 3.video, 'cool video', $20.00
>>
>> so instead of getting (when faceting on category)
>> books(2)
>> video(1)
>>
>> I'd like to get:
>> books ($22)
>> video ($20)
>>
>> Is this something that can be even done? Any feedback would be much
>> appreciated.
>
>
>
> --
> Dipl.-Ing.(FH)
> Johannes Goll
> 211 Curry Ford Lane
> Gaithersburg, Maryland 20878
> USA



-- 
Dipl.-Ing.(FH)
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878
USA

Re: summing facets on a specific field

2012-02-06 Thread Johannes Goll

you can use the StatsComponent

http://wiki.apache.org/solr/StatsComponent

with stats=true&stats.price=category&stats.facet=category

and pull the sum fields from the resulting stats facets.

Johannes

2012/2/5 Paul Kapla :
> Hi everyone,
> I'm pretty new to solr and I'm not sure if this can even be done. Is there
> a way to sum a specific field per each item in a facet. For example, you
> have an ecommerce site that has the following documents:
>
> id,category,name,price
> 1,books,'solr book', $10.00
> 2,books,'lucene in action', $12.00
> 3.video, 'cool video', $20.00
>
> so instead of getting (when faceting on category)
> books(2)
> video(1)
>
> I'd like to get:
> books ($22)
> video ($20)
>
> Is this something that can be even done? Any feedback would be much
> appreciated.



-- 
Dipl.-Ing.(FH)
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878
USA

Update Solr Schema To Store Field

2012-02-01 Thread Johannes Goll

Hi,

I am running apache-solr-3.1.0 and would like to change a field
attribute from stored="false" to
stored="true".

I have several hundred cores that have been indexed without storing
the field which is fine
as I only would like to retrieve the value for new data that I plan to
index with the
updated schema.

My question is whether this change affects the query behavior for the
existing indexed
documents which were loaded with stored ="false"

Thanks a lot,
Johannes

Re: Hierarchical faceting in UI

2012-01-23 Thread Johannes Goll

another way is to store the original hierarchy in a sql database (in
the form: id, parent_id, name, level) and in the Lucene index store
the complete
hierarchy (from root to leave node) for each document in one field
using the ids of the sql database. In that way you can get documents
at any level of
the hierarchy. You can use the sql database to dynamically expand the
tree by building facet queries to fetch document collections of
child-nodes.

Johannes


 from the root level down to leave node in one field "1 13 32 42 23 12"

2012/1/23  :
>
> On Mon, 23 Jan 2012 14:33:00 -0800 (PST), Yuhao 
> wrote:
>> Programmatically, something like this might work: for each facet field,
>> add another hidden field that identifies its parent.  Then, program
>> additional logic in the UI to show only the facet terms at the currently
>> selected level.  For example, if one filters on "cat:electronics", the
> new
>> UI logic would apply the additional filter "cat_parent:electronics".
> Can
>> this be done?
>
> Yes. This is how I do it.
>
>> Would it be a lot of work?
> No. Its not a lot of work, simply represent your hierarchy as parent/child
> relations in the document fields and in your UI drill down by issuing new
> faceted searches. Use the current facet (tree level) as the parent:
> in the next query. Its much easier than other suggestions for this.
>
>> Is there a better way?
> Not in my opinion, there isn't. This is the simplest to implement and
> understand.
>
>>
>> By the way, Flamenco (another faceted browser) has built-in support for
>> hierarchies, and it has worked well for my data in this aspect (but less
>> well than Solr in others).  I'm looking for the same kind of
> hierarchical
>> UI feature in Solr.

Re: Multi CPU Cores

2011-10-17 Thread Johannes Goll

Yes, same thing. This was for the jetty servlet container not tomcat. I would 
refer to the tomcat documentation on how to modify/configure the java runtime 
environment (JRE) arguments for your running instance.
Johannes

On Oct 17, 2011, at 4:01 AM, Robert Brown  wrote:

> Where exactly do you set this up?  We're running Solr3.4 under tomcat,
> OpenJDK 1.6.0.20
> 
> btw, is the JRE just a different name for the VM?  Apologies for such a
> newbie Java question.
> 
> 
> 
> On Sun, 16 Oct 2011 12:51:44 -0400, Johannes Goll
>  wrote:
>> we use the the following in production
>> 
>> java -server -XX:+UseParallelGC -XX:+AggressiveOpts
>> -XX:+DisableExplicitGC -Xms3G -Xmx40G -Djetty.port=
>> -Dsolr.solr.home= jar start.jar
>> 
>> more information
>> http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html
>> 
>> Johannes
>

Re: Multi CPU Cores

2011-10-16 Thread Johannes Goll

we use the the following in production

java -server -XX:+UseParallelGC -XX:+AggressiveOpts
-XX:+DisableExplicitGC -Xms3G -Xmx40G -Djetty.port=
-Dsolr.solr.home= jar start.jar

more information
http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html

Johannes

Re: Multi CPU Cores

2011-10-16 Thread Johannes Goll

Try using -useParallelGc as vm option. 

Johannes

On Oct 16, 2011, at 7:51 AM, Ken Krugler  wrote:

> 
> On Oct 16, 2011, at 1:44pm, Rob Brown wrote:
> 
>> Looks like I checked the load during a quiet period, ab -n 1 -c 1000
>> saw a decent 40% load on each core.
>> 
>> Still a little confused as to why 1 core stays at 100% constantly - even
>> during the quiet periods?
> 
> Could be background GC, depending on what you've got your JVM configured to 
> use.
> 
> Though that shouldn't stay at 100% for very long.
> 
> -- Ken
> 
> 
>> -Original Message-
>> From: Johannes Goll 
>> Reply-to: solr-user@lucene.apache.org
>> To: solr-user@lucene.apache.org 
>> Subject: Re: Multi CPU Cores
>> Date: Sat, 15 Oct 2011 21:30:11 -0400
>> 
>> Did you try to submit multiple search requests in parallel? The apache ab 
>> tool is great tool to simulate simultaneous load using (-n and -c).
>> Johannes
>> 
>> On Oct 15, 2011, at 7:32 PM, Rob Brown  wrote:
>> 
>>> Hi,
>>> 
>>> I'm running Solr on a machine with 16 CPU cores, yet watching "top"
>>> shows that java is only apparently using 1 and maxing it out.
>>> 
>>> Is there anything that can be done to take advantage of more CPU cores?
>>> 
>>> Solr 3.4 under Tomcat
>>> 
>>> [root@solr01 ~]# java -version
>>> java version "1.6.0_20"
>>> OpenJDK Runtime Environment (IcedTea6 1.9.8)
>>> (rhel-1.22.1.9.8.el5_6-x86_64)
>>> OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)
>>> 
>>> 
>>> top - 14:36:18 up 22 days, 21:54,  4 users,  load average: 1.89, 1.24,
>>> 1.08
>>> Tasks: 317 total,   1 running, 315 sleeping,   0 stopped,   1 zombie
>>> Cpu0  :  0.0%us,  0.0%sy,  0.0%ni, 99.6%id,  0.4%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu3  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu4  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu6  : 99.6%us,  0.4%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu10 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu12 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu13 :  0.7%us,  0.0%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu14 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Cpu15 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Mem:  132088928k total, 23760584k used, 108328344k free,   318228k
>>> buffers
>>> Swap: 25920868k total,0k used, 25920868k free, 18371128k cached
>>> 
>>> PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+
>>> COMMAND 
>>> 
>>>
>>> 4466 tomcat20   0 31.2g 4.0g 171m S 101.0  3.2   2909:38
>>> java
>>> 
>>>   
>>> 6495 root  15   0 42416 3892 1740 S  0.4  0.0   9:34.71
>>> openvpn 
>>> 
>>>
>>> 11456 root  16   0 12892 1312  836 R  0.4  0.0   0:00.08
>>> top 
>>> 
>>>
>>>  1 root  15   0 10368  632  536 S  0.0  0.0   0:04.69
>>> init 
>>> 
>>> 
>>> 
>> 
> 
> --
> Ken Krugler
> +1 530-210-6378
> http://bixolabs.com
> custom big data solutions & training
> Hadoop, Cascading, Mahout & Solr
> 
> 
>

Re: Multi CPU Cores

2011-10-15 Thread Johannes Goll

Did you try to submit multiple search requests in parallel? The apache ab tool 
is great tool to simulate simultaneous load using (-n and -c).
Johannes

On Oct 15, 2011, at 7:32 PM, Rob Brown  wrote:

> Hi,
> 
> I'm running Solr on a machine with 16 CPU cores, yet watching "top"
> shows that java is only apparently using 1 and maxing it out.
> 
> Is there anything that can be done to take advantage of more CPU cores?
> 
> Solr 3.4 under Tomcat
> 
> [root@solr01 ~]# java -version
> java version "1.6.0_20"
> OpenJDK Runtime Environment (IcedTea6 1.9.8)
> (rhel-1.22.1.9.8.el5_6-x86_64)
> OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)
> 
> 
> top - 14:36:18 up 22 days, 21:54,  4 users,  load average: 1.89, 1.24,
> 1.08
> Tasks: 317 total,   1 running, 315 sleeping,   0 stopped,   1 zombie
> Cpu0  :  0.0%us,  0.0%sy,  0.0%ni, 99.6%id,  0.4%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu3  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu4  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu6  : 99.6%us,  0.4%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu10 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu12 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu13 :  0.7%us,  0.0%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu14 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Cpu15 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Mem:  132088928k total, 23760584k used, 108328344k free,   318228k
> buffers
> Swap: 25920868k total,0k used, 25920868k free, 18371128k cached
> 
>  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+
> COMMAND   
>   
>
> 4466 tomcat20   0 31.2g 4.0g 171m S 101.0  3.2   2909:38
> java  
>   
>   
> 6495 root  15   0 42416 3892 1740 S  0.4  0.0   9:34.71
> openvpn   
>   
>
> 11456 root  16   0 12892 1312  836 R  0.4  0.0   0:00.08
> top   
>   
>
>1 root  15   0 10368  632  536 S  0.0  0.0   0:04.69
> init 
> 
> 
>

Re: Huge performance drop in distributed search w/ shards on the same server/container

2011-06-14 Thread Johannes Goll

I increased the maximum POST size and headerBufferSize to 10MB ; lowThreads
to 50, maxThreads to 10 and lowResourceMaxIdleTime=15000. We tried
tomcat 6 using the following Connnector settings :

I am getting the same exception as for jetty

SEVERE: org.apache.solr.common.SolrException:
org.apache.solr.client.solrj.SolrServerException: java.net.SocketException:
Connection reset

This seem to point towards a Solr specific issue (solrj.SolrServerException
during individual shard searches).  I monitored the CPU utilization
executing sequential distributed searches and noticed that in the beginning
all CPUs are getting used for a short period of time (multiple lines for
shard searches are shown in the log with isShard=true arguments), then all
CPU except one become idle and the request is being processed by this one
CPU for the longest period of time.

I also noticed in the logs that while most of the individual shard searches
(isShard=true) have low QTimes (5-10), a minority has extreme QTimes
(104402-105126). All shards are fairly similar in size and content (1.2 M
documents) and the StatsComponent is being used
[stats=true&stats.field=weight&stats.facet=library_id]. Here library_id
equals the shard/core name.

Is there an internal timeout for gathering shard results or other fixed
resource limitation ?

Johannes

2011/6/13 Yonik Seeley 

> On Sun, Jun 12, 2011 at 9:10 PM, Johannes Goll 
> wrote:
> > However, sporadically, Jetty 6.1.2X (shipped with  Solr 3.1.)
> > sporadically throws Socket connect exceptions when executing distributed
> > searches.
>
> Are you using the exact jetty.xml that shipped with the solr example
> server,
> or did you make any modifications?
>
> -Yonik
> http://www.lucidimagination.com
>

-- 
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878

Re: Huge performance drop in distributed search w/ shards on the same server/container

2011-06-12 Thread Johannes Goll

Hi Fred,

we are having similar issues of scaling Solr 3.1 distributed searches on a
single box with 18 cores. We use the StatsComponent which seems to be mainly
CPU bound. Using distributed searches resulted in a 9 fold decrease in
response time. However, sporadically, Jetty 6.1.2X (shipped with  Solr 3.1.)
sporadically throws Socket connect exceptions when executing distributed
searches. Our next step is to switch from from jetty to tomcat.  Did you
find a solution for improving the CPU utilization and requests per second
for your system?

Johannes


2011/5/26 pravesh 

> Do you really require multi-shards? Single core/shard will do for even
> millions of documents and the search will be faster than searching on
> multi-shards.
>
> Consider multi-shard when you cannot scale-up on a single
> shard/machine(e.g,
> CPU,RAM etc. becomes major block).
>
> Also read through the SOLR distributed search wiki to check on all tuning
> up
> required at application server(Tomcat) end, like maxHTTP request settings.
> For a single request in a multi-shard setup internal HTTP requests are made
> through all queried shards, so, make sure you set this parameter higher.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Huge-performance-drop-in-distributed-search-w-shards-on-the-same-server-container-tp2938421p2988464.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878

Re: apache-solr-3.1 slow stats component queries

2011-05-05 Thread Johannes Goll

Hi,

I bench-marked the slow stats queries (6 point estimate) using the same
hardware on an index of size 104M. We use a Solr/Lucene 3.1-mod which
returns only the sum and count for statistics component results. Solr/Lucene
is run on jetty.

The relationship between query time and set of found documents is linear
when using the stats component (R^2 0.99). I guess this is expected as the
application needs to scan/sum-up the stat field for all matching documents?

Are there any plans for caching stat results for a certain stat field along
with the documents that match a filter query ? Any other ideas that could
help to improve this (hardware/software configuration) ?  Even for a subset
of 10M entries, the stat search takes on the order of 10 seconds.

Thanks in advance.
Johannes



2011/4/18 Johannes Goll 

> any ideas why in this case the stats summaries are so slow  ?  Thank you
> very much in advance for any ideas/suggestions. Johannes
>
>
> 2011/4/5 Johannes Goll 
>
>> Hi,
>>
>> thank you for making the new apache-solr-3.1 available.
>>
>> I have installed the version from
>>
>> http://apache.tradebit.com/pub//lucene/solr/3.1.0/
>>
>> and am running into very slow stats component queries (~ 1 minute)
>> for fetching the computed sum of the stats field
>>
>> url: ?q=*:*&start=0&rows=0&stats=true&stats.field=weight
>>
>> 52825
>>
>> #documents: 78,359,699
>> total RAM: 256G
>> vm arguments:  -server -xmx40G
>>
>> the stats.field specification is as follows:
>> > stored="false" required="true" multiValued="false"
>> default="1"/>
>>
>> filter queries that narrow down the #docs help to reduce it -
>> QTime seems to be proportional to the number of docs being returned
>> by a filter query.
>>
>> Is there any way to improve the performance of such stats queries ?
>> Caching only helped to improve the filter query performance but if
>> larger subsets are being returned, QTime increases unacceptably.
>>
>> Since I only need the sum and not the STD or sumsOfSquares/Min/Max,
>> I have created a custom 3.1 version that does only return the sum. But
>> this
>> only slightly improved the performance. Of course I could somehow cache
>> the larger sum queries on the client side but I want to do this only as a
>> last resort.
>>
>> Thank you very much in advance for any ideas/suggestions.
>>
>> Johannes
>>
>>
>
>
> --
> Johannes Goll
> 211 Curry Ford Lane
> Gaithersburg, Maryland 20878
>

Re: apache-solr-3.1 slow stats component queries

2011-04-18 Thread Johannes Goll

any ideas why in this case the stats summaries are so slow  ?  Thank you
very much in advance for any ideas/suggestions. Johannes

2011/4/5 Johannes Goll 

> Hi,
>
> thank you for making the new apache-solr-3.1 available.
>
> I have installed the version from
>
> http://apache.tradebit.com/pub//lucene/solr/3.1.0/
>
> and am running into very slow stats component queries (~ 1 minute)
> for fetching the computed sum of the stats field
>
> url: ?q=*:*&start=0&rows=0&stats=true&stats.field=weight
>
> 52825
>
> #documents: 78,359,699
> total RAM: 256G
> vm arguments:  -server -xmx40G
>
> the stats.field specification is as follows:
>  stored="false" required="true" multiValued="false"
> default="1"/>
>
> filter queries that narrow down the #docs help to reduce it -
> QTime seems to be proportional to the number of docs being returned
> by a filter query.
>
> Is there any way to improve the performance of such stats queries ?
> Caching only helped to improve the filter query performance but if
> larger subsets are being returned, QTime increases unacceptably.
>
> Since I only need the sum and not the STD or sumsOfSquares/Min/Max,
> I have created a custom 3.1 version that does only return the sum. But this
> only slightly improved the performance. Of course I could somehow cache
> the larger sum queries on the client side but I want to do this only as a
> last resort.
>
> Thank you very much in advance for any ideas/suggestions.
>
> Johannes
>
>


-- 
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878

apache-solr-3.1 slow stats component queries

2011-04-05 Thread Johannes Goll

Hi,

thank you for making the new apache-solr-3.1 available.

I have installed the version from

http://apache.tradebit.com/pub//lucene/solr/3.1.0/

and am running into very slow stats component queries (~ 1 minute)
for fetching the computed sum of the stats field

url: ?q=*:*&start=0&rows=0&stats=true&stats.field=weight

52825

#documents: 78,359,699
total RAM: 256G
vm arguments:  -server -xmx40G

the stats.field specification is as follows:


filter queries that narrow down the #docs help to reduce it -
QTime seems to be proportional to the number of docs being returned
by a filter query.

Is there any way to improve the performance of such stats queries ?
Caching only helped to improve the filter query performance but if
larger subsets are being returned, QTime increases unacceptably.

Since I only need the sum and not the STD or sumsOfSquares/Min/Max,
I have created a custom 3.1 version that does only return the sum. But this
only slightly improved the performance. Of course I could somehow cache
the larger sum queries on the client side but I want to do this only as a
last resort.

Thank you very much in advance for any ideas/suggestions.

Johannes

Re: solr upgrade question

2011-03-31 Thread Johannes Goll

Hi Alexander,

I have posted same question a few month ago. The only solution that came up
was to regenerate the index files using the new  version. How did you do
this exactly with
luke 1.0.1 ? Would you mind sharing some of that magic ?

Best,
Johannes


2011/3/31 Alexander Aristov 

> Didn't get any responses.
>
> But I tried luke 1.0.1 and it did the magic. I run optimization and after
> that solr got up.
>
> Best Regards
> Alexander Aristov
>
>
> On 30 March 2011 15:47, Alexander Aristov  >wrote:
>
> > People
> >
> > Is were way to upgrade existsing index from solr 1.4 to solr 4(trunk).
> When
> > I configured solr 4 and launched it complained about incorrect lucence
> file
> > version (3 instead of old 2)
> >
> > Are there any procedures to convert index?
> >
> >
> > Best Regards
> > Alexander Aristov
> >
>

Re: Adding weightage to the facets count

2011-01-25 Thread Johannes Goll

Hi Siva,

try using the Solr Stats Component
http://wiki.apache.org/solr/StatsComponent

similar to
select/?&q=*:*&stats=true&stats.field={your-weight-field}&stats.facet={your-facet-field}

and get the sum field from the response. You may need to resort the weighted
facet counts to get a descending list of facet counts.

Note, there is a bug for using the Stats Component with multi-valued facet
fields.

For details see
https://issues.apache.org/jira/browse/SOLR-1782

Johannes

2011/1/24 Chris Hostetter 

>
> : prod1 has tag called “Light Weight” with weightage 20,
> : prod2 has tag called “Light Weight” with weightage 100,
> :
> : If i get facet for “Light Weight” , i will get Light Weight (2) ,
> : here i need to consider the weightage in to account, and the result will
> be
> : Light Weight (120)
> :
> : How can we achieve this?Any ideas are really helpful.
>
>
> It's not really possible with Solr out of the box.  Faceting is fast and
> efficient in Solr because it's all done using set intersections (and most
> of the sets can be kept in ram very compactly and reused).  For what you
> are describing you'd need to no only assocaited a weighted payload with
> every TermPosition, but also factor that weight in when doing the
> faceting, which means efficient set operations are now out the window.
>
> If you know java it would be probably be possible to write a custom
> SolrPlugin (a SearchComponent) to do this type of faceting in special
> cases (assuming you indexed in a particular way) but i'm not sure off hte
> top of my head how well it would scale -- the basic algo i'm thinking of
> is (after indexing each facet term wit ha weight payload) to iterate over
> the DocSet of all matching documents in parallel with an iteration over
> a TermPositions, skipping ahead to only the docs that match the query, and
> recording the sum of the payloads for each term.
>
> Hmmm...
>
> except TermPositions iterates over >> tuples,
> so you would have to iterate over every term, and for every term then loop
> over all matching docs ... like i said, not sure how efficient it would
> wind up being.
>
> You might be happier all arround if you just do some sampling -- store the
> tag+weight pairs so thta htey cna be retireved with each doc, and then
> when you get your top facet constraints back, look at the first page of
> results, and figure out what the sun "weight" is for each of those
> constraints based solely on the page#1 results.
>
> i've had happy users using a similar appraoch in the past.
>
> -Hoss




-- 
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878

Re: Tuning StatsComponent

2011-01-13 Thread Johannes Goll

What field type do you recommend for a  float stats.field for optimal Solr
1.4.1 StatsComponent performance ?

float, pfloat or tfloat ?

Do you recommend to index the field ?


2011/1/12 stockii 

>
> my field Type  is "double" maybe "sint" is better ? but i need double ...
> =(
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Tuning-StatsComponent-tp2225809p2241903.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: solrconfig luceneMatchVersion 2.9.3

2011-01-07 Thread Johannes Goll

according to
http://www.mail-archive.com/solr-user@lucene.apache.org/msg40491.html

there is no more trunk support for 2.9 indexes.

So I tried the suggested solution to execute an optimize to convert a 2.9.3
index to a 3.x index.

However, when I tried to the optimize a 2.9.3 index using the Solr 4.0 trunk
version with luceneMatchVersion set to LUCENE_30 in the solrconfig.xml,
I am getting

SimplePostTool: POSTing file optimize.xml
SimplePostTool: FATAL: Solr returned an error: Severe errors in solr
configuration.  Check your log files for more detailed information on what
may be wrong.  -
java.lang.RuntimeException:
org.apache.lucene.index.IndexFormatTooOldException: Format version is not
supported in file '_0.fdx': 1 (needs to be between 2 and 2). This version of
Lucene only supports indexes created with release 3.0 and later.

Is there any other mechanism for converting index files to 3.x?



2011/1/6 Johannes Goll 

> Hi,
>
> our index files have been created using Lucene 2.9.3 and solr 1.4.1.
>
> I am trying to use a patched version of the current trunk (solr 1.5.0 ? ).
> The patched version works fine with newly generated index data but
> not with our existing data:
>
> After adjusting the solrconfig.xml  - I added the line
>
>   LUCENE_40
>
> also tried
>
>   LUCENE_30
>
> I am getting the following exception
>
> "java.lang.RuntimeException: 
> org.apache.lucene.index.IndexFormatTooOldException:
> Format version is not supported in file '_q.fdx': 1 (needs to be between 2 
> and 2)"
>
> When I try to change it to
>
>   LUCENE_29
>
> or
>
>   2.9
>
> or
>
>   2.9.3
>
> I am getting
>
> "SEVERE: org.apache.solr.common.SolrException: Invalid luceneMatchVersion
> '2.9', valid values are: [LUCENE_30, LUCENE_31, LUCENE_40, LUCENE_CURRENT]
> or a string in format 'V.V'"
>
> Do you know a way to make this work with Lucene version 2.9.3 ?
>
> Thanks,
> Johannes
>



-- 
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878

solrconfig luceneMatchVersion 2.9.3

2011-01-06 Thread Johannes Goll

Hi,

our index files have been created using Lucene 2.9.3 and solr 1.4.1.

I am trying to use a patched version of the current trunk (solr 1.5.0 ? ).
The patched version works fine with newly generated index data but
not with our existing data:

After adjusting the solrconfig.xml  - I added the line

  LUCENE_40

also tried

  LUCENE_30

I am getting the following exception

"java.lang.RuntimeException:
org.apache.lucene.index.IndexFormatTooOldException:
Format version is not supported in file '_q.fdx': 1 (needs to be
between 2 and 2)"

When I try to change it to

  LUCENE_29

or

  2.9

or

  2.9.3

I am getting

"SEVERE: org.apache.solr.common.SolrException: Invalid luceneMatchVersion
'2.9', valid values are: [LUCENE_30, LUCENE_31, LUCENE_40, LUCENE_CURRENT]
or a string in format 'V.V'"

Do you know a way to make this work with Lucene version 2.9.3 ?

Thanks,
Johannes

Solr 1.4.1 stats component count not matching facet count for multi valued field

2010-12-23 Thread Johannes Goll

Hi,

I have a facet field called option which may be multi-valued and
a weight field which is single-valued.

When I use the Solr 1.4.1 stats component with a facet field, i.e.

q=*:*&version=2.2&stats=true&
stats.field=weight&stats.facet=option

I get conflicting results for the stats count result
1

when compared with the faceting counts obtained by

q=*:*&version=2.2&facet=true&facet.field=option

I would expect the same count for either method.

This happens if multiple values are stored in the options field.

It seem that for a multiple values only the last entered value is being
considered in the stats component? What am I doing wrong here?

Thanks,
Johannes

Re: UI

Re: summing facets on a specific field

Re: summing facets on a specific field

Update Solr Schema To Store Field

Re: Hierarchical faceting in UI

Re: Multi CPU Cores

Re: Multi CPU Cores

Re: Multi CPU Cores

Re: Multi CPU Cores

Re: Huge performance drop in distributed search w/ shards on the same server/container

Re: Huge performance drop in distributed search w/ shards on the same server/container

Re: apache-solr-3.1 slow stats component queries

Re: apache-solr-3.1 slow stats component queries

apache-solr-3.1 slow stats component queries

Re: solr upgrade question

Re: Adding weightage to the facets count

Re: Tuning StatsComponent

Re: solrconfig luceneMatchVersion 2.9.3

solrconfig luceneMatchVersion 2.9.3

Solr 1.4.1 stats component count not matching facet count for multi valued field

20 matches

Site Navigation

Mail list logo

Footer information