Grouping on a multi-valued field

2014-05-27 Thread Bhoomit Vasani
 Hi,

Does latest release of solr supports grouping on a multi-valued field?

According to this
https://wiki.apache.org/solr/FieldCollapsing#Known_Limitations it doesn't,
but the doc was last updated 14 months ago...

-- 
-- 
Thanks  Regards,
Bhoomit Vasani | SE @ Mygola
WE are LIVE http://www.mygola.com/!
91-8892949849


Re: Applying boosting for keyword search

2014-05-27 Thread manju16832003
Hi Jack,
Thank you for the suggestions. :-)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Applying-boosting-for-keyword-search-tp4137523p4138239.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how to apply multiplcative Boost in multivalued field

2014-05-27 Thread Aman Tandon
Any help here?

With Regards
Aman Tandon


On Mon, May 26, 2014 at 5:13 PM, Aman Tandon amantandon...@gmail.comwrote:

 HI,

 I am confused to how to apply the multiplicative boost on multivalued
 field.

 field name=plid type=string indexed=true stored=true 
 required=false omitNorms=true multiValued=true /



 Suppose in plid the value goes like 111,1234,2345,4567,2335,9876,67

 I am applying the filters on the plid like *..fq=plid:(111 1234 2345
 4567 2335 9876 67)*

 Now i need to apply the boost on the first three plid as well, which is a
 multivalued field, so help me out here.

 With Regards
 Aman Tandon



Re: Applying boosting for keyword search

2014-05-27 Thread manju16832003
Hi Erick,

Your explanation leads me to one question :-)

if
*/select?q=featured:true^100fq=make:toyotasort=featured_date desc,price
asc*

The above query, without edismax, works well because, If I'm not mistaken
its boosting document by value method.

So I'm boosting all my documents with the value featured=true and all those
documents would be sorted by their featured date in descending order (Latest
featured documents) and price (lower to higher).

My question is,
If we were to boost the documents based on a value, how could we make sure
the order of the documents?

For example : 
https://wiki.apache.org/solr/SolrRelevancyFAQ
defType=dismaxqf=textq=supervilliansbf=popularity

In the above case, all the documents that contains the word *popularity*
would be on top depends on their score.

However, I want to order the documents by certain criteria that contains the
word popularity So we would have to use *sort* to order the documents.

if we say, boosting has no or almost no effect if we use sort, then whats
the contradiction story between *sort* and *boost*

:-) would be interesting to know the answer



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Applying-boosting-for-keyword-search-tp4137523p4138241.html
Sent from the Solr - User mailing list archive at Nabble.com.


Error uploading csv/json from exampledocs folder

2014-05-27 Thread Mukundaraman valakumaresan
Hi

I am using solr version 4.8. When I uploaded the csv and json provided in
the example docs folder. I am getting a 400 response and the following
exception

Could be a bug...

7955 [searcherExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  –
QuerySenderListener sending requests to Searcher@3a5b069c[collection1]
main{StandardDirectoryReader(segments_2:3:nrt _0(4.8):C32)}
7956 [searcherExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  –
QuerySenderListener done.
7956 [searcherExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  –
[collection1] Registered new searcher Searcher@3a5b069c[collection1]
main{StandardDirectoryReader(segments_2:3:nrt _0(4.8):C32)}
7957 [qtp756319399-19] INFO
 org.apache.solr.update.processor.LogUpdateProcessor  – [collection1]
webapp=/solr path=/update params={commit=true} {commit=} 0 975
11929 [qtp756319399-16] INFO
 org.apache.solr.update.processor.LogUpdateProcessor  – [collection1]
webapp=/solr path=/update params={} {} 0 2
11930 [qtp756319399-16] ERROR org.apache.solr.core.SolrCore  –
org.apache.solr.common.SolrException: Unexpected character '[' (code 91) in
prolog; expected ''
 at [row,col {unknown-source}]: [1,1]
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)


Re: Error uploading csv/json from exampledocs folder

2014-05-27 Thread Alexandre Rafalovitch
What's the command line you are using? Or a sequence of steps if you
are not using post.jar.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Tue, May 27, 2014 at 2:34 PM, Mukundaraman valakumaresan
muk...@8kmiles.com wrote:
 Hi

 I am using solr version 4.8. When I uploaded the csv and json provided in
 the example docs folder. I am getting a 400 response and the following
 exception

 Could be a bug...

 7955 [searcherExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  –
 QuerySenderListener sending requests to Searcher@3a5b069c[collection1]
 main{StandardDirectoryReader(segments_2:3:nrt _0(4.8):C32)}
 7956 [searcherExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  –
 QuerySenderListener done.
 7956 [searcherExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  –
 [collection1] Registered new searcher Searcher@3a5b069c[collection1]
 main{StandardDirectoryReader(segments_2:3:nrt _0(4.8):C32)}
 7957 [qtp756319399-19] INFO
  org.apache.solr.update.processor.LogUpdateProcessor  – [collection1]
 webapp=/solr path=/update params={commit=true} {commit=} 0 975
 11929 [qtp756319399-16] INFO
  org.apache.solr.update.processor.LogUpdateProcessor  – [collection1]
 webapp=/solr path=/update params={} {} 0 2
 11930 [qtp756319399-16] ERROR org.apache.solr.core.SolrCore  –
 org.apache.solr.common.SolrException: Unexpected character '[' (code 91) in
 prolog; expected ''
  at [row,col {unknown-source}]: [1,1]
 at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
 at
 org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
 at
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)


Re: Error uploading csv/json from exampledocs folder

2014-05-27 Thread Mukundaraman valakumaresan
Yes I am using post .jar

java -jar post.jar *.csv
java -jar post.jar *.json


On Tue, May 27, 2014 at 1:17 PM, Alexandre Rafalovitch
arafa...@gmail.comwrote:

 What's the command line you are using? Or a sequence of steps if you
 are not using post.jar.

 Regards,
Alex.
 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr
 proficiency


 On Tue, May 27, 2014 at 2:34 PM, Mukundaraman valakumaresan
 muk...@8kmiles.com wrote:
  Hi
 
  I am using solr version 4.8. When I uploaded the csv and json provided in
  the example docs folder. I am getting a 400 response and the following
  exception
 
  Could be a bug...
 
  7955 [searcherExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  –
  QuerySenderListener sending requests to Searcher@3a5b069c[collection1]
  main{StandardDirectoryReader(segments_2:3:nrt _0(4.8):C32)}
  7956 [searcherExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  –
  QuerySenderListener done.
  7956 [searcherExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  –
  [collection1] Registered new searcher Searcher@3a5b069c[collection1]
  main{StandardDirectoryReader(segments_2:3:nrt _0(4.8):C32)}
  7957 [qtp756319399-19] INFO
   org.apache.solr.update.processor.LogUpdateProcessor  – [collection1]
  webapp=/solr path=/update params={commit=true} {commit=} 0 975
  11929 [qtp756319399-16] INFO
   org.apache.solr.update.processor.LogUpdateProcessor  – [collection1]
  webapp=/solr path=/update params={} {} 0 2
  11930 [qtp756319399-16] ERROR org.apache.solr.core.SolrCore  –
  org.apache.solr.common.SolrException: Unexpected character '[' (code 91)
 in
  prolog; expected ''
   at [row,col {unknown-source}]: [1,1]
  at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
  at
 
 org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
  at
 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
  at
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)



Re: Full Indexing fails on Solr-Probable connection issue.HELP!

2014-05-27 Thread Aniket Bhoi
On Mon, May 26, 2014 at 4:14 PM, Aniket Bhoi aniket.b...@gmail.com wrote:

 Another thing I have noted is that the exception always follows a commit
 operation.Log excerpt below:

 INFO: SolrDeletionPolicy.onCommit: commits:num=2
 commit{dir=/opt/solr/cores/calls/data/index,segFN=segments_2qt,version=1347458723267,generation=3557,filenames=[_3z9.tii,
 _3z3.fnm, _3z9.nrm, _3za.prx, _3z9.fdt, _3z9.fnm, _3z9.fdx, _3z3.frq,
 _3za.nrm, segments_2qt, _3z3.fdx, _3z9.prx, _3z3.fdt, _3za.fdx, _3z9.frq,
 _3z3.prx, _3za.fdt, _3z3.tii, _3za.tis, _3za.fnm, _3z3.nrm, _3z9.tis,
 _3za.tii, _3za.frq, _3z3.tis]
  
 commit{dir=/opt/solr/cores/calls/data/index,segFN=segments_2qu,version=1347458723269,generation=3558,filenames=[_3zb.fdt,
 _3z9.tii, _3z3.fnm, _3z9.nrm, _3zb.tii, _3zb.tis, _3zb.fdx, _3za.prx,
 _3z9.fdt, _3z9.fnm, _3z9.fdx, _3zb.frq, _3z3.frq, _3za.nrm, segments_2qu,
 _3z3.fdx, _3zb.prx, _3z9.prx, _3zb.fnm, _3z3.fdt, _3za.fdx, _3z9.frq,
 _3z3.prx, _3za.fdt, _3zb.nrm, _3z3.tii, _3za.tis, _3za.fnm, _3z3.nrm,
 _3z9.tis, _3za.tii, _3za.frq, _3z3.tis]
 May 24, 2014 5:49:05 AM org.apache.solr.core.SolrDeletionPolicy
 updateCommits
 INFO: newest commit = 1347458723269
 May 24, 2014 5:49:05 AM org.apache.solr.search.SolrIndexSearcher init
 INFO: Opening Searcher@423dbcca main
 May 24, 2014 5:49:05 AM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming Searcher@423dbcca main from Searcher@19c19869 main

 fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 May 24, 2014 5:49:05 AM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming result for Searcher@423dbcca main

 fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 May 24, 2014 5:49:05 AM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming Searcher@423dbcca main from Searcher@19c19869 main

 filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 May 24, 2014 5:49:05 AM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming result for Searcher@423dbcca main

 filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 May 24, 2014 5:49:05 AM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming Searcher@423dbcca main from Searcher@19c19869 main

 queryResultCache{lookups=1,hits=1,hitratio=1.00,inserts=3,evictions=0,size=3,warmupTime=2,cumulative_lookups=47,cumulative_hits=46,cumulative_hitratio=0.97,cumulative_inserts=1,cumulative_evictions=0}
 May 24, 2014 5:49:05 AM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming result for Searcher@423dbcca main

 queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=3,evictions=0,size=3,warmupTime=2,cumulative_lookups=47,cumulative_hits=46,cumulative_hitratio=0.97,cumulative_inserts=1,cumulative_evictions=0}
 May 24, 2014 5:49:05 AM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming Searcher@423dbcca main from Searcher@19c19869 main

 documentCache{lookups=0,hits=0,hitratio=0.00,inserts=40,evictions=0,size=40,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 May 24, 2014 5:49:05 AM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming result for Searcher@423dbcca main

 documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 May 24, 2014 5:49:05 AM org.apache.solr.core.QuerySenderListener
 newSearcher
 INFO: QuerySenderListener sending requests to Searcher@423dbcca main
 May 24, 2014 5:49:05 AM org.apache.solr.core.SolrCore execute
 INFO: [calls] webapp=null path=null
 params={start=0event=newSearcherq=*:*rows=20} hits=40028 status=0
 QTime=2
 May 24, 2014 5:49:05 AM org.apache.solr.update.DirectUpdateHandler2 commit
 INFO: end_commit_flush
 May 24, 2014 5:49:05 AM org.apache.solr.core.SolrCore execute
 INFO: [calls] webapp=null path=null
 params={start=0event=newSearcherq=bankingrows=20} hits=636 status=0
 QTime=3
 May 24, 2014 5:49:05 AM org.apache.solr.core.QuerySenderListener
 newSearcher
 INFO: QuerySenderListener done.
 May 24, 2014 5:49:05 AM
 org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener
 newSearcher
 INFO: Index is not optimized therefore skipping building spell check index
 for: default
 May 24, 2014 5:49:05 AM org.apache.solr.core.SolrCore registerSearcher
 INFO: [calls] Registered new searcher Searcher@423dbcca main
 May 

Re: Error uploading csv/json from exampledocs folder

2014-05-27 Thread Alexandre Rafalovitch
You are missing the content type, so your CSV/JSON is being
(mis-)treated as XML.

If you run java -jar post.jar -h, you will see the correct usage example:
*) java -Dtype=text/csv -jar post.jar *.csv
*) java -Dtype=text/csv -jar post.jar *.csv

You could also try:
*) java -Dtype=text/csv -jar post.jar *.csv

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Tue, May 27, 2014 at 2:59 PM, Mukundaraman valakumaresan
muk...@8kmiles.com wrote:
 Yes I am using post .jar

 java -jar post.jar *.csv
 java -jar post.jar *.json


 On Tue, May 27, 2014 at 1:17 PM, Alexandre Rafalovitch
 arafa...@gmail.comwrote:

 What's the command line you are using? Or a sequence of steps if you
 are not using post.jar.

 Regards,
Alex.
 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr
 proficiency


 On Tue, May 27, 2014 at 2:34 PM, Mukundaraman valakumaresan
 muk...@8kmiles.com wrote:
  Hi
 
  I am using solr version 4.8. When I uploaded the csv and json provided in
  the example docs folder. I am getting a 400 response and the following
  exception
 
  Could be a bug...
 
  7955 [searcherExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  –
  QuerySenderListener sending requests to Searcher@3a5b069c[collection1]
  main{StandardDirectoryReader(segments_2:3:nrt _0(4.8):C32)}
  7956 [searcherExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  –
  QuerySenderListener done.
  7956 [searcherExecutor-5-thread-1] INFO  org.apache.solr.core.SolrCore  –
  [collection1] Registered new searcher Searcher@3a5b069c[collection1]
  main{StandardDirectoryReader(segments_2:3:nrt _0(4.8):C32)}
  7957 [qtp756319399-19] INFO
   org.apache.solr.update.processor.LogUpdateProcessor  – [collection1]
  webapp=/solr path=/update params={commit=true} {commit=} 0 975
  11929 [qtp756319399-16] INFO
   org.apache.solr.update.processor.LogUpdateProcessor  – [collection1]
  webapp=/solr path=/update params={} {} 0 2
  11930 [qtp756319399-16] ERROR org.apache.solr.core.SolrCore  –
  org.apache.solr.common.SolrException: Unexpected character '[' (code 91)
 in
  prolog; expected ''
   at [row,col {unknown-source}]: [1,1]
  at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
  at
 
 org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
  at
 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
  at
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)



Classpath for Solr Plugins

2014-05-27 Thread Jan Nehring
Hi!

We have developed a custom tokenizer for Solr 4.3. Our Solr Server is started 
as a war file inside of a java application using jetty. When the custom 
tokenizer is in the class path oft the enclosing java application Solr fails on 
startup. The error is ClassCastException ArabicLetterTokenizer which is a class 
we don't use in our system. We can put the custom tokenizer in the Solr 
classpath from any other location but it needs to reside in the classpath of  
the enclosing application due to the overall architecture of our system.

The solution to our problem was to manipulate Jettys class loading behaviour:

Server server = new org.eclipse.jetty.server.Server();
WebAppContext webapp = new WebAppContext();
webapp.setParentLoaderPriority(true);   // -- this line
server.setHandler(webapp);
...

Using setParentLoaderPriority( true ) solved all our classpath problems. We 
didnt even have to use any oft the various methods to load the custom tokenizer 
in the Solr class path. My questions:

1) This method works too good and is too simple to implement, where is the 
drawback?
2) If there is no drawback why is it not mentioned in the Solr Doc in 
https://wiki.apache.org/solr/SolrPlugins ? 

Thank you very much Jan




答复: (Issue) How improve solr facet performance

2014-05-27 Thread Alice.H.Yang (mis.cnsh04.Newegg) 41493
Hi, Token

1.
I set the 3 fields with hundreds of values uses fc and the rest uses 
enum, the performance is improved 2 times compared with no parameter, and then 
I add facet.method=20 , the performance is improved about 4 times compared with 
no parameter.
And I also tried setting 9 facet field to one copyfield, I test the 
performance, it is improved about 2.5 times compared with no parameter.
So, It is improved a lot under your advice, thanks a lot.
2.
Now I have another performance issue, It's the group performance. The 
number of data is as same as facet performance scenario. 
When the keyword search hits about one million documents, the QTime is about 
600ms.(It doesn't query the first time, it's in cache)

Query url: 
select?fl=item_catalogq=default_search:paramterdefType=edismaxrows=50group=truegroup.field=item_group_idgroup.ngroups=truegroup.sort=stock4sort%20desc,final_price%20asc,is_selleritem%20ascsort=score%20desc,default_sort%20desc

It need Qtime about 600ms.

This query have two parameter: 
1. fl one field 
2. group=true, 
group.ngroups=true

If I set group=false,, the QTime is only 1 ms.
But I need do group and group.ngroups, How can I improve the group performance 
under this demand. Do you have some advice for me. I'm looking forward to your 
reply.

Best Regards,
Alice Yang
+86-021-51530666*41493
Floor 19,KaiKai Plaza,888,Wanhandu Rd,Shanghai(200042)


-邮件原件-
发件人: Toke Eskildsen [mailto:t...@statsbiblioteket.dk] 
发送时间: 2014年5月24日 15:17
收件人: solr-user@lucene.apache.org
主题: RE: (Issue) How improve solr facet performance

Alice.H.Yang (mis.cnsh04.Newegg) 41493 [alice.h.y...@newegg.com] wrote:
 1.  I'm sorry, I have made a mistake, the total number of documents is 32 
 Million, not 320 Million.
 2.  The system memory is large for solr index, OS total has 256G, I set the 
 solr tomcat HEAPSIZE=-Xms25G -Xmx100G

100G is a very high number. What special requirements dictates such a large 
heap size?

 Reply:  9 fields I facet on.

Solr treats each facet separately and with facet.method=fc and 10M hits, this 
means that it will iterate 9*10M = 90M document IDs and update the counters for 
those.

 Reply:  3 facet fields have one hundred unique values, other 6 facet fields' 
 unique values are between 3 to 15.

So very low cardinality. This is confirmed by your low response time of 6ms for 
2925 hits.

 And we test this scenario:  If the number of facet fields' unique values is 
 less we add facet.method=enum, there is a little to improve performance.

That is a shame: enum is normally the simple answer to a setup like yours. Have 
you tried fine-tuning your fc/enum selection, so that the 3 fields with 
hundreds of values uses fc and the rest uses enum? That might halve your 
response time.


Since the number of unique facets is so low, I do not think that DocValues can 
help you here. Besides the fine-grained fc/enum-selection above, you could try 
collapsing all 9 facet-fields into a single field. The idea behind this is that 
for facet.method=fc, performing faceting on a field with (for example) 300 
unique values takes practically the same amount of time as faceting on a field 
with 1000 unique values: Faceting on a single slightly larger field is much 
faster than faceting on 9 smaller fields. After faceting with facet.limit=-1 on 
the single super-facet-field, you must match the returned values back to their 
original fields:


If you have the facet-fields

field0: 34
field1: 187
field2: 78432
field3: 3
...

then collapse them by or-ing a field-specific mask that is bigger than the max 
in any field, then put it all into a single field:

fieldAll: 0xA000 | 34
fieldAll: 0xA100 | 187
fieldAll: 0xA200 | 78432
fieldAll: 0xA300 | 3
...

perform the facet request on fieldAll with facet.limit=-1 and split the 
resulting counts with

for (entry: facetResultAll) {
  switch (0xFF00  entry.value) {
case 0xA000:
  field0.add(entry.value, entry.count);
  break;
case 0xA100:
  field1.add(entry.value, entry.count);
  break;
...
  }
}


Regards,
Toke Eskildsen, State and University Library, Denmark


Re: Classpath for Solr Plugins

2014-05-27 Thread Alexandre Rafalovitch
In terms of drawback, I think strong dependence on Jetty is probably
one. Custom Java code is another. Most of the people run Solr as a
black box.

For the proposed solution, as the document said, usually the plugins
are placed in the directory defined in solrconfig.xml. So, most of the
people do not need other methods.

I am glad you have the solutions that worked for you, but it's a
little hard to tell whether the original problem is applicable to
anybody else.

Perhaps, your email to the mailing list can be the documentation
required for those who have similar issues. If it becomes something
bigger, a documentation JIRA can be created (by anybody).

Regards,
  Alex.


Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Tue, May 27, 2014 at 3:13 PM, Jan Nehring jan.nehr...@semperlink.com wrote:
 Hi!

 We have developed a custom tokenizer for Solr 4.3. Our Solr Server is started 
 as a war file inside of a java application using jetty. When the custom 
 tokenizer is in the class path oft the enclosing java application Solr fails 
 on startup. The error is ClassCastException ArabicLetterTokenizer which is a 
 class we don't use in our system. We can put the custom tokenizer in the Solr 
 classpath from any other location but it needs to reside in the classpath of  
 the enclosing application due to the overall architecture of our system.

 The solution to our problem was to manipulate Jettys class loading behaviour:

 Server server = new org.eclipse.jetty.server.Server();
 WebAppContext webapp = new WebAppContext();
 webapp.setParentLoaderPriority(true);   // -- this line
 server.setHandler(webapp);
 ...

 Using setParentLoaderPriority( true ) solved all our classpath problems. We 
 didnt even have to use any oft the various methods to load the custom 
 tokenizer in the Solr class path. My questions:

 1) This method works too good and is too simple to implement, where is the 
 drawback?
 2) If there is no drawback why is it not mentioned in the Solr Doc in 
 https://wiki.apache.org/solr/SolrPlugins ?

 Thank you very much Jan




Re: ExtractingRequestHandler indexing zip files

2014-05-27 Thread marotosg
Hi,

Thanks for your answer Alexandre.
I have zip files with only one document inside per zip file. These documents
are mainly pdf,xml,html.

I tried to index tini.txt.gz file which is located in the trunk to be used
by extraction tests
\trunk\solr\contrib\extraction\src\test-files\extraction\tini.txt.gz

I get the same issue only the name of the file inside tini.txt.gz gets
indexed as content. That means ExtractRequesthandler can open the file
because it's getting the name inside but for some reason is not reading the
content.

Any suggestions?

Thanks
Sergio




--
View this message in context: 
http://lucene.472066.n3.nabble.com/ExtractingRequestHandler-indexing-zip-files-tp4138172p4138255.html
Sent from the Solr - User mailing list archive at Nabble.com.


Regex with local params is not working

2014-05-27 Thread Lokn
Hi,

With solr local params, the regex is not working.
My sample query: q ={!qf=$myfield_qf}/[a-d]ad/, where I have myfield_qf
defined in the solrconfig.xml.

Any help here would be appreciated.

Thanks,
Lokesh



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Regex-with-local-params-is-not-working-tp4138257.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Regex with local params is not working

2014-05-27 Thread Alexandre Rafalovitch
What's your query parser type (default? dismax? edismax?). Default
type does not know about 'qf' parameter, so you may need to define
your type as well: http://wiki.apache.org/solr/LocalParams

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Tue, May 27, 2014 at 3:38 PM, Lokn nlokesh...@gmail.com wrote:
 Hi,

 With solr local params, the regex is not working.
 My sample query: q ={!qf=$myfield_qf}/[a-d]ad/, where I have myfield_qf
 defined in the solrconfig.xml.

 Any help here would be appreciated.

 Thanks,
 Lokesh



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Regex-with-local-params-is-not-working-tp4138257.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to Configure Solr For Test Purposes?

2014-05-27 Thread Furkan KAMACI
Hi All;

I have defined just that:

 autoSoftCommit
 maxTime1/maxTime
 /autoSoftCommit

then turned of hard commit. I've run my tests time and time again and did
not get any error. Who wants to write a unit test that interacts with Solr
as like my situation can use it.

Thanks;
Furkan KAMACI


2014-05-26 23:37 GMT+03:00 Furkan KAMACI furkankam...@gmail.com:

 Hi Shawn;

 I know that it is a bad practise but I just commit up to 5 documents and
 there will not be more than 5 documents at any time at any test method. It
 is just for test purpose to see that my API works. I want to have
 automatic tests.

 What do you suggest for my purpose? If a test case fails re-running it for
 some times maybe a solution? What kind of configuration do you suggest for
 my Solr configuration?

 Thanks;
 Furkan KAMACI
 26 May 2014 21:03 tarihinde Shawn Heisey s...@elyograg.org yazdı:

 On 5/26/2014 10:57 AM, Furkan KAMACI wrote:
  Hi;
 
  I run Solr within my Test Suite. I delete documents or atomically update
  them and check whether if it works or not. I know that I have to setup a
  hard/soft commit timing for my test Solr. However even I have that
 settings:
 
   autoCommit
 maxTime1/maxTime
 openSearchertrue/openSearcher
   /autoCommit
 
 autoSoftCommit
   maxTime1/maxTime
 /autoSoftCommit

 I hope you know that this is BAD configuration.  Doing automatic commits
 on an interval of 1 millisecond is asking for a whole host of problems.
  In some cases, this could do a commit after every single document that
 is indexed, which is NOT recommended at all.  The openSearcher setting
 of true on autoCommit makes it even worse.  There's no reason to do
 both autoSoftCommit and autoCommit with openSearcher=true.  I don't know
 which one wins between autoCommit and autoSoftCommit if they both have
 the same config, but I would guess the hard commit does.

  and even I wait (Thread.sleep()) for a time to wait Solr *sometimes* my
  tests are failed. I get fail error even I increase wait time.  Example
 of a
  sometimes failed code piece:
 
  for (int i = 0; i  dummyDocumentSize; i++) {
   deleteById(id + i);
   dummyDocumentSize--;
   queryResponse = query(solrParams);
   assertTrue(queryResponse.getResults().size() ==
 dummyDocumentSize);
}
 
  at debug mode if I wait for Solr to reflect changes I see that I do not
 get
  error. What do you think, what kind of configuration I should have for
 such
  kind of purposes?

 Chances are that commits are going to take longer than 1 millisecond.
 If you're actively indexing, the system is going to be trying to stack
 up lots of commits at the same time.  The maxWarmingSearchers value will
 limit the number of new searchers that can be opened, but it will not
 stop the commits themselves.  When lots of commits are going on, each
 one will take *even longer* to complete, which probably explains the
 problem.

 Thanks,
 Shawn




Re: How does query on few-hits AND many-hits work

2014-05-27 Thread Per Steffensen
Thanks for responding, Yonik. I tried out your suggestion, and it seems 
to work as it is supposed to, and it performs at least as well as the 
hacky implementation I did myself. Wish you had responded earlier. Or 
maybe not, then I wouldn't have dived into it myself making an 
implementation that does (almost) exactly what seems to be done when 
using your approach, and then I wouldn't have learned so much. But the 
great thing is that now I do not have to go suggest (or implement 
myself) this idea as a new Solr/Lucene feature - it is already there!


See 
http://solrlucene.blogspot.dk/2014/05/performance-of-and-queries-with-uneven.html. 
Hope you do not mind that I reference you and the link you pointed out.


Thanks a lot!

Regards, Per Steffensen

On 23/05/14 18:13, Yonik Seeley wrote:

On Fri, May 23, 2014 at 11:37 AM, Toke Eskildsen t...@statsbiblioteket.dk 
wrote:

Per Steffensen [st...@designware.dk] wrote:

* It IS more efficient to just use the index for the
no_dlng_doc_ind_sto-part of the request to get doc-ids that match that
part and then fetch timestamp-doc-values for those doc-ids to filter out
the docs that does not match the timestamp_dlng_doc_ind_sto-part of
the query.

Thank you for the follow up. It sounds rather special-case though, with 
requirement of DocValues for the range-field. Do you think this can be 
generalized?

Maybe it already is?
http://heliosearch.org/advanced-filter-caching-in-solr/

Something like this:
  fq={!frange cache=false cost=150 v=timestampField l=beginTime u=endTime}


-Yonik
http://heliosearch.org - facet functions, subfacets, off-heap filtersfieldcache





Re: How does query on few-hits AND many-hits work

2014-05-27 Thread Per Steffensen
Well, the only search i did, was ask this question on this 
mailing-list :-)


On 26/05/14 17:05, Alexandre Rafalovitch wrote:

Did not follow the whole story but  post-query-value-filter does exist in
Solr. Have you tried searching for pretty much that expression. and maybe
something about cost-based filter.

Regards,
 Alex




Re: Using fq as OR

2014-05-27 Thread Dmitry Kan
Erick:

Well, in principle you should be able to access both q (via qstr) and fq
(via params), as the method signature tells:

createParser(String qstr, SolrParams localParams, SolrParams params,
SolrQueryRequest req)

http://lucene.apache.org/solr/4_8_1/solr-core/org/apache/solr/search/QParserPlugin.html

so as you pointed, the branching between q and fq can be done using the
local params syntax. Not sure, if there is any practical use case here,
though.

sorry for diverting from the original topic :)


On Mon, May 26, 2014 at 7:37 PM, Erick Erickson erickerick...@gmail.comwrote:

 Dmitry:

 You have a valid point. That said I'm pretty sure you could have the
 filter query use your custom parser by something like
 fq={!customparser} whatever

 Of course if you were doing something in your custom qparser that
 needed both halves, that wouldn't work either..

 Best,
 Erick

 On Mon, May 26, 2014 at 12:14 AM, Dmitry Kan solrexp...@gmail.com wrote:
  Erick,
 
  correct me if the following's wrong, but if you have a custom query
 parser
  configured to preprocess your searches, you'd need to send the
  corresponding bit of the search in the q= parameter, rather than fq=
  parameter. In that sense, q and fq are not exactly equal.
 
  Dmitry
 
 
  On Thu, May 22, 2014 at 5:57 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
  Hmmm, not quite.
 
  AFAIK, anything you can put in a q clause can also be put in an fq
  clause. So it's not a matter of whether your search is precise or not
  that you should use for determining whether to use a q or fq clause.
  What _should_ influence this is whether docs that satisfy the clause
  should contribute to ranking.
 
  fq clauses do NOT contribute to ranking. They determine whether the
  doc is returned at all.
  q clauses contribute to the ranking.
 
  Additionally, the results of fq clauses are cached and may be re-used.
 
  That said, since fq clauses are often used in conjunction with
  faceting, they are very often used more precisely. But it's still a
  matter of caching and ranking that should determine where the clause
  goes.
 
  FWIW,
  Erick
 
  On Wed, May 21, 2014 at 9:09 PM, manju16832003 manju16832...@gmail.com
 
  wrote:
   The *fq* is used for searching more deterministic results something
 like
   WHERE type={}
   Where as *q* is something like WHERE type like '%%'
  
   user *fq*, if your are sure of what your going to search
   use *q*, if not sure what your trying to search
  
   If you are using fq and if you do not get any matching documents, solr
   throws 0 or error message
   where q would try to match nearest documents for your search query
  
   That's what I have experienced so far. :-).
  
  
  
  
   --
   View this message in context:
 
 http://lucene.472066.n3.nabble.com/Using-fq-as-OR-tp4137411p4137525.html
   Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
  --
  Dmitry Kan
  Blog: http://dmitrykan.blogspot.com
  Twitter: http://twitter.com/dmitrykan




-- 
Dmitry Kan
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan


Re: How to Configure Solr For Test Purposes?

2014-05-27 Thread Tomás Fernández Löbbe
 What do you suggest for my purpose? If a test case fails re-running it for
 some times maybe a solution? What kind of configuration do you suggest for
 my Solr configuration?


From the snippet of test that you showed, it looks like it's testing only
Solr functionality. So, first make sure this is a test that you really
need. Solr has it's own tests, and it you feel it could use more (for some
specific case or context), I'd open a Jira and try to get the test inside
Solr.
If my impression is wrong and your test is actually testing your code, then
I'd suggest you to use a specific soft commit call with waitSearcher = true
on your test instead of relying on the autocommit (and remove the
autocommit completely from your solrconfig).

Tomás



Thanks;
 Furkan KAMACI
 26 May 2014 21:03 tarihinde Shawn Heisey s...@elyograg.org yazdı:

  On 5/26/2014 10:57 AM, Furkan KAMACI wrote:
   Hi;
  
   I run Solr within my Test Suite. I delete documents or atomically
 update
   them and check whether if it works or not. I know that I have to setup
 a
   hard/soft commit timing for my test Solr. However even I have that
  settings:
  
autoCommit
  maxTime1/maxTime
  openSearchertrue/openSearcher
/autoCommit
  
  autoSoftCommit
maxTime1/maxTime
  /autoSoftCommit
 
  I hope you know that this is BAD configuration.  Doing automatic commits
  on an interval of 1 millisecond is asking for a whole host of problems.
   In some cases, this could do a commit after every single document that
  is indexed, which is NOT recommended at all.  The openSearcher setting
  of true on autoCommit makes it even worse.  There's no reason to do
  both autoSoftCommit and autoCommit with openSearcher=true.  I don't know
  which one wins between autoCommit and autoSoftCommit if they both have
  the same config, but I would guess the hard commit does.
 
   and even I wait (Thread.sleep()) for a time to wait Solr *sometimes* my
   tests are failed. I get fail error even I increase wait time.  Example
  of a
   sometimes failed code piece:
  
   for (int i = 0; i  dummyDocumentSize; i++) {
deleteById(id + i);
dummyDocumentSize--;
queryResponse = query(solrParams);
assertTrue(queryResponse.getResults().size() ==
  dummyDocumentSize);
 }
  
   at debug mode if I wait for Solr to reflect changes I see that I do not
  get
   error. What do you think, what kind of configuration I should have for
  such
   kind of purposes?
 
  Chances are that commits are going to take longer than 1 millisecond.
  If you're actively indexing, the system is going to be trying to stack
  up lots of commits at the same time.  The maxWarmingSearchers value will
  limit the number of new searchers that can be opened, but it will not
  stop the commits themselves.  When lots of commits are going on, each
  one will take *even longer* to complete, which probably explains the
  problem.
 
  Thanks,
  Shawn
 
 



Re: How to Configure Solr For Test Purposes?

2014-05-27 Thread Furkan KAMACI
Hi;

I've developed a Proxy application that takes request from clients and
sends them to Solr then gets response and sends them to client as response.
So, I am testing my application, my proxy for Solr.

Thanks;
Furkan KAMACI


2014-05-27 14:52 GMT+03:00 Tomás Fernández Löbbe tomasflo...@gmail.com:

  What do you suggest for my purpose? If a test case fails re-running it
 for
  some times maybe a solution? What kind of configuration do you suggest
 for
  my Solr configuration?
 
 
 From the snippet of test that you showed, it looks like it's testing only
 Solr functionality. So, first make sure this is a test that you really
 need. Solr has it's own tests, and it you feel it could use more (for some
 specific case or context), I'd open a Jira and try to get the test inside
 Solr.
 If my impression is wrong and your test is actually testing your code, then
 I'd suggest you to use a specific soft commit call with waitSearcher = true
 on your test instead of relying on the autocommit (and remove the
 autocommit completely from your solrconfig).

 Tomás



 Thanks;
  Furkan KAMACI
  26 May 2014 21:03 tarihinde Shawn Heisey s...@elyograg.org yazdı:
 
   On 5/26/2014 10:57 AM, Furkan KAMACI wrote:
Hi;
   
I run Solr within my Test Suite. I delete documents or atomically
  update
them and check whether if it works or not. I know that I have to
 setup
  a
hard/soft commit timing for my test Solr. However even I have that
   settings:
   
 autoCommit
   maxTime1/maxTime
   openSearchertrue/openSearcher
 /autoCommit
   
   autoSoftCommit
 maxTime1/maxTime
   /autoSoftCommit
  
   I hope you know that this is BAD configuration.  Doing automatic
 commits
   on an interval of 1 millisecond is asking for a whole host of problems.
In some cases, this could do a commit after every single document that
   is indexed, which is NOT recommended at all.  The openSearcher setting
   of true on autoCommit makes it even worse.  There's no reason to do
   both autoSoftCommit and autoCommit with openSearcher=true.  I don't
 know
   which one wins between autoCommit and autoSoftCommit if they both
 have
   the same config, but I would guess the hard commit does.
  
and even I wait (Thread.sleep()) for a time to wait Solr *sometimes*
 my
tests are failed. I get fail error even I increase wait time.
  Example
   of a
sometimes failed code piece:
   
for (int i = 0; i  dummyDocumentSize; i++) {
 deleteById(id + i);
 dummyDocumentSize--;
 queryResponse = query(solrParams);
 assertTrue(queryResponse.getResults().size() ==
   dummyDocumentSize);
  }
   
at debug mode if I wait for Solr to reflect changes I see that I do
 not
   get
error. What do you think, what kind of configuration I should have
 for
   such
kind of purposes?
  
   Chances are that commits are going to take longer than 1 millisecond.
   If you're actively indexing, the system is going to be trying to stack
   up lots of commits at the same time.  The maxWarmingSearchers value
 will
   limit the number of new searchers that can be opened, but it will not
   stop the commits themselves.  When lots of commits are going on, each
   one will take *even longer* to complete, which probably explains the
   problem.
  
   Thanks,
   Shawn
  
  
 



Re: 答复: (Issue) How improve solr facet performance

2014-05-27 Thread david.w.smi...@gmail.com
Alice,

RE grouping, try Solr 4.8’s new “collapse” qparser w/ “expand
SearchComponent.  The ref guide has the docs.  It’s usually a faster
equivalent approach to group=true

Do you care to comment further on NewEgg’s apparent switch from Endeca to
Solr?  (confirm true/false and rationale)

~ David Smiley
Freelance Apache Lucene/Solr Search Consultant/Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, May 27, 2014 at 4:17 AM, Alice.H.Yang (mis.cnsh04.Newegg) 41493 
alice.h.y...@newegg.com wrote:

 Hi, Token

 1.
 I set the 3 fields with hundreds of values uses fc and the rest
 uses enum, the performance is improved 2 times compared with no parameter,
 and then I add facet.method=20 , the performance is improved about 4 times
 compared with no parameter.
 And I also tried setting 9 facet field to one copyfield, I test
 the performance, it is improved about 2.5 times compared with no parameter.
 So, It is improved a lot under your advice, thanks a lot.
 2.
 Now I have another performance issue, It's the group performance.
 The number of data is as same as facet performance scenario.
 When the keyword search hits about one million documents, the QTime is
 about 600ms.(It doesn't query the first time, it's in cache)

 Query url:

 select?fl=item_catalogq=default_search:paramterdefType=edismaxrows=50group=truegroup.field=item_group_idgroup.ngroups=truegroup.sort=stock4sort%20desc,final_price%20asc,is_selleritem%20ascsort=score%20desc,default_sort%20desc

 It need Qtime about 600ms.

 This query have two parameter:
 1. fl one field
 2. group=true,
 group.ngroups=true

 If I set group=false,, the QTime is only 1 ms.
 But I need do group and group.ngroups, How can I improve the group
 performance under this demand. Do you have some advice for me. I'm looking
 forward to your reply.

 Best Regards,
 Alice Yang
 +86-021-51530666*41493
 Floor 19,KaiKai Plaza,888,Wanhandu Rd,Shanghai(200042)


 -邮件原件-
 发件人: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
 发送时间: 2014年5月24日 15:17
 收件人: solr-user@lucene.apache.org
 主题: RE: (Issue) How improve solr facet performance

 Alice.H.Yang (mis.cnsh04.Newegg) 41493 [alice.h.y...@newegg.com] wrote:
  1.  I'm sorry, I have made a mistake, the total number of documents is
 32 Million, not 320 Million.
  2.  The system memory is large for solr index, OS total has 256G, I set
 the solr tomcat HEAPSIZE=-Xms25G -Xmx100G

 100G is a very high number. What special requirements dictates such a
 large heap size?

  Reply:  9 fields I facet on.

 Solr treats each facet separately and with facet.method=fc and 10M hits,
 this means that it will iterate 9*10M = 90M document IDs and update the
 counters for those.

  Reply:  3 facet fields have one hundred unique values, other 6 facet
 fields' unique values are between 3 to 15.

 So very low cardinality. This is confirmed by your low response time of
 6ms for 2925 hits.

  And we test this scenario:  If the number of facet fields' unique values
 is less we add facet.method=enum, there is a little to improve performance.

 That is a shame: enum is normally the simple answer to a setup like yours.
 Have you tried fine-tuning your fc/enum selection, so that the 3 fields
 with hundreds of values uses fc and the rest uses enum? That might halve
 your response time.


 Since the number of unique facets is so low, I do not think that DocValues
 can help you here. Besides the fine-grained fc/enum-selection above, you
 could try collapsing all 9 facet-fields into a single field. The idea
 behind this is that for facet.method=fc, performing faceting on a field
 with (for example) 300 unique values takes practically the same amount of
 time as faceting on a field with 1000 unique values: Faceting on a single
 slightly larger field is much faster than faceting on 9 smaller fields.
 After faceting with facet.limit=-1 on the single super-facet-field, you
 must match the returned values back to their original fields:


 If you have the facet-fields

 field0: 34
 field1: 187
 field2: 78432
 field3: 3
 ...

 then collapse them by or-ing a field-specific mask that is bigger than the
 max in any field, then put it all into a single field:

 fieldAll: 0xA000 | 34
 fieldAll: 0xA100 | 187
 fieldAll: 0xA200 | 78432
 fieldAll: 0xA300 | 3
 ...

 perform the facet request on fieldAll with facet.limit=-1 and split the
 resulting counts with

 for (entry: facetResultAll) {
   switch (0xFF00  entry.value) {
 case 0xA000:
   field0.add(entry.value, entry.count);
   break;
 case 0xA100:
   field1.add(entry.value, entry.count);
   break;
 ...
   }
 }


 Regards,
 Toke Eskildsen, State and University Library, Denmark



Re: Regex with local params is not working

2014-05-27 Thread Yonik Seeley
On Tue, May 27, 2014 at 4:38 AM, Lokn nlokesh...@gmail.com wrote:
 With solr local params, the regex is not working.
 My sample query: q ={!qf=$myfield_qf}/[a-d]ad/, where I have myfield_qf
 defined in the solrconfig.xml.

add debugQuery=true to the request to see what query is actually produced.

-Yonik
http://heliosearch.org - facet functions, subfacets, off-heap filtersfieldcache


Re: 答复: Internals about Too many values for UnInvertedField faceting on field xxx

2014-05-27 Thread Yonik Seeley
On Mon, May 26, 2014 at 9:21 PM, 张月祥 zhan...@calis.edu.cn wrote:
 Thanks a lot.

 There are only 256 byte arrays to hold all of the ord data, and the
 pointers into those arrays are only 24 bits long.  That gets you back
 to 32 bits, or 4GB of ord data max.  It's practically less since you
 only have to overflow one array before the exception is thrown.

 What does the ord data mean? Term Id or Term-Document Relation or 
 Document-Term Relation ?

Every document has a list of term numbers (term ords) associated with it.
The deltas between sorted term numbers are vInt encoded.

-Yonik
http://heliosearch.org - facet functions, subfacets, off-heap filtersfieldcache


SolrCloud: facet range option f.field.facet.mincount=1 omits buckets on response

2014-05-27 Thread Ronald Matamoros
Good afternoon,

Is the f.field.facet.mincount option supported on a distributed search?
Under SolrCloud experiencing that some buckets are ignored when using the 
option f.field.facet.mincount=1.

The Solr logs do not indicate any error or warning during execution.
The debug=true option and increasing the log levels to the FacetComponent do 
not provide any hints to the behaviour.

Replicated the issue on both Solr 4.5.1  4.8.1.
Attached a PDF that provides additional details and steps to replicate the 
behaviour using the out of the box Solr distribution.

Any insight or recommendation to tackle this situation is much appreciated.

Example, 

  Removing the f.field.facet.mincount=1 option gives the expected list of 
buckets for the 6 documents matched.

lst name=facet_ranges
 lst name=price
   lst name=counts
 int name=0.00/int
 int name=50.01/int
 int name=100.00/int
 int name=150.03/int
 int name=200.00/int
 int name=250.01/int
 int name=300.00/int
 int name=350.00/int
 int name=400.00/int
 int name=450.00/int
 int name=500.00/int
 int name=550.00/int
 int name=600.00/int
 int name=650.00/int
 int name=700.00/int
 int name=750.01/int
 int name=800.00/int
 int name=850.00/int
 int name=900.00/int
 int name=950.00/int
   /lst
   float name=gap50.0/float
   float name=start0.0/float
   float name=end1000.0/float
   int name=before0/int
   int name=after0/int
   int name=between2/int
 /lst
   /lst

  Using the f.field.facet.mincount=1 option removes the 0 count buckets 
but will also omit bucket int name=250.0

   lst name=facet_ranges
  lst name=price
lst name=counts
int name=50.01/int
int name=150.03/int
int name=750.01/int
 /lst
 float name=gap50.0/float
 float name=start0.0/float
 float name=end1000.0/float
 int name=before0/int
 int name=after0/int
 int name=between4/int
  /lst
/lst

 Refreshing the query using the browser's F5 option renders a different 
bucket list 
 (you may need to refresh multiple times)

   lst name=facet_ranges
  lst name=price
lst name=counts
int name=150.03/int
int name=250.01/int
 /lst
 float name=gap50.0/float
 float name=start0.0/float
 float name=end1000.0/float
 int name=before0/int
 int name=after0/int
 int name=between2/int
  /lst
/lst

Regards 
Ronald Matamoros


RE: Wordbreak spellchecker excessive breaking.

2014-05-27 Thread Dyer, James
You can do this if you set it up like in the mail Solr example:

lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str  
str name=fieldname/str
str name=combineWordstrue/str
str name=breakWordstrue/str
int name=maxChanges10/int
/lst

The combineWords and breakWords flags let you tell it which kind of 
workbreak correction you want.  maxChanges controls the maximum number of 
words it can break 1 word into, or the maximum number of words it can combine.  
It is reasonable to set this to 1 or 2.

The best way to use this is in conjunction with a regular spellchecker like 
DirectSolrSpellChecker.  When used together with the collation functionality, 
it should take a query like mob ile and depending on what actually returns 
results from your data, suggest either mobile or perhaps mob lie or both.  
The one thing is cannot do is fix a transposition or misspelling and combine or 
break words in one shot.  That is, it cannot detect that mob lie should 
become mobile.

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: S.L [mailto:simpleliving...@gmail.com] 
Sent: Saturday, May 24, 2014 4:21 PM
To: solr-user@lucene.apache.org
Subject: Wordbreak spellchecker excessive breaking.

I am using Solr wordbreak spellchecker and the issue is that when I search
for a term like mob ile expecting that the wordbreak spellchecker would
actually resutn a suggestion for mobile it breaks the search term into
letters like m o b  I have two issues with this behavior.

 1. How can I make Solr combine mob ile to mobile?
 2. Not withstanding the fact that my search term mob ile is being broken
incorrectly into individual letters , I realize that the wordbreak is
needed in certain cases, how do I control the wordbreak so that it does not
break it into letters like m o b which seems like excessive breaking to
me ?

Thanks.


SolrCloud zkcli

2014-05-27 Thread Matt Kuiper
Hello,

I am using ZkCLI -cmd upconfig with a reload of each Solr node to update my 
solrconfig within my SolrCloud.  I noticed the linkconfig option at 
https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities, and do 
not quite understand where this option is designed to be used.

Can anyone clarify for me?

Thanks,
Matt


Re: ExtractingRequestHandler indexing zip files

2014-05-27 Thread Siegfried Goeschl
Hi Sergio,

your either do the stuff on the caller side (which is probably a good idea 
since you are off-load the SOLR server) or extend the ExtractingRequestHandler

Cheers,

Siegfried Goeschl

On 27 May 2014, at 10:37, marotosg marot...@gmail.com wrote:

 Hi,
 
 Thanks for your answer Alexandre.
 I have zip files with only one document inside per zip file. These documents
 are mainly pdf,xml,html.
 
 I tried to index tini.txt.gz file which is located in the trunk to be used
 by extraction tests
 \trunk\solr\contrib\extraction\src\test-files\extraction\tini.txt.gz
 
 I get the same issue only the name of the file inside tini.txt.gz gets
 indexed as content. That means ExtractRequesthandler can open the file
 because it's getting the name inside but for some reason is not reading the
 content.
 
 Any suggestions?
 
 Thanks
 Sergio
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/ExtractingRequestHandler-indexing-zip-files-tp4138172p4138255.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: SolrCloud zkcli

2014-05-27 Thread Shawn Heisey
On 5/27/2014 10:41 AM, Matt Kuiper wrote:
 I am using ZkCLI -cmd upconfig with a reload of each Solr node to update my 
 solrconfig within my SolrCloud.  I noticed the linkconfig option at 
 https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities, and 
 do not quite understand where this option is designed to be used.

If you use linkconfig with a collection that does not exist, you end up
with a collection *stub* in zookeeper that just contains the
configName.  At that point you can create the collection manually with
CoreAdmin or automatically with the Collection Admin, and I would expect
that with the latter, you'd be able to leave out the
collection.configName parameter.

If you use linkconfig with a collection that already exists, you can
change which configName is linked to that collection.  This is an easy
way to swap in a dev config on an existing collection.

Thanks,
Shawn



How to Get Highlighting Working in Velocity (Solr 4.8.0)

2014-05-27 Thread Teague James
My Solr 4.8.0 index includes a field called 'dom_title'. The field is
displayed in the result set. I want to be able to highlight keywords from
this field in the displayed results. I have tried configuring solrconfig.xml
and I have tried adding parameters to the query hl=truehl.fl=dom_title
but the searched keyword never gets highlighted in the results. I am
attempting to use the Velocity Browse interface to demonstrate this. Most of
the configuration is right out of the box, except for the fields in the
schema.

From my solrconfig.xml:
requestHandler name=/browse class=solr.SearchHandler
lst name=defautls
str name=echoParamsexplicit/str
str name=wtvelocity/str
str name=v.templatebrowse/str
str name=v.layoutlayout/str
str name=hlon/str
str name=hl.fldom_title/str
str name=hl.encoderhtml/str
str name=hl.simple.prelt;bgt;/str
str name=hl.simple.postlt;/bgt;/str

I omitted a lot of basic query settings and facet field info from this
snippet to focus on the highlighting component. What am I missing?

-Teague



RE: SolrCloud zkcli

2014-05-27 Thread Matt Kuiper
Great, thanks

Matt

-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Tuesday, May 27, 2014 11:13 AM
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud zkcli

On 5/27/2014 10:41 AM, Matt Kuiper wrote:
 I am using ZkCLI -cmd upconfig with a reload of each Solr node to update my 
 solrconfig within my SolrCloud.  I noticed the linkconfig option at 
 https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities, and 
 do not quite understand where this option is designed to be used.

If you use linkconfig with a collection that does not exist, you end up with a 
collection *stub* in zookeeper that just contains the configName.  At that 
point you can create the collection manually with CoreAdmin or automatically 
with the Collection Admin, and I would expect that with the latter, you'd be 
able to leave out the collection.configName parameter.

If you use linkconfig with a collection that already exists, you can change 
which configName is linked to that collection.  This is an easy way to swap in 
a dev config on an existing collection.

Thanks,
Shawn



Solr Suggester vs Solr SpellChecker

2014-05-27 Thread Mukundaraman valakumaresan
Hi

For auto suggestions I see, that the SpellChecker Component is being used.

In SolrConfig.xml,  I see the following SuggestComponent.  What is the
difference between these 2 components? Which one should be used for Auto
Suggestions.

I could see the SpellChecker component working, any references to use the
SuggesterComponent?

searchComponent name=suggest class=solr.SuggestComponent
  lst name=suggester
  str name=namemySuggester/str
  str name=lookupImplFuzzyLookupFactory/str  !--
org.apache.solr.spelling.suggest.fst --
  str name=dictionaryImplDocumentDictionaryFactory/str !--
org.apache.solr.spelling.suggest.HighFrequencyDictionaryFactory --
  str name=fieldcat/str
  str name=weightFieldprice/str
  str name=suggestAnalyzerFieldTypestring/str
/lst
  /searchComponent

Thanks  Regards
Mukund


Any Solrj API to obtain field list?

2014-05-27 Thread T. Kuro Kurosaka
I'd like to write Solr client code that writes text to language specific 
field, say, myfield_es, for Spanish,
if the field myfield_es is defined in schema.xml, and otherwise to a 
fall-back field myfield. To do this,
I need to obtain a list of defined fields (and dynamic fields) from the 
server. But I cannot find
a suitable Solrj API. Is there any? I'm using Solr 4.6.1. I could write 
code to use Schema REST API

(https://wiki.apache.org/solr/SchemaRESTAPI) but I would much prefer to use
the existing code if one exists.

--
T. Kuro Kurosaka • Senior Software Engineer




Re: Any Solrj API to obtain field list?

2014-05-27 Thread Sujit Pal
Have you looked at IndexSchema? That would offer you methods to query index
metadata using SolrJ.

http://lucene.apache.org/solr/4_7_2/solr-core/org/apache/solr/schema/IndexSchema.html

-sujit



On Tue, May 27, 2014 at 1:56 PM, T. Kuro Kurosaka k...@healthline.comwrote:

 I'd like to write Solr client code that writes text to language specific
 field, say, myfield_es, for Spanish,
 if the field myfield_es is defined in schema.xml, and otherwise to a
 fall-back field myfield. To do this,
 I need to obtain a list of defined fields (and dynamic fields) from the
 server. But I cannot find
 a suitable Solrj API. Is there any? I'm using Solr 4.6.1. I could write
 code to use Schema REST API
 (https://wiki.apache.org/solr/SchemaRESTAPI) but I would much prefer to
 use
 the existing code if one exists.

 --
 T. Kuro Kurosaka • Senior Software Engineer





Re: Any Solrj API to obtain field list?

2014-05-27 Thread Ahmet Arslan
Hi,

https://wiki.apache.org/solr/LukeRequestHandler  make sure numTerms=0 for 
performance 



On Tuesday, May 27, 2014 11:56 PM, T. Kuro Kurosaka k...@healthline.com wrote:



I'd like to write Solr client code that writes text to language specific 
field, say, myfield_es, for Spanish,
if the field myfield_es is defined in schema.xml, and otherwise to a 
fall-back field myfield. To do this,
I need to obtain a list of defined fields (and dynamic fields) from the 
server. But I cannot find
a suitable Solrj API. Is there any? I'm using Solr 4.6.1. I could write 
code to use Schema REST API
(https://wiki.apache.org/solr/SchemaRESTAPI) but I would much prefer to use
the existing code if one exists.

-- 
T. Kuro Kurosaka • Senior Software Engineer


Re: Any Solrj API to obtain field list?

2014-05-27 Thread Jack Krupansky
You might consider an update request processor as an alternative. It runs on 
the server and might be simpler. You can even use the stateless script 
update processor to avoid having to write any custom Java code.


-- Jack Krupansky

-Original Message- 
From: T. Kuro Kurosaka

Sent: Tuesday, May 27, 2014 4:56 PM
To: 'solr-user@lucene.apache.org'
Subject: Any Solrj API to obtain field list?

I'd like to write Solr client code that writes text to language specific
field, say, myfield_es, for Spanish,
if the field myfield_es is defined in schema.xml, and otherwise to a
fall-back field myfield. To do this,
I need to obtain a list of defined fields (and dynamic fields) from the
server. But I cannot find
a suitable Solrj API. Is there any? I'm using Solr 4.6.1. I could write
code to use Schema REST API
(https://wiki.apache.org/solr/SchemaRESTAPI) but I would much prefer to use
the existing code if one exists.

--
T. Kuro Kurosaka • Senior Software Engineer



Re: Any Solrj API to obtain field list?

2014-05-27 Thread T. Kuro Kurosaka

On 05/27/2014 02:29 PM, Jack Krupansky wrote:
You might consider an update request processor as an alternative. It 
runs on the server and might be simpler. You can even use the 
stateless script update processor to avoid having to write any custom 
Java code.


-- Jack Krupansky 


That's an interesting approach. I'd consider it.


On 05/27/2014 02:04 PM, Sujit Pal wrote:

Have you looked at IndexSchema? That would offer you methods to query index
metadata using SolrJ.

http://lucene.apache.org/solr/4_7_2/solr-core/org/apache/solr/schema/IndexSchema.html

-sujit

The question was essentially how to get IndexSchema for Solrj client,
without needing to parse the XML file, hopefully.


On 05/27/2014 02:16 PM, Ahmet Arslan wrote:

Hi,

https://wiki.apache.org/solr/LukeRequestHandler   make sure numTerms=0 for 
performance


I'm afraid this won't work because when the index is empty, Luke won't 
return any fields.
And for the fields that are written, this method returns more 
information than I'd like to know.

I just want to know if a field is valid or not.


Kuro



Re: Any Solrj API to obtain field list?

2014-05-27 Thread Steve Rowe
You can call the Schema API from SolrJ - see Shawn Heisey’s example code here: 
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/%3c51daecd2.6030...@elyograg.org%3e
 

Steve

On May 27, 2014, at 5:50 PM, T. Kuro Kurosaka k...@healthline.com wrote:

 On 05/27/2014 02:29 PM, Jack Krupansky wrote:
 You might consider an update request processor as an alternative. It runs on 
 the server and might be simpler. You can even use the stateless script 
 update processor to avoid having to write any custom Java code.
 
 -- Jack Krupansky 
 
 That's an interesting approach. I'd consider it.
 
 
 On 05/27/2014 02:04 PM, Sujit Pal wrote:
 Have you looked at IndexSchema? That would offer you methods to query index
 metadata using SolrJ.
 
 http://lucene.apache.org/solr/4_7_2/solr-core/org/apache/solr/schema/IndexSchema.html
 
 -sujit
 The question was essentially how to get IndexSchema for Solrj client,
 without needing to parse the XML file, hopefully.
 
 
 On 05/27/2014 02:16 PM, Ahmet Arslan wrote:
 Hi,
 
 https://wiki.apache.org/solr/LukeRequestHandler   make sure numTerms=0 for 
 performance
 
 I'm afraid this won't work because when the index is empty, Luke won't return 
 any fields.
 And for the fields that are written, this method returns more information 
 than I'd like to know.
 I just want to know if a field is valid or not.
 
 
 Kuro
 



search component needs access to results of previous component

2014-05-27 Thread Jitka
Hello and thanks for reading my question.

If our high-level search handler doesn't get enough results back from a Solr
query, it tweaks the query, re-sends to Solr, and combines the result sets
from the two queries. I would like to modify our Solr set-up so that Solr
itself could handle the deciding, tweaking, re-sending and combining within
a single call.

My current thought is to create a FollowUpQueryComponent that would live in
the last-components section of the definition of our usual requestHandler
in solrconfig.xml. 
The new component would take a look at the results of the default
QueryComponent and decide whether or not to tweak the query and bring about
another search. I tried to do the looking-and-tweaking at the aggregator
level in the new component's prepare() function, but ran into two problems.

1) The new component's prepare() function is called before the default
QueryComponent has done its work, so it doesn't have the information it
needs to decide whether the new call will be needed or not, and

2) Even if I postpone the decision until after the default QueryComponent's
results are in, as far as I can tell those results stay on the shards and
are not available to the aggregator.

Either I am going about this all wrong, or these must be standard problems
for creators of custom QueryComponents. Are there standard solutions? Any
feedback would be much appreciated.

Thanks, 
Jitka




--
View this message in context: 
http://lucene.472066.n3.nabble.com/search-component-needs-access-to-results-of-previous-component-tp4138335.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Any Solrj API to obtain field list?

2014-05-27 Thread T. Kuro Kurosaka

On 05/27/2014 02:55 PM, Steve Rowe wrote:
You can call the Schema API from SolrJ - see Shawn Heisey’s example code here:http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/%3c51daecd2.6030...@elyograg.org%3e  


Steve

It looks like this returns a Json representation of fields if I do
query.setRequestHandler(/schema/fields);

I guess this is the closest Solrj can do.

Thank  you, Steve.



Re: Any Solrj API to obtain field list?

2014-05-27 Thread Steve Rowe
Shawn’s code shows that SolrJ parses the JSON for you into NamedList 
(response.getResponse()). - Steve

On May 27, 2014, at 7:11 PM, T. Kuro Kurosaka k...@healthline.com wrote:

 On 05/27/2014 02:55 PM, Steve Rowe wrote:
 You can call the Schema API from SolrJ - see Shawn Heisey’s example code 
 here:http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/%3c51daecd2.6030...@elyograg.org%3e
   
 Steve
 It looks like this returns a Json representation of fields if I do
query.setRequestHandler(/schema/fields);
 
 I guess this is the closest Solrj can do.
 
 Thank  you, Steve.
 



Re: Any Solrj API to obtain field list?

2014-05-27 Thread T. Kuro Kurosaka

On 05/27/2014 04:21 PM, Steve Rowe wrote:

Shawn’s code shows that SolrJ parses the JSON for you into NamedList 
(response.getResponse()). - Steve

Thank you for pointing it out.

It wasn't apparent what get(key) returns since the method signature
of getResponse() merely tells it would return a NamedListObject.
After running a test code under debugger, I found out, for key=fields,
the returned object is of ArrayListSimpleOrderedMapNameValuePair.
This is what I came up with:

private static final String url = http://localhost:8983/solr/hlbase;;
private static final SolrServer server = new HttpSolrServer(url);
   ...
SolrQuery query = new SolrQuery();
query.setRequestHandler(/schema/fields);
QueryResponse response = server.query(query);

ListSimpleOrderedMapNameValuePair fields = 
(ArrayListSimpleOrderedMapNameValuePair) 
response.getResponse().get(fields);

for(SimpleOrderedMapNameValuePair fmap : fields) {
  System.out.println(fmap.get(name));
}


Kuro



Re: solrcloud shards backup/restoration

2014-05-27 Thread rulinma
I also want to know how to realization it.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solrcloud-shards-backup-restoration-tp4088447p4138358.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Backup strategy for SolrCloud

2014-05-27 Thread rulinma
Now, I use this. But I still want a snapshot index too.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Backup-strategy-for-SolrCloud-tp4009291p4138359.html
Sent from the Solr - User mailing list archive at Nabble.com.


Error enquiry- exceeded limit of maxWarmingSearchers=2

2014-05-27 Thread M, Arjun (NSN - IN/Bangalore)
Hi,

I am getting the below error.

org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: 
Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again 
later.

Can you please help?



Strict mode at searching and indexing

2014-05-27 Thread 小川修
Hi.

I use Solr4.7.2.

When I search by wrong query
for example
[non_exists_field:value].

Or, when I index by wrong xml
for example
[adddocfield name=non_exist_fieldvalue/field/doc/add]

Solr reports nothing.

I want to get error message.
Is there option that makes Solr strict?

Thanks.


Re: Error enquiry- exceeded limit of maxWarmingSearchers=2

2014-05-27 Thread Shawn Heisey
 Hi,

 I am getting the below error.

 org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
 Error opening new searcher. exceeded limit of
 maxWarmingSearchers=2, try again later.

This error is usually a symptom of a problem, not the actual problem.

Either you are running into performance issues that are making your
commits slow, or you are committing too frequently.   Either way, you've
got a situation where one commit (with opensearcher=true) is not able to
finish before the next commit starts.

Solr puts a limit on the number of searcher objects that can be starting
up (warming) at the same time. You've exceeded that limit. Here's a wiki
page about slow commits:

http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_commits

The rest of that wiki page discusses other things that can cause Solr
performance issues.

Thanks,
Shawn






Re: Regex with local params is not working

2014-05-27 Thread Lokn
Thanks for the reply.
I am using edismax for the query parsing. Still it's not working.
Instead of using local params, if I use the field directly then regex is
working fine.

Thanks,
Lokesh



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Regex-with-local-params-is-not-working-tp4138257p4138372.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Regex with local params is not working

2014-05-27 Thread Lokn
I tried with debug on, solr generates disjunctionmaxquery instead of
regexquery.
Is there something that I need to add to the query?

Thanks,
Lokesh



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Regex-with-local-params-is-not-working-tp4138257p4138373.html
Sent from the Solr - User mailing list archive at Nabble.com.