date:20160505

?????? query action with wrong result size zero

2016-05-05 Thread ????????

thank you ,Jay Potharaju

I got a discover, in the same one solr core , i put two kinds of docs, which 
means that they does not have the same fields, does this means that different 
kinds of docs can not be put into the same solr core?

thanks!

max mi

--  --
??: "Erick Erickson";;
: 2016??5??6??(??) 12:14
??: "solr-user"; 

: Re: query action with wrong result size zero

Please show us:
1> a sample doc that you expect to be returned
2> the results of adding '=query' to the URL
3> the schema definition for the field you're querying against.

It is likely that your query isn't quite what you think it is, is going
against a different field than you think or your schema isn't
quite doing what you think...

On Thu, May 5, 2016 at 9:40 AM, Jay Potharaju  wrote:
> Can you check if the field you are searching on is case sensitive? You can
> quickly test it by copying the exact contents of the brand field into your
> query and comparing it against the query you have posted above.
>
> On Thu, May 5, 2016 at 8:57 AM, mixiangliu <852262...@qq.com> wrote:
>
>>
>> i found a strange thing  with solr query??when i set the value of query
>> field like "brand:amd"??the  size of query result is zero,but the real data
>> is not zero??can  some body tell me why??thank you very much
>> my english is not very good??wish some body understand my words!
>>
>
>
>
> --
> Thanks
> Jay Potharaju

Re: fq behavior...

2016-05-05 Thread Susmit Shukla

Please take a look at this blog, specifically "Leapfrog Anyone?" section-
http://yonik.com/advanced-filter-caching-in-solr/

Thanks,
Susmit

On Thu, May 5, 2016 at 10:54 PM, Bastien Latard - MDPI AG <
lat...@mdpi.com.invalid> wrote:

> Hi guys,
>
> Just a quick question, that I did not find an easy answer.
>
> 1.
>
>Is the fq "executed" before or after the usual query (q)
>
>e.g.: select?q=title:"something really specific"=bPublic:true=10
>
>Would it first:
>
>  * get all the "specific" results, and then apply the filter
>  * OR is it first getting all the docs matching the fq and then
>running the "q" query
>
> In other words, does it first check for "the best cardinality"?
>
> Kind regards,
> Bastien
>
>

fq behavior...

2016-05-05 Thread Bastien Latard - MDPI AG


Hi guys,

Just a quick question, that I did not find an easy answer.

1.

   Is the fq "executed" before or after the usual query (q)

   e.g.: select?q=title:"something really specific"=bPublic:true=10

   Would it first:

 * get all the "specific" results, and then apply the filter
 * OR is it first getting all the docs matching the fq and then
   running the "q" query

In other words, does it first check for "the best cardinality"?

Kind regards,
Bastien

RE: Facet ignoring repeated word

2016-05-05 Thread G, Rajesh

Hi,

Can you please help? If there is a solution then It will be easy, else I have 
to create a script in python that can process the results from 
TermVectorComponent and group the result by words in different documents to 
find the word count. The Python script will accept the exported Solr result as 
input

Thanks
Rajesh



CEB India Private Limited. Registration No: U741040HR2004PTC035324. Registered 
office: 6th Floor, Tower B, DLF Building No.10 DLF Cyber City, Gurgaon, 
Haryana-122002, India.

This e-mail and/or its attachments are intended only for the use of the 
addressee(s) and may contain confidential and legally privileged information 
belonging to CEB and/or its subsidiaries, including SHL. If you have received 
this e-mail in error, please notify the sender and immediately, destroy all 
copies of this email and its attachments. The publication, copying, in whole or 
in part, or use or dissemination in any other way of this e-mail and 
attachments by anyone other than the intended person(s) is prohibited.

-Original Message-
From: G, Rajesh [mailto:r...@cebglobal.com]
Sent: Thursday, May 5, 2016 4:29 PM
To: Ahmet Arslan ; solr-user@lucene.apache.org; 
erickerick...@gmail.com
Subject: RE: Facet ignoring repeated word

Hi,

TermVectorComponent works. I am able to find the repeating words within the 
same document...that facet was not able to. The problem I see is 
TermVectorComponent produces result by a document e.g. and I have to combine 
the counts i.e count of word my is=6 in the list of documents. Can you please 
suggest a solution to group count by word across documents?. Basically we want 
to build word cloud from Solr result


1675


4





1675


2




http://localhost:8182/solr/dev/tvrh?q=*:*=true=comments=true=comments=1000


Hi Erick,
I need the count of repeated words to build word cloud

Thanks
Rajesh



CEB India Private Limited. Registration No: U741040HR2004PTC035324. Registered 
office: 6th Floor, Tower B, DLF Building No.10 DLF Cyber City, Gurgaon, 
Haryana-122002, India.

This e-mail and/or its attachments are intended only for the use of the 
addressee(s) and may contain confidential and legally privileged information 
belonging to CEB and/or its subsidiaries, including SHL. If you have received 
this e-mail in error, please notify the sender and immediately, destroy all 
copies of this email and its attachments. The publication, copying, in whole or 
in part, or use or dissemination in any other way of this e-mail and 
attachments by anyone other than the intended person(s) is prohibited.

-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com]
Sent: Tuesday, May 3, 2016 6:19 AM
To: solr-user@lucene.apache.org; G, Rajesh 
Subject: Re: Facet ignoring repeated word

Hi,

StatsComponent does not respect the query parameter. However you can feed a 
function query (e.g., termfreq) to it.

Instead consider using TermVectors or MLT's interesting terms.


https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component
https://cwiki.apache.org/confluence/display/solr/MoreLikeThis

Ahmet


On Monday, May 2, 2016 9:31 AM, "G, Rajesh"  wrote:
Hi Erick/ Ahmet,

Thanks for your suggestion. Can we have a query in TermsComponent like. I need 
the word count of comments for a question id not all. When I include the query 
q=questionid=123 I still see count of all

http://localhost:8182/solr/dev/terms?terms.fl=comments=true=1000=questionid=123

StatsComponent is not supporting text fields

Field type 
textcloud_en{class=org.apache.solr.schema.TextField,analyzer=org.apache.solr.analysis.TokenizerChain,args={positionIncrementGap=100,
 class=solr.TextField}} is not currently supported

  

  
  
  


  
  

  

Thanks
Rajesh



CEB India Private Limited. Registration No: U741040HR2004PTC035324. Registered 
office: 6th Floor, Tower B, DLF Building No.10 DLF Cyber City, Gurgaon, 
Haryana-122002, India.

This e-mail and/or its attachments are intended only for the use of the 
addressee(s) and may contain confidential and legally privileged information 
belonging to CEB and/or its subsidiaries, including CEB subsidiaries that offer 
SHL Talent Measurement products and services. If you have received this e-mail 
in error, please notify the sender and immediately, destroy all copies of this 
email and its attachments. The publication, copying, in whole or in part, or 
use or dissemination in any other way of this e-mail and attachments by anyone 
other than the intended person(s) is prohibited.


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Friday, April 29, 2016 9:16 PM
To: solr-user ; Ahmet Arslan 
Subject: Re: Facet ignoring

Re: OOM script executed

2016-05-05 Thread Bastien Latard - MDPI AG


Thank you Shawn!

So if I run the two following requests, it will only store once 7.5Mo, 
right?

- select?q=*:*=bPublic:true=10
- select?q=field:my_search=bPublic:true=10

kr,

Bast

On 04/05/2016 16:22, Shawn Heisey wrote:

On 5/3/2016 11:58 PM, Bastien Latard - MDPI AG wrote:

Thank you for your email.
You said "have big caches or request big pages (e.g. 100k docs)"...
Does a fq cache all the potential results, or only the ones the query
returns?
e.g.: select?q=*:*=bPublic:true=10

=> with this query, if I have 60 millions of public documents, would
it cache 10 or 60 millions of IDs?
...and does it cache it the filter cache (from fq) in the OS cache or
in java heap?

The result of a filter query is a bitset.  If the core contains 60
million documents, each bitset is 7.5 million bytes in length.  It is
not a list of IDs -- it's a large array of bits representing every
document in the Lucene index, including deleted documents (the Max Doc
value from the core overview).  There are two values for each bit - 0 or
1, depending on whether each document matches the filter or not.

Thanks,
Shawn




Kind regards,
Bastien Latard
Web engineer
--
MDPI AG
Postfach, CH-4005 Basel, Switzerland
Office: Klybeckstrasse 64, CH-4057
Tel. +41 61 683 77 35
Fax: +41 61 302 89 18
E-mail:
lat...@mdpi.com
http://www.mdpi.com/

How to get all the docs whose field contain a specialized string?

2016-05-05 Thread ????????

Hi, all


I do a query by solr admin UI ,but the response is not what i desired!
My operations follows!


first step: get all data.
http://127.0.0.1:8080/solr/example/select?q=*%3A*=json=true


response follows:


"response": { "numFound": 5, "start": 0, "docs": [   { 
"id": "1", "goods_name_s": "cpu1", "brand_s": "amd", 
"_version_": 1533546720443498500   },   { "id": "2", 
"goods_name_s": "cpu2", "brand_s": "ibm",// there is a 'ibm' 
"_version_": 1533546730775117800   },   { "id": "3", 
"goods_name_s": "cpu3", "brand_s": "intel", "_version_": 
1533546741316452400   },   { "id": "4", "goods_name_s": 
"cpu4", "brand_s": "other", "_version_": 1533546750936088600
   },   { "id": "5", "goods_name_s": "cpu5", 
"brand_s": "ibm hp",//there is a 'ibm' "_version_": 1533548604687384600 
  } ]

second step: query the record which 'brand_s' contain 'ibm'.
http://127.0.0.1:8080/solr/example/select?q=brand_s%3Aibm=json=true


 "response": { "numFound": 1, "start": 0, "docs": [   { 
"id": "2", "goods_name_s": "cpu2", "brand_s": "ibm", 
"_version_": 1533546730775117800   } ]   }


my question is why there is only one doc found? There are two Docs which 
contains 'ibm' in all docs.

BigDecimal Solr Field in schema

2016-05-05 Thread Roshan Kamble

Hello All,

I am using Solr 6.0.0 in cloud mode and have requirement to support all number 
in BigDecimal

Does anyone know which solr field type should be used for BigDecimal?

I tried using DoubleTrieField but it does not meet the requirement and round up 
very big number approx. after 16 digit.


Regards,
Roshan

The information in this email is confidential and may be legally privileged. It 
is intended solely for the addressee. Access to this email by anyone else is 
unauthorised. If you are not the intended recipient, any disclosure, copying, 
distribution or any action taken or omitted to be taken in reliance on it, is 
prohibited and may be unlawful.

Re: id field always stored?

2016-05-05 Thread Alexandre Rafalovitch

Solr 6 or Solr 5.5, right?

docValues now return the values, even if stored=false. That's probably
what you are hitting. Check release notes (under 5.5 I believe) for
more details.

Regards,
   Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 6 May 2016 at 06:30, Siddhartha Singh Sandhu  wrote:
> Hi,
>
> 1. I was doing some exploration and wanted to know if
> the id field is always stored even when I set stored
> = false.
>
>
> * multiValued="false" stored="false"/>*
>
> 2. Also, even though I removed dynamic fields, anything tagged *_id is
> getting stored despite marking that field stored = false.
>
> * required="true" stored="false"/>*
>
> Where string is defined as:
>
> * docValues="true" />*
>
> Regards,
>
> Sid.

Re: collection alias and solr streaming expression

2016-05-05 Thread sudsport s

Thanks Joel ,

I have created JIRA issue. please let me know if any feedback.

https://issues.apache.org/jira/browse/SOLR-9077

On Thu, May 5, 2016 at 2:38 PM, Joel Bernstein  wrote:

> Yes, this needs to be supported. If you open up a ticket, I'll be happy to
> review.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Thu, May 5, 2016 at 2:24 PM, sudsport s  wrote:
>
> > I tried to run solr streaming expression using collection alias , I get
> > null pointer expression. after looking at log I see that getSlices
> returns
> > null.
> >
> > can someone suggest if it is good idea to add support for collection
> alias
> > in solr streaming expression?
> > If yes I would like to submit fix and add support for this feature.
> >
> >
> >
> >
> > --
> > Thanks
> >
>

Re: query action with wrong result size zero

2016-05-05 Thread Erick Erickson

Please show us:
1> a sample doc that you expect to be returned
2> the results of adding '=query' to the URL
3> the schema definition for the field you're querying against.

It is likely that your query isn't quite what you think it is, is going
against a different field than you think or your schema isn't
quite doing what you think...

On Thu, May 5, 2016 at 9:40 AM, Jay Potharaju  wrote:
> Can you check if the field you are searching on is case sensitive? You can
> quickly test it by copying the exact contents of the brand field into your
> query and comparing it against the query you have posted above.
>
> On Thu, May 5, 2016 at 8:57 AM, mixiangliu <852262...@qq.com> wrote:
>
>>
>> i found a strange thing  with solr query，when i set the value of query
>> field like "brand:amd"，the  size of query result is zero,but the real data
>> is not zero，can  some body tell me why，thank you very much！！
>> my english is not very good，wish some body understand my words!
>>
>
>
>
> --
> Thanks
> Jay Potharaju

Re: Query String Limit

2016-05-05 Thread Erick Erickson

Or perhaps the TermsQueryParser? See:
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-TermsQueryParser

Best,
Erick

On Thu, May 5, 2016 at 11:13 AM, Ahmet Arslan  wrote:
> Hi,
>
> Wow thats a lot of IDs. Where are they coming from?
> May be you can consider using join options of lucene/solr if these IDs are 
> result of another query.
>
> Also terms query parser would be better choice in case of lots of IDs.
> https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-TermsQueryParser
>
> Ahmet
>
>
> On Thursday, May 5, 2016 7:45 AM, Prasanna S. Dhakephalkar 
>  wrote:
> Hi
>
> We had increased the maxBooleanClauses to a large number, but it did not
> work
>
> Here is the query
>
> http://localhost:8983/solr/collection1/select?fq=record_id%3A(604929+504197+
> 500759+510957+624719+524081+544530+375687+494822+468221+553049+441998+495212
> +462613+623866+344379+462078+501936+189274+609976+587180+620273+479690+60601
> 8+487078+496314+497899+374231+486707+516582+74518+479684+1696152+1090711+396
> 784+377205+600603+539686+550483+436672+512228+1102968+600604+487699+612271+4
> 87978+433952+479846+492699+380838+412290+487086+515836+487957+525335+495426+
> 619724+49726+444558+67422+368749+630542+473638+613887+1679503+509367+1108299
> +498818+528683+530270+595087+468595+585998+487888+600612+515884+455568+60643
> 8+526281+497992+460147+587530+576456+526021+790508+486148+469160+365923+4846
> 54+510829+488792+610933+254610+632700+522376+594418+514817+439283+1676569+52
> 4031+431557+521628+609255+627205+1255921+57+477017+519675+548373+350309+
> 491176+524276+570935+549458+495765+512814+494722+382249+619036+477309+487718
> +470604+514622+1240902+570607+613830+519130+479708+630293+496994+623870+5706
> 72+390434+483496+609115+490875+443859+292168+522383+501802+606498+596773+479
> 881+486020+488654+490422+512636+495512+489480+626269+614618+498967+476988+47
> 7608+486568+270095+295480+478367+607120+583892+593474+494373+368030+484522+5
> 01183+432822+448109+553418+584084+614868+486206+481014+495027+501880+479113+
> 615208+488161+512278+597663+569409+139097+489490+584000+493619+607479+281080
> +518617+518803+487896+719003+584153+484341+505689+278177+539722+548001+62529
> 6+1676456+507566+619039+501882+530385+474125+293642+612857+568418+640839+519
> 893+524335+612859+618762+479460+479719+593700+573677+525991+610965+462087+52
> 1251+501197+443642+1684784+533972+510695+475499+490644+613829+613893+479467+
> 542478+1102898+499230+436921+458632+602303+488468+1684407+584373+494603+4992
> 45+548019+600436+606997+59+503156+440428+518759+535013+548023+494273+649
> 062+528704+469282+582249+511250+496466+497675+505937+489504+600444+614240+19
> 35577+464232+522398+613809+1206232+607149+607644+498059+506810+487115+550976
> +638174+600849+525655+625011+500082+606336+507156+487887+333601+457209+60111
> 0+494927+1712081+601280+486061+501558+600451+263864+527378+571918+472415+608
> 130+212386+380460+590400+478850+631886+486782+608013+613824+581767+527023+62
> 3207+607013+505819+485418+486786+537626+507047+92+527473+495520+553141+5
> 17837+497295+563266+495506+532725+267057+497321+453249+524341+429654+720001+
> 539946+490813+479491+479628+479630+1125985+351147+524296+565077+439949+61241
> 3+495854+479493+1647796+600259+229346+492571+485638+596394+512112+477237+600
> 459+263780+704068+485934+450060+475944+582280+488031+1094010+1687904+539515+
> 525820+539516+505985+600461+488991+387733+520928+362967+351847+531586+616101
> +479925+494156+511292+515729+601903+282655+491244+610859+486081+325500+43639
> 7+600708+523445+480737+486083+614767+486278+1267655+484845+495145+562624+493
> 381+8060+638731+501347+565979+325132+501363+268866+614113+479646+1964487+631
> 934+25717+461612+376451+513712+527557+459209+610194+1938903+488861+426305+47
> 7676+1222682+1246647+567986+501908+791653+325802+498354+435156+484862+533068
> +339875+395827+475148+331094+528741+540715+623480+416601+516419+600473+62563
> 2+480570+447412+449778+503316+492365+563298+486361+500907+514521+138405+6123
> 27+495344+596879+524918+474563+47273+514739+553189+548418+448943+450612+6006
> 78+484753+485302+271844+474199+487922+473784+431524+535371+513583+514746+612
> 534+327470+485855+517878+384102+485856+612768+494791+504840+601330+493551+55
> 8620+540131+479809+394179+487866+559955+578444+576571+485861+488879+573089+4
> 97552+487898+490369+535756+614155+633027+487473+517912+523364+527419+600487+
> 486128+278040+598478+487395+600579+585691+498970+488151+608187+445943+631971
> +230291+504552+534443+501924+489148+292672+528874+434783+479533+485301+61908
> 9+629083+479383+600981+534717+645420+604921+618714+522329+597822+507413+5706
> 05+491732+464741+511564+613929+526049+614817+589065+603307+491990+467339+264
> 426+487907+492982+589067+487674+487820+492983+486708+504140+1216198+625736+4
> 92984+530116+615663+503248+1896822+600588+518139+494994+621846+599669+488207
>

Re: Filter queries & caching

2016-05-05 Thread Erick Erickson

Well, first of all I would use separate fq clauses. IOW:
 fq=filter(fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *]  &&
type:"abc")

would probably be better written as:
 fq=fromfield:[* TO NOW/DAY+1DAY]=tofield:[NOW/DAY-7DAY TO *]=type:"abc"

You can always spoof the * at the _end_ of the cache by putting it at
some date far in the
future, i.e.
fq=fromfield:[some-date TO NOW/DAY+1000DAYS]

But you can empirically figure this out, just go to the
admin>>core>>plugins/stats
and look at the hits for filterCache when you issue test queries.

Best,
Erick

On Thu, May 5, 2016 at 4:55 PM, Ahmet Arslan  wrote:
> Hi,
>
> It depends on your re-use patterns. Query supplied to the filter query (fq) 
> will be the key of the cache map. Subsequent filter query with an existing 
> key will be served from cache.
>
> For example lets say that you always use these two clauses together.
> Then it makes sense to use fq=+fromfield:[* TO NOW/DAY+1DAY] 
> +tofield:[NOW/DAY-7DAY TO *]
>
> If you have other requests where you filter on type, then separate it: 
> fq=type:abc
>
> It is beter to create smart cache keys that are likely to be issued again.
>
> If type:abc already restricts document space into a very small subset, then 
> you may use post filter option for the remaining restricting clauses.
>
> Ahmet
>
>
> On Friday, May 6, 2016 12:05 AM, Jay Potharaju  wrote:
> I have almost 50 million docs and growing ...that being said in a high
> query volume case does it make sense to use
>
> fq=filter(fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *]  &&
> type:"abc")
>
> OR
> fq=filter(fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] )
> fq=filter(type:abc)
>
> Is this something that I would need to determine by running some test
> Thanks
>
> On Thu, May 5, 2016 at 1:44 PM, Jay Potharaju  wrote:
>
>> Are you suggesting rewriting it like this ?
>> fq=filter(fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] )
>> fq=filter(type:abc)
>>
>> Is this a better use of the cache as supposed to fq=fromfield:[* TO
>> NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] && type:"abc"
>>
>> Thanks
>>
>> On Thu, May 5, 2016 at 12:50 PM, Ahmet Arslan 
>> wrote:
>>
>>> Hi,
>>>
>>> Cache enemy is not * but NOW. Since you round it to DAY, cache will work
>>> within-day.
>>> I would use separate filer queries, especially fq=type:abc for the
>>> structured query so it will be cached independently.
>>>
>>> Also consider disabling caching (using cost) in expensive queries:
>>> http://yonik.com/advanced-filter-caching-in-solr/
>>>
>>> Ahmet
>>>
>>>
>>>
>>> On Thursday, May 5, 2016 8:25 PM, Jay Potharaju 
>>> wrote:
>>> Hi,
>>> I have a filter query that gets  documents based on date ranges from last
>>> n
>>> days to anytime in future.
>>>
>>> The objective is to get documents between a date range, but the start date
>>> and end date values are stored in different fields and that is why I wrote
>>> the filter query as below
>>>
>>> fq=fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] &&
>>> type:"abc"
>>>
>>> The way these queries are currently written I think wont leverage the
>>> filter cache because of "*". Is there a better way to write this query so
>>> that I can leverage the cache.
>>>
>>>
>>>
>>> --
>>> Thanks
>>> Jay
>
>>>
>>
>>
>>
>> --
>> Thanks
>> Jay Potharaju
>>
>>
>
>
>
> --
> Thanks
> Jay Potharaju

Re: Results of facet differs with change in facet.limit.

2016-05-05 Thread Erick Erickson

OK, this is strange on the face of it. Is there any chance you could
create a test case that fails? Even if it only fails a small percentage
of the time...

Best,
Erick

On Wed, May 4, 2016 at 3:02 AM, Modassar Ather  wrote:
> The "val1" is same for both the test with limit 100 and 200 so the
> following is true.
>
> limit=100
> 1225
> 1082
> 1076
>
> limit=200
> 1366
> 1321
> 1315
>
> This I have noticed irrespective of facet.limit too. Please refer to my
> previous mail for the example.
>
> Thanks,
> Modassar
>
> On Wed, May 4, 2016 at 3:01 PM, Toke Eskildsen 
> wrote:
>
>> On Mon, 2016-05-02 at 15:53 +0530, Modassar Ather wrote:
>> > E.g.
>> > Query : text_field:term=f=100
>> > Result :
>> > 1225
>> > 1082
>> > 1076
>> >
>> > Query : text_field:term=f=200
>> > 1366
>> > 1321
>> > 1315
>>
>> Is the "val1" in your limit=100 test the same term as your "val1" in
>> your limit=200-test?
>>
>>
>> Or to phrase it another way: Do you have
>>
>> limit=100
>> 1225
>> 1082
>> 1076
>>
>> limit=200
>> 1366
>> 1321
>> 1315
>>
>>
>> or
>>
>> limit=100
>> 1225
>> 1082
>> 1076
>>
>> limit=200
>> 1366
>> 1321
>> 1315
>>
>>
>> - Toke Eskildsen, State and University Library, Denmark
>>
>>

Re: Implementing partial search and exact matching

2016-05-05 Thread Lasitha Wattaladeniya

Hi Nick,

Thanks for the reply.  Actually

q="software engineering" - > doc1
q="software engineer" - > no results
q="Software engineer" - > doc2

I hope above test cases will explain my requirements further.  So far I'm
thinking of changing the qf according to query has enclosing double
quotations or not.

If somebody knows a better approach/queryparser please point me

Thanks
On 6 May 2016 10:56 am, "ND"  wrote:

> Lasitha,
>
> I think I understand what you are asking and if you have something like
> Doc1 = software engineering
> Doc2 = Software engineer
>
> And if you query
> q=software+engineer -> Doc1 & Doc2
>
> but
>
> q="software+engineer" -> Doc1
>
> Correct?
>
> If this is correct then to my knowledge no, Solr out of the box cannot do
> what you are asking, that is recognize a exact (quoted) search and change
> the query fields to non-ngram fields. Of course you could do this in code
> with some regex to see if the first and last character are double quotes
> but there are a number of draw backs to this like q= "Software Engineer"
> and Ninja wont do what you want it to but you could always do something
> else.
>
> From my experience I would still want q="software+engineer" -> Doc1 and
> Doc2 cause technically that exact phrase does exists in both Docs.
>
> Maybe there is someone else on here who can offer some more perspective on
> this or a possible query analyzer that I haven't heard of that can solve
> this issue (I would also be interested in that).
>
> Nick
>
> On Thu, May 5, 2016 at 6:33 PM, Lasitha Wattaladeniya 
> wrote:
>
> > Hi nd,
> >
> > Here's the issue..  Let's say I search.. Software Engineer. For am
> example
> > lets say this query will return 10 results when search against ngram
> field
> > . Now I search "Software Engineer"   with double quotations. This should
> > not return same result set as the previous query.
> >
> > I thought the query parser I'm using (edismax)  may have an inbuilt
> > function for that.
> >
> > Do I have to specifically change the query field (qf)  to solve this
> issue
> > for each query?
> > On 6 May 2016 8:26 am, "ND"  wrote:
> >
> > We implemented something similar it sounds to what you are asking but I
> > dont see why you would need to match the original field. Since
> technically
> > a field that has *software engineer* indexed is matched by a query like
> > "software eng"  to "software engineer" with the ngrams; which makes
> the
> > exact phrase is still valid.
> >
> > The problem we where trying to solve was the exact phrase issue which can
> > be solved with taking in a qs value of 0 (or higher depending on your
> > definition of exactness).
> >
> > Maybe an example would help if there is something I am not understanding.
> >
> > Also your field definitions might help or a example of the schema
> breakdown
> > similar to the admin analyze page.
> >
> > Nick
> >
> >
> > On Thu, May 5, 2016 at 12:25 AM, Lasitha Wattaladeniya <
> watt...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > I'm trying to implement a search functionality using solr. Currently
> I'm
> > > suing edismax parser with ngram fields to do the search against. So far
> > it
> > > works well.
> > >
> > > The question I have is when the user input double quotations to the
> > search,
> > > As the requirement this should  match against the original field, not
> > > against the ngram field.
> > >
> > > Currently what I have thought of doing is, identify the double
> quotations
> > > in the user input and change the query field (qf) according to that (to
> > > ngram field or to the exact field). Isn't there any out of the box
> > solution
> > > for this, I feel like it's a generic requirement and don't want to
> > reinvent
> > > the wheel. Appreciate your comments
> > >
> > > [1]. https://issues.apache.org/jira/browse/SOLR-6842
> > >
> > > [2].
> > >
> > >
> >
> >
> http://grokbase.com/t/lucene/solr-user/14cbghncvh/different-fields-for-user-supplied-phrases-in-edismax
> > >
> > > Thanks,
> > >
> > > Lasitha Wattaladeniya
> > >
> > > Software Engineer
> > >
> >
>

Re: I need Consultation/Suggestion and I am even willing to pay fee for that

2016-05-05 Thread Alexandre Rafalovitch

My reading is that this whole thing is a content farm, automatically
generating website based on the user queries against some sort of
internal database of documents (PDFs, ebooks, etc perhaps).

The goal seems to be SEO rather than user experience.

Regards,
   Alex.


Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 6 May 2016 at 00:53, John Bickerstaff  wrote:
> This statement has two possible meanings in my mind...
>
> "I want everything as automated manner with minimal manual work."
>
> Do you mean minimal work for your users?  Or do you mean minimal work to
> get your idea up and running and generating income for you or your company?
>
> The first meaning is laudable and a good idea -- and generally the nicer
> you want to make things for your users, the more time you will spend in
> analysis and development (I.E. greater cost in time and money)
>
> The second meaning suggests you want to spend a minimum of time and money
> to get something working -- which is generally incompatible with a really
> great user experience...
>
> And, of course, I may have totally missed your meaning and you may have had
> something totally different in mind...
>
> On Thu, May 5, 2016 at 8:33 AM, John Bickerstaff 
> wrote:
>
>> I'll just briefly add some thoughts...
>>
>> #1 This can be done several ways - including keeping a totally separate
>> document that contains ONLY the data you're willing to expose for free --
>> but what you want to accomplish is not clear enough to me for me to start
>> making recommendations.  I'll just say that this is not a problem or an
>> issue.  A way can be found to address #1 without much problem.
>>
>> #2 is difficult to understand.  I have the sense that you're only
>> beginning to think about a full application you want to build - with Search
>> at the center -- answering #2 is going to take a lot more clarity about
>> exactly what you're trying to accomplish.
>>
>>
>> #3  SOLR allows you to store original content so that you can return it
>> from Solr to an application at some future point.  You don't need to worry
>> about that.  By far the simplest way to handle images is to store metadata
>> about the image (including a link, or some way to get it quickly out of
>> your database, say, the DB id) and then go get the image as part of a
>> secondary process of building your web page after Solr has returned
>> results...  At least that's the way I and the teams I've worked with have
>> always handled it.
>>
>> #4  I must admit, I don't understand question #4...  Do you mean "Will the
>> way I'm handling documents affect the way my site is ranked by Google?"
>> Um.  Probably?  If you were giving everything away for free you'd
>> probably end up with a higher rank over time, but that's not what you want
>> to do, so maybe it's not an issue?  I'm not an expert on getting good
>> rankings from Google, so I'll leave that to others to comment on.
>>
>> As for 5 - what is the something you want to do?  I could try to answer,
>> but I don't have enough information to be sure my answer will match what
>> you're looking for.
>>
>> On Thu, May 5, 2016 at 4:46 AM, Zara Parst  wrote:
>>
>>> What is in my mind!!
>>>
>>>
>>>
>>> I have data in TB mainly educational assignments and projects which will
>>> contain text, image and may be codes also if this is from computer
>>> Science.  I will index all the documents into solr and I will also have
>>> original copy of those documents. Now, I want to create a library where
>>> user can search the content and can see few parts of relevant documents
>>> like 5 to 10 related documents but in restricted manner.  For unrestricted
>>> manner they have to pay for each documents.
>>>
>>>
>>>
>>> I also want to create page for those content which has been already shown
>>> to the user as a restricted part. So that number of page on my website
>>> keep
>>> on increasing which will give a boost to my website for search engine
>>> ranking. Obviously more pages mean better rank. I want everything as
>>> automated manner with minimal manual work. Now issue that I am facing
>>>
>>> 1.  How to generate restricted part out of solr which is most relevant
>>> ( I can implement sliding window display which might serve this but if
>>> there is already something in solr then I will prefer that one)
>>>
>>>
>>> 2.  How to create pages from that content and how to manage url of
>>> that
>>> page on my website (one solution would be url based on query but what if
>>> someone search almost same thing and some other document comes as first
>>> option and how to resolve the issue of the same url, this will also create
>>> issue of overlapping content with different url if I am implementing
>>> sliding window)
>>>
>>>
>>>
>>> 3.  About creating page, shall I create the page from solr content or
>>> from original content because it

Re: Implementing partial search and exact matching

2016-05-05 Thread ND

Lasitha,

I think I understand what you are asking and if you have something like
Doc1 = software engineering
Doc2 = Software engineer

And if you query
q=software+engineer -> Doc1 & Doc2

but

q="software+engineer" -> Doc1

Correct?

If this is correct then to my knowledge no, Solr out of the box cannot do
what you are asking, that is recognize a exact (quoted) search and change
the query fields to non-ngram fields. Of course you could do this in code
with some regex to see if the first and last character are double quotes
but there are a number of draw backs to this like q= "Software Engineer"
and Ninja wont do what you want it to but you could always do something
else.

>From my experience I would still want q="software+engineer" -> Doc1 and
Doc2 cause technically that exact phrase does exists in both Docs.

Maybe there is someone else on here who can offer some more perspective on
this or a possible query analyzer that I haven't heard of that can solve
this issue (I would also be interested in that).

Nick

On Thu, May 5, 2016 at 6:33 PM, Lasitha Wattaladeniya 
wrote:

> Hi nd,
>
> Here's the issue..  Let's say I search.. Software Engineer. For am example
> lets say this query will return 10 results when search against ngram field
> . Now I search "Software Engineer"   with double quotations. This should
> not return same result set as the previous query.
>
> I thought the query parser I'm using (edismax)  may have an inbuilt
> function for that.
>
> Do I have to specifically change the query field (qf)  to solve this issue
> for each query?
> On 6 May 2016 8:26 am, "ND"  wrote:
>
> We implemented something similar it sounds to what you are asking but I
> dont see why you would need to match the original field. Since technically
> a field that has *software engineer* indexed is matched by a query like
> "software eng"  to "software engineer" with the ngrams; which makes the
> exact phrase is still valid.
>
> The problem we where trying to solve was the exact phrase issue which can
> be solved with taking in a qs value of 0 (or higher depending on your
> definition of exactness).
>
> Maybe an example would help if there is something I am not understanding.
>
> Also your field definitions might help or a example of the schema breakdown
> similar to the admin analyze page.
>
> Nick
>
>
> On Thu, May 5, 2016 at 12:25 AM, Lasitha Wattaladeniya 
> wrote:
>
> > Hi All,
> >
> > I'm trying to implement a search functionality using solr. Currently I'm
> > suing edismax parser with ngram fields to do the search against. So far
> it
> > works well.
> >
> > The question I have is when the user input double quotations to the
> search,
> > As the requirement this should  match against the original field, not
> > against the ngram field.
> >
> > Currently what I have thought of doing is, identify the double quotations
> > in the user input and change the query field (qf) according to that (to
> > ngram field or to the exact field). Isn't there any out of the box
> solution
> > for this, I feel like it's a generic requirement and don't want to
> reinvent
> > the wheel. Appreciate your comments
> >
> > [1]. https://issues.apache.org/jira/browse/SOLR-6842
> >
> > [2].
> >
> >
>
> http://grokbase.com/t/lucene/solr-user/14cbghncvh/different-fields-for-user-supplied-phrases-in-edismax
> >
> > Thanks,
> >
> > Lasitha Wattaladeniya
> >
> > Software Engineer
> >
>

Re: Solr 6 / Solrj RuntimeException: First tuple is not a metadata tuple

2016-05-05 Thread deniz

Joel Bernstein wrote
>> Can you post your classpath?

classpath as follows:


solr-solrj-6.0.0
commons-io-2.4
httpclient-4.4.1
httpcore-4.4.1
httpmime-4.4.1
zookeeper-3.4.6
stax2-api-3.1.4
woodstox-core-asl-4.4.1
noggit-0.6
jcl-over-slf4j-1.7.7
slf4j-api-1.7.7




-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-6-Solrj-RuntimeException-First-tuple-is-not-a-metadata-tuple-tp4274451p4274979.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Implementing partial search and exact matching

2016-05-05 Thread Lasitha Wattaladeniya

Hi nd,

Here's the issue..  Let's say I search.. Software Engineer. For am example
lets say this query will return 10 results when search against ngram field
. Now I search "Software Engineer"   with double quotations. This should
not return same result set as the previous query.

I thought the query parser I'm using (edismax)  may have an inbuilt
function for that.

Do I have to specifically change the query field (qf)  to solve this issue
for each query?
On 6 May 2016 8:26 am, "ND"  wrote:

We implemented something similar it sounds to what you are asking but I
dont see why you would need to match the original field. Since technically
a field that has *software engineer* indexed is matched by a query like
"software eng"  to "software engineer" with the ngrams; which makes the
exact phrase is still valid.

The problem we where trying to solve was the exact phrase issue which can
be solved with taking in a qs value of 0 (or higher depending on your
definition of exactness).

Maybe an example would help if there is something I am not understanding.

Also your field definitions might help or a example of the schema breakdown
similar to the admin analyze page.

Nick

On Thu, May 5, 2016 at 12:25 AM, Lasitha Wattaladeniya 
wrote:

> Hi All,
>
> I'm trying to implement a search functionality using solr. Currently I'm
> suing edismax parser with ngram fields to do the search against. So far it
> works well.
>
> The question I have is when the user input double quotations to the
search,
> As the requirement this should  match against the original field, not
> against the ngram field.
>
> Currently what I have thought of doing is, identify the double quotations
> in the user input and change the query field (qf) according to that (to
> ngram field or to the exact field). Isn't there any out of the box
solution
> for this, I feel like it's a generic requirement and don't want to
reinvent
> the wheel. Appreciate your comments
>
> [1]. https://issues.apache.org/jira/browse/SOLR-6842
>
> [2].
>
>
http://grokbase.com/t/lucene/solr-user/14cbghncvh/different-fields-for-user-supplied-phrases-in-edismax
>
> Thanks,
>
> Lasitha Wattaladeniya
>
> Software Engineer
>

Filtering on nGroups

2016-05-05 Thread Nick Vasilyev

I am grouping documents on a field and would like to retrieve documents
where the number of items in a group matches a specific value or a range.

I haven't been able to experiment with all new functionality, but I wanted
to see if this is possible without having to calculate the count and add it
at index time as a field.

Does anyone have any ideas?

Thanks in advance

Re: Implementing partial search and exact matching

2016-05-05 Thread ND

We implemented something similar it sounds to what you are asking but I
dont see why you would need to match the original field. Since technically
a field that has *software engineer* indexed is matched by a query like
"software eng"  to "software engineer" with the ngrams; which makes the
exact phrase is still valid.

The problem we where trying to solve was the exact phrase issue which can
be solved with taking in a qs value of 0 (or higher depending on your
definition of exactness).

Maybe an example would help if there is something I am not understanding.

Also your field definitions might help or a example of the schema breakdown
similar to the admin analyze page.

Nick

On Thu, May 5, 2016 at 12:25 AM, Lasitha Wattaladeniya 
wrote:

> Hi All,
>
> I'm trying to implement a search functionality using solr. Currently I'm
> suing edismax parser with ngram fields to do the search against. So far it
> works well.
>
> The question I have is when the user input double quotations to the search,
> As the requirement this should  match against the original field, not
> against the ngram field.
>
> Currently what I have thought of doing is, identify the double quotations
> in the user input and change the query field (qf) according to that (to
> ngram field or to the exact field). Isn't there any out of the box solution
> for this, I feel like it's a generic requirement and don't want to reinvent
> the wheel. Appreciate your comments
>
> [1]. https://issues.apache.org/jira/browse/SOLR-6842
>
> [2].
>
> http://grokbase.com/t/lucene/solr-user/14cbghncvh/different-fields-for-user-supplied-phrases-in-edismax
>
> Thanks,
>
> Lasitha Wattaladeniya
>
> Software Engineer
>

Re: Filter queries & caching

2016-05-05 Thread Ahmet Arslan

Hi,

It depends on your re-use patterns. Query supplied to the filter query (fq) 
will be the key of the cache map. Subsequent filter query with an existing key 
will be served from cache.

For example lets say that you always use these two clauses together.
Then it makes sense to use fq=+fromfield:[* TO NOW/DAY+1DAY] 
+tofield:[NOW/DAY-7DAY TO *]

If you have other requests where you filter on type, then separate it: 
fq=type:abc

It is beter to create smart cache keys that are likely to be issued again.

If type:abc already restricts document space into a very small subset, then you 
may use post filter option for the remaining restricting clauses. 

Ahmet


On Friday, May 6, 2016 12:05 AM, Jay Potharaju  wrote:
I have almost 50 million docs and growing ...that being said in a high
query volume case does it make sense to use

fq=filter(fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *]  &&
type:"abc")

OR
fq=filter(fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] )
fq=filter(type:abc)

Is this something that I would need to determine by running some test
Thanks

On Thu, May 5, 2016 at 1:44 PM, Jay Potharaju  wrote:

> Are you suggesting rewriting it like this ?
> fq=filter(fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] )
> fq=filter(type:abc)
>
> Is this a better use of the cache as supposed to fq=fromfield:[* TO
> NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] && type:"abc"
>
> Thanks
>
> On Thu, May 5, 2016 at 12:50 PM, Ahmet Arslan 
> wrote:
>
>> Hi,
>>
>> Cache enemy is not * but NOW. Since you round it to DAY, cache will work
>> within-day.
>> I would use separate filer queries, especially fq=type:abc for the
>> structured query so it will be cached independently.
>>
>> Also consider disabling caching (using cost) in expensive queries:
>> http://yonik.com/advanced-filter-caching-in-solr/
>>
>> Ahmet
>>
>>
>>
>> On Thursday, May 5, 2016 8:25 PM, Jay Potharaju 
>> wrote:
>> Hi,
>> I have a filter query that gets  documents based on date ranges from last
>> n
>> days to anytime in future.
>>
>> The objective is to get documents between a date range, but the start date
>> and end date values are stored in different fields and that is why I wrote
>> the filter query as below
>>
>> fq=fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] &&
>> type:"abc"
>>
>> The way these queries are currently written I think wont leverage the
>> filter cache because of "*". Is there a better way to write this query so
>> that I can leverage the cache.
>>
>>
>>
>> --
>> Thanks
>> Jay

>>
>
>
>
> --
> Thanks
> Jay Potharaju
>
>



-- 
Thanks
Jay Potharaju

Re: collection alias and solr streaming expression

2016-05-05 Thread Joel Bernstein

Yes, this needs to be supported. If you open up a ticket, I'll be happy to
review.

Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, May 5, 2016 at 2:24 PM, sudsport s  wrote:

> I tried to run solr streaming expression using collection alias , I get
> null pointer expression. after looking at log I see that getSlices returns
> null.
>
> can someone suggest if it is good idea to add support for collection alias
> in solr streaming expression?
> If yes I would like to submit fix and add support for this feature.
>
>
>
>
> --
> Thanks
>

Re: Filter queries & caching

2016-05-05 Thread Jay Potharaju

I have almost 50 million docs and growing ...that being said in a high
query volume case does it make sense to use

 fq=filter(fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *]  &&
type:"abc")

OR
fq=filter(fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] )
fq=filter(type:abc)

Is this something that I would need to determine by running some test
Thanks

On Thu, May 5, 2016 at 1:44 PM, Jay Potharaju  wrote:

> Are you suggesting rewriting it like this ?
> fq=filter(fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] )
> fq=filter(type:abc)
>
> Is this a better use of the cache as supposed to fq=fromfield:[* TO
> NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] && type:"abc"
>
> Thanks
>
> On Thu, May 5, 2016 at 12:50 PM, Ahmet Arslan 
> wrote:
>
>> Hi,
>>
>> Cache enemy is not * but NOW. Since you round it to DAY, cache will work
>> within-day.
>> I would use separate filer queries, especially fq=type:abc for the
>> structured query so it will be cached independently.
>>
>> Also consider disabling caching (using cost) in expensive queries:
>> http://yonik.com/advanced-filter-caching-in-solr/
>>
>> Ahmet
>>
>>
>>
>> On Thursday, May 5, 2016 8:25 PM, Jay Potharaju 
>> wrote:
>> Hi,
>> I have a filter query that gets  documents based on date ranges from last
>> n
>> days to anytime in future.
>>
>> The objective is to get documents between a date range, but the start date
>> and end date values are stored in different fields and that is why I wrote
>> the filter query as below
>>
>> fq=fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] &&
>> type:"abc"
>>
>> The way these queries are currently written I think wont leverage the
>> filter cache because of "*". Is there a better way to write this query so
>> that I can leverage the cache.
>>
>>
>>
>> --
>> Thanks
>> Jay
>>
>
>
>
> --
> Thanks
> Jay Potharaju
>
>



-- 
Thanks
Jay Potharaju

Re: Filter queries & caching

2016-05-05 Thread Jay Potharaju

Are you suggesting rewriting it like this ?
fq=filter(fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] )
fq=filter(type:abc)

Is this a better use of the cache as supposed to fq=fromfield:[* TO
NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] && type:"abc"

Thanks

On Thu, May 5, 2016 at 12:50 PM, Ahmet Arslan 
wrote:

> Hi,
>
> Cache enemy is not * but NOW. Since you round it to DAY, cache will work
> within-day.
> I would use separate filer queries, especially fq=type:abc for the
> structured query so it will be cached independently.
>
> Also consider disabling caching (using cost) in expensive queries:
> http://yonik.com/advanced-filter-caching-in-solr/
>
> Ahmet
>
>
>
> On Thursday, May 5, 2016 8:25 PM, Jay Potharaju 
> wrote:
> Hi,
> I have a filter query that gets  documents based on date ranges from last n
> days to anytime in future.
>
> The objective is to get documents between a date range, but the start date
> and end date values are stored in different fields and that is why I wrote
> the filter query as below
>
> fq=fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] &&
> type:"abc"
>
> The way these queries are currently written I think wont leverage the
> filter cache because of "*". Is there a better way to write this query so
> that I can leverage the cache.
>
>
>
> --
> Thanks
> Jay
>



-- 
Thanks
Jay Potharaju

id field always stored?

2016-05-05 Thread Siddhartha Singh Sandhu

Hi,

1. I was doing some exploration and wanted to know if
the id field is always stored even when I set stored
= false.


**

2. Also, even though I removed dynamic fields, anything tagged *_id is
getting stored despite marking that field stored = false.

**

Where string is defined as:

**

Regards,

Sid.

Re: Upper and lower fence in stats component

2016-05-05 Thread Yonik Seeley

On Thu, May 5, 2016 at 3:21 PM, Duane Rackley
 wrote:
> My team is switching from a custom statistics patch to using Stats Component 
> in Solr 5.4.1. One of the features that we haven't been able to replicate in 
> Stats Component is an upper and lower fence.

With the JSON Facet API, you can do a query facet to narrow the domain
and then calculate stats under that:

http://yonik.com/json-facet-api/#QueryFacet

-Yonik

Re: Solr cloud 6.0.0 with ZooKeeper 3.4.8 Errors

2016-05-05 Thread Susheel Kumar

Yes, Nick, I am using the chroot to share the ZK for different instances.

On Thu, May 5, 2016 at 3:08 PM, Nick Vasilyev 
wrote:

> Just out of curiosity, are you using sharing the zookeepers between the
> different versions of Solr? If So, are you specifying a zookeeper chroot?
> On May 5, 2016 2:05 PM, "Susheel Kumar"  wrote:
>
> > Nick, Hoss -  Things are back to normal with ZK 3.4.8 and ZK-6.0.0.  I
> > switched to Solr 5.5.0 with ZK 3.4.8 which worked fine and then installed
> > 6.0.0.  I suspect (not 100% sure) i left ZK dataDir / Solr collection
> > directory data from previous ZK/solr version which probably was making
> Solr
> > 6 in unstable state.
> >
> > Thanks,
> > Susheel
> >
> > On Wed, May 4, 2016 at 9:56 PM, Susheel Kumar 
> > wrote:
> >
> > > Thanks, Nick & Hoss.  I am using the exact same machine, have wiped out
> > > solr 5.5.0 and installed solr-6.0.0 with external ZK 3.4.8.  I checked
> > the
> > > File Description limit for user solr, which is 12000 and increased to
> > > 52000. Don't see "too many files open..." error now in Solr log but
> still
> > > Solr connection getting lost in Admin panel.
> > >
> > > Let me do some more tests and install older version back to confirm and
> > > will share the findings.
> > >
> > > Thanks,
> > > Susheel
> > >
> > > On Wed, May 4, 2016 at 8:11 PM, Chris Hostetter <
> > hossman_luc...@fucit.org>
> > > wrote:
> > >
> > >>
> > >> : Thanks, Nick. Do we know any suggested # for file descriptor limit
> > with
> > >> : Solr6?  Also wondering why i haven't seen this problem before with
> > Solr
> > >> 5.x?
> > >>
> > >> are you running Solr6 on the exact same host OS that you were running
> > >> Solr5 on?
> > >>
> > >> even if you are using the "same OS version" on a diff machine, that
> > could
> > >> explain the discrepency if you (or someone else) increased the file
> > >> descriptor limit on the "old machine" but that neverh appened on the
> > 'new
> > >> machine"
> > >>
> > >>
> > >>
> > >> : On Wed, May 4, 2016 at 4:54 PM, Nick Vasilyev <
> > nick.vasily...@gmail.com
> > >> >
> > >> : wrote:
> > >> :
> > >> : > It looks like you have too many open files, try increasing the
> file
> > >> : > descriptor limit.
> > >> : >
> > >> : > On Wed, May 4, 2016 at 3:48 PM, Susheel Kumar <
> > susheel2...@gmail.com>
> > >> : > wrote:
> > >> : >
> > >> : > > Hello,
> > >> : > >
> > >> : > > I am trying to setup 2 node Solr cloud 6 cluster with ZK 3.4.8
> and
> > >> used
> > >> : > the
> > >> : > > install service to setup solr.
> > >> : > >
> > >> : > > After launching Solr Admin Panel on server1, it looses
> connections
> > >> in few
> > >> : > > seconds and then comes back and other node server2 is marked as
> > >> Down in
> > >> : > > cloud graph. After few seconds its loosing the connection and
> > comes
> > >> back.
> > >> : > >
> > >> : > > Any idea what may be going wrong? Has anyone used Solr 6 with ZK
> > >> 3.4.8.
> > >> : > > Have never seen this error before with solr 5.x with ZK 3.4.6.
> > >> : > >
> > >> : > > Below log from server1 & server2.  The ZK has 3 nodes with
> chroot
> > >> : > enabled.
> > >> : > >
> > >> : > > Thanks,
> > >> : > > Susheel
> > >> : > >
> > >> : > > server1/solr.log
> > >> : > >
> > >> : > > 
> > >> : > >
> > >> : > >
> > >> : > > 2016-05-04 19:20:53.804 INFO  (qtp1989972246-14) [   ]
> > >> : > > o.a.s.c.c.ZkStateReader path=[/collections/collection1]
> > >> : > > [configName]=[collection1] specified config exists in ZooKeeper
> > >> : > >
> > >> : > > 2016-05-04 19:20:53.806 INFO  (qtp1989972246-14) [   ]
> > >> : > o.a.s.s.HttpSolrCall
> > >> : > > [admin] webapp=null path=/admin/collections
> > >> : > > params={action=CLUSTERSTATUS=json&_=1462389588125} status=0
> > >> QTime=25
> > >> : > >
> > >> : > > 2016-05-04 19:20:53.859 INFO  (qtp1989972246-19) [   ]
> > >> : > > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list
> with
> > >> params
> > >> : > > action=LIST=json&_=1462389588125 and sendToOCPQueue=true
> > >> : > >
> > >> : > > 2016-05-04 19:20:53.861 INFO  (qtp1989972246-19) [   ]
> > >> : > o.a.s.s.HttpSolrCall
> > >> : > > [admin] webapp=null path=/admin/collections
> > >> : > > params={action=LIST=json&_=1462389588125} status=0 QTime=2
> > >> : > >
> > >> : > > 2016-05-04 19:20:57.520 INFO  (qtp1989972246-13) [   ]
> > >> : > o.a.s.s.HttpSolrCall
> > >> : > > [admin] webapp=null path=/admin/cores
> > >> : > > params={indexInfo=false=json&_=1462389588124} status=0
> QTime=0
> > >> : > >
> > >> : > > 2016-05-04 19:20:57.546 INFO  (qtp1989972246-15) [   ]
> > >> : > o.a.s.s.HttpSolrCall
> > >> : > > [admin] webapp=null path=/admin/info/system
> > >> : > > params={wt=json&_=1462389588126} status=0 QTime=25
> > >> : > >
> > >> : > > 2016-05-04 19:20:57.610 INFO  (qtp1989972246-13) [   ]
> > >> : > > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list
> with
> > >> params
> > >> : > > action=LIST=json&_=1462389588125

Re: Filter queries & caching

2016-05-05 Thread Ahmet Arslan

Hi,

Cache enemy is not * but NOW. Since you round it to DAY, cache will work 
within-day.
I would use separate filer queries, especially fq=type:abc for the structured 
query so it will be cached independently.

Also consider disabling caching (using cost) in expensive queries:
http://yonik.com/advanced-filter-caching-in-solr/

Ahmet



On Thursday, May 5, 2016 8:25 PM, Jay Potharaju  wrote:
Hi,
I have a filter query that gets  documents based on date ranges from last n
days to anytime in future.

The objective is to get documents between a date range, but the start date
and end date values are stored in different fields and that is why I wrote
the filter query as below

fq=fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] && type:"abc"

The way these queries are currently written I think wont leverage the
filter cache because of "*". Is there a better way to write this query so
that I can leverage the cache.



-- 
Thanks
Jay

Re: JSON-Facet ignoring excludeTags when numFound is 0

2016-05-05 Thread Yonik Seeley

On Thu, May 5, 2016 at 3:37 PM, Siddharth Modala
 wrote:
> Thanks Yonik,
>
> That fixed the issue. Will this experimental flag be removed from future
> versions?

I don't think so... it's needed functionality, I just don't
particularly like where I had to put it (in the "facet" block instead
of in the block w/ other facet params like offset, limit, etc).
While it's nice that we start off with an implicit facet bucket, that
leaves us w/o a place to specify parameters for the root facet.

-Yonik


> Is there any other webpage apart from your blog(which is btw really
> awesome) where I can find more info on the new facet module(like info reg
> the processEmpty flag e.t.c)?
> On May 5, 2016 2:37 PM, "Yonik Seeley"  wrote:
>
>> On Thu, May 5, 2016 at 2:27 PM, Siddharth Modala
>>  wrote:
>> > Hi All,
>> >
>> > We are facing the following issue where Json Facet with excludeTag
>> doesn't
>> > return any results  when numFound=0, even though excluding the filter
>> will
>> > result in matching few docs. (Note: excludeTag works when numFound is >
>> 0)
>>
>> Yeah, we perhaps need to be smarter at knowing when a 0 result could
>> still yield useful sub-facets.
>> For now, there is an experimental flag you can add to the facet block:
>> processEmpty:true
>>
>> So in your case:
>> json.facet={
>>   processEmpty:true,
>>   exCounts:{type:terms, field:group_id_s,limit=-1,
>>  domain:{excludeTags:AMOUNT} },
>>  }
>>
>> -Yonik
>>

Re: JSON-Facet ignoring excludeTags when numFound is 0

2016-05-05 Thread Siddharth Modala

Thanks Yonik,

That fixed the issue. Will this experimental flag be removed from future
versions?

Is there any other webpage apart from your blog(which is btw really
awesome) where I can find more info on the new facet module(like info reg
the processEmpty flag e.t.c)?
On May 5, 2016 2:37 PM, "Yonik Seeley"  wrote:

> On Thu, May 5, 2016 at 2:27 PM, Siddharth Modala
>  wrote:
> > Hi All,
> >
> > We are facing the following issue where Json Facet with excludeTag
> doesn't
> > return any results  when numFound=0, even though excluding the filter
> will
> > result in matching few docs. (Note: excludeTag works when numFound is >
> 0)
>
> Yeah, we perhaps need to be smarter at knowing when a 0 result could
> still yield useful sub-facets.
> For now, there is an experimental flag you can add to the facet block:
> processEmpty:true
>
> So in your case:
> json.facet={
>   processEmpty:true,
>   exCounts:{type:terms, field:group_id_s,limit=-1,
>  domain:{excludeTags:AMOUNT} },
>  }
>
> -Yonik
>

Upper and lower fence in stats component

2016-05-05 Thread Duane Rackley

Hello,

My team is switching from a custom statistics patch to using Stats Component in 
Solr 5.4.1. One of the features that we haven't been able to replicate in Stats 
Component is an upper and lower fence. The fences limit the data that is sent 
to the Stats Component but not the data that is returned by the query. We have 
tried a couple of things, queries on the tutorial index listed below, but 
neither of them returned the desired behavior. I was wondering if anyone knew 
how to accomplish this feature or of a patch that adds this feature?

http://localhost:8983/solr/techproducts/select?rows=0=*:*=true=%7B!lucene%7Dprice:[100%20TO%201000]
http://localhost:8983/solr/techproducts/select?rows=2=*:*=true=%7B!frange%20l=100%20u=2000%7Dprice=price

Re: Solr cloud 6.0.0 with ZooKeeper 3.4.8 Errors

2016-05-05 Thread Nick Vasilyev

Just out of curiosity, are you using sharing the zookeepers between the
different versions of Solr? If So, are you specifying a zookeeper chroot?
On May 5, 2016 2:05 PM, "Susheel Kumar"  wrote:

> Nick, Hoss -  Things are back to normal with ZK 3.4.8 and ZK-6.0.0.  I
> switched to Solr 5.5.0 with ZK 3.4.8 which worked fine and then installed
> 6.0.0.  I suspect (not 100% sure) i left ZK dataDir / Solr collection
> directory data from previous ZK/solr version which probably was making Solr
> 6 in unstable state.
>
> Thanks,
> Susheel
>
> On Wed, May 4, 2016 at 9:56 PM, Susheel Kumar 
> wrote:
>
> > Thanks, Nick & Hoss.  I am using the exact same machine, have wiped out
> > solr 5.5.0 and installed solr-6.0.0 with external ZK 3.4.8.  I checked
> the
> > File Description limit for user solr, which is 12000 and increased to
> > 52000. Don't see "too many files open..." error now in Solr log but still
> > Solr connection getting lost in Admin panel.
> >
> > Let me do some more tests and install older version back to confirm and
> > will share the findings.
> >
> > Thanks,
> > Susheel
> >
> > On Wed, May 4, 2016 at 8:11 PM, Chris Hostetter <
> hossman_luc...@fucit.org>
> > wrote:
> >
> >>
> >> : Thanks, Nick. Do we know any suggested # for file descriptor limit
> with
> >> : Solr6?  Also wondering why i haven't seen this problem before with
> Solr
> >> 5.x?
> >>
> >> are you running Solr6 on the exact same host OS that you were running
> >> Solr5 on?
> >>
> >> even if you are using the "same OS version" on a diff machine, that
> could
> >> explain the discrepency if you (or someone else) increased the file
> >> descriptor limit on the "old machine" but that neverh appened on the
> 'new
> >> machine"
> >>
> >>
> >>
> >> : On Wed, May 4, 2016 at 4:54 PM, Nick Vasilyev <
> nick.vasily...@gmail.com
> >> >
> >> : wrote:
> >> :
> >> : > It looks like you have too many open files, try increasing the file
> >> : > descriptor limit.
> >> : >
> >> : > On Wed, May 4, 2016 at 3:48 PM, Susheel Kumar <
> susheel2...@gmail.com>
> >> : > wrote:
> >> : >
> >> : > > Hello,
> >> : > >
> >> : > > I am trying to setup 2 node Solr cloud 6 cluster with ZK 3.4.8 and
> >> used
> >> : > the
> >> : > > install service to setup solr.
> >> : > >
> >> : > > After launching Solr Admin Panel on server1, it looses connections
> >> in few
> >> : > > seconds and then comes back and other node server2 is marked as
> >> Down in
> >> : > > cloud graph. After few seconds its loosing the connection and
> comes
> >> back.
> >> : > >
> >> : > > Any idea what may be going wrong? Has anyone used Solr 6 with ZK
> >> 3.4.8.
> >> : > > Have never seen this error before with solr 5.x with ZK 3.4.6.
> >> : > >
> >> : > > Below log from server1 & server2.  The ZK has 3 nodes with chroot
> >> : > enabled.
> >> : > >
> >> : > > Thanks,
> >> : > > Susheel
> >> : > >
> >> : > > server1/solr.log
> >> : > >
> >> : > > 
> >> : > >
> >> : > >
> >> : > > 2016-05-04 19:20:53.804 INFO  (qtp1989972246-14) [   ]
> >> : > > o.a.s.c.c.ZkStateReader path=[/collections/collection1]
> >> : > > [configName]=[collection1] specified config exists in ZooKeeper
> >> : > >
> >> : > > 2016-05-04 19:20:53.806 INFO  (qtp1989972246-14) [   ]
> >> : > o.a.s.s.HttpSolrCall
> >> : > > [admin] webapp=null path=/admin/collections
> >> : > > params={action=CLUSTERSTATUS=json&_=1462389588125} status=0
> >> QTime=25
> >> : > >
> >> : > > 2016-05-04 19:20:53.859 INFO  (qtp1989972246-19) [   ]
> >> : > > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with
> >> params
> >> : > > action=LIST=json&_=1462389588125 and sendToOCPQueue=true
> >> : > >
> >> : > > 2016-05-04 19:20:53.861 INFO  (qtp1989972246-19) [   ]
> >> : > o.a.s.s.HttpSolrCall
> >> : > > [admin] webapp=null path=/admin/collections
> >> : > > params={action=LIST=json&_=1462389588125} status=0 QTime=2
> >> : > >
> >> : > > 2016-05-04 19:20:57.520 INFO  (qtp1989972246-13) [   ]
> >> : > o.a.s.s.HttpSolrCall
> >> : > > [admin] webapp=null path=/admin/cores
> >> : > > params={indexInfo=false=json&_=1462389588124} status=0 QTime=0
> >> : > >
> >> : > > 2016-05-04 19:20:57.546 INFO  (qtp1989972246-15) [   ]
> >> : > o.a.s.s.HttpSolrCall
> >> : > > [admin] webapp=null path=/admin/info/system
> >> : > > params={wt=json&_=1462389588126} status=0 QTime=25
> >> : > >
> >> : > > 2016-05-04 19:20:57.610 INFO  (qtp1989972246-13) [   ]
> >> : > > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with
> >> params
> >> : > > action=LIST=json&_=1462389588125 and sendToOCPQueue=true
> >> : > >
> >> : > > 2016-05-04 19:20:57.613 INFO  (qtp1989972246-13) [   ]
> >> : > o.a.s.s.HttpSolrCall
> >> : > > [admin] webapp=null path=/admin/collections
> >> : > > params={action=LIST=json&_=1462389588125} status=0 QTime=3
> >> : > >
> >> : > > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5980) [   ]
> >> : > > o.a.h.i.c.DefaultHttpClient I/O exception
> (java.net.SocketException)
> >> : >

Re: JSON-Facet ignoring excludeTags when numFound is 0

2016-05-05 Thread Yonik Seeley

On Thu, May 5, 2016 at 2:27 PM, Siddharth Modala
 wrote:
> Hi All,
>
> We are facing the following issue where Json Facet with excludeTag  doesn't
> return any results  when numFound=0, even though excluding the filter will
> result in matching few docs. (Note: excludeTag works when numFound is > 0)

Yeah, we perhaps need to be smarter at knowing when a 0 result could
still yield useful sub-facets.
For now, there is an experimental flag you can add to the facet block:
processEmpty:true

So in your case:
json.facet={
  processEmpty:true,
  exCounts:{type:terms, field:group_id_s,limit=-1,
 domain:{excludeTags:AMOUNT} },
 }

-Yonik

JSON-Facet ignoring excludeTags when numFound is 0

2016-05-05 Thread Siddharth Modala

Hi All,

We are facing the following issue where Json Facet with excludeTag  doesn't
return any results  when numFound=0, even though excluding the filter will
result in matching few docs. (Note: excludeTag works when numFound is > 0)

We are using Solr 5.4.1
For eg.

If we have the following data in Solr

acct_s,group_id_s,amt_td
A1, G1,100
A2, G1, 200
A3,G2,100
A4,G3, 100


and we make a solr query as follows

q=*:*={!tag='AMOUNT' }amt_td:[201 TO *]&
={
  exCounts:{type:terms, field:group_id_s,limit=-1,
domain:{excludeTags:AMOUNT} },
}

Result is numfound=0 and facet:[]

but clearly excluding the 'AMOUNT' filter will match all the documents.


This issue is not there in the normal facet module

eg.

 q=*:*={!tag='AMOUNT' }amt_td:[201 TO
*]=true={!ex=AMOUNT}group_id_s=-1

will return numFound=0
facet_counts:{
  facet_fields:{
  group_id_s:[
  G1,2
  G2,1
  G3,1
 ]
  }
   }


This issue shows up in a different form when we have our data distributed
on multiple shards in cloud mode.

Eg.
If the data is distributed as follows

Shard1 on node 1:

acct_s,group_id_s,amt_td
A1, G1,100
A3,G2,100


Shard2 on node2:

acct_s,group_id_s,amt_td
A2, G1, 200
A4,G3, 100

When we run the following query

Note: below fq will match one document

q=*:*={!tag='AMOUNT' }amt_td:[101 TO *]&
={
  exCounts:{type:terms, field:group_id_s, limit=-1,
domain:{excludeTags:AMOUNT} },
}

Will give

numFound=1
facets:{
  exCounts:{
   buckets:[{
  val:G1, count:1
  },{
  val:G3, count:1
  }]
  }
  }

We can clearly see that the counts of the groups are not as we would
expect. Thats because in Shard1 the above query has no matching docs so
json facet dint return any results from that shard. But Shard2 has one
matching doc and so the json facet module return some result.

Is this a bug? or this is the way the new json facet module is implemented.

Has anyone else faced something similar? did you find any work around for
this issue?

PS: We are using jsonFacet because of its performance benefits.

collection alias and solr streaming expression

2016-05-05 Thread sudsport s

I tried to run solr streaming expression using collection alias , I get
null pointer expression. after looking at log I see that getSlices returns
null.

can someone suggest if it is good idea to add support for collection alias
in solr streaming expression?
If yes I would like to submit fix and add support for this feature.




--
Thanks

Re: Query String Limit

2016-05-05 Thread Ahmet Arslan

Hi,

Wow thats a lot of IDs. Where are they coming from?
May be you can consider using join options of lucene/solr if these IDs are 
result of another query.

Also terms query parser would be better choice in case of lots of IDs.
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-TermsQueryParser

Ahmet


On Thursday, May 5, 2016 7:45 AM, Prasanna S. Dhakephalkar 
 wrote:
Hi

We had increased the maxBooleanClauses to a large number, but it did not
work

Here is the query

http://localhost:8983/solr/collection1/select?fq=record_id%3A(604929+504197+
500759+510957+624719+524081+544530+375687+494822+468221+553049+441998+495212
+462613+623866+344379+462078+501936+189274+609976+587180+620273+479690+60601
8+487078+496314+497899+374231+486707+516582+74518+479684+1696152+1090711+396
784+377205+600603+539686+550483+436672+512228+1102968+600604+487699+612271+4
87978+433952+479846+492699+380838+412290+487086+515836+487957+525335+495426+
619724+49726+444558+67422+368749+630542+473638+613887+1679503+509367+1108299
+498818+528683+530270+595087+468595+585998+487888+600612+515884+455568+60643
8+526281+497992+460147+587530+576456+526021+790508+486148+469160+365923+4846
54+510829+488792+610933+254610+632700+522376+594418+514817+439283+1676569+52
4031+431557+521628+609255+627205+1255921+57+477017+519675+548373+350309+
491176+524276+570935+549458+495765+512814+494722+382249+619036+477309+487718
+470604+514622+1240902+570607+613830+519130+479708+630293+496994+623870+5706
72+390434+483496+609115+490875+443859+292168+522383+501802+606498+596773+479
881+486020+488654+490422+512636+495512+489480+626269+614618+498967+476988+47
7608+486568+270095+295480+478367+607120+583892+593474+494373+368030+484522+5
01183+432822+448109+553418+584084+614868+486206+481014+495027+501880+479113+
615208+488161+512278+597663+569409+139097+489490+584000+493619+607479+281080
+518617+518803+487896+719003+584153+484341+505689+278177+539722+548001+62529
6+1676456+507566+619039+501882+530385+474125+293642+612857+568418+640839+519
893+524335+612859+618762+479460+479719+593700+573677+525991+610965+462087+52
1251+501197+443642+1684784+533972+510695+475499+490644+613829+613893+479467+
542478+1102898+499230+436921+458632+602303+488468+1684407+584373+494603+4992
45+548019+600436+606997+59+503156+440428+518759+535013+548023+494273+649
062+528704+469282+582249+511250+496466+497675+505937+489504+600444+614240+19
35577+464232+522398+613809+1206232+607149+607644+498059+506810+487115+550976
+638174+600849+525655+625011+500082+606336+507156+487887+333601+457209+60111
0+494927+1712081+601280+486061+501558+600451+263864+527378+571918+472415+608
130+212386+380460+590400+478850+631886+486782+608013+613824+581767+527023+62
3207+607013+505819+485418+486786+537626+507047+92+527473+495520+553141+5
17837+497295+563266+495506+532725+267057+497321+453249+524341+429654+720001+
539946+490813+479491+479628+479630+1125985+351147+524296+565077+439949+61241
3+495854+479493+1647796+600259+229346+492571+485638+596394+512112+477237+600
459+263780+704068+485934+450060+475944+582280+488031+1094010+1687904+539515+
525820+539516+505985+600461+488991+387733+520928+362967+351847+531586+616101
+479925+494156+511292+515729+601903+282655+491244+610859+486081+325500+43639
7+600708+523445+480737+486083+614767+486278+1267655+484845+495145+562624+493
381+8060+638731+501347+565979+325132+501363+268866+614113+479646+1964487+631
934+25717+461612+376451+513712+527557+459209+610194+1938903+488861+426305+47
7676+1222682+1246647+567986+501908+791653+325802+498354+435156+484862+533068
+339875+395827+475148+331094+528741+540715+623480+416601+516419+600473+62563
2+480570+447412+449778+503316+492365+563298+486361+500907+514521+138405+6123
27+495344+596879+524918+474563+47273+514739+553189+548418+448943+450612+6006
78+484753+485302+271844+474199+487922+473784+431524+535371+513583+514746+612
534+327470+485855+517878+384102+485856+612768+494791+504840+601330+493551+55
8620+540131+479809+394179+487866+559955+578444+576571+485861+488879+573089+4
97552+487898+490369+535756+614155+633027+487473+517912+523364+527419+600487+
486128+278040+598478+487395+600579+585691+498970+488151+608187+445943+631971
+230291+504552+534443+501924+489148+292672+528874+434783+479533+485301+61908
9+629083+479383+600981+534717+645420+604921+618714+522329+597822+507413+5706
05+491732+464741+511564+613929+526049+614817+589065+603307+491990+467339+264
426+487907+492982+589067+487674+487820+492983+486708+504140+1216198+625736+4
92984+530116+615663+503248+1896822+600588+518139+494994+621846+599669+488207
+640923+487580+539856+603968+444717+492991+614824+491735+492992+495149+52117
2+365778+261681+600502+479682+597464+492997+587172+624381+482355+1246338+593
642+492000+494707+620137+493000+20617+585199+587176+587177+1877064+587179+53
3478+606061+647089+612257+558521+612259+612261+612264+612266+612268+612273+6
12274+612275+612276+612278+612279+1414843+883571+206887+147419+617296+547518

Re: Solr cloud 6.0.0 with ZooKeeper 3.4.8 Errors

2016-05-05 Thread Susheel Kumar

Nick, Hoss -  Things are back to normal with ZK 3.4.8 and ZK-6.0.0.  I
switched to Solr 5.5.0 with ZK 3.4.8 which worked fine and then installed
6.0.0.  I suspect (not 100% sure) i left ZK dataDir / Solr collection
directory data from previous ZK/solr version which probably was making Solr
6 in unstable state.

Thanks,
Susheel

On Wed, May 4, 2016 at 9:56 PM, Susheel Kumar  wrote:

> Thanks, Nick & Hoss.  I am using the exact same machine, have wiped out
> solr 5.5.0 and installed solr-6.0.0 with external ZK 3.4.8.  I checked the
> File Description limit for user solr, which is 12000 and increased to
> 52000. Don't see "too many files open..." error now in Solr log but still
> Solr connection getting lost in Admin panel.
>
> Let me do some more tests and install older version back to confirm and
> will share the findings.
>
> Thanks,
> Susheel
>
> On Wed, May 4, 2016 at 8:11 PM, Chris Hostetter 
> wrote:
>
>>
>> : Thanks, Nick. Do we know any suggested # for file descriptor limit with
>> : Solr6?  Also wondering why i haven't seen this problem before with Solr
>> 5.x?
>>
>> are you running Solr6 on the exact same host OS that you were running
>> Solr5 on?
>>
>> even if you are using the "same OS version" on a diff machine, that could
>> explain the discrepency if you (or someone else) increased the file
>> descriptor limit on the "old machine" but that neverh appened on the 'new
>> machine"
>>
>>
>>
>> : On Wed, May 4, 2016 at 4:54 PM, Nick Vasilyev > >
>> : wrote:
>> :
>> : > It looks like you have too many open files, try increasing the file
>> : > descriptor limit.
>> : >
>> : > On Wed, May 4, 2016 at 3:48 PM, Susheel Kumar 
>> : > wrote:
>> : >
>> : > > Hello,
>> : > >
>> : > > I am trying to setup 2 node Solr cloud 6 cluster with ZK 3.4.8 and
>> used
>> : > the
>> : > > install service to setup solr.
>> : > >
>> : > > After launching Solr Admin Panel on server1, it looses connections
>> in few
>> : > > seconds and then comes back and other node server2 is marked as
>> Down in
>> : > > cloud graph. After few seconds its loosing the connection and comes
>> back.
>> : > >
>> : > > Any idea what may be going wrong? Has anyone used Solr 6 with ZK
>> 3.4.8.
>> : > > Have never seen this error before with solr 5.x with ZK 3.4.6.
>> : > >
>> : > > Below log from server1 & server2.  The ZK has 3 nodes with chroot
>> : > enabled.
>> : > >
>> : > > Thanks,
>> : > > Susheel
>> : > >
>> : > > server1/solr.log
>> : > >
>> : > > 
>> : > >
>> : > >
>> : > > 2016-05-04 19:20:53.804 INFO  (qtp1989972246-14) [   ]
>> : > > o.a.s.c.c.ZkStateReader path=[/collections/collection1]
>> : > > [configName]=[collection1] specified config exists in ZooKeeper
>> : > >
>> : > > 2016-05-04 19:20:53.806 INFO  (qtp1989972246-14) [   ]
>> : > o.a.s.s.HttpSolrCall
>> : > > [admin] webapp=null path=/admin/collections
>> : > > params={action=CLUSTERSTATUS=json&_=1462389588125} status=0
>> QTime=25
>> : > >
>> : > > 2016-05-04 19:20:53.859 INFO  (qtp1989972246-19) [   ]
>> : > > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with
>> params
>> : > > action=LIST=json&_=1462389588125 and sendToOCPQueue=true
>> : > >
>> : > > 2016-05-04 19:20:53.861 INFO  (qtp1989972246-19) [   ]
>> : > o.a.s.s.HttpSolrCall
>> : > > [admin] webapp=null path=/admin/collections
>> : > > params={action=LIST=json&_=1462389588125} status=0 QTime=2
>> : > >
>> : > > 2016-05-04 19:20:57.520 INFO  (qtp1989972246-13) [   ]
>> : > o.a.s.s.HttpSolrCall
>> : > > [admin] webapp=null path=/admin/cores
>> : > > params={indexInfo=false=json&_=1462389588124} status=0 QTime=0
>> : > >
>> : > > 2016-05-04 19:20:57.546 INFO  (qtp1989972246-15) [   ]
>> : > o.a.s.s.HttpSolrCall
>> : > > [admin] webapp=null path=/admin/info/system
>> : > > params={wt=json&_=1462389588126} status=0 QTime=25
>> : > >
>> : > > 2016-05-04 19:20:57.610 INFO  (qtp1989972246-13) [   ]
>> : > > o.a.s.h.a.CollectionsHandler Invoked Collection Action :list with
>> params
>> : > > action=LIST=json&_=1462389588125 and sendToOCPQueue=true
>> : > >
>> : > > 2016-05-04 19:20:57.613 INFO  (qtp1989972246-13) [   ]
>> : > o.a.s.s.HttpSolrCall
>> : > > [admin] webapp=null path=/admin/collections
>> : > > params={action=LIST=json&_=1462389588125} status=0 QTime=3
>> : > >
>> : > > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5980) [   ]
>> : > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
>> : > caught
>> : > > when connecting to {}->http://server2:8983: Too many open files
>> : > >
>> : > > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5983) [   ]
>> : > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
>> : > caught
>> : > > when connecting to {}->http://server2:8983: Too many open files
>> : > >
>> : > > 2016-05-04 19:21:29.139 INFO  (qtp1989972246-5984) [   ]
>> : > > o.a.h.i.c.DefaultHttpClient I/O exception (java.net.SocketException)
>> : > caught
>>

Filter queries & caching

2016-05-05 Thread Jay Potharaju

Hi,
I have a filter query that gets  documents based on date ranges from last n
days to anytime in future.

The objective is to get documents between a date range, but the start date
and end date values are stored in different fields and that is why I wrote
the filter query as below

fq=fromfield:[* TO NOW/DAY+1DAY]&& tofield:[NOW/DAY-7DAY TO *] && type:"abc"

The way these queries are currently written I think wont leverage the
filter cache because of "*". Is there a better way to write this query so
that I can leverage the cache.



-- 
Thanks
Jay

Re: query action with wrong result size zero

2016-05-05 Thread Jay Potharaju

Can you check if the field you are searching on is case sensitive? You can
quickly test it by copying the exact contents of the brand field into your
query and comparing it against the query you have posted above.

On Thu, May 5, 2016 at 8:57 AM, mixiangliu <852262...@qq.com> wrote:

>
> i found a strange thing  with solr query，when i set the value of query
> field like "brand:amd"，the  size of query result is zero,but the real data
> is not zero，can  some body tell me why，thank you very much！！
> my english is not very good，wish some body understand my words!
>

-- 
Thanks
Jay Potharaju

Re: 11:12:25 ERROR SolrCore org.apache.solr.common.SolrException: undefined field 1948

2016-05-05 Thread Garfinkel, David

Thanks Shawn!

On Thu, May 5, 2016 at 12:14 PM, Shawn Heisey  wrote:

> On 5/5/2016 9:52 AM, Garfinkel, David wrote:
> > I'm new to administering Solr, but it is part of my DAM and I'd like to
> > have a better understanding. If I understand correctly I have a field in
> my
> > schema with uuid 1948 that is causing an issue right?
>
> The data being indexed contains a field *named* 1948.  That is not the
> value of the field, it's the name.  Your schema does not contain a field
> named 1948, so Solr refuses to index the data.
>
> Thanks,
> Shawn
>
>


-- 
David Garfinkel
Digital Asset Management/Helpdesk/Systems Support
The Museum of Modern Art
212.708.9866
david_garfin...@moma.org

Re: 11:12:25 ERROR SolrCore org.apache.solr.common.SolrException: undefined field 1948

2016-05-05 Thread Shawn Heisey

On 5/5/2016 9:52 AM, Garfinkel, David wrote:
> I'm new to administering Solr, but it is part of my DAM and I'd like to
> have a better understanding. If I understand correctly I have a field in my
> schema with uuid 1948 that is causing an issue right?

The data being indexed contains a field *named* 1948.  That is not the
value of the field, it's the name.  Your schema does not contain a field
named 1948, so Solr refuses to index the data.

Thanks,
Shawn

query action with wrong result size zero

2016-05-05 Thread mixiangliu


i found a strange thing  with solr query，when i set the value of query field 
like "brand:amd"，the  size of query result is zero,but the real data is not 
zero，can  some body tell me why，thank you very much！！
my english is not very good，wish some body understand my words!

11:12:25 ERROR SolrCore org.apache.solr.common.SolrException: undefined field 1948

2016-05-05 Thread Garfinkel, David

I'm new to administering Solr, but it is part of my DAM and I'd like to
have a better understanding. If I understand correctly I have a field in my
schema with uuid 1948 that is causing an issue right?

-- 
David Garfinkel
Digital Asset Management/Helpdesk/Systems Support
The Museum of Modern Art
212.708.9866
david_garfin...@moma.org

Re: JDK requirements for Solr 5.5

2016-05-05 Thread Shawn Heisey

On 5/5/2016 1:48 AM, t...@sina.com wrote:
> Can Solr run on JDK 7 32 bit? Or must be 64 bit?

You can use a 32-bit JVM ... but it will be limited to 2GB of heap. 
This is a Java limitation, not a Solr limitation.  Depending on how
large your index is, 2GB may not be enough.

Thanks,
Shawn

Re: Passing Ids in query takes more time

2016-05-05 Thread Jeff Wartes

An ID lookup is a very simple and fast query, for one ID. Or’ing a lookup for 
80k ids though is basically 80k searches as far as Solr is concerned, so it’s 
not altogether surprising that it takes a while. Your complaint seems to be 
that the query planner doesn’t know in advance that  should be 
run first, and then the id selection applied to the reduced set. 

So, I can think of a few things for you to look at, in no particular order:

1. TermsQueryParser is designed for lists of terms, you might get better 
results from that: 
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-TermsQueryParser

2. If your  is the real discriminating factor in your search, 
you could just search for  and then apply your ID list as a 
PostFilter: http://yonik.com/advanced-filter-caching-in-solr/  
I guess that’d look something like ={!terms f= v="= 100 
should qualify it as a post filter, which only operates on an already-found 
result set instead of the full index. (Note: I haven’t confirmed that the Terms 
query parser supports post filtering.)

3. I’m not really aware of any storage engine that’ll love doing a filter on 
80k ids at once, but a key-value store like Cassandra might work out better for 
that.

4. There is a thing called a JoinQParserPlugin 
(https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser)
 that can join to another collection 
(https://issues.apache.org/jira/browse/SOLR-4905). But I’ve never used it, and 
there are some significant restrictions.

On 5/5/16, 2:46 AM, "Bhaumik Joshi"  wrote:

>Hi,
>
>
>I am retrieving ids from collection1 based on some query and passing those ids 
>as a query to collection2 so the query to collection2 which contains ids in it 
>takes much more time compare to normal query.
>
>
>Que. 1 - While passing ids to query why it takes more time compare to normal 
>query however we are narrowing the criteria by passing ids?
>
>e.g.  query-1: doc_id:(111 222 333 444 ...) AND  slower 
>(passing 80k ids takes 7-9 sec) than query-2: only  (700-800 
>ms). Both returns 250 records with same set of fields.
>
>
>Que. 2 - Any idea on how i can achieve above (get ids from one collection and 
>pass those ids to other one) in efficient manner or any other way to get data 
>from one collection based on response of other collection?
>
>
>Thanks & Regards,
>
>Bhaumik Joshi

Re: Query String Limit

2016-05-05 Thread Shawn Heisey

On 5/4/2016 10:45 PM, Prasanna S. Dhakephalkar wrote:
> We had increased the maxBooleanClauses to a large number, but it did not
> work

It looks like you have 1161 values here, so maxBooleanClauses does need
to be increased beyond the default, but the error message would be
different if that limit were being reached.

The problem seems to be that this URL is 8251 bytes long, counting the
extra space after json, which a browser would likely expand to three
characters, making it 8253 bytes long.  Every webserver I have ever
looked at has a default 8192 byte limit on HTTP headers -- the actual
header here would have "GET " before the URL and " HTTP/1.1" after it,
so there's a potential header size of 8266 bytes here -- which won't
work if the server config is left alone.

You have two choices.  You must either bump up the max header size in
the Jetty config, or change to a POST request instead of a GET request,
and put the query parameters in the POST body.

In 5.x and 6.x, the server/etc/jetty.xml config already has a line for
the requestHeaderSize, you just need to change the number.  This is from
my config -- we have queries up to 20K in size:

As you can see, this config uses a property.  You could add
"-Dsolr.jetty.request.header.size=32768" to your Solr startup options
and achieve the same result.  Personally, I just edit jetty.xml -- which
I need to remember to do again when I upgrade Solr.

I need to check whether we are using POST in our code or not.  That
would solve the entire issue.

Thanks,
Shawn

Re: I need Consultation/Suggestion and I am even willing to pay fee for that

2016-05-05 Thread John Bickerstaff

This statement has two possible meanings in my mind...

"I want everything as automated manner with minimal manual work."

Do you mean minimal work for your users?  Or do you mean minimal work to
get your idea up and running and generating income for you or your company?

The first meaning is laudable and a good idea -- and generally the nicer
you want to make things for your users, the more time you will spend in
analysis and development (I.E. greater cost in time and money)

The second meaning suggests you want to spend a minimum of time and money
to get something working -- which is generally incompatible with a really
great user experience...

And, of course, I may have totally missed your meaning and you may have had
something totally different in mind...

On Thu, May 5, 2016 at 8:33 AM, John Bickerstaff 
wrote:

> I'll just briefly add some thoughts...
>
> #1 This can be done several ways - including keeping a totally separate
> document that contains ONLY the data you're willing to expose for free --
> but what you want to accomplish is not clear enough to me for me to start
> making recommendations.  I'll just say that this is not a problem or an
> issue.  A way can be found to address #1 without much problem.
>
> #2 is difficult to understand.  I have the sense that you're only
> beginning to think about a full application you want to build - with Search
> at the center -- answering #2 is going to take a lot more clarity about
> exactly what you're trying to accomplish.
>
>
> #3  SOLR allows you to store original content so that you can return it
> from Solr to an application at some future point.  You don't need to worry
> about that.  By far the simplest way to handle images is to store metadata
> about the image (including a link, or some way to get it quickly out of
> your database, say, the DB id) and then go get the image as part of a
> secondary process of building your web page after Solr has returned
> results...  At least that's the way I and the teams I've worked with have
> always handled it.
>
> #4  I must admit, I don't understand question #4...  Do you mean "Will the
> way I'm handling documents affect the way my site is ranked by Google?"
> Um.  Probably?  If you were giving everything away for free you'd
> probably end up with a higher rank over time, but that's not what you want
> to do, so maybe it's not an issue?  I'm not an expert on getting good
> rankings from Google, so I'll leave that to others to comment on.
>
> As for 5 - what is the something you want to do?  I could try to answer,
> but I don't have enough information to be sure my answer will match what
> you're looking for.
>
> On Thu, May 5, 2016 at 4:46 AM, Zara Parst  wrote:
>
>> What is in my mind!!
>>
>>
>>
>> I have data in TB mainly educational assignments and projects which will
>> contain text, image and may be codes also if this is from computer
>> Science.  I will index all the documents into solr and I will also have
>> original copy of those documents. Now, I want to create a library where
>> user can search the content and can see few parts of relevant documents
>> like 5 to 10 related documents but in restricted manner.  For unrestricted
>> manner they have to pay for each documents.
>>
>>
>>
>> I also want to create page for those content which has been already shown
>> to the user as a restricted part. So that number of page on my website
>> keep
>> on increasing which will give a boost to my website for search engine
>> ranking. Obviously more pages mean better rank. I want everything as
>> automated manner with minimal manual work. Now issue that I am facing
>>
>> 1.  How to generate restricted part out of solr which is most relevant
>> ( I can implement sliding window display which might serve this but if
>> there is already something in solr then I will prefer that one)
>>
>>
>> 2.  How to create pages from that content and how to manage url of
>> that
>> page on my website (one solution would be url based on query but what if
>> someone search almost same thing and some other document comes as first
>> option and how to resolve the issue of the same url, this will also create
>> issue of overlapping content with different url if I am implementing
>> sliding window)
>>
>>
>>
>> 3.  About creating page, shall I create the page from solr content or
>> from original content because it might have image in content so better
>> option would be from original content.  More suitable choice looks like
>> from original content, if that is the case then how to extract those part
>> from the original content corresponding to the solr result.
>>
>>
>>
>> 4.  Will this affect my site ranking in negative way.
>>
>>
>>
>> 5.  Can we do something for Meta keyword, Title etc. of generated
>> page.
>>
>
>

Re: I need Consultation/Suggestion and I am even willing to pay fee for that

2016-05-05 Thread John Bickerstaff

I'll just briefly add some thoughts...

#1 This can be done several ways - including keeping a totally separate
document that contains ONLY the data you're willing to expose for free --
but what you want to accomplish is not clear enough to me for me to start
making recommendations.  I'll just say that this is not a problem or an
issue.  A way can be found to address #1 without much problem.

#2 is difficult to understand.  I have the sense that you're only beginning
to think about a full application you want to build - with Search at the
center -- answering #2 is going to take a lot more clarity about exactly
what you're trying to accomplish.

#3  SOLR allows you to store original content so that you can return it
from Solr to an application at some future point.  You don't need to worry
about that.  By far the simplest way to handle images is to store metadata
about the image (including a link, or some way to get it quickly out of
your database, say, the DB id) and then go get the image as part of a
secondary process of building your web page after Solr has returned
results...  At least that's the way I and the teams I've worked with have
always handled it.

#4  I must admit, I don't understand question #4...  Do you mean "Will the
way I'm handling documents affect the way my site is ranked by Google?"
Um.  Probably?  If you were giving everything away for free you'd
probably end up with a higher rank over time, but that's not what you want
to do, so maybe it's not an issue?  I'm not an expert on getting good
rankings from Google, so I'll leave that to others to comment on.

As for 5 - what is the something you want to do?  I could try to answer,
but I don't have enough information to be sure my answer will match what
you're looking for.

On Thu, May 5, 2016 at 4:46 AM, Zara Parst  wrote:

> What is in my mind!!
>
>
>
> I have data in TB mainly educational assignments and projects which will
> contain text, image and may be codes also if this is from computer
> Science.  I will index all the documents into solr and I will also have
> original copy of those documents. Now, I want to create a library where
> user can search the content and can see few parts of relevant documents
> like 5 to 10 related documents but in restricted manner.  For unrestricted
> manner they have to pay for each documents.
>
>
>
> I also want to create page for those content which has been already shown
> to the user as a restricted part. So that number of page on my website keep
> on increasing which will give a boost to my website for search engine
> ranking. Obviously more pages mean better rank. I want everything as
> automated manner with minimal manual work. Now issue that I am facing
>
> 1.  How to generate restricted part out of solr which is most relevant
> ( I can implement sliding window display which might serve this but if
> there is already something in solr then I will prefer that one)
>
>
> 2.  How to create pages from that content and how to manage url of that
> page on my website (one solution would be url based on query but what if
> someone search almost same thing and some other document comes as first
> option and how to resolve the issue of the same url, this will also create
> issue of overlapping content with different url if I am implementing
> sliding window)
>
>
>
> 3.  About creating page, shall I create the page from solr content or
> from original content because it might have image in content so better
> option would be from original content.  More suitable choice looks like
> from original content, if that is the case then how to extract those part
> from the original content corresponding to the solr result.
>
>
>
> 4.  Will this affect my site ranking in negative way.
>
>
>
> 5.  Can we do something for Meta keyword, Title etc. of generated page.
>

Re: Solr 6 / Solrj RuntimeException: First tuple is not a metadata tuple

2016-05-05 Thread Joel Bernstein

Also since the same query is working from curl, it's a pretty strong
indication that the error is occurring on the client. The logs show that
the includeMetadata  parameter is being sent properly. This is done
automatically by the JDBC driver. So the /sql handler should be sending the
metadata Tuple. The error you are seeing will occur if the metadata Tuple
is not present or if the first Tuple is null. My guess is the first Tuple
is null do to a ClassNotFoundException which is getting swallowed up during
the parse.

Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, May 5, 2016 at 8:37 AM, Joel Bernstein  wrote:

> In looking at the logs things look good on the server side. The sql query
> is sent to the /sql handler. It's translated to a solr query and sent to
> the select handler. Results are returned and no errors.
>
> So, I'm going to venture a guess that problem is on the client side. I'm
> wondering if you're tripping a ClassNotFoundException once the parsing of
> the json result comes back. I've seen instances where
> ClassNotFoundExceptions get swallowed.
>
> Can you post your classpath?
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Thu, May 5, 2016 at 4:57 AM, deniz  wrote:
>
>> Also found
>>
>> // JDBC requires metadata like field names from the SQLHandler. Force
>> this property to be true.
>> props.setProperty("includeMetadata", "true");
>>
>>
>> in org.apache.solr.client.solrj.io.sql.DriverImpl
>>
>> are there any other ways to get response on solrj without metaData to
>> avoid
>> the error?
>>
>>
>>
>> -
>> Zeki ama calismiyor... Calissa yapar...
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Solr-6-Solrj-RuntimeException-First-tuple-is-not-a-metadata-tuple-tp4274451p4274739.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
>

Re: Solr 6 / Solrj RuntimeException: First tuple is not a metadata tuple

2016-05-05 Thread Joel Bernstein

In looking at the logs things look good on the server side. The sql query
is sent to the /sql handler. It's translated to a solr query and sent to
the select handler. Results are returned and no errors.

So, I'm going to venture a guess that problem is on the client side. I'm
wondering if you're tripping a ClassNotFoundException once the parsing of
the json result comes back. I've seen instances where
ClassNotFoundExceptions get swallowed.

Can you post your classpath?

Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, May 5, 2016 at 4:57 AM, deniz  wrote:

> Also found
>
> // JDBC requires metadata like field names from the SQLHandler. Force
> this property to be true.
> props.setProperty("includeMetadata", "true");
>
>
> in org.apache.solr.client.solrj.io.sql.DriverImpl
>
> are there any other ways to get response on solrj without metaData to avoid
> the error?
>
>
>
> -
> Zeki ama calismiyor... Calissa yapar...
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-6-Solrj-RuntimeException-First-tuple-is-not-a-metadata-tuple-tp4274451p4274739.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

RE: Facet ignoring repeated word

2016-05-05 Thread G, Rajesh

Hi,

TermVectorComponent works. I am able to find the repeating words within the 
same document...that facet was not able to. The problem I see is 
TermVectorComponent produces result by a document e.g. and I have to combine 
the counts i.e count of word my is=6 in the list of documents. Can you please 
suggest a solution to group count by word across documents?. Basically we want 
to build word cloud from Solr result

1675

4

1675

2

http://localhost:8182/solr/dev/tvrh?q=*:*=true=comments=true=comments=1000

Hi Erick,
I need the count of repeated words to build word cloud

Thanks
Rajesh

CEB India Private Limited. Registration No: U741040HR2004PTC035324. Registered 
office: 6th Floor, Tower B, DLF Building No.10 DLF Cyber City, Gurgaon, 
Haryana-122002, India.

This e-mail and/or its attachments are intended only for the use of the 
addressee(s) and may contain confidential and legally privileged information 
belonging to CEB and/or its subsidiaries, including SHL. If you have received 
this e-mail in error, please notify the sender and immediately, destroy all 
copies of this email and its attachments. The publication, copying, in whole or 
in part, or use or dissemination in any other way of this e-mail and 
attachments by anyone other than the intended person(s) is prohibited.

-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com]
Sent: Tuesday, May 3, 2016 6:19 AM
To: solr-user@lucene.apache.org; G, Rajesh 
Subject: Re: Facet ignoring repeated word

Hi,

StatsComponent does not respect the query parameter. However you can feed a 
function query (e.g., termfreq) to it.

Instead consider using TermVectors or MLT's interesting terms.

https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component
https://cwiki.apache.org/confluence/display/solr/MoreLikeThis

Ahmet

On Monday, May 2, 2016 9:31 AM, "G, Rajesh"  wrote:
Hi Erick/ Ahmet,

Thanks for your suggestion. Can we have a query in TermsComponent like. I need 
the word count of comments for a question id not all. When I include the query 
q=questionid=123 I still see count of all

http://localhost:8182/solr/dev/terms?terms.fl=comments=true=1000=questionid=123

StatsComponent is not supporting text fields

Field type 
textcloud_en{class=org.apache.solr.schema.TextField,analyzer=org.apache.solr.analysis.TokenizerChain,args={positionIncrementGap=100,
 class=solr.TextField}} is not currently supported

Thanks
Rajesh

CEB India Private Limited. Registration No: U741040HR2004PTC035324. Registered 
office: 6th Floor, Tower B, DLF Building No.10 DLF Cyber City, Gurgaon, 
Haryana-122002, India.

This e-mail and/or its attachments are intended only for the use of the 
addressee(s) and may contain confidential and legally privileged information 
belonging to CEB and/or its subsidiaries, including CEB subsidiaries that offer 
SHL Talent Measurement products and services. If you have received this e-mail 
in error, please notify the sender and immediately, destroy all copies of this 
email and its attachments. The publication, copying, in whole or in part, or 
use or dissemination in any other way of this e-mail and attachments by anyone 
other than the intended person(s) is prohibited.

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Friday, April 29, 2016 9:16 PM
To: solr-user ; Ahmet Arslan 
Subject: Re: Facet ignoring repeated word

That's the way faceting is designed to work. It counts the _documents_ that a 
term appears in that satisfy your query, if a word appears multiple times in a 
doc, it'll only count it once.

For the general use-case it'd be unsettling for a user to see a facet count of 
500, then click on it and discover that the number of docs in the corpus was 
really 345 or something.

Ahmet's hints might help, but I'd really ask if counting words multiple times 
really satisfies the use case.

Best,
Erick

On Fri, Apr 29, 2016 at 7:10 AM, Ahmet Arslan  wrote:
> Hi,
>
> Depending on your requirements; StatsComponent, TermsComponent, 
> LukeRequestHandler can also be used.
>
>
> https://cwiki.apache.org/confluence/display/solr/The+Terms+Component
> https://wiki.apache.org/solr/LukeRequestHandler
> https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
> Ahmet
>
>
>
> On Friday, April 29, 2016 11:56 AM, "G, Rajesh"  wrote:
> Hi,
>
> I am trying to implement word 
>

Re: Nodes appear twice in state.json

2016-05-05 Thread Shalin Shekhar Mangar

Hmm not good. Definitely a new bug. Please open an issue.

Please look up the core node name in core.properties for that particular
core and remove the other one from state.json manually. Probably best to do
a cluster restart to avoid surprises. This is certainly uncharted territory.

On Wed, May 4, 2016 at 6:54 PM, Markus Jelsma 
wrote:

> Hi - we've just upgraded a development environment from 5.5 to Solr 6.0.
> After the upgrade, which went fine, we see two replica's appear twice in
> the cloud view (see below), both being leader. We've seen this happen
> before on some older 5.x versions. Is there a Jira issue i am missing? An
> unknown issue?
>
> Also, how to fix this. How do we remove the double node from the
> state.json?
>
> Many thanks!
> Markus
>
> {"search":{
> "replicationFactor":"3",
> "shards":{
>   "shard1":{
> "range":"8000-",
> "state":"active",
> "replicas":{
>   "core_node6":{
> "core":"search_shard1_replica1",
> "base_url":"http://idx5.oi.dev:8983/solr;,
> "node_name":"idx5.oi.dev:8983_solr",
> "state":"down"},
>   "core_node2":{
> "core":"search_shard1_replica2",
> "base_url":"http://idx2.oi.dev:8983/solr;,
> "node_name":"idx2.oi.dev:8983_solr",
> "state":"active",
> "leader":"true"},
>   "core_node3":{
> "core":"search_shard1_replica2",
> "base_url":"http://idx2.oi.dev:8983/solr;,
> "node_name":"idx2.oi.dev:8983_solr",
> "state":"down",
> "leader":"true"},
>   "core_node5":{
> "core":"search_shard1_replica3",
> "base_url":"http://idx3.oi.dev:8983/solr;,
> "node_name":"idx3.oi.dev:8983_solr",
> "state":"down"}}},
>   "shard2":{
> "range":"0-7fff",
> "state":"active",
> "replicas":{
>   "core_node1":{
> "core":"search_shard2_replica1",
> "base_url":"http://idx4.oi.dev:8983/solr;,
> "node_name":"idx4.oi.dev:8983_solr",
> "state":"down"},
>   "core_node2":{
> "core":"search_shard2_replica2",
> "base_url":"http://idx6.oi.dev:8983/solr;,
> "node_name":"idx6.oi.dev:8983_solr",
> "state":"down"},
>   "core_node4":{
> "core":"search_shard2_replica3",
> "base_url":"http://idx1.oi.dev:8983/solr;,
> "node_name":"idx1.oi.dev:8983_solr",
> "state":"active",
> "leader":"true",
> "router":{"name":"compositeId"},
> "maxShardsPerNode":"1",
> "autoAddReplicas":"false"}}
>
>
>
>


-- 
Regards,
Shalin Shekhar Mangar.

I need Consultation/Suggestion and I am even willing to pay fee for that

2016-05-05 Thread Zara Parst

What is in my mind!!



I have data in TB mainly educational assignments and projects which will
contain text, image and may be codes also if this is from computer
Science.  I will index all the documents into solr and I will also have
original copy of those documents. Now, I want to create a library where
user can search the content and can see few parts of relevant documents
like 5 to 10 related documents but in restricted manner.  For unrestricted
manner they have to pay for each documents.



I also want to create page for those content which has been already shown
to the user as a restricted part. So that number of page on my website keep
on increasing which will give a boost to my website for search engine
ranking. Obviously more pages mean better rank. I want everything as
automated manner with minimal manual work. Now issue that I am facing

1.  How to generate restricted part out of solr which is most relevant
( I can implement sliding window display which might serve this but if
there is already something in solr then I will prefer that one)


2.  How to create pages from that content and how to manage url of that
page on my website (one solution would be url based on query but what if
someone search almost same thing and some other document comes as first
option and how to resolve the issue of the same url, this will also create
issue of overlapping content with different url if I am implementing
sliding window)



3.  About creating page, shall I create the page from solr content or
from original content because it might have image in content so better
option would be from original content.  More suitable choice looks like
from original content, if that is the case then how to extract those part
from the original content corresponding to the solr result.



4.  Will this affect my site ranking in negative way.



5.  Can we do something for Meta keyword, Title etc. of generated page.

Passing Ids in query takes more time

2016-05-05 Thread Bhaumik Joshi

Hi,


I am retrieving ids from collection1 based on some query and passing those ids 
as a query to collection2 so the query to collection2 which contains ids in it 
takes much more time compare to normal query.


Que. 1 - While passing ids to query why it takes more time compare to normal 
query however we are narrowing the criteria by passing ids?

e.g.  query-1: doc_id:(111 222 333 444 ...) AND  slower 
(passing 80k ids takes 7-9 sec) than query-2: only  (700-800 
ms). Both returns 250 records with same set of fields.


Que. 2 - Any idea on how i can achieve above (get ids from one collection and 
pass those ids to other one) in efficient manner or any other way to get data 
from one collection based on response of other collection?


Thanks & Regards,

Bhaumik Joshi

Passing IDs in query takes more time

2016-05-05 Thread Bhaumik Joshi

Hi,


I am retrieving ids from collection1 based on some query and passing those ids 
as a query to collection2 so the query to collection2 which contains ids in it 
takes much more time compare to normal query.


Que. 1 - While passing ids to query why it takes more time compare to normal 
query however we are narrowing the criteria by passing ids?

e.g.  query-1: doc_id:(111 222 333 444 ...) AND  slower (takes 
7-9 sec) than

only  (700-800 ms). Please note that in this case i am passing 
80k ids in  and retrieving 250 rows.


Que. 2 - Any idea on how i can achieve above (get ids from one collection and 
pass those ids to other one) in efficient manner or any other way to get data 
from one collection based on response of other collection?


Thanks & Regards,

Bhaumik Joshi

RE: Facet ignoring repeated word

2016-05-05 Thread G, Rajesh

Hi,

Please ignore my previous email.

CEB India Private Limited. Registration No: U741040HR2004PTC035324. Registered 
office: 6th Floor, Tower B, DLF Building No.10 DLF Cyber City, Gurgaon, 
Haryana-122002, India.

This e-mail and/or its attachments are intended only for the use of the 
addressee(s) and may contain confidential and legally privileged information 
belonging to CEB and/or its subsidiaries, including SHL. If you have received 
this e-mail in error, please notify the sender and immediately, destroy all 
copies of this email and its attachments. The publication, copying, in whole or 
in part, or use or dissemination in any other way of this e-mail and 
attachments by anyone other than the intended person(s) is prohibited.

-Original Message-
From: G, Rajesh [mailto:r...@cebglobal.com]
Sent: Thursday, May 5, 2016 2:29 PM
To: Ahmet Arslan ; solr-user@lucene.apache.org
Subject: RE: Facet ignoring repeated word

Hi,

TearmVector component is also not considering query parameter. The below query 
shows result for all question id instead of question id 3426

http://localhost:8182/solr/dev/terms?terms.fl=comments=true=1000=questionid=3426

Thanks
Rajesh

CEB India Private Limited. Registration No: U741040HR2004PTC035324. Registered 
office: 6th Floor, Tower B, DLF Building No.10 DLF Cyber City, Gurgaon, 
Haryana-122002, India.

This e-mail and/or its attachments are intended only for the use of the 
addressee(s) and may contain confidential and legally privileged information 
belonging to CEB and/or its subsidiaries, including SHL. If you have received 
this e-mail in error, please notify the sender and immediately, destroy all 
copies of this email and its attachments. The publication, copying, in whole or 
in part, or use or dissemination in any other way of this e-mail and 
attachments by anyone other than the intended person(s) is prohibited.

-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com]
Sent: Tuesday, May 3, 2016 6:19 AM
To: solr-user@lucene.apache.org; G, Rajesh 
Subject: Re: Facet ignoring repeated word

Hi,

StatsComponent does not respect the query parameter. However you can feed a 
function query (e.g., termfreq) to it.

Instead consider using TermVectors or MLT's interesting terms.

https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component
https://cwiki.apache.org/confluence/display/solr/MoreLikeThis

Ahmet

On Monday, May 2, 2016 9:31 AM, "G, Rajesh"  wrote:
Hi Erick/ Ahmet,

Thanks for your suggestion. Can we have a query in TermsComponent like. I need 
the word count of comments for a question id not all. When I include the query 
q=questionid=123 I still see count of all

http://localhost:8182/solr/dev/terms?terms.fl=comments=true=1000=questionid=123

StatsComponent is not supporting text fields

Field type 
textcloud_en{class=org.apache.solr.schema.TextField,analyzer=org.apache.solr.analysis.TokenizerChain,args={positionIncrementGap=100,
 class=solr.TextField}} is not currently supported

Thanks
Rajesh

CEB India Private Limited. Registration No: U741040HR2004PTC035324. Registered 
office: 6th Floor, Tower B, DLF Building No.10 DLF Cyber City, Gurgaon, 
Haryana-122002, India.

This e-mail and/or its attachments are intended only for the use of the 
addressee(s) and may contain confidential and legally privileged information 
belonging to CEB and/or its subsidiaries, including CEB subsidiaries that offer 
SHL Talent Measurement products and services. If you have received this e-mail 
in error, please notify the sender and immediately, destroy all copies of this 
email and its attachments. The publication, copying, in whole or in part, or 
use or dissemination in any other way of this e-mail and attachments by anyone 
other than the intended person(s) is prohibited.

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Friday, April 29, 2016 9:16 PM
To: solr-user ; Ahmet Arslan 
Subject: Re: Facet ignoring repeated word

That's the way faceting is designed to work. It counts the _documents_ that a 
term appears in that satisfy your query, if a word appears multiple times in a 
doc, it'll only count it once.

For the general use-case it'd be unsettling for a user to see a facet count of 
500, then click on it and discover that the number of docs in the corpus was 
really 345 or something.

Ahmet's hints might help, but I'd really ask if counting words multiple times 
really satisfies the use case.

Best,
Erick

On Fri, Apr 29, 2016 at 7:10 AM, Ahmet Arslan  wrote:
> Hi,
>
> Depending on your requirements; StatsComponent, TermsComponent, 
> LukeRequestHandler can also be used.
>
>
> https://cwiki.apache.org/confluence/display/solr/The+Terms+Component
>

RE: Facet ignoring repeated word

2016-05-05 Thread G, Rajesh

Hi,

TearmVector component is also not considering query parameter. The below query 
shows result for all question id instead of question id 3426

http://localhost:8182/solr/dev/terms?terms.fl=comments=true=1000=questionid=3426

Thanks
Rajesh

CEB India Private Limited. Registration No: U741040HR2004PTC035324. Registered 
office: 6th Floor, Tower B, DLF Building No.10 DLF Cyber City, Gurgaon, 
Haryana-122002, India.

This e-mail and/or its attachments are intended only for the use of the 
addressee(s) and may contain confidential and legally privileged information 
belonging to CEB and/or its subsidiaries, including SHL. If you have received 
this e-mail in error, please notify the sender and immediately, destroy all 
copies of this email and its attachments. The publication, copying, in whole or 
in part, or use or dissemination in any other way of this e-mail and 
attachments by anyone other than the intended person(s) is prohibited.

-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com]
Sent: Tuesday, May 3, 2016 6:19 AM
To: solr-user@lucene.apache.org; G, Rajesh 
Subject: Re: Facet ignoring repeated word

Hi,

StatsComponent does not respect the query parameter. However you can feed a 
function query (e.g., termfreq) to it.

Instead consider using TermVectors or MLT's interesting terms.

https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component
https://cwiki.apache.org/confluence/display/solr/MoreLikeThis

Ahmet

On Monday, May 2, 2016 9:31 AM, "G, Rajesh"  wrote:
Hi Erick/ Ahmet,

Thanks for your suggestion. Can we have a query in TermsComponent like. I need 
the word count of comments for a question id not all. When I include the query 
q=questionid=123 I still see count of all

http://localhost:8182/solr/dev/terms?terms.fl=comments=true=1000=questionid=123

StatsComponent is not supporting text fields

Field type 
textcloud_en{class=org.apache.solr.schema.TextField,analyzer=org.apache.solr.analysis.TokenizerChain,args={positionIncrementGap=100,
 class=solr.TextField}} is not currently supported

Thanks
Rajesh

CEB India Private Limited. Registration No: U741040HR2004PTC035324. Registered 
office: 6th Floor, Tower B, DLF Building No.10 DLF Cyber City, Gurgaon, 
Haryana-122002, India.

This e-mail and/or its attachments are intended only for the use of the 
addressee(s) and may contain confidential and legally privileged information 
belonging to CEB and/or its subsidiaries, including CEB subsidiaries that offer 
SHL Talent Measurement products and services. If you have received this e-mail 
in error, please notify the sender and immediately, destroy all copies of this 
email and its attachments. The publication, copying, in whole or in part, or 
use or dissemination in any other way of this e-mail and attachments by anyone 
other than the intended person(s) is prohibited.

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Friday, April 29, 2016 9:16 PM
To: solr-user ; Ahmet Arslan 
Subject: Re: Facet ignoring repeated word

That's the way faceting is designed to work. It counts the _documents_ that a 
term appears in that satisfy your query, if a word appears multiple times in a 
doc, it'll only count it once.

For the general use-case it'd be unsettling for a user to see a facet count of 
500, then click on it and discover that the number of docs in the corpus was 
really 345 or something.

Ahmet's hints might help, but I'd really ask if counting words multiple times 
really satisfies the use case.

Best,
Erick

On Fri, Apr 29, 2016 at 7:10 AM, Ahmet Arslan  wrote:
> Hi,
>
> Depending on your requirements; StatsComponent, TermsComponent, 
> LukeRequestHandler can also be used.
>
>
> https://cwiki.apache.org/confluence/display/solr/The+Terms+Component
> https://wiki.apache.org/solr/LukeRequestHandler
> https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
> Ahmet
>
>
>
> On Friday, April 29, 2016 11:56 AM, "G, Rajesh"  wrote:
> Hi,
>
> I am trying to implement word 
> cloud
>   using Solr.  The problem I have is Solr facet query ignores repeated words 
> in a document eg.
>
> I have indexed the text :
> It seems that the harder I work, the more work I get for the same 
> compensation and reward. The more work I take on gets absorbed into my 
> "normal" workload and I'm not recognized for working harder than my peers, 
> which makes me not want to work to my potential. I am very

Re: Solr 6 / Solrj RuntimeException: First tuple is not a metadata tuple

2016-05-05 Thread deniz

Also found 

// JDBC requires metadata like field names from the SQLHandler. Force
this property to be true.
props.setProperty("includeMetadata", "true");


in org.apache.solr.client.solrj.io.sql.DriverImpl 

are there any other ways to get response on solrj without metaData to avoid
the error? 



-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-6-Solrj-RuntimeException-First-tuple-is-not-a-metadata-tuple-tp4274451p4274739.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 6 / Solrj RuntimeException: First tuple is not a metadata tuple

2016-05-05 Thread deniz

could it be something with includeMetaData=true param? I have tried to set it
to false but then the logs look like:

webapp=/solr path=/sql
params={includeMetadata=true=false=1=json=2.2=select+id,+text+from+test+where+tits%3D1+limit+5=map_reduce}
status=0 QTime=3 



-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-6-Solrj-RuntimeException-First-tuple-is-not-a-metadata-tuple-tp4274451p4274733.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr Suggester no results

2016-05-05 Thread Grigoris Iliopoulos

Hi there,

I want to use the Solr suggester component for city names. I have the
following settings:
schema.xml

Field definition


  



  


The field i want to apply the suggester on



The copy field



The field



solr-config.xml


  
true
10
mySuggester
  
  
suggest
  





  
mySuggester
FuzzyLookupFactory
DocumentDictionaryFactory
citySuggest
string
  


Then i run

http://localhost:8983/solr/company/suggest?suggest=true=mySuggester=json=Ath=true

to build the suggest component

Finally i run

   
http://localhost:8983/solr/company/suggest?suggest=true=mySuggester=json=Ath

but i get an empty result set

{"responseHeader":{"status":0,"QTime":0},"suggest":{"mySuggester":{"Ath":{"numFound":0,"suggestions":[]

Are there any obvious mistakes? Any thoughts?

Re: getZkStateReader() returning NULL

2016-05-05 Thread Alan Woodward

You'll need to call this.server.connect() - the state reader is instantiated 
lazily.

Alan Woodward
www.flax.co.uk


On 5 May 2016, at 01:10, Boman wrote:

> I am attempting to check for existence of a collection prior to creating a
> new one with that name, using Solrj:
> 
>System.out.println("Checking for existence of collection...");
>ZkStateReader zkStateReader = this.server.getZkStateReader(); 
>zkStateReader.updateClusterState();
> 
> this.server was created using:
> 
>   this.server = new CloudSolrClient(this.ZK_HOST);
> 
> The call: this.server.getZkStateReader() consistently returns a NULL.
> 
> Any help would be appreciated. Thanks.
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/getZkStateReader-returning-NULL-tp4274663.html
> Sent from the Solr - User mailing list archive at Nabble.com.

JDK requirements for Solr 5.5

2016-05-05 Thread tjlp

 Hi,
Can Solr run on JDK 7 32 bit? Or must be 64 bit?
Thanks

Implementing partial search and exact search

2016-05-05 Thread Lasitha Wattaladeniya

Hi All,

I'm trying to implement a search functionality using solr. Currently I'm
suing edismax parser with ngram fields to do the search against. So far it
works well.

The question I have is when the user input double quotations to the search,
As the requirement this should  match against the original field, not
against the ngram field.

Currently what I have thought of doing is, identify the double quotations
in the user input and change the query field (qf) according to that (to
ngram field or to the exact field). Isn't there any out of the box solution
for this, I feel like it's a generic requirement and don't want to reinvent
the wheel. Appreciate your comments

[1]. https://issues.apache.org/jira/browse/SOLR-6842
[2].
http://grokbase.com/t/lucene/solr-user/14cbghncvh/different-fields-for-user-supplied-phrases-in-edismax

Thanks,
Lasitha Wattaladeniya
Software Engineer

Implementing partial search and exact matching

2016-05-05 Thread Lasitha Wattaladeniya

Hi All,

I'm trying to implement a search functionality using solr. Currently I'm
suing edismax parser with ngram fields to do the search against. So far it
works well.

The question I have is when the user input double quotations to the search,
As the requirement this should  match against the original field, not
against the ngram field.

Currently what I have thought of doing is, identify the double quotations
in the user input and change the query field (qf) according to that (to
ngram field or to the exact field). Isn't there any out of the box solution
for this, I feel like it's a generic requirement and don't want to reinvent
the wheel. Appreciate your comments

[1]. https://issues.apache.org/jira/browse/SOLR-6842

[2].
http://grokbase.com/t/lucene/solr-user/14cbghncvh/different-fields-for-user-supplied-phrases-in-edismax

Thanks,

Lasitha Wattaladeniya

Software Engineer

Re: Solr 6 / Solrj RuntimeException: First tuple is not a metadata tuple

2016-05-05 Thread deniz


> The logs you shared don't seem to be the full logs. There will be a
> related
> exception on the Solr server side. The exception on the Solr server side
> will explain the cause of the problem.

The logs are the full logs which I got on the console when I run the code,
and there is no exception on server side at all (it prints the incoming
query and shows the hits actually, already pasted above)

the same query is fine if I run with curl only though...



-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-6-Solrj-RuntimeException-First-tuple-is-not-a-metadata-tuple-tp4274451p4274715.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to range search on the field which contains multiple decimal point (eg: 2.5.0.4)

2016-05-05 Thread Toke Eskildsen

Santhosh Sheshasayanan  wrote:
> When I do range search on the "version" field with criteria
> [* TO 2.5.0.5], it gave me all the value like (2.5.0.1, 2.5.0.10,
> 2.5.0.4). But this is wrong result. Since I was expecting only
> 2.5.0.1 and 2.5.0.4.
> But it include 2.5.0.10 with the results. I googled and found
> that solr does lexical sorting. But I want numerical sorting.

Numbers in Solr-land are either integers or floating point values. From hat 
perspective, '2.5.0.10' it just ciphers delimited by dots. You want a special 
sort that handles versioning designations. As far as I know that is not 
supported by Solr, so you will have to roll your own.

The simplest solution is to pad with zeroes, so '2.5.0.4' becomes 
'00200504', '2.5.0.10' becomes '00200510', and '2.5.0' becomes 
'00200500', which sorts lexically as you want.

You will have to make your padding conservative enough to cover all your 
versioning cases. Remember to consider 'alpha', 'beta', 'SNAPSHOT' and similar 
postfixes.

- Toke Eskildsen

Can Highlighting and MoreLikeThis works together in same requestHandler?

2016-05-05 Thread Zheng Lin Edwin Yeo

Hi,

I'm finding out if we could possibly implement highlighting and
MoreLikeThis (MLT) function in the same requestHandler.

I understand that highlighting uses the normal SearchHandler, while MLT
uses MoreLikeThisHandler, and reach requestHandler can only have one
handler. For this case, we have to implement two different requestHandler,
and when user does a search, we have to send two different queries to Solr.

Is there anyway which we can combine the two together, so that when user
does a search, the same query can give both the highlight and MLT results?

I'm using Solr 5.4.0.

Regards,
Edwin

Duplicate docs in pagination with same score in Solr Cloud

2016-05-05 Thread Jeffery Yuan

We are running a match all query in solr cloud(for example, one shard with 3
replicas) - all data have same score. 

In solr, if docs have same score, they will be sorted by internal docid.
In solr clound, CloudSolrClient will send request to different shards. -
round robin

Same document may have different internal docid in different replicas.

So is it possible that in replicaA, docA docid is 1, in replicaB, its docid
is 11.
Then if the first query(get 0-10 rows) is sent to replicaA, the second
query(get 11-20 rows) is sent to replicaB, same doc will be returned twice?

Is this possible? 
If so, does this mean if we want to avoid duplicate data, we have to add
sort to break same-score-tie?

- This happens to me in my test environment: we get the solr data from
another machine, then make some change using atomic update, then do
pagination to export same data, found that it returns duplicate data.
-- My fix is to sort data by updateDate.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Duplicate-docs-in-pagination-with-same-score-in-Solr-Cloud-tp4274712.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Query String Limit

2016-05-05 Thread Susmit Shukla

Hi Prasanna,

What is the exact number you set it to?
What error did you get on solr console and in the solr logs?
Did you reload the core/restarted solr after bumping up the solrconfig?

Thanks,
Susmit

On Wed, May 4, 2016 at 9:45 PM, Prasanna S. Dhakephalkar <
prasann...@merajob.in> wrote:

> Hi
>
> We had increased the maxBooleanClauses to a large number, but it did not
> work
>
> Here is the query
>
>
> http://localhost:8983/solr/collection1/select?fq=record_id%3A(604929+504197+
>
> 500759+510957+624719+524081+544530+375687+494822+468221+553049+441998+495212
>
> +462613+623866+344379+462078+501936+189274+609976+587180+620273+479690+60601
>
> 8+487078+496314+497899+374231+486707+516582+74518+479684+1696152+1090711+396
>
> 784+377205+600603+539686+550483+436672+512228+1102968+600604+487699+612271+4
>
> 87978+433952+479846+492699+380838+412290+487086+515836+487957+525335+495426+
>
> 619724+49726+444558+67422+368749+630542+473638+613887+1679503+509367+1108299
>
> +498818+528683+530270+595087+468595+585998+487888+600612+515884+455568+60643
>
> 8+526281+497992+460147+587530+576456+526021+790508+486148+469160+365923+4846
>
> 54+510829+488792+610933+254610+632700+522376+594418+514817+439283+1676569+52
>
> 4031+431557+521628+609255+627205+1255921+57+477017+519675+548373+350309+
>
> 491176+524276+570935+549458+495765+512814+494722+382249+619036+477309+487718
>
> +470604+514622+1240902+570607+613830+519130+479708+630293+496994+623870+5706
>
> 72+390434+483496+609115+490875+443859+292168+522383+501802+606498+596773+479
>
> 881+486020+488654+490422+512636+495512+489480+626269+614618+498967+476988+47
>
> 7608+486568+270095+295480+478367+607120+583892+593474+494373+368030+484522+5
>
> 01183+432822+448109+553418+584084+614868+486206+481014+495027+501880+479113+
>
> 615208+488161+512278+597663+569409+139097+489490+584000+493619+607479+281080
>
> +518617+518803+487896+719003+584153+484341+505689+278177+539722+548001+62529
>
> 6+1676456+507566+619039+501882+530385+474125+293642+612857+568418+640839+519
>
> 893+524335+612859+618762+479460+479719+593700+573677+525991+610965+462087+52
>
> 1251+501197+443642+1684784+533972+510695+475499+490644+613829+613893+479467+
>
> 542478+1102898+499230+436921+458632+602303+488468+1684407+584373+494603+4992
>
> 45+548019+600436+606997+59+503156+440428+518759+535013+548023+494273+649
>
> 062+528704+469282+582249+511250+496466+497675+505937+489504+600444+614240+19
>
> 35577+464232+522398+613809+1206232+607149+607644+498059+506810+487115+550976
>
> +638174+600849+525655+625011+500082+606336+507156+487887+333601+457209+60111
>
> 0+494927+1712081+601280+486061+501558+600451+263864+527378+571918+472415+608
>
> 130+212386+380460+590400+478850+631886+486782+608013+613824+581767+527023+62
>
> 3207+607013+505819+485418+486786+537626+507047+92+527473+495520+553141+5
>
> 17837+497295+563266+495506+532725+267057+497321+453249+524341+429654+720001+
>
> 539946+490813+479491+479628+479630+1125985+351147+524296+565077+439949+61241
>
> 3+495854+479493+1647796+600259+229346+492571+485638+596394+512112+477237+600
>
> 459+263780+704068+485934+450060+475944+582280+488031+1094010+1687904+539515+
>
> 525820+539516+505985+600461+488991+387733+520928+362967+351847+531586+616101
>
> +479925+494156+511292+515729+601903+282655+491244+610859+486081+325500+43639
>
> 7+600708+523445+480737+486083+614767+486278+1267655+484845+495145+562624+493
>
> 381+8060+638731+501347+565979+325132+501363+268866+614113+479646+1964487+631
>
> 934+25717+461612+376451+513712+527557+459209+610194+1938903+488861+426305+47
>
> 7676+1222682+1246647+567986+501908+791653+325802+498354+435156+484862+533068
>
> +339875+395827+475148+331094+528741+540715+623480+416601+516419+600473+62563
>
> 2+480570+447412+449778+503316+492365+563298+486361+500907+514521+138405+6123
>
> 27+495344+596879+524918+474563+47273+514739+553189+548418+448943+450612+6006
>
> 78+484753+485302+271844+474199+487922+473784+431524+535371+513583+514746+612
>
> 534+327470+485855+517878+384102+485856+612768+494791+504840+601330+493551+55
>
> 8620+540131+479809+394179+487866+559955+578444+576571+485861+488879+573089+4
>
> 97552+487898+490369+535756+614155+633027+487473+517912+523364+527419+600487+
>
> 486128+278040+598478+487395+600579+585691+498970+488151+608187+445943+631971
>
> +230291+504552+534443+501924+489148+292672+528874+434783+479533+485301+61908
>
> 9+629083+479383+600981+534717+645420+604921+618714+522329+597822+507413+5706
>
> 05+491732+464741+511564+613929+526049+614817+589065+603307+491990+467339+264
>
> 426+487907+492982+589067+487674+487820+492983+486708+504140+1216198+625736+4
>
> 92984+530116+615663+503248+1896822+600588+518139+494994+621846+599669+488207
>
> +640923+487580+539856+603968+444717+492991+614824+491735+492992+495149+52117
>
> 2+365778+261681+600502+479682+597464+492997+587172+624381+482355+1246338+593
>
> 642+492000+494707+620137+493000+20617+585199+587176+587177+1877064+587179+53
>
>

69 matches

Mail list logo