Re: Creating a Custom Query Response Writer

2014-12-07 Thread Erik Hatcher
I wouldn’t personally do anything custom for JSON - but rather just pull out 
what you need client-side (and make the request such that it doesn’t return 
more than you need).  Doing a custom JSON format for this would limit your 
later flexibility in case you wanted to get different pieces of the response, 
possibly.   Also, if you had your own custom JSON format writer it could become 
problematic later on if Solr’s internal APIs changed and you needed to upgrade 
(or worse, maintain multiple versions). 

Erik

 On Dec 6, 2014, at 5:13 PM, Ryan Yacyshyn ryan.yacys...@gmail.com wrote:
 
 Hi Erik,
 
 Wow that's great. Thanks for the explanation, I tried using the
 VelocityResponseWriter with your template provided and it worked as
 expected, makes sense!
 
 What if I want to return a custom JSON response back, rather than HTML for
 auto-suggesting? I'm thinking about using Twitter's typeahead jQuery plugin
 https://twitter.github.io/typeahead.js/examples/#custom-templates 
 https://twitter.github.io/typeahead.js/examples/#custom-templates and
 passing it JSON, and have a template on the front-end to the autosuggest.
 
 Ryan
 
 
 
 On Sat Dec 06 2014 at 3:41:38 AM Erik Hatcher erik.hatc...@gmail.com 
 mailto:erik.hatc...@gmail.com
 wrote:
 
 Ryan - I just pulled Taming Text off my shelf and refreshed my memory of
 this custom response writer.
 
 While having a custom writer is a neat example, it’s unnecessary for that
 particular functionality.  Solr has a built-in templatable response writer,
 the VelocityResponseWriter.  You can see it in action for a similar suggest
 feature in Solr’s example /browse interface (type “ip” and wait a second in
 the /browse UI with the sample data indexed).  In there is a little bit of
 jQuery autocomplete plugin usage that calls back to the /terms handler,
 using a suggest.vm template (in conf/velocity).  The difference with the
 Taming Text example is that it is returns stored fields of a standard
 search rather than just raw terms; with a little adjustment you can get
 basically the same thing as TT.  Leveraging the Solr example (v4.10.2 for
 me here), I created a conf/velocity/typeahead.vm:
 
  ul
#foreach($doc in $response.results)
  li$doc.name/li
#end
  /ul
 
 (the docs in the example data have a ‘name’ field)
 
 This request  http://localhost:8983/solr/collection1/select?q=name%
 3Aip*wt=velocityv.template=typeahead http://localhost:8983/solr/ 
 http://localhost:8983/solr/
 collection1/select?q=name:ip*wt=velocityv.template=typeahead results
 in this response:
 
  ul
  liBelkin Mobile Power Cord for iPod w/ Dock/li
  liiPod  iPod Mini USB 2.0 Cable/li
  liApple 60 GB iPod with Video Playback Black/li
  /ul
 
Erik
 
 
 On Dec 6, 2014, at 2:24 AM, Ryan Yacyshyn ryan.yacys...@gmail.com
 wrote:
 
 Hey Everyone,
 
 I'm a little stuck on building a custom query response writer. I want to
 create a response writer similar to the one explained in the book, Taming
 Text, on the TypeAheadResponseWriter. I know I need to implement the
 QueryResponseWriter, but I'm not sure where to find the Solr JAR files I
 need to include. Where can I find these?
 
 Thanks,
 Ryan



RE: How to stop Solr tokenising search terms with spaces

2014-12-07 Thread Dinesh Babu
Hi Alex,

My requirement is that I should be able to search for a person , for example 
Tom Hanks, by either

1) the whole of first name (Tom)
2) or partial first name with prefix  (To )
3) or partial first name without prefix  ( om)
4) or the whole of surname ( Hanks)
5) or partial surname with prefix (Han)
6) or partial surname without prefix (ank)
7) or the whole name (Tom Hanks)
8) or partial first name with or without prefix and partial surname with or 
without prefix ( To Han , om ank)
9) All of the above as case insensitive search

Thanks in advance for your help

Regards,
Dinesh Babu.


-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
Sent: 07 December 2014 01:20
To: solr-user
Subject: Re: How to stop Solr tokenising search terms with spaces

There is no spoon. And, there is no phrase search. Certainly nothing that is 
one approach that fits all.

What is actually happening is that you seem to want both phrase and prefix 
search. In your original question you did not explain the second part. So, you 
were given a solution for the first one.

To get the second part, you now need to to put some sort of NGram into the 
index-type analyzer chain. But the problem is, you need to be very clear on 
what you want there. Do you want:
1) Major Hanks
2) Major Ha
3) Hanks Ma (swapped)
4) Hanks random text Major (swapped and apart)
4) Ha Ma (prefix on both words)
5) ha ma (lower case searches too)
Or only some of those?

Each of these things have implications and trade-offs. Once you know what you 
want to find, we can help you get there.

Regards,
   Alex.
P.s. If you are not sure what I am talking about with the analyzer chain, may I 
recommend my own book:
http://www.amazon.ca/Instant-Apache-Solr-Indexing-Data-ebook/dp/B00D85K9XC
It seems to be on sale right now.
Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and 
newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers 
community: https://www.linkedin.com/groups?gid=6713853


On 6 December 2014 at 19:17, Dinesh Babu dinesh.b...@pb.com wrote:

 Just curious, why solr does not provide a simple mechanism to do a phrase 
 search ? It is a very common use case and it is very surprising that there is 
 no straight forward, at least I have not found one after so much research,  
 way to do it in Solr.

 Regards,
 Dinesh


 -Original Message-
 From: Dinesh Babu [mailto:dinesh.b...@pb.com]
 Sent: 05 December 2014 17:29
 To: solr-user@lucene.apache.org
 Subject: RE: How to stop Solr tokenising search terms with spaces

 Hi Erik,

 Probably I celebrated too soon. When I tested {!field} it seemed to
 work as the query was on such a data that it made to look like it is
 working.  using the example that I originally mentioned to search for
 Tom Hanks Major

 1) If I search {!field f=displayName}: Hanks Major,  it works

 2) If I provide partial word {!field f=displayName}: Hanks Ma,  it
 does not work

 Is this how {!field is designed to work?

 Also I tried without and with escaping space as you suggested. It has
 the same issue

 1) q= field1:Hanks Major , it works
 2) q= field1:Hanks Maj , does not works

 Regards,
 Dinesh Babu.



 -Original Message-
 From: Erik Hatcher [mailto:erik.hatc...@gmail.com]
 Sent: 05 December 2014 16:44
 To: solr-user@lucene.apache.org
 Subject: Re: How to stop Solr tokenising search terms with spaces

 But also, to spell out the more typical way to do that:

q=field1:”…” OR field2:”…”

 The nice thing about {!field} is that the value doesn’t have to have quotes 
 and deal with escaping issues, but if you just want phrase queries and 
 quote/escaping isn’t a hassle maybe that’s cleaner for you.

 Erik


 On Dec 5, 2014, at 11:30 AM, Dinesh Babu dinesh.b...@pb.com wrote:

 One more quick question Erik,

 If I want to do search on multiple fields using {!field} do we have a
 query similar to what  {!prefix} has
 :  q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val}
 where f1_val=field 1 valuef2_val=field2 value

 Regards,
 Dinesh Babu.



 -Original Message-
 From: Dinesh Babu
 Sent: 05 December 2014 16:26
 To: solr-user@lucene.apache.org
 Subject: RE: How to stop Solr tokenising search terms with spaces

 Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate
 your help


 Regards,
 Dinesh Babu.



 -Original Message-
 From: Erik Hatcher [mailto:erik.hatc...@gmail.com]
 Sent: 05 December 2014 16:00
 To: solr-user@lucene.apache.org
 Subject: Re: How to stop Solr tokenising search terms with spaces

 try using {!field} instead of {!prefix}.  {!field} will create a
 phrase query (or term query if it’s just one term) after analysis.
 [it also could construct other query types if the analysis overlaps
 tokens, but maybe not relevant here]

 Also note that you can use multiple of these expressions if needed:
 q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where
 f1_val=field 1 

RE: How to stop Solr tokenising search terms with spaces

2014-12-07 Thread Dinesh Babu
Hi Jack,

Reproducing the email that specifies my requirement.

My requirement is that I should be able to search for a person , for example 
Tom Hanks, by either

1) the whole of first name (Tom)
2) or partial first name with prefix  (To )
3) or partial first name without prefix  ( om)
4) or the whole of surname ( Hanks)
5) or partial surname with prefix (Han)
6) or partial surname without prefix (ank)
7) or the whole name (Tom Hanks)
8) or partial first name with or without prefix and partial surname with or 
without prefix ( To Han , om ank)
9) All of the above as case insensitive search

Thanks in advance for your help

Regards,
Dinesh Babu

Regards,
Dinesh Babu.



-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: 07 December 2014 02:04
To: solr-user@lucene.apache.org
Subject: Re: How to stop Solr tokenising search terms with spaces

AFAIK, partial word matching is not a common use case. Could you provide a 
citation to shows otherwise?

Solr does provide a simple mechanism for phrase search - just place your 
phrase in quotes.

If you wish to do something more complex, then of course the solution may be 
more complex.

The starting point would be for you to provide a more complete description of 
your use case, which is clearly not simple phrase search.

Your most recent messages suggested that you want to match on partial words, 
but... you need to be more specific - don't make us try to guess your 
requirements. Feeding us partial requirements, one partial requirement at a 
time is not particularly effective.

Finally, are you really trying to match names within arbitrary text, or do you 
have a field that simply contains a complete name? Again, this comes back to 
providing us with more specific requirements. My guess, from your mention of 
LDAP, is that the field would contain only a name, but... that's me guessing 
when you need to be specific. Once this distinction is cleared up, we can then 
focus on solutions that work either for arbitrary text or single value fields.

-- Jack Krupansky

-Original Message-
From: Dinesh Babu
Sent: Saturday, December 6, 2014 7:17 PM
To: solr-user@lucene.apache.org
Subject: RE: How to stop Solr tokenising search terms with spaces


Just curious, why solr does not provide a simple mechanism to do a phrase
search ? It is a very common use case and it is very surprising that there
is no straight forward, at least I have not found one after so much
research,  way to do it in Solr.

Regards,
Dinesh


-Original Message-
From: Dinesh Babu [mailto:dinesh.b...@pb.com]
Sent: 05 December 2014 17:29
To: solr-user@lucene.apache.org
Subject: RE: How to stop Solr tokenising search terms with spaces

Hi Erik,

Probably I celebrated too soon. When I tested {!field} it seemed to work as
the query was on such a data that it made to look like it is working.  using
the example that I originally mentioned to search for Tom Hanks Major

1) If I search {!field f=displayName}: Hanks Major,  it works

2) If I provide partial word {!field f=displayName}: Hanks Ma,  it does not
work

Is this how {!field is designed to work?

Also I tried without and with escaping space as you suggested. It has the
same issue

1) q= field1:Hanks Major , it works
2) q= field1:Hanks Maj , does not works

Regards,
Dinesh Babu.



-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com]
Sent: 05 December 2014 16:44
To: solr-user@lucene.apache.org
Subject: Re: How to stop Solr tokenising search terms with spaces

But also, to spell out the more typical way to do that:

   q=field1:”…” OR field2:”…”

The nice thing about {!field} is that the value doesn’t have to have quotes
and deal with escaping issues, but if you just want phrase queries and
quote/escaping isn’t a hassle maybe that’s cleaner for you.

Erik


 On Dec 5, 2014, at 11:30 AM, Dinesh Babu dinesh.b...@pb.com wrote:

 One more quick question Erik,

 If I want to do search on multiple fields using {!field} do we have a
 query similar to what  {!prefix} has
 :  q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where
 f1_val=field 1 valuef2_val=field2 value

 Regards,
 Dinesh Babu.



 -Original Message-
 From: Dinesh Babu
 Sent: 05 December 2014 16:26
 To: solr-user@lucene.apache.org
 Subject: RE: How to stop Solr tokenising search terms with spaces

 Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your
 help


 Regards,
 Dinesh Babu.



 -Original Message-
 From: Erik Hatcher [mailto:erik.hatc...@gmail.com]
 Sent: 05 December 2014 16:00
 To: solr-user@lucene.apache.org
 Subject: Re: How to stop Solr tokenising search terms with spaces

 try using {!field} instead of {!prefix}.  {!field} will create a phrase
 query (or term query if it’s just one term) after analysis.  [it also
 could construct other query types if the analysis overlaps tokens, but
 maybe not relevant here]

 Also note that you can use multiple of these 

RE: How to stop Solr tokenising search terms with spaces

2014-12-07 Thread Dinesh Babu
Thanks Yonik. This does not seem to work for me. This is wgat I did

1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) 
RVN Project Admins

2) {!complexphrase}RVN* -- Unknown query type 
\org.apache.lucene.search.PrefixQuery\ found in phrase query string \RVN*\

3) {!complexphrase}RVN V* -- Does not bring any result back.

4) {!complexphrase}RVN Viewpoint* -- Does not bring any result back.

Do I need to make any configuration changes to get this working?

Regards,
Dinesh Babu.



-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley
Sent: 07 December 2014 03:30
To: solr-user@lucene.apache.org
Subject: Re: How to stop Solr tokenising search terms with spaces

On Sat, Dec 6, 2014 at 7:17 PM, Dinesh Babu dinesh.b...@pb.com wrote:
 Just curious, why solr does not provide a simple mechanism to do a phrase 
 search ?

Simple phrase queries:
q= field1:Hanks Major

Phrase queries with wildcards / partial matches are a different story... they 
are complex:

q={!complexphrase}hanks ma*

See more examples here:
http://heliosearch.org/solr-4-8-features/

-Yonik
http://heliosearch.org - native code faceting, facet functions, sub-facets, 
off-heap data





Re: How to stop Solr tokenising search terms with spaces

2014-12-07 Thread Jack Krupansky
Thanks for the clarification. You may be able to get by using an ngram 
filter at index time - but not at query time.


Then Tom would be indexed at position 0 as to, om, and tom, and 
Hanks would be indexed at position 1 as ha, an, nk, ks, han, 
ank, nks, hank, anks, and hanks, permitting all of your queries, 
as unquoted terms or quoted simple phrases, such as to ank.


Use the standard tokenizer combined with the NGramFilterFactory and lower 
case filter, but only use the ngram filter at index time.


See:
http://lucene.apache.org/core/4_10_2/analyzers-common/org/apache/lucene/analysis/ngram/NGramFilterFactory.html

But be aware that use of the ngram filter dramatically increases the index 
size, so don't use it on large text fields, just short text fields like 
names.


-- Jack Krupansky

-Original Message- 
From: Dinesh Babu

Sent: Sunday, December 7, 2014 2:58 PM
To: solr-user@lucene.apache.org
Subject: RE: How to stop Solr tokenising search terms with spaces

Hi Alex,

My requirement is that I should be able to search for a person , for example 
Tom Hanks, by either


1) the whole of first name (Tom)
2) or partial first name with prefix  (To )
3) or partial first name without prefix  ( om)
4) or the whole of surname ( Hanks)
5) or partial surname with prefix (Han)
6) or partial surname without prefix (ank)
7) or the whole name (Tom Hanks)
8) or partial first name with or without prefix and partial surname with or 
without prefix ( To Han , om ank)

9) All of the above as case insensitive search

Thanks in advance for your help

Regards,
Dinesh Babu.


-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
Sent: 07 December 2014 01:20
To: solr-user
Subject: Re: How to stop Solr tokenising search terms with spaces

There is no spoon. And, there is no phrase search. Certainly nothing that 
is one approach that fits all.


What is actually happening is that you seem to want both phrase and prefix 
search. In your original question you did not explain the second part. So, 
you were given a solution for the first one.


To get the second part, you now need to to put some sort of NGram into the 
index-type analyzer chain. But the problem is, you need to be very clear on 
what you want there. Do you want:

1) Major Hanks
2) Major Ha
3) Hanks Ma (swapped)
4) Hanks random text Major (swapped and apart)
4) Ha Ma (prefix on both words)
5) ha ma (lower case searches too)
Or only some of those?

Each of these things have implications and trade-offs. Once you know what 
you want to find, we can help you get there.


Regards,
  Alex.
P.s. If you are not sure what I am talking about with the analyzer chain, 
may I recommend my own book:

http://www.amazon.ca/Instant-Apache-Solr-Indexing-Data-ebook/dp/B00D85K9XC
It seems to be on sale right now.
Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and 
newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers 
community: https://www.linkedin.com/groups?gid=6713853



On 6 December 2014 at 19:17, Dinesh Babu dinesh.b...@pb.com wrote:


Just curious, why solr does not provide a simple mechanism to do a phrase 
search ? It is a very common use case and it is very surprising that there 
is no straight forward, at least I have not found one after so much 
research,  way to do it in Solr.


Regards,
Dinesh


-Original Message-
From: Dinesh Babu [mailto:dinesh.b...@pb.com]
Sent: 05 December 2014 17:29
To: solr-user@lucene.apache.org
Subject: RE: How to stop Solr tokenising search terms with spaces

Hi Erik,

Probably I celebrated too soon. When I tested {!field} it seemed to
work as the query was on such a data that it made to look like it is
working.  using the example that I originally mentioned to search for
Tom Hanks Major

1) If I search {!field f=displayName}: Hanks Major,  it works

2) If I provide partial word {!field f=displayName}: Hanks Ma,  it
does not work

Is this how {!field is designed to work?

Also I tried without and with escaping space as you suggested. It has
the same issue

1) q= field1:Hanks Major , it works
2) q= field1:Hanks Maj , does not works

Regards,
Dinesh Babu.



-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com]
Sent: 05 December 2014 16:44
To: solr-user@lucene.apache.org
Subject: Re: How to stop Solr tokenising search terms with spaces

But also, to spell out the more typical way to do that:

   q=field1:”…” OR field2:”…”

The nice thing about {!field} is that the value doesn’t have to have 
quotes and deal with escaping issues, but if you just want phrase queries 
and quote/escaping isn’t a hassle maybe that’s cleaner for you.


Erik



On Dec 5, 2014, at 11:30 AM, Dinesh Babu dinesh.b...@pb.com wrote:

One more quick question Erik,

If I want to do search on multiple fields using {!field} do we have a
query similar to what  {!prefix} has
:  q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val}
where f1_val=field 1 

Re: How to stop Solr tokenising search terms with spaces

2014-12-07 Thread Yonik Seeley
On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote:
 Thanks Yonik. This does not seem to work for me. This is wgat I did

 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) 
 RVN Project Admins

 2) {!complexphrase}RVN* -- Unknown query type 
 \org.apache.lucene.search.PrefixQuery\ found in phrase query string 
 \RVN*\

Looks like you found a bug in this part... a prefix query being quoted
when it doesn't need to be.

 3) {!complexphrase}RVN V* -- Does not bring any result back.

This type of query should work (and does for me).  Is it because the
default search field does not have these terms, and you didn't specify
a different field to search?
Try this:
{!complexphrase}displayName:RVN V*

-Yonik
http://heliosearch.org - native code faceting, facet functions,
sub-facets, off-heap data


Re: Creating a Custom Query Response Writer

2014-12-07 Thread Ryan Yacyshyn
Thanks Erik. That's what I did in the end and it works great. I thought I'd
need to create a custom response to remove unnecessary fields but was able
to make the request return pretty much only what I need, even adding
omitHeader=true. I'm using the EdgeNGramFilterFactory during indexing on
the title field, and then returning the just the title on the matching set.

For example, a call to http://localhost:8983/solr/movies/suggest_movie?q=sav,
will return:

{
response: {
numFound: 2,
start: 0,
docs: [
{
title: Saving Private Ryan
},
{
title: Into the Breach: 'Saving Private Ryan'
}
]
}
}

Thanks for your help!

Ryan




On Sun Dec 07 2014 at 7:14:23 AM Erik Hatcher erik.hatc...@gmail.com
wrote:

 I wouldn’t personally do anything custom for JSON - but rather just pull
 out what you need client-side (and make the request such that it doesn’t
 return more than you need).  Doing a custom JSON format for this would
 limit your later flexibility in case you wanted to get different pieces of
 the response, possibly.   Also, if you had your own custom JSON format
 writer it could become problematic later on if Solr’s internal APIs changed
 and you needed to upgrade (or worse, maintain multiple versions).

 Erik

  On Dec 6, 2014, at 5:13 PM, Ryan Yacyshyn ryan.yacys...@gmail.com
 wrote:
 
  Hi Erik,
 
  Wow that's great. Thanks for the explanation, I tried using the
  VelocityResponseWriter with your template provided and it worked as
  expected, makes sense!
 
  What if I want to return a custom JSON response back, rather than HTML
 for
  auto-suggesting? I'm thinking about using Twitter's typeahead jQuery
 plugin
  https://twitter.github.io/typeahead.js/examples/#custom-templates 
 https://twitter.github.io/typeahead.js/examples/#custom-templates and
  passing it JSON, and have a template on the front-end to the autosuggest.
 
  Ryan
 
 
 
  On Sat Dec 06 2014 at 3:41:38 AM Erik Hatcher erik.hatc...@gmail.com
 mailto:erik.hatc...@gmail.com
  wrote:
 
  Ryan - I just pulled Taming Text off my shelf and refreshed my memory of
  this custom response writer.
 
  While having a custom writer is a neat example, it’s unnecessary for
 that
  particular functionality.  Solr has a built-in templatable response
 writer,
  the VelocityResponseWriter.  You can see it in action for a similar
 suggest
  feature in Solr’s example /browse interface (type “ip” and wait a
 second in
  the /browse UI with the sample data indexed).  In there is a little bit
 of
  jQuery autocomplete plugin usage that calls back to the /terms handler,
  using a suggest.vm template (in conf/velocity).  The difference with the
  Taming Text example is that it is returns stored fields of a standard
  search rather than just raw terms; with a little adjustment you can get
  basically the same thing as TT.  Leveraging the Solr example (v4.10.2
 for
  me here), I created a conf/velocity/typeahead.vm:
 
   ul
 #foreach($doc in $response.results)
   li$doc.name/li
 #end
   /ul
 
  (the docs in the example data have a ‘name’ field)
 
  This request  http://localhost:8983/solr/collection1/select?q=name%
  3Aip*wt=velocityv.template=typeahead http://localhost:8983/solr/ 
 http://localhost:8983/solr/
  collection1/select?q=name:ip*wt=velocityv.template=typeahead results
  in this response:
 
   ul
   liBelkin Mobile Power Cord for iPod w/ Dock/li
   liiPod  iPod Mini USB 2.0 Cable/li
   liApple 60 GB iPod with Video Playback Black/li
   /ul
 
 Erik
 
 
  On Dec 6, 2014, at 2:24 AM, Ryan Yacyshyn ryan.yacys...@gmail.com
  wrote:
 
  Hey Everyone,
 
  I'm a little stuck on building a custom query response writer. I want
 to
  create a response writer similar to the one explained in the book,
 Taming
  Text, on the TypeAheadResponseWriter. I know I need to implement the
  QueryResponseWriter, but I'm not sure where to find the Solr JAR files
 I
  need to include. Where can I find these?
 
  Thanks,
  Ryan




Re: CloudSolrServer, concurrency and too many connections

2014-12-07 Thread JoeSmith
i've upgraded to 4.10.2 on the client-side.  Still seeing this connection
problem when connecting to the Zookeeper port.  If I connect directly to
SolrServer, the connections do not increase.  But when connecting to
Zookeeper, the connections increase up to 60 and then start to fail.  I
understand Zookeeper is configured to fail after 60 connections to prevent
a DOS attack, but I dont see why we keep adding new connections (up to
60).  Does the client-side Zookeeper code also use HttpClient
ConnectionPooling for its Connection Pool?  Below is the Exception that
shows up in the log file when this happens.  When we execute queries we are
using the _route_ parameter, could this explain anything?

o.a.zookeeper.ClientCnxn - Session 0x0 for server
aweqca3utmtc10.cloud..com/10.22.10.107:9983, unexpected error, closing
socket connection and attempting reconnect

java.io.IOException: Connection reset by peer

at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_55]

at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
~[na:1.7.0_55]

at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
~[na:1.7.0_55]

at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[na:1.7.0_55]

at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
~[na:1.7.0_55]

at
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
~[zookeeper-3.4.6.jar:3.4.6-1569965]

at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
~[zookeeper-3.4.6.jar:3.4.6-1569965]

at
org.apache.zookeeper.Clie4.ntCnxn$SendThread.run(ClientCnxn.java:1081)
~[zookeeper-3.4.6.jar:3.4.6-1569965]


Will try to get the server code upgraded to 4.10.2.



On Sat, Dec 6, 2014 at 3:52 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 12/6/2014 12:09 PM, JoeSmith wrote:
  We are currently using CloudSolrServer, but it looks like this class is
 not
  thread-safe (setDefaultCollection). Should this instance be initialized
  once (at startup) and then re-used (in all threads) until shutdown when
 the
  process terminates?  Or should it re-instantiated for each request?
 
  Currently, we are trying to use CloudSolrServer as a singleton, but it
  looks like the connections to the host are not being closed and under
 load
  we start getting failures.  and In the Zookeeper logs we see this error:
 
  WARN  - 2014-12-04 10:09:14.364;
  org.apache.zookeeper.server.NIOServerCnxnFactory; Too many connections
 from
  /11.22.33.44 - max is 60
 
  netstat (on the Zookeeper host) shows that the connections are not being
  closed. What is the 'correct' way to fix this?   Apologies if i have
 missed
  any documentation that explains, pointers would be helpful.

 All SolrServer implementations in SolrJ, including CloudSolrServer, are
 supposed to be threadsafe.  If it turns out they're not actually
 threadsafe, then we treat that as a bug.  The discussion to determine
 that it's a bug takes place on this mailing list, and once we determine
 that, the next step is to file an issue in Jira.

 The general way to use SolrJ is to initialize the server instance at the
 beginning and re-use it for all client communication to Solr.  With
 CloudSolrServer, you normally only need a single server instance to talk
 to the entire cloud, because you can set the collection parameter on
 each request to indicate which collection to work on.  If you only have
 a handful of collections, you might want to use multiple instances and
 use setDefaultCollection  to specify the collection.  With
 HttpSolrServer, an instance is required for each core, because the core
 name is in the initialization URL.

 I've not looked at the code, but I can't imagine that the client ever
 needs to make more than one connection to each server in the zookeeper
 ensemble.  Here's a list of the open connections on one of my zookeeper
 servers for my SolrCloud 4.2.1 install:

 java21800 root   21u  IPv62836983  0t0  TCP
 10.8.0.151:50178-10.8.0.152:2888 (ESTABLISHED)
 java21800 root   22u  IPv62661097  0t0  TCP
 10.8.0.151:3888-10.8.0.152:34116 (ESTABLISHED)
 java21800 root   26u  IPv6   28065088  0t0  TCP
 10.8.0.151:2181-10.8.0.141:52583 (ESTABLISHED)
 java21800 root   27u  IPv6   23967470  0t0  TCP
 10.8.0.151:2181-10.8.0.152:49436 (ESTABLISHED)
 java21800 root   28r  IPv6   23969636  0t0  TCP
 10.8.0.151:2181-10.8.0.151:57290 (ESTABLISHED)
 java21800 root   29r  IPv6   23969951  0t0  TCP
 10.8.0.151:3888-10.8.0.153:54721 (ESTABLISHED)

 The 151, 152, and 153 addresses are my ZK servers, with Solr also
 running on 151 and 152.  The 141 address is the SolrJ client.  The main
 ZK port is 2181, with ports 2888 and 3888 used for internal zookeeper
 communication.  I actually would have expected to see two client
 connections from .141 ... one for the indexer program and one 

Re: CloudSolrServer, concurrency and too many connections

2014-12-07 Thread Shawn Heisey
On 12/7/2014 9:11 PM, JoeSmith wrote:
 i've upgraded to 4.10.2 on the client-side.  Still seeing this connection
 problem when connecting to the Zookeeper port.  If I connect directly to
 SolrServer, the connections do not increase.  But when connecting to
 Zookeeper, the connections increase up to 60 and then start to fail.  I
 understand Zookeeper is configured to fail after 60 connections to prevent
 a DOS attack, but I dont see why we keep adding new connections (up to
 60).  Does the client-side Zookeeper code also use HttpClient
 ConnectionPooling for its Connection Pool?  Below is the Exception that
 shows up in the log file when this happens.  When we execute queries we are
 using the _route_ parameter, could this explain anything?

The docs say that Zookeeper uses NIO communication directly by default,
so there's no layer like HttpClient.  I don't think it uses pooling ...
it does everything over a single TCP connection that doesn't normally
disconnect until the program exits.

Basically, the Zookeeper authors built their own networking layer that
uses TCP directly.  You have the option of using Netty instead:

http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#Communication+using+the+Netty+framework

Are you running version 3.4.6 for your zookeeper servers?  That's the
version of ZK client code you'll find in Solr 4.10.x, and the
recommended version for both the server and your SolrJ program.

The most likely reasons for the connection problems you are seeing are:

1) A bug in the networking layer of your JVM.
1a) The latest Oracle Java 7 (currently 7u72) is highly recommended.
2) A bug or misconfig in the OS TCP stack, or possibly its firewall.
3) A bug or misconfig in zookeeper.

I can't rule out the fourth possibility, but so far I think it's unlikely:

4) A bug in SolrJ that has not yet been reported or fixed.

Thanks,
Shawn



Re: Logging in Solr's DataImportHandler

2014-12-07 Thread Mikhail Khludnev
Hello Dan,

Usually it works well. Can you describe how you run it particularly, eg
what you download exactly and what's the command line ?

On Fri, Dec 5, 2014 at 11:37 PM, Dan Davis dansm...@gmail.com wrote:

 I have a script transformer and a log transformer, and I'm not seeing the
 log messages, at least not where I expect.
 Is there anyway I can simply log a custom message from within my script?
 Can the script easily interact with its containers logger?




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
mkhlud...@griddynamics.com


Boosting the score using edismax for a non empty and non indexed field.

2014-12-07 Thread S.L
Hi All,

I have a situation where I need to boost the score of a query if a field
(imageURL) in the given document is non empty , I am using edismax so I
know that using bq parameter would solve the problem. However the field
imageURL that  I am trying to boost on is not indexed , meaning (stored =
true and indexed = false), can I use the bq parameter for a non indexed
field ? or should I be looking at re-indexing after changing the schema to
make this an indexed field ?

Also , my use case is such that I want the documents that have an imageURL
to be boosted so that they appear before those documents that do not have
the imageURL when sorted by score in a descending order, and this field in
question i.e. imageURL is sometimes present  and sometimes not, that is why
I am looking at boosting the score of those documents that have the
imageURL present.

Thanks and any help and suggestionis much appreciated!


RE: How to stop Solr tokenising search terms with spaces

2014-12-07 Thread Dinesh Babu
I just tried  your suggestion

{!complexphrase}displayName:RVN Viewpoint users

Even the above did not work. Am I missing any configuration changes for this 
parser to work?

Regards,
Dinesh Babu.



-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley
Sent: 07 December 2014 20:49
To: solr-user@lucene.apache.org
Subject: Re: How to stop Solr tokenising search terms with spaces

On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote:
 Thanks Yonik. This does not seem to work for me. This is wgat I did

 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) 
 RVN Project Admins

 2) {!complexphrase}RVN* -- Unknown query type 
 \org.apache.lucene.search.PrefixQuery\ found in phrase query string 
 \RVN*\

Looks like you found a bug in this part... a prefix query being quoted when it 
doesn't need to be.

 3) {!complexphrase}RVN V* -- Does not bring any result back.

This type of query should work (and does for me).  Is it because the default 
search field does not have these terms, and you didn't specify a different 
field to search?
Try this:
{!complexphrase}displayName:RVN V*

-Yonik
http://heliosearch.org - native code faceting, facet functions, sub-facets, 
off-heap data