Re: Parallelizing queries without Custom Component

2018-01-15 Thread Max Bridgewater
Thanks Emir. Looks indeed like what I need.

On Mon, Jan 15, 2018 at 11:33 AM, Emir Arnautović <
emir.arnauto...@sematext.com> wrote:

> Hi Max,
> It seems to me that you are looking for grouping
> https://lucene.apache.org/solr/guide/6_6/result-grouping.html <
> https://lucene.apache.org/solr/guide/6_6/result-grouping.html> or field
> collapsing https://lucene.apache.org/solr/guide/6_6/collapse-and-
> expand-results.html <https://lucene.apache.org/
> solr/guide/6_6/collapse-and-expand-results.html> feature.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 15 Jan 2018, at 17:27, Max Bridgewater 
> wrote:
> >
> > Hi,
> >
> > My index is composed of product reviews. Each review contains the id of
> the
> > product it refers to. But it also contains a rating for this product and
> > the number of negative feedback provided on this product.
> >
> > {
> >   id: solr doc id,
> >   rating: number between 0 and 5,
> >   product_id: the product that is being reviewed,
> >   negative_feedback: how many negative feedbacks on this product
> > }
> >
> > The query below returns the "worst" review for the given product
> 7453632.
> > Worst is defined as  rated 1 to 3 and having the highest number of
> negative
> > feedback.
> >
> > /select?q=product_id:7453632&fq=rating:[1 TO 3]&sort=negative_feedback
> > desc&rows=1
> >
> > The query works as intended. Now the challenging part is to extend this
> > query to support many product_id. If executed with many product Id, the
> > result should be the list of worst reviews for all the provided products.
> >
> > A query of the following form would return the list of worst products for
> > products: 7453632,645454,534664.
> >
> > /select?q=product_id:[7453632,645454,534664]&fq=rating:[1 TO
> > 3]&sort=negative_feedback desc
> >
> > Is there a way to do this in Solr without custom component?
> >
> > Thanks.
> > Max
>
>


Parallelizing queries without Custom Component

2018-01-15 Thread Max Bridgewater
Hi,

My index is composed of product reviews. Each review contains the id of the
product it refers to. But it also contains a rating for this product and
the number of negative feedback provided on this product.

{
   id: solr doc id,
   rating: number between 0 and 5,
   product_id: the product that is being reviewed,
   negative_feedback: how many negative feedbacks on this product
}

The query below returns the "worst" review for the given product  7453632.
Worst is defined as  rated 1 to 3 and having the highest number of negative
feedback.

/select?q=product_id:7453632&fq=rating:[1 TO 3]&sort=negative_feedback
desc&rows=1

The query works as intended. Now the challenging part is to extend this
query to support many product_id. If executed with many product Id, the
result should be the list of worst reviews for all the provided products.

A query of the following form would return the list of worst products for
products: 7453632,645454,534664.

/select?q=product_id:[7453632,645454,534664]&fq=rating:[1 TO
3]&sort=negative_feedback desc

Is there a way to do this in Solr without custom component?

Thanks.
Max


Do I need to declare TermVectorComponent for best MoreLikeThis results?

2017-07-12 Thread Max Bridgewater
Hi,

The MLT documentation says that for best results, the fields should have
stored term vectors in schema.xml, with:



My question: should I also create the TermVectorComponent and declare it in
the search handler?

In other terms, do I have to do this in my solrconfig.xml for best results?




  
true
  
  
tvComponent
  



I am seeing continuously increasing MLT response times and I am wondering
if I am doing something wrong.

Thanks.
Max.


MoreLikeThis Clarifications

2017-06-22 Thread Max Bridgewater
I am trying to confirm my understanding of MLT after going through
following page:
https://cwiki.apache.org/confluence/display/solr/MoreLikeThis.

Three approaches are mentioned:

1) Use it as a request handler and send text to the MoreLikeThis request
handler as needed.
2) Use it as a search component and MLT is performed on every document
returned
3) You use it as a request handler but with externally supplied text.


What are example queries in each case and what config changes are required
for each case?

There is also MLTQParser. When can I use this parser as opposed to use any
of the three above approaches?

Thanks,
Max.


Re: Phrase Exact Match with Margin of Error

2017-06-15 Thread Max Bridgewater
Thanks Susheel. The challenge is that if I search for the word "between"
alone, I still get plenty of results. In a way I want the query to  match
the document title exactly (up to a few characters) and the document title
match the query exactly (up to a few characters). KeywordTokenizer allows
that. But complexphrase does not seem to work with KeywordTokenizer.

On Thu, Jun 15, 2017 at 10:23 AM, Susheel Kumar 
wrote:

> CompledPhraseQuery parser is what you need to look
> https://cwiki.apache.org/confluence/display/solr/Other+
> Parsers#OtherParsers-ComplexPhraseQueryParser.
> See below for e.g.
>
>
>
> http://localhost:8983/solr/techproducts/select?debugQuery=on&indent=on&q=
> manu:%22Bridge%20the%20gat~1%20between%20your%20skills%
> 20and%20your%20goals%22&defType=complexphrase
>
> On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewater <
> max.bridgewa...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I am trying to do phrase exact match. For this, I use
> > KeywordTokenizerFactory. This basically does what I want to do. My field
> > type is defined as follows:
> >
> >  > positionIncrementGap="100">
> >   
> > 
> > 
> >   
> >   
> > 
> > 
> >   
> > 
> >
> >
> > In addition to this, I want to tolerate typos of two or three letters. I
> > thought fuzzy search could allow me to accept this margin of error. But
> > this doesn't seem to work.
> >
> > A typical query I would have is:
> >
> > q=subjet:"Bridge the gap between your skills and your goals"
> >
> > Now, in this query, if I replace gap with gat, I was hoping I could do
> > something such as:
> >
> > q=subjet:"Bridge the gat between your skills and your goals"~0.8
> >
> > But this doesn't quite do what I am trying to achieve.
> >
> > Any suggestion?
> >
>


Phrase Exact Match with Margin of Error

2017-06-15 Thread Max Bridgewater
Hi,

I am trying to do phrase exact match. For this, I use
KeywordTokenizerFactory. This basically does what I want to do. My field
type is defined as follows:


  


  
  


  



In addition to this, I want to tolerate typos of two or three letters. I
thought fuzzy search could allow me to accept this margin of error. But
this doesn't seem to work.

A typical query I would have is:

q=subjet:"Bridge the gap between your skills and your goals"

Now, in this query, if I replace gap with gat, I was hoping I could do
something such as:

q=subjet:"Bridge the gat between your skills and your goals"~0.8

But this doesn't quite do what I am trying to achieve.

Any suggestion?


Invoking a SerachHandler inside Solr Plugin

2017-04-11 Thread Max Bridgewater
I am looking for best practices when a search component in one handler,
needs to invoke another handler, say /basic. So far, I got this working
prototype:

public void process(ResponseBuilder rb) throws IOException {
  SolrQueryResponse response = new SolrQueryResponse();
 ModifiableSolrParams params=new ModifiableSolrParams();
 params.add("defType",
"lucene").add("fl","product_id").add("wt","json").
add("df","competitor_product_titles").add("echoParams","explicit").add("q",rb.req.getParams().get("q"));
  SolrQueryRequest request= new
LocalSolrQueryRequest(rb.req.getCore(),params );
  SolrRequestHandler hdlr =
rb.req.getCore().getRequestHandler("/basic");
  rb.req.getCore().execute(hdlr, request, response);
  DocList
docList=((ResultContext)response.getValues().get("response")).docs;
 //Do some crazy stuff with the result
}


My concerns:

1) What is a clean way to read the /basic handler's default parameters
from solrconfig.xml and use them in LocalSolrQueryRequest().
2) Is there a better way to accomplish this task overall?


Thanks,
Max.


Re: Query.extractTerms dissapeared from 5.1.0 to 5.2.0

2017-02-01 Thread Max Bridgewater
Perfect. Thanks a lot.

On Wed, Feb 1, 2017 at 2:01 PM, Alan Woodward  wrote:

> Hi, extractTerms() is now on Weight rather than on Query.
>
> Alan
>
> > On 1 Feb 2017, at 17:43, Max Bridgewater 
> wrote:
> >
> > Hi,
> >
> > It seems Query.extractTerms() disapeared from 5.1.0 (
> > http://lucene.apache.org/core/5_1_0/core/org/apache/lucene/
> search/Query.html)
> > to 5.2.0 (
> > http://lucene.apache.org/core/5_2_0/core/org/apache/lucene/
> search/Query.html
> > ).
> >
> > However, I cannot find any comment on it in 5.2.0 release notes. Any
> > recommendation on what I should use in place of that method? I am
> migrating
> > some legacy code from Solr 4 to Solr 6.
> >
> > Thanks,
> > Max.
>
>


Query.extractTerms dissapeared from 5.1.0 to 5.2.0

2017-02-01 Thread Max Bridgewater
Hi,

It seems Query.extractTerms() disapeared from 5.1.0 (
http://lucene.apache.org/core/5_1_0/core/org/apache/lucene/search/Query.html)
to 5.2.0 (
http://lucene.apache.org/core/5_2_0/core/org/apache/lucene/search/Query.html
).

However, I cannot find any comment on it in 5.2.0 release notes. Any
recommendation on what I should use in place of that method? I am migrating
some legacy code from Solr 4 to Solr 6.

Thanks,
Max.


Solr 6 Default Core URL

2016-12-13 Thread Max Bridgewater
I have one Solr core on my solr 6 instance and I can query it with:

http://localhost:8983/solr/mycore/search?q=*:*

Is there a way to configure solr 6 so that I can simply query it with this
simple URL?

http://localhost:8983/search?q=*:*


Thanks.
Max,


Re: Solr 6 Performance Suggestions

2016-11-28 Thread Max Bridgewater
Thanks again Folks. I tried each suggestion and none made any difference. I
am setting up a lab for performance monitoring using App Dynamics.
Hopefully I am able to figure out something.

On Mon, Nov 28, 2016 at 11:20 AM, Erick Erickson 
wrote:

> bq: If you know the maximum size you ever will need, setting Xmx is good.
>
> Not quite sure what you're getting at here. I pretty much guarantee that a
> production system will eat up the default heap size, so not setting Xmx
> will
> cause OOM errors pretty soon. Or did you mean Xms?
>
> As far as setting Xms, there are differing opinions, mostly though since
> Solr
> likes memory so much there's a lot of tuning to try to determine Xmx and
> it's pretty much guaranteed that Java will need close to that amount of
> memory.
> So setting Xms=Xmx is a minor optimization if that assumption is true.
> It's arguable
> how much practical difference it makes though.
>
> Best,
> Erick
>
> On Mon, Nov 28, 2016 at 2:14 AM, Florian Gleixner  wrote:
> > Am 28.11.2016 um 00:00 schrieb Shawn Heisey:
> >>
> >> On 11/27/2016 12:51 PM, Florian Gleixner wrote:
> >>>
> >>> On 22.11.2016 14:54, Max Bridgewater wrote:
> >>>>
> >>>> test cases were exactly the same, the machines where exactly the same
> >>>> and heap settings exactly the same (Xms24g, Xmx24g). Requests were
> >>>> sent with
> >>>
> >>> Setting heap too large is a common error. Recent Solr use the
> >>> filesystem cache, so you don't have to set heap to the size of the
> >>> index. The avalible RAM has to be able to run the OS, run the jvm and
> >>> hold most of the index data in filesystem cache. If you have 32GB RAM
> >>> and a 20GB Index, then set -Xms never higher than 10GB. I personally
> >>> would set -Xms to 4GB and omit -Xmx
> >>
> >>
> >> In my mind, the Xmx setting is much more important than Xms.  Setting
> >> both to the same number avoids any need for Java to detect memory
> >> pressure before increasing the heap size, which can be helpful.
> >>
> >
> > From https://cwiki.apache.org/confluence/display/solr/JVM+Settings
> >
> > "The maximum heap size, set with -Xmx, is more critical. If the memory
> heap
> > grows to this size, object creation may begin to fail and throw
> > OutOfMemoryException. Setting this limit too low can cause spurious
> errors
> > in your application, but setting it too high can be detrimental as well."
> >
> > you are right, Xmx is more important. But setting Xms to Xmx will waste
> RAM,
> > that the OS can use to cache your index data. Setting Xmx can avoid
> problems
> > in some situations where solr can eat up your filesystem cache until the
> > next GC has been finished.
> >
> >> Without Xmx, Java is in control of the max heap size, and it may not
> >> make the correct choice.  It's important to know what your max heap is,
> >> because chances are excellent that the max heap *will* be reached.  Solr
> >> allocates a lot of memory to do its job.
> >>
> >
> > If you know the maximum size you ever will need, setting Xmx is good.
> >
> >
> >
> >
>


Re: Solr 6 Performance Suggestions

2016-11-25 Thread Max Bridgewater
Thanks folks. It looks like the sweet spot where I get comparable results
is at 30 concurrent threads. It progressively degrades from there as I
increases the number of concurrent threads in the test script.

This made me think that something is configured in Tomcat ((Solr4) that is
not comparatively set in Solr 6. The only thing I found that would make
sense is the connector max number threads that we have set at 800 for
Tomcat. However, it jetty.xml, maxThreads is set to 5. Not sure if
these two maxThreads have the same effect.

I thought about Yonik suggestion a little bit. Where I am scratching my
head is that if specific kind of queries where more expensive than others,
should this be reflected even at 30 concurrent threads?

Anyway, still digging.

On Wed, Nov 23, 2016 at 9:56 AM, Walter Underwood 
wrote:

> I recently ran benchmarks on 4.10.4 and 6.2.1 and found very little
> difference in query performance.
>
> This was with 8 million documents (homework problems) from production. I
> used query logs from
> production. The load is a constant number of requests per minute from 100
> threads. CPU usage
> is under 50% in order to avoid congestion. The benchmarks ran for 100
> minutes.
>
> Measuring median and 95th percentile, the times were within 10%. I think
> that is within the
> repeatability of the benchmark. A different number of GCs could make that
> difference.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Nov 23, 2016, at 8:14 AM, Bram Van Dam  wrote:
> >
> > On 22/11/16 15:34, Prateek Jain J wrote:
> >> I am not sure but I heard this in one of discussions, that you cant
> migrate directly from solr 4 to solr 6. It has to be incremental like solr
> 4 to solr 5 and then to solr 6. I might be wrong but is worth trying.
> >
> > Ideally the index needs to be upgraded using the IndexUpgrader.
> >
> > Something like this should do the trick:
> >
> > java -cp lucene-core-6.0.0.jar:lucene-backward-codecs-6.0.0.jar
> > org.apache.lucene.index.IndexUpgrader /path/to/index
> >
> > - Bram
>
>


Solr 6 Performance Suggestions

2016-11-22 Thread Max Bridgewater
I migrated an application from Solr 4 to Solr 6.  solrconfig.xml  and
schema.xml are sensibly the same. The JVM params are also pretty much
similar.  The indicces have each about 2 million documents. No particular
tuning was done to Solr 6 beyond the default settings. Solr 4 is running in
Tomcat 7.

Early results seem to show Solr 4 outperforming Solr 6. The first shows an
average response time of 280 ms while the second averages at 430 ms. The
test cases were exactly the same, the machines where exactly the same and
heap settings exactly the same (Xms24g, Xmx24g). Requests were sent with
Jmeter with 50 concurrent threads for 2h.

I know that this is not enough information to claim that Solr 4 generally
outperforms Solr 6. I also know that this pretty much depends on what the
application does. So I am not claiming anything general. All I want to do
is get some input before I start digging.

What are some things I could tune to improve the numbers for Solr 6? Have
you guys experienced such discrepancies?

Thanks,
Max.


Re: Edismax query parsing in Solr 4 vs Solr 6

2016-11-12 Thread Max Bridgewater
Hi Greg,

Your analysis is SPOT ON. I did some debugging and found out that we had
q.op in the default set to AND. And when I changed that to OR, things
worked exactly as in Solr 4. So, it seemed Solr 6 was behaving as is
should. What I could not explain was whether Solr 4 was using the
configured q.op that was set in the default or not. But your explanation
makes sense now.

Thanks,
Max.



On Sat, Nov 12, 2016 at 4:54 PM, Greg Pendlebury 
wrote:

> This has come up a lot on the lists lately. Keep in mind that edismax
> parses your query uses additional parameters such as 'mm' and 'q.op'. It is
> the handling of these parameters (and the selection of default values)
> which has changed between versions to address a few functionality gaps.
>
> The most common issue I've seen is where users were not setting those
> values and relying on the defaults. You might now need to set them
> explicitly to return to desired behaviour.
>
> I can't see all of your configuration, but I'm guessing the important one
> here is 'q.op', which was previously hard coded to 'OR', irrespective of
> either parameters or solrconfig. Try setting that to 'OR' explicitly...
> maybe you have your default operator set to 'AND' in solrconfig and that is
> now being applied? The other option is 'mm', which I suspect should be set
> to '0' unless you have some reason to want it. If it was set to '100%' it
> might insert the additional '+' flags, but it can also show up as a '~'
> operator on the end.
>
> Ta,
> Greg
>
> On 8 November 2016 at 22:13, Max Bridgewater 
> wrote:
>
> > I am migrating a solr based app from Solr 4 to Solr 6.  One of the
> > discrepancies I am noticing is around edismax query parsing. My code
> makes
> > the following call:
> >
> >
> >  userQuery="+(title:shirts isbn:shirts) +(id:20446 id:82876)"
> >   Query query=QParser.getParser(userQuery, "edismax", req).getQuery();
> >
> >
> > With Solr 4, query becomes:
> >
> > +(+(title:shirt isbn:shirts) +(id:20446 id:82876))
> >
> > With Solr 6 it however becomes:
> >
> > +(+(+title:shirt +isbn:shirts) +(+id:20446 +id:82876))
> >
> > Digging deeper, it appears that parseOriginalQuery() in
> > ExtendedDismaxQParser is adding those additional + signs.
> >
> >
> > Is there a way to prevent this altering of queries?
> >
> > Thanks,
> > Max.
> >
>


Edismax query parsing in Solr 4 vs Solr 6

2016-11-08 Thread Max Bridgewater
I am migrating a solr based app from Solr 4 to Solr 6.  One of the
discrepancies I am noticing is around edismax query parsing. My code makes
the following call:


 userQuery="+(title:shirts isbn:shirts) +(id:20446 id:82876)"
  Query query=QParser.getParser(userQuery, "edismax", req).getQuery();


With Solr 4, query becomes:

+(+(title:shirt isbn:shirts) +(id:20446 id:82876))

With Solr 6 it however becomes:

+(+(+title:shirt +isbn:shirts) +(+id:20446 +id:82876))

Digging deeper, it appears that parseOriginalQuery() in
ExtendedDismaxQParser is adding those additional + signs.


Is there a way to prevent this altering of queries?

Thanks,
Max.


BooleanQuery Migration from Solr 4 to SOlr 6

2016-07-18 Thread Max Bridgewater
HI Folks,

I am tasked with migrating a Solr app from Solr 4 to Solr 6. This solr app
is in essence a bunch of solr components/handlers. One part that challenges
me is BooleanQuery immutability in Solr 6.

Here is the challenge: In our old code base, we had classes that
implemented custom interfaces and extended BooleanQuery. These custom
interfaces were essentially markers that told our various components where
the user came from. Based on the user's origin, different pieces of logic
would apply.

Now, in Solr 6, our custom boolean query  can no longer extend BooleanQuery
since BooleanQuery only has a private constructor. I am looking for a clean
solution to this problem.

Here are some ideas I had:

1) Remove the logic that depends on the custom boolean query => Big risk to
our search logic
2) Simply remove BooleanQuery as super class of custom boolean query =>
Major risk. Wherever we do “if(query instanceof BooleanQuery) “, we would
not catch our custom queries.
3) Remove BooleanQuery as parent to the custom query (e.g. make it extend
Query) AND Refactor to move all “if(query instanceof BooleanQuery) “ into a
dedicated method: isCustomBooleanQuery. This would return “query instanceof
BooleanQuery || “query instanceof CustomQuery“. We then need to change ALL
20 occurrences of this test and ensure we handle both cases appropriately.
==> Very invasive.
4) Add a method createCustomQuery() that would return a boolean query
wherein a special clause is added that allows us to identify our custom
queries.  This special clause should not impact search results. => Pretty
ugly.


Other potential clean, low risk, and less invasive solution?


Max.


Determine Containing Handler

2016-05-19 Thread Max Bridgewater
Hi,

I am implementing a component that needs to redirect calls to the handler
that originally called it. Say the call comes to handler /search, the
component would then do some processing and, alter the query and then send
the query back to /search again.

It works great. The only issue is that the handler is not always called
/search, leading me to have to force people to pass the handler name as
parameter to the component, which is not ideal.

The question thus is: is there a way to find out what handler a component
was invoked from?

I checked in SolrCore and SolrQueryRequest I can't seem to find a method
that would do this.

Thanks,
Max.


Re: Function Query Parsing problem in Solr 5.4.1 and Solr 5.5.0

2016-04-02 Thread Max Bridgewater
Thank you Mike, that was it.

Max.

On Sat, Apr 2, 2016 at 2:40 AM, Mikhail Khludnev  wrote:

> Hello Max,
>
> Since it reports the first space occurrence pos=32, I advise to nuke all
> spaces between braces  in sum().
>
> On Fri, Apr 1, 2016 at 7:40 PM, Max Bridgewater  >
> wrote:
>
> > Hi,
> >
> > I have the following configuration for firstSearcher handler in
> > solrconfig.xml:
> >
> >
> >   
> >   
> > 
> >   parts
> >   score desc, Review1 asc, Rank2 asc
> > 
> > 
> >   make
> >   {!func}sum(product(0.01,param1),
> > product(0.20,param2),  min(param2,0.4)) desc
> > 
> >   
> > 
> >
> > This works great in Solr 4.10. However, in solr 5.4.1 and solr 5.5.0, I
> get
> > the below error. How do I write this kind of query with Solr 5?
> >
> >
> > Thanks,
> > Max.
> >
> >
> > ERROR org.apache.solr.handler.RequestHandlerBase  [   x:productsearch] –
> > org.apache.solr.common.SolrException: Can't determine a Sort Order (asc
> or
> > desc) in sort spec '{!func}sum(product(0.01,param1),
> product(0.20,param2),
> > min(param2,0.4)) desc', pos=32
> > at
> >
> >
> org.apache.solr.search.SortSpecParsing.parseSortSpec(SortSpecParsing.java:143)
> > at org.apache.solr.search.QParser.getSort(QParser.java:247)
> > at
> >
> >
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:18
> > 7)
> > at
> >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler
> > .java:247)
> > at
> >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.jav
> > a:156)
> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2073)
> > at
> >
> >
> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:6
> > 9)
> > at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1840)
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
> 
>


Function Query Parsing problem in Solr 5.4.1 and Solr 5.5.0

2016-04-01 Thread Max Bridgewater
Hi,

I have the following configuration for firstSearcher handler in
solrconfig.xml:


  
  

  parts
  score desc, Review1 asc, Rank2 asc


  make
  {!func}sum(product(0.01,param1),
product(0.20,param2),  min(param2,0.4)) desc

  


This works great in Solr 4.10. However, in solr 5.4.1 and solr 5.5.0, I get
the below error. How do I write this kind of query with Solr 5?


Thanks,
Max.


ERROR org.apache.solr.handler.RequestHandlerBase  [   x:productsearch] –
org.apache.solr.common.SolrException: Can't determine a Sort Order (asc or
desc) in sort spec '{!func}sum(product(0.01,param1), product(0.20,param2),
min(param2,0.4)) desc', pos=32
at
org.apache.solr.search.SortSpecParsing.parseSortSpec(SortSpecParsing.java:143)
at org.apache.solr.search.QParser.getSort(QParser.java:247)
at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:18
7)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler
.java:247)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.jav
a:156)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2073)
at
org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:6
9)
at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1840)


Re: Load Resource from within Solr Plugin

2016-03-31 Thread Max Bridgewater
Hi Folks,

Thanks for all the great suggestions. i will try and see which one works
best.
@Hoss: The WEB-INF folder is just in my dev environment. I have a localo
Solr instance and I points it to the target/WEB-INF. Simple convenient
setup for development purposes.

Much appreciated.

Max.

On Wed, Mar 30, 2016 at 4:24 PM, Rajesh Hazari 
wrote:

> Max,
> Have you looked in External file field which is reload on every hard
> commit,
> only disadvantage of this is the file (personal-words.txt) has to be placed
> in all data folders in each solr core,
> for which we have a bash script to do this job.
>
>
> https://cwiki.apache.org/confluence/display/solr/Working+with+External+Files+and+Processes
>
> Ignore this if this does not meets your requirement.
>
> *Rajesh**.*
>
> On Wed, Mar 30, 2016 at 1:21 PM, Chris Hostetter  >
> wrote:
>
> > :
> > :  > : regex=".*\.jar" />
> >
> > 1) as a general rule, if you have a  delcaration which includes
> > "WEB-INF" you are probably doing something wrong.
> >
> > Maybe not in this case -- maybe "search-webapp/target" is a completley
> > distinct java application and you are just re-using it's jars.  But 9
> > times out of 10, when people have a  WEB-INF path they are trying to load
> > jars from, it's because they *first* added their jars to Solr's WEB_INF
> > directory, and then when that didn't work they added the path to the
> > WEB-INF dir as a  ... but now you've got those classes being loaded
> > twice, and you've multiplied all of your problems.
> >
> > 2) let's ignore the fact that your path has WEB-INF in it, and just
> > assume it's some path to somewhere where on disk that has nothing to
> > do with solr, and you want to load those jars.
> >
> > great -- solr will do that for you, and all of those classes will be
> > available to plugins.
> >
> > Now if you wnat to explicitly do something classloader related, you do
> > *not* want to be using Thread.currentThread().getContextClassLoader() ...
> > because the threads that execute everything in Solr are a pool of worker
> > threads that is created before solr ever has a chance to parse your  > /> directive.
> >
> > You want to ensure anything you do related to a Classloader uses the
> > ClassLoader Solr sets up for plugins -- that's available from the
> > SolrResourceLoader.
> >
> > You can always get the SolrResourceLoader via
> > SolrCore.getSolrResourceLoader().  from there you can getClassLoader() if
> > you really need some hairy custom stuff -- or if you are just trying to
> > load a simple resource file as an InputStream, use openResource(String
> > name) ... that will start by checking for it in the conf dir, and will
> > fallback to your jar -- so you can have a default resource file shipped
> > with your plugin, but allow users to override it in their collection
> > configs.
> >
> >
> > -Hoss
> > http://www.lucidworks.com/
> >
>


Load Resource from within Solr Plugin

2016-03-29 Thread Max Bridgewater
HI,

I am facing the exact issue described here:
http://stackoverflow.com/questions/25623797/solr-plugin-classloader.

Basically I'm writing a solr plugin by extending SearchComponent class. My
new class is part of a.jar archive. Also my class depends on a jar b.jar. I
placed both jars in my own folder and declared in it solrconfig.xml with:



I also declared my new component in solrconfig.xml. The component is
invoked correctly up to a point where a class ClassFromB from b.jar
attempts to load a classpath resource personal-words.txt from classpath.

The piece of code in class ClassFromB looks like this:

Thread.currentThread().getContextClassLoader().getResources("personal-words.txt")


Unfortunately, this returns an empty list. Any recommendation?


Thanks,

Max.