date:20190319

Re: Incorrect Guava version in maven repository

2019-03-19 Thread Shawn Heisey


On 3/19/2019 6:17 PM, Amber Liu wrote:

 When I try to upgrade Guava that SOLR depends on, I notice the Guava
version listed in maven repository for SOLR is 14.0.1 (
https://mvnrepository.com/artifact/org.apache.solr/solr-core/8.0.0). I also
noticed that there is a Jira issue resolved in SOLR that upgraded Guava
dependency to 25.1(https://issues.apache.org/jira/browse/SOLR-11763). Is
the Guava version listed in maven repository correct? Which Guava version
does  SOLR 8.0.0 and 7.5.0 depends on?


If you check that issue, it says it was fixed in 8.1 and 9.0.  Neither 
of those versions are released yet.


All currently released versions of Solr will depend on an old Guava, 
14.0.1 sounds like the right version.


Thanks,
Shawn

Re: Need help on LTR

2019-03-19 Thread Mohomed Rimash

one more thing i noticed is your feature params values doesn't wrap in q or
qf field. check that as well

On Wed, 20 Mar 2019 at 01:34, Amjad Khan  wrote:

> Did, but same error
>
> {
>   "responseHeader":{
> "status":400,
> "QTime":5},
>   "error":{
> "metadata":[
>   "error-class","org.apache.solr.common.SolrException",
>   "root-error-class","java.lang.NullPointerException"],
> "msg":"org.apache.solr.ltr.model.ModelException: Model type does not
> exist org.apache.solr.ltr.model.LinearModel",
> "code":400}}
>
>
>
> > On Mar 19, 2019, at 3:26 PM, Mohomed Rimash 
> wrote:
> >
> > Please update the weights values to greater than 0 and less than 1.
> >
> > On Wed, 20 Mar 2019 at 00:13, Amjad Khan  wrote:
> >
> >> Feature File
> >> ===
> >>
> >> [
> >>  {
> >>"store" : "exampleFeatureStore",
> >>"name" : "isCityName",
> >>"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
> >>"params" : { "field" : "CITY_NAME" }
> >>  },
> >>  {
> >>"store" : "exampleFeatureStore",
> >>"name" : "originalScore",
> >>"class" : "org.apache.solr.ltr.feature.OriginalScoreFeature",
> >>"params" : {}
> >>  },
> >>  {
> >>"store" : "exampleFeatureStore",
> >>"name" : "isLat",
> >>"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
> >>"params" : { "field" : "LATITUDE" }
> >>  }
> >> ]
> >>
> >> Model File
> >> ==
> >> {
> >>  "store": "exampleFeatureStore",
> >>  "class": "org.apache.solr.ltr.model.LinearModel",
> >>  "name": "exampleModelStore",
> >>  "features": [{
> >>  "name": "isCityName"
> >>},
> >>{
> >>  "name": "isLat"
> >>},
> >>{
> >>  "name": "original_score"
> >>}
> >>  ],
> >>  "params": {
> >>"weights": {
> >>  "isCityName": 0.0,
> >>  "isLat": 0.0,
> >>  "original_score": 1.0
> >>}
> >>  }
> >> }
> >>
> >>
> >>
> >>> On Mar 19, 2019, at 2:04 PM, Mohomed Rimash 
> >> wrote:
> >>>
> >>> Can you share the feature file and the model file,
> >>> 1. I had few instances where invalid values for parameters (ie weights
> >> set
> >>> to more than 1 , with minmaxnormalizer) resulted the above error,
> >>> 2, Check all the features added to the model has a weight under params
> ->
> >>> weights in the model
> >>>
> >>>
> >>> On Tue, 19 Mar 2019 at 21:21, Roopa Rao  wrote:
> >>>
>  Does your feature definitions and the feature names used in the model
>  match?
> 
>  On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan 
> >> wrote:
> 
> > Yes, I did.
> >
> > I can see the feature that I created by this
> > schema/feature-store/exampleFeatureStore and it return me the
> features
> >> I
> > created. But issue is when I try to put store-model.
> >
> >> On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
> > wrote:
> >>
> >> Hi Amjad, After adding the libraries into the path, Did you restart
> >> the
> >> SOLR ?
> >>
> >> On Tue, 19 Mar 2019 at 08:45, Amjad Khan 
> wrote:
> >>
> >>> I followed the Solr LTR Documentation
> >>>
> >>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> >>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
> >>>
> >>> 1. Added library into the solr-config
> >>> 
> >>>  >>> regex=".*\.jar" />
> >>>  >>> regex="solr-ltr-\d.*\.jar" />
> >>> 2. Successfully added feature
> >>> 3. Get schema to see feature is available
> >>> 4. When I try to push model I see the error below, however I added
> >> the
> > lib
> >>> into solr-cofig
> >>>
> >>> Response
> >>> {
> >>> "responseHeader":{
> >>>  "status":400,
> >>>  "QTime":1},
> >>> "error":{
> >>>  "metadata":[
> >>>"error-class","org.apache.solr.common.SolrException",
> >>>"root-error-class","java.lang.NullPointerException"],
> >>>  "msg":"org.apache.solr.ltr.model.ModelException: Model type does
>  not
> >>> exist org.apache.solr.ltr.model.LinearModel",
> >>>  "code":400}}
> >>>
> >>> Thanks
> >
> >
> 
> >>
> >>
>
>

Incorrect Guava version in maven repository

2019-03-19 Thread Amber Liu

Hi,

When I try to upgrade Guava that SOLR depends on, I notice the Guava
version listed in maven repository for SOLR is 14.0.1 (
https://mvnrepository.com/artifact/org.apache.solr/solr-core/8.0.0). I also
noticed that there is a Jira issue resolved in SOLR that upgraded Guava
dependency to 25.1(https://issues.apache.org/jira/browse/SOLR-11763). Is
the Guava version listed in maven repository correct? Which Guava version
does  SOLR 8.0.0 and 7.5.0 depends on?

Thanks,
Amber

Re: Solr Suggester component not returning any results

2019-03-19 Thread Zheng Lin Edwin Yeo

Hi,

Will you be able to post your configurations for your /suggest
requestHandler in solrconfig.xml?

Regards,
Edwin

On Wed, 20 Mar 2019 at 02:40, Deoxyribonucleic_DNA ... <
deoxyribonucleic_...@hotmail.com> wrote:

> Hi, I'm trying to implement the suggester component in Solr,  based off
> the tutorial:
>
>
> https://lucene.apache.org/solr/guide/7_7/suggester.html#suggester-search-component-parameters
>
> I'm not getting any errors about how the suggester is set up or anything,
> but when I try to use it I get 0 results returned. Including images of the
> schema/config parts I added to my files, as shown to do in the tutorial,
> can someone take a look? Or does anyone have any suggestions based on prior
> experience or common problems?
>
> Also, I've got solr 7.6. There wasn't some change that would affect this
> just in that slight version change was there? I
>
> Thanks in advance.
>
>
>
>
>
>
> Link used in search:
>
>
> http://localhost:8984/solr/nutch/suggest?suggest=true&;
> suggest.build=true&suggest.dictionary=mySuggester&suggest.q=Canad
>
> (due to documents containing "Canada" as the intended word)
>
> Result:
>
>
>

Re: Problem understanding why QPS is so low

2019-03-19 Thread Erick Erickson

The thing that jumps out at me are:

1> your filterCache. The autowarm count is very, very high. I usually start 
with about 16. This will be especially important if you open new searchers 
often, i.e. your soft commit interval or hard-commit-with-opensearcher-true. 
Esentially you’re executing 900 filter queries just as though you’d sent them 
from outside Solr every time you open a searcher. I’d also make it 
substantially smaller. It’ll suck up approximately maxDoc/8 bytes per entry. 

My fear is that you’re spending a lot of resources on this cache that are 
starving your queries…

What are your autocommit settings? And are you continuously indexing? If so at 
what rate?

2> A 2-4 second latency is very, very large for a corpus this size, something 
is (probably) misconfigured.

3> Document cache disabled. If you can absolutely guarantee that stored fields 
are never returned you’re probably OK. The purpose of the documentCache is to 
prevent components of the _same_ query from re-reading and decompressing 
minimum 16K blocks for each returned document. You’re right that if there are 
no stored fields this is probably useless, OTOH if there are no docs, then it 
won’t be used either, it’s not pre-allocated.

4> on a quick glance your boosts don’t appear too bad, I’d be surprised if this 
were the root cause. 

5> I think you’re oversharded unless you expect to have much larger numbers of 
documents. My rule of thumb is to expect about 50M docs/shard given reasonable 
machines. YMMV of course. If that intuition is true, you could get much greater 
throughput by having fewer shards.

6> Here’s what I’d do. I’m guessing you have some kind of load testing tool 
jMeter or the like (you _must_ use “real” queries BTW). Go ahead and set up a 
test run on a node, changing one thing at a time. For instance, you could 
remove all of the boosting and test that. Try with no indexing going on, etc.

7> Throw a profiler at it to find out where you’re spending time. My 
suggestions are guesswork at some level. It may be easier to try some things 
rather than get a profiler running on it but…

8> You could work with an isolated machine, even a local box. Just create a 
one-shard collection, then copy the index from _one_ of your shards locally to 
the right place and beat it up.

Good luck!
Erick

> On Mar 19, 2019, at 3:38 PM, Ash Ramesh  wrote:
> 
> Hi everybody,
> 
> My team run a solr cluster which has very low QPS throughput. I have been
> going through the different configurations in our setup, and think that
> it's probably the way we have defined our request handlers that is causing
> the slowness.
> 
> Details of our cluster are below the fold.
> 
> *Questions:*
> 
>   1. Obviously we have a set of 'expensive' boosts here. Are there any
>   inherent anti pattens obvious in the request handler?
>   2. Is it normal for such a request handler to max out at around 13 QPS
>   before latency starts hitting 2-4 seconds?
>   3. Have we maybe architected our cluster incorrectly?
>   4. Are there any patterns we should adopt to increase through put?
> 
> 
> Thank you so much for taking time to read this email. We would really
> appreciate any feedback. We are happy to provide more details into our
> cluster if needed.
> 
> Regards,
> 
> Ash
> 
> *Information about our Cluster:*
> 
>   - *Solr Version: *7.4.0
>   - *Architecture: *TLOG/PULL - 8 Shards (Default shard hashing)
>  - Doc Count: 50 Million ~
>  - TLOG - EC2 Machines hosting TLOGs have all 8 shards. Approximately
>  12G index total
>  - PULL - EC2 Machines host 2 shards. There are 4 ASGs such that each
>  ASG host one of the shard combinations - [shard1, shard2], [shard3,
>  shard4], [shard5, shard6], [shard7, shard8]
> - We scale up on CPU utilisation
>  - Schema: No stored fields (except for `id`)
>  - Indexing: Use the SolrJ Zookeeper Client to talk directly to TLOGs
>  to update (fully replace) documents
> - Deleted docs: Between 10-25% depending on when the merge policy
> was last executed
>  - Request Serving: PULL ASGs are wrapped around a ELB, such that we
>  use the SolrJ HTTP Client to make requests.
> - All read requests are sent with the
> '"shards.preference=replica.location:local,replica.type:PULL"' in an
> attempt to direct all traffic to PULL nodes.
> - *Average QPS* per full copy of the index (PULL nodes of
>   shard1-shard8): *13 queries per second*
>   - *Heap Size PULL: *15G
>  - Index is fully memory mapped with extra RAM to spare on all PULL
>  machines
>   - *Solr Caches:*
>  - Document Cache: Disabled - No stored fields, seems pointless
>  - Query Cache: Disabled - too many different queries no reason to use
>  this
>  - Filter Cache: 1600 in size (900 autowarm) - we have a set of well
>  defined filter queries, we are thinking of increasing this since hit rate
>  is 0.86
> 
> *Example Request Handler (Obfu

Re: is df needed for SolrCloud replication?

2019-03-19 Thread Shawn Heisey


On 3/19/2019 4:48 PM, Oakley, Craig (NIH/NLM/NCBI) [C] wrote:

I recently noticed that my solr.log files have been getting the following error 
message:
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: no field name 
specified in query and no default specified via 'df' param

The timing of these messages coincide with pings to leader node of the 
SolrCloud from other nodes of the SolrCloud (the message appears only on 
whatever node is currently the leader).

I believe that the user for whom I set up this SolrCloud intentionally removed 
df from the defaults section of solrconfig.xml (in order to streamline out 
parts of the code which he does not use).

I have not (yet) noticed any ill effects from this error. Is this error benign? 
Or shall I ask the user to reinstate df in the defaults section of 
solrconfig.xml? Or can SorlCloud replication be configured to work around any 
ill effects that there may be?


If you don't define df (which means "default field"), then every query 
must indicate which field(s) it will query, or you will see that error 
message.


It sounds like the query that is in the ping handler needs to be changed 
so it includes a field name.  Typically ping handlers use *:* for their 
query, which is special syntax for all documents, and works even when no 
fields are defined.  That query is usually extremely fast.


Thanks,
Shawn

is df needed for SolrCloud replication?

2019-03-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]

I recently noticed that my solr.log files have been getting the following error 
message:
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: no field name 
specified in query and no default specified via 'df' param

The timing of these messages coincide with pings to leader node of the 
SolrCloud from other nodes of the SolrCloud (the message appears only on 
whatever node is currently the leader).

I believe that the user for whom I set up this SolrCloud intentionally removed 
df from the defaults section of solrconfig.xml (in order to streamline out 
parts of the code which he does not use).

I have not (yet) noticed any ill effects from this error. Is this error benign? 
Or shall I ask the user to reinstate df in the defaults section of 
solrconfig.xml? Or can SorlCloud replication be configured to work around any 
ill effects that there may be?

Please advise

Re: Need help on LTR

2019-03-19 Thread Roopa ML

In model file replace original_score with originalScore

Roopa

Sent from my iPhone

> On Mar 19, 2019, at 2:44 PM, Amjad Khan  wrote:
> 
> Roopa,
> 
> Yes
> 
>> On Mar 19, 2019, at 11:51 AM, Roopa Rao  wrote:
>> 
>> Does your feature definitions and the feature names used in the model match?
>> 
>>> On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan  wrote:
>>> 
>>> Yes, I did.
>>> 
>>> I can see the feature that I created by this
>>> schema/feature-store/exampleFeatureStore and it return me the features I
>>> created. But issue is when I try to put store-model.
>>> 
 On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
>>> wrote:
 
 Hi Amjad, After adding the libraries into the path, Did you restart the
 SOLR ?
 
> On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
> 
> I followed the Solr LTR Documentation
> 
> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
> 
> 1. Added library into the solr-config
> 
>  regex=".*\.jar" />
>  regex="solr-ltr-\d.*\.jar" />
> 2. Successfully added feature
> 3. Get schema to see feature is available
> 4. When I try to push model I see the error below, however I added the
>>> lib
> into solr-cofig
> 
> Response
> {
> "responseHeader":{
>  "status":400,
>  "QTime":1},
> "error":{
>  "metadata":[
>"error-class","org.apache.solr.common.SolrException",
>"root-error-class","java.lang.NullPointerException"],
>  "msg":"org.apache.solr.ltr.model.ModelException: Model type does not
> exist org.apache.solr.ltr.model.LinearModel",
>  "code":400}}
> 
> Thanks
>>> 
>>> 
>

Problem understanding why QPS is so low

2019-03-19 Thread Ash Ramesh

Hi everybody,

My team run a solr cluster which has very low QPS throughput. I have been
going through the different configurations in our setup, and think that
it's probably the way we have defined our request handlers that is causing
the slowness.

Details of our cluster are below the fold.

*Questions:*

   1. Obviously we have a set of 'expensive' boosts here. Are there any
   inherent anti pattens obvious in the request handler?
   2. Is it normal for such a request handler to max out at around 13 QPS
   before latency starts hitting 2-4 seconds?
   3. Have we maybe architected our cluster incorrectly?
   4. Are there any patterns we should adopt to increase through put?


Thank you so much for taking time to read this email. We would really
appreciate any feedback. We are happy to provide more details into our
cluster if needed.

Regards,

Ash

*Information about our Cluster:*

   - *Solr Version: *7.4.0
   - *Architecture: *TLOG/PULL - 8 Shards (Default shard hashing)
  - Doc Count: 50 Million ~
  - TLOG - EC2 Machines hosting TLOGs have all 8 shards. Approximately
  12G index total
  - PULL - EC2 Machines host 2 shards. There are 4 ASGs such that each
  ASG host one of the shard combinations - [shard1, shard2], [shard3,
  shard4], [shard5, shard6], [shard7, shard8]
 - We scale up on CPU utilisation
  - Schema: No stored fields (except for `id`)
  - Indexing: Use the SolrJ Zookeeper Client to talk directly to TLOGs
  to update (fully replace) documents
 - Deleted docs: Between 10-25% depending on when the merge policy
 was last executed
  - Request Serving: PULL ASGs are wrapped around a ELB, such that we
  use the SolrJ HTTP Client to make requests.
 - All read requests are sent with the
 '"shards.preference=replica.location:local,replica.type:PULL"' in an
 attempt to direct all traffic to PULL nodes.
 - *Average QPS* per full copy of the index (PULL nodes of
   shard1-shard8): *13 queries per second*
   - *Heap Size PULL: *15G
  - Index is fully memory mapped with extra RAM to spare on all PULL
  machines
   - *Solr Caches:*
  - Document Cache: Disabled - No stored fields, seems pointless
  - Query Cache: Disabled - too many different queries no reason to use
  this
  - Filter Cache: 1600 in size (900 autowarm) - we have a set of well
  defined filter queries, we are thinking of increasing this since hit rate
  is 0.86

*Example Request Handler (Obfuscated field names and boost values)*


**

**

*  *
*  en*
*  *

*  edismax*
*  10*
*  id*
*  * _query_*

*  fieldA^0.99 fieldB^0.99 fieldC^0.99 fieldD^0.99
fieldE^0.99*
*  fieldA_$${lang}*
*  fieldB_$${lang}*
*  fieldC_$${lang}*
*  fieldD_$${lang}*
*  textContent_$${lang}*


*  2*
*  0.99*
*  true*

*  *
*  fieldA_$${lang}^0.99 fieldB_$${lang}^0.99*
**

**
*  {!type=edismax v=$qq}*
**

**
*  {!edismax qf=fieldA^0.99 mm=100% bq="" boost="" pf=""
tie=1.00 v=$qq}*
*  {!edismax qf=fieldB^0.99 mm=100% bq="" boost="" pf=""
tie=1.00 v=$qq}*
*  {!edismax qf=fieldC^0.99 mm=100% bq="" boost="" pf=""
tie=1.00 v=$qq}*
*  {!edismax qf=fieldD^0.99 mm=100% bq="" boost="" pf=""
tie=1.00 v=$qq}*

*  {!edismax qf=fieldA^0.99 fieldB^0.99 fieldC^0.99
fieldD^0.99 mm=100% bq="" boost="" pf="" tie=1.00 v=$qq}*

*  {!func}mul(termfreq(docBoostFieldB,$qq),100)*
*  if(termfreq(docBoostFieldB,$qq),1,def(docBoostFieldA,1))*
**

**
*  elevator*
**

*  *

*Notes:*

   - *We have a data science team that feeds back click through data to the
   boostFields to re-order results for popular queries*
   - *We do sorting on 'score DESC dateSubmitted DESC'*
   - *We use the 'elevator' component quite heavily - e.g.
   'elevateIds=A,B,C'*
   - *We have some localized fields - thus we do aliasing in the request
   handler*

-- 
*P.S. We've launched a new blog to share the latest ideas and case studies 
from our team. Check it out here: product.canva.com 
. ***
** Empowering the 
world to design
Also, we're hiring. Apply here!

Re: Need information on EofExceptions in solr 4.8.1

2019-03-19 Thread Shawn Heisey


On 3/19/2019 10:39 AM, Vijay Rawlani wrote:

We are using solr 4.8.1 in our project. We are observing following
EofExceptions in solr.
It would be helpful for us to know in what situations we might land up
with this.
Can we get rid of this with any solr configuration or is there any way
forward at all?
Kindly let us know some information about the exception and the scenario
where it can occur.


When Solr throws Jetty's EofException, it almost always means this has 
occurred:


The client talking to Solr has disconnected its TCP connection before 
Solr has finished processing the request.  When Solr (Jetty) finally 
finishes and tries to send the response, the client is long gone and the 
response cannot be sent.  So EofException is thrown.


Usually this is because the client has a short timeout and the requests 
Solr is being asked to process are taking longer than that timeout to 
complete.  If a client is configured with a timeout, it should be quite 
long ... normally at least a minute, and two to five minutes would be 
better.


Solr 4.8.1 is nearly four years old.  This is ancient in the open source 
world.  There will be no bugfixes for a version that old.  I do not 
think the behavior you are seeing is a bug, but if you do encounter one, 
you'll need to reproduce it in an 8.x version before we can fix it.


Thanks,
Shawn

RE: Upgrading tika

2019-03-19 Thread Phil Scadden

As per Erick advice, I would strongly recommend that you do anything tika in a  
separate solrj programme. You do not want to have your solr instance processing 
via tika.

-Original Message-
From: Tannen, Lev (USAEO) [Contractor] 
Sent: Wednesday, 20 March 2019 08:17
To: solr-user@lucene.apache.org
Subject: RE: Upgrading tika

Sorry Erick,
Please disregard my previous message. Somehow I downloaded the version without 
those two files. I am going to download the latest version solr 8.0.0 and try 
it.
Best
Lev Tannen

-Original Message-
From: Erick Erickson 
Sent: Tuesday, March 19, 2019 2:48 PM
To: solr-user 
Subject: Re: Upgrading tika

Yes, Solr is distributed with Tika. Look in:
./solr/contrib/extraction/lib

Tika is upgraded when new versions come out, so the underlying files are 
whatever are current at the time.

The integration is a fairly loose coupling, if you're using some external 
program (say a SolrJ program) to parse the files, there's no requirement to use 
the jars distributed with Solr, use whatever suits your fancy. An external 
program just constructs a SolrDocument to send to Solr. What you use to create 
that document is irrelevant. See:
https://lucidworks.com/2012/02/14/indexing-with-solrj/ for some background.

If you're using the ExtractingRequestHandler, where you just send the 
semi-structured docs to Solr (PDFs, Word or whatever), then needing to know 
anything about individual Tika-related jar files is kind of strange.

If your predecessors wrote some custom code that runs as part of Solr, I don't 
know what to say...

Best,
Erick

On Tue, Mar 19, 2019 at 10:47 AM Tannen, Lev (USAEO) [Contractor] 
 wrote:
>
> Thank you Shawn.
> I assumed that tika has been integrated with solr. I the project written 
> before me they used two tika files taken from solr distribution. I am trying 
> to do the same with solr 7.7.1. However this version contains a different set 
> of tika related files. So I am confused. Does  solr does not have integrated 
> tika anymore, or I just cannot recognize them?
>
> -Original Message-
> From: Shawn Heisey 
> Sent: Tuesday, March 19, 2019 11:11 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Upgrading tika
>
> On 3/19/2019 9:03 AM, levtannen wrote:
> > Could anybody suggest me what files do I need to use the latest
> > version of Tika and where to find them?
>
> This mailing list is solr-user.  Tika is an entirely separate project from 
> Solr within the Apache Foundation.  To get help with Tika, you'll need to ask 
> that project.
>
> https://tika.apache.org/mail-lists.html
>
> Thanks,
> Shawn
Notice: This email and any attachments are confidential and may not be used, 
published or redistributed without the prior written consent of the Institute 
of Geological and Nuclear Sciences Limited (GNS Science). If received in error 
please destroy and immediately notify GNS Science. Do not copy or disclose the 
contents.

Re: Behavior of Function Query

2019-03-19 Thread Mikhail Khludnev

Ok. ValueSourceAugmenter.transform(SolrDocument, int) omits null values
coming from QueryDocValues.objectVal(int) on non-matching docs.
It might seems odd, but that's it.

On Tue, Mar 19, 2019 at 1:21 PM Erik Hatcher  wrote:

> Try adding fl=* into the request.   There’s an oddity with fl, iirc, where
> it can skip functions if * isn’t there (or maybe a concrete non-score
> field?)
>
>Erik
>
> > On Mar 18, 2019, at 10:19, Ashish Bisht  wrote:
> >
> > Please see the below requests and response
> >
> > http://Sol:8983/solr/SCSpell/select?q="*internet of
> >
> things*"&defType=edismax&qf=spellcontent&wt=json&rows=1&fl=score,internet_of_things:query({!edismax
> > v='"*internet of things*"'}),instant_of_things:query({!edismax
> v='"instant
> > of things"'})
> >
> >
> > Response contains score from function query
> >
> > "fl":"score,internet_of_things:query({!edismax v='\"internet of
> > things\"'}),instant_of_things:query({!edismax v='\"instant of
> things\"'})",
> >  "rows":"1",
> >  "wt":"json"}},
> >  "response":{"numFound":851,"start":0,"maxScore":7.6176834,"docs":[
> >  {
> >"score":7.6176834,
> >   * "internet_of_things":7.6176834*}]
> >  }}
> >
> >
> > But if in the same request q is changed,it doesn't give score
> >
> > http://Sol-1:8983/solr/SCSpell/select?q="*wall
> >
> street*"&defType=edismax&qf=spellcontent&wt=json&rows=1&fl=score,internet_of_things:query({!edismax
> > v='"*internet of things*"'}),instant_of_things:query({!edismax
> v='"instant
> > of things"'})
> >
> >   "q":"\"wall street\"",
> >  "defType":"edismax",
> >  "qf":"spellcontent",
> >  "fl":"score,internet_of_things:query({!edismax v='\"internet of
> > things\"'}),instant_of_things:query({!edismax v='\"instant of
> things\"'})",
> >  "rows":"1",
> >  "wt":"json"}},
> >  "response":{"numFound":46,"start":0,"maxScore":15.670144,"docs":[
> >  {
> >"score":15.670144}]
> >  }}
> >
> >
> > Why score of function query is getting applied when q is a different.
> >
> >
> >
> > --
> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


-- 
Sincerely yours
Mikhail Khludnev

Re: Need help on LTR

2019-03-19 Thread Amjad Khan

Did, but same error

{
  "responseHeader":{
"status":400,
"QTime":5},
  "error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","java.lang.NullPointerException"],
"msg":"org.apache.solr.ltr.model.ModelException: Model type does not exist 
org.apache.solr.ltr.model.LinearModel",
"code":400}}



> On Mar 19, 2019, at 3:26 PM, Mohomed Rimash  wrote:
> 
> Please update the weights values to greater than 0 and less than 1.
> 
> On Wed, 20 Mar 2019 at 00:13, Amjad Khan  wrote:
> 
>> Feature File
>> ===
>> 
>> [
>>  {
>>"store" : "exampleFeatureStore",
>>"name" : "isCityName",
>>"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
>>"params" : { "field" : "CITY_NAME" }
>>  },
>>  {
>>"store" : "exampleFeatureStore",
>>"name" : "originalScore",
>>"class" : "org.apache.solr.ltr.feature.OriginalScoreFeature",
>>"params" : {}
>>  },
>>  {
>>"store" : "exampleFeatureStore",
>>"name" : "isLat",
>>"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
>>"params" : { "field" : "LATITUDE" }
>>  }
>> ]
>> 
>> Model File
>> ==
>> {
>>  "store": "exampleFeatureStore",
>>  "class": "org.apache.solr.ltr.model.LinearModel",
>>  "name": "exampleModelStore",
>>  "features": [{
>>  "name": "isCityName"
>>},
>>{
>>  "name": "isLat"
>>},
>>{
>>  "name": "original_score"
>>}
>>  ],
>>  "params": {
>>"weights": {
>>  "isCityName": 0.0,
>>  "isLat": 0.0,
>>  "original_score": 1.0
>>}
>>  }
>> }
>> 
>> 
>> 
>>> On Mar 19, 2019, at 2:04 PM, Mohomed Rimash 
>> wrote:
>>> 
>>> Can you share the feature file and the model file,
>>> 1. I had few instances where invalid values for parameters (ie weights
>> set
>>> to more than 1 , with minmaxnormalizer) resulted the above error,
>>> 2, Check all the features added to the model has a weight under params ->
>>> weights in the model
>>> 
>>> 
>>> On Tue, 19 Mar 2019 at 21:21, Roopa Rao  wrote:
>>> 
 Does your feature definitions and the feature names used in the model
 match?
 
 On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan 
>> wrote:
 
> Yes, I did.
> 
> I can see the feature that I created by this
> schema/feature-store/exampleFeatureStore and it return me the features
>> I
> created. But issue is when I try to put store-model.
> 
>> On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
> wrote:
>> 
>> Hi Amjad, After adding the libraries into the path, Did you restart
>> the
>> SOLR ?
>> 
>> On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
>> 
>>> I followed the Solr LTR Documentation
>>> 
>>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
>>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
>>> 
>>> 1. Added library into the solr-config
>>> 
>>> >> regex=".*\.jar" />
>>> >> regex="solr-ltr-\d.*\.jar" />
>>> 2. Successfully added feature
>>> 3. Get schema to see feature is available
>>> 4. When I try to push model I see the error below, however I added
>> the
> lib
>>> into solr-cofig
>>> 
>>> Response
>>> {
>>> "responseHeader":{
>>>  "status":400,
>>>  "QTime":1},
>>> "error":{
>>>  "metadata":[
>>>"error-class","org.apache.solr.common.SolrException",
>>>"root-error-class","java.lang.NullPointerException"],
>>>  "msg":"org.apache.solr.ltr.model.ModelException: Model type does
 not
>>> exist org.apache.solr.ltr.model.LinearModel",
>>>  "code":400}}
>>> 
>>> Thanks
> 
> 
 
>> 
>>

Re: Need help on LTR

2019-03-19 Thread Mohomed Rimash

Please update the weights values to greater than 0 and less than 1.

On Wed, 20 Mar 2019 at 00:13, Amjad Khan  wrote:

> Feature File
> ===
>
> [
>   {
> "store" : "exampleFeatureStore",
> "name" : "isCityName",
> "class" : "org.apache.solr.ltr.feature.FieldValueFeature",
> "params" : { "field" : "CITY_NAME" }
>   },
>   {
> "store" : "exampleFeatureStore",
> "name" : "originalScore",
> "class" : "org.apache.solr.ltr.feature.OriginalScoreFeature",
> "params" : {}
>   },
>   {
> "store" : "exampleFeatureStore",
> "name" : "isLat",
> "class" : "org.apache.solr.ltr.feature.FieldValueFeature",
> "params" : { "field" : "LATITUDE" }
>   }
> ]
>
> Model File
> ==
> {
>   "store": "exampleFeatureStore",
>   "class": "org.apache.solr.ltr.model.LinearModel",
>   "name": "exampleModelStore",
>   "features": [{
>   "name": "isCityName"
> },
> {
>   "name": "isLat"
> },
> {
>   "name": "original_score"
> }
>   ],
>   "params": {
> "weights": {
>   "isCityName": 0.0,
>   "isLat": 0.0,
>   "original_score": 1.0
> }
>   }
> }
>
>
>
> > On Mar 19, 2019, at 2:04 PM, Mohomed Rimash 
> wrote:
> >
> > Can you share the feature file and the model file,
> > 1. I had few instances where invalid values for parameters (ie weights
> set
> > to more than 1 , with minmaxnormalizer) resulted the above error,
> > 2, Check all the features added to the model has a weight under params ->
> > weights in the model
> >
> >
> > On Tue, 19 Mar 2019 at 21:21, Roopa Rao  wrote:
> >
> >> Does your feature definitions and the feature names used in the model
> >> match?
> >>
> >> On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan 
> wrote:
> >>
> >>> Yes, I did.
> >>>
> >>> I can see the feature that I created by this
> >>> schema/feature-store/exampleFeatureStore and it return me the features
> I
> >>> created. But issue is when I try to put store-model.
> >>>
>  On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
> >>> wrote:
> 
>  Hi Amjad, After adding the libraries into the path, Did you restart
> the
>  SOLR ?
> 
>  On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
> 
> > I followed the Solr LTR Documentation
> >
> > https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> > https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
> >
> > 1. Added library into the solr-config
> > 
> >  > regex=".*\.jar" />
> >  > regex="solr-ltr-\d.*\.jar" />
> > 2. Successfully added feature
> > 3. Get schema to see feature is available
> > 4. When I try to push model I see the error below, however I added
> the
> >>> lib
> > into solr-cofig
> >
> > Response
> > {
> > "responseHeader":{
> >   "status":400,
> >   "QTime":1},
> > "error":{
> >   "metadata":[
> > "error-class","org.apache.solr.common.SolrException",
> > "root-error-class","java.lang.NullPointerException"],
> >   "msg":"org.apache.solr.ltr.model.ModelException: Model type does
> >> not
> > exist org.apache.solr.ltr.model.LinearModel",
> >   "code":400}}
> >
> > Thanks
> >>>
> >>>
> >>
>
>

RE: Upgrading tika

2019-03-19 Thread Tannen, Lev (USAEO) [Contractor]

Sorry Erick, 
Please disregard my previous message. Somehow I downloaded the version without 
those two files. I am going to download the latest version solr 8.0.0 and try 
it.
Best 
Lev Tannen

-Original Message-
From: Erick Erickson  
Sent: Tuesday, March 19, 2019 2:48 PM
To: solr-user 
Subject: Re: Upgrading tika

Yes, Solr is distributed with Tika. Look in:
./solr/contrib/extraction/lib

Tika is upgraded when new versions come out, so the underlying files are 
whatever are current at the time.

The integration is a fairly loose coupling, if you're using some external 
program (say a SolrJ program) to parse the files, there's no requirement to use 
the jars distributed with Solr, use whatever suits your fancy. An external 
program just constructs a SolrDocument to send to Solr. What you use to create 
that document is irrelevant. See:
https://lucidworks.com/2012/02/14/indexing-with-solrj/ for some background.

If you're using the ExtractingRequestHandler, where you just send the 
semi-structured docs to Solr (PDFs, Word or whatever), then needing to know 
anything about individual Tika-related jar files is kind of strange.

If your predecessors wrote some custom code that runs as part of Solr, I don't 
know what to say...

Best,
Erick

On Tue, Mar 19, 2019 at 10:47 AM Tannen, Lev (USAEO) [Contractor] 
 wrote:
>
> Thank you Shawn.
> I assumed that tika has been integrated with solr. I the project written 
> before me they used two tika files taken from solr distribution. I am trying 
> to do the same with solr 7.7.1. However this version contains a different set 
> of tika related files. So I am confused. Does  solr does not have integrated 
> tika anymore, or I just cannot recognize them?
>
> -Original Message-
> From: Shawn Heisey 
> Sent: Tuesday, March 19, 2019 11:11 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Upgrading tika
>
> On 3/19/2019 9:03 AM, levtannen wrote:
> > Could anybody suggest me what files do I need to use the latest 
> > version of Tika and where to find them?
>
> This mailing list is solr-user.  Tika is an entirely separate project from 
> Solr within the Apache Foundation.  To get help with Tika, you'll need to ask 
> that project.
>
> https://tika.apache.org/mail-lists.html
>
> Thanks,
> Shawn

RE: Upgrading tika

2019-03-19 Thread Tannen, Lev (USAEO) [Contractor]

Thank you Erick,
The problem is that the solr7.7.1 distribution does not contain files tika-core 
and tika-parsers in the contrib/extraction/lib folder. It contains 
only(instead) tika-java7-1.19.1.jar and tika-xmp-1.19.1.jar.
Have I lost some files while downloading?
Best,
Lev Tannen

-Original Message-
From: Erick Erickson  
Sent: Tuesday, March 19, 2019 2:48 PM
To: solr-user 
Subject: Re: Upgrading tika

Yes, Solr is distributed with Tika. Look in:
./solr/contrib/extraction/lib

Tika is upgraded when new versions come out, so the underlying files are 
whatever are current at the time.

The integration is a fairly loose coupling, if you're using some external 
program (say a SolrJ program) to parse the files, there's no requirement to use 
the jars distributed with Solr, use whatever suits your fancy. An external 
program just constructs a SolrDocument to send to Solr. What you use to create 
that document is irrelevant. See:
https://lucidworks.com/2012/02/14/indexing-with-solrj/ for some background.

If you're using the ExtractingRequestHandler, where you just send the 
semi-structured docs to Solr (PDFs, Word or whatever), then needing to know 
anything about individual Tika-related jar files is kind of strange.

If your predecessors wrote some custom code that runs as part of Solr, I don't 
know what to say...

Best,
Erick

On Tue, Mar 19, 2019 at 10:47 AM Tannen, Lev (USAEO) [Contractor] 
 wrote:
>
> Thank you Shawn.
> I assumed that tika has been integrated with solr. I the project written 
> before me they used two tika files taken from solr distribution. I am trying 
> to do the same with solr 7.7.1. However this version contains a different set 
> of tika related files. So I am confused. Does  solr does not have integrated 
> tika anymore, or I just cannot recognize them?
>
> -Original Message-
> From: Shawn Heisey 
> Sent: Tuesday, March 19, 2019 11:11 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Upgrading tika
>
> On 3/19/2019 9:03 AM, levtannen wrote:
> > Could anybody suggest me what files do I need to use the latest 
> > version of Tika and where to find them?
>
> This mailing list is solr-user.  Tika is an entirely separate project from 
> Solr within the Apache Foundation.  To get help with Tika, you'll need to ask 
> that project.
>
> https://tika.apache.org/mail-lists.html
>
> Thanks,
> Shawn

Re: Upgrading tika

2019-03-19 Thread Erick Erickson

Yes, Solr is distributed with Tika. Look in:
./solr/contrib/extraction/lib

Tika is upgraded when new versions come out, so the underlying files
are whatever are current at the time.

The integration is a fairly loose coupling, if you're using some
external program (say a SolrJ program) to parse the files, there's no
requirement to use the jars distributed with Solr, use whatever suits
your fancy. An external program just constructs a SolrDocument to send
to Solr. What you use to create that document is irrelevant. See:
https://lucidworks.com/2012/02/14/indexing-with-solrj/ for some
background.

If you're using the ExtractingRequestHandler, where you just send the
semi-structured docs to Solr (PDFs, Word or whatever), then needing to
know anything about individual Tika-related jar files is kind of
strange.

If your predecessors wrote some custom code that runs as part of Solr,
I don't know what to say...

Best,
Erick

On Tue, Mar 19, 2019 at 10:47 AM Tannen, Lev (USAEO) [Contractor]
 wrote:
>
> Thank you Shawn.
> I assumed that tika has been integrated with solr. I the project written 
> before me they used two tika files taken from solr distribution. I am trying 
> to do the same with solr 7.7.1. However this version contains a different set 
> of tika related files. So I am confused. Does  solr does not have integrated 
> tika anymore, or I just cannot recognize them?
>
> -Original Message-
> From: Shawn Heisey 
> Sent: Tuesday, March 19, 2019 11:11 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Upgrading tika
>
> On 3/19/2019 9:03 AM, levtannen wrote:
> > Could anybody suggest me what files do I need to use the latest
> > version of Tika and where to find them?
>
> This mailing list is solr-user.  Tika is an entirely separate project from 
> Solr within the Apache Foundation.  To get help with Tika, you'll need to ask 
> that project.
>
> https://tika.apache.org/mail-lists.html
>
> Thanks,
> Shawn

Re: Need help on LTR

2019-03-19 Thread Amjad Khan

Roopa,

Yes

> On Mar 19, 2019, at 11:51 AM, Roopa Rao  wrote:
> 
> Does your feature definitions and the feature names used in the model match?
> 
> On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan  wrote:
> 
>> Yes, I did.
>> 
>> I can see the feature that I created by this
>> schema/feature-store/exampleFeatureStore and it return me the features I
>> created. But issue is when I try to put store-model.
>> 
>>> On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
>> wrote:
>>> 
>>> Hi Amjad, After adding the libraries into the path, Did you restart the
>>> SOLR ?
>>> 
>>> On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
>>> 
 I followed the Solr LTR Documentation
 
 https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
 https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
 
 1. Added library into the solr-config
 
 >>> regex=".*\.jar" />
 >>> regex="solr-ltr-\d.*\.jar" />
 2. Successfully added feature
 3. Get schema to see feature is available
 4. When I try to push model I see the error below, however I added the
>> lib
 into solr-cofig
 
 Response
 {
 "responseHeader":{
   "status":400,
   "QTime":1},
 "error":{
   "metadata":[
 "error-class","org.apache.solr.common.SolrException",
 "root-error-class","java.lang.NullPointerException"],
   "msg":"org.apache.solr.ltr.model.ModelException: Model type does not
 exist org.apache.solr.ltr.model.LinearModel",
   "code":400}}
 
 Thanks
>> 
>>

Re: Need help on LTR

2019-03-19 Thread Amjad Khan

Feature File
===

[
  {
"store" : "exampleFeatureStore",
"name" : "isCityName",
"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
"params" : { "field" : "CITY_NAME" }
  },
  {
"store" : "exampleFeatureStore",
"name" : "originalScore",
"class" : "org.apache.solr.ltr.feature.OriginalScoreFeature",
"params" : {}
  },
  {
"store" : "exampleFeatureStore",
"name" : "isLat",
"class" : "org.apache.solr.ltr.feature.FieldValueFeature",
"params" : { "field" : "LATITUDE" }
  }
]

Model File
==
{
  "store": "exampleFeatureStore",
  "class": "org.apache.solr.ltr.model.LinearModel",
  "name": "exampleModelStore",
  "features": [{
  "name": "isCityName"
},
{
  "name": "isLat"
},
{
  "name": "original_score"
}
  ],
  "params": {
"weights": {
  "isCityName": 0.0,
  "isLat": 0.0,
  "original_score": 1.0
}
  }
}



> On Mar 19, 2019, at 2:04 PM, Mohomed Rimash  wrote:
> 
> Can you share the feature file and the model file,
> 1. I had few instances where invalid values for parameters (ie weights set
> to more than 1 , with minmaxnormalizer) resulted the above error,
> 2, Check all the features added to the model has a weight under params ->
> weights in the model
> 
> 
> On Tue, 19 Mar 2019 at 21:21, Roopa Rao  wrote:
> 
>> Does your feature definitions and the feature names used in the model
>> match?
>> 
>> On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan  wrote:
>> 
>>> Yes, I did.
>>> 
>>> I can see the feature that I created by this
>>> schema/feature-store/exampleFeatureStore and it return me the features I
>>> created. But issue is when I try to put store-model.
>>> 
 On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
>>> wrote:
 
 Hi Amjad, After adding the libraries into the path, Did you restart the
 SOLR ?
 
 On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
 
> I followed the Solr LTR Documentation
> 
> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
> 
> 1. Added library into the solr-config
> 
>  regex=".*\.jar" />
>  regex="solr-ltr-\d.*\.jar" />
> 2. Successfully added feature
> 3. Get schema to see feature is available
> 4. When I try to push model I see the error below, however I added the
>>> lib
> into solr-cofig
> 
> Response
> {
> "responseHeader":{
>   "status":400,
>   "QTime":1},
> "error":{
>   "metadata":[
> "error-class","org.apache.solr.common.SolrException",
> "root-error-class","java.lang.NullPointerException"],
>   "msg":"org.apache.solr.ltr.model.ModelException: Model type does
>> not
> exist org.apache.solr.ltr.model.LinearModel",
>   "code":400}}
> 
> Thanks
>>> 
>>> 
>>

Solr Suggester component not returning any results

2019-03-19 Thread Deoxyribonucleic_DNA ...

Hi, I'm trying to implement the suggester component in Solr,  based off the 
tutorial:

https://lucene.apache.org/solr/guide/7_7/suggester.html#suggester-search-component-parameters

I'm not getting any errors about how the suggester is set up or anything, but 
when I try to use it I get 0 results returned. Including images of the 
schema/config parts I added to my files, as shown to do in the tutorial, can 
someone take a look? Or does anyone have any suggestions based on prior 
experience or common problems?

Also, I've got solr 7.6. There wasn't some change that would affect this just 
in that slight version change was there? I

Thanks in advance.

[cid:0e5daa48-57aa-439d-a227-8ad4d2249e18]


[cid:f73bacaf-e3b4-4767-b42a-55d53678b623]

Link used in search:


http://localhost:8984/solr/nutch/suggest?suggest=true&suggest.build=true&suggest.dictionary=mySuggester&suggest.q=Canad

(due to documents containing "Canada" as the intended word)

Result:

[cid:d8bec75d-1d45-4b5f-8dc7-3ffaf22ee7a7]

Re: Need information on EofExceptions in solr 4.8.1

2019-03-19 Thread Atita Arora

Precisely, socketTimeout ms if it's your indexing pipeline code.
We faced this when our docs were unusually bigger.

On Tue, Mar 19, 2019 at 7:08 PM Saurabh Sharma 
wrote:

> Hi,
>
> Seems like it is not a problem with solr. It is happening due to stream
> termination at the jetty.
> Please make sure your client is not setting very low read timeout. You can
> also increase max sessions timeout and idleTimeout at jetty level.
>
> Thanks
> Saurabh Sharma
>
> On Tue, Mar 19, 2019 at 11:19 PM Vijay Rawlani 
> wrote:
>
> > Dear Concerned,
> >
> > We are using solr 4.8.1 in our project. We are observing following
> > EofExceptions in solr.
> > It would be helpful for us to know in what situations we might land up
> > with this.
> > Can we get rid of this with any solr configuration or is there any way
> > forward at all?
> > Kindly let us know some information about the exception and the scenario
> > where it can occur.
> >
> > 019-03-17T00:00:25.604457+00:00@solr@UNKNOWN@
> > org.apache.solr.servlet.SolrDispatchFilter:120 -
> > null:org.eclipse.jetty.io.EofException#012#011at
> > org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:142)#012#011at
> > org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107)#012#011at
> >
> org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:214)#012#011at
> >
> >
> org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:207)#012#011at
> >
> >
> org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:98)#012#011at
> >
> >
> org.apache.solr.response.BinaryResponseWriter.write(BinaryResponseWriter.java:51)#012#011at
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:755)#012#011at
> >
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:431)#012...
> >
> > 2019-03-17T00:00:25.604457+00:00@solr@[INVALID_PROGRAM]@
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)#012#011at
> >
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)#012#011at
> >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)#012#011at
> >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)#012#011at
> >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)#012#011at
> >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)#012#011at
> >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)#012#011at
> >
> >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)#012#011at
> >
> > org.eclipse.jetty.server.session.SessionHandler.doScope(Sess...
> >
> > 2019-03-17T00:00:25.604457+00:00@solr@[INVALID_PROGRAM]@193)#012#011at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)#012#011at
> >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)#012#011at
> >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)#012#011at
> >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)#012#011at
> >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)#012#011at
> >
> > org.eclipse.jetty.server.Server.handle(Server.java:368)#012#011at
> >
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)#012#011at
> >
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)#012#011at
> >
> > org.e...
> >
> > 2019-03-17T00:00:25.604457+00:00@solr@[INVALID_PROGRAM]@953)#012#011at
> >
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)#012#011at
> >
> >
> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)#012#011at
> >
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)#012#011at
> >
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)#012#011at
> >
> >
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)#012#011at
> >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)#012#011at
> >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)#012#011at
> >
> > java.lang.Thread.run(Thread.java:748)#012
> >
> > 2019-03-17T00:00:25.604457+00:00@solr@UNKNOWN@
> > org.eclipse.jetty.server.Response:312 - Committed before 500
> > {trace=org.eclipse.jetty.io.EofException#012#011at
> > org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:142)#012#011at
> > org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107)#012#011at
> >
> org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:214)#012#011at
> >
> >
> org.apache.solr.common.util.FastOutp

Re: Need information on EofExceptions in solr 4.8.1

2019-03-19 Thread Saurabh Sharma

Hi,

Seems like it is not a problem with solr. It is happening due to stream
termination at the jetty.
Please make sure your client is not setting very low read timeout. You can
also increase max sessions timeout and idleTimeout at jetty level.

Thanks
Saurabh Sharma

On Tue, Mar 19, 2019 at 11:19 PM Vijay Rawlani 
wrote:

> Dear Concerned,
>
> We are using solr 4.8.1 in our project. We are observing following
> EofExceptions in solr.
> It would be helpful for us to know in what situations we might land up
> with this.
> Can we get rid of this with any solr configuration or is there any way
> forward at all?
> Kindly let us know some information about the exception and the scenario
> where it can occur.
>
> 019-03-17T00:00:25.604457+00:00@solr@UNKNOWN@
> org.apache.solr.servlet.SolrDispatchFilter:120 -
> null:org.eclipse.jetty.io.EofException#012#011at
> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:142)#012#011at
> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107)#012#011at
> org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:214)#012#011at
>
> org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:207)#012#011at
>
> org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:98)#012#011at
>
> org.apache.solr.response.BinaryResponseWriter.write(BinaryResponseWriter.java:51)#012#011at
>
> org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:755)#012#011at
>
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:431)#012...
>
> 2019-03-17T00:00:25.604457+00:00@solr@[INVALID_PROGRAM]@
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)#012#011at
>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)#012#011at
>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)#012#011at
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)#012#011at
>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)#012#011at
>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)#012#011at
>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)#012#011at
>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)#012#011at
>
> org.eclipse.jetty.server.session.SessionHandler.doScope(Sess...
>
> 2019-03-17T00:00:25.604457+00:00@solr@[INVALID_PROGRAM]@193)#012#011at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)#012#011at
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)#012#011at
>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)#012#011at
>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)#012#011at
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)#012#011at
>
> org.eclipse.jetty.server.Server.handle(Server.java:368)#012#011at
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)#012#011at
>
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)#012#011at
>
> org.e...
>
> 2019-03-17T00:00:25.604457+00:00@solr@[INVALID_PROGRAM]@953)#012#011at
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)#012#011at
>
> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)#012#011at
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)#012#011at
>
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)#012#011at
>
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)#012#011at
>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)#012#011at
>
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)#012#011at
>
> java.lang.Thread.run(Thread.java:748)#012
>
> 2019-03-17T00:00:25.604457+00:00@solr@UNKNOWN@
> org.eclipse.jetty.server.Response:312 - Committed before 500
> {trace=org.eclipse.jetty.io.EofException#012#011at
> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:142)#012#011at
> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107)#012#011at
> org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:214)#012#011at
>
> org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:207)#012#011at
>
> org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:98)#012#011at
>
> org.apache.solr.response.BinaryResponseWriter.write(BinaryResponseWriter.java:51)#012#011at
>
> org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:755)#012#011at
>
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja...
>
>
> Thanks

Re: Need help on LTR

2019-03-19 Thread Mohomed Rimash

Can you share the feature file and the model file,
1. I had few instances where invalid values for parameters (ie weights set
to more than 1 , with minmaxnormalizer) resulted the above error,
2, Check all the features added to the model has a weight under params ->
weights in the model


On Tue, 19 Mar 2019 at 21:21, Roopa Rao  wrote:

> Does your feature definitions and the feature names used in the model
> match?
>
> On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan  wrote:
>
> > Yes, I did.
> >
> > I can see the feature that I created by this
> > schema/feature-store/exampleFeatureStore and it return me the features I
> > created. But issue is when I try to put store-model.
> >
> > > On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
> > wrote:
> > >
> > > Hi Amjad, After adding the libraries into the path, Did you restart the
> > > SOLR ?
> > >
> > > On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
> > >
> > >> I followed the Solr LTR Documentation
> > >>
> > >> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> > >> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
> > >>
> > >> 1. Added library into the solr-config
> > >> 
> > >>   > >> regex=".*\.jar" />
> > >>  > >> regex="solr-ltr-\d.*\.jar" />
> > >> 2. Successfully added feature
> > >> 3. Get schema to see feature is available
> > >> 4. When I try to push model I see the error below, however I added the
> > lib
> > >> into solr-cofig
> > >>
> > >> Response
> > >> {
> > >>  "responseHeader":{
> > >>"status":400,
> > >>"QTime":1},
> > >>  "error":{
> > >>"metadata":[
> > >>  "error-class","org.apache.solr.common.SolrException",
> > >>  "root-error-class","java.lang.NullPointerException"],
> > >>"msg":"org.apache.solr.ltr.model.ModelException: Model type does
> not
> > >> exist org.apache.solr.ltr.model.LinearModel",
> > >>"code":400}}
> > >>
> > >> Thanks
> >
> >
>

Need information on EofExceptions in solr 4.8.1

2019-03-19 Thread Vijay Rawlani

Dear Concerned,

We are using solr 4.8.1 in our project. We are observing following 
EofExceptions in solr. 
It would be helpful for us to know in what situations we might land up 
with this.
Can we get rid of this with any solr configuration or is there any way 
forward at all? 
Kindly let us know some information about the exception and the scenario 
where it can occur.

019-03-17T00:00:25.604457+00:00@solr@UNKNOWN@ 
org.apache.solr.servlet.SolrDispatchFilter:120 - 
null:org.eclipse.jetty.io.EofException#012#011at 
org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:142)#012#011at 
org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107)#012#011at 
org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:214)#012#011at
 
org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:207)#012#011at
 
org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:98)#012#011at
 
org.apache.solr.response.BinaryResponseWriter.write(BinaryResponseWriter.java:51)#012#011at
 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:755)#012#011at
 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:431)#012...

2019-03-17T00:00:25.604457+00:00@solr@[INVALID_PROGRAM]@ 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)#012#011at
 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)#012#011at
 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)#012#011at
 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)#012#011at
 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)#012#011at
 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)#012#011at
 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)#012#011at
 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)#012#011at
 
org.eclipse.jetty.server.session.SessionHandler.doScope(Sess...

2019-03-17T00:00:25.604457+00:00@solr@[INVALID_PROGRAM]@193)#012#011at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)#012#011at
 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)#012#011at
 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)#012#011at
 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)#012#011at
 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)#012#011at
 
org.eclipse.jetty.server.Server.handle(Server.java:368)#012#011at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)#012#011at
 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)#012#011at
 
org.e...

2019-03-17T00:00:25.604457+00:00@solr@[INVALID_PROGRAM]@953)#012#011at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)#012#011at
 
org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)#012#011at 
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)#012#011at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)#012#011at
 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)#012#011at
 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)#012#011at
 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)#012#011at
 
java.lang.Thread.run(Thread.java:748)#012

2019-03-17T00:00:25.604457+00:00@solr@UNKNOWN@ 
org.eclipse.jetty.server.Response:312 - Committed before 500 
{trace=org.eclipse.jetty.io.EofException#012#011at 
org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:142)#012#011at 
org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107)#012#011at 
org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:214)#012#011at
 
org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:207)#012#011at
 
org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:98)#012#011at
 
org.apache.solr.response.BinaryResponseWriter.write(BinaryResponseWriter.java:51)#012#011at
 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:755)#012#011at
 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja...


Thanks in advance.

Best Regards,
Vijay Rawlani
Tata Consultancy Services Ltd.
+91 7416448048
=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are s

RE: Upgrading tika

2019-03-19 Thread Tannen, Lev (USAEO) [Contractor]

Thank you Shawn. 
I assumed that tika has been integrated with solr. I the project written before 
me they used two tika files taken from solr distribution. I am trying to do the 
same with solr 7.7.1. However this version contains a different set of tika 
related files. So I am confused. Does  solr does not have integrated tika 
anymore, or I just cannot recognize them?

-Original Message-
From: Shawn Heisey  
Sent: Tuesday, March 19, 2019 11:11 AM
To: solr-user@lucene.apache.org
Subject: Re: Upgrading tika

On 3/19/2019 9:03 AM, levtannen wrote:
> Could anybody suggest me what files do I need to use the latest 
> version of Tika and where to find them?

This mailing list is solr-user.  Tika is an entirely separate project from Solr 
within the Apache Foundation.  To get help with Tika, you'll need to ask that 
project.

https://tika.apache.org/mail-lists.html

Thanks,
Shawn

Re: Upgrading tika

2019-03-19 Thread Shawn Heisey


On 3/19/2019 9:03 AM, levtannen wrote:

Could anybody suggest me what files do I need to use the latest version of
Tika and where to find them?


This mailing list is solr-user.  Tika is an entirely separate project 
from Solr within the Apache Foundation.  To get help with Tika, you'll 
need to ask that project.


https://tika.apache.org/mail-lists.html

Thanks,
Shawn

Re: Solr index slow response

2019-03-19 Thread Walter Underwood

A sharded index will go faster, because the indexing workload is split among 
the machines.

A 5 Mbyte batch for indexing seems a little large, but it may be OK. Increase 
the client threads until you get CPU around 80%. 

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 19, 2019, at 8:53 AM, Aaron Yingcai Sun  wrote:
> 
> Hello, Walter,
> 
> Thanks for the hint. it looks like the size matters, our documents size are 
> not fixed, there are many small documents, such as 59KB perl 10 documents, 
> the response time is around 10ms which is pretty good, there I could let it 
> send bigger batch still I get reasonable response time.
> 
> 
> I will try with Solr Could cluster, maybe get better speed there.
> 
> 
> //Aaron
> 
> 
> From: Walter Underwood 
> Sent: Tuesday, March 19, 2019 3:29:17 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr index slow response
> 
> Indexing is CPU bound. If you have enough RAM, SSD disks, and enough client 
> threads, you should be able to drive CPU to over 90%.
> 
> Start with two client threads per CPU. That allows one thread to be sending 
> data over the network while another is waiting for Solr to process the batch.
> 
> A couple of years ago, I was indexing a million docs per minute into a Solr 
> Cloud cluster. I think that was four shards on instances with 16 CPUs, so it 
> was 64 CPUs available for indexing. That was with Java 8, G1GC, and 8 GB of 
> heap.
> 
> Your document are averaging about 50 kbytes, which is pretty big. Our 
> documents average about 3.5 kbytes. A lot of the indexing work is handling 
> the text, so those larger documents would be at least 10X slower than ours.
> 
> Are you doing atomic updates? That would slow things down a lot.
> 
> If you want to use G1GC, use the configuration I sent earlier.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Mar 19, 2019, at 7:05 AM, Bernd Fehling  
>> wrote:
>> 
>> Isn't there somthing about largePageTables which must be enabled
>> in JAVA and also supported by OS for such huge heaps?
>> 
>> Just a guess.
>> 
>> Am 19.03.19 um 15:01 schrieb Jörn Franke:
>>> It could be an issue with jdk 8 that may not be suitable for such large 
>>> heaps. Have more nodes with smaller heaps (eg 31 gb)
 Am 18.03.2019 um 11:47 schrieb Aaron Yingcai Sun :

 Hello, Solr!

 We are having some performance issue when try to send documents for solr 
 to index. The repose time is very slow and unpredictable some time.

 Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, 
 while 300 GB is reserved for solr, while this happening, cpu usage is 
 around 30%, mem usage is 34%.  io also look ok according to iotop. SSD 
 disk.

 Our application send 100 documents to solr per request, json encoded. the 
 size is around 5M each time. some times the response time is under 1 
 seconds, some times could be 300 seconds, the slow response happens very 
 often.

 "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 
 360ms; if 100 uncommited docs"

 There are around 100 clients sending those documents at the same time, but 
 each for the client is blocking call which wait the http response then 
 send the next one.

 I tried to make the number of documents smaller in one request, such as 
 20, but  still I see slow response time to time, like 80 seconds.

 Would you help to give some hint how improve the response time?  solr does 
 not seems very loaded, there must be a way to make the response faster.

 BRs

 //Aaron

>

Re: Solr index slow response

2019-03-19 Thread Aaron Yingcai Sun

Hello, Walter,

Thanks for the hint. it looks like the size matters, our documents size are not 
fixed, there are many small documents, such as 59KB perl 10 documents, the 
response time is around 10ms which is pretty good, there I could let it send 
bigger batch still I get reasonable response time.

I will try with Solr Could cluster, maybe get better speed there.

//Aaron

From: Walter Underwood 
Sent: Tuesday, March 19, 2019 3:29:17 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr index slow response

Indexing is CPU bound. If you have enough RAM, SSD disks, and enough client 
threads, you should be able to drive CPU to over 90%.

Start with two client threads per CPU. That allows one thread to be sending 
data over the network while another is waiting for Solr to process the batch.

A couple of years ago, I was indexing a million docs per minute into a Solr 
Cloud cluster. I think that was four shards on instances with 16 CPUs, so it 
was 64 CPUs available for indexing. That was with Java 8, G1GC, and 8 GB of 
heap.

Your document are averaging about 50 kbytes, which is pretty big. Our documents 
average about 3.5 kbytes. A lot of the indexing work is handling the text, so 
those larger documents would be at least 10X slower than ours.

Are you doing atomic updates? That would slow things down a lot.

If you want to use G1GC, use the configuration I sent earlier.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 19, 2019, at 7:05 AM, Bernd Fehling  
> wrote:
>
> Isn't there somthing about largePageTables which must be enabled
> in JAVA and also supported by OS for such huge heaps?
>
> Just a guess.
>
> Am 19.03.19 um 15:01 schrieb Jörn Franke:
>> It could be an issue with jdk 8 that may not be suitable for such large 
>> heaps. Have more nodes with smaller heaps (eg 31 gb)
>>> Am 18.03.2019 um 11:47 schrieb Aaron Yingcai Sun :
>>>
>>> Hello, Solr!
>>>
>>>
>>> We are having some performance issue when try to send documents for solr to 
>>> index. The repose time is very slow and unpredictable some time.
>>>
>>>
>>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, while 
>>> 300 GB is reserved for solr, while this happening, cpu usage is around 30%, 
>>> mem usage is 34%.  io also look ok according to iotop. SSD disk.
>>>
>>>
>>> Our application send 100 documents to solr per request, json encoded. the 
>>> size is around 5M each time. some times the response time is under 1 
>>> seconds, some times could be 300 seconds, the slow response happens very 
>>> often.
>>>
>>>
>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 360ms; 
>>> if 100 uncommited docs"
>>>
>>>
>>> There are around 100 clients sending those documents at the same time, but 
>>> each for the client is blocking call which wait the http response then send 
>>> the next one.
>>>
>>>
>>> I tried to make the number of documents smaller in one request, such as 20, 
>>> but  still I see slow response time to time, like 80 seconds.
>>>
>>>
>>> Would you help to give some hint how improve the response time?  solr does 
>>> not seems very loaded, there must be a way to make the response faster.
>>>
>>>
>>> BRs
>>>
>>> //Aaron
>>>
>>>
>>>

Re: Need help on LTR

2019-03-19 Thread Roopa Rao

Does your feature definitions and the feature names used in the model match?

On Tue, Mar 19, 2019 at 10:17 AM Amjad Khan  wrote:

> Yes, I did.
>
> I can see the feature that I created by this
> schema/feature-store/exampleFeatureStore and it return me the features I
> created. But issue is when I try to put store-model.
>
> > On Mar 19, 2019, at 12:18 AM, Mohomed Rimash 
> wrote:
> >
> > Hi Amjad, After adding the libraries into the path, Did you restart the
> > SOLR ?
> >
> > On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
> >
> >> I followed the Solr LTR Documentation
> >>
> >> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
> >> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
> >>
> >> 1. Added library into the solr-config
> >> 
> >>   >> regex=".*\.jar" />
> >>  >> regex="solr-ltr-\d.*\.jar" />
> >> 2. Successfully added feature
> >> 3. Get schema to see feature is available
> >> 4. When I try to push model I see the error below, however I added the
> lib
> >> into solr-cofig
> >>
> >> Response
> >> {
> >>  "responseHeader":{
> >>"status":400,
> >>"QTime":1},
> >>  "error":{
> >>"metadata":[
> >>  "error-class","org.apache.solr.common.SolrException",
> >>  "root-error-class","java.lang.NullPointerException"],
> >>"msg":"org.apache.solr.ltr.model.ModelException: Model type does not
> >> exist org.apache.solr.ltr.model.LinearModel",
> >>"code":400}}
> >>
> >> Thanks
>
>

RE: Upgrading tika

2019-03-19 Thread Tannen, Lev (USAEO) [Contractor]

Thank you Jeremy,
I am not using Maven, but I took both jars from the same distribution of cxf so 
they supposed to be compatible.

-Original Message-
From: Branham, Jeremy (Experis)  
Sent: Tuesday, March 19, 2019 11:10 AM
To: solr-user@lucene.apache.org
Subject: Re: Upgrading tika

I’m not positive – But I think you should match the CXF jar versions.

"cxf-core-3.2.8.jar" 

org.apache.cxf
cxf-rt-frontend-jaxrs
3.2.8

Jeremy Branham
jb...@allstate.com

On 3/19/19, 10:03 AM, "levtannen"  wrote:

"cxf-core-3.2.8.jar" and "cxf-rt-fromtend-jaxrs-2.6.3.jar"

Re: Solr index slow response

2019-03-19 Thread Emir Arnautović

The fact that it is happening with a single client suggests that it is not 
about concurrency. If it is happening equally frequent, I would assume it is 
about bulks - they might appear the same but might be significantly different 
from Solr’s POV. Is it update or append always? If it is append, maybe try 
isolate bulk that is taking longer and try repeating the same bulk multiple 
times and see if it will always be slow.
Maybe try doing thread dump while processing slow bulk and it might show you 
some pointers where Solr is spending time. Or maybe even try sending a single 
doc bulks and see if some documents are significantly heavier than others.

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 19 Mar 2019, at 13:22, Aaron Yingcai Sun  wrote:
> 
> Yes, the same behavior even with a single thread client. The following page 
> says "In general, adding many documents per update request is faster than one 
> per update request."  but in reality, add many documents per request result 
> in much longer response time, it's not liner, response time of 100 docs per 
> request  is bigger than (the response time of 10 docs per request) * 10.
> 
> 
> https://wiki.apache.org/solr/SolrPerformanceFactors#mergeFactor
> 
> SolrPerformanceFactors - Solr 
> Wiki
> wiki.apache.org
> Schema Design Considerations. indexed fields. The number of indexed fields 
> greatly increases the following: Memory usage during indexing ; Segment merge 
> time
> 
> 
> 
> 
> 
> From: Emir Arnautović 
> Sent: Tuesday, March 19, 2019 1:00:19 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr index slow response
> 
> If you start indexing with just a single thread/client, do you still see slow 
> bulks?
> 
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> 
> 
> 
>> On 19 Mar 2019, at 12:54, Aaron Yingcai Sun  wrote:
>> 
>> "QTime" value is from the solr rest api response, extracted from the 
>> http/json payload.  The "Request time" is what I measured from client side, 
>> it's almost the same value as QTime, just some milliseconds difference.  I 
>> could provide tcpdump to prove that it is really solr slow response.
>> 
>> Those long response time is not really spikes, it's constantly happening, 
>> almost half of the request has such long delay.  The more document added in 
>> one request the more delay it has.
>> 
>> 
>> From: Emir Arnautović 
>> Sent: Tuesday, March 19, 2019 12:30:33 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr index slow response
>> 
>> Just to add different perspective here: how do you send documents to Solr? 
>> Are those log lines from your client? Maybe it is not Solr that is slow. 
>> Could it be network or client itself. If you have some dry run on client, 
>> maybe try running it without Solr to eliminate client from the suspects.
>> 
>> Do you observe similar spikes when you run indexing with less concurrent 
>> clients?
>> 
>> It is really hard to pinpoint the issue without looking at some monitoring 
>> tool.
>> 
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>> 
>> 
>> 
>>> On 19 Mar 2019, at 09:17, Aaron Yingcai Sun  wrote:
>>> 
>>> We have around 80 million documents to index, total index size around 3TB,  
>>> I guess I'm not the first one to work with this big amount of data. with 
>>> such slow response time, the index process would take around 2 weeks. While 
>>> the system resource is not very loaded, there must be a way to speed it up.
>>> 
>>> 
>>> To Walter, I don't see why G1GC would improve this, we only do index, no 
>>> query in the background. There is no memory constraint.  it's more feel 
>>> like some internal thread are blocking each other.
>>> 
>>> 
>>> I used to run with more documents in one request, that give much worse 
>>> response time, 300 documents in one request could end up 20 minutes 
>>> response time, now I changed to max 10 documents in one request, still many 
>>> response time around 30 seconds, while some of them are very fast( ~100 
>>> ms).  How come there are such big difference? the documents size does not 
>>> have such big difference.
>>> 
>>> 
>>> I just want to speed it up since nothing seems to be overloaded.  Are there 
>>> any other faster way to index such big amount of data?
>>> 
>>> 
>>> BRs
>>> 
>>> //Aaron
>>> 
>>> 
>>> From: Walter Underwood 
>>> Sent: Monday, March 18, 2019 4:59:20 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Solr index slow response
>>> 
>>> Solr is not designed to have consistent response times for updates. You are 
>>> expecting Solr to do something t

Re: Upgrading tika

2019-03-19 Thread Branham, Jeremy (Experis)

I’m not positive – But I think you should match the CXF jar versions.

"cxf-core-3.2.8.jar" 


org.apache.cxf
cxf-rt-frontend-jaxrs
3.2.8


 
Jeremy Branham
jb...@allstate.com

On 3/19/19, 10:03 AM, "levtannen"  wrote:

"cxf-core-3.2.8.jar" and "cxf-rt-fromtend-jaxrs-2.6.3.jar"

Upgrading tika

2019-03-19 Thread levtannen

Hello community,
I am using Tika to extract the text content from pdf files before indexing
them. I used version 1.7 and it worked OK except it produced a lot warnings
like "Font not found". Now I am trying to move to the newer version 1.19.1
and I have problems finding all necessary dependencies. First it gave me the
exception that it cannot find the class 
"org/apache/cxf/jaxrs/ext/multipart/ContentDisposition". I have added jars
"cxf-core-3.2.8.jar" and "cxf-rt-fromtend-jaxrs-2.6.3.jar" from the cxf
project. Now it gives me an exception "java.lang.NoClassDefFoundError:
javax/ws/rs/core/MultivaluedMap".
Could anybody suggest me what files do I need to use the latest version of
Tika and where to find them?
Thank you.
 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr index slow response

2019-03-19 Thread Michael Gibney

I'll second Emir's suggestion to try disabling swap. "I doubt swap would
affect it since there is such huge free memory." -- sounds reasonable, but
has not been my experience, and the stats you sent indicate that swap is in
fact being used. Also, note that in many cases setting vm.swappiness=0 is
not equivalent to disabling swap (i.e., swapoff -a). If you're inclined to
try disabling swap, verify that it's successfully disabled by checking (and
re-checking) actual swap usage (that may sound obvious or trivial, but
relying on possibly-incorrect assumptions related to amount of free memory,
swappiness, etc. can be misleading). Good luck!

On Tue, Mar 19, 2019 at 10:29 AM Walter Underwood 
wrote:

> Indexing is CPU bound. If you have enough RAM, SSD disks, and enough
> client threads, you should be able to drive CPU to over 90%.
>
> Start with two client threads per CPU. That allows one thread to be
> sending data over the network while another is waiting for Solr to process
> the batch.
>
> A couple of years ago, I was indexing a million docs per minute into a
> Solr Cloud cluster. I think that was four shards on instances with 16 CPUs,
> so it was 64 CPUs available for indexing. That was with Java 8, G1GC, and 8
> GB of heap.
>
> Your document are averaging about 50 kbytes, which is pretty big. Our
> documents average about 3.5 kbytes. A lot of the indexing work is handling
> the text, so those larger documents would be at least 10X slower than ours.
>
> Are you doing atomic updates? That would slow things down a lot.
>
> If you want to use G1GC, use the configuration I sent earlier.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Mar 19, 2019, at 7:05 AM, Bernd Fehling <
> bernd.fehl...@uni-bielefeld.de> wrote:
> >
> > Isn't there somthing about largePageTables which must be enabled
> > in JAVA and also supported by OS for such huge heaps?
> >
> > Just a guess.
> >
> > Am 19.03.19 um 15:01 schrieb Jörn Franke:
> >> It could be an issue with jdk 8 that may not be suitable for such large
> heaps. Have more nodes with smaller heaps (eg 31 gb)
> >>> Am 18.03.2019 um 11:47 schrieb Aaron Yingcai Sun :
> >>>
> >>> Hello, Solr!
> >>>
> >>>
> >>> We are having some performance issue when try to send documents for
> solr to index. The repose time is very slow and unpredictable some time.
> >>>
> >>>
> >>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM,
> while 300 GB is reserved for solr, while this happening, cpu usage is
> around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
> >>>
> >>>
> >>> Our application send 100 documents to solr per request, json encoded.
> the size is around 5M each time. some times the response time is under 1
> seconds, some times could be 300 seconds, the slow response happens very
> often.
> >>>
> >>>
> >>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for
> 360ms; if 100 uncommited docs"
> >>>
> >>>
> >>> There are around 100 clients sending those documents at the same time,
> but each for the client is blocking call which wait the http response then
> send the next one.
> >>>
> >>>
> >>> I tried to make the number of documents smaller in one request, such
> as 20, but  still I see slow response time to time, like 80 seconds.
> >>>
> >>>
> >>> Would you help to give some hint how improve the response time?  solr
> does not seems very loaded, there must be a way to make the response faster.
> >>>
> >>>
> >>> BRs
> >>>
> >>> //Aaron
> >>>
> >>>
> >>>
>
>

Re: Solr index slow response

2019-03-19 Thread Walter Underwood

Indexing is CPU bound. If you have enough RAM, SSD disks, and enough client 
threads, you should be able to drive CPU to over 90%.

Start with two client threads per CPU. That allows one thread to be sending 
data over the network while another is waiting for Solr to process the batch.

A couple of years ago, I was indexing a million docs per minute into a Solr 
Cloud cluster. I think that was four shards on instances with 16 CPUs, so it 
was 64 CPUs available for indexing. That was with Java 8, G1GC, and 8 GB of 
heap.

Your document are averaging about 50 kbytes, which is pretty big. Our documents 
average about 3.5 kbytes. A lot of the indexing work is handling the text, so 
those larger documents would be at least 10X slower than ours.

Are you doing atomic updates? That would slow things down a lot.

If you want to use G1GC, use the configuration I sent earlier.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 19, 2019, at 7:05 AM, Bernd Fehling  
> wrote:
> 
> Isn't there somthing about largePageTables which must be enabled
> in JAVA and also supported by OS for such huge heaps?
> 
> Just a guess.
> 
> Am 19.03.19 um 15:01 schrieb Jörn Franke:
>> It could be an issue with jdk 8 that may not be suitable for such large 
>> heaps. Have more nodes with smaller heaps (eg 31 gb)
>>> Am 18.03.2019 um 11:47 schrieb Aaron Yingcai Sun :
>>> 
>>> Hello, Solr!
>>> 
>>> 
>>> We are having some performance issue when try to send documents for solr to 
>>> index. The repose time is very slow and unpredictable some time.
>>> 
>>> 
>>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, while 
>>> 300 GB is reserved for solr, while this happening, cpu usage is around 30%, 
>>> mem usage is 34%.  io also look ok according to iotop. SSD disk.
>>> 
>>> 
>>> Our application send 100 documents to solr per request, json encoded. the 
>>> size is around 5M each time. some times the response time is under 1 
>>> seconds, some times could be 300 seconds, the slow response happens very 
>>> often.
>>> 
>>> 
>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 360ms; 
>>> if 100 uncommited docs"
>>> 
>>> 
>>> There are around 100 clients sending those documents at the same time, but 
>>> each for the client is blocking call which wait the http response then send 
>>> the next one.
>>> 
>>> 
>>> I tried to make the number of documents smaller in one request, such as 20, 
>>> but  still I see slow response time to time, like 80 seconds.
>>> 
>>> 
>>> Would you help to give some hint how improve the response time?  solr does 
>>> not seems very loaded, there must be a way to make the response faster.
>>> 
>>> 
>>> BRs
>>> 
>>> //Aaron
>>> 
>>> 
>>>

Re: Need help on LTR

2019-03-19 Thread Amjad Khan

Yes, I did.

I can see the feature that I created by this 
schema/feature-store/exampleFeatureStore and it return me the features I 
created. But issue is when I try to put store-model.

> On Mar 19, 2019, at 12:18 AM, Mohomed Rimash  wrote:
> 
> Hi Amjad, After adding the libraries into the path, Did you restart the
> SOLR ?
> 
> On Tue, 19 Mar 2019 at 08:45, Amjad Khan  wrote:
> 
>> I followed the Solr LTR Documentation
>> 
>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html <
>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html>
>> 
>> 1. Added library into the solr-config
>> 
>>  > regex=".*\.jar" />
>> > regex="solr-ltr-\d.*\.jar" />
>> 2. Successfully added feature
>> 3. Get schema to see feature is available
>> 4. When I try to push model I see the error below, however I added the lib
>> into solr-cofig
>> 
>> Response
>> {
>>  "responseHeader":{
>>"status":400,
>>"QTime":1},
>>  "error":{
>>"metadata":[
>>  "error-class","org.apache.solr.common.SolrException",
>>  "root-error-class","java.lang.NullPointerException"],
>>"msg":"org.apache.solr.ltr.model.ModelException: Model type does not
>> exist org.apache.solr.ltr.model.LinearModel",
>>"code":400}}
>> 
>> Thanks

Re: Need help on LTR

2019-03-19 Thread Amjad Khan

Hi,

Yes, I did restarted the solr server with this JVM param.

> On Mar 19, 2019, at 3:35 AM, Jörn Franke  wrote:
> 
> Did you add the option -Dsolr.ltr.enabled=true ?
> 
>> Am 19.03.2019 um 04:15 schrieb Amjad Khan :
>> 
>> I followed the Solr LTR Documentation 
>> 
>> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html 
>> 
>> 
>> 1. Added library into the solr-config
>> 
>> > />
>> > />
>> 2. Successfully added feature
>> 3. Get schema to see feature is available
>> 4. When I try to push model I see the error below, however I added the lib 
>> into solr-cofig
>> 
>> Response
>> {
>> "responseHeader":{
>>   "status":400,
>>   "QTime":1},
>> "error":{
>>   "metadata":[
>> "error-class","org.apache.solr.common.SolrException",
>> "root-error-class","java.lang.NullPointerException"],
>>   "msg":"org.apache.solr.ltr.model.ModelException: Model type does not exist 
>> org.apache.solr.ltr.model.LinearModel",
>>   "code":400}}
>> 
>> Thanks

Re: Solr index slow response

2019-03-19 Thread Bernd Fehling


Isn't there somthing about largePageTables which must be enabled
in JAVA and also supported by OS for such huge heaps?

Just a guess.

Am 19.03.19 um 15:01 schrieb Jörn Franke:

It could be an issue with jdk 8 that may not be suitable for such large heaps. 
Have more nodes with smaller heaps (eg 31 gb)


Am 18.03.2019 um 11:47 schrieb Aaron Yingcai Sun :

Hello, Solr!


We are having some performance issue when try to send documents for solr to 
index. The repose time is very slow and unpredictable some time.


Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, while 300 
GB is reserved for solr, while this happening, cpu usage is around 30%, mem 
usage is 34%.  io also look ok according to iotop. SSD disk.


Our application send 100 documents to solr per request, json encoded. the size 
is around 5M each time. some times the response time is under 1 seconds, some 
times could be 300 seconds, the slow response happens very often.


"Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 360ms; if 
100 uncommited docs"


There are around 100 clients sending those documents at the same time, but each 
for the client is blocking call which wait the http response then send the next 
one.


I tried to make the number of documents smaller in one request, such as 20, but 
 still I see slow response time to time, like 80 seconds.


Would you help to give some hint how improve the response time?  solr does not 
seems very loaded, there must be a way to make the response faster.


BRs

//Aaron

Re: Re: obfuscated password error

2019-03-19 Thread Satya Marivada

Hi Jeremy,

Thanks for the points. Yes, agreed that there is some conflicting property
somewhere that is not letting it work. So I basically restored solr-6.3.0
directory from another environment and replace the host name appropriately
for this environment. And I used the original keystore that has been
generated for this environment and it worked fine. So basically the
keystore is good as well except that there is some conflicting property
which is not letting it do deobfuscation right.

Thanks,
Satya

On Mon, Mar 18, 2019 at 2:32 PM Branham, Jeremy (Experis) <
jb...@allstate.com> wrote:

> I’m not sure if you are sharing the trust/keystores, so I may be off-base
> here…
>
> Some thoughts –
> - Verify your VM arguments, to be sure there aren’t conflicting SSL
> properties.
> - Verify the environment is targeting the correct version of Java
> - Verify the trust/key stores exist where they are expected, and you can
> list the contents with the keytool
> - Verify the correct CA certs are trusted
>
>
> Jeremy Branham
> jb...@allstate.com
>
> On 3/18/19, 1:08 PM, "Satya Marivada"  wrote:
>
> Any suggestions please.
>
> Thanks,
> Satya
>
> On Mon, Mar 18, 2019 at 11:12 AM Satya Marivada <
> satya.chaita...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > Using solr-6.3.0, to obfuscate the password, have used jetty util to
> > generate obfuscated password
> >
> >
> > java -cp jetty-util-9.3.8.v20160314.jar
> > org.eclipse.jetty.util.security.Password mypassword
> >
> >
> > The output has been used in
> https://urldefense.proofpoint.com/v2/url?u=http-3A__solr.in.sh&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=YtmCJK2U90u6mqx-FOmBS5nqy03luM2J-Zc_LhImnG0&e=
> as below
> >
> >
> >
> >
> SOLR_SSL_KEY_STORE=/sanfs/mnt/vol01/solr/solr-6.3.0/server/etc/solr-ssl.keystore.jks
> >
> >
> SOLR_SSL_KEY_STORE_PASSWORD="OBF:1bcd1l161lts1ltu1uum1uvk1lq41lq61k221b9t"
> >
> >
> >
> SOLR_SSL_TRUST_STORE=/sanfs/mnt/vol01/solr/solr-6.3.0/server/etc/solr-ssl.keystore.jks
> >
> >
> >
> SOLR_SSL_TRUST_STORE_PASSWORD="OBF:1bcd1l161lts1ltu1uum1uvk1lq41lq61k221b9t"
> >
> > Solr does not start fine with below exception, any suggestions? If I
> use
> > the plain text password, it works fine. One more thing is that the
> same
> > setup with obfuscated password works in other environments except
> one which
> > got this exception. Recently system level patches are applied, just
> saying
> > though dont think that could have impact,
> >
> > Caused by: java.net.SocketException:
> > java.security.NoSuchAlgorithmException: Error constructing
> implementation
> > (algorithm: Default, provider: SunJSSE, class:
> sun.security.ssl.SSLContextIm
> > pl$DefaultSSLContext)
> > at
> > javax.net.ssl.DefaultSSLSocketFactory.throwException(
> https://urldefense.proofpoint.com/v2/url?u=http-3A__SSLSocketFactory.java&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=dud5QRNkwTMDiH04sCjNs1U9_5t8wBMxJNiyQRdjXRk&e=:248
> )
> > at
> > javax.net.ssl.DefaultSSLSocketFactory.createSocket(
> https://urldefense.proofpoint.com/v2/url?u=http-3A__SSLSocketFactory.java&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=dud5QRNkwTMDiH04sCjNs1U9_5t8wBMxJNiyQRdjXRk&e=:255
> )
> > at
> > org.apache.http.conn.ssl.SSLSocketFactory.createSocket(
> https://urldefense.proofpoint.com/v2/url?u=http-3A__SSLSocketFactory.java&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=dud5QRNkwTMDiH04sCjNs1U9_5t8wBMxJNiyQRdjXRk&e=:513
> )
> > at
> > org.apache.http.conn.ssl.SSLSocketFactory.createSocket(
> https://urldefense.proofpoint.com/v2/url?u=http-3A__SSLSocketFactory.java&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=dud5QRNkwTMDiH04sCjNs1U9_5t8wBMxJNiyQRdjXRk&e=:383
> )
> > at
> >
> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(
> https://urldefense.proofpoint.com/v2/url?u=http-3A__DefaultClientConnectionOperator.java&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=EATR9hBi7P9kYpCcJ8maLn81bHA72GhhvwWQY0V9EQw&e=:165
> )
> > at
> > org.apache.http.impl.conn.ManagedClientConnectionImpl.open(
> https://urldefense.proofpoint.com/v2/url?u=http-3A__ManagedClientConnectionImpl.java&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=yuCHQjzNKMtl0uWKiDWB01ChPkiY1tCaPX8n8lhdR-s&e=:304
> )

Re: Solr index slow response

2019-03-19 Thread Jörn Franke

It could be an issue with jdk 8 that may not be suitable for such large heaps. 
Have more nodes with smaller heaps (eg 31 gb)

> Am 18.03.2019 um 11:47 schrieb Aaron Yingcai Sun :
> 
> Hello, Solr!
> 
> 
> We are having some performance issue when try to send documents for solr to 
> index. The repose time is very slow and unpredictable some time.
> 
> 
> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, while 
> 300 GB is reserved for solr, while this happening, cpu usage is around 30%, 
> mem usage is 34%.  io also look ok according to iotop. SSD disk.
> 
> 
> Our application send 100 documents to solr per request, json encoded. the 
> size is around 5M each time. some times the response time is under 1 seconds, 
> some times could be 300 seconds, the slow response happens very often.
> 
> 
> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 360ms; 
> if 100 uncommited docs"
> 
> 
> There are around 100 clients sending those documents at the same time, but 
> each for the client is blocking call which wait the http response then send 
> the next one.
> 
> 
> I tried to make the number of documents smaller in one request, such as 20, 
> but  still I see slow response time to time, like 80 seconds.
> 
> 
> Would you help to give some hint how improve the response time?  solr does 
> not seems very loaded, there must be a way to make the response faster.
> 
> 
> BRs
> 
> //Aaron
> 
> 
>

Re: Spellchecker -File based vs Index based

2019-03-19 Thread Ashish Bisht

Spellcheck configuration is default one..


solr.FileBasedSpellChecker
file
spellings.txt
UTF-8
./spellcheckerFile




  default
  jkdefault
  file
  on
  true
  10
  5
  5
  true
  10
  true
  10
  5


Also the words are present in the file..For e.g things word which is
corrected is present inside file.Also the suggestions related to it are
present.

I don't want suggestions for right word (of,things)..Any problem with
request .Tried two combinations.

1./spell?spellcheck.q=intnet of
things&spellcheck=true&spellcheck.collateParam.q.op=AND&df=spellcontent&spellcheck.dictionary=file

2.spell?q=intnet of
things&defType=edismax&qf=spellcontent&wt=json&rows=0&&spellcheck=true&spellcheck.dictionary=file&q.op=AND

Please suggest



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Spellchecker -File based vs Index based

2019-03-19 Thread Ashish Bisht

Spellcheck configuration is default one.. 


solr.FileBasedSpellChecker
file
spellings.txt
UTF-8
./spellcheckerFile




  default
  jkdefault
  file
  on
  true
  10
  5
  5
  true
  10
  true
  10
  5


Also the words are present in the file..For e.g things word which is
corrected is present inside file.Also the suggestions related to it are
present. 

*I don't want suggestions for right word (of,things)..Any problem with
request .Tried two combinations.* 

1./spell?spellcheck.q=intnet of
things&spellcheck=true&spellcheck.collateParam.q.op=AND&df=spellcontent&spellcheck.dictionary=file
 

2./spell?q=intnet of
things&defType=edismax&qf=spellcontent&wt=json&rows=0&&spellcheck=true&spellcheck.dictionary=file&q.op=AND
 

Please suggest



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Re: obfuscated password error

2019-03-19 Thread Satya Marivada

It has been generated with plain password. Same in other environments too,
but it works in other environments.

Thanks,
Satya

On Mon, Mar 18, 2019, 10:42 PM Zheng Lin Edwin Yeo 
wrote:

> Hi,
>
> Did you generate your keystore with the obfuscated password or the plain
> text password?
>
> Regards,
> Edwin
>
> On Tue, 19 Mar 2019 at 02:32, Branham, Jeremy (Experis) <
> jb...@allstate.com>
> wrote:
>
> > I’m not sure if you are sharing the trust/keystores, so I may be off-base
> > here…
> >
> > Some thoughts –
> > - Verify your VM arguments, to be sure there aren’t conflicting SSL
> > properties.
> > - Verify the environment is targeting the correct version of Java
> > - Verify the trust/key stores exist where they are expected, and you can
> > list the contents with the keytool
> > - Verify the correct CA certs are trusted
> >
> >
> > Jeremy Branham
> > jb...@allstate.com
> >
> > On 3/18/19, 1:08 PM, "Satya Marivada"  wrote:
> >
> > Any suggestions please.
> >
> > Thanks,
> > Satya
> >
> > On Mon, Mar 18, 2019 at 11:12 AM Satya Marivada <
> > satya.chaita...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > Using solr-6.3.0, to obfuscate the password, have used jetty util
> to
> > > generate obfuscated password
> > >
> > >
> > > java -cp jetty-util-9.3.8.v20160314.jar
> > > org.eclipse.jetty.util.security.Password mypassword
> > >
> > >
> > > The output has been used in
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__solr.in.sh&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=YtmCJK2U90u6mqx-FOmBS5nqy03luM2J-Zc_LhImnG0&e=
> > as below
> > >
> > >
> > >
> > >
> >
> SOLR_SSL_KEY_STORE=/sanfs/mnt/vol01/solr/solr-6.3.0/server/etc/solr-ssl.keystore.jks
> > >
> > >
> >
> SOLR_SSL_KEY_STORE_PASSWORD="OBF:1bcd1l161lts1ltu1uum1uvk1lq41lq61k221b9t"
> > >
> > >
> > >
> >
> SOLR_SSL_TRUST_STORE=/sanfs/mnt/vol01/solr/solr-6.3.0/server/etc/solr-ssl.keystore.jks
> > >
> > >
> > >
> >
> SOLR_SSL_TRUST_STORE_PASSWORD="OBF:1bcd1l161lts1ltu1uum1uvk1lq41lq61k221b9t"
> > >
> > > Solr does not start fine with below exception, any suggestions? If
> I
> > use
> > > the plain text password, it works fine. One more thing is that the
> > same
> > > setup with obfuscated password works in other environments except
> > one which
> > > got this exception. Recently system level patches are applied, just
> > saying
> > > though dont think that could have impact,
> > >
> > > Caused by: java.net.SocketException:
> > > java.security.NoSuchAlgorithmException: Error constructing
> > implementation
> > > (algorithm: Default, provider: SunJSSE, class:
> > sun.security.ssl.SSLContextIm
> > > pl$DefaultSSLContext)
> > > at
> > > javax.net.ssl.DefaultSSLSocketFactory.throwException(
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__SSLSocketFactory.java&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=dud5QRNkwTMDiH04sCjNs1U9_5t8wBMxJNiyQRdjXRk&e=:248
> > )
> > > at
> > > javax.net.ssl.DefaultSSLSocketFactory.createSocket(
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__SSLSocketFactory.java&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=dud5QRNkwTMDiH04sCjNs1U9_5t8wBMxJNiyQRdjXRk&e=:255
> > )
> > > at
> > > org.apache.http.conn.ssl.SSLSocketFactory.createSocket(
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__SSLSocketFactory.java&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=dud5QRNkwTMDiH04sCjNs1U9_5t8wBMxJNiyQRdjXRk&e=:513
> > )
> > > at
> > > org.apache.http.conn.ssl.SSLSocketFactory.createSocket(
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__SSLSocketFactory.java&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=dud5QRNkwTMDiH04sCjNs1U9_5t8wBMxJNiyQRdjXRk&e=:383
> > )
> > > at
> > >
> > org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__DefaultClientConnectionOperator.java&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=EATR9hBi7P9kYpCcJ8maLn81bHA72GhhvwWQY0V9EQw&e=:165
> > )
> > > at
> > > org.apache.http.impl.conn.ManagedClientConnectionImpl.open(
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__ManagedClientConnectionImpl.java&d=DwIBaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=Ix7ZcyM45ms93i2fWx4SNPgiLA7TGHVDOjCklcxbvLs&s=yuCHQjzNKMtl0uWKiDWB01ChPkiY1t

Re: Solr index slow response

2019-03-19 Thread Chris Ulicny

Do you know what is causing the 30% CPU usage? The seems awfully high for
solr when only indexing (unless it is iowait), but could be expected based
on custom update processors and analyzers you may have.

Are you putting all of the documents into a single core or multiple? Also
are you using SATA or PCIe solid state drives?

We have a similar index situation. About 65million documents that take up
about 4.5TB for a single copy of the index. We've never tried to put all of
the documents into a single solr core before, so I don't know how well that
would work.

On Tue, Mar 19, 2019 at 8:28 AM Aaron Yingcai Sun  wrote:

> Yes, the same behavior even with a single thread client. The following
> page says "In general, adding many documents per update request is faster
> than one per update request."  but in reality, add many documents per
> request result in much longer response time, it's not liner, response time
> of 100 docs per request  is bigger than (the response time of 10 docs per
> request) * 10.
>
>
> https://wiki.apache.org/solr/SolrPerformanceFactors#mergeFactor
>
> SolrPerformanceFactors - Solr Wiki<
> https://wiki.apache.org/solr/SolrPerformanceFactors#mergeFactor>
> wiki.apache.org
> Schema Design Considerations. indexed fields. The number of indexed fields
> greatly increases the following: Memory usage during indexing ; Segment
> merge time
>
>
>
>
> 
> From: Emir Arnautović 
> Sent: Tuesday, March 19, 2019 1:00:19 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr index slow response
>
> If you start indexing with just a single thread/client, do you still see
> slow bulks?
>
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 19 Mar 2019, at 12:54, Aaron Yingcai Sun  wrote:
> >
> > "QTime" value is from the solr rest api response, extracted from the
> http/json payload.  The "Request time" is what I measured from client side,
> it's almost the same value as QTime, just some milliseconds difference.  I
> could provide tcpdump to prove that it is really solr slow response.
> >
> > Those long response time is not really spikes, it's constantly
> happening, almost half of the request has such long delay.  The more
> document added in one request the more delay it has.
> >
> > 
> > From: Emir Arnautović 
> > Sent: Tuesday, March 19, 2019 12:30:33 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Solr index slow response
> >
> > Just to add different perspective here: how do you send documents to
> Solr? Are those log lines from your client? Maybe it is not Solr that is
> slow. Could it be network or client itself. If you have some dry run on
> client, maybe try running it without Solr to eliminate client from the
> suspects.
> >
> > Do you observe similar spikes when you run indexing with less concurrent
> clients?
> >
> > It is really hard to pinpoint the issue without looking at some
> monitoring tool.
> >
> > Emir
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> >
> >> On 19 Mar 2019, at 09:17, Aaron Yingcai Sun  wrote:
> >>
> >> We have around 80 million documents to index, total index size around
> 3TB,  I guess I'm not the first one to work with this big amount of data.
> with such slow response time, the index process would take around 2 weeks.
> While the system resource is not very loaded, there must be a way to speed
> it up.
> >>
> >>
> >> To Walter, I don't see why G1GC would improve this, we only do index,
> no query in the background. There is no memory constraint.  it's more feel
> like some internal thread are blocking each other.
> >>
> >>
> >> I used to run with more documents in one request, that give much worse
> response time, 300 documents in one request could end up 20 minutes
> response time, now I changed to max 10 documents in one request, still many
> response time around 30 seconds, while some of them are very fast( ~100
> ms).  How come there are such big difference? the documents size does not
> have such big difference.
> >>
> >>
> >> I just want to speed it up since nothing seems to be overloaded.  Are
> there any other faster way to index such big amount of data?
> >>
> >>
> >> BRs
> >>
> >> //Aaron
> >>
> >> 
> >> From: Walter Underwood 
> >> Sent: Monday, March 18, 2019 4:59:20 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Solr index slow response
> >>
> >> Solr is not designed to have consistent response times for updates. You
> are expecting Solr to do something that it does not do.
> >>
> >> About Xms and Xmx, the JVM will continue to allocate memory until it
> hits the max. After it hits the max, it will start to collect garbage. A
> smaller Xms just wastes time doing allocations after the JVM is running.
> Avoid that by making Xm

Connect to multiple Solr Servers via Spring boot

2019-03-19 Thread Rushikesh Garadade

Hi,
I have multitenant Application on Spring Boot, where we want search data
from different solr deployments for different tenants.

I have referrred many sites on internet, they provide solution to take
server url from properties file. So if we want to take from property file
then it will be always one Solr.
Code is as follows:

@Configuration

@EnableSolrRepositories

@ComponentScan

public class SolrConectionUtils {

@Value("${spring.data.solr.host}")

private String solrConnectionUrl;


@Bean

public SolrClient solrClient() {

System.out.println("Connection URL: " + solrConnectionUrl);

SolrClient solrClient = new
HttpSolrClient.Builder(solrConnectionUrl).build();

return solrClient;

}


@Bean

public SolrTemplate solrTemplate() throws Exception {

SolrClient httpSolrClient = solrClient();

return solrTemplate(httpSolrClient);

}


@Bean

public SolrTemplate solrTemplate(SolrClient httpSolrClient) throws
Exception {

return new SolrTemplate(httpSolrClient);

}


}



In order achieve muliti solr deployment requirement, we have planned to
store server url in mysql per tenant. i.e. instead of properties file we
will fetch connection url from mysql.
Following is the code:

@Configuration
@EnableSolrRepositories
@ComponentScan
public class SolrConectionUtils {

@Autowired
private DataBaseConnectionPropertiesService
dataBaseConnectionPropertiesService;
@Autowired
private WebRequestContext webRequestContext;

@Autowired
private JobContext jobContext;


@Autowired
private AppSourceService appSourceService;

private String solrConnectionUrl;

private Long tenantId;

public String Connection() {
// CODE TO GET SOLR URL FROM MYSQL TABLE
}

public SolrClient solrClient() {
solrConnectionUrl=Connection();
System.out.println("Just TRy-"+ solrConnectionUrl);
HttpSolrClient solrClient = new
HttpSolrClient.Builder(solrConnectionUrl).build();
return solrClient;
}

public SolrTemplate solrTemplate() throws Exception {
SolrClient client=this.solrClient();
return solrTemplate(client);
}
@Bean
public SolrTemplate solrTemplate(SolrClient client) throws Exception {
return new SolrTemplate(client);
}
}


we get following Error:

java.lang.NullPointerException
at
org.springframework.data.solr.core.SolrTemplate.querySolr(SolrTemplate.java:498)
at
org.springframework.data.solr.core.SolrTemplate.doQueryForPage(SolrTemplate.java:297)
at
org.springframework.data.solr.core.SolrTemplate.query(SolrTemplate.java:326)
at
com.solix.emailarchiving.email.solr.search.SolrTemplateWrapper.query(SolrTemplateWrapper.java:89)
at
com.solix.emailarchiving.email.solr.search.EmailSearchRepositoryImpl.findAllEmails(EmailSearchRepositoryImpl.java:129)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:338)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:197)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at
org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
at com.sun.proxy.$Proxy208.findAllEmails(Unknown Source)
at
com.solix.emailarchiving.email.EmailSolrServiceImpl.advanceSearch(EmailSolrServiceImpl.java:108)
at
com.solix.emailarchiving.email.EmailSolrServiceImpl$$FastClassBySpringCGLIB$$40f18538.invoke()
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:747)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at
org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:89)
at
com.solix.emailarchiving.annotation.ServiceTransactionAspect.beforeMethod(ServiceTransactionAspect.java:45)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:643)
at
org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:632)
at
org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdv

Re: Solr index slow response

2019-03-19 Thread Aaron Yingcai Sun

Yes, the same behavior even with a single thread client. The following page 
says "In general, adding many documents per update request is faster than one 
per update request."  but in reality, add many documents per request result in 
much longer response time, it's not liner, response time of 100 docs per 
request  is bigger than (the response time of 10 docs per request) * 10.


https://wiki.apache.org/solr/SolrPerformanceFactors#mergeFactor

SolrPerformanceFactors - Solr 
Wiki
wiki.apache.org
Schema Design Considerations. indexed fields. The number of indexed fields 
greatly increases the following: Memory usage during indexing ; Segment merge 
time





From: Emir Arnautović 
Sent: Tuesday, March 19, 2019 1:00:19 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr index slow response

If you start indexing with just a single thread/client, do you still see slow 
bulks?

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 19 Mar 2019, at 12:54, Aaron Yingcai Sun  wrote:
>
> "QTime" value is from the solr rest api response, extracted from the 
> http/json payload.  The "Request time" is what I measured from client side, 
> it's almost the same value as QTime, just some milliseconds difference.  I 
> could provide tcpdump to prove that it is really solr slow response.
>
> Those long response time is not really spikes, it's constantly happening, 
> almost half of the request has such long delay.  The more document added in 
> one request the more delay it has.
>
> 
> From: Emir Arnautović 
> Sent: Tuesday, March 19, 2019 12:30:33 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr index slow response
>
> Just to add different perspective here: how do you send documents to Solr? 
> Are those log lines from your client? Maybe it is not Solr that is slow. 
> Could it be network or client itself. If you have some dry run on client, 
> maybe try running it without Solr to eliminate client from the suspects.
>
> Do you observe similar spikes when you run indexing with less concurrent 
> clients?
>
> It is really hard to pinpoint the issue without looking at some monitoring 
> tool.
>
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 19 Mar 2019, at 09:17, Aaron Yingcai Sun  wrote:
>>
>> We have around 80 million documents to index, total index size around 3TB,  
>> I guess I'm not the first one to work with this big amount of data. with 
>> such slow response time, the index process would take around 2 weeks. While 
>> the system resource is not very loaded, there must be a way to speed it up.
>>
>>
>> To Walter, I don't see why G1GC would improve this, we only do index, no 
>> query in the background. There is no memory constraint.  it's more feel like 
>> some internal thread are blocking each other.
>>
>>
>> I used to run with more documents in one request, that give much worse 
>> response time, 300 documents in one request could end up 20 minutes response 
>> time, now I changed to max 10 documents in one request, still many response 
>> time around 30 seconds, while some of them are very fast( ~100 ms).  How 
>> come there are such big difference? the documents size does not have such 
>> big difference.
>>
>>
>> I just want to speed it up since nothing seems to be overloaded.  Are there 
>> any other faster way to index such big amount of data?
>>
>>
>> BRs
>>
>> //Aaron
>>
>> 
>> From: Walter Underwood 
>> Sent: Monday, March 18, 2019 4:59:20 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr index slow response
>>
>> Solr is not designed to have consistent response times for updates. You are 
>> expecting Solr to do something that it does not do.
>>
>> About Xms and Xmx, the JVM will continue to allocate memory until it hits 
>> the max. After it hits the max, it will start to collect garbage. A smaller 
>> Xms just wastes time doing allocations after the JVM is running. Avoid that 
>> by making Xms and Xms the same.
>>
>> We run all of our JVMs with 8 GB of heap and the G1 collector. You probably 
>> do not need more than 8 GB unless you are doing high-cardinality facets or 
>> some other memory-hungry querying.
>>
>> The first step would be to use a good configuration. We start our Java 8 
>> JVMs with these parameters:
>>
>> SOLR_HEAP=8g
>> # Use G1 GC  -- wunder 2017-01-23
>> # Settings from https://wiki.apache.org/solr/ShawnHeisey
>> GC_TUNE=" \
>> -XX:+UseG1GC \
>> -XX:+ParallelRefProcEnabled \
>> -XX:G1HeapRegionSize=8m \
>> -XX:MaxGCPauseMillis=200 \
>> -XX:+UseLargePages \
>> -XX:+AggressiveOpts \
>> “
>>
>> Use SSD for disks, with total space about 3X as big as the expected index 
>> size.
>>
>> Have RAM not used by Solr

Re: Solr index slow response

2019-03-19 Thread Emir Arnautović

If you start indexing with just a single thread/client, do you still see slow 
bulks?

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 19 Mar 2019, at 12:54, Aaron Yingcai Sun  wrote:
> 
> "QTime" value is from the solr rest api response, extracted from the 
> http/json payload.  The "Request time" is what I measured from client side, 
> it's almost the same value as QTime, just some milliseconds difference.  I 
> could provide tcpdump to prove that it is really solr slow response.
> 
> Those long response time is not really spikes, it's constantly happening, 
> almost half of the request has such long delay.  The more document added in 
> one request the more delay it has.
> 
> 
> From: Emir Arnautović 
> Sent: Tuesday, March 19, 2019 12:30:33 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr index slow response
> 
> Just to add different perspective here: how do you send documents to Solr? 
> Are those log lines from your client? Maybe it is not Solr that is slow. 
> Could it be network or client itself. If you have some dry run on client, 
> maybe try running it without Solr to eliminate client from the suspects.
> 
> Do you observe similar spikes when you run indexing with less concurrent 
> clients?
> 
> It is really hard to pinpoint the issue without looking at some monitoring 
> tool.
> 
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> 
> 
> 
>> On 19 Mar 2019, at 09:17, Aaron Yingcai Sun  wrote:
>> 
>> We have around 80 million documents to index, total index size around 3TB,  
>> I guess I'm not the first one to work with this big amount of data. with 
>> such slow response time, the index process would take around 2 weeks. While 
>> the system resource is not very loaded, there must be a way to speed it up.
>> 
>> 
>> To Walter, I don't see why G1GC would improve this, we only do index, no 
>> query in the background. There is no memory constraint.  it's more feel like 
>> some internal thread are blocking each other.
>> 
>> 
>> I used to run with more documents in one request, that give much worse 
>> response time, 300 documents in one request could end up 20 minutes response 
>> time, now I changed to max 10 documents in one request, still many response 
>> time around 30 seconds, while some of them are very fast( ~100 ms).  How 
>> come there are such big difference? the documents size does not have such 
>> big difference.
>> 
>> 
>> I just want to speed it up since nothing seems to be overloaded.  Are there 
>> any other faster way to index such big amount of data?
>> 
>> 
>> BRs
>> 
>> //Aaron
>> 
>> 
>> From: Walter Underwood 
>> Sent: Monday, March 18, 2019 4:59:20 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr index slow response
>> 
>> Solr is not designed to have consistent response times for updates. You are 
>> expecting Solr to do something that it does not do.
>> 
>> About Xms and Xmx, the JVM will continue to allocate memory until it hits 
>> the max. After it hits the max, it will start to collect garbage. A smaller 
>> Xms just wastes time doing allocations after the JVM is running. Avoid that 
>> by making Xms and Xms the same.
>> 
>> We run all of our JVMs with 8 GB of heap and the G1 collector. You probably 
>> do not need more than 8 GB unless you are doing high-cardinality facets or 
>> some other memory-hungry querying.
>> 
>> The first step would be to use a good configuration. We start our Java 8 
>> JVMs with these parameters:
>> 
>> SOLR_HEAP=8g
>> # Use G1 GC  -- wunder 2017-01-23
>> # Settings from https://wiki.apache.org/solr/ShawnHeisey
>> GC_TUNE=" \
>> -XX:+UseG1GC \
>> -XX:+ParallelRefProcEnabled \
>> -XX:G1HeapRegionSize=8m \
>> -XX:MaxGCPauseMillis=200 \
>> -XX:+UseLargePages \
>> -XX:+AggressiveOpts \
>> “
>> 
>> Use SSD for disks, with total space about 3X as big as the expected index 
>> size.
>> 
>> Have RAM not used by Solr or the OS that is equal to the expected index size.
>> 
>> After that, let’s figure out what the real requirement is. If you must have 
>> consistent response times for update requests, you’ll need to do that 
>> outside of Solr. But if you need high data import rates, we can probably 
>> help.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>> On Mar 18, 2019, at 8:31 AM, Aaron Yingcai Sun  wrote:
>>> 
>>> Hello, Chris
>>> 
>>> 
>>> Thanks for the tips.  So I tried to set it as you suggested, not see too 
>>> much improvement.  Since I don't need it visible immediately, softCommit is 
>>> disabled totally.
>>> 
>>> The slow response is happening every few seconds,  if it happens hourly I 
>>> would suspect the hourly auto-commit.  But it happen much more frequently.  
>>> I don't see any CPU/RAM/

Re: Solr index slow response

2019-03-19 Thread Aaron Yingcai Sun

"QTime" value is from the solr rest api response, extracted from the http/json 
payload.  The "Request time" is what I measured from client side, it's almost 
the same value as QTime, just some milliseconds difference.  I could provide 
tcpdump to prove that it is really solr slow response.

Those long response time is not really spikes, it's constantly happening, 
almost half of the request has such long delay.  The more document added in one 
request the more delay it has.


From: Emir Arnautović 
Sent: Tuesday, March 19, 2019 12:30:33 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr index slow response

Just to add different perspective here: how do you send documents to Solr? Are 
those log lines from your client? Maybe it is not Solr that is slow. Could it 
be network or client itself. If you have some dry run on client, maybe try 
running it without Solr to eliminate client from the suspects.

Do you observe similar spikes when you run indexing with less concurrent 
clients?

It is really hard to pinpoint the issue without looking at some monitoring tool.

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 19 Mar 2019, at 09:17, Aaron Yingcai Sun  wrote:
>
> We have around 80 million documents to index, total index size around 3TB,  I 
> guess I'm not the first one to work with this big amount of data. with such 
> slow response time, the index process would take around 2 weeks. While the 
> system resource is not very loaded, there must be a way to speed it up.
>
>
> To Walter, I don't see why G1GC would improve this, we only do index, no 
> query in the background. There is no memory constraint.  it's more feel like 
> some internal thread are blocking each other.
>
>
> I used to run with more documents in one request, that give much worse 
> response time, 300 documents in one request could end up 20 minutes response 
> time, now I changed to max 10 documents in one request, still many response 
> time around 30 seconds, while some of them are very fast( ~100 ms).  How come 
> there are such big difference? the documents size does not have such big 
> difference.
>
>
> I just want to speed it up since nothing seems to be overloaded.  Are there 
> any other faster way to index such big amount of data?
>
>
> BRs
>
> //Aaron
>
> 
> From: Walter Underwood 
> Sent: Monday, March 18, 2019 4:59:20 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr index slow response
>
> Solr is not designed to have consistent response times for updates. You are 
> expecting Solr to do something that it does not do.
>
> About Xms and Xmx, the JVM will continue to allocate memory until it hits the 
> max. After it hits the max, it will start to collect garbage. A smaller Xms 
> just wastes time doing allocations after the JVM is running. Avoid that by 
> making Xms and Xms the same.
>
> We run all of our JVMs with 8 GB of heap and the G1 collector. You probably 
> do not need more than 8 GB unless you are doing high-cardinality facets or 
> some other memory-hungry querying.
>
> The first step would be to use a good configuration. We start our Java 8 JVMs 
> with these parameters:
>
> SOLR_HEAP=8g
> # Use G1 GC  -- wunder 2017-01-23
> # Settings from https://wiki.apache.org/solr/ShawnHeisey
> GC_TUNE=" \
> -XX:+UseG1GC \
> -XX:+ParallelRefProcEnabled \
> -XX:G1HeapRegionSize=8m \
> -XX:MaxGCPauseMillis=200 \
> -XX:+UseLargePages \
> -XX:+AggressiveOpts \
> “
>
> Use SSD for disks, with total space about 3X as big as the expected index 
> size.
>
> Have RAM not used by Solr or the OS that is equal to the expected index size.
>
> After that, let’s figure out what the real requirement is. If you must have 
> consistent response times for update requests, you’ll need to do that outside 
> of Solr. But if you need high data import rates, we can probably help.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>> On Mar 18, 2019, at 8:31 AM, Aaron Yingcai Sun  wrote:
>>
>> Hello, Chris
>>
>>
>> Thanks for the tips.  So I tried to set it as you suggested, not see too 
>> much improvement.  Since I don't need it visible immediately, softCommit is 
>> disabled totally.
>>
>> The slow response is happening every few seconds,  if it happens hourly I 
>> would suspect the hourly auto-commit.  But it happen much more frequently.  
>> I don't see any CPU/RAM/NETWORK IO/DISK IO bottleneck on OS level.  It just 
>> looks like solr server is blocking internally itself.
>>
>>
>> <   ${solr.autoCommit.maxTime:360}
>> ---
>>> ${solr.autoCommit.maxTime:15000}
>> 16c16
>> <   true
>> ---
>>> false
>>
>>
>>
>> 190318-162811.610-189982 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 
>> 200, QTime: 1405 ms, Request time: 1407 ms.
>> 190318-162811.636-189968 DBG1:doc_count: 10 , doc_size: 465  KB, Res code:

Re: Solr index slow response

2019-03-19 Thread Emir Arnautović

Just to add different perspective here: how do you send documents to Solr? Are 
those log lines from your client? Maybe it is not Solr that is slow. Could it 
be network or client itself. If you have some dry run on client, maybe try 
running it without Solr to eliminate client from the suspects.

Do you observe similar spikes when you run indexing with less concurrent 
clients?

It is really hard to pinpoint the issue without looking at some monitoring tool.

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 19 Mar 2019, at 09:17, Aaron Yingcai Sun  wrote:
> 
> We have around 80 million documents to index, total index size around 3TB,  I 
> guess I'm not the first one to work with this big amount of data. with such 
> slow response time, the index process would take around 2 weeks. While the 
> system resource is not very loaded, there must be a way to speed it up.
> 
> 
> To Walter, I don't see why G1GC would improve this, we only do index, no 
> query in the background. There is no memory constraint.  it's more feel like 
> some internal thread are blocking each other.
> 
> 
> I used to run with more documents in one request, that give much worse 
> response time, 300 documents in one request could end up 20 minutes response 
> time, now I changed to max 10 documents in one request, still many response 
> time around 30 seconds, while some of them are very fast( ~100 ms).  How come 
> there are such big difference? the documents size does not have such big 
> difference.
> 
> 
> I just want to speed it up since nothing seems to be overloaded.  Are there 
> any other faster way to index such big amount of data?
> 
> 
> BRs
> 
> //Aaron
> 
> 
> From: Walter Underwood 
> Sent: Monday, March 18, 2019 4:59:20 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr index slow response
> 
> Solr is not designed to have consistent response times for updates. You are 
> expecting Solr to do something that it does not do.
> 
> About Xms and Xmx, the JVM will continue to allocate memory until it hits the 
> max. After it hits the max, it will start to collect garbage. A smaller Xms 
> just wastes time doing allocations after the JVM is running. Avoid that by 
> making Xms and Xms the same.
> 
> We run all of our JVMs with 8 GB of heap and the G1 collector. You probably 
> do not need more than 8 GB unless you are doing high-cardinality facets or 
> some other memory-hungry querying.
> 
> The first step would be to use a good configuration. We start our Java 8 JVMs 
> with these parameters:
> 
> SOLR_HEAP=8g
> # Use G1 GC  -- wunder 2017-01-23
> # Settings from https://wiki.apache.org/solr/ShawnHeisey
> GC_TUNE=" \
> -XX:+UseG1GC \
> -XX:+ParallelRefProcEnabled \
> -XX:G1HeapRegionSize=8m \
> -XX:MaxGCPauseMillis=200 \
> -XX:+UseLargePages \
> -XX:+AggressiveOpts \
> “
> 
> Use SSD for disks, with total space about 3X as big as the expected index 
> size.
> 
> Have RAM not used by Solr or the OS that is equal to the expected index size.
> 
> After that, let’s figure out what the real requirement is. If you must have 
> consistent response times for update requests, you’ll need to do that outside 
> of Solr. But if you need high data import rates, we can probably help.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Mar 18, 2019, at 8:31 AM, Aaron Yingcai Sun  wrote:
>> 
>> Hello, Chris
>> 
>> 
>> Thanks for the tips.  So I tried to set it as you suggested, not see too 
>> much improvement.  Since I don't need it visible immediately, softCommit is 
>> disabled totally.
>> 
>> The slow response is happening every few seconds,  if it happens hourly I 
>> would suspect the hourly auto-commit.  But it happen much more frequently.  
>> I don't see any CPU/RAM/NETWORK IO/DISK IO bottleneck on OS level.  It just 
>> looks like solr server is blocking internally itself.
>> 
>> 
>> <   ${solr.autoCommit.maxTime:360}
>> ---
>>> ${solr.autoCommit.maxTime:15000}
>> 16c16
>> <   true
>> ---
>>> false
>> 
>> 
>> 
>> 190318-162811.610-189982 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 
>> 200, QTime: 1405 ms, Request time: 1407 ms.
>> 190318-162811.636-189968 DBG1:doc_count: 10 , doc_size: 465  KB, Res code: 
>> 200, QTime: 1357 ms, Request time: 1360 ms.
>> 190318-162811.732-189968 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 
>> 200, QTime: 90 ms, Request time: 92 ms.
>> 190318-162811.995-189981 DBG1:doc_count: 10 , doc_size: 610  KB, Res code: 
>> 200, QTime: 5306 ms, Request time: 5308 ms.
>> 190318-162814.873-190003 DBG1:doc_count: 10 , doc_size: 508  KB, Res code: 
>> 200, QTime: 4775 ms, Request time: 4777 ms.
>> 190318-162814.889-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 
>> 200, QTime: 20222 ms, Request time: 20224 ms.
>> 190318-162814.975-191817 DBG1:doc_count: 10 , doc_size: 539  KB, Res

RE: Update handler and atomic update

2019-03-19 Thread Martin Frank Hansen (MHQ)

Hi Thierry,

Thanks for your help. I think I will try to make my own handler instead.

Best regards

Martin


Internal - KMD A/S

-Original Message-
From: THIERRY BOUCHENY 
Sent: 19. marts 2019 10:38
To: solr-user@lucene.apache.org
Subject: Re: Update handler and atomic update

Hi Martin,

I read after answering your email that you don’t want to use curl, that might 
be a problem. I might be wrong but I don’t think you can make an atomic update 
with a GET request having the params in the url. I think you need to make a 
POST request and embed [{"id":"docid","clicks":{“inc”:"1"}}] in the raw body 
hence using curl or any other app that allows you this like Postman.

Best regards

Thierry

> On 19 Mar 2019, at 08:59, Martin Frank Hansen (MHQ)  wrote:
>
> Hi Thierry,
>
> Do you mean something like this?
>
> http://localhost:8983/solr/.../update? 
> [{"id":"docid","clicks":{“inc”:"1"}}]commit=true
>
> I do not get an error, but it does not increase the value of clicks 
> (unfortunately).
>
> Best regards
>
> Martin
>
>
> Internal - KMD A/S
>
> -Original Message-
> From: THIERRY BOUCHENY 
> Sent: 19. marts 2019 09:51
> To: solr-user@lucene.apache.org
> Subject: Re: Update handler and atomic update
>
> Hi Martin,
>
> Have you tried doing a POST with some JSON or XML Body.
>
> I would POST some json like the following
>
> [{"id":"docid","clicks":{“inc”:"1"}}]
>
> In an /update?commit=true
>
> Best regards
>
> Thierry
>
> See documentation here 
> https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html
>
>> On 19 Mar 2019, at 08:14, Martin Frank Hansen (MHQ)  wrote:
>>
>> Hi,
>>
>> Hope someone can help me, I am trying to make an incremental update for one 
>> document using the API, but cannot make it work. I have tried a lot of 
>> things and all I actually want is to increment the value of the field 
>> “clicks” by one.
>>
>> I have something like this:
>> http://localhost:8983/solr/.../update?id:docid&inc:clicks=1&commit=true
>>
>> in the schema.xml the field looks like this:
>>
>> > multiValued="false" docValues="true"/>
>>
>> Please note that I do not wish to use curl for this operation.
>>
>> Thanks in advance.
>>
>> Best regards
>>
>> Martin
>>
>>
>> Internal - KMD A/S
>>
>> Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du 
>> KMD’s Privatlivspolitik, der fortæller, 
>> hvordan vi behandler oplysninger om dig.
>>
>> Protection of your personal data is important to us. Here you can read KMD’s 
>> Privacy Policy outlining how we process 
>> your personal data.
>>
>> Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information. 
>> Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst 
>> informere afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi 
>> dig slette e-mailen i dit system uden at videresende eller kopiere den. 
>> Selvom e-mailen og ethvert vedhæftet bilag efter vores overbevisning er fri 
>> for virus og andre fejl, som kan påvirke computeren eller it-systemet, hvori 
>> den modtages og læses, åbnes den på modtagerens eget ansvar. Vi påtager os 
>> ikke noget ansvar for tab og skade, som er opstået i forbindelse med at 
>> modtage og bruge e-mailen.
>>
>> Please note that this message may contain confidential information. If you 
>> have received this message by mistake, please inform the sender of the 
>> mistake by sending a reply, then delete the message from your system without 
>> making, distributing or retaining any copies of it. Although we believe that 
>> the message and any attachments are free from viruses and other errors that 
>> might affect the computer or it-system where it is received and read, the 
>> recipient opens the message at his or her own risk. We assume no 
>> responsibility for any loss or damage arising from the receipt or use of 
>> this message.

Re: Behavior of Function Query

2019-03-19 Thread Erik Hatcher

Try adding fl=* into the request.   There’s an oddity with fl, iirc, where it 
can skip functions if * isn’t there (or maybe a concrete non-score field?)

   Erik

> On Mar 18, 2019, at 10:19, Ashish Bisht  wrote:
> 
> Please see the below requests and response
> 
> http://Sol:8983/solr/SCSpell/select?q="*internet of
> things*"&defType=edismax&qf=spellcontent&wt=json&rows=1&fl=score,internet_of_things:query({!edismax
> v='"*internet of things*"'}),instant_of_things:query({!edismax v='"instant
> of things"'})
> 
> 
> Response contains score from function query
> 
> "fl":"score,internet_of_things:query({!edismax v='\"internet of
> things\"'}),instant_of_things:query({!edismax v='\"instant of things\"'})",
>  "rows":"1",
>  "wt":"json"}},
>  "response":{"numFound":851,"start":0,"maxScore":7.6176834,"docs":[
>  {
>"score":7.6176834,
>   * "internet_of_things":7.6176834*}]
>  }}
> 
> 
> But if in the same request q is changed,it doesn't give score
> 
> http://Sol-1:8983/solr/SCSpell/select?q="*wall
> street*"&defType=edismax&qf=spellcontent&wt=json&rows=1&fl=score,internet_of_things:query({!edismax
> v='"*internet of things*"'}),instant_of_things:query({!edismax v='"instant
> of things"'})
> 
>   "q":"\"wall street\"",
>  "defType":"edismax",
>  "qf":"spellcontent",
>  "fl":"score,internet_of_things:query({!edismax v='\"internet of
> things\"'}),instant_of_things:query({!edismax v='\"instant of things\"'})",
>  "rows":"1",
>  "wt":"json"}},
>  "response":{"numFound":46,"start":0,"maxScore":15.670144,"docs":[
>  {
>"score":15.670144}]
>  }}
> 
> 
> Why score of function query is getting applied when q is a different.
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

How to use custom analyzer for solr from janusgraph

2019-03-19 Thread Ahemad Ali

Hi,
I have new field type "str_rev" in schema.xml which contains pattern 
replacement factory and keyword tokenizer.
How to use this feild type in janusgraph so that my index will get replaced 
with pattern.
I will really appreciate any ideas and suggestions.
thanks,Ahemad

Sent from Yahoo Mail on Android

Re: Update handler and atomic update

2019-03-19 Thread THIERRY BOUCHENY

Hi Martin,

I read after answering your email that you don’t want to use curl, that might 
be a problem. I might be wrong but I don’t think you can make an atomic update 
with a GET request having the params in the url. I think you need to make a 
POST request and embed [{"id":"docid","clicks":{“inc”:"1"}}] in the raw body 
hence using curl or any other app that allows you this like Postman. 

Best regards

Thierry

> On 19 Mar 2019, at 08:59, Martin Frank Hansen (MHQ)  wrote:
> 
> Hi Thierry,
> 
> Do you mean something like this?
> 
> http://localhost:8983/solr/.../update? 
> [{"id":"docid","clicks":{“inc”:"1"}}]commit=true
> 
> I do not get an error, but it does not increase the value of clicks 
> (unfortunately).
> 
> Best regards
> 
> Martin
> 
> 
> Internal - KMD A/S
> 
> -Original Message-
> From: THIERRY BOUCHENY 
> Sent: 19. marts 2019 09:51
> To: solr-user@lucene.apache.org
> Subject: Re: Update handler and atomic update
> 
> Hi Martin,
> 
> Have you tried doing a POST with some JSON or XML Body.
> 
> I would POST some json like the following
> 
> [{"id":"docid","clicks":{“inc”:"1"}}]
> 
> In an /update?commit=true
> 
> Best regards
> 
> Thierry
> 
> See documentation here 
> https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html
> 
>> On 19 Mar 2019, at 08:14, Martin Frank Hansen (MHQ)  wrote:
>> 
>> Hi,
>> 
>> Hope someone can help me, I am trying to make an incremental update for one 
>> document using the API, but cannot make it work. I have tried a lot of 
>> things and all I actually want is to increment the value of the field 
>> “clicks” by one.
>> 
>> I have something like this:
>> http://localhost:8983/solr/.../update?id:docid&inc:clicks=1&commit=true
>> 
>> in the schema.xml the field looks like this:
>> 
>> > multiValued="false" docValues="true"/>
>> 
>> Please note that I do not wish to use curl for this operation.
>> 
>> Thanks in advance.
>> 
>> Best regards
>> 
>> Martin
>> 
>> 
>> Internal - KMD A/S
>> 
>> Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du 
>> KMD’s Privatlivspolitik, der fortæller, 
>> hvordan vi behandler oplysninger om dig.
>> 
>> Protection of your personal data is important to us. Here you can read KMD’s 
>> Privacy Policy outlining how we process 
>> your personal data.
>> 
>> Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information. 
>> Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst 
>> informere afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi 
>> dig slette e-mailen i dit system uden at videresende eller kopiere den. 
>> Selvom e-mailen og ethvert vedhæftet bilag efter vores overbevisning er fri 
>> for virus og andre fejl, som kan påvirke computeren eller it-systemet, hvori 
>> den modtages og læses, åbnes den på modtagerens eget ansvar. Vi påtager os 
>> ikke noget ansvar for tab og skade, som er opstået i forbindelse med at 
>> modtage og bruge e-mailen.
>> 
>> Please note that this message may contain confidential information. If you 
>> have received this message by mistake, please inform the sender of the 
>> mistake by sending a reply, then delete the message from your system without 
>> making, distributing or retaining any copies of it. Although we believe that 
>> the message and any attachments are free from viruses and other errors that 
>> might affect the computer or it-system where it is received and read, the 
>> recipient opens the message at his or her own risk. We assume no 
>> responsibility for any loss or damage arising from the receipt or use of 
>> this message.

RE: Update handler and atomic update

2019-03-19 Thread Martin Frank Hansen (MHQ)

Hi Thierry,

Do you mean something like this?

http://localhost:8983/solr/.../update? 
[{"id":"docid","clicks":{“inc”:"1"}}]commit=true

I do not get an error, but it does not increase the value of clicks 
(unfortunately).

Best regards

Martin


Internal - KMD A/S

-Original Message-
From: THIERRY BOUCHENY 
Sent: 19. marts 2019 09:51
To: solr-user@lucene.apache.org
Subject: Re: Update handler and atomic update

Hi Martin,

Have you tried doing a POST with some JSON or XML Body.

I would POST some json like the following

[{"id":"docid","clicks":{“inc”:"1"}}]

In an /update?commit=true

Best regards

Thierry

See documentation here 
https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html

> On 19 Mar 2019, at 08:14, Martin Frank Hansen (MHQ)  wrote:
>
> Hi,
>
> Hope someone can help me, I am trying to make an incremental update for one 
> document using the API, but cannot make it work. I have tried a lot of things 
> and all I actually want is to increment the value of the field “clicks” by 
> one.
>
> I have something like this:
> http://localhost:8983/solr/.../update?id:docid&inc:clicks=1&commit=true
>
> in the schema.xml the field looks like this:
>
>  multiValued="false" docValues="true"/>
>
> Please note that I do not wish to use curl for this operation.
>
> Thanks in advance.
>
> Best regards
>
> Martin
>
>
> Internal - KMD A/S
>
> Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du 
> KMD’s Privatlivspolitik, der fortæller, 
> hvordan vi behandler oplysninger om dig.
>
> Protection of your personal data is important to us. Here you can read KMD’s 
> Privacy Policy outlining how we process 
> your personal data.
>
> Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information. 
> Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst informere 
> afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi dig slette 
> e-mailen i dit system uden at videresende eller kopiere den. Selvom e-mailen 
> og ethvert vedhæftet bilag efter vores overbevisning er fri for virus og 
> andre fejl, som kan påvirke computeren eller it-systemet, hvori den modtages 
> og læses, åbnes den på modtagerens eget ansvar. Vi påtager os ikke noget 
> ansvar for tab og skade, som er opstået i forbindelse med at modtage og bruge 
> e-mailen.
>
> Please note that this message may contain confidential information. If you 
> have received this message by mistake, please inform the sender of the 
> mistake by sending a reply, then delete the message from your system without 
> making, distributing or retaining any copies of it. Although we believe that 
> the message and any attachments are free from viruses and other errors that 
> might affect the computer or it-system where it is received and read, the 
> recipient opens the message at his or her own risk. We assume no 
> responsibility for any loss or damage arising from the receipt or use of this 
> message.

Re: Update handler and atomic update

2019-03-19 Thread THIERRY BOUCHENY

Hi Martin,

Have you tried doing a POST with some JSON or XML Body.

I would POST some json like the following

[{"id":"docid","clicks":{“inc”:"1"}}]

In an /update?commit=true

Best regards

Thierry

See documentation here 
https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html

> On 19 Mar 2019, at 08:14, Martin Frank Hansen (MHQ)  wrote:
> 
> Hi,
> 
> Hope someone can help me, I am trying to make an incremental update for one 
> document using the API, but cannot make it work. I have tried a lot of things 
> and all I actually want is to increment the value of the field “clicks” by 
> one.
> 
> I have something like this:
> http://localhost:8983/solr/.../update?id:docid&inc:clicks=1&commit=true
> 
> in the schema.xml the field looks like this:
> 
>  multiValued="false" docValues="true"/>
> 
> Please note that I do not wish to use curl for this operation.
> 
> Thanks in advance.
> 
> Best regards
> 
> Martin
> 
> 
> Internal - KMD A/S
> 
> Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du 
> KMD’s Privatlivspolitik, der fortæller, 
> hvordan vi behandler oplysninger om dig.
> 
> Protection of your personal data is important to us. Here you can read KMD’s 
> Privacy Policy outlining how we process 
> your personal data.
> 
> Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information. 
> Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst informere 
> afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi dig slette 
> e-mailen i dit system uden at videresende eller kopiere den. Selvom e-mailen 
> og ethvert vedhæftet bilag efter vores overbevisning er fri for virus og 
> andre fejl, som kan påvirke computeren eller it-systemet, hvori den modtages 
> og læses, åbnes den på modtagerens eget ansvar. Vi påtager os ikke noget 
> ansvar for tab og skade, som er opstået i forbindelse med at modtage og bruge 
> e-mailen.
> 
> Please note that this message may contain confidential information. If you 
> have received this message by mistake, please inform the sender of the 
> mistake by sending a reply, then delete the message from your system without 
> making, distributing or retaining any copies of it. Although we believe that 
> the message and any attachments are free from viruses and other errors that 
> might affect the computer or it-system where it is received and read, the 
> recipient opens the message at his or her own risk. We assume no 
> responsibility for any loss or damage arising from the receipt or use of this 
> message.

Re: Solr index slow response

2019-03-19 Thread Aaron Yingcai Sun

We have around 80 million documents to index, total index size around 3TB,  I 
guess I'm not the first one to work with this big amount of data. with such 
slow response time, the index process would take around 2 weeks. While the 
system resource is not very loaded, there must be a way to speed it up.

To Walter, I don't see why G1GC would improve this, we only do index, no query 
in the background. There is no memory constraint.  it's more feel like some 
internal thread are blocking each other.

I used to run with more documents in one request, that give much worse response 
time, 300 documents in one request could end up 20 minutes response time, now I 
changed to max 10 documents in one request, still many response time around 30 
seconds, while some of them are very fast( ~100 ms).  How come there are such 
big difference? the documents size does not have such big difference.

I just want to speed it up since nothing seems to be overloaded.  Are there any 
other faster way to index such big amount of data?

BRs

//Aaron

From: Walter Underwood 
Sent: Monday, March 18, 2019 4:59:20 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr index slow response

Solr is not designed to have consistent response times for updates. You are 
expecting Solr to do something that it does not do.

About Xms and Xmx, the JVM will continue to allocate memory until it hits the 
max. After it hits the max, it will start to collect garbage. A smaller Xms 
just wastes time doing allocations after the JVM is running. Avoid that by 
making Xms and Xms the same.

We run all of our JVMs with 8 GB of heap and the G1 collector. You probably do 
not need more than 8 GB unless you are doing high-cardinality facets or some 
other memory-hungry querying.

The first step would be to use a good configuration. We start our Java 8 JVMs 
with these parameters:

SOLR_HEAP=8g
# Use G1 GC  -- wunder 2017-01-23
# Settings from https://wiki.apache.org/solr/ShawnHeisey
GC_TUNE=" \
-XX:+UseG1GC \
-XX:+ParallelRefProcEnabled \
-XX:G1HeapRegionSize=8m \
-XX:MaxGCPauseMillis=200 \
-XX:+UseLargePages \
-XX:+AggressiveOpts \
“

Use SSD for disks, with total space about 3X as big as the expected index size.

Have RAM not used by Solr or the OS that is equal to the expected index size.

After that, let’s figure out what the real requirement is. If you must have 
consistent response times for update requests, you’ll need to do that outside 
of Solr. But if you need high data import rates, we can probably help.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 18, 2019, at 8:31 AM, Aaron Yingcai Sun  wrote:
>
> Hello, Chris
>
>
> Thanks for the tips.  So I tried to set it as you suggested, not see too much 
> improvement.  Since I don't need it visible immediately, softCommit is 
> disabled totally.
>
> The slow response is happening every few seconds,  if it happens hourly I 
> would suspect the hourly auto-commit.  But it happen much more frequently.  I 
> don't see any CPU/RAM/NETWORK IO/DISK IO bottleneck on OS level.  It just 
> looks like solr server is blocking internally itself.
>
>
> <   ${solr.autoCommit.maxTime:360}
> ---
>>  ${solr.autoCommit.maxTime:15000}
> 16c16
> <   true
> ---
>>  false
>
>
>
> 190318-162811.610-189982 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 
> 200, QTime: 1405 ms, Request time: 1407 ms.
> 190318-162811.636-189968 DBG1:doc_count: 10 , doc_size: 465  KB, Res code: 
> 200, QTime: 1357 ms, Request time: 1360 ms.
> 190318-162811.732-189968 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 
> 200, QTime: 90 ms, Request time: 92 ms.
> 190318-162811.995-189981 DBG1:doc_count: 10 , doc_size: 610  KB, Res code: 
> 200, QTime: 5306 ms, Request time: 5308 ms.
> 190318-162814.873-190003 DBG1:doc_count: 10 , doc_size: 508  KB, Res code: 
> 200, QTime: 4775 ms, Request time: 4777 ms.
> 190318-162814.889-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 
> 200, QTime: 20222 ms, Request time: 20224 ms.
> 190318-162814.975-191817 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 
> 200, QTime: 27732 ms, Request time: 27735 ms.
> 190318-162814.975-189958 DBG1:doc_count: 10 , doc_size: 616  KB, Res code: 
> 200, QTime: 28106 ms, Request time: 28109 ms.
> 190318-162814.975-190004 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 
> 200, QTime: 16703 ms, Request time: 16706 ms.
> 190318-162814.982-189963 DBG1:doc_count: 10 , doc_size: 540  KB, Res code: 
> 200, QTime: 28216 ms, Request time: 28218 ms.
> 190318-162814.988-190007 DBG1:doc_count: 10 , doc_size: 673  KB, Res code: 
> 200, QTime: 28133 ms, Request time: 28136 ms.
> 190318-162814.993-189962 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 
> 200, QTime: 27909 ms, Request time: 27912 ms.
> 190318-162814.996-191818 DBG1:doc_count: 10 , doc_size: 529  KB, Res code: 
> 200, QTime: 28172 ms, Request time: 28174 ms.
> 190318-162815.056-189986 DBG1:doc_count: 10 ,

Update handler and atomic update

2019-03-19 Thread Martin Frank Hansen (MHQ)

Hi,

Hope someone can help me, I am trying to make an incremental update for one 
document using the API, but cannot make it work. I have tried a lot of things 
and all I actually want is to increment the value of the field “clicks” by one.

I have something like this:
http://localhost:8983/solr/.../update?id:docid&inc:clicks=1&commit=true

in the schema.xml the field looks like this:



Please note that I do not wish to use curl for this operation.

Thanks in advance.

Best regards

Martin


Internal - KMD A/S

Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du 
KMD’s Privatlivspolitik, der fortæller, 
hvordan vi behandler oplysninger om dig.

Protection of your personal data is important to us. Here you can read KMD’s 
Privacy Policy outlining how we process your 
personal data.

Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information. Hvis 
du ved en fejltagelse modtager e-mailen, beder vi dig venligst informere 
afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi dig slette 
e-mailen i dit system uden at videresende eller kopiere den. Selvom e-mailen og 
ethvert vedhæftet bilag efter vores overbevisning er fri for virus og andre 
fejl, som kan påvirke computeren eller it-systemet, hvori den modtages og 
læses, åbnes den på modtagerens eget ansvar. Vi påtager os ikke noget ansvar 
for tab og skade, som er opstået i forbindelse med at modtage og bruge e-mailen.

Please note that this message may contain confidential information. If you have 
received this message by mistake, please inform the sender of the mistake by 
sending a reply, then delete the message from your system without making, 
distributing or retaining any copies of it. Although we believe that the 
message and any attachments are free from viruses and other errors that might 
affect the computer or it-system where it is received and read, the recipient 
opens the message at his or her own risk. We assume no responsibility for any 
loss or damage arising from the receipt or use of this message.

Re: Need help on LTR

2019-03-19 Thread Jörn Franke

Did you add the option -Dsolr.ltr.enabled=true ?

> Am 19.03.2019 um 04:15 schrieb Amjad Khan :
> 
> I followed the Solr LTR Documentation 
> 
> https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html 
> 
> 
> 1. Added library into the solr-config
> 
>   />
> 
> 2. Successfully added feature
> 3. Get schema to see feature is available
> 4. When I try to push model I see the error below, however I added the lib 
> into solr-cofig
> 
> Response
> {
>  "responseHeader":{
>"status":400,
>"QTime":1},
>  "error":{
>"metadata":[
>  "error-class","org.apache.solr.common.SolrException",
>  "root-error-class","java.lang.NullPointerException"],
>"msg":"org.apache.solr.ltr.model.ModelException: Model type does not exist 
> org.apache.solr.ltr.model.LinearModel",
>"code":400}}
> 
> Thanks

58 matches

Mail list logo