Re: [EXTERNAL] - SolR OOM error due to query injection

2020-06-11 Thread Michael Gibney
Guilherme,
The answer is likely to be dependent on the query parser, query parser
configuration, and analysis chains. If you post those it could aid in
helping troubleshoot. One thing that jumps to mind is the asterisks
("*") -- if they're interpreted as wildcards, that could be
problematic? More generally, it's of course true that Solr won't parse
this input as SQL, but as Isabelle pointed out, there are still
potentially lots of meta-characters (in addition to quite a few short,
common terms).
Michael


On Thu, Jun 11, 2020 at 7:43 AM Guilherme Viteri  wrote:
>
> Hi Isabelle
> Thanks for your input.
> In fact SolR returns 30 results out of this queries. Why does it behave in a 
> way that causes OOM ? Also the commands, they are SQL commands and solr would 
> parse it as normal character …
>
> Thanks
>
>
> > On 10 Jun 2020, at 22:50, Isabelle Giguere  
> > wrote:
> >
> > Hi Guilherme;
> >
> > The only thing I can think of right now is the number of non-alphanumeric 
> > characters.
> >
> > In the first 'q' in your examples, after resolving the character escapes, 
> > 1/3 of characters are non-alphanumeric (* / = , etc).
> >
> > Maybe filter-out queries that contain too many non-alphanumeric characters 
> > before sending the request to Solr ?  Whatever "too many" could be.
> >
> > Isabelle Giguère
> > Computational Linguist & Java Developer
> > Linguiste informaticienne & développeur java
> >
> >
> > 
> > De : Guilherme Viteri 
> > Envoyé : 10 juin 2020 16:57
> > À : solr-user@lucene.apache.org 
> > Objet : [EXTERNAL] - SolR OOM error due to query injection
> >
> > Hi,
> >
> > Environment: SolR 6.6.2, with org.apache.solr.solr-core:6.1.0. This setup 
> > has been running for at least 4 years without having OutOfMemory error. (it 
> > is never too late for an OOM…)
> >
> > This week, our search tool has been attacked via ‘sql injection’ like, and 
> > that led to an OOM. These requests weren’t aggressive that stressed the 
> > server with an excessive number of hits, however 5 to 10 request of this 
> > nature was enough to crash the server.
> >
> > I’ve come across a this link 
> > https://urldefense.com/v3/__https://stackoverflow.com/questions/26862474/prevent-from-solr-query-injections-when-using-solrj__;!!Obbck6kTJA!IdbT_RQCp3jXO5KJxMkWNJIRlNU9Hu1hnJsWqCWT_QS3zpZSAxYeFPM_hGWNwp3y$
> >   
> > <https://urldefense.com/v3/__https://stackoverflow.com/questions/26862474/prevent-from-solr-query-injections-when-using-solrj__;!!Obbck6kTJA!IdbT_RQCp3jXO5KJxMkWNJIRlNU9Hu1hnJsWqCWT_QS3zpZSAxYeFPM_hGWNwp3y$
> >  >, however, that’s not what I am after. In our case we do allow lucene 
> > query and field search like title:Title or our ids have dash and if it get 
> > escaped, then the search won’t work properly.
> >
> > Does anyone have an idea ?
> >
> > Cheers
> > G
> >
> > Here are some of the requests that appeared in the logs in relation to the 
> > attack (see below: sorry it is messy)
> > query?q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F2%2A%28IF%28%28SELECT%2F%2A%2A%2F%2A%2F%2A%2A%2FFROM%2F%2A%2A%2F%28SELECT%2F%2A%2A%2FCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283235%3D3235%2C1%29%29%29%2C0x717a626271%2C0x78%29%29s%29%2C%2F%2A%2A%2F8446744073709551610%2C%2F%2A%2A%2F8446744073709551610%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22YBXk%22%2F%2A%2A%2FLIKE%2F%2A%2A%2F%22YBXk=Homo%20sapiens=Reaction=Pathway=true
> >
> > q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F2%2A%28IF%28%28SELECT%2F%2A%2A%2F%2A%2F%2A%2A%2FFROM%2F%2A%2A%2F%28SELECT%2F%2A%2A%2FCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283235%3D3235%2C1%29%29%29%2C0x717a626271%2C0x78%29%29s%29%2C%2F%2A%2A%2F8446744073709551610%2C%2F%2A%2A%2F8446744073709551610%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22rDmG%22%3D%22rDmG=Homo%20sapiens=Reaction=Pathway=true
> >
> > q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F3641%2F%2A%2A%2FFROM%28SELECT%2F%2A%2A%2FCOUNT%28%2A%29%2CCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283641%3D3641%2C1%29%29%29%2C0x717a626271%2CFLOOR%28RAND%280%29%2A2%29%29x%2F%2A%2A%2FFROM%2F%2A%2A%2FINFORMATION_SCHEMA.PLUGINS%2F%2A%2A%2FGROUP%2F%2A%2A%2FBY%2F%2A%2A%2Fx%29a%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22dfkM%22%2F%2A%2A%2FLIKE%2F%2A%2A%2F%22dfkM=Homo%20sapiens=Reaction=Pathway=true
> >
> > q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F3641%2F%2A%2A%2FFROM%28SELECT%2F%2A%2A%2FCOUNT%28%2A%29%2CCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283641%3D3641%2C1%29%29%29%2C0x717a626271%2CFLOOR

Re: [EXTERNAL] - SolR OOM error due to query injection

2020-06-11 Thread Guilherme Viteri
Hi Isabelle
Thanks for your input.
In fact SolR returns 30 results out of this queries. Why does it behave in a 
way that causes OOM ? Also the commands, they are SQL commands and solr would 
parse it as normal character …

Thanks


> On 10 Jun 2020, at 22:50, Isabelle Giguere  
> wrote:
> 
> Hi Guilherme;
> 
> The only thing I can think of right now is the number of non-alphanumeric 
> characters.
> 
> In the first 'q' in your examples, after resolving the character escapes, 1/3 
> of characters are non-alphanumeric (* / = , etc).
> 
> Maybe filter-out queries that contain too many non-alphanumeric characters 
> before sending the request to Solr ?  Whatever "too many" could be.
> 
> Isabelle Giguère
> Computational Linguist & Java Developer
> Linguiste informaticienne & développeur java
> 
> 
> 
> De : Guilherme Viteri 
> Envoyé : 10 juin 2020 16:57
> À : solr-user@lucene.apache.org 
> Objet : [EXTERNAL] - SolR OOM error due to query injection
> 
> Hi,
> 
> Environment: SolR 6.6.2, with org.apache.solr.solr-core:6.1.0. This setup has 
> been running for at least 4 years without having OutOfMemory error. (it is 
> never too late for an OOM…)
> 
> This week, our search tool has been attacked via ‘sql injection’ like, and 
> that led to an OOM. These requests weren’t aggressive that stressed the 
> server with an excessive number of hits, however 5 to 10 request of this 
> nature was enough to crash the server.
> 
> I’ve come across a this link 
> https://urldefense.com/v3/__https://stackoverflow.com/questions/26862474/prevent-from-solr-query-injections-when-using-solrj__;!!Obbck6kTJA!IdbT_RQCp3jXO5KJxMkWNJIRlNU9Hu1hnJsWqCWT_QS3zpZSAxYeFPM_hGWNwp3y$
>   
> <https://urldefense.com/v3/__https://stackoverflow.com/questions/26862474/prevent-from-solr-query-injections-when-using-solrj__;!!Obbck6kTJA!IdbT_RQCp3jXO5KJxMkWNJIRlNU9Hu1hnJsWqCWT_QS3zpZSAxYeFPM_hGWNwp3y$
>  >, however, that’s not what I am after. In our case we do allow lucene query 
> and field search like title:Title or our ids have dash and if it get escaped, 
> then the search won’t work properly.
> 
> Does anyone have an idea ?
> 
> Cheers
> G
> 
> Here are some of the requests that appeared in the logs in relation to the 
> attack (see below: sorry it is messy)
> query?q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F2%2A%28IF%28%28SELECT%2F%2A%2A%2F%2A%2F%2A%2A%2FFROM%2F%2A%2A%2F%28SELECT%2F%2A%2A%2FCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283235%3D3235%2C1%29%29%29%2C0x717a626271%2C0x78%29%29s%29%2C%2F%2A%2A%2F8446744073709551610%2C%2F%2A%2A%2F8446744073709551610%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22YBXk%22%2F%2A%2A%2FLIKE%2F%2A%2A%2F%22YBXk=Homo%20sapiens=Reaction=Pathway=true
> 
> q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F2%2A%28IF%28%28SELECT%2F%2A%2A%2F%2A%2F%2A%2A%2FFROM%2F%2A%2A%2F%28SELECT%2F%2A%2A%2FCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283235%3D3235%2C1%29%29%29%2C0x717a626271%2C0x78%29%29s%29%2C%2F%2A%2A%2F8446744073709551610%2C%2F%2A%2A%2F8446744073709551610%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22rDmG%22%3D%22rDmG=Homo%20sapiens=Reaction=Pathway=true
> 
> q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F3641%2F%2A%2A%2FFROM%28SELECT%2F%2A%2A%2FCOUNT%28%2A%29%2CCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283641%3D3641%2C1%29%29%29%2C0x717a626271%2CFLOOR%28RAND%280%29%2A2%29%29x%2F%2A%2A%2FFROM%2F%2A%2A%2FINFORMATION_SCHEMA.PLUGINS%2F%2A%2A%2FGROUP%2F%2A%2A%2FBY%2F%2A%2A%2Fx%29a%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22dfkM%22%2F%2A%2A%2FLIKE%2F%2A%2A%2F%22dfkM=Homo%20sapiens=Reaction=Pathway=true
> 
> q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F3641%2F%2A%2A%2FFROM%28SELECT%2F%2A%2A%2FCOUNT%28%2A%29%2CCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283641%3D3641%2C1%29%29%29%2C0x717a626271%2CFLOOR%28RAND%280%29%2A2%29%29x%2F%2A%2A%2FFROM%2F%2A%2A%2FINFORMATION_SCHEMA.PLUGINS%2F%2A%2A%2FGROUP%2F%2A%2A%2FBY%2F%2A%2A%2Fx%29a%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22yBhx%22%3D%22yBhx=Homo%20sapiens=Reaction=Pathway=true
> 
> q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F1695%3DCTXSYS.DRITHSX.SN%281695%2C%28CHR%28113%29%7C%7CCHR%28122%29%7C%7CCHR%28112%29%7C%7CCHR%28120%29%7C%7CCHR%28113%29%7C%7C%28SELECT%2F%2A%2A%2F%28CASE%2F%2A%2A%2FWHEN%2F%2A%2A%2F%281695%3D1695%29%2F%2A%2A%2FTHEN%2F%2A%2A%2F1%2F%2A%2A%2FELSE%2F%2A%2A%2F0%2F%2A%2A%2FEND%29%2F%2A%2A%2FFROM%2F%2A%2A%2FDUAL%29%7C%7CCHR%28113%29%7C%7CCHR%28122%29%7C%7CCHR%2898%29%7C%7CCHR%2898%29%7C%7CCHR%28113%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22eEdc%22%2F%2A%2A%2FLIKE%2F%2A%2A%2F%22eEdc=Homo%20sapiens=Reaction=Pathway=true
> 
> q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F1695%3DCTXSYS.DRITHSX.SN%281695%2C%28CHR%

Re: [EXTERNAL] - SolR OOM error due to query injection

2020-06-10 Thread Isabelle Giguere
Hi Guilherme;

The only thing I can think of right now is the number of non-alphanumeric 
characters.

In the first 'q' in your examples, after resolving the character escapes, 1/3 
of characters are non-alphanumeric (* / = , etc).

Maybe filter-out queries that contain too many non-alphanumeric characters 
before sending the request to Solr ?  Whatever "too many" could be.

Isabelle Giguère
Computational Linguist & Java Developer
Linguiste informaticienne & développeur java



De : Guilherme Viteri 
Envoyé : 10 juin 2020 16:57
À : solr-user@lucene.apache.org 
Objet : [EXTERNAL] - SolR OOM error due to query injection

Hi,

Environment: SolR 6.6.2, with org.apache.solr.solr-core:6.1.0. This setup has 
been running for at least 4 years without having OutOfMemory error. (it is 
never too late for an OOM…)

This week, our search tool has been attacked via ‘sql injection’ like, and that 
led to an OOM. These requests weren’t aggressive that stressed the server with 
an excessive number of hits, however 5 to 10 request of this nature was enough 
to crash the server.

I’ve come across a this link 
https://urldefense.com/v3/__https://stackoverflow.com/questions/26862474/prevent-from-solr-query-injections-when-using-solrj__;!!Obbck6kTJA!IdbT_RQCp3jXO5KJxMkWNJIRlNU9Hu1hnJsWqCWT_QS3zpZSAxYeFPM_hGWNwp3y$
  
<https://urldefense.com/v3/__https://stackoverflow.com/questions/26862474/prevent-from-solr-query-injections-when-using-solrj__;!!Obbck6kTJA!IdbT_RQCp3jXO5KJxMkWNJIRlNU9Hu1hnJsWqCWT_QS3zpZSAxYeFPM_hGWNwp3y$
 >, however, that’s not what I am after. In our case we do allow lucene query 
and field search like title:Title or our ids have dash and if it get escaped, 
then the search won’t work properly.

Does anyone have an idea ?

Cheers
G

Here are some of the requests that appeared in the logs in relation to the 
attack (see below: sorry it is messy)
query?q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F2%2A%28IF%28%28SELECT%2F%2A%2A%2F%2A%2F%2A%2A%2FFROM%2F%2A%2A%2F%28SELECT%2F%2A%2A%2FCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283235%3D3235%2C1%29%29%29%2C0x717a626271%2C0x78%29%29s%29%2C%2F%2A%2A%2F8446744073709551610%2C%2F%2A%2A%2F8446744073709551610%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22YBXk%22%2F%2A%2A%2FLIKE%2F%2A%2A%2F%22YBXk=Homo%20sapiens=Reaction=Pathway=true

q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F2%2A%28IF%28%28SELECT%2F%2A%2A%2F%2A%2F%2A%2A%2FFROM%2F%2A%2A%2F%28SELECT%2F%2A%2A%2FCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283235%3D3235%2C1%29%29%29%2C0x717a626271%2C0x78%29%29s%29%2C%2F%2A%2A%2F8446744073709551610%2C%2F%2A%2A%2F8446744073709551610%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22rDmG%22%3D%22rDmG=Homo%20sapiens=Reaction=Pathway=true

q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F3641%2F%2A%2A%2FFROM%28SELECT%2F%2A%2A%2FCOUNT%28%2A%29%2CCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283641%3D3641%2C1%29%29%29%2C0x717a626271%2CFLOOR%28RAND%280%29%2A2%29%29x%2F%2A%2A%2FFROM%2F%2A%2A%2FINFORMATION_SCHEMA.PLUGINS%2F%2A%2A%2FGROUP%2F%2A%2A%2FBY%2F%2A%2A%2Fx%29a%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22dfkM%22%2F%2A%2A%2FLIKE%2F%2A%2A%2F%22dfkM=Homo%20sapiens=Reaction=Pathway=true

q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F3641%2F%2A%2A%2FFROM%28SELECT%2F%2A%2A%2FCOUNT%28%2A%29%2CCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283641%3D3641%2C1%29%29%29%2C0x717a626271%2CFLOOR%28RAND%280%29%2A2%29%29x%2F%2A%2A%2FFROM%2F%2A%2A%2FINFORMATION_SCHEMA.PLUGINS%2F%2A%2A%2FGROUP%2F%2A%2A%2FBY%2F%2A%2A%2Fx%29a%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22yBhx%22%3D%22yBhx=Homo%20sapiens=Reaction=Pathway=true

q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F1695%3DCTXSYS.DRITHSX.SN%281695%2C%28CHR%28113%29%7C%7CCHR%28122%29%7C%7CCHR%28112%29%7C%7CCHR%28120%29%7C%7CCHR%28113%29%7C%7C%28SELECT%2F%2A%2A%2F%28CASE%2F%2A%2A%2FWHEN%2F%2A%2A%2F%281695%3D1695%29%2F%2A%2A%2FTHEN%2F%2A%2A%2F1%2F%2A%2A%2FELSE%2F%2A%2A%2F0%2F%2A%2A%2FEND%29%2F%2A%2A%2FFROM%2F%2A%2A%2FDUAL%29%7C%7CCHR%28113%29%7C%7CCHR%28122%29%7C%7CCHR%2898%29%7C%7CCHR%2898%29%7C%7CCHR%28113%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22eEdc%22%2F%2A%2A%2FLIKE%2F%2A%2A%2F%22eEdc=Homo%20sapiens=Reaction=Pathway=true

q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F1695%3DCTXSYS.DRITHSX.SN%281695%2C%28CHR%28113%29%7C%7CCHR%28122%29%7C%7CCHR%28112%29%7C%7CCHR%28120%29%7C%7CCHR%28113%29%7C%7C%28SELECT%2F%2A%2A%2F%28CASE%2F%2A%2A%2FWHEN%2F%2A%2A%2F%281695%3D1695%29%2F%2A%2A%2FTHEN%2F%2A%2A%2F1%2F%2A%2A%2FELSE%2F%2A%2A%2F0%2F%2A%2A%2FEND%29%2F%2A%2A%2FFROM%2F%2A%2A%2FDUAL%29%7C%7CCHR%28113%29%7C%7CCHR%28122%29%7C%7CCHR%2898%29%7C%7CCHR%2898%29%7C%7CCHR%28113%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22zAUD%22%3D%22zAUD=Homo%20sapiens=Reaction=Pathway=true

q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F4144%3DCONVERT%28INT%2C%28SELECT%2F%2A%2A%2FCHAR%28113%29%2BCHAR%28122%29%2BCHAR%28112%29%2BCH

SolR OOM error due to query injection

2020-06-10 Thread Guilherme Viteri
Hi,

Environment: SolR 6.6.2, with org.apache.solr.solr-core:6.1.0. This setup has 
been running for at least 4 years without having OutOfMemory error. (it is 
never too late for an OOM…)

This week, our search tool has been attacked via ‘sql injection’ like, and that 
led to an OOM. These requests weren’t aggressive that stressed the server with 
an excessive number of hits, however 5 to 10 request of this nature was enough 
to crash the server.

I’ve come across a this link 
https://stackoverflow.com/questions/26862474/prevent-from-solr-query-injections-when-using-solrj
 
,
 however, that’s not what I am after. In our case we do allow lucene query and 
field search like title:Title or our ids have dash and if it get escaped, then 
the search won’t work properly.

Does anyone have an idea ?

Cheers
G

Here are some of the requests that appeared in the logs in relation to the 
attack (see below: sorry it is messy)
query?q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F2%2A%28IF%28%28SELECT%2F%2A%2A%2F%2A%2F%2A%2A%2FFROM%2F%2A%2A%2F%28SELECT%2F%2A%2A%2FCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283235%3D3235%2C1%29%29%29%2C0x717a626271%2C0x78%29%29s%29%2C%2F%2A%2A%2F8446744073709551610%2C%2F%2A%2A%2F8446744073709551610%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22YBXk%22%2F%2A%2A%2FLIKE%2F%2A%2A%2F%22YBXk=Homo%20sapiens=Reaction=Pathway=true

q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F2%2A%28IF%28%28SELECT%2F%2A%2A%2F%2A%2F%2A%2A%2FFROM%2F%2A%2A%2F%28SELECT%2F%2A%2A%2FCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283235%3D3235%2C1%29%29%29%2C0x717a626271%2C0x78%29%29s%29%2C%2F%2A%2A%2F8446744073709551610%2C%2F%2A%2A%2F8446744073709551610%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22rDmG%22%3D%22rDmG=Homo%20sapiens=Reaction=Pathway=true

q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F3641%2F%2A%2A%2FFROM%28SELECT%2F%2A%2A%2FCOUNT%28%2A%29%2CCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283641%3D3641%2C1%29%29%29%2C0x717a626271%2CFLOOR%28RAND%280%29%2A2%29%29x%2F%2A%2A%2FFROM%2F%2A%2A%2FINFORMATION_SCHEMA.PLUGINS%2F%2A%2A%2FGROUP%2F%2A%2A%2FBY%2F%2A%2A%2Fx%29a%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22dfkM%22%2F%2A%2A%2FLIKE%2F%2A%2A%2F%22dfkM=Homo%20sapiens=Reaction=Pathway=true

q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28SELECT%2F%2A%2A%2F3641%2F%2A%2A%2FFROM%28SELECT%2F%2A%2A%2FCOUNT%28%2A%29%2CCONCAT%280x717a707871%2C%28SELECT%2F%2A%2A%2F%28ELT%283641%3D3641%2C1%29%29%29%2C0x717a626271%2CFLOOR%28RAND%280%29%2A2%29%29x%2F%2A%2A%2FFROM%2F%2A%2A%2FINFORMATION_SCHEMA.PLUGINS%2F%2A%2A%2FGROUP%2F%2A%2A%2FBY%2F%2A%2A%2Fx%29a%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22yBhx%22%3D%22yBhx=Homo%20sapiens=Reaction=Pathway=true

q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F1695%3DCTXSYS.DRITHSX.SN%281695%2C%28CHR%28113%29%7C%7CCHR%28122%29%7C%7CCHR%28112%29%7C%7CCHR%28120%29%7C%7CCHR%28113%29%7C%7C%28SELECT%2F%2A%2A%2F%28CASE%2F%2A%2A%2FWHEN%2F%2A%2A%2F%281695%3D1695%29%2F%2A%2A%2FTHEN%2F%2A%2A%2F1%2F%2A%2A%2FELSE%2F%2A%2A%2F0%2F%2A%2A%2FEND%29%2F%2A%2A%2FFROM%2F%2A%2A%2FDUAL%29%7C%7CCHR%28113%29%7C%7CCHR%28122%29%7C%7CCHR%2898%29%7C%7CCHR%2898%29%7C%7CCHR%28113%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22eEdc%22%2F%2A%2A%2FLIKE%2F%2A%2A%2F%22eEdc=Homo%20sapiens=Reaction=Pathway=true

q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F1695%3DCTXSYS.DRITHSX.SN%281695%2C%28CHR%28113%29%7C%7CCHR%28122%29%7C%7CCHR%28112%29%7C%7CCHR%28120%29%7C%7CCHR%28113%29%7C%7C%28SELECT%2F%2A%2A%2F%28CASE%2F%2A%2A%2FWHEN%2F%2A%2A%2F%281695%3D1695%29%2F%2A%2A%2FTHEN%2F%2A%2A%2F1%2F%2A%2A%2FELSE%2F%2A%2A%2F0%2F%2A%2A%2FEND%29%2F%2A%2A%2FFROM%2F%2A%2A%2FDUAL%29%7C%7CCHR%28113%29%7C%7CCHR%28122%29%7C%7CCHR%2898%29%7C%7CCHR%2898%29%7C%7CCHR%28113%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22zAUD%22%3D%22zAUD=Homo%20sapiens=Reaction=Pathway=true

q=IPP%22%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F4144%3DCONVERT%28INT%2C%28SELECT%2F%2A%2A%2FCHAR%28113%29%2BCHAR%28122%29%2BCHAR%28112%29%2BCHAR%28120%29%2BCHAR%28113%29%2B%28SELECT%2F%2A%2A%2F%28CASE%2F%2A%2A%2FWHEN%2F%2A%2A%2F%284144%3D4144%29%2F%2A%2A%2FTHEN%2F%2A%2A%2FCHAR%2849%29%2F%2A%2A%2FELSE%2F%2A%2A%2FCHAR%2848%29%2F%2A%2A%2FEND%29%29%2BCHAR%28113%29%2BCHAR%28122%29%2BCHAR%2898%29%2BCHAR%2898%29%2BCHAR%28113%29%29%29%2F%2A%2A%2FAND%2F%2A%2A%2F%28%28%28%22ePUW%22%2F%2A%2A%2FLIKE%2F%2A%2A%2F%22ePUW=Homo%20sapiens=Reaction=Pathway=true


Re: OOM Error

2016-11-09 Thread Susheel Kumar
Thanks, Shawn for looking into. Your summption is right, the end of graph
is the OOM. I am trying to collect all the queries & ingestion numbers
around 9:12 but one more observation and a question from today

Observed that on 2-3 VM's out of 12, shows high usage of heap even though
heavy ingestion stopped more than an hour back while on other machines
shows normal usage.  Does that tells anything?

Snapshot 1 showing high usage of heap
===
https://www.dropbox.com/s/c1qy1s5nc9uo6cp/2016-11-09_15-55-24.png?dl=0

Snapshot  2 showing normal usage of heap
===
https://www.dropbox.com/s/9v016ilmhcahs28/2016-11-09_15-58-28.png?dl=0

The other question is we found that our ingestion batch size varies (goes
from 200 to 4000+ docs depending on  queue size). I am asking the ingestion
folks to fix the batch size but wondering does it matter in terms of load
on solr/heap usage if we submit small batches (like 500 docs max) more
frequently, than submitting bigger batches less frequently.  So far bigger
batch size has not caused any issues except these two incidents.

Thanks,
Susheel





On Wed, Nov 9, 2016 at 10:19 AM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 11/8/2016 12:49 PM, Susheel Kumar wrote:
> > Ran into OOM Error again right after two weeks. Below is the GC log
> > viewer graph. The first time we run into this was after 3 months and
> > then second time in two weeks. After first incident reduced the cache
> > size and increase heap from 8 to 10G. Interestingly query and
> > ingestion load is like normal other days and heap utilisation remains
> > stable and suddenly jumps to x2.
>
> It looks like something happened at about 9:12:30 on that graph.  Do you
> know what that was?  Starting at about that time, GC times went through
> the roof and the allocated heap began a steady rise.  At about 9:15, a
> lot of garbage was freed up and GC times dropped way down again.  At
> about 9:18, the GC once again started taking a long time, and the used
> heap was still going up steadily. At about 9:21, the full GCs started --
> the wide black bars.  I assume that the end of the graph is the OOM.
>
> > We are looking to reproduce this in test environment by producing
> > similar queries/ingestion but wondering if running into some memory
> > leak or bug like "SOLR-8922 - DocSetCollector can allocate massive
> > garbage on large indexes" which can cause this issue. Also we have
> > frequent updates and wondering if not optimizing the index can result
> > into this situation
>
> It looks more like a problem with allocated memory that's NOT garbage
> than a problem with garbage, but I can't really rule anything out, and
> even what I've said below could be wrong.
>
> Most of the allocated heap is in the old generation.  If there's a bug
> in Solr causing this problem, it would probably be a memory leak, but
> SOLR-8922 doesn't talk about a leak.  A memory leak is always possible,
> but those have been rare in Solr.  The most likely problem is that
> something changed in your indexing or query patterns which required a
> lot more memory than what happened before that point.
>
> Thanks,
> Shawn
>
>


Re: OOM Error

2016-11-09 Thread Shawn Heisey
On 11/8/2016 12:49 PM, Susheel Kumar wrote:
> Ran into OOM Error again right after two weeks. Below is the GC log
> viewer graph. The first time we run into this was after 3 months and
> then second time in two weeks. After first incident reduced the cache
> size and increase heap from 8 to 10G. Interestingly query and
> ingestion load is like normal other days and heap utilisation remains
> stable and suddenly jumps to x2. 

It looks like something happened at about 9:12:30 on that graph.  Do you
know what that was?  Starting at about that time, GC times went through
the roof and the allocated heap began a steady rise.  At about 9:15, a
lot of garbage was freed up and GC times dropped way down again.  At
about 9:18, the GC once again started taking a long time, and the used
heap was still going up steadily. At about 9:21, the full GCs started --
the wide black bars.  I assume that the end of the graph is the OOM.

> We are looking to reproduce this in test environment by producing
> similar queries/ingestion but wondering if running into some memory
> leak or bug like "SOLR-8922 - DocSetCollector can allocate massive
> garbage on large indexes" which can cause this issue. Also we have
> frequent updates and wondering if not optimizing the index can result
> into this situation

It looks more like a problem with allocated memory that's NOT garbage
than a problem with garbage, but I can't really rule anything out, and
even what I've said below could be wrong.

Most of the allocated heap is in the old generation.  If there's a bug
in Solr causing this problem, it would probably be a memory leak, but
SOLR-8922 doesn't talk about a leak.  A memory leak is always possible,
but those have been rare in Solr.  The most likely problem is that
something changed in your indexing or query patterns which required a
lot more memory than what happened before that point.

Thanks,
Shawn



Re: OOM Error

2016-11-08 Thread Susheel Kumar
Hello,

Ran into OOM Error again right after two weeks. Below is the GC log viewer
graph.  The first time we run into this was after 3 months and then second
time in two weeks. After first incident reduced the cache size and increase
heap from 8 to 10G.  Interestingly query and ingestion load is like normal
other days and heap utilisation remains stable and suddenly jumps to x2.

We are looking to reproduce this in test environment by producing similar
queries/ingestion but wondering if running into some memory leak or bug
like  "SOLR-8922 - DocSetCollector can allocate massive garbage on large
indexes" which can cause this issue.  Also we have frequent updates and
wondering if not optimizing the index can result into this situation

Any thoughts ?

GC Viewer

https://www.dropbox.com/s/bb29ub5q2naljdl/gc_log_snapshot.png?dl=0




On Wed, Oct 26, 2016 at 10:47 AM, Susheel Kumar <susheel2...@gmail.com>
wrote:

> Hi Toke,
>
> I think your guess is right.  We have ingestion running in batches.  We
> have 6 shards & 6 replicas on 12 VM's each around 40+ million docs on each
> shard.
>
> Thanks everyone for the suggestions/pointers.
>
> Thanks,
> Susheel
>
> On Wed, Oct 26, 2016 at 1:52 AM, Toke Eskildsen <t...@statsbiblioteket.dk>
> wrote:
>
>> On Tue, 2016-10-25 at 15:04 -0400, Susheel Kumar wrote:
>> > Thanks, Toke.  Analyzing GC logs helped to determine that it was a
>> > sudden
>> > death.
>>
>> > The peaks in last 20 mins... See   http://tinypic.com/r/n2zonb/9
>>
>> Peaks yes, but there is a pattern of
>>
>> 1) Stable memory use
>> 2) Temporary doubling of the memory used and a lot of GC
>> 3) Increased (relative to last stable period) but stable memory use
>> 4) Goto 2
>>
>> Should I guess, I would say that you are running ingests in batches,
>> which temporarily causes 2 searchers to be open at the same time. That
>> is 2 in the list above. After the batch ingest, the baseline moves up,
>> assumedly because your have added quite a lot of documents, relative to
>> the overall number of documents.
>>
>>
>> The temporary doubling of the baseline is hard to avoid, but I am
>> surprised of the amount of heap that you need in the stable periods.
>> Just to be clear: This is from a Solr with 8GB of heap handling only 1
>> shard of 20GB and you are using DocValues? How many documents do you
>> have in such a shard?
>>
>> - Toke Eskildsen, State and University Library, Denmark
>>
>
>


Re: OOM Error

2016-10-26 Thread Susheel Kumar
Hi Toke,

I think your guess is right.  We have ingestion running in batches.  We
have 6 shards & 6 replicas on 12 VM's each around 40+ million docs on each
shard.

Thanks everyone for the suggestions/pointers.

Thanks,
Susheel

On Wed, Oct 26, 2016 at 1:52 AM, Toke Eskildsen 
wrote:

> On Tue, 2016-10-25 at 15:04 -0400, Susheel Kumar wrote:
> > Thanks, Toke.  Analyzing GC logs helped to determine that it was a
> > sudden
> > death.
>
> > The peaks in last 20 mins... See   http://tinypic.com/r/n2zonb/9
>
> Peaks yes, but there is a pattern of
>
> 1) Stable memory use
> 2) Temporary doubling of the memory used and a lot of GC
> 3) Increased (relative to last stable period) but stable memory use
> 4) Goto 2
>
> Should I guess, I would say that you are running ingests in batches,
> which temporarily causes 2 searchers to be open at the same time. That
> is 2 in the list above. After the batch ingest, the baseline moves up,
> assumedly because your have added quite a lot of documents, relative to
> the overall number of documents.
>
>
> The temporary doubling of the baseline is hard to avoid, but I am
> surprised of the amount of heap that you need in the stable periods.
> Just to be clear: This is from a Solr with 8GB of heap handling only 1
> shard of 20GB and you are using DocValues? How many documents do you
> have in such a shard?
>
> - Toke Eskildsen, State and University Library, Denmark
>


Re: OOM Error

2016-10-26 Thread Tom Evans
On Wed, Oct 26, 2016 at 4:53 AM, Shawn Heisey  wrote:
> On 10/25/2016 8:03 PM, Susheel Kumar wrote:
>> Agree, Pushkar.  I had docValues for sorting / faceting fields from
>> begining (since I setup Solr 6.0).  So good on that side. I am going to
>> analyze the queries to find any potential issue. Two questions which I am
>> puzzling with
>>
>> a) Should the below JVM parameter be included for Prod to get heap dump
>>
>> "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/the/dump"
>
> A heap dump can take a very long time to complete, and there may not be
> enough memory in the machine to start another instance of Solr until the
> first one has finished the heap dump.  Also, I do not know whether Java
> would release the listening port before the heap dump finishes.  If not,
> then a new instance would not be able to start immediately.
>
> If a different heap dump file is created each time, that might lead to
> problems with disk space after repeated dumps.  I don't know how the
> option works.
>
>> b) Currently OOM script just kills the Solr instance. Shouldn't it be
>> enhanced to wait and restart Solr instance
>
> As long as there is a problem causing OOMs, it seems rather pointless to
> start Solr right back up, as another OOM is likely.  The safest thing to
> do is kill Solr (since its operation would be unpredictable after OOM)
> and let the admin sort the problem out.
>

Occasionally our cloud nodes can OOM, when particularly complex
faceting is performed. The current OOM management can be exceedingly
annoying; a user will make a too complex analysis request, bringing
down one server, taking it out of the balancer. The user gets fed up
at no response, so reloads the page, re-submitting the analysis and
bringing down the next server in the cluster.

Lather, rinse, repeat - and then you get to have a meeting to discuss
why we invest so much in HA infrastructure that can be made non-HA by
one user with a complex query. In those meetings it is much harder to
justify not restarting.

Cheers

Tom


Re: OOM Error

2016-10-25 Thread Toke Eskildsen
On Tue, 2016-10-25 at 15:04 -0400, Susheel Kumar wrote:
> Thanks, Toke.  Analyzing GC logs helped to determine that it was a
> sudden
> death.  

> The peaks in last 20 mins... See   http://tinypic.com/r/n2zonb/9

Peaks yes, but there is a pattern of 

1) Stable memory use
2) Temporary doubling of the memory used and a lot of GC
3) Increased (relative to last stable period) but stable memory use
4) Goto 2

Should I guess, I would say that you are running ingests in batches,
which temporarily causes 2 searchers to be open at the same time. That
is 2 in the list above. After the batch ingest, the baseline moves up,
assumedly because your have added quite a lot of documents, relative to
the overall number of documents.


The temporary doubling of the baseline is hard to avoid, but I am
surprised of the amount of heap that you need in the stable periods.
Just to be clear: This is from a Solr with 8GB of heap handling only 1
shard of 20GB and you are using DocValues? How many documents do you
have in such a shard?

- Toke Eskildsen, State and University Library, Denmark


Re: OOM Error

2016-10-25 Thread Shawn Heisey
On 10/25/2016 8:03 PM, Susheel Kumar wrote:
> Agree, Pushkar.  I had docValues for sorting / faceting fields from
> begining (since I setup Solr 6.0).  So good on that side. I am going to
> analyze the queries to find any potential issue. Two questions which I am
> puzzling with
>
> a) Should the below JVM parameter be included for Prod to get heap dump
>
> "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/the/dump"

A heap dump can take a very long time to complete, and there may not be
enough memory in the machine to start another instance of Solr until the
first one has finished the heap dump.  Also, I do not know whether Java
would release the listening port before the heap dump finishes.  If not,
then a new instance would not be able to start immediately.

If a different heap dump file is created each time, that might lead to
problems with disk space after repeated dumps.  I don't know how the
option works.

> b) Currently OOM script just kills the Solr instance. Shouldn't it be
> enhanced to wait and restart Solr instance

As long as there is a problem causing OOMs, it seems rather pointless to
start Solr right back up, as another OOM is likely.  The safest thing to
do is kill Solr (since its operation would be unpredictable after OOM)
and let the admin sort the problem out.

Thanks,
Shawn



Re: OOM Error

2016-10-25 Thread Erick Erickson
Off the top of my head:

a) Should the below JVM parameter be included for Prod to get heap dump

Makes sense. It may produce quite a large dump file, but then this is
an extraordinary situation so that's probably OK.

b) Currently OOM script just kills the Solr instance. Shouldn't it be
enhanced to wait and restart Solr instance

Personally I don't think so. IMO there's no real point in restarting
Solr, you have to address this issue as this situation is likely to
recur. So restarting Solr may hide this very serious problem, how
would you even know to look? Restarting Solr could potentially lead to
a long involved process of wondering why selected queries seem to fail
and not noticing that the OOM script killed Solr. Having the default
_not_ restart Solr forces you to notice.

If you have to change the script to restart Solr, you also know that
you made the change and you should _really_ notify ops that they
should monitor this situation.

I admit this can be argued either way; Personally, I'd rather "fail
fast and often".

Best,
Erick

On Tue, Oct 25, 2016 at 7:03 PM, Susheel Kumar  wrote:
> Agree, Pushkar.  I had docValues for sorting / faceting fields from
> begining (since I setup Solr 6.0).  So good on that side. I am going to
> analyze the queries to find any potential issue. Two questions which I am
> puzzling with
>
> a) Should the below JVM parameter be included for Prod to get heap dump
>
> "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/the/dump"
>
> b) Currently OOM script just kills the Solr instance. Shouldn't it be
> enhanced to wait and restart Solr instance
>
> Thanks,
> Susheel
>
>
>
>
> On Tue, Oct 25, 2016 at 7:35 PM, Pushkar Raste 
> wrote:
>
>> You should look into using docValues.  docValues are stored off heap and
>> hence you would be better off than just bumping up the heap.
>>
>> Don't enable docValues on existing fields unless you plan to reindex data
>> from scratch.
>>
>> On Oct 25, 2016 3:04 PM, "Susheel Kumar"  wrote:
>>
>> > Thanks, Toke.  Analyzing GC logs helped to determine that it was a sudden
>> > death.  The peaks in last 20 mins... See   http://tinypic.com/r/n2zonb/9
>> >
>> > Will look into the queries more closer and also adjusting the cache
>> sizing.
>> >
>> >
>> > Thanks,
>> > Susheel
>> >
>> > On Tue, Oct 25, 2016 at 3:37 AM, Toke Eskildsen 
>> > wrote:
>> >
>> > > On Mon, 2016-10-24 at 18:27 -0400, Susheel Kumar wrote:
>> > > > I am seeing OOM script killed solr (solr 6.0.0) on couple of our VM's
>> > > > today. So far our solr cluster has been running fine but suddenly
>> > > > today many of the VM's Solr instance got killed.
>> > >
>> > > As you have the GC-logs, you should be able to determine if it was a
>> > > slow death (e.g. caches gradually being filled) or a sudden one (e.g.
>> > > grouping or faceting on a large new non-DocValued field).
>> > >
>> > > Try plotting the GC logs with time on the x-axis and free memory after
>> > > GC on the y-axis. It it happens to be a sudden death, the last lines in
>> > > solr.log might hold a clue after all.
>> > >
>> > > - Toke Eskildsen, State and University Library, Denmark
>> > >
>> >
>>


Re: OOM Error

2016-10-25 Thread Susheel Kumar
Agree, Pushkar.  I had docValues for sorting / faceting fields from
begining (since I setup Solr 6.0).  So good on that side. I am going to
analyze the queries to find any potential issue. Two questions which I am
puzzling with

a) Should the below JVM parameter be included for Prod to get heap dump

"-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/the/dump"

b) Currently OOM script just kills the Solr instance. Shouldn't it be
enhanced to wait and restart Solr instance

Thanks,
Susheel




On Tue, Oct 25, 2016 at 7:35 PM, Pushkar Raste 
wrote:

> You should look into using docValues.  docValues are stored off heap and
> hence you would be better off than just bumping up the heap.
>
> Don't enable docValues on existing fields unless you plan to reindex data
> from scratch.
>
> On Oct 25, 2016 3:04 PM, "Susheel Kumar"  wrote:
>
> > Thanks, Toke.  Analyzing GC logs helped to determine that it was a sudden
> > death.  The peaks in last 20 mins... See   http://tinypic.com/r/n2zonb/9
> >
> > Will look into the queries more closer and also adjusting the cache
> sizing.
> >
> >
> > Thanks,
> > Susheel
> >
> > On Tue, Oct 25, 2016 at 3:37 AM, Toke Eskildsen 
> > wrote:
> >
> > > On Mon, 2016-10-24 at 18:27 -0400, Susheel Kumar wrote:
> > > > I am seeing OOM script killed solr (solr 6.0.0) on couple of our VM's
> > > > today. So far our solr cluster has been running fine but suddenly
> > > > today many of the VM's Solr instance got killed.
> > >
> > > As you have the GC-logs, you should be able to determine if it was a
> > > slow death (e.g. caches gradually being filled) or a sudden one (e.g.
> > > grouping or faceting on a large new non-DocValued field).
> > >
> > > Try plotting the GC logs with time on the x-axis and free memory after
> > > GC on the y-axis. It it happens to be a sudden death, the last lines in
> > > solr.log might hold a clue after all.
> > >
> > > - Toke Eskildsen, State and University Library, Denmark
> > >
> >
>


Re: OOM Error

2016-10-25 Thread Pushkar Raste
You should look into using docValues.  docValues are stored off heap and
hence you would be better off than just bumping up the heap.

Don't enable docValues on existing fields unless you plan to reindex data
from scratch.

On Oct 25, 2016 3:04 PM, "Susheel Kumar"  wrote:

> Thanks, Toke.  Analyzing GC logs helped to determine that it was a sudden
> death.  The peaks in last 20 mins... See   http://tinypic.com/r/n2zonb/9
>
> Will look into the queries more closer and also adjusting the cache sizing.
>
>
> Thanks,
> Susheel
>
> On Tue, Oct 25, 2016 at 3:37 AM, Toke Eskildsen 
> wrote:
>
> > On Mon, 2016-10-24 at 18:27 -0400, Susheel Kumar wrote:
> > > I am seeing OOM script killed solr (solr 6.0.0) on couple of our VM's
> > > today. So far our solr cluster has been running fine but suddenly
> > > today many of the VM's Solr instance got killed.
> >
> > As you have the GC-logs, you should be able to determine if it was a
> > slow death (e.g. caches gradually being filled) or a sudden one (e.g.
> > grouping or faceting on a large new non-DocValued field).
> >
> > Try plotting the GC logs with time on the x-axis and free memory after
> > GC on the y-axis. It it happens to be a sudden death, the last lines in
> > solr.log might hold a clue after all.
> >
> > - Toke Eskildsen, State and University Library, Denmark
> >
>


Re: OOM Error

2016-10-25 Thread Susheel Kumar
Thanks, Toke.  Analyzing GC logs helped to determine that it was a sudden
death.  The peaks in last 20 mins... See   http://tinypic.com/r/n2zonb/9

Will look into the queries more closer and also adjusting the cache sizing.


Thanks,
Susheel

On Tue, Oct 25, 2016 at 3:37 AM, Toke Eskildsen 
wrote:

> On Mon, 2016-10-24 at 18:27 -0400, Susheel Kumar wrote:
> > I am seeing OOM script killed solr (solr 6.0.0) on couple of our VM's
> > today. So far our solr cluster has been running fine but suddenly
> > today many of the VM's Solr instance got killed.
>
> As you have the GC-logs, you should be able to determine if it was a
> slow death (e.g. caches gradually being filled) or a sudden one (e.g.
> grouping or faceting on a large new non-DocValued field).
>
> Try plotting the GC logs with time on the x-axis and free memory after
> GC on the y-axis. It it happens to be a sudden death, the last lines in
> solr.log might hold a clue after all.
>
> - Toke Eskildsen, State and University Library, Denmark
>


Re: OOM Error

2016-10-25 Thread William Bell
I would also recommend that 8GB is cutting it close for Java 8 JVM with
SOLR. We use 12GB and have had issues with 8GB. But your mileage may vary.

On Tue, Oct 25, 2016 at 1:37 AM, Toke Eskildsen 
wrote:

> On Mon, 2016-10-24 at 18:27 -0400, Susheel Kumar wrote:
> > I am seeing OOM script killed solr (solr 6.0.0) on couple of our VM's
> > today. So far our solr cluster has been running fine but suddenly
> > today many of the VM's Solr instance got killed.
>
> As you have the GC-logs, you should be able to determine if it was a
> slow death (e.g. caches gradually being filled) or a sudden one (e.g.
> grouping or faceting on a large new non-DocValued field).
>
> Try plotting the GC logs with time on the x-axis and free memory after
> GC on the y-axis. It it happens to be a sudden death, the last lines in
> solr.log might hold a clue after all.
>
> - Toke Eskildsen, State and University Library, Denmark
>



-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: OOM Error

2016-10-25 Thread Toke Eskildsen
On Mon, 2016-10-24 at 18:27 -0400, Susheel Kumar wrote:
> I am seeing OOM script killed solr (solr 6.0.0) on couple of our VM's
> today. So far our solr cluster has been running fine but suddenly
> today many of the VM's Solr instance got killed.

As you have the GC-logs, you should be able to determine if it was a
slow death (e.g. caches gradually being filled) or a sudden one (e.g.
grouping or faceting on a large new non-DocValued field).

Try plotting the GC logs with time on the x-axis and free memory after
GC on the y-axis. It it happens to be a sudden death, the last lines in
solr.log might hold a clue after all.

- Toke Eskildsen, State and University Library, Denmark


Re: OOM Error

2016-10-24 Thread Susheel Kumar
Thanks, Pushkar. The Solr was already killed by OOM script so i believe we
can't get heap dump.

Hi Shawn, I used Solr service scripts to launch Solr and it looks like
bin/solr doesn't include by default the below JVM parameter.

"-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/the/dump"

Is that something we should add to the Solr launch scripts to have it
included or may be at least in disabled (comment) mode?

Thanks,
Susheel

On Mon, Oct 24, 2016 at 8:20 PM, Shawn Heisey  wrote:

> On 10/24/2016 4:27 PM, Susheel Kumar wrote:
> > I am seeing OOM script killed solr (solr 6.0.0) on couple of our VM's
> > today. So far our solr cluster has been running fine but suddenly today
> > many of the VM's Solr instance got killed. I had 8G of heap allocated on
> 64
> > GB machines with 20+ GB of index size on each shards.
> >
> > What could be looked to find the exact root cause. I am suspecting of any
> > query (wildcard prefix query etc.) might have caused this issue.  The
> > ingestion and query load looks normal as other days.  I have the solr GC
> > logs as well.
>
> It is unlikely that you will be able to figure out exactly what is using
> too much memory from Solr logs.  The place where the OOM happens may be
> completely unrelated to the parts of the system that are using large
> amounts of memory.  That point is just the place where Java ran out of
> memory to allocate, which could happen when allocating a tiny amount of
> memory just as easily as it could happen when allocating a large amount
> of memory.
>
> What I can tell you has been placed on this wiki page:
>
> https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap
>
> Thanks,
> Shawn
>
>


Re: OOM Error

2016-10-24 Thread Shawn Heisey
On 10/24/2016 4:27 PM, Susheel Kumar wrote:
> I am seeing OOM script killed solr (solr 6.0.0) on couple of our VM's
> today. So far our solr cluster has been running fine but suddenly today
> many of the VM's Solr instance got killed. I had 8G of heap allocated on 64
> GB machines with 20+ GB of index size on each shards.
>
> What could be looked to find the exact root cause. I am suspecting of any
> query (wildcard prefix query etc.) might have caused this issue.  The
> ingestion and query load looks normal as other days.  I have the solr GC
> logs as well.

It is unlikely that you will be able to figure out exactly what is using
too much memory from Solr logs.  The place where the OOM happens may be
completely unrelated to the parts of the system that are using large
amounts of memory.  That point is just the place where Java ran out of
memory to allocate, which could happen when allocating a tiny amount of
memory just as easily as it could happen when allocating a large amount
of memory.

What I can tell you has been placed on this wiki page:

https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap

Thanks,
Shawn



Re: OOM Error

2016-10-24 Thread Pushkar Raste
Did you look into the heap dump ?

On Mon, Oct 24, 2016 at 6:27 PM, Susheel Kumar 
wrote:

> Hello,
>
> I am seeing OOM script killed solr (solr 6.0.0) on couple of our VM's
> today. So far our solr cluster has been running fine but suddenly today
> many of the VM's Solr instance got killed. I had 8G of heap allocated on 64
> GB machines with 20+ GB of index size on each shards.
>
> What could be looked to find the exact root cause. I am suspecting of any
> query (wildcard prefix query etc.) might have caused this issue.  The
> ingestion and query load looks normal as other days.  I have the solr GC
> logs as well.
>
> Thanks,
> Susheel
>


OOM Error

2016-10-24 Thread Susheel Kumar
Hello,

I am seeing OOM script killed solr (solr 6.0.0) on couple of our VM's
today. So far our solr cluster has been running fine but suddenly today
many of the VM's Solr instance got killed. I had 8G of heap allocated on 64
GB machines with 20+ GB of index size on each shards.

What could be looked to find the exact root cause. I am suspecting of any
query (wildcard prefix query etc.) might have caused this issue.  The
ingestion and query load looks normal as other days.  I have the solr GC
logs as well.

Thanks,
Susheel


Re: Non-Heap OOM Error with Small Index Size

2014-06-12 Thread msoltow
We've managed to fix our issue, but just in case anyone has the same problem,
I wanted to identify our solution.

We were originally using the version of Tomcat that was packaged with CentOS
(Tomcat 6.0.24).  We tried downloading a newer version of Tomcat (7.0.52)
and running Solr there, and this fixed the problem.  We're not sure exactly
what the problem was, but that's all it took.

Hope this helps someone!

Michael



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Non-Heap-OOM-Error-with-Small-Index-Size-tp4141175p4141509.html
Sent from the Solr - User mailing list archive at Nabble.com.


Non-Heap OOM Error with Small Index Size

2014-06-11 Thread msoltow
While running a Solr-based Web application on Tomcat 6, we have been
repeatedly running into Out of Memory issues.  However, these OOM errors are
not related to the Java heap.  A snapshot of our Solr dashboard just before
the OOM error reported:

Physical memory: 7.13/7.29 GB
JVM-Memory: 57.90 MB - 3.05 GB - 3.56 GB

In addition, the top command displays:
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
25382 tomcat20   0 9716m 6.8g 175m S 47.9 92.6 196:13.70 java

We're unsure as to why the physical memory usage is so much higher than the
JVM usage, especially given that the size of our index is roughly 500 MB. 
We were originally using OpenJDK, and we tried switching to Oracle JDK with
no luck.

Is it normal for physical memory usage to be this high?  We do not want to
upgrade our RAM if the problem is really just an error in the configuration.

I've attached environment info below, as well as an excerpt of the latest
OOM error report.

Thank you very much in advance.

Kind regards,
Michael


Additional info about our application:
We index documents from a remote location by retrieving them via a REST API. 
The entire remote repository is crawled at regular intervals by our
application.  Twenty-five documents are loaded at a time (the page size
provided by the API), and we manually commit each set of twenty-five
documents.  We do have auto-commit (but not auto-soft-commit) enabled with a
time of 60s, but an auto-commit has never actually occurred.

Solr Info:
Solr 4.8.0
524 MB Index Size
31 Fields
Just under 3000 documents
Directory factory is MMapDirectory
Caches enabled with default settings/size limits

Selected JVM Arguments:
-XX:MaxPermSize=128m
-Dorg.apache.pdfbox.baseParser.pushBackSize=524288
-Xmx4096m
-Xms1024m

Environment:
64-bit AWS EC2 running CentOS 6.5
Tomcat 6.0.24
7.5 GB RAM
Tried using both Oracle JDK 1.7.0_60 and Open JDK

OOM Log Entry:
OpenJDK 64-Bit Server VM warning: INFO:
os::commit_memory(0x000773c8, 366477312, 0) failed; error='Cannot
allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 366477312 bytes for
committing reserved memory.
# An error report file with more information is saved as:
# /tmp/jvm-18372/hs_error.log

OOM Error Report Snippet:
# Native memory allocation (malloc) failed to allocate 366477312 bytes for
committing reserved memory.
# Possible reasons:
#   The system is out of physical RAM or swap space
#   In 32 bit mode, the process size limit was hit
# Possible solutions:
#   Reduce memory load on the system
#   Increase physical memory or swap space
#   Check if swap backing store is full
#   Use 64 bit Java on a 64 bit OS
#   Decrease Java heap size (-Xmx/-Xms)
#   Decrease number of Java threads
#   Decrease Java thread stack sizes (-Xss)
#   Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
#  Out of Memory Error (os_linux.cpp:2769), pid=18372, tid=140031038150400
#
# JRE version: OpenJDK Runtime Environment (7.0_55-b13) (build
1.7.0_55-mockbuild_2014_04_16_12_11-b00)
# Java VM: OpenJDK 64-Bit Server VM (24.51-b03 mixed mode linux-amd64
compressed oops)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Non-Heap-OOM-Error-with-Small-Index-Size-tp4141175.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Hard Commit giving OOM Error on Index Writer in Solr 4.2.1

2013-05-22 Thread Umesh Prasad
Hi Shawn,
Thanks for the advice :). The JVM heap Size usage on indexer machine
has been consistency about 95% (both total and old gen) for past 3 days. It
might have nothing to do with Solr 3.6 Vs solr 4.2 .. Because Solr 3.6
indexer gets restarted once in 2-3  days.
  Will investigate why memory usage is so high on indexer.



On Wed, May 22, 2013 at 10:03 AM, Shawn Heisey s...@elyograg.org wrote:

 On 5/21/2013 9:22 PM, Umesh Prasad wrote:
  This is our own implementation of data source (canon name
  com.flipkart.w3.solr.MultiSPCMSProductsDataSource) , which pulls the data
  from out downstream service and it doesn't cache data in RAM. It fetches
  the data in batches of 200 and iterates over it when DIH asks for it. I
  will check the possibility of leak, but unlikely.
 Can OOM issue be because during analysis, IndexWriter finds the
  document to be too large to fit in 100 MB memory and can't flush to disk
 ?
  Our analyzer chain doesn't make easy (specially with a field like) (does
 a
  cross product of synonyms terms)

 If your documents are really large (hundreds of KB, or a few MB), you
 might need a bigger ramBufferSizeMB value ... but if that were causing
 problems, I would expect it to show up during import, not at commit time.

 How much of your 32GB heap is in use during indexing?  Would you be able
 to try with the heap at 31GB instead of 32GB?  One of Java's default
 optimizations (UseCompressedOops) gets turned off with a heap size of
 32GB because it doesn't work any more, and that might lead to strange
 things happening.

 Do you have the ability to try 4.3 instead of 4.2.1?

 Thanks,
 Shawn




-- 
---
Thanks  Regards
Umesh Prasad


Re: Hard Commit giving OOM Error on Index Writer in Solr 4.2.1

2013-05-21 Thread Otis Gospodnetic
Hi,

Maybe you van share more info, such as your java command line or jstat
output from right before the oom ...

Otis
Solr  ElasticSearch Support
http://sematext.com/
On May 21, 2013 1:58 AM, Umesh Prasad umesh.i...@gmail.com wrote:

 Hi All,
I am hitting an OOM error while trying to do an hard commit on one of
 the cores.

 Transaction log dir is Empty and DIH shows indexing going on for  13 hrs..

 *Indexing since 13h 22m 22s*
 Requests: 5,211,392 (108/s), Fetched: 1,902,792 (40/s), Skipped: 106,853,
 Processed: 1,016,696 (21/s)
 Started: about 13 hours ago



 response
 lst name=responseHeaderint name=status500/intint
 name=QTime4/int/lstlst name=errorstr name=msgthis writer hit
 an OutOfMemoryError; cannot commit/strstr
 name=tracejava.lang.IllegalStateException: this writer hit an
 OutOfMemoryError; cannot commit
 at

 org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2661)
 at
 org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2827)
 at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2807)
 at

 org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:536)
 at

 org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:95)
 at

 org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
 at

 org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1055)
 at

 org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157)
 at

 org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
 at

 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
 at

 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
 at

 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
 at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
 at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
 at

 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at

 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at

 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at

 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at

 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at

 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at

 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:554)
 at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
 at
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
 at

 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
 at
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:662)




 --
 ---
 Thanks  Regards
 Umesh Prasad



Re: Hard Commit giving OOM Error on Index Writer in Solr 4.2.1

2013-05-21 Thread Shawn Heisey
On 5/20/2013 11:57 PM, Umesh Prasad wrote:
I am hitting an OOM error while trying to do an hard commit on one of
 the cores.
 
 Transaction log dir is Empty and DIH shows indexing going on for  13 hrs..
 
 *Indexing since 13h 22m 22s*
 Requests: 5,211,392 (108/s), Fetched: 1,902,792 (40/s), Skipped: 106,853,
 Processed: 1,016,696 (21/s)
 Started: about 13 hours ago

In addition to what Otis requested, can you also provide your dataimport
config file?  If you need to obscure connection details like username,
password, hostname, and port, that would be perfectly OK, but the
overall details from the connection must be intact.

Please use a paste website, like pastie.org, fpaste.org, or whatever
your favorite is, and send us link(s).

Thanks,
Shawn



Re: Hard Commit giving OOM Error on Index Writer in Solr 4.2.1

2013-05-21 Thread Jack Krupansky

Try again on a machine with more memory. Or did you do that already?

-- Jack Krupansky

-Original Message- 
From: Umesh Prasad

Sent: Tuesday, May 21, 2013 1:57 AM
To: solr-user@lucene.apache.org
Subject: Hard Commit giving OOM Error on Index Writer in Solr 4.2.1

Hi All,
  I am hitting an OOM error while trying to do an hard commit on one of
the cores.

Transaction log dir is Empty and DIH shows indexing going on for  13 hrs..

*Indexing since 13h 22m 22s*
Requests: 5,211,392 (108/s), Fetched: 1,902,792 (40/s), Skipped: 106,853,
Processed: 1,016,696 (21/s)
Started: about 13 hours ago



response
lst name=responseHeaderint name=status500/intint
name=QTime4/int/lstlst name=errorstr name=msgthis writer hit
an OutOfMemoryError; cannot commit/strstr
name=tracejava.lang.IllegalStateException: this writer hit an
OutOfMemoryError; cannot commit
   at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2661)
   at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2827)
   at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2807)
   at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:536)
   at
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:95)
   at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
   at
org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1055)
   at
org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157)
   at
org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
   at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
   at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
   at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
   at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
   at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
   at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
   at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
   at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
   at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
   at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
   at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
   at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
   at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:554)
   at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
   at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
   at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
   at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
   at java.lang.Thread.run(Thread.java:662)




--
---
Thanks  Regards
Umesh Prasad 



Re: Hard Commit giving OOM Error on Index Writer in Solr 4.2.1

2013-05-21 Thread Umesh Prasad
We have sufficient RAM on machine ..64 GB and we have given JVM 32 GB of
memory. The machine runs Indexing primarily.

The JVM doesn't run out of memory. It is the particular IndexWriterSolrCore
which has .. May be we have specified too low a memory for IndexWriter ..

We index mainly product data and use DIH to pull data from downstream
services. Autocommiit is off. The commit is infrequent  for legacy
reasons.. 1 commit in 2-3 hrs. It it makes a difference, then, a Core can
have more than10 lakh documents uncommitted at a time. IndexWriter has a
memory of 100 MB
 We ran with same config on Solr 3.5 and we never ran out of Memory.
But then, I hadn't tried hard commits on Solr 3.5.

Data-Source Entry :
dataConfig
dataSource name=products type=MultiSPCMSProductsDataSource
spCmsHost=$config.spCmsHost spCmsPort=$config.spCmsPort
spCmsTimeout=3 cmsBatchSize=200 psURL=$config.psUrl
autoCommit=false/
document name=products
entity name=item pk=id
transformer=w3.solr.transformers.GenericProductsTransformer
dataSource=products
/entity
/document
/dataConfig

IndexConfig.

ramBufferSizeMB100/ramBufferSizeMB
maxMergeDocs2147483647/maxMergeDocs
maxFieldLength5/maxFieldLength
writeLockTimeout1000/writeLockTimeout
commitLockTimeout1/commitLockTimeout





On Tue, May 21, 2013 at 7:07 PM, Jack Krupansky j...@basetechnology.comwrote:

 Try again on a machine with more memory. Or did you do that already?

 -- Jack Krupansky

 -Original Message- From: Umesh Prasad
 Sent: Tuesday, May 21, 2013 1:57 AM
 To: solr-user@lucene.apache.org
 Subject: Hard Commit giving OOM Error on Index Writer in Solr 4.2.1


 Hi All,
   I am hitting an OOM error while trying to do an hard commit on one of
 the cores.

 Transaction log dir is Empty and DIH shows indexing going on for  13 hrs..

 *Indexing since 13h 22m 22s*
 Requests: 5,211,392 (108/s), Fetched: 1,902,792 (40/s), Skipped: 106,853,
 Processed: 1,016,696 (21/s)
 Started: about 13 hours ago



 response
 lst name=responseHeaderint name=status500/intint
 name=QTime4/int/lstlst name=errorstr name=msgthis writer hit
 an OutOfMemoryError; cannot commit/strstr
 name=tracejava.lang.**IllegalStateException: this writer hit an
 OutOfMemoryError; cannot commit
at
 org.apache.lucene.index.**IndexWriter.**prepareCommitInternal(**
 IndexWriter.java:2661)
at
 org.apache.lucene.index.**IndexWriter.commitInternal(**
 IndexWriter.java:2827)
at org.apache.lucene.index.**IndexWriter.commit(**
 IndexWriter.java:2807)
at
 org.apache.solr.update.**DirectUpdateHandler2.commit(**
 DirectUpdateHandler2.java:536)
at
 org.apache.solr.update.**processor.RunUpdateProcessor.**processCommit(**
 RunUpdateProcessorFactory.**java:95)
at
 org.apache.solr.update.**processor.**UpdateRequestProcessor.**
 processCommit(**UpdateRequestProcessor.java:**64)
at
 org.apache.solr.update.**processor.**DistributedUpdateProcessor.**
 processCommit(**DistributedUpdateProcessor.**java:1055)
at
 org.apache.solr.update.**processor.LogUpdateProcessor.**processCommit(**
 LogUpdateProcessorFactory.**java:157)
at
 org.apache.solr.handler.**RequestHandlerUtils.**handleCommit(**
 RequestHandlerUtils.java:69)
at
 org.apache.solr.handler.**ContentStreamHandlerBase.**handleRequestBody(**
 ContentStreamHandlerBase.java:**68)
at
 org.apache.solr.handler.**RequestHandlerBase.**handleRequest(**
 RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.**execute(SolrCore.java:1817)
at
 org.apache.solr.servlet.**SolrDispatchFilter.execute(**
 SolrDispatchFilter.java:639)
at
 org.apache.solr.servlet.**SolrDispatchFilter.doFilter(**
 SolrDispatchFilter.java:345)
at
 org.apache.solr.servlet.**SolrDispatchFilter.doFilter(**
 SolrDispatchFilter.java:141)
at
 org.apache.catalina.core.**ApplicationFilterChain.**internalDoFilter(**
 ApplicationFilterChain.java:**235)
at
 org.apache.catalina.core.**ApplicationFilterChain.**doFilter(**
 ApplicationFilterChain.java:**206)
at
 org.apache.catalina.core.**StandardWrapperValve.invoke(**
 StandardWrapperValve.java:233)
at
 org.apache.catalina.core.**StandardContextValve.invoke(**
 StandardContextValve.java:191)
at
 org.apache.catalina.core.**StandardHostValve.invoke(**
 StandardHostValve.java:127)
at
 org.apache.catalina.valves.**ErrorReportValve.invoke(**
 ErrorReportValve.java:102)
at
 org.apache.catalina.core.**StandardEngineValve.invoke(**
 StandardEngineValve.java:109)
at
 org.apache.catalina.valves.**AccessLogValve.invoke(**
 AccessLogValve.java:554)
at
 org.apache.catalina.connector.**CoyoteAdapter.service(**
 CoyoteAdapter.java:298)
at
 org.apache.coyote.http11.**Http11Processor.process(**
 Http11Processor.java:859)
at
 org.apache.coyote.http11.**Http11Protocol$**Http11ConnectionHandler.**
 process(Http11Protocol.java:**588)
at
 org.apache.tomcat.util.net.**JIoEndpoint$Worker.run(**
 JIoEndpoint.java:489)
at java.lang.Thread.run

Re: Hard Commit giving OOM Error on Index Writer in Solr 4.2.1

2013-05-21 Thread Shawn Heisey

On 5/21/2013 5:14 PM, Umesh Prasad wrote:

We have sufficient RAM on machine ..64 GB and we have given JVM 32 GB of
memory. The machine runs Indexing primarily.

The JVM doesn't run out of memory. It is the particular IndexWriterSolrCore
which has .. May be we have specified too low a memory for IndexWriter ..

We index mainly product data and use DIH to pull data from downstream
services. Autocommiit is off. The commit is infrequent  for legacy
reasons.. 1 commit in 2-3 hrs. It it makes a difference, then, a Core can
have more than10 lakh documents uncommitted at a time. IndexWriter has a
memory of 100 MB
  We ran with same config on Solr 3.5 and we never ran out of Memory.
But then, I hadn't tried hard commits on Solr 3.5.


Hard commits are the only kind of commits that Solr 3.x has.  It's soft 
commits that are new with 4.x.



Data-Source Entry :
dataConfig
dataSource name=products type=MultiSPCMSProductsDataSource


This appears to be using a custom data source, not one of the well-known 
types.  If it had been JDBC, I would be saying that your JDBC driver is 
trying to cache the entire result set in RAM.  With a MySQL data source, 
a batchSize of -1 fixes this problem, by internally changing the JDBC 
fetchSize to Integer.MIN_VALUE.  Other databases have different mechanisms.


With this data source, I have no idea at all how to make sure that it 
doesn't cache all results in RAM.  It might be that the combination of 
the new Solr and this custom data source causes a memory leak, something 
that doesn't happen with the old Solr version.


You said that the transaction log directory is empty.  That rules out 
one possibility, which would be solved by the autoCommit settings on 
this page:


http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup

Aside from the memory leak idea, or possibly having your entire source 
data cached in RAM, I have no idea what's happening here.


Thanks,
Shawn



Re: Hard Commit giving OOM Error on Index Writer in Solr 4.2.1

2013-05-21 Thread Umesh Prasad
Hi Shawn,
This is our own implementation of data source (canon name
com.flipkart.w3.solr.MultiSPCMSProductsDataSource) , which pulls the data
from out downstream service and it doesn't cache data in RAM. It fetches
the data in batches of 200 and iterates over it when DIH asks for it. I
will check the possibility of leak, but unlikely.
   Can OOM issue be because during analysis, IndexWriter finds the
document to be too large to fit in 100 MB memory and can't flush to disk ?
Our analyzer chain doesn't make easy (specially with a field like) (does a
cross product of synonyms terms)

fieldType name=textStemmed class=solr.TextField indexed=true
stored=false multiValued=true positionIncrementGap=100
omitNorms=true
analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true/
filter class=solr.*SynonymFilterFactory* synonyms=*
synonyms_index.txt* ignoreCase=true expand=*true*/
 filter class=solr.KStemFilterFactory /
filter class=solr.EnglishMinimalStemFilterFactory/
   filter class=solr.*SynonymFilterFactory* synonyms=*
synonyms_index.txt* ignoreCase=true expand=true/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
/analyzer
analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true/
filter class=solr.SynonymFilterFactory
synonyms=synonyms_index.txt ignoreCase=true expand=true/
filter class=solr.KStemFilterFactory /
filter
class=solr.EnglishMinimalStemFilterFactory/
 filter class=solr.SynonymFilterFactory
synonyms=synonyms_index.txt ignoreCase=true expand=true/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 /analyzer
/fieldType




On Wed, May 22, 2013 at 5:03 AM, Shawn Heisey s...@elyograg.org wrote:

 On 5/21/2013 5:14 PM, Umesh Prasad wrote:

 We have sufficient RAM on machine ..64 GB and we have given JVM 32 GB of
 memory. The machine runs Indexing primarily.

 The JVM doesn't run out of memory. It is the particular
 IndexWriterSolrCore
 which has .. May be we have specified too low a memory for IndexWriter ..

 We index mainly product data and use DIH to pull data from downstream
 services. Autocommiit is off. The commit is infrequent  for legacy
 reasons.. 1 commit in 2-3 hrs. It it makes a difference, then, a Core can
 have more than10 lakh documents uncommitted at a time. IndexWriter has a
 memory of 100 MB
   We ran with same config on Solr 3.5 and we never ran out of Memory.
 But then, I hadn't tried hard commits on Solr 3.5.


 Hard commits are the only kind of commits that Solr 3.x has.  It's soft
 commits that are new with 4.x.


  Data-Source Entry :
 dataConfig
 dataSource name=products type=**MultiSPCMSProductsDataSource


 This appears to be using a custom data source, not one of the well-known
 types.  If it had been JDBC, I would be saying that your JDBC driver is
 trying to cache the entire result set in RAM.  With a MySQL data source, a
 batchSize of -1 fixes this problem, by internally changing the JDBC
 fetchSize to Integer.MIN_VALUE.  Other databases have different mechanisms.

 With this data source, I have no idea at all how to make sure that it
 doesn't cache all results in RAM.  It might be that the combination of the
 new Solr and this custom data source causes a memory leak, something that
 doesn't happen with the old Solr version.

 You said that the transaction log directory is empty.  That rules out one
 possibility, which would be solved by the autoCommit settings on this page:

 http://wiki.apache.org/solr/**SolrPerformanceProblems#Slow_**startuphttp://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup

 Aside from the memory leak idea, or possibly having your entire source
 data cached in RAM, I have no idea what's happening here.

 Thanks,
 Shawn




-- 
---
Thanks  Regards
Umesh Prasad


Re: Hard Commit giving OOM Error on Index Writer in Solr 4.2.1

2013-05-21 Thread Shawn Heisey
On 5/21/2013 9:22 PM, Umesh Prasad wrote:
 This is our own implementation of data source (canon name
 com.flipkart.w3.solr.MultiSPCMSProductsDataSource) , which pulls the data
 from out downstream service and it doesn't cache data in RAM. It fetches
 the data in batches of 200 and iterates over it when DIH asks for it. I
 will check the possibility of leak, but unlikely.
Can OOM issue be because during analysis, IndexWriter finds the
 document to be too large to fit in 100 MB memory and can't flush to disk ?
 Our analyzer chain doesn't make easy (specially with a field like) (does a
 cross product of synonyms terms)

If your documents are really large (hundreds of KB, or a few MB), you
might need a bigger ramBufferSizeMB value ... but if that were causing
problems, I would expect it to show up during import, not at commit time.

How much of your 32GB heap is in use during indexing?  Would you be able
to try with the heap at 31GB instead of 32GB?  One of Java's default
optimizations (UseCompressedOops) gets turned off with a heap size of
32GB because it doesn't work any more, and that might lead to strange
things happening.

Do you have the ability to try 4.3 instead of 4.2.1?

Thanks,
Shawn



Hard Commit giving OOM Error on Index Writer in Solr 4.2.1

2013-05-20 Thread Umesh Prasad
Hi All,
   I am hitting an OOM error while trying to do an hard commit on one of
the cores.

Transaction log dir is Empty and DIH shows indexing going on for  13 hrs..

*Indexing since 13h 22m 22s*
Requests: 5,211,392 (108/s), Fetched: 1,902,792 (40/s), Skipped: 106,853,
Processed: 1,016,696 (21/s)
Started: about 13 hours ago



response
lst name=responseHeaderint name=status500/intint
name=QTime4/int/lstlst name=errorstr name=msgthis writer hit
an OutOfMemoryError; cannot commit/strstr
name=tracejava.lang.IllegalStateException: this writer hit an
OutOfMemoryError; cannot commit
at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2661)
at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2827)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2807)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:536)
at
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:95)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1055)
at
org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157)
at
org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:554)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)




-- 
---
Thanks  Regards
Umesh Prasad


PermGen OOM Error

2012-05-16 Thread richard.pog...@holidaylettings.co.uk
When running Solr we are experiencing PermGen OOM exceptions, this problem gets 
worse and worse the more documents are added and committed.

Stopping the java process does not seem to free the memory.

Has anyone experienced issues like this.

Kind regards,

Richard


Re: PermGen OOM Error

2012-05-16 Thread SH

so have to increase the memory available to the JVM, what servlet container are 
you using?

SH

On 05/16/2012 01:50 PM, richard.pog...@holidaylettings.co.uk wrote:

When running Solr we are experiencing PermGen OOM exceptions, this problem gets 
worse and worse the more documents are added and committed.

Stopping the java process does not seem to free the memory.

Has anyone experienced issues like this.

Kind regards,

Richard



--
Silvio Hermann
Friedrich-Schiller-Universität Jena
Thüringer Universitäts- und Landesbibliothek
Bibliotheksplatz 2
07743 Jena
Phone: ++49 3641 940019
FAX:   ++49 3641 940022

http://www.historische-bestaende.de


Re: PermGen OOM Error

2012-05-16 Thread Jack Krupansky
PermGen memory has to do with number of classes loaded, rather than 
documents.


Here are a couple of pages that help explain Java PermGen issues. The bottom 
line is that you can increase the PermGen space, or enable unloading of 
classes, or at least trace class loading to see why the problem occurs.


http://stackoverflow.com/questions/88235/how-to-deal-with-java-lang-outofmemoryerror-permgen-space-error

http://www.brokenbuild.com/blog/2006/08/04/java-jvm-gc-permgen-and-memory-options/

-- Jack Krupansky

-Original Message- 
From: richard.pog...@holidaylettings.co.uk

Sent: Wednesday, May 16, 2012 7:50 AM
To: solr-user@lucene.apache.org
Subject: PermGen OOM Error

When running Solr we are experiencing PermGen OOM exceptions, this problem 
gets worse and worse the more documents are added and committed.


Stopping the java process does not seem to free the memory.

Has anyone experienced issues like this.

Kind regards,

Richard 



Re: OOM error during merge - index still ok?

2009-09-25 Thread Yonik Seeley
On Fri, Sep 25, 2009 at 8:20 AM, Phillip Farber pfar...@umich.edu wrote:
  Can I expect the index to be left in a usable state ofter an out of memory
 error during a merge or it it most likely to be corrupt?

It should be in the state it was after the last successful commit.

-Yonik
http://www.lucidimagination.com

  I'd really hate to
 have to start this index build again from square one.  Thanks.

 Thanks,

 Phil

 ---
 Exception in thread http-8080-Processor2505 java.lang.OutOfMemoryError:
 Java heap space
 Exception in thread RMI TCP Connection(131)-141.213.128.155
 java.lang.OutOfMemoryError: Java heap space
 Exception in thread ContainerBackgroundProcessor[StandardEngine[Catalina]]
 java.lang.OutOfMemoryError: Java heap space
 Exception in thread http-8080-Processor2537 java.lang.OutOfMemoryError:
 Java heap space
 Exception in thread http-8080-Processor2483 Exception in thread RMI
 Scheduler(0) java.lang.OutOfMemoryError: Java heap space
 java.lang.OutOfMemoryError: Java heap space
 Exception in thread Lucene Merge Thread #202
 org.apache.lucene.index.MergePolicy$MergeException:
 java.lang.OutOfMemoryError: Java heap space
   at
 org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:351)
   at
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:315)
 Caused by: java.lang.OutOfMemoryError: Java heap space
 Exception in thread Lucene Merge Thread #266
 org.apache.lucene.index.MergePolicy$MergeException:
 java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot
 merge
   at
 org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:351)
   at
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:315)
 Caused by: java.lang.IllegalStateException: this writer hit an
 OutOfMemoryError; cannot merge
   at org.apache.lucene.index.IndexWriter._mergeInit(IndexWriter.java:4529)
   at org.apache.lucene.index.IndexWriter.mergeInit(IndexWriter.java:4512)
   at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4424)
   at
 org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:235)
   at
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
 WARN: The method class
 org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
 WARN: Please see http://www.slf4j.org/codes.html#release for an explanation.
 WARN: The method class
 org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
 WARN: Please see http://www.slf4j.org/codes.html#release for an explanation.