Re: ManagedIndexSchema Bad version when trying to persist schema

2018-10-11 Thread Erick Erickson
Bah, I should have said when you create a collection. You get the
following if you create your collection using the default schema:

WARNING: Using _default configset with data driven schema
functionality. NOT RECOMMENDED for production use.
 To turn off: bin/solr config -c eoe -p 8982 -action
set-user-property -property update.autoCreateFields -value false

But as Shawn says, if you have add-unknown-fields-to-the-schema
configured in your update processor chain in solrconfig.xml, you're
using schemaless mode.

Best,
Erick
On Thu, Oct 11, 2018 at 9:58 AM Shawn Heisey  wrote:
>
> On 10/11/2018 10:07 AM, Mikhail Ibraheem wrote:
> >   Hi Erick,Thanks for your reply.No, we aren't using schemaless mode.  
> >  is not explicitly declared in our solrconfig.xml
>
> Schemaless mode is not turned on by the schemaFactory config element.
>
> The default configurations that Solr ships with have schemaless mode
> enabled.  It is handled with an update processor chain named
> add-unknown-fields-to-the-schema.  As Erick mentioned, we *STRONGLY*
> recommend that you disable this in production.  In version 7.5, you only
> need to change whether that update chain is set to the default chain.
> This can be handled with a property (update.autoCreateFields) but you
> can just also edit solrconfig.xml to change the "default" attribute to
> false.
>
> The exception you're getting is
> org.apache.zookeeper.KeeperException$BadVersionException which comes
> from zookeeper.  Can you share the *entire* error?  It will likely be a
> few dozen lines in length and may contain multiple "Caused by" sections,
> each with their own stacktrace.  With the entire error, we can narrow
> down what part of Solr code is executing when the error happens.
>
> Thanks,
> Shawn
>


Which is the best option to secure the solr specific data in zookeeper?

2018-10-11 Thread Andaka Mundaka
Hi,

When the solr authentication is enabled, which is better to use ZK ACL or
enable authentication for the whole zookeeper itself? Or is their any other
better option?


Thanks,
Yamuna J


Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)

2018-10-11 Thread Shawn Heisey

On 10/11/2018 4:51 AM, yasoobhaider wrote:

Hi Shawn, thanks for the inputs.

I have uploaded the gc logs of one of the slaves here:
https://ufile.io/ecvag (should work till 18th Oct '18)

I uploaded the logs to gceasy as well and it says that the problem is
consecutive full GCs. According to the solution they have mentioned,
increasing the heap size is a solution. But I am already running on a pretty
big heap, so don't think increasing the heap size is going to be a long term
solution.


Surprisingly, the GC performance in that logfile is actually pretty 
good.  I was more interested in how much heap was actually being used 
than the performance.


The "Heap after GC" button on the gceasy report page (which controls 
which graph is shown) shows that you really are using most of that 
80GB.  If the info you shared about your index sizes is accurate, the 
only way I can imagine this much heap being necessary is configuration.


It sounds like each system should have two configurations -- 
solrconfig.xml and the schema are the primary things in those configs.  
Can you share the unique configuration directories for each of your 
indexes, which I think means there will be two of them? For each of the 
configurations, indicate the number of documents and size on disk.  
You'll need to use a file sharing site.  It would be best to archive 
each directory into its own zipfile or .tar.gz file.


If your systems are running in cloud mode, the active configuration will 
be stored in zookeeper.


Thanks,
Shawn



Re: ManagedIndexSchema Bad version when trying to persist schema

2018-10-11 Thread Shawn Heisey

On 10/11/2018 10:07 AM, Mikhail Ibraheem wrote:

  Hi Erick,Thanks for your reply.No, we aren't using schemaless mode.  
 is not explicitly declared in our solrconfig.xml


Schemaless mode is not turned on by the schemaFactory config element.

The default configurations that Solr ships with have schemaless mode 
enabled.  It is handled with an update processor chain named 
add-unknown-fields-to-the-schema.  As Erick mentioned, we *STRONGLY* 
recommend that you disable this in production.  In version 7.5, you only 
need to change whether that update chain is set to the default chain.  
This can be handled with a property (update.autoCreateFields) but you 
can just also edit solrconfig.xml to change the "default" attribute to 
false.


The exception you're getting is 
org.apache.zookeeper.KeeperException$BadVersionException which comes 
from zookeeper.  Can you share the *entire* error?  It will likely be a 
few dozen lines in length and may contain multiple "Caused by" sections, 
each with their own stacktrace.  With the entire error, we can narrow 
down what part of Solr code is executing when the error happens.


Thanks,
Shawn



Re: ManagedIndexSchema Bad version when trying to persist schema

2018-10-11 Thread Terry Steichen
Erick,

I don't get any such message when I start solr - could you share what
that curl command should be?

You suggest modifying solrconfig.xml - could you be more explicit on
what changes to make?

Terry


On 10/11/2018 11:52 AM, Erick Erickson wrote:
> bq: Also why solr updates and persists the managed-schema while ingesting 
> data?
>
> I'd guess you are using "schemaless mode", which is expressly
> recommended _against_ for production systems. See "Schemaless Mode" in
> the reference guide.
>
> I'd disable schemaless mode (when you start Solr there should be a
> message telling you how to disable it via curl, but I'd modify my
> solrconfig.xml file to remove it permanently)
>
> Best,
> Erick
> On Thu, Oct 11, 2018 at 8:02 AM Mikhail Ibraheem
>  wrote:
>> Hi,We upgraded to Solr 7.5, we try to ingest to solr using solrJ in 
>> concurrent updates (Many threads).We are getting this 
>> exception:o.a.s.s.ManagedIndexSchema Bad version when trying to persist 
>> schema using 1 due to: 
>> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = 
>> BadVersion for /configs/my-core/managed-schemao.a.s.s.ManagedIndexSchema 
>> Failed to persist managed schema at /configs/my-core/managed-schema - 
>> version mismatch
>>
>> Also why solr updates and persists the managed-schema while ingesting data? 
>> I see managed-schema shouldn't be affected by data updates.
>> Thanks



Re: ManagedIndexSchema Bad version when trying to persist schema

2018-10-11 Thread Mikhail Ibraheem
 Hi Erick,Thanks for your reply.No, we aren't using schemaless mode.  
 is not explicitly declared in our solrconfig.xmlAlso we have 
only one replica and one shard.
Any help?
ThanksMikhail
On Thursday, 11 October 2018, 17:53:01 EET, Erick Erickson 
 wrote:  
 
 bq: Also why solr updates and persists the managed-schema while ingesting data?

I'd guess you are using "schemaless mode", which is expressly
recommended _against_ for production systems. See "Schemaless Mode" in
the reference guide.

I'd disable schemaless mode (when you start Solr there should be a
message telling you how to disable it via curl, but I'd modify my
solrconfig.xml file to remove it permanently)

Best,
Erick
On Thu, Oct 11, 2018 at 8:02 AM Mikhail Ibraheem
 wrote:
>
> Hi,We upgraded to Solr 7.5, we try to ingest to solr using solrJ in 
> concurrent updates (Many threads).We are getting this 
> exception:o.a.s.s.ManagedIndexSchema Bad version when trying to persist 
> schema using 1 due to: 
> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = 
> BadVersion for /configs/my-core/managed-schemao.a.s.s.ManagedIndexSchema 
> Failed to persist managed schema at /configs/my-core/managed-schema - version 
> mismatch
>
> Also why solr updates and persists the managed-schema while ingesting data? I 
> see managed-schema shouldn't be affected by data updates.
> Thanks  

Re: ManagedIndexSchema Bad version when trying to persist schema

2018-10-11 Thread Erick Erickson
bq: Also why solr updates and persists the managed-schema while ingesting data?

I'd guess you are using "schemaless mode", which is expressly
recommended _against_ for production systems. See "Schemaless Mode" in
the reference guide.

I'd disable schemaless mode (when you start Solr there should be a
message telling you how to disable it via curl, but I'd modify my
solrconfig.xml file to remove it permanently)

Best,
Erick
On Thu, Oct 11, 2018 at 8:02 AM Mikhail Ibraheem
 wrote:
>
> Hi,We upgraded to Solr 7.5, we try to ingest to solr using solrJ in 
> concurrent updates (Many threads).We are getting this 
> exception:o.a.s.s.ManagedIndexSchema Bad version when trying to persist 
> schema using 1 due to: 
> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = 
> BadVersion for /configs/my-core/managed-schemao.a.s.s.ManagedIndexSchema 
> Failed to persist managed schema at /configs/my-core/managed-schema - version 
> mismatch
>
> Also why solr updates and persists the managed-schema while ingesting data? I 
> see managed-schema shouldn't be affected by data updates.
> Thanks


Re: Tika and Solr : rejected document due to mime type restrictions

2018-10-11 Thread Shawn Heisey

On 10/11/2018 9:06 AM, Bisonti Mario wrote:

I startup tika server from command line:
java -jar /opt/tika/tika-server-1.19.1.jar

I configured, with ManifoldCF a connector to Solr.

When I start the ingest of pdf and .xls document, I see in the tika server:



so it seems that tika server process the cocuments, but, Solr server doesn't 
ingest.

I obtain the error:
Solr connector rejected document due to mime type restrictions: 
(application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
Solr connector rejected document due to mime type restrictions: 
(application/pdf)


Those errors are not coming from Solr.  Do you see any errors in 
solr.log?  If you do, then we can help you with those.


Since ManifoldCF calls its components connectors, I am betting the 
errors are being generated by ManifoldCF, and that for those documents, 
nothing has actually been sent to Solr, so you won't see errors in the 
solr.log for those files.  ManifoldCF is a separate project within 
Apache, which has its own support infrastructure.


https://manifoldcf.apache.org/en_US/mail.html

Thanks,
Shawn



Tika and Solr : rejected document due to mime type restrictions

2018-10-11 Thread Bisonti Mario
Hallo.
I startup tika server from command line:
java -jar /opt/tika/tika-server-1.19.1.jar

I configured, with ManifoldCF a connector to Solr.

When I start the ingest of pdf and .xls document, I see in the tika server:

INFO  Setting the server's publish address to be http://localhost:9998/
INFO  Logging initialized @1053ms to org.eclipse.jetty.util.log.Slf4jLog
INFO  jetty-9.4.z-SNAPSHOT; built: 2018-06-05T18:24:03.829Z; git: 
d5fc0523cfa96bfebfbda19606cad384d772f04c; jvm 10.0.2+13-Ubuntu-1ubuntu0.18.04.2
INFO  Started ServerConnector@f74e835{HTTP/1.1,[http/1.1]}{localhost:9998}
INFO  Started @1134ms
WARN  Empty contextPath
INFO  Started o.e.j.s.h.ContextHandler@68d6972f{/,null,AVAILABLE}
INFO  Started Apache Tika server at http://localhost:9998/
INFO  meta (application/pdf)
INFO  meta (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
WARN  Using fallback font 'LiberationSans' for 'TimesNewRomanPS-BoldMT'
WARN  Using fallback font 'LiberationSans' for 'Arial-Black'
WARN  Using fallback font 'LiberationSans' for 'TimesNewRomanPSMT'
WARN  Using fallback font 'LiberationSans' for 'Arial-BoldMT'
WARN  Using fallback font 'LiberationSans' for 'ArialMT'
WARN  Using fallback font 'LiberationSans' for 'CourierNewPSMT'
WARN  Using fallback font 'LiberationSans' for 'TimesNewRomanPS-ItalicMT'
INFO  tika (application/pdf)
WARN  Using fallback font 'LiberationSans' for 'TimesNewRomanPS-BoldMT'
WARN  Using fallback font 'LiberationSans' for 'Arial-Black'
WARN  Using fallback font 'LiberationSans' for 'TimesNewRomanPSMT'
WARN  Using fallback font 'LiberationSans' for 'Arial-BoldMT'
WARN  Using fallback font 'LiberationSans' for 'ArialMT'
WARN  Using fallback font 'LiberationSans' for 'CourierNewPSMT'
WARN  Using fallback font 'LiberationSans' for 'TimesNewRomanPS-ItalicMT'
INFO  tika (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

so it seems that tika server process the cocuments, but, Solr server doesn't 
ingest.

I obtain the error:
Solr connector rejected document due to mime type restrictions: 
(application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
Solr connector rejected document due to mime type restrictions: 
(application/pdf)

I understood that tika converts all documents in text so it index to solr, or 
are there any restriction about Tika Server mime typ?

Thanks a lot

Mario


ManagedIndexSchema Bad version when trying to persist schema

2018-10-11 Thread Mikhail Ibraheem
Hi,We upgraded to Solr 7.5, we try to ingest to solr using solrJ in concurrent 
updates (Many threads).We are getting this exception:o.a.s.s.ManagedIndexSchema 
Bad version when trying to persist schema using 1 due to: 
org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = 
BadVersion for /configs/my-core/managed-schemao.a.s.s.ManagedIndexSchema Failed 
to persist managed schema at /configs/my-core/managed-schema - version mismatch

Also why solr updates and persists the managed-schema while ingesting data? I 
see managed-schema shouldn't be affected by data updates.
Thanks

Re: Solr JVM Memory settings

2018-10-11 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Shawn,

On 10/11/18 12:54 AM, Shawn Heisey wrote:
> On 10/10/2018 10:08 PM, Sourav Moitra wrote:
>> We have a Solr server with 8gb of memory. We are using solr in
>> cloud mode, solr version is 7.5, Java version is Oracle Java 9
>> and settings for Xmx and Xms value is 2g but we are observing
>> that the RAM getting used to 98% when doing indexing.
>> 
>> How can I ensure that SolrCloud doesn't use more than N GB of
>> memory ?
> 
> Where precisely are you seeing the 98% usage?  It is completely
> normal for a modern operating system to report that almost all the
> system memory is in use, at least after the system has been
> shuffling a lot of data.  All modern operating systems will use
> memory that has not been specifically allocated to programs for
> disk caching purposes, and system information tools will generally
> indicate that this memory is in use, even though it can be
> instantly claimed by any program that requests it.
> 
> https://en.wikipedia.org/wiki/Page_cache
> 
> If you tell a Java program that it is limited to a 2GB heap, then
> that program will never use more than 2GB, plus a little extra for
> the java runtime itself.  I cannot give you an exact figure for
> that little bit extra.  But every bit of data on disk that Solr
> accesses will end up (at least temporarily) in the operating
> system's disk cache -- using that unallocated memory.
> 
> https://wiki.apache.org/solr/SolrPerformanceProblems#RAM

To be fair, the JVM can use *much more* memory than you have specified
for your Java heap. It's just that the Java heap itself wont exceed
those values.

The JVM uses quite a bit of native memory which isn't counted in the
Java heap. There is only one way I know of to control that, and it's
to set a process-limit at the OS level on the amount of memory
allowed. I'm not sure how sensitive to those limits the JVM actually
is, so attempting to artificially constrain the JVM might end up with
a native OOM crash.

- -chris
-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlu/YgoACgkQHPApP6U8
pFjcbBAAgYegk20pYvfu3vcrAKxj3s+RSMGRPZ+nN5g0KYQFuhFgptYE+TqjLfBX
geekQUNqNUHO5psMA5q/6m6b3LwpqrMxJiapv0wWQ2wPah21CgLs/P/iG+elNQ63
H0ZXbe3wX0P0onZbP4+sfDyzhujZ+5+gMooK87o8Q4z91hIVX1EZfM4lcaZ3pbnb
JJ44YorWGPpXjQNEtOHfS7l/Q+8+6+XfEyfKha3JpRFcwcqgLpv23Koy4xgxgYr+
PMqfjptMBMjZ04xSdd491crm2yZowv3KH1Ss8v/L51rknGYPxCEkdKvPrUlpn+Rb
4WnQS6H//dJvQaLum/qR9Jxd+3vc13K7Mn++5Lu+jMbeEgaJU2hD4/ap/KMtFCqn
eIXl6HQYPW36sVcm/MIpkRvAgx8vri17sd3/5sOYaETrp4SMxMN5W44GvgDdkbGF
R9/tVBCFWb3p+o8eSKUf7QmARiN69DHGVwtQHWMIp8K9893IeHUNgVXKD7281zLB
AjHPc7QTvAn4xne0X9lvQjr+YKOPxd9FFqMBejdKht9aBFQvApma9LtJT3FInrob
QkSIx594KhoRltRy7E9t3XuWWGg8ujiuzKl6SEPsgXUC2Opwr4Wwu1yn9dCWkFJz
RzCKbaDBaNmrK6HSEsoNvS+yQPksPxM8MuchFaCAMZpVOsobCM0=
=77dD
-END PGP SIGNATURE-


Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)

2018-10-11 Thread Erick Erickson
I have to echo what others have said. An 80G heap is waaay out the norm,
especially when you consider the size of your indexes and the number of docs.

Understanding why you think you need that much heap should be your top
priority. As has already been suggested, insuring docValues are set for all
fields that are used for sorting, faceting and grouping is a must. Deep paging
can hurt too.

In addition I'd check the cache settings, do you have a huge
filterCache? What about
the other caches? One common mistake is to have very high cache
settings, in your setup
I'd stick with 512 to start.

Without _data_ it's hard to say, so unless some of those settings
don't help the next
thing I'd do is a heap dump or put a profiler on the JVM and see where
the heap is actually
allocated.

It's quite possible that you arrived at 80G with some mistaken
assumptions and once
those are cleared up you can reduce your heap a lot. You say "through
a lot of trial and error",
what exactly happens when you use, say, a 32G heap? OOMs? Slowdowns?

This is also starving your OS cache where most of the Lucene index
data is stored, see:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Best,
Erick


On Thu, Oct 11, 2018 at 4:42 AM yasoobhaider  wrote:
>
> Hi Shawn, thanks for the inputs.
>
> I have uploaded the gc logs of one of the slaves here:
> https://ufile.io/ecvag (should work till 18th Oct '18)
>
> I uploaded the logs to gceasy as well and it says that the problem is
> consecutive full GCs. According to the solution they have mentioned,
> increasing the heap size is a solution. But I am already running on a pretty
> big heap, so don't think increasing the heap size is going to be a long term
> solution.
>
> From what I understood from a bit more looking around, this is Concurrent
> Mode Failure for CMS. I found an old blog mentioning the use of
> XX:CMSFullGCsBeforeCompaction=1 to make sure that compaction is done prior
> to next collection trigger. So if it is a fragmentation problem, this will
> solve it I hope.
>
> I will also try out using docValues as suggested by Ere on a couple of
> fields on which we make a lot of faceting queries to reduce memory usage on
> the slaves.
>
> Please share any ideas that you may have from the gc logs analysis
>
> Thanks
> Yasoob
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr JVM Memory settings

2018-10-11 Thread Terry Steichen
Don't know if this directly affects what you're trying to do.  But I
have an 8GB server and when I run "solr status" I can see what % of the
automatic memory allocation is being used.  As it turned out, solr would
occasionally exceed that (and crashed). 

I then began starting solr with the additional parameter: "-m 1g"  Now
the solr consumption is almost always 50% or less, and have had no
further problems.


On 10/11/2018 12:08 AM, Sourav Moitra wrote:
> Hello,
>
> We have a Solr server with 8gb of memory. We are using solr in cloud
> mode, solr version is 7.5, Java version is Oracle Java 9 and settings
> for Xmx and Xms value is 2g but we are observing that the RAM getting
> used to 98% when doing indexing.
>
> How can I ensure that SolrCloud doesn't use more than N GB of memory ?
>
> Sourav Moitra
> https://souravmoitra.com
>



Re: Realtime get not always returning existing data

2018-10-11 Thread Chris Ulicny
We are relatively far behind with this one. The collections that we
experience the problem on are currently running on 6.3.0. If it's easy
enough for you to upgrade, it might be worth a try, but I didn't see any
changes to the RealTimeGet in either of the 7.4/5 change logs after a
cursory glance.

Due to the volume and number of different processes that, this cluster
requires more coordination to reindex and upgrade. So it's currently the
last one on our plan to get upgraded to 7.X (or 8.X if timing allows).

On Thu, Oct 11, 2018 at 8:22 AM sgaron cse  wrote:

> Hey Chris,
>
> Which version of SOLR are you running? I was thinking of maybe trying
> another version to see if it fixes the issue.
>
> On Thu, Oct 11, 2018 at 8:11 AM Chris Ulicny  wrote:
>
> > We've also run into that issue of not being able to reproduce it outside
> of
> > running production loads.
> >
> > However, we haven't been encountering the problem in live production
> quite
> > as much as we used to, and I think that might be from the /get requests
> > being spread out a little more evenly over the running interval which is
> > due to other process changes.
> >
> > If I get any new information, I'll update as well.
> >
> > Thanks for your help.
> >
> > On Wed, Oct 10, 2018 at 10:53 AM sgaron cse 
> wrote:
> >
> > > I haven't found a way to reproduce the problem other that running our
> > > entire set of code. I've also been trying different things to make sure
> > to
> > > problem is not from my end and so far I haven't managed to fix it by
> > > changing my code. It has to be a race condition somewhere but I just
> > can't
> > > put my finger on it.
> > >
> > > I'll message back if I find a way to reproduce.
> > >
> > > On Wed, Oct 10, 2018 at 10:48 AM Erick Erickson <
> erickerick...@gmail.com
> > >
> > > wrote:
> > >
> > > > Well assigning a bogus version that generates a 409 error then
> > > > immediately doing an RTG on the doc doesn't fail for me either 18
> > > > million tries later. So I'm afraid I haven't a clue where to go from
> > > > here. Unless we can somehow find a way to generate this failure I'm
> > > > going to drop it for the foreseeable future.
> > > >
> > > > Erick
> > > > On Tue, Oct 9, 2018 at 7:39 AM Erick Erickson <
> erickerick...@gmail.com
> > >
> > > > wrote:
> > > > >
> > > > > H. I wonder if a version conflict or perhaps other failure can
> > > > > somehow cause this. It shouldn't be very hard to add that to my
> test
> > > > > setup, just randomly add n _version_ field value.
> > > > >
> > > > > Erick
> > > > > On Mon, Oct 1, 2018 at 8:20 AM Erick Erickson <
> > erickerick...@gmail.com
> > > >
> > > > wrote:
> > > > > >
> > > > > > Thanks. I'll be away for the rest of the week, so won't be able
> to
> > > try
> > > > > > anything more
> > > > > > On Mon, Oct 1, 2018 at 5:10 AM Chris Ulicny 
> > > wrote:
> > > > > > >
> > > > > > > In our case, we are heavily indexing in the collection while
> the
> > > /get
> > > > > > > requests are happening which is what we assumed was causing
> this
> > > > very rare
> > > > > > > behavior. However, we have experienced the problem for a
> > collection
> > > > where
> > > > > > > the following happens in sequence with minutes in between them.
> > > > > > >
> > > > > > > 1. Document id=1 is indexed
> > > > > > > 2. Document successfully retrieved with /get?id=1
> > > > > > > 3. Document failed to be retrieved with /get?id=1
> > > > > > > 4. Document successfully retrieved with /get?id=1
> > > > > > >
> > > > > > > We've haven't looked at the issue in a while, so I don't have
> the
> > > > exact
> > > > > > > timing of that sequence on hand right now. I'll try to find an
> > > actual
> > > > > > > example, although I'm relatively certain it was multiple
> minutes
> > in
> > > > between
> > > > > > > each of those requests. However our autocommit (and soft
> commit)
> > > > times are
> > > > > > > 60s for both collections.
> > > > > > >
> > > > > > > I think the following two are probably the biggest differences
> > for
> > > > our
> > > > > > > setup, besides the version difference (v6.3.0):
> > > > > > >
> > > > > > > > index to this collection, perhaps not at a high rate
> > > > > > > > separate the machines running solr from the one doing any
> > > querying
> > > > or
> > > > > > > indexing
> > > > > > >
> > > > > > > The clients are on 3 hosts separate from the solr instances.
> The
> > > > total
> > > > > > > number of threads that are making updates and making /get
> > requests
> > > is
> > > > > > > around 120-150. About 40-50 per host. Each of our two
> collections
> > > > gets an
> > > > > > > average of 500 requests per second constantly for ~5 minutes,
> and
> > > > then the
> > > > > > > number slowly tapers off to essentially 0 after ~15 minutes.
> > > > > > >
> > > > > > > Every thread attempts to make the same series of requests.
> > > > > > >
> > > > > > > -- Update with "_version_=-1". If successful, no other requests
> > are
> > > > made.
> > > 

Re: Realtime get not always returning existing data

2018-10-11 Thread sgaron cse
Hey Chris,

Which version of SOLR are you running? I was thinking of maybe trying
another version to see if it fixes the issue.

On Thu, Oct 11, 2018 at 8:11 AM Chris Ulicny  wrote:

> We've also run into that issue of not being able to reproduce it outside of
> running production loads.
>
> However, we haven't been encountering the problem in live production quite
> as much as we used to, and I think that might be from the /get requests
> being spread out a little more evenly over the running interval which is
> due to other process changes.
>
> If I get any new information, I'll update as well.
>
> Thanks for your help.
>
> On Wed, Oct 10, 2018 at 10:53 AM sgaron cse  wrote:
>
> > I haven't found a way to reproduce the problem other that running our
> > entire set of code. I've also been trying different things to make sure
> to
> > problem is not from my end and so far I haven't managed to fix it by
> > changing my code. It has to be a race condition somewhere but I just
> can't
> > put my finger on it.
> >
> > I'll message back if I find a way to reproduce.
> >
> > On Wed, Oct 10, 2018 at 10:48 AM Erick Erickson  >
> > wrote:
> >
> > > Well assigning a bogus version that generates a 409 error then
> > > immediately doing an RTG on the doc doesn't fail for me either 18
> > > million tries later. So I'm afraid I haven't a clue where to go from
> > > here. Unless we can somehow find a way to generate this failure I'm
> > > going to drop it for the foreseeable future.
> > >
> > > Erick
> > > On Tue, Oct 9, 2018 at 7:39 AM Erick Erickson  >
> > > wrote:
> > > >
> > > > H. I wonder if a version conflict or perhaps other failure can
> > > > somehow cause this. It shouldn't be very hard to add that to my test
> > > > setup, just randomly add n _version_ field value.
> > > >
> > > > Erick
> > > > On Mon, Oct 1, 2018 at 8:20 AM Erick Erickson <
> erickerick...@gmail.com
> > >
> > > wrote:
> > > > >
> > > > > Thanks. I'll be away for the rest of the week, so won't be able to
> > try
> > > > > anything more
> > > > > On Mon, Oct 1, 2018 at 5:10 AM Chris Ulicny 
> > wrote:
> > > > > >
> > > > > > In our case, we are heavily indexing in the collection while the
> > /get
> > > > > > requests are happening which is what we assumed was causing this
> > > very rare
> > > > > > behavior. However, we have experienced the problem for a
> collection
> > > where
> > > > > > the following happens in sequence with minutes in between them.
> > > > > >
> > > > > > 1. Document id=1 is indexed
> > > > > > 2. Document successfully retrieved with /get?id=1
> > > > > > 3. Document failed to be retrieved with /get?id=1
> > > > > > 4. Document successfully retrieved with /get?id=1
> > > > > >
> > > > > > We've haven't looked at the issue in a while, so I don't have the
> > > exact
> > > > > > timing of that sequence on hand right now. I'll try to find an
> > actual
> > > > > > example, although I'm relatively certain it was multiple minutes
> in
> > > between
> > > > > > each of those requests. However our autocommit (and soft commit)
> > > times are
> > > > > > 60s for both collections.
> > > > > >
> > > > > > I think the following two are probably the biggest differences
> for
> > > our
> > > > > > setup, besides the version difference (v6.3.0):
> > > > > >
> > > > > > > index to this collection, perhaps not at a high rate
> > > > > > > separate the machines running solr from the one doing any
> > querying
> > > or
> > > > > > indexing
> > > > > >
> > > > > > The clients are on 3 hosts separate from the solr instances. The
> > > total
> > > > > > number of threads that are making updates and making /get
> requests
> > is
> > > > > > around 120-150. About 40-50 per host. Each of our two collections
> > > gets an
> > > > > > average of 500 requests per second constantly for ~5 minutes, and
> > > then the
> > > > > > number slowly tapers off to essentially 0 after ~15 minutes.
> > > > > >
> > > > > > Every thread attempts to make the same series of requests.
> > > > > >
> > > > > > -- Update with "_version_=-1". If successful, no other requests
> are
> > > made.
> > > > > > -- On 409 Conflict failure, it makes a /get request for the id
> > > > > > -- On doc:null failure, the client handles the error and moves on
> > > > > >
> > > > > > Combining this with the previous series of /get requests, we end
> up
> > > with
> > > > > > situations where an update fails as expected, but the subsequent
> > /get
> > > > > > request fails to retrieve the existing document:
> > > > > >
> > > > > > 1. Thread 1 updates id=1 successfully
> > > > > > 2. Thread 2 tries to update id=1, fails (409)
> > > > > > 3. Thread 2 tries to get id=1 succeeds.
> > > > > >
> > > > > > ...Minutes later...
> > > > > >
> > > > > > 4. Thread 3 tries to update id=1, fails (409)
> > > > > > 5. Thread 3 tries to get id=1, fails (doc:null)
> > > > > >
> > > > > > ...Minutes later...
> > > > > >
> > > > > > 6. Thread 4 tries to update id=1, fails (409)
> > > > > > 7. 

Re: Realtime get not always returning existing data

2018-10-11 Thread Chris Ulicny
We've also run into that issue of not being able to reproduce it outside of
running production loads.

However, we haven't been encountering the problem in live production quite
as much as we used to, and I think that might be from the /get requests
being spread out a little more evenly over the running interval which is
due to other process changes.

If I get any new information, I'll update as well.

Thanks for your help.

On Wed, Oct 10, 2018 at 10:53 AM sgaron cse  wrote:

> I haven't found a way to reproduce the problem other that running our
> entire set of code. I've also been trying different things to make sure to
> problem is not from my end and so far I haven't managed to fix it by
> changing my code. It has to be a race condition somewhere but I just can't
> put my finger on it.
>
> I'll message back if I find a way to reproduce.
>
> On Wed, Oct 10, 2018 at 10:48 AM Erick Erickson 
> wrote:
>
> > Well assigning a bogus version that generates a 409 error then
> > immediately doing an RTG on the doc doesn't fail for me either 18
> > million tries later. So I'm afraid I haven't a clue where to go from
> > here. Unless we can somehow find a way to generate this failure I'm
> > going to drop it for the foreseeable future.
> >
> > Erick
> > On Tue, Oct 9, 2018 at 7:39 AM Erick Erickson 
> > wrote:
> > >
> > > H. I wonder if a version conflict or perhaps other failure can
> > > somehow cause this. It shouldn't be very hard to add that to my test
> > > setup, just randomly add n _version_ field value.
> > >
> > > Erick
> > > On Mon, Oct 1, 2018 at 8:20 AM Erick Erickson  >
> > wrote:
> > > >
> > > > Thanks. I'll be away for the rest of the week, so won't be able to
> try
> > > > anything more
> > > > On Mon, Oct 1, 2018 at 5:10 AM Chris Ulicny 
> wrote:
> > > > >
> > > > > In our case, we are heavily indexing in the collection while the
> /get
> > > > > requests are happening which is what we assumed was causing this
> > very rare
> > > > > behavior. However, we have experienced the problem for a collection
> > where
> > > > > the following happens in sequence with minutes in between them.
> > > > >
> > > > > 1. Document id=1 is indexed
> > > > > 2. Document successfully retrieved with /get?id=1
> > > > > 3. Document failed to be retrieved with /get?id=1
> > > > > 4. Document successfully retrieved with /get?id=1
> > > > >
> > > > > We've haven't looked at the issue in a while, so I don't have the
> > exact
> > > > > timing of that sequence on hand right now. I'll try to find an
> actual
> > > > > example, although I'm relatively certain it was multiple minutes in
> > between
> > > > > each of those requests. However our autocommit (and soft commit)
> > times are
> > > > > 60s for both collections.
> > > > >
> > > > > I think the following two are probably the biggest differences for
> > our
> > > > > setup, besides the version difference (v6.3.0):
> > > > >
> > > > > > index to this collection, perhaps not at a high rate
> > > > > > separate the machines running solr from the one doing any
> querying
> > or
> > > > > indexing
> > > > >
> > > > > The clients are on 3 hosts separate from the solr instances. The
> > total
> > > > > number of threads that are making updates and making /get requests
> is
> > > > > around 120-150. About 40-50 per host. Each of our two collections
> > gets an
> > > > > average of 500 requests per second constantly for ~5 minutes, and
> > then the
> > > > > number slowly tapers off to essentially 0 after ~15 minutes.
> > > > >
> > > > > Every thread attempts to make the same series of requests.
> > > > >
> > > > > -- Update with "_version_=-1". If successful, no other requests are
> > made.
> > > > > -- On 409 Conflict failure, it makes a /get request for the id
> > > > > -- On doc:null failure, the client handles the error and moves on
> > > > >
> > > > > Combining this with the previous series of /get requests, we end up
> > with
> > > > > situations where an update fails as expected, but the subsequent
> /get
> > > > > request fails to retrieve the existing document:
> > > > >
> > > > > 1. Thread 1 updates id=1 successfully
> > > > > 2. Thread 2 tries to update id=1, fails (409)
> > > > > 3. Thread 2 tries to get id=1 succeeds.
> > > > >
> > > > > ...Minutes later...
> > > > >
> > > > > 4. Thread 3 tries to update id=1, fails (409)
> > > > > 5. Thread 3 tries to get id=1, fails (doc:null)
> > > > >
> > > > > ...Minutes later...
> > > > >
> > > > > 6. Thread 4 tries to update id=1, fails (409)
> > > > > 7. Thread 4 tries to get id=1 succeeds.
> > > > >
> > > > > As Steven mentioned, it happens very, very rarely. We tried to
> > recreate it
> > > > > in a more controlled environment, but ran into the same issue that
> > you are,
> > > > > Erick. Every simplified situation we ran produced no problems.
> Since
> > it's
> > > > > not a large issue for us and happens very rarely, we stopped trying
> > to
> > > > > recreate it.
> > > > >
> > > > >
> > > > > On Sun, Sep 

Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)

2018-10-11 Thread yasoobhaider
Hi Shawn, thanks for the inputs.

I have uploaded the gc logs of one of the slaves here:
https://ufile.io/ecvag (should work till 18th Oct '18)

I uploaded the logs to gceasy as well and it says that the problem is
consecutive full GCs. According to the solution they have mentioned,
increasing the heap size is a solution. But I am already running on a pretty
big heap, so don't think increasing the heap size is going to be a long term
solution.

>From what I understood from a bit more looking around, this is Concurrent
Mode Failure for CMS. I found an old blog mentioning the use of
XX:CMSFullGCsBeforeCompaction=1 to make sure that compaction is done prior
to next collection trigger. So if it is a fragmentation problem, this will
solve it I hope.

I will also try out using docValues as suggested by Ere on a couple of
fields on which we make a lot of faceting queries to reduce memory usage on
the slaves.

Please share any ideas that you may have from the gc logs analysis

Thanks
Yasoob



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr JVM Memory settings

2018-10-11 Thread Hendrik Haddorp
Beside the heap the JVM has other memory areas, like the metaspace: 
https://docs.oracle.com/javase/9/tools/java.htm

-> MaxMetaspaceSize
search for "size" in that document and you'll find tons of further 
settings. I have not tried out Oracle Java 9 yet.


regards,
Hendrik

On 11.10.2018 06:08, Sourav Moitra wrote:

Hello,

We have a Solr server with 8gb of memory. We are using solr in cloud
mode, solr version is 7.5, Java version is Oracle Java 9 and settings
for Xmx and Xms value is 2g but we are observing that the RAM getting
used to 98% when doing indexing.

How can I ensure that SolrCloud doesn't use more than N GB of memory ?

Sourav Moitra
https://souravmoitra.com