Re: Solr Geospatial Polygon Indexing/Querying Issue

2019-07-30 Thread David Smiley
On Tue, Jul 30, 2019 at 4:41 PM Sanders, Marshall (CAI - Atlanta) <
marshall.sande...@coxautoinc.com> wrote:

> I’ll explain the context around the use case we’re trying to solve and
> then attempt to respond as best I can to each of your points.  What we have
> is a list of documents that in our case the location is sometimes a point
> and sometimes a circle.  These basically represent (in our case) inventory
> at a physical location (point) or inventory that can be delivered to you
> within X km (configurable per document) which represents the circle use
> case.  We want to be able to allow a user to say I want all documents
> within X distance of my location, but also all documents that are able to
> be delivered to your point where the delivery distance is defined on the
> inventory (creating the circle).
>

That background info helps me understand things!


> This is why we were actually trying to combine both point based data and
> poly/circle data into a single geospatial field, since I don’t believe you
> could do something like fq=geofilt(latlng, x, y, d) OR
> geofilt(latlngCircle, x, y, 1) but perhaps we’re just not getting quite the
> right syntax, etc.
>

Oh quite possible :-).   It would look something like this:   fq= {!geofilt
sfield=latLng d=queryDistance} OR {!geofilt sfield=latLngCircle
d=0}=myLocation
Notice the space after the fq= which is critical so that the first
local-params (i.e. first geofilt) does not "own" the entire filter query
string end to end.  Due to the space, the whole thing is parsed by the
default lucene/standard query parser, and then we have the two clauses
clearly there.  The second geofilt has distance 0; it'd be nice if it
internally optimized to a point but nonetheless it's fine.  Alternatively
there's another syntax to embed WKT where you can specify a point
explicitly... something like this: ...  {!field f=latLngCircle
v="Intersects(POINT(x y))"}

That said, it's also just fine to do as you were planning -- have one RPT
based field for the shape representation (mixture of points and circles),
and one LLPSF field purely for the center point that is used for sorting.
That LLPSF field would be indexed=false docValues=true since you wouldn't
be filtering on it.

>
> * Generally RptWithGeometrySpatialField should be used over
> SpatialRecursivePrefixTreeFieldType unless you want heatmaps or are willing
> to make trade-offs in higher index size and lossy precision in order to get
> faster search.  It's up to you; if you benchmark both I'd love to hear how
> it went.
>
> -We may explore both but typically we’re more interested in speed
> than accuracy, benchmarking it may be a very interesting exercise however.
> For sorting for instance we’re actually using sqedist instead of geodist
> because we’re not overly concerned about sorting accuracy.
>

Okay... though geodist on a LLPSF field is remarkably optimized.


> * I see you are using Geo3D, which is not the default.  Geo3D is strict
> about the coordinate order -- counter-clickwise.  Your triangle is
> clockwise and thus it has an inverted interpretation -- thus it's a shape
> that covers nearly the whole globe.  I recently documented this
> https://issues.apache.org/jira/browse/SOLR-13467 but it's not published
> yet since it's so new.
>
> - Thanks for this clarification as well.  I had read this in the
> WKT docs too, again something we tried but really weren’t sure about what
> the right answer was and had been going back and forth on.  The
> documentation seems to specify that you need to specify either JTS or
> Geo3d, but doesn’t provide much info/guidance about which to use when and
> since JTS required adding another jar manually and therefore complicates
> our build process significantly (at least vs using Geo3D) we tried Geo3D.
> I’d love to hear more about the tradeoffs and other considerations between
> the two, but sounds like we should switch to JTS (the default, correct?)
>

The default spatialContextFactory is something internal; not JTS or Geo3D.
Based on your requirements, you needn't specify either JTS or Geo3D, mostly
because you don't actually need polygons.  I wouldn't bother specifying it
unless you want to experiment with some benchmarking.  JTS would give you
nothing here but Geo3D + prefixTree=S2 (in Solr 8.2) might be faster.


> * You can absolutely index a circle in Solr -- this is something cool and
> somewhat unique. And you don't need format=legacy.  The documentation needs
> to call t out better, though it at least refers to circles as a "buffered
> point" which is the currently supported way of representing it, and it does
> have one example.  Search for "BUFFER" and you'll see a WKT-like syntax to
> do it.  BUFFER is not standard WKT; it was added on to do this.  The first
> arg is a X Y center, and 2nd arg is a distance in decimal degrees (not
> km).  BTW Geo3D is a good choice here but not essential either.
>
> -   This sounds very promising and we’ll 

Solr 7.7.2 vs Solr 8.2.0

2019-07-30 Thread Arnold Bronley
Hi,

We are trying to decide whether we should upgrade to Solr 7.7.2 version or
Solr 8.2.0 version. We are currently on Solr 6.3.0 version.

On one hand 8.2.0 version feels like a good choice because it is the latest
version. But then experience tells that initial versions usually have lot
of bugs compared to the later LTS versions.

Also, there is one more issue. There is this major JIRA bug
https://issues.apache.org/jira/browse/SOLR-13336 which mostly won't get
fixed in any 7.x version, but is fixed in Solr 8.1. I checked and our Solr
configuration is vulnerable to it. Do you have any recommendation as to
which Solr version one should move to given these facts?


SOLR 8.1.1 EdgeNGramFilterFactory parsing query

2019-07-30 Thread Hodder, Rick
I have a SOLR 4.10.2 core, and I am upgrading to 8.1.1.
I created an 8.1.1 core manually using the default_config set , and then 
brought over settings into the 8.1.1 schema
I have adjusted the schema.xml and solrconfig.xml, and I have the core 
queryable in 8.1.1.

I have a field named Company:





In 4.10.2 when I run the query:

IDX_Company:blue

with debugQuery on, I see the query parsed into pieces (correctly)

"debug": {
"rawquerystring": "IDX_Company:blue",
"querystring": "IDX_Company:blue",
"parsedquery": "(IDX_Company:b IDX_Company:bl IDX_Company:blu 
IDX_Company:blue)/no_coord",
...

When I run this against 8.1.1, with debugQuery on, I get the following:

"debug":{
"rawquerystring":"IDX_Company:blue",
"querystring":"IDX_Company:blue",
"parsedquery":"IDX_Company:blue",
...

It seems to not be applying the EdgeNGramFilterFactory - the only change I made 
to the EdgeNGramFilterFactory configuration was to remove the "side" attribute, 
per the documentation.
Also, per the documentation, I replaced the SynonymFilterFactory with 
SynonmGraphFilterFactory, and added the FlattenGraphFilterFactory.

I have tried removing the FlattenGraphFilterFactory, I have cleared and 
repopulated the core (reindexed), I have stopped and started SOLR 8.1.1, and no 
difference.

Here is the definition of text_general I am using in schema.xml



  









 
  
  

 







 
  



Re: SolrCloud recommended I/O RAID level

2019-07-30 Thread Shawn Heisey

On 7/30/2019 12:12 PM, Kaminski, Adi wrote:
Indeed RAID10 with both mirroring and striping should satisfy the need, 
but per some benchmarks in the network there is still an impact on write 
performance on it compared to RAID0 which is considered as much better 
(attaching a table that summarizes different RAID levels and their 
pros/cons and capacity ratio).


RAID10 offers the best combination of performance and reliability. 
RAID0 might beat it *slightly* on performance, but if ANY drive fails on 
RAID0, the entire volume is lost.


If we have ~200-320 shards spread by our 7 Solr node servers (part of 
SolrCloud cluster) on single core/collection configured with replication 
factor 2, shouldn't it supply applicative level redundancy of indexed 
data ?


Yes, you could rely on Solr alone for data redundancy.  But if there's a 
drive failure, do you REALLY want to be single-stranded for the time it 
takes to rebuild the entire server and copy data?  That's what you would 
end up doing if you choose RAID0.


It is true that RAID1 or RAID10 means you have to buy double your usable 
capacity.  I would argue that drives are cheap and will cost less than 
either downtime or sysadmin effort.


Thanks,
Shawn


Re: SolrCloud recommended I/O RAID level

2019-07-30 Thread Kaminski, Adi
Hi Furkan,
Thanks for your response !

Indeed RAID10 with both mirroring and striping should satisfy the need, but per 
some benchmarks in the network there is still an impact on write performance on 
it compared to RAID0 which is considered as much better (attaching a table that 
summarizes different RAID levels and their pros/cons and capacity ratio).

If we have ~200-320 shards spread by our 7 Solr node servers (part of SolrCloud 
cluster) on single core/collection configured with replication factor 2, 
shouldn't it supply applicative level redundancy of indexed data ?
Solr will hold each shard in two places with its documents, and will ensure 
that every shard placed in 2 different servers, no ?

If that's the case, why not choosing RAID0 that is the best from both read and 
write performance ?

Thanks,
Adi

Sent from Workspace ONE Boxer

On Jul 30, 2019 20:51, Furkan KAMACI  wrote:
Hi Adi,

RAID10 is good for satisfying both indexing and query, striping across
mirror sets. However, you lose half of your raw disk space, just like with
RAID1.

Here is a mail thread of mine which discusses RAID levels for Solr
specific:
https://lists.apache.org/thread.html/462d7467b2f2d064223eb46763a6a6e606ac670fe7f7b40858d97c0d@1366325333@%3Csolr-user.lucene.apache.org%3E

Kind Regards,
Furkan KAMACI

On Mon, Jul 29, 2019 at 10:25 PM Kaminski, Adi 
wrote:

> Hi,
> We are about to size large environment with 7 nodes/servers with
> replication factor 2 of SolrCloud cluster (using Solr 7.6).
>
> The system contains parent-child (nested documents) schema, and about to
> have 40M parent docs with 50-80 child docs each (in total 2-3.2B Solr docs).
>
> We have a use case that will require to update parent document fields
> triggered by an application flow (with re-indexing or atomic/partial update
> approach, that will probably require to upgrade to Solr 8.1.1 that supports
> this feature and contains some fixes in nested docs handling area).
>
> Since these updates might be quite heavy from IOPS perspective, we would
> like to make sure that the IO hardware and RAID configuration are optimized
> (r/w ratio of 50% read and 50% write, to allow balanced search and update
> flows).
>
> Can someone share similar scale/use- case/deployment RAID level
> configuration ?
> (I assume that RAID5&6 are not an option due to parity/dual parity heavy
> impact on write operations, so it leaves RAID 0, 1 or 10).
>
> Thanks in advance,
> Adi
>
>
>
>
> Sent from Workspace ONE Boxer
>
>
> This electronic message may contain proprietary and confidential
> information of Verint Systems Inc., its affiliates and/or subsidiaries. The
> information is intended to be for the use of the individual(s) or
> entity(ies) named above. If you are not the intended recipient (or
> authorized to receive this e-mail for the intended recipient), you may not
> use, copy, disclose or distribute to anyone this message or any information
> contained in this message. If you have received this electronic message in
> error, please notify us by replying to this e-mail.
>


This electronic message may contain proprietary and confidential information of 
Verint Systems Inc., its affiliates and/or subsidiaries. The information is 
intended to be for the use of the individual(s) or entity(ies) named above. If 
you are not the intended recipient (or authorized to receive this e-mail for 
the intended recipient), you may not use, copy, disclose or distribute to 
anyone this message or any information contained in this message. If you have 
received this electronic message in error, please notify us by replying to this 
e-mail.


Re: Problem with solr suggester in case of non-ASCII characters

2019-07-30 Thread Szűcs Roland
Hi Furkan,

Thanks the suggestion, I always forget the most effective debugging tool
the analysis page.

It turned out that "Jó" was a stop word and it was eliminated during the
text analysis. What I will do is to create a new field type but without
stop word removal and I will use it like this:
short_text_hu_without_stop_removal

Thanks again

Roland

Furkan KAMACI  ezt írta (időpont: 2019. júl. 30.,
K, 16:17):

> Hi Roland,
>
> Could you check Analysis tab (
> https://lucene.apache.org/solr/guide/8_1/analysis-screen.html) and tell
> how
> the term is analyzed for both query and index?
>
> Kind Regards,
> Furkan KAMACI
>
> On Tue, Jul 30, 2019 at 4:50 PM Szűcs Roland 
> wrote:
>
> > Hi All,
> >
> > I have an author suggester (searchcomponent and the related request
> > handler) defined in solrconfig:
> > 
> > >
> > 
> >   author
> >   AnalyzingInfixLookupFactory
> >   DocumentDictionaryFactory
> >   BOOK_productAuthor
> >   short_text_hu
> >   suggester_infix_author
> >   false
> >   false
> >   2
> > 
> > 
> >
> >  > startup="lazy" >
> > 
> >   true
> >   10
> >   author
> > 
> > 
> >   suggest
> > 
> > 
> >
> > Author field has just a minimal text processing in query and index time
> > based on the following definition:
> >  > positionIncrementGap="100" multiValued="true">
> > 
> >   
> >   
> >> ignoreCase="true"/>
> >   
> > 
> > 
> >   
> >> ignoreCase="true"/>
> >   
> > 
> >   
> >> docValues="true"/>
> >> docValues="true" multiValued="true"/>
> >> positionIncrementGap="100">
> > 
> >   
> >   
> >words="lang/stopwords_ar.txt"
> > ignoreCase="true"/>
> >   
> >   
> > 
> >   
> >
> > When I use qeries with only ASCII characters, the results are correct:
> > "Al":{
> > "term":"Alexandre Dumas", "weight":0, "payload":""}
> >
> > When I try it with Hungarian authorname with special character:
> > "Jó":"author":{
> > "Jó":{ "numFound":0, "suggestions":[]}}
> >
> > When I try it with three letters, it works again:
> > "Józ":"author":{
> > "Józ":{ "numFound":10, "suggestions":[{ "term":"Bajza József", "
> > weight":0, "payload":""}, { "term":"Eötvös József", "weight":0, "
> > payload":""}, { "term":"Eötvös József", "weight":0,
> "payload":""}, {
> > "term":"Eötvös József", "weight":0, "payload":""}, {
> > "term":"József
> > Attila", "weight":0, "payload":""}..
> >
> > Any idea how can it happen that a longer string has more matches than a
> > shorter one. It is inconsistent. What can I do to fix it as it would
> > results poor customer experience.
> > They would feel that sometimes they need 2 sometimes 3 characters to get
> > suggestions.
> >
> > Thanks in advance,
> > Roland
> >
>


Re: SolrCloud recommended I/O RAID level

2019-07-30 Thread Furkan KAMACI
Hi Adi,

RAID10 is good for satisfying both indexing and query, striping across
mirror sets. However, you lose half of your raw disk space, just like with
RAID1.

Here is a mail thread of mine which discusses RAID levels for Solr
specific:
https://lists.apache.org/thread.html/462d7467b2f2d064223eb46763a6a6e606ac670fe7f7b40858d97c0d@1366325333@%3Csolr-user.lucene.apache.org%3E

Kind Regards,
Furkan KAMACI

On Mon, Jul 29, 2019 at 10:25 PM Kaminski, Adi 
wrote:

> Hi,
> We are about to size large environment with 7 nodes/servers with
> replication factor 2 of SolrCloud cluster (using Solr 7.6).
>
> The system contains parent-child (nested documents) schema, and about to
> have 40M parent docs with 50-80 child docs each (in total 2-3.2B Solr docs).
>
> We have a use case that will require to update parent document fields
> triggered by an application flow (with re-indexing or atomic/partial update
> approach, that will probably require to upgrade to Solr 8.1.1 that supports
> this feature and contains some fixes in nested docs handling area).
>
> Since these updates might be quite heavy from IOPS perspective, we would
> like to make sure that the IO hardware and RAID configuration are optimized
> (r/w ratio of 50% read and 50% write, to allow balanced search and update
> flows).
>
> Can someone share similar scale/use- case/deployment RAID level
> configuration ?
> (I assume that RAID5&6 are not an option due to parity/dual parity heavy
> impact on write operations, so it leaves RAID 0, 1 or 10).
>
> Thanks in advance,
> Adi
>
>
>
>
> Sent from Workspace ONE Boxer
>
>
> This electronic message may contain proprietary and confidential
> information of Verint Systems Inc., its affiliates and/or subsidiaries. The
> information is intended to be for the use of the individual(s) or
> entity(ies) named above. If you are not the intended recipient (or
> authorized to receive this e-mail for the intended recipient), you may not
> use, copy, disclose or distribute to anyone this message or any information
> contained in this message. If you have received this electronic message in
> error, please notify us by replying to this e-mail.
>


Re: Solr Geospatial Polygon Indexing/Querying Issue

2019-07-30 Thread Sanders, Marshall (CAI - Atlanta)
David,

Firstly, thanks for putting together such a thorough email it helps a lot to 
understand some of the things we were just guessing at because (as you 
mentioned a few times) the documentation around all of this is rather sparse.

I’ll explain the context around the use case we’re trying to solve and then 
attempt to respond as best I can to each of your points.  What we have is a 
list of documents that in our case the location is sometimes a point and 
sometimes a circle.  These basically represent (in our case) inventory at a 
physical location (point) or inventory that can be delivered to you within X km 
(configurable per document) which represents the circle use case.  We want to 
be able to allow a user to say I want all documents within X distance of my 
location, but also all documents that are able to be delivered to your point 
where the delivery distance is defined on the inventory (creating the circle).

This is why we were actually trying to combine both point based data and 
poly/circle data into a single geospatial field, since I don’t believe you 
could do something like fq=geofilt(latlng, x, y, d) OR geofilt(latlngCircle, x, 
y, 1) but perhaps we’re just not getting quite the right syntax, etc.

* Personally, I find it highly confusing to have a field named "latlng" and 
have it be anything other than a simple point -- it's all you have if given a 
single latitude longitude pair.  If you intend for the data to be a circle 
(either exactly or approximated) then perhaps call it latLngCircle

- This is happening because we’re trying to combine two different use 
cases into a single field, since I don’t think we have that option from the 
query side.  The name is really just us re-using our current field for this 
exploration, but would probably end up being named something different.

* geodist() and for that matter any other attempt to get the distance to a 
non-point shape is not going to work -- either error or confusing results; I 
forget.  This is hard to do and the logic isn't there for it, and probably 
wouldn't perform to user's expectations if it did.  This ought to be documented 
but seems not to be.

-Good to know, so no matter what we’ll have to have a point value 
stored somewhere for each document and calculate geodist on that.

* Generally RptWithGeometrySpatialField should be used over 
SpatialRecursivePrefixTreeFieldType unless you want heatmaps or are willing to 
make trade-offs in higher index size and lossy precision in order to get faster 
search.  It's up to you; if you benchmark both I'd love to hear how it went.

-We may explore both but typically we’re more interested in speed than 
accuracy, benchmarking it may be a very interesting exercise however.  For 
sorting for instance we’re actually using sqedist instead of geodist because 
we’re not overly concerned about sorting accuracy.

* In WKT format, the ordinate order is "X Y" (thus longitude then latitude).  
Looking at your triangle, it is extremely close to Antarctica, and I'm 
skeptical you intended that. This is not directly documented AFAICT but it's 
such a common mistake that it ought to be called out in the docs.

-Definitely did not intend it to be close to Antarctica,  I think we 
tried both but probably went back to lat,long and was definitely more common in 
our (failed) testing.


* I see you are using Geo3D, which is not the default.  Geo3D is strict about 
the coordinate order -- counter-clickwise.  Your triangle is clockwise and thus 
it has an inverted interpretation -- thus it's a shape that covers nearly the 
whole globe.  I recently documented this 
https://issues.apache.org/jira/browse/SOLR-13467 but it's not published yet 
since it's so new.

- Thanks for this clarification as well.  I had read this in the WKT 
docs too, again something we tried but really weren’t sure about what the right 
answer was and had been going back and forth on.  The documentation seems to 
specify that you need to specify either JTS or Geo3d, but doesn’t provide much 
info/guidance about which to use when and since JTS required adding another jar 
manually and therefore complicates our build process significantly (at least vs 
using Geo3D) we tried Geo3D.  I’d love to hear more about the tradeoffs and 
other considerations between the two, but sounds like we should switch to JTS 
(the default, correct?)


* You can absolutely index a circle in Solr -- this is something cool and 
somewhat unique. And you don't need format=legacy.  The documentation needs to 
call this out better, though it at least refers to circles as a "buffered 
point" which is the currently supported way of representing it, and it does 
have one example.  Search for "BUFFER" and you'll see a WKT-like syntax to do 
it.  BUFFER is not standard WKT; it was added on to do this.  The first arg is 
a X Y center, and 2nd arg is a distance in decimal degrees (not km).  BTW Geo3D 
is a good choice here but 

Re: Solr Backup

2019-07-30 Thread Jan Høydahl
The FS backup feature requires a shared drive as you say, and this is clearly 
documented. No way around it. Cloud Filestore would likely fix it.

Or you could write a new backup repo plugin for backup directly to Google Cloud 
Storage?

Jan Høydahl

> 30. jul. 2019 kl. 13:41 skrev Jayadevan Maymala :
> 
> Hello all,
> 
> We have a 3-node Solr cluster running on google cloud platform. I would
> like to schedule a backup and have been trying the backup API and getting
> java.nio.file.NoSuchFileException:java.nio.file.NoSuchFileException error.
> I suspect it is because a shared drive is necessary. Google VM instances
> don't have this feature, unless I go for Cloud Filestore.
> Is there a work-around? Some way in which I can have the cluster back up
> taken on the node on which I am executing the backup command? Solr Version
> is 7.3
> 
> Regards,
> Jayadevan


Re: Basic Query Not Working - Please Help

2019-07-30 Thread Furkan KAMACI
Hi Vipul,

You are welcome!

Kind Regards,
Furkan KAMACI

On Fri, Jul 26, 2019 at 11:07 AM Vipul Bahuguna <
newthings4learn...@gmail.com> wrote:

> Hi Furkan -
>
> I realized that I was searching incorrectly.
> I later realized that if I need to search by specific field, I need to do
> as you suggested -
> q=appname:App1 .
>
> OR if need to simply search by App1, then I need to use  to
> index my field appname at the time of insertion so that it can be later
> search without specifying the fieldname.
>
> thanks for your response.
>
> On Tue, Jul 23, 2019 at 6:07 AM Furkan KAMACI 
> wrote:
>
> > Hi Vipul,
> >
> > Which query do you submit? Is that one:
> >
> > q=appname:App1
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On Mon, Jul 22, 2019 at 10:52 AM Vipul Bahuguna <
> > newthings4learn...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I have installed SOLR 8.1.1.
> > > I am new and trying the very basics.
> > >
> > > I installed solr8.1.1 on Windows and I am using SOLR in standalone
> mode.
> > >
> > > Steps I followed -
> > >
> > > 1. created a core as follows:
> > > solr create_core -c dox
> > >
> > > 2. updated the managed_schema.xml file to add few specific fields
> > specific
> > > to my schema as belows:
> > >
> > >  stored="true"/>
> > >  stored="true"/>
> > >  > stored="true"/>
> > >  > > stored="true"/>
> > >
> > > 3. then i restarted SOLR
> > >
> > > 4. then i went to the Documents tab to enter my sample data for
> indexing,
> > > which looks like below:
> > > {
> > >
> > >   "id" : "1",
> > >   "prjname" : "Project1",
> > >   "apps" : [
> > > {
> > >   "appname" : "App1",
> > >   "topics" : [
> > > {
> > >   "topicname" : "topic1",
> > >   "links" : [
> > > "http://www.google.com;,
> > > "http://www.t6.com;
> > >   ]
> > > },
> > > {
> > >   "topicname" : "topic2",
> > >   "links" : [
> > > "http://www.java.com;,
> > > "http://www.rediff.com;
> > >   ]
> > > }
> > >   ]
> > > },
> > > {
> > >   "appname" : "App2",
> > >   "topics" : [
> > > {
> > >   "topicname" : "topic3",
> > >   "links" : [
> > > "http://www.t3.com;,
> > > "http://www.t4.com;
> > >   ]
> > > },
> > > {
> > >   "topicname" : "topic4",
> > >   "links" : [
> > > "http://www.rules.com;,
> > > "http://www.amazon.com;
> > >   ]
> > > }
> > >   ]
> > > }
> > >   ]
> > > }
> > >
> > > 5. Now when i go to Query tab and click Execute Search with *.*, it
> shows
> > > my recently added document as follows:
> > > {
> > > "responseHeader":{ "status":0, "QTime":0, "params":{ "q":"*:*", "_":
> > > "1563780352100"}}, "response":{"numFound":1,"start":0,"docs":[ {
> > "id":"1",
> > > "
> > > prjname":["Project1"], "apps":["{appname=App1,
> topics=[{topicname=topic1,
> > > links=[http://www.google.com, http://www.t6.com]}, {topicname=topic2,
> > > links=[http://www.java.com, http://www.rediff.com]}]};,
> "{appname=App2,
> > > topics=[{topicname=topic3, links=[http://www.t3.com, http://www.t4.com
> > ]},
> > > {topicname=topic4, links=[http://www.rules.com, http://www.amazon.com
> > > ]}]}"],
> > > "_version_":1639742305772503040}] }}
> > >
> > > 6. But now when I am trying to search based on field topicname or
> > prjname,
> > > it does not returns any document. Even if put anything in q like App1,
> > zero
> > > results are being returned.
> > >
> > >
> > > Can someone help me understanding what I might have done incorrectly?
> > > May be I defined my schema incorrectly.
> > >
> > > Thanks in advance
> > >
> >
>


Re: Problem with solr suggester in case of non-ASCII characters

2019-07-30 Thread Furkan KAMACI
Hi Roland,

Could you check Analysis tab (
https://lucene.apache.org/solr/guide/8_1/analysis-screen.html) and tell how
the term is analyzed for both query and index?

Kind Regards,
Furkan KAMACI

On Tue, Jul 30, 2019 at 4:50 PM Szűcs Roland 
wrote:

> Hi All,
>
> I have an author suggester (searchcomponent and the related request
> handler) defined in solrconfig:
> 
> >
> 
>   author
>   AnalyzingInfixLookupFactory
>   DocumentDictionaryFactory
>   BOOK_productAuthor
>   short_text_hu
>   suggester_infix_author
>   false
>   false
>   2
> 
> 
>
>  startup="lazy" >
> 
>   true
>   10
>   author
> 
> 
>   suggest
> 
> 
>
> Author field has just a minimal text processing in query and index time
> based on the following definition:
>  positionIncrementGap="100" multiValued="true">
> 
>   
>   
>ignoreCase="true"/>
>   
> 
> 
>   
>ignoreCase="true"/>
>   
> 
>   
>docValues="true"/>
>docValues="true" multiValued="true"/>
>positionIncrementGap="100">
> 
>   
>   
>ignoreCase="true"/>
>   
>   
> 
>   
>
> When I use qeries with only ASCII characters, the results are correct:
> "Al":{
> "term":"Alexandre Dumas", "weight":0, "payload":""}
>
> When I try it with Hungarian authorname with special character:
> "Jó":"author":{
> "Jó":{ "numFound":0, "suggestions":[]}}
>
> When I try it with three letters, it works again:
> "Józ":"author":{
> "Józ":{ "numFound":10, "suggestions":[{ "term":"Bajza József", "
> weight":0, "payload":""}, { "term":"Eötvös József", "weight":0, "
> payload":""}, { "term":"Eötvös József", "weight":0, "payload":""}, {
> "term":"Eötvös József", "weight":0, "payload":""}, {
> "term":"József
> Attila", "weight":0, "payload":""}..
>
> Any idea how can it happen that a longer string has more matches than a
> shorter one. It is inconsistent. What can I do to fix it as it would
> results poor customer experience.
> They would feel that sometimes they need 2 sometimes 3 characters to get
> suggestions.
>
> Thanks in advance,
> Roland
>


Re: Solr Backup

2019-07-30 Thread Shawn Heisey

On 7/30/2019 7:11 AM, Jayadevan Maymala wrote:

We will need the *FULL* error message.  It is probably dozens of lines
long and MIGHT contain multiple "Caused by" sections.


{
   "responseHeader":{
 "status":500,
 "QTime":22},
   "Operation backup caused
exception:":"java.nio.file.NoSuchFileException:java.nio.file.NoSuchFileException:
/backups/coupon",


That is the response, and it doesn't have the information I was hoping 
to see.  It doesn't show what happened ... basically it says there was 
an error, but doesn't have any real detail about what the error was. 
Analyzing what stacktrace data is available doesn't reveal anything useful.


I was hoping for actual logfile data.  The solr.log file from the server 
side may contain a more complete error.  If you can use a file sharing 
website or paste website to share the whole logfile, we might be able to 
find more information.


Thanks,
Shawn


Problem with solr suggester in case of non-ASCII characters

2019-07-30 Thread Szűcs Roland
Hi All,

I have an author suggester (searchcomponent and the related request
handler) defined in solrconfig:

>

  author
  AnalyzingInfixLookupFactory
  DocumentDictionaryFactory
  BOOK_productAuthor
  short_text_hu
  suggester_infix_author
  false
  false
  2





  true
  10
  author


  suggest



Author field has just a minimal text processing in query and index time
based on the following definition:


  
  
  
  


  
  
  

  
  
  
  

  
  
  
  
  

  

When I use qeries with only ASCII characters, the results are correct:
"Al":{
"term":"Alexandre Dumas", "weight":0, "payload":""}

When I try it with Hungarian authorname with special character:
"Jó":"author":{
"Jó":{ "numFound":0, "suggestions":[]}}

When I try it with three letters, it works again:
"Józ":"author":{
"Józ":{ "numFound":10, "suggestions":[{ "term":"Bajza József", "
weight":0, "payload":""}, { "term":"Eötvös József", "weight":0, "
payload":""}, { "term":"Eötvös József", "weight":0, "payload":""}, {
"term":"Eötvös József", "weight":0, "payload":""}, {
"term":"József
Attila", "weight":0, "payload":""}..

Any idea how can it happen that a longer string has more matches than a
shorter one. It is inconsistent. What can I do to fix it as it would
results poor customer experience.
They would feel that sometimes they need 2 sometimes 3 characters to get
suggestions.

Thanks in advance,
Roland


Re: Solr Backup

2019-07-30 Thread Jayadevan Maymala
On Tue, Jul 30, 2019 at 5:56 PM Shawn Heisey  wrote:

> On 7/30/2019 5:41 AM, Jayadevan Maymala wrote:
> > We have a 3-node Solr cluster running on google cloud platform. I would
> > like to schedule a backup and have been trying the backup API and getting
> > java.nio.file.NoSuchFileException:java.nio.file.NoSuchFileException
> error.
> > I suspect it is because a shared drive is necessary. Google VM instances
> > don't have this feature, unless I go for Cloud Filestore.
> > Is there a work-around? Some way in which I can have the cluster back up
> > taken on the node on which I am executing the backup command? Solr
> Version
> > is 7.3
>
> We will need the *FULL* error message.  It is probably dozens of lines
> long and MIGHT contain multiple "Caused by" sections.


{
  "responseHeader":{
"status":500,
"QTime":22},
  "Operation backup caused
exception:":"java.nio.file.NoSuchFileException:java.nio.file.NoSuchFileException:
/backups/coupon",
  "exception":{
"msg":"/backups/coupon",
"rspCode":-1},
  "error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","org.apache.solr.common.SolrException"],
"msg":"/backups/coupon",
"trace":"org.apache.solr.common.SolrException: /backups/coupon\n\tat
org.apache.solr.client.solrj.SolrResponse.getException(SolrResponse.java:53)\n\tat
org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:258)\n\tat
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:230)\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)\n\tat
org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:736)\n\tat
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:717)\n\tat
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:498)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)\n\tat
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
org.eclipse.jetty.server.Server.handle(Server.java:530)\n\tat
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)\n\tat
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)\n\tat
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)\n\tat
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)\n\tat
org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)\n\tat
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)\n\tat
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)\n\tat
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)\n\tat
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)\n\tat
java.lang.Thread.run(Thread.java:748)\n",
"code":500}}

> We will also need
> the complete Solr version to be able to interpret the error -- 7.3 is
> not 

Re: [ZOOKEEPER] - Error - HEAP MEMORY

2019-07-30 Thread Rodrigo Oliveira
Hi,

Follow the process, and more data about my SOLR + ZOOKEEPER.

root  48425  1 26 Jul29 ?03:00:39 java -server -Xms28g
-Xmx32g -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90
-XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4
-XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark
-XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000
-XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled
-XX:-OmitStackTraceInFastThrow -verbose:gc -XX:+PrintHeapAtGC
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
-XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime
-Xloggc:/solr/server/logs/solr_gc.log -XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=9 -XX:GCLogFileSize=20M -DzkClientTimeout=15000
-DzkHost=177.55.55.152:2181,177.55.55.153:2181,177.55.55.154:2181,
177.55.55.155:2181,177.55.55.156:2181 -Dsolr.log.dir=/solr/server/logs
-Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Duser.timezone=UTC
-Djetty.home=/solr/server -Dsolr.solr.home=/solr/server/solr
-Dsolr.data.home= -Dsolr.install.dir=/solr
-Dsolr.default.confdir=/solr/server/solr/configsets/_default/conf -Xss256k
-Dsolr.jetty.https.port=8983 -Dsolr.log.muteconsole
-XX:OnOutOfMemoryError=/solr/bin/oom_solr.sh 8983 /solr/server/logs -jar
start.jar --module=http

root  48163  1  0 Jul29 ?00:01:33 java
-Dzookeeper.log.dir=/zoop/bin/../logs
-Dzookeeper.log.file=zookeeper-root-server-eddison0001.log
-Dzookeeper.root.logger=INFO,CONSOLE -XX:+HeapDumpOnOutOfMemoryError
-XX:OnOutOfMemoryError=kill -9 %p -cp
/zoop/bin/../zookeeper-server/target/classes:/zoop/bin/../build/classes:/zoop/bin/../zookeeper-server/target/lib/*.jar:/zoop/bin/../build/lib/*.jar:/zoop/bin/../lib/zookeeper-jute-3.5.5.jar:/zoop/bin/../lib/zookeeper-3.5.5.jar:/zoop/bin/../lib/slf4j-log4j12-1.7.25.jar:/zoop/bin/../lib/slf4j-api-1.7.25.jar:/zoop/bin/../lib/netty-all-4.1.29.Final.jar:/zoop/bin/../lib/log4j-1.2.17.jar:/zoop/bin/../lib/json-simple-1.1.1.jar:/zoop/bin/../lib/jline-2.11.jar:/zoop/bin/../lib/jetty-util-9.4.17.v20190418.jar:/zoop/bin/../lib/jetty-servlet-9.4.17.v20190418.jar:/zoop/bin/../lib/jetty-server-9.4.17.v20190418.jar:/zoop/bin/../lib/jetty-security-9.4.17.v20190418.jar:/zoop/bin/../lib/jetty-io-9.4.17.v20190418.jar:/zoop/bin/../lib/jetty-http-9.4.17.v20190418.jar:/zoop/bin/../lib/javax.servlet-api-3.1.0.jar:/zoop/bin/../lib/jackson-databind-2.9.8.jar:/zoop/bin/../lib/jackson-core-2.9.8.jar:/zoop/bin/../lib/jackson-annotations-2.9.0.jar:/zoop/bin/../lib/commons-cli-1.2.jar:/zoop/bin/../lib/audience-annotations-0.5.0.jar:/zoop/bin/../zookeeper-*.jar:/zoop/bin/../zookeeper-server/src/main/resources/lib/*.jar:/zoop/bin/../conf:
-*Xmx1000m -Xmx4096m* -Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.local.only=false
org.apache.zookeeper.server.quorum.QuorumPeerMain /zoop/conf/zoo.cfg
root  91749  91497  0 09:43 pts/000:00:00 grep --color=auto -i zook
[09:43:55] root@eddison0001:~$

Answer the question, look the ZOOKEEPER process, 2 xmx.

By the way, I changed the SOLR (28xms - 32 xmx) because after 5 days ago
using the SOLR, I received the message about Heap Memory in SOLR. Nowadays
I don't have more message about Heap Memory in Solr.

Regards,


Em ter, 30 de jul de 2019 às 08:41, Jörn Franke 
escreveu:

> 2 xmx does not make sense,
>
> Your heap seems unusual large usually your heap should be much smaller
> than available memory so solr can use it for index caching which is off-heap
>
> > Am 30.07.2019 um 13:25 schrieb Rodrigo Oliveira <
> adamantina.rodr...@gmail.com>:
> >
> > Hi,
> >
> > My environment have 5 servers with solr + zookeeper in the same hosts.
> >
> > However, I've had 48 RAM each servers (solr - xms 28 gb and xmx - 32 gb).
> >
> > Looking for my servers and process, in zookeepers has xmx 1000 mb and xmx
> > 4096 mb (last item, was change for me).
> >
> > Why 2 values for xmx?
> >
> > Regards,
> >
> > Em ter, 30 de jul de 2019 04:44, Dominique Bejean <
> dominique.bej...@eolya.fr>
> > escreveu:
> >
> >> Hi,
> >>
> >> I don’t find any documentation about the parameter
> >> zookeeper_server_java_heaps
> >> in zoo.cfg.
> >> The way to control java heap size is either the java.env file of the
> >> zookeeper-env.sh file. In zookeeper-env.sh
> >> SERVER_JVMFLAGS="-Xmx=512m"
> >>
> >> How many RAM on your server ?
> >>
> >> Regards
> >>
> >> Dominique
> >>
> >>
> >>
> >>
> >> Le lun. 29 juil. 2019 à 20:35, Rodrigo Oliveira <
> >> adamantina.rodr...@gmail.com> a écrit :
> >>
> >>> Hi,
> >>>
> >>> After 3 days running, my zookeeper showing this error.
> >>>
> >>> 2019-07-29 15:10:41,906 [myid:1] - WARN
> >>> [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] -
> >>> Connection broken for id 4332550065071534382, my id = 1, error =
> >>> java.io.IOException: Received packet with invalid packet: 824196618
> >>> at
> >>>
> >>>
> >>
> 

Re: Solr Backup

2019-07-30 Thread Shawn Heisey

On 7/30/2019 5:41 AM, Jayadevan Maymala wrote:

We have a 3-node Solr cluster running on google cloud platform. I would
like to schedule a backup and have been trying the backup API and getting
java.nio.file.NoSuchFileException:java.nio.file.NoSuchFileException error.
I suspect it is because a shared drive is necessary. Google VM instances
don't have this feature, unless I go for Cloud Filestore.
Is there a work-around? Some way in which I can have the cluster back up
taken on the node on which I am executing the backup command? Solr Version
is 7.3


We will need the *FULL* error message.  It is probably dozens of lines 
long and MIGHT contain multiple "Caused by" sections.  We will also need 
the complete Solr version to be able to interpret the error -- 7.3 is 
not specific enough.


Thanks,
Shawn


Re: [ZOOKEEPER] - Error - HEAP MEMORY

2019-07-30 Thread Jörn Franke
2 xmx does not make sense,

Your heap seems unusual large usually your heap should be much smaller than 
available memory so solr can use it for index caching which is off-heap

> Am 30.07.2019 um 13:25 schrieb Rodrigo Oliveira 
> :
> 
> Hi,
> 
> My environment have 5 servers with solr + zookeeper in the same hosts.
> 
> However, I've had 48 RAM each servers (solr - xms 28 gb and xmx - 32 gb).
> 
> Looking for my servers and process, in zookeepers has xmx 1000 mb and xmx
> 4096 mb (last item, was change for me).
> 
> Why 2 values for xmx?
> 
> Regards,
> 
> Em ter, 30 de jul de 2019 04:44, Dominique Bejean 
> escreveu:
> 
>> Hi,
>> 
>> I don’t find any documentation about the parameter
>> zookeeper_server_java_heaps
>> in zoo.cfg.
>> The way to control java heap size is either the java.env file of the
>> zookeeper-env.sh file. In zookeeper-env.sh
>> SERVER_JVMFLAGS="-Xmx=512m"
>> 
>> How many RAM on your server ?
>> 
>> Regards
>> 
>> Dominique
>> 
>> 
>> 
>> 
>> Le lun. 29 juil. 2019 à 20:35, Rodrigo Oliveira <
>> adamantina.rodr...@gmail.com> a écrit :
>> 
>>> Hi,
>>> 
>>> After 3 days running, my zookeeper showing this error.
>>> 
>>> 2019-07-29 15:10:41,906 [myid:1] - WARN
>>> [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] -
>>> Connection broken for id 4332550065071534382, my id = 1, error =
>>> java.io.IOException: Received packet with invalid packet: 824196618
>>> at
>>> 
>>> 
>> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163)
>>> 2019-07-29 15:10:41,906 [myid:1] - WARN
>>> [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1179] -
>>> Interrupting SendWorker
>>> 2019-07-29 15:10:41,907 [myid:1] - WARN
>>> [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1092] -
>>> Interrupted while waiting for message on queue
>>> java.lang.InterruptedException
>>> at
>>> 
>>> 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
>>> at
>>> 
>>> 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
>>> at
>>> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
>>> at
>>> 
>>> 
>> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
>>> at
>>> 
>>> 
>> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
>>> at
>>> 
>>> 
>> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
>>> 2019-07-29 15:10:41,907 [myid:1] - WARN
>>> [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1102] -
>> Send
>>> worker leaving thread  id 4332550065071534382 my id = 1
>>> 2019-07-29 15:10:41,917 [myid:1] - INFO  [/177.55.55.152:3888
>>> :QuorumCnxManager$Listener@888] - Received connection request /
>>> 177.55.55.63:53972
>>> 2019-07-29 15:10:41,920 [myid:1] - WARN
>>> [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] -
>>> Connection broken for id 4332550065071534382, my id = 1, error =
>>> java.io.IOException: Received packet with invalid packet: 840973834
>>> at
>>> 
>>> 
>> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163)
>>> 2019-07-29 15:10:41,921 [myid:1] - WARN
>>> [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1179] -
>>> Interrupting SendWorker
>>> 2019-07-29 15:10:41,922 [myid:1] - WARN
>>> [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1092] -
>>> Interrupted while waiting for message on queue
>>> java.lang.InterruptedException
>>> at
>>> 
>>> 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
>>> at
>>> 
>>> 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
>>> at
>>> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
>>> at
>>> 
>>> 
>> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
>>> at
>>> 
>>> 
>> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
>>> at
>>> 
>>> 
>> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
>>> 2019-07-29 15:10:41,922 [myid:1] - WARN
>>> [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1102] -
>> Send
>>> worker leaving thread  id 4332550065071534382 my id = 1
>>> 2019-07-29 15:10:41,932 [myid:1] - INFO  [/177.55.55.152:3888
>>> :QuorumCnxManager$Listener@888] - Received connection request /
>>> 177.55.55.63:38633
>>> 2019-07-29 15:10:41,933 [myid:1] - WARN
>>> [RecvWorker:4332550065071534638:QuorumCnxManager$RecvWorker@1176] -
>>> Connection broken for id 4332550065071534638, my id = 1, error =
>>> java.io.IOException: Received packet with invalid packet: 807419402
>>> at
>>> 
>>> 
>> 

Solr Backup

2019-07-30 Thread Jayadevan Maymala
Hello all,

We have a 3-node Solr cluster running on google cloud platform. I would
like to schedule a backup and have been trying the backup API and getting
java.nio.file.NoSuchFileException:java.nio.file.NoSuchFileException error.
I suspect it is because a shared drive is necessary. Google VM instances
don't have this feature, unless I go for Cloud Filestore.
Is there a work-around? Some way in which I can have the cluster back up
taken on the node on which I am executing the backup command? Solr Version
is 7.3

Regards,
Jayadevan


Re: [ZOOKEEPER] - Error - HEAP MEMORY

2019-07-30 Thread Rodrigo Oliveira
Hi,

My environment have 5 servers with solr + zookeeper in the same hosts.

However, I've had 48 RAM each servers (solr - xms 28 gb and xmx - 32 gb).

Looking for my servers and process, in zookeepers has xmx 1000 mb and xmx
4096 mb (last item, was change for me).

Why 2 values for xmx?

Regards,

Em ter, 30 de jul de 2019 04:44, Dominique Bejean 
escreveu:

> Hi,
>
> I don’t find any documentation about the parameter
> zookeeper_server_java_heaps
> in zoo.cfg.
> The way to control java heap size is either the java.env file of the
> zookeeper-env.sh file. In zookeeper-env.sh
> SERVER_JVMFLAGS="-Xmx=512m"
>
> How many RAM on your server ?
>
> Regards
>
> Dominique
>
>
>
>
> Le lun. 29 juil. 2019 à 20:35, Rodrigo Oliveira <
> adamantina.rodr...@gmail.com> a écrit :
>
> > Hi,
> >
> > After 3 days running, my zookeeper showing this error.
> >
> > 2019-07-29 15:10:41,906 [myid:1] - WARN
> >  [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] -
> > Connection broken for id 4332550065071534382, my id = 1, error =
> > java.io.IOException: Received packet with invalid packet: 824196618
> > at
> >
> >
> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163)
> > 2019-07-29 15:10:41,906 [myid:1] - WARN
> >  [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1179] -
> > Interrupting SendWorker
> > 2019-07-29 15:10:41,907 [myid:1] - WARN
> >  [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1092] -
> > Interrupted while waiting for message on queue
> > java.lang.InterruptedException
> > at
> >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
> > at
> >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
> > at
> > java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
> > at
> >
> >
> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
> > at
> >
> >
> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
> > at
> >
> >
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
> > 2019-07-29 15:10:41,907 [myid:1] - WARN
> >  [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1102] -
> Send
> > worker leaving thread  id 4332550065071534382 my id = 1
> > 2019-07-29 15:10:41,917 [myid:1] - INFO  [/177.55.55.152:3888
> > :QuorumCnxManager$Listener@888] - Received connection request /
> > 177.55.55.63:53972
> > 2019-07-29 15:10:41,920 [myid:1] - WARN
> >  [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] -
> > Connection broken for id 4332550065071534382, my id = 1, error =
> > java.io.IOException: Received packet with invalid packet: 840973834
> > at
> >
> >
> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163)
> > 2019-07-29 15:10:41,921 [myid:1] - WARN
> >  [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1179] -
> > Interrupting SendWorker
> > 2019-07-29 15:10:41,922 [myid:1] - WARN
> >  [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1092] -
> > Interrupted while waiting for message on queue
> > java.lang.InterruptedException
> > at
> >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
> > at
> >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
> > at
> > java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
> > at
> >
> >
> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
> > at
> >
> >
> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
> > at
> >
> >
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
> > 2019-07-29 15:10:41,922 [myid:1] - WARN
> >  [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1102] -
> Send
> > worker leaving thread  id 4332550065071534382 my id = 1
> > 2019-07-29 15:10:41,932 [myid:1] - INFO  [/177.55.55.152:3888
> > :QuorumCnxManager$Listener@888] - Received connection request /
> > 177.55.55.63:38633
> > 2019-07-29 15:10:41,933 [myid:1] - WARN
> >  [RecvWorker:4332550065071534638:QuorumCnxManager$RecvWorker@1176] -
> > Connection broken for id 4332550065071534638, my id = 1, error =
> > java.io.IOException: Received packet with invalid packet: 807419402
> > at
> >
> >
> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163)
> > 2019-07-29 15:10:41,933 [myid:1] - WARN
> >  [RecvWorker:4332550065071534638:QuorumCnxManager$RecvWorker@1179] -
> > Interrupting SendWorker
> > 2019-07-29 15:10:41,934 [myid:1] - WARN
> >  

Re: Problem with uploading Large synonym files in cloud mode

2019-07-30 Thread Bernd Fehling

You have to increase the -Djute.maxbuffer for large configs.

In Solr bin/solr/solr.in.sh use e.g.
SOLR_OPTS="$SOLR_OPTS -Djute.maxbuffer=1000"
This will increase maxbuffer for zookeeper on solr side to 10MB.

In Zookeeper zookeeper/conf/zookeeper-env.sh
SERVER_JVMFLAGS="$SERVER_JVMFLAGS -Djute.maxbuffer=1000"

I have a >10MB Thesaurus and use 30MB for jute.maxbuffer, works perfect.

Regards


Am 30.07.19 um 13:09 schrieb Salmaan Rashid Syed:

Hi Solr Users,

I have a very big synonym file (>5MB). I am unable to start Solr in cloud
mode as it throws an error message stating that the synonmys file is
too large. I figured out that the zookeeper doesn't take a file greater
than 1MB size.

I tried to break down my synonyms file to smaller chunks less than 1MB
each. But, I am not sure about how to include all the filenames into the
Solr schema.

Should it be seperated by commas like synonyms = "__1_synonyms.txt,
__2_synonyms.txt, __3synonyms.txt"

Or is there a better way of doing that? Will the bigger file when broken
down to smaller chunks will be uploaded to zookeeper as well.

Please help or please guide me to relevant documentation regarding this.

Thank you.

Regards.
Salmaan.



Re: Problem with uploading Large synonym files in cloud mode

2019-07-30 Thread Jörn Franke
Aside that a 5 MB synonym file is rather strange (what is the use case for such 
a large synonym file?) and that it will have impact on index size and/or query 
time:

You can configure zookeeper server and the Solr client to allow larger files 
using the jute.maxbuffer option.

> Am 30.07.2019 um 13:09 schrieb Salmaan Rashid Syed 
> :
> 
> Hi Solr Users,
> 
> I have a very big synonym file (>5MB). I am unable to start Solr in cloud
> mode as it throws an error message stating that the synonmys file is
> too large. I figured out that the zookeeper doesn't take a file greater
> than 1MB size.
> 
> I tried to break down my synonyms file to smaller chunks less than 1MB
> each. But, I am not sure about how to include all the filenames into the
> Solr schema.
> 
> Should it be seperated by commas like synonyms = "__1_synonyms.txt,
> __2_synonyms.txt, __3synonyms.txt"
> 
> Or is there a better way of doing that? Will the bigger file when broken
> down to smaller chunks will be uploaded to zookeeper as well.
> 
> Please help or please guide me to relevant documentation regarding this.
> 
> Thank you.
> 
> Regards.
> Salmaan.


Problem with uploading Large synonym files in cloud mode

2019-07-30 Thread Salmaan Rashid Syed
Hi Solr Users,

I have a very big synonym file (>5MB). I am unable to start Solr in cloud
mode as it throws an error message stating that the synonmys file is
too large. I figured out that the zookeeper doesn't take a file greater
than 1MB size.

I tried to break down my synonyms file to smaller chunks less than 1MB
each. But, I am not sure about how to include all the filenames into the
Solr schema.

Should it be seperated by commas like synonyms = "__1_synonyms.txt,
__2_synonyms.txt, __3synonyms.txt"

Or is there a better way of doing that? Will the bigger file when broken
down to smaller chunks will be uploaded to zookeeper as well.

Please help or please guide me to relevant documentation regarding this.

Thank you.

Regards.
Salmaan.


Re: [ZOOKEEPER] - Error - HEAP MEMORY

2019-07-30 Thread Dominique Bejean
Hi,

I don’t find any documentation about the parameter zookeeper_server_java_heaps
in zoo.cfg.
The way to control java heap size is either the java.env file of the
zookeeper-env.sh file. In zookeeper-env.sh
SERVER_JVMFLAGS="-Xmx=512m"

How many RAM on your server ?

Regards

Dominique




Le lun. 29 juil. 2019 à 20:35, Rodrigo Oliveira <
adamantina.rodr...@gmail.com> a écrit :

> Hi,
>
> After 3 days running, my zookeeper showing this error.
>
> 2019-07-29 15:10:41,906 [myid:1] - WARN
>  [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] -
> Connection broken for id 4332550065071534382, my id = 1, error =
> java.io.IOException: Received packet with invalid packet: 824196618
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163)
> 2019-07-29 15:10:41,906 [myid:1] - WARN
>  [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1179] -
> Interrupting SendWorker
> 2019-07-29 15:10:41,907 [myid:1] - WARN
>  [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1092] -
> Interrupted while waiting for message on queue
> java.lang.InterruptedException
> at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
> at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
> at
> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
> 2019-07-29 15:10:41,907 [myid:1] - WARN
>  [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1102] - Send
> worker leaving thread  id 4332550065071534382 my id = 1
> 2019-07-29 15:10:41,917 [myid:1] - INFO  [/177.55.55.152:3888
> :QuorumCnxManager$Listener@888] - Received connection request /
> 177.55.55.63:53972
> 2019-07-29 15:10:41,920 [myid:1] - WARN
>  [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] -
> Connection broken for id 4332550065071534382, my id = 1, error =
> java.io.IOException: Received packet with invalid packet: 840973834
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163)
> 2019-07-29 15:10:41,921 [myid:1] - WARN
>  [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1179] -
> Interrupting SendWorker
> 2019-07-29 15:10:41,922 [myid:1] - WARN
>  [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1092] -
> Interrupted while waiting for message on queue
> java.lang.InterruptedException
> at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
> at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
> at
> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
> 2019-07-29 15:10:41,922 [myid:1] - WARN
>  [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1102] - Send
> worker leaving thread  id 4332550065071534382 my id = 1
> 2019-07-29 15:10:41,932 [myid:1] - INFO  [/177.55.55.152:3888
> :QuorumCnxManager$Listener@888] - Received connection request /
> 177.55.55.63:38633
> 2019-07-29 15:10:41,933 [myid:1] - WARN
>  [RecvWorker:4332550065071534638:QuorumCnxManager$RecvWorker@1176] -
> Connection broken for id 4332550065071534638, my id = 1, error =
> java.io.IOException: Received packet with invalid packet: 807419402
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163)
> 2019-07-29 15:10:41,933 [myid:1] - WARN
>  [RecvWorker:4332550065071534638:QuorumCnxManager$RecvWorker@1179] -
> Interrupting SendWorker
> 2019-07-29 15:10:41,934 [myid:1] - WARN
>  [SendWorker:4332550065071534638:QuorumCnxManager$SendWorker@1092] -
> Interrupted while waiting for message on queue
> java.lang.InterruptedException
> at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
> at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
> at
> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
> at
>
>