Re: Solr 8.2 docker image in cloud mode not connecting to Zookeeper on startup

2019-10-18 Thread Drew Kidder
As an additional bit of information, here's the tcpdump of my startup of
solr in the docker container, after logging into the container and running
"bin/solr start -f -c" (which is the same CMD my Dockerfile executes):

root@91e3883fb675:/opt/solr-8.2.0# tcpdump -nvvv -i any -c 100 host
172.20.60.138
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size
262144 bytes
21:54:49.426019 IP (tos 0x0, ttl 64, id 44803, offset 0, flags [DF], proto
TCP (6), length 60)
172.17.0.2.60562 > 172.20.60.138.2181: Flags [S], cksum 0x94e0
(incorrect -> 0x19d3), seq 2175798173, win 29200, options [mss
1460,sackOK,TS val 6792350 ecr 0,nop,wscale 7], length 0
21:54:49.472340 IP (tos 0x0, ttl 37, id 37699, offset 0, flags [none],
proto TCP (6), length 48)
172.20.60.138.2181 > 172.17.0.2.60562: Flags [S.], cksum 0xd892
(correct), seq 452884582, ack 2175798174, win 65535, options [mss
1460,wscale 2,eol], length 0
21:54:49.472428 IP (tos 0x0, ttl 64, id 44804, offset 0, flags [DF], proto
TCP (6), length 40)
172.17.0.2.60562 > 172.20.60.138.2181: Flags [.], cksum 0x94cc
(incorrect -> 0x0472), seq 1, ack 1, win 229, length 0
21:54:49.472950 IP (tos 0x0, ttl 64, id 44805, offset 0, flags [DF], proto
TCP (6), length 89)
172.17.0.2.60562 > 172.20.60.138.2181: Flags [P.], cksum 0x94fd
(incorrect -> 0x8ecb), seq 1:50, ack 1, win 229, length 49
21:54:49.473400 IP (tos 0x0, ttl 37, id 33425, offset 0, flags [none],
proto TCP (6), length 40)
172.20.60.138.2181 > 172.17.0.2.60562: Flags [.], cksum 0x0526
(correct), seq 1, ack 50, win 65535, length 0
21:54:59.448636 IP (tos 0x0, ttl 64, id 44806, offset 0, flags [DF], proto
TCP (6), length 40)
172.17.0.2.60562 > 172.20.60.138.2181: Flags [F.], cksum 0x94cc
(incorrect -> 0x0440), seq 50, ack 1, win 229, length 0
21:54:59.449070 IP (tos 0x0, ttl 37, id 3430, offset 0, flags [none], proto
TCP (6), length 40)
172.20.60.138.2181 > 172.17.0.2.60562: Flags [.], cksum 0x0525
(correct), seq 1, ack 51, win 65535, length 0
21:55:21.518447 IP (tos 0x0, ttl 37, id 2259, offset 0, flags [none], proto
TCP (6), length 40)
172.20.60.138.2181 > 172.17.0.2.60562: Flags [F.], cksum 0x0524
(correct), seq 1, ack 51, win 65535, length 0
21:55:21.518513 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
(6), length 40)
172.17.0.2.60562 > 172.20.60.138.2181: Flags [.], cksum 0x043f
(correct), seq 51, ack 2, win 229, length 0

172.17.0.2 is my solr docker container, 172.20.60.138 is my zk1 docker
container residing out in AWS.

>From this, it looks like communication is happening but that it's finishing
and closing the connection instead of holding it open. Am I interpreting
this correctly?


--
Drew(i...@gmail.com)
http://wyntermute.dyndns.org/blog/

-- I Drive Way Too Fast To Worry About Cholesterol.


On Fri, Oct 18, 2019 at 1:18 PM Drew Kidder  wrote:

> Again, thank you all for the suggestions.
>
> My ZK ensemble is talking to each other and the outside world:
>
> solr@fe0ad5b40b42:/etc/default# echo srvr | nc zk1.zookeeper.internal 2181
> Zookeeper version: 3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built
> on 05/03/2019 12:07 GMT
> Latency min/avg/max: 0/0/0
> Received: 53
> Sent: 33
> Connections: 1
> Outstanding: 19
> Zxid: 0x0
> Mode: follower
> Node count: 5
>
> solr@fe0ad5b40b42:/etc/default# echo srvr | nc zk2.zookeeper.internal 2181
> Zookeeper version: 3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built
> on 05/03/2019 12:07 GMT
> Latency min/avg/max: 0/0/0
> Received: 37
> Sent: 17
> Connections: 1
> Outstanding: 19
> Zxid: 0x2
> Mode: leader
> Node count: 5
> Proposal sizes last/min/max: 32/32/36
>
> solr@fe0ad5b40b42:/etc/default# echo srvr | nc zk3.zookeeper.internal 2181
> Zookeeper version: 3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built
> on 05/03/2019 12:07 GMT
> Latency min/avg/max: 0/0/0
> Received: 7
> Sent: 3
> Connections: 1
> Outstanding: 3
> Zxid: 0x2
> Mode: follower
> Node count: 5
>
> All of these commands can be executed on the solr container as either the
> root user or the solr user (see the command prompt in each command). Note
> that zk2 is the leader and zk1 and zk3 are followers. The configuration
> files (including the ZOO_MY_ID and ZOO_SERVERS environment variables) are
> all set up correctly and by all rights and purposes, ZK appears to be set
> up correctly and functioning.
>
> Jorne Franke: I tried implementing your suggestion of providing "/" as the
> root node by appending "/" to the end of the ZK_HOST connection string and
> it still did not work (e.g. ENV ZK_HOST
> zk1.zookeeper.internal:2181,zk2.zookeeper.internal:2181,zk3.zookeeper.internal:2181/
> in the Dockerfile). Was this what you meant?  Or were you suggesting to set
> the ZK_ROOT in the Solr configs/environment instead?
>
> --
> Drew(i...@gmail.com)
> http://wyntermute.dyndns.org/blog/
>
> -- I Drive Way Too Fast To Worry About Cholesterol.
>
>
> On Fri, Oct 18, 2019 at 12:11 PM Ahmed Adel  wrote:
>
>> This could be 

Re: [CAUTION] Converting graph query to stream graph query

2019-10-18 Thread Joel Bernstein
I believe we were debugging why graph results were not being returned in a
different thread. It looks like the same problem.

Is your Solr instance a straight install or have you moved config files
from an older version of Solr to a newer version of Solr.

Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, Oct 16, 2019 at 1:09 AM Natarajan, Rajeswari <
rajeswari.natara...@sap.com> wrote:

> I need to gather all the children of docid  1 . Root item has parent as
> null. (Sample data below)
>
> Tried as below
>
> nodes(graphtest,
>   walk="1->parent",
>   gather="docid",
>   scatter="branches, leaves")
>
> Response :
> {
>   "result-set": {
> "docs": [
>   {
> "node": "1",
> "collection": "graphtest,",
> "field": "node",
> "level": 0
>   },
>   {
> "EOF": true,
> "RESPONSE_TIME": 5
>   }
> ]
>   }
> }
>
> Query just gets the  root item and not it's children. Looks like I am
> missing something obvious . Any pointers , please.
>
> As I said earlier the below graph query gets all the children of docid 1.
>
> fq={!graph from=parent to=docid}docid:"1"
>
> Thanks,
> Rajeswari
>
>
>
> On 10/15/19, 12:04 PM, "Natarajan, Rajeswari" <
> rajeswari.natara...@sap.com> wrote:
>
> Hi,
>
>
> curl -XPOST -H 'Content-Type: application/json' '
> http://localhost:8983/solr/ggg/update' --data-binary '{
> "add" : { "doc" : { "id" : "a", "docid" : "1", "name" : "Root document
> one" } },
> "add" : { "doc" : { "id" : "b", "docid" : "2", "name" : "Root document
> two" } },
> "add" : { "doc" : {  "id" : "c", "docid" : "3", "name" : "Root
> document three" } },
> "add" : { "doc" : {  "id" : "d", "docid" : "11", "parent" : "1",
> "name" : "First level document 1, child one" } },
> "add" : { "doc" : {  "id" : "e", "docid" : "12", "parent" : "1",
> "name" : "First level document 1, child two" } },
> "add" : { "doc" : {  "id" : "f", "docid" : "13", "parent" : "1",
> "name" : "First level document 1, child three" } },
> "add" : { "doc" : {  "id" : "g", "docid" : "21", "parent" : "2",
> "name" : "First level document 2, child one" } },
> "add" : { "doc" : {  "id" : "h", "docid" : "22", "parent" : "2",
> "name" : "First level document 2, child two" } },
> "add" : { "doc" : {  "id" : "j", "docid" : "121", "parent" : "12",
> "name" : "Second level document 12, child one" } },
> "add" : { "doc" : {  "id" : "k", "docid" : "122", "parent" : "12",
> "name" : "Second level document 12, child two" } },
> "add" : { "doc" : {  "id" : "l", "docid" : "131", "parent" : "13",
> "name" : "Second level document 13, child three" } },
> "commit" : {}
> }'
>
>
> For the above data , the below query gets all the children of document
> with docid 1.
>
>
> http://localhost:8983/solr/graphtest/select?q=*:*={!graph%20from=parent%20to=docid}docid
> 
> :"1<
> http://localhost:8983/solr/graphtest/select?q=*:*=%7b!graph%20from=parent%20to=docid%7ddocid:%221
> >"
>
>
> How can I convert this query into streaming graph query with nodes
> expression.
>
> Thanks,
> Rajeswari
>
>
>
>


Re: Help with Stream Graph

2019-10-18 Thread Joel Bernstein
The query that is created to me looks looked good but it returns no
results. Let's just do a basic query using the select handler:

product_s:product1

If this brings back zero results then we know we have a problem with the
data.

Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Oct 18, 2019 at 1:41 PM Rajeswari Natarajan 
wrote:

> Hi Joel,
>
> Do you see anything wrong in the config or data . I am using 7.6.
>
> Thanks,
> Rajeswari
>
> On Thu, Oct 17, 2019 at 8:36 AM Rajeswari Natarajan 
> wrote:
>
> > My config is from
> >
> >
> >
> https://github.com/apache/lucene-solr/tree/branch_7_6/solr/solrj/src/test-files/solrj/solr/configsets/streaming/conf
> >
> >
> > 
> >
> >  > docValues="true"/>
> >
> >
> >
> > 
> >
> >  omitNorms="true"
> > positionIncrementGap="0"/>
> >
> >
> >
> > Thanks,
> >
> > Rajeswari
> >
> > On Thu, Oct 17, 2019 at 8:16 AM Rajeswari Natarajan 
> > wrote:
> >
> >> I tried below query  and it returns o results
> >>
> >>
> >>
> http://localhost:8983/solr/knr/export?{!terms+f%3Dproduct_s}product1=false=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2
> 
> >> <
> http://localhost:8983/solr/knr/export?%7B!terms+f%3Dproduct_s%7Dproduct1=false=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2
> >
> >>
> >>
> >> {
> >>   "responseHeader":{"status":0},
> >>   "response":{
> >> "numFound":0,
> >> "docs":[]}}
> >>
> >> Regards,
> >> Rajeswari
> >> On Thu, Oct 17, 2019 at 8:05 AM Rajeswari Natarajan  >
> >> wrote:
> >>
> >>> Thanks Joel.
> >>>
> >>> Here is the logs for below request
> >>>
> >>> curl --data-urlencode
> >>> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
> >>> http://localhost:8983/solr/knr/stream
> >>>
> >>> 2019-10-17 15:02:06.969 INFO  (qtp952486988-280) [c:knr s:shard1
> >>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> >>> [knr_shard1_replica_n1]  webapp=/solr path=/stream
> >>>
> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s")}
> >>> status=0 QTime=0
> >>>
> >>> 2019-10-17 15:02:06.975 INFO  (qtp952486988-192) [c:knr s:shard1
> >>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> >>> [knr_shard1_replica_n1]  webapp=/solr path=/export
> >>>
> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
> >>> hits=0 status=0 QTime=1
> >>>
> >>>
> >>>
> >>> Here is the logs for
> >>>
> >>>
> >>>
> >>> curl --data-urlencode
> >>>
> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
> >>> leaves")' http://localhost:8983/solr/knr/stream
> >>>
> >>>
> >>> 2019-10-17 15:03:57.068 INFO  (qtp952486988-356) [c:knr s:shard1
> >>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> >>> [knr_shard1_replica_n1]  webapp=/solr path=/stream
> >>>
> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s",scatter%3D"branches,+leaves")}
> >>> status=0 QTime=0
> >>>
> >>> 2019-10-17 15:03:57.071 INFO  (qtp952486988-400) [c:knr s:shard1
> >>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
> >>> [knr_shard1_replica_n1]  webapp=/solr path=/export
> >>>
> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
> >>> hits=0 status=0 QTime=0
> >>>
> >>>
> >>>
> >>>
> >>> Thank you,
> >>>
> >>> Rajeswari
> >>>
> >>> On Thu, Oct 17, 2019 at 5:23 AM Joel Bernstein 
> >>> wrote:
> >>>
>  Can you show the logs from this request. There will be a Solr query
> that
>  gets sent with product1 searched against the product_s field. Let's
> see
>  how
>  many documents that query returns.
> 
> 
>  Joel Bernstein
>  http://joelsolr.blogspot.com/
> 
> 
>  On Thu, Oct 17, 2019 at 1:41 AM Rajeswari Natarajan <
> rajis...@gmail.com
>  >
>  wrote:
> 
>  > Hi,
>  >
>  > Since the stream graph query for my use case , didn't work as  i
> took
>  the
>  > data from solr source code test and also copied the schema and
>  > solrconfig.xml from solr 7.6 source code.  Had to substitute few
>  variables.
>  >
>  > Posted below data
>  >
>  > curl -X POST http://localhost:8983/solr/knr/update -H
>  > 'Content-type:text/csv' -d '
>  > id, basket_s, product_s, prics_f
>  > 90,basket1,product1,20
>  > 91,basket1,product3,30
>  > 92,basket1,product5,1
>  > 93,basket2,product1,2
>  > 94,basket2,product6,5
>  > 95,basket2,product7,10
>  > 96,basket3,product4,20
>  > 97,basket3,product3,10
>  > 98,basket3,product1,10
>  > 99,basket4,product4,40
>  > 110,basket4,product3,10
>  > 111,basket4,product1,10'
>  > After this I committed and made sure the data got published. to solr
>  >
>  > curl --data-urlencode
>  > 

Re: Solr 8.2 docker image in cloud mode not connecting to Zookeeper on startup

2019-10-18 Thread Drew Kidder
Again, thank you all for the suggestions.

My ZK ensemble is talking to each other and the outside world:

solr@fe0ad5b40b42:/etc/default# echo srvr | nc zk1.zookeeper.internal 2181
Zookeeper version: 3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built on
05/03/2019 12:07 GMT
Latency min/avg/max: 0/0/0
Received: 53
Sent: 33
Connections: 1
Outstanding: 19
Zxid: 0x0
Mode: follower
Node count: 5

solr@fe0ad5b40b42:/etc/default# echo srvr | nc zk2.zookeeper.internal 2181
Zookeeper version: 3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built on
05/03/2019 12:07 GMT
Latency min/avg/max: 0/0/0
Received: 37
Sent: 17
Connections: 1
Outstanding: 19
Zxid: 0x2
Mode: leader
Node count: 5
Proposal sizes last/min/max: 32/32/36

solr@fe0ad5b40b42:/etc/default# echo srvr | nc zk3.zookeeper.internal 2181
Zookeeper version: 3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built on
05/03/2019 12:07 GMT
Latency min/avg/max: 0/0/0
Received: 7
Sent: 3
Connections: 1
Outstanding: 3
Zxid: 0x2
Mode: follower
Node count: 5

All of these commands can be executed on the solr container as either the
root user or the solr user (see the command prompt in each command). Note
that zk2 is the leader and zk1 and zk3 are followers. The configuration
files (including the ZOO_MY_ID and ZOO_SERVERS environment variables) are
all set up correctly and by all rights and purposes, ZK appears to be set
up correctly and functioning.

Jorne Franke: I tried implementing your suggestion of providing "/" as the
root node by appending "/" to the end of the ZK_HOST connection string and
it still did not work (e.g. ENV ZK_HOST
zk1.zookeeper.internal:2181,zk2.zookeeper.internal:2181,zk3.zookeeper.internal:2181/
in the Dockerfile). Was this what you meant?  Or were you suggesting to set
the ZK_ROOT in the Solr configs/environment instead?

--
Drew(i...@gmail.com)
http://wyntermute.dyndns.org/blog/

-- I Drive Way Too Fast To Worry About Cholesterol.


On Fri, Oct 18, 2019 at 12:11 PM Ahmed Adel  wrote:

> This could be because Zookeeper ensemble is not properly configured. Using
> a very similar setup which consists of ZK cluster of three hosts and one
> Solr Cloud node (all are containers), the system got running. Each ZK host
> has ZOO_MY_ID and ZOO_SERVERS environment variables set before running ZK.
> In this case, the former variable value would be from 1 to 3 on each host
> and the latter would be "server.1=z1:2888:3888;2181
> server.2=z2:2888:3888;2181 server.3=z3:2888:3888;2181" the same on all
> hosts (the double quotes may be needed for proper parsing). This
> ZOO_SERVERS syntax is for ZK version 3.5. 3.4 is slightly different.
>
> http://aadel.io
>
> On Fri, Oct 18, 2019 at 5:28 PM Drew Kidder  wrote:
>
> > Thank you all for your suggestions! I appreciate the fast turnaround.
> >
> > My setup is using Amazon ECS for our solr cloud installation. Each ZK is
> in
> > its own container, using Route53 Service Discovery to provide the DNS
> name.
> > The ZK nodes can all talk to each other, and I can communicate to each
> one
> > of those nodes from my local machine and from within the solr container.
> > Solr is one node per container, as Martijn correctly assumed. I am not
> > using a zkRoot at present because my intention is to use ZK solely for
> Solr
> > Cloud and nothing else.
> >
> > I have tried removing the "-z" option from the Dockerfile CMD and using
> the
> > ZK_HOST environment variable (see below). I have even also modified the
> > solr.in.sh and set the ZK_HOST variable there, all to no avail. I have
> > tried both the Dockerfile command route, and have logged into the solr
> > container and tried to run the CMD manually to see if there was a problem
> > with the way I was using the CMD entry. All of those methods give me the
> > same result output captured in the gist below.
> >
> > The gist for my solr.log output is here:
> > https://gist.github.com/dkidder/2db9a6d393dedb97a39ed32e2be0c087
> >
> > My Dockerfile for the solr container looks like this:
> >
> >
> > FROMsolr:8.2
> >
> > EXPOSE8983 8999 2181
> >
> > VOLUME/app/logs
> > VOLUME/app/data
> > VOLUME/app/conf
> >
> > ## add our jetty configuration (increased request size!)
> > COPY   jetty.xml /opt/solr/server/etc/
> >
> > ## SolrCloud configuration
> > ENV ZK_HOST zk1:2181,zk2:2181,zk3:2181
> > ENV ZK_CLIENT_TIMEOUT 3
> >
> > USER   root
> > RUNapt-get update
> > RUNapt-get install -y netcat net-tools vim procps
> > USER   solr
> >
> > # Copy over custom solr plugins
> > COPYmyplugins/src/resources/* /opt/solr/server/solr/my-resources/
> > COPYlib/*.jar /opt/solr/my-lib/
> >
> > # Copy over my configs
> > COPYconf/ /app/conf
> >
> > #Start solr in cloud mode, connecting to zookeeper
> > CMD   ["solr","start","-f","-c"]
> >
> > The docker command I use to execute this Dockerfile is `docker run -p
> > 8983:8983 -p 2181:2181 --name $(APP_NAME) $(APP_NAME):latest`
> >
> > Output of `ps -eflww` from within the 

Re: Help with Stream Graph

2019-10-18 Thread Rajeswari Natarajan
Hi Joel,

Do you see anything wrong in the config or data . I am using 7.6.

Thanks,
Rajeswari

On Thu, Oct 17, 2019 at 8:36 AM Rajeswari Natarajan 
wrote:

> My config is from
>
>
> https://github.com/apache/lucene-solr/tree/branch_7_6/solr/solrj/src/test-files/solrj/solr/configsets/streaming/conf
>
>
> 
>
>  docValues="true"/>
>
>
>
> 
>
>  omitNorms="true"
> positionIncrementGap="0"/>
>
>
>
> Thanks,
>
> Rajeswari
>
> On Thu, Oct 17, 2019 at 8:16 AM Rajeswari Natarajan 
> wrote:
>
>> I tried below query  and it returns o results
>>
>>
>> http://localhost:8983/solr/knr/export?{!terms+f%3Dproduct_s}product1=false=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2
>> 
>>
>>
>> {
>>   "responseHeader":{"status":0},
>>   "response":{
>> "numFound":0,
>> "docs":[]}}
>>
>> Regards,
>> Rajeswari
>> On Thu, Oct 17, 2019 at 8:05 AM Rajeswari Natarajan 
>> wrote:
>>
>>> Thanks Joel.
>>>
>>> Here is the logs for below request
>>>
>>> curl --data-urlencode
>>> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
>>> http://localhost:8983/solr/knr/stream
>>>
>>> 2019-10-17 15:02:06.969 INFO  (qtp952486988-280) [c:knr s:shard1
>>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
>>> [knr_shard1_replica_n1]  webapp=/solr path=/stream
>>> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s")}
>>> status=0 QTime=0
>>>
>>> 2019-10-17 15:02:06.975 INFO  (qtp952486988-192) [c:knr s:shard1
>>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
>>> [knr_shard1_replica_n1]  webapp=/solr path=/export
>>> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
>>> hits=0 status=0 QTime=1
>>>
>>>
>>>
>>> Here is the logs for
>>>
>>>
>>>
>>> curl --data-urlencode
>>> 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
>>> leaves")' http://localhost:8983/solr/knr/stream
>>>
>>>
>>> 2019-10-17 15:03:57.068 INFO  (qtp952486988-356) [c:knr s:shard1
>>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
>>> [knr_shard1_replica_n1]  webapp=/solr path=/stream
>>> params={expr=gatherNodes(knr,walk%3D"product1->product_s",gather%3D"basket_s",scatter%3D"branches,+leaves")}
>>> status=0 QTime=0
>>>
>>> 2019-10-17 15:03:57.071 INFO  (qtp952486988-400) [c:knr s:shard1
>>> r:core_node2 x:knr_shard1_replica_n1] o.a.s.c.S.Request
>>> [knr_shard1_replica_n1]  webapp=/solr path=/export
>>> params={q={!terms+f%3Dproduct_s}product1=false=off=basket_s,product_s=basket_s+asc,product_s+asc=json=2.2}
>>> hits=0 status=0 QTime=0
>>>
>>>
>>>
>>>
>>> Thank you,
>>>
>>> Rajeswari
>>>
>>> On Thu, Oct 17, 2019 at 5:23 AM Joel Bernstein 
>>> wrote:
>>>
 Can you show the logs from this request. There will be a Solr query that
 gets sent with product1 searched against the product_s field. Let's see
 how
 many documents that query returns.


 Joel Bernstein
 http://joelsolr.blogspot.com/


 On Thu, Oct 17, 2019 at 1:41 AM Rajeswari Natarajan >>> >
 wrote:

 > Hi,
 >
 > Since the stream graph query for my use case , didn't work as  i took
 the
 > data from solr source code test and also copied the schema and
 > solrconfig.xml from solr 7.6 source code.  Had to substitute few
 variables.
 >
 > Posted below data
 >
 > curl -X POST http://localhost:8983/solr/knr/update -H
 > 'Content-type:text/csv' -d '
 > id, basket_s, product_s, prics_f
 > 90,basket1,product1,20
 > 91,basket1,product3,30
 > 92,basket1,product5,1
 > 93,basket2,product1,2
 > 94,basket2,product6,5
 > 95,basket2,product7,10
 > 96,basket3,product4,20
 > 97,basket3,product3,10
 > 98,basket3,product1,10
 > 99,basket4,product4,40
 > 110,basket4,product3,10
 > 111,basket4,product1,10'
 > After this I committed and made sure the data got published. to solr
 >
 > curl --data-urlencode
 > 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s")'
 > http://localhost:8983/solr/knr/stream
 >
 > {
 >
 >   "result-set":{
 >
 > "docs":[{
 >
 > "EOF":true,
 >
 > "RESPONSE_TIME":4}]}}
 >
 >
 > and if I add *scatter="branches, leaves" , there is one doc.*
 >
 >
 >
 > curl --data-urlencode
 >
 >
 'expr=gatherNodes(knr,walk="product1->product_s",gather="basket_s",scatter="branches,
 > leaves")' http://localhost:8983/solr/knr/stream
 >
 > {
 >
 >   "result-set":{
 >
 > "docs":[{
 >
 > "node":"product1",
 >
 > "collection":"knr",
 >
 > "field":"node",
 >
 > "level":0}
 >
 >   ,{
 >
 > "EOF":true,
 >
 > "RESPONSE_TIME":4}]}}

Re: AEM 6.4 Compatibility

2019-10-18 Thread Shawn Heisey

On 10/18/2019 11:10 AM, Natalie Hannigan wrote:

I am new to this group. I am working with a vendor to get Solr up and running 
with AEM 6.4. Has anyone had any experience with this? I am wanting to use Solr 
8.1, but I cannot find documentation that says they are compatible. Does anyone 
know for sure?


I had absolutely no idea what AEM was until I Googled it.

We can help you with Solr, but not with the Adobe software.  I did find 
this:


https://helpx.adobe.com/experience-manager/using/aem_solr64.html

If you need help with this integration, you will need to talk to Adobe. 
If a question about Solr comes up, feel free to ask us about it.


Thanks,
Shawn


Re: Solr 8.2 docker image in cloud mode not connecting to Zookeeper on startup

2019-10-18 Thread Ahmed Adel
This could be because Zookeeper ensemble is not properly configured. Using
a very similar setup which consists of ZK cluster of three hosts and one
Solr Cloud node (all are containers), the system got running. Each ZK host
has ZOO_MY_ID and ZOO_SERVERS environment variables set before running ZK.
In this case, the former variable value would be from 1 to 3 on each host
and the latter would be "server.1=z1:2888:3888;2181
server.2=z2:2888:3888;2181 server.3=z3:2888:3888;2181" the same on all
hosts (the double quotes may be needed for proper parsing). This
ZOO_SERVERS syntax is for ZK version 3.5. 3.4 is slightly different.

http://aadel.io

On Fri, Oct 18, 2019 at 5:28 PM Drew Kidder  wrote:

> Thank you all for your suggestions! I appreciate the fast turnaround.
>
> My setup is using Amazon ECS for our solr cloud installation. Each ZK is in
> its own container, using Route53 Service Discovery to provide the DNS name.
> The ZK nodes can all talk to each other, and I can communicate to each one
> of those nodes from my local machine and from within the solr container.
> Solr is one node per container, as Martijn correctly assumed. I am not
> using a zkRoot at present because my intention is to use ZK solely for Solr
> Cloud and nothing else.
>
> I have tried removing the "-z" option from the Dockerfile CMD and using the
> ZK_HOST environment variable (see below). I have even also modified the
> solr.in.sh and set the ZK_HOST variable there, all to no avail. I have
> tried both the Dockerfile command route, and have logged into the solr
> container and tried to run the CMD manually to see if there was a problem
> with the way I was using the CMD entry. All of those methods give me the
> same result output captured in the gist below.
>
> The gist for my solr.log output is here:
> https://gist.github.com/dkidder/2db9a6d393dedb97a39ed32e2be0c087
>
> My Dockerfile for the solr container looks like this:
>
>
> FROMsolr:8.2
>
> EXPOSE8983 8999 2181
>
> VOLUME/app/logs
> VOLUME/app/data
> VOLUME/app/conf
>
> ## add our jetty configuration (increased request size!)
> COPY   jetty.xml /opt/solr/server/etc/
>
> ## SolrCloud configuration
> ENV ZK_HOST zk1:2181,zk2:2181,zk3:2181
> ENV ZK_CLIENT_TIMEOUT 3
>
> USER   root
> RUNapt-get update
> RUNapt-get install -y netcat net-tools vim procps
> USER   solr
>
> # Copy over custom solr plugins
> COPYmyplugins/src/resources/* /opt/solr/server/solr/my-resources/
> COPYlib/*.jar /opt/solr/my-lib/
>
> # Copy over my configs
> COPYconf/ /app/conf
>
> #Start solr in cloud mode, connecting to zookeeper
> CMD   ["solr","start","-f","-c"]
>
> The docker command I use to execute this Dockerfile is `docker run -p
> 8983:8983 -p 2181:2181 --name $(APP_NAME) $(APP_NAME):latest`
>
> Output of `ps -eflww` from within the solr container (as root):
>
> root@fe0ad5b40b42:/opt/solr-8.2.0# ps -eflww
> F S UIDPID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY  TIME
> CMD
> 4 S solr 1 0  9  80   0 - 1043842 -14:36 ?00:00:07
> /usr/local/openjdk-11/bin/java -server -Xms512m -Xmx512m -XX:+UseG1GC
> -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled
> -XX:MaxGCPauseMillis=250 -XX:+UseLargePages -XX:+AlwaysPreTouch
>
> -Xlog:gc*:file=/var/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.local.only=false
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.port=18983
> -Dcom.sun.management.jmxremote.rmi.port=18983 -DzkClientTimeout=3
> -DzkHost=zk1:2181,zk2:2181,zk3:2181 -Dsolr.log.dir=/var/solr/logs
> -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Duser.timezone=UTC
> -Djetty.home=/opt/solr/server -Dsolr.solr.home=/var/solr/data
> -Dsolr.data.home= -Dsolr.install.dir=/opt/solr
> -Dsolr.default.confdir=/opt/solr/server/solr/configsets/_default/conf
> -Dlog4j.configurationFile=file:/var/solr/log4j2.xml -Xss256k
> -Dsolr.jetty.https.port=8983 -jar start.jar --module=http
> 4 S root90 0  0  80   0 -  4988 -  14:37 pts/000:00:00
> /bin/bash
> 0 R root9590  0  80   0 -  9595 -  14:37 pts/000:00:00
> ps -eflww
>
> Output of netstat from within the solr container (as root):
>
> root@fe0ad5b40b42:/opt/solr-8.2.0# netstat
> Active Internet connections (w/o servers)
> Proto Recv-Q Send-Q Local Address   Foreign Address State
> tcp0  0 fe0ad5b40b42:43678  172.20.28.179:2181
>  TIME_WAIT
> tcp0  0 fe0ad5b40b42:60164  172.20.155.241:2181
> TIME_WAIT
> tcp0  0 fe0ad5b40b42:60500  172.20.60.138:2181
>  TIME_WAIT
> Active UNIX domain sockets (w/o servers)
> Proto RefCnt Flags   Type   State I-Node   Path
> unix  2  [ ] STREAM CONNECTED 129252
> unix  2  [ ] STREAM CONNECTED 129270
>
> I'm beginning to 

AEM 6.4 Compatibility

2019-10-18 Thread Natalie Hannigan
Hey all,

I am new to this group. I am working with a vendor to get Solr up and running 
with AEM 6.4. Has anyone had any experience with this? I am wanting to use Solr 
8.1, but I cannot find documentation that says they are compatible. Does anyone 
know for sure?

Thanks in advance.

- Natalie

--
- This email may contain confidential 
protected health information and/or attorney privileged information. If 
received in error, see our Privacy Statement at 
https://www.brookdale.com/privacy-policy/


Re: Solr 8.2 docker image in cloud mode not connecting to Zookeeper on startup

2019-10-18 Thread Jörn Franke
Even if you do not have a dedicated zkRoot node you will need to provide / in 
the connection.

Then, even if the zk nodes can connect with each other it does not mean they 
form an ensemble. You need to adapt zoo.cfg of all nodes and add all nodes to 
it. Additionally all will need a myid file with a unique id.

Am 18.10.2019 um 17:28 schrieb Drew Kidder :
> 
> Thank you all for your suggestions! I appreciate the fast turnaround.
> 
> My setup is using Amazon ECS for our solr cloud installation. Each ZK is in
> its own container, using Route53 Service Discovery to provide the DNS name.
> The ZK nodes can all talk to each other, and I can communicate to each one
> of those nodes from my local machine and from within the solr container.
> Solr is one node per container, as Martijn correctly assumed. I am not
> using a zkRoot at present because my intention is to use ZK solely for Solr
> Cloud and nothing else.
> 
> I have tried removing the "-z" option from the Dockerfile CMD and using the
> ZK_HOST environment variable (see below). I have even also modified the
> solr.in.sh and set the ZK_HOST variable there, all to no avail. I have
> tried both the Dockerfile command route, and have logged into the solr
> container and tried to run the CMD manually to see if there was a problem
> with the way I was using the CMD entry. All of those methods give me the
> same result output captured in the gist below.
> 
> The gist for my solr.log output is here:
> https://gist.github.com/dkidder/2db9a6d393dedb97a39ed32e2be0c087
> 
> My Dockerfile for the solr container looks like this:
> 
> 
> FROMsolr:8.2
> 
> EXPOSE8983 8999 2181
> 
> VOLUME/app/logs
> VOLUME/app/data
> VOLUME/app/conf
> 
> ## add our jetty configuration (increased request size!)
> COPY   jetty.xml /opt/solr/server/etc/
> 
> ## SolrCloud configuration
> ENV ZK_HOST zk1:2181,zk2:2181,zk3:2181
> ENV ZK_CLIENT_TIMEOUT 3
> 
> USER   root
> RUNapt-get update
> RUNapt-get install -y netcat net-tools vim procps
> USER   solr
> 
> # Copy over custom solr plugins
> COPYmyplugins/src/resources/* /opt/solr/server/solr/my-resources/
> COPYlib/*.jar /opt/solr/my-lib/
> 
> # Copy over my configs
> COPYconf/ /app/conf
> 
> #Start solr in cloud mode, connecting to zookeeper
> CMD   ["solr","start","-f","-c"]
> 
> The docker command I use to execute this Dockerfile is `docker run -p
> 8983:8983 -p 2181:2181 --name $(APP_NAME) $(APP_NAME):latest`
> 
> Output of `ps -eflww` from within the solr container (as root):
> 
> root@fe0ad5b40b42:/opt/solr-8.2.0# ps -eflww
> F S UIDPID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY  TIME
> CMD
> 4 S solr 1 0  9  80   0 - 1043842 -14:36 ?00:00:07
> /usr/local/openjdk-11/bin/java -server -Xms512m -Xmx512m -XX:+UseG1GC
> -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled
> -XX:MaxGCPauseMillis=250 -XX:+UseLargePages -XX:+AlwaysPreTouch
> -Xlog:gc*:file=/var/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.local.only=false
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.port=18983
> -Dcom.sun.management.jmxremote.rmi.port=18983 -DzkClientTimeout=3
> -DzkHost=zk1:2181,zk2:2181,zk3:2181 -Dsolr.log.dir=/var/solr/logs
> -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Duser.timezone=UTC
> -Djetty.home=/opt/solr/server -Dsolr.solr.home=/var/solr/data
> -Dsolr.data.home= -Dsolr.install.dir=/opt/solr
> -Dsolr.default.confdir=/opt/solr/server/solr/configsets/_default/conf
> -Dlog4j.configurationFile=file:/var/solr/log4j2.xml -Xss256k
> -Dsolr.jetty.https.port=8983 -jar start.jar --module=http
> 4 S root90 0  0  80   0 -  4988 -  14:37 pts/000:00:00
> /bin/bash
> 0 R root9590  0  80   0 -  9595 -  14:37 pts/000:00:00
> ps -eflww
> 
> Output of netstat from within the solr container (as root):
> 
> root@fe0ad5b40b42:/opt/solr-8.2.0# netstat
> Active Internet connections (w/o servers)
> Proto Recv-Q Send-Q Local Address   Foreign Address State
> tcp0  0 fe0ad5b40b42:43678  172.20.28.179:2181
> TIME_WAIT
> tcp0  0 fe0ad5b40b42:60164  172.20.155.241:2181
> TIME_WAIT
> tcp0  0 fe0ad5b40b42:60500  172.20.60.138:2181
> TIME_WAIT
> Active UNIX domain sockets (w/o servers)
> Proto RefCnt Flags   Type   State I-Node   Path
> unix  2  [ ] STREAM CONNECTED 129252
> unix  2  [ ] STREAM CONNECTED 129270
> 
> I'm beginning to think that ZK is not setup correctly. I haven't uploaded
> any configuration files to ZK yet; my understanding was that I could start
> up a solr cloud node with no collections and upload the configuration from
> there. I was under the impression that it would try to connect to ZK and if
> it couldn't get config files 

Re: Solr 8.2 docker image in cloud mode not connecting to Zookeeper on startup

2019-10-18 Thread Shawn Heisey

On 10/18/2019 9:28 AM, Drew Kidder wrote:

I'm beginning to think that ZK is not setup correctly. I haven't uploaded
any configuration files to ZK yet; my understanding was that I could start
up a solr cloud node with no collections and upload the configuration from
there. I was under the impression that it would try to connect to ZK and if
it couldn't get config files from there it would use local config files.


SolrCloud will always read index configs from ZooKeeper.  It will not 
use local config files.  I believe that the only config that will be 
read locally is solr.xml, but that can also be placed in ZK.


Solr will run with no collections in the cloud.  When the first 
SolrCloud node in a cluster is started, that is the state it will be in. 
 All of the nodes can run with no collections.



Do I need to upload the solr cloud configuration files to ZK before starting
up the cluster?  The netstat output makes it look like the solr container
is indeed connected to the ZK containers, but there's no indication as to
why it cannot connect to Zookeeper that I can see.


If Solr finds no information at all in ZK when it starts, then it will 
create the required structures within ZK for the cluster.  Index configs 
will not normally be uploaded just by starting Solr.  Some methods of 
creating collections will also upload the config.  Some will require 
that you upload the configuration first, or use one that is already there.


The entries on the netstat output show TIME_WAIT.  If there were active 
connections, they would show ESTABLISHED.  When a ZK client is running, 
it maintains continuous connections to all of the servers that it is given.


All of the work related to making ZK connections is handled by the ZK 
client, not Solr itself.  I'm not sure what options are available for 
getting that client to provide more information about what went wrong. 
Based on the information available, there seems to be some kind of 
network problem.  I do not know whether it is something in Java, Docker, 
or somewhere else.


Have you tried your "ruok" test as the user that is running Solr, or was 
that test done as root?


Thanks,
Shawn


Re: Solr 8.2 docker image in cloud mode not connecting to Zookeeper on startup

2019-10-18 Thread Drew Kidder
Thank you all for your suggestions! I appreciate the fast turnaround.

My setup is using Amazon ECS for our solr cloud installation. Each ZK is in
its own container, using Route53 Service Discovery to provide the DNS name.
The ZK nodes can all talk to each other, and I can communicate to each one
of those nodes from my local machine and from within the solr container.
Solr is one node per container, as Martijn correctly assumed. I am not
using a zkRoot at present because my intention is to use ZK solely for Solr
Cloud and nothing else.

I have tried removing the "-z" option from the Dockerfile CMD and using the
ZK_HOST environment variable (see below). I have even also modified the
solr.in.sh and set the ZK_HOST variable there, all to no avail. I have
tried both the Dockerfile command route, and have logged into the solr
container and tried to run the CMD manually to see if there was a problem
with the way I was using the CMD entry. All of those methods give me the
same result output captured in the gist below.

The gist for my solr.log output is here:
https://gist.github.com/dkidder/2db9a6d393dedb97a39ed32e2be0c087

My Dockerfile for the solr container looks like this:


FROMsolr:8.2

EXPOSE8983 8999 2181

VOLUME/app/logs
VOLUME/app/data
VOLUME/app/conf

## add our jetty configuration (increased request size!)
COPY   jetty.xml /opt/solr/server/etc/

## SolrCloud configuration
ENV ZK_HOST zk1:2181,zk2:2181,zk3:2181
ENV ZK_CLIENT_TIMEOUT 3

USER   root
RUNapt-get update
RUNapt-get install -y netcat net-tools vim procps
USER   solr

# Copy over custom solr plugins
COPYmyplugins/src/resources/* /opt/solr/server/solr/my-resources/
COPYlib/*.jar /opt/solr/my-lib/

# Copy over my configs
COPYconf/ /app/conf

#Start solr in cloud mode, connecting to zookeeper
CMD   ["solr","start","-f","-c"]

The docker command I use to execute this Dockerfile is `docker run -p
8983:8983 -p 2181:2181 --name $(APP_NAME) $(APP_NAME):latest`

Output of `ps -eflww` from within the solr container (as root):

root@fe0ad5b40b42:/opt/solr-8.2.0# ps -eflww
F S UIDPID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY  TIME
CMD
4 S solr 1 0  9  80   0 - 1043842 -14:36 ?00:00:07
/usr/local/openjdk-11/bin/java -server -Xms512m -Xmx512m -XX:+UseG1GC
-XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled
-XX:MaxGCPauseMillis=250 -XX:+UseLargePages -XX:+AlwaysPreTouch
-Xlog:gc*:file=/var/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.local.only=false
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.port=18983
-Dcom.sun.management.jmxremote.rmi.port=18983 -DzkClientTimeout=3
-DzkHost=zk1:2181,zk2:2181,zk3:2181 -Dsolr.log.dir=/var/solr/logs
-Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Duser.timezone=UTC
-Djetty.home=/opt/solr/server -Dsolr.solr.home=/var/solr/data
-Dsolr.data.home= -Dsolr.install.dir=/opt/solr
-Dsolr.default.confdir=/opt/solr/server/solr/configsets/_default/conf
-Dlog4j.configurationFile=file:/var/solr/log4j2.xml -Xss256k
-Dsolr.jetty.https.port=8983 -jar start.jar --module=http
4 S root90 0  0  80   0 -  4988 -  14:37 pts/000:00:00
/bin/bash
0 R root9590  0  80   0 -  9595 -  14:37 pts/000:00:00
ps -eflww

Output of netstat from within the solr container (as root):

root@fe0ad5b40b42:/opt/solr-8.2.0# netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address   Foreign Address State
tcp0  0 fe0ad5b40b42:43678  172.20.28.179:2181
 TIME_WAIT
tcp0  0 fe0ad5b40b42:60164  172.20.155.241:2181
TIME_WAIT
tcp0  0 fe0ad5b40b42:60500  172.20.60.138:2181
 TIME_WAIT
Active UNIX domain sockets (w/o servers)
Proto RefCnt Flags   Type   State I-Node   Path
unix  2  [ ] STREAM CONNECTED 129252
unix  2  [ ] STREAM CONNECTED 129270

I'm beginning to think that ZK is not setup correctly. I haven't uploaded
any configuration files to ZK yet; my understanding was that I could start
up a solr cloud node with no collections and upload the configuration from
there. I was under the impression that it would try to connect to ZK and if
it couldn't get config files from there it would use local config files. Do
I need to upload the solr cloud configuration files to ZK before starting
up the cluster?  The netstat output makes it look like the solr container
is indeed connected to the ZK containers, but there's no indication as to
why it cannot connect to Zookeeper that I can see.

--
Drew(i...@gmail.com)
http://wyntermute.dyndns.org/blog/

-- I Drive Way Too Fast To Worry About Cholesterol.


On Fri, Oct 18, 2019 at 3:11 AM Martijn Koster 
wrote:

>
>
> > On 18 Oct 2019, at 00:25, Drew Kidder  wrote:
>
> > * I'm using the following 

RE: Solr JVM Turning - 7.2.1

2019-10-18 Thread Sethuraman, Ganesh
Solr Users,

Any suggestion or insights on the Solr behavior will help. 

Regards
Ganesh

-Original Message-
From: Sethuraman, Ganesh  
Sent: Wednesday, October 16, 2019 9:25 PM
To: solr-user@lucene.apache.org
Subject: Solr JVM Turning - 7.2.1

CAUTION: This email originated from outside of D Please do not click links 
or open attachments unless you recognize the sender and know the content is 
safe.


Hi,

We are using Solr 7.2.1 with 2 nodes (245GB RAM each) and 3 node ZK cluster in 
production. We are using Java 8 with default GC settings (with NewRatio=3) with 
15GB max heap, changed to 16 GB after the performance issue mentioned below.

We have about 90 collections in this (~8  shards each), about 50 of them are 
actively being used. About 3 collections are being actively updated using SolrJ 
update query with soft commit of 30 secs. Other collection go through update 
handler batch CSV update.

We had read timeout/slowness issue when Young Collection size usage peaked. As 
you can see in the GC Graph below during the problem time. After that we 
increased the overall heap size to 16GB (from 15 GB) and as you can see that we 
did not see any read issue.

  1.  I see our Heap is very large, we are seeing higher usage of young 
collection, is this due to solrj updates (concurrent one record update)?
  2.  Should we change the NewRatio to 2 (so that young size increases more)? 
as we are seeing only 58% usage of old gen
  3.  We are also seeing a behavior that if we restart the Solr in production, 
when updates are happening, one server starts up, but does not have all 
collections and shards up, and when we restart both the server up, it comes up 
fine, is this behavior also related to the Solrj updates?



Problem GC Report  
https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgceasy.io%2Fmy-gc-report.jsp%3Fp%3DYXJjaGl2ZWQvMjAxOS8xMC83Ly0tMDJfc29scl9nYy5sb2cuNi5jdXJyZW50LS0xNC00My01OA%3D%3D%26channel%3DWEBdata=02%7C01%7CSethuramanG%40dnb.com%7C0cd17ad89bdd4227909108d752a0d0ef%7C19e2b708bf12437597198dec42771b3e%7C0%7C1%7C637068722973512337sdata=TrrmhXqtgbv9%2BKbcNulKw%2FrOCzTf9%2FSxO3JStWRlNG8%3Dreserved=0

No Problem GC Report (still see higher Young collection use)  
https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgceasy.io%2Fmy-gc-report.jsp%3Fp%3DYXJjaGl2ZWQvMjAxOS8xMC85Ly0tMDJfX3NvbHJfZ2MubG9nLjIuY3VycmVudC0tMjAtNDQtMjY%3D%26channel%3DWEBdata=02%7C01%7CSethuramanG%40dnb.com%7C0cd17ad89bdd4227909108d752a0d0ef%7C19e2b708bf12437597198dec42771b3e%7C0%7C1%7C637068722973517327sdata=MolLF1OSc8SBsx9rQqSfaHgwtzd534q%2B1Zqoc4W44QY%3Dreserved=0

 Any help on the above question appreciated.

Thanks ,

Ganesh






Re: Performance Issue since Solr 7.7 with wt=javabin

2019-10-18 Thread Paras Lehana
Hi Andy,

Have you run performance benchmarking for sometime and made sure that the
Solr Caching and GC doesn't impact the performance? I recommend that you
should rebuild the performance matrix after few warmups and requests. Have
you invalidated this?

On Fri, 18 Oct 2019 at 12:35, Jan Høydahl  wrote:

> Hi,
>
> Did you find a solution to your performance problem?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> > 17. jun. 2019 kl. 17:17 skrev Andy Reek :
> >
> > Hi Solr team,
> >
> > we are using Solr in version 7.1 as search engine in our online shop
> (SAP Hybris). And as a task I needed to migrate to the most recent Solr in
> version 7 (7.7). Doing this I faced extreme performance issues. After
> debugging and testing different setups I found out, that they were caused
> by the parameter wt=javabin. These issues begin to raise since version 7.7,
> in 7.6 it is still working as fast as in 7.1.
> >
> > Just an example: Doing a simple query for *.* and wt=javabin in 7.6: 0.2
> seconds and in 7.7: 34 seconds!
> >
> > The configuration of the schema.xml and solrconfig.xml are equal in both
> versions. Version 8.1 has the same effect as 7.7. Using something other
> than wt=javabin (e.g. wt=xml) will work fast in every version - which is
> our current workaround.
> >
> >
> > To reproduce this issue I have attached my used configsets folder plus
> some test data. This all can be tested with docker and wget:
> >
> > Solr 7.6:
> > docker run -d --name solr7.6 -p 8983:8983 --rm -v
> $PWD/configsets/default:/opt/solr/server/solr/configsets/myconfig:ro
> solr:7.6-slim solr-create -c mycore -d
> /opt/solr/server/solr/configsets/myconfig
> > docker cp $PWD/data.json solr7.6:/opt/solr/data.json
> > docker exec -it --user solr solr7.6 bin/post -c mycore data.json
> > wget "
> http://localhost:8983/solr/mycore/select?q=*:*=javabin= <
> http://localhost:8983/solr/mycore/select?q=*:*=javabin=>"
> > (0.2s)
> >
> > Solr 7.7:
> > docker run -d --name solr7.7 -p 18983:8983 --rm -v
> $PWD/configsets/default:/opt/solr/server/solr/configsets/myconfig:ro
> solr:7.7-slim solr-create -c mycore -d
> /opt/solr/server/solr/configsets/myconfig
> > docker cp $PWD/data.json solr7.7:/opt/solr/data.json
> > docker exec -it --user solr solr7.7 bin/post -c mycore data.json
> > (34s)
> >
> > For me it seems like a bug. But if not, then please let me know what I
> did wrong ;-)
> >
> >
> > Best Regards,
> >
> > Andy Reek
> > Principal Software Developer
> >
> > diva-e Jena
> > Mälzerstraße 3, 07745 Jena, Deutschland
> >
> > T:   +49 (3641) 3678 (223)
> > F:   +49 (3641) 3678 101
> > andy.r...@diva-e.com 
> >
> > www.diva-e.com  follow us: facebook <
> https://www.facebook.com/digital.value.enterprise/?ref=hl>, twitter <
> https://twitter.com/diva_enterprise>, LinkedIn <
> https://www.linkedin.com/company/diva-e-digital-value-enterprise-gmbh>,
> Xing 
> > 
> >
> > diva-e AGETO GmbH
> > Handelsregister: HRB 210399 Amtsgericht Jena
> > Geschäftsführung: Sascha Sauer, Sirko Schneppe, Axel Jahn
> > 
>
>

-- 
-- 
Regards,

*Paras Lehana* [65871]
Software Programmer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
IMPORTANT: 
NEVER share your IndiaMART OTP/ Password with anyone.


Re: Query regarding positionIncrementGap

2019-10-18 Thread Paras Lehana
Hi Shubham,

In other words, *you specify a large positionIncrementGap to make sure that
your queries don't match across multiple values of a field*.

For example, for a query like title:"paper plate making machine", you don't
want it to match with doc having two values for title:"paper plate",
"making machine". A positionIncrementGap of 100 will make sure that "plate"
and "making" is 100 position apart. To make you understand better, notice
the positions of terms (in format token -> position) and remember that
position matching matters in Lucene Query Matching:


   - paper -> 1
   - plate -> 2
   - making -> 3
   - machine -> 4

Now with positionIncrementGap of 0, the doc will have these positions for
title:

   -
   - paper -> 1
   - plate -> 2
   - making -> 3 (0+3)
   - machine -> 4 (0+4)

which will match the query. But if we have a positionIncrementGap of 100,
the doc will have these positions for title:

   -
   - paper -> 1
   - plate -> 2
   - making -> 103 (100+3)
   - machine -> 104 (100+4)

which will not match the* "exact"* query due to different positions.

Hope this helps. My position calculation is bit different from
@erickerick...@gmail.com  as I tried to replicate
the maths I could understand from the source code. Please feel free to
correct if not. Anyways, the idea remains same. :)

On Fri, 18 Oct 2019 at 18:36, Erick Erickson 
wrote:

> I really don’t understand the question. The field has to be multiValued,
> but there’s no other restriction. It’s all about whether a document you
> input has the same field name specified more than once, i.e. is
> multiValued. That’s why the example I gave has 
> Imagine you’re indexing a document. The client side breaks up the doc on
> sentence boundaries and enters them as multiple mentions of the same field,
> i.e.
> 
>   sentence one
>   sentence two
>   sentence three
>   sentence four
>   sentence five
> 
>
> I think you’re missing the implication that the incoming document
> _already_ has the multiple fields put there by the time it gets to Solr.
>
> Best,
> Erick
>
>
> > On Oct 18, 2019, at 2:28 AM, Shubham Goswami 
> wrote:
> >
> > Hi Erick
> >
> > Thanks for reply and your example is very helpful.
> > But i think we can only use this attribute if we are getting data from a
> > single field
> > which has the copy of all data from every field.
> > Please correct me if i am wrong.
> > Thanks for your great support.
> >
> > Shubham
> >
> > On Thu, Oct 17, 2019 at 5:56 PM Erick Erickson 
> > wrote:
> >
> >> First, it only counts if you add multiple entries for the field.
> Consider
> >> the following
> >> 
> >>   a b c
> >>   def
> >> 
> >>
> >> where the field has a positionIncrementGap of 100. The term positions of
> >> the entries are
> >> a:1
> >> b:2
> >> c:3
> >> d:103
> >> e:104
> >> f:105
> >>
> >> Now consider the doc where there’s only one field:
> >> 
> >>   a b c d e f
> >> 
> >>
> >> The term positions are
> >> a:1
> >> b:2
> >> c:3
> >> d:4
> >> e:5
> >> f:6
> >>
> >> The use-case is if you, say, index individual sentences and want to
> match
> >> two or more words in the _same_ sentence. You can specify a phrase query
> >> where the slop is < the positionIncrementGap. So in the first case, if I
> >> search for “a b”~99 I’d get a match. But if I searched for “a d”~99 I
> >> wouldn’t.
> >>
> >> Best,
> >> Erick
> >>
> >>> On Oct 17, 2019, at 2:09 AM, Shubham Goswami <
> shubham.gosw...@hotwax.co>
> >> wrote:
> >>>
> >>> Hi Community
> >>>
> >>> I am a beginner in solr and i am trying to understand the working of
> >>> positionIncrementGap but i am still not clear how it exactly works for
> >> the
> >>> phrase queries and general queires.
> >>>  Can somebody please help me to understand this with the help fo an
> >>> example ?
> >>> Any help will be appreciated. Thanks in advance.
> >>>
> >>> --
> >>> *Thanks & Regards*
> >>> Shubham Goswami
> >>> Enterprise Software Engineer
> >>> *HotWax Systems*
> >>> *Enterprise open source experts*
> >>> cell: +91-7803886288
> >>> office: 0731-409-3684
> >>> http://www.hotwaxsystems.com
> >>
> >>
> >
> > --
> > *Thanks & Regards*
> > Shubham Goswami
> > Enterprise Software Engineer
> > *HotWax Systems*
> > *Enterprise open source experts*
> > cell: +91-7803886288
> > office: 0731-409-3684
> > http://www.hotwaxsystems.com
>
>

-- 
-- 
Regards,

*Paras Lehana* [65871]
Software Programmer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
IMPORTANT: 
NEVER share your IndiaMART OTP/ Password with anyone.


Re: Query regarding positionIncrementGap

2019-10-18 Thread Erick Erickson
I really don’t understand the question. The field has to be multiValued, but 
there’s no other restriction. It’s all about whether a document you input has 
the same field name specified more than once, i.e. is multiValued. That’s why 
the example I gave has 
  sentence one
  sentence two
  sentence three
  sentence four
  sentence five


I think you’re missing the implication that the incoming document _already_ has 
the multiple fields put there by the time it gets to Solr.

Best,
Erick


> On Oct 18, 2019, at 2:28 AM, Shubham Goswami  
> wrote:
> 
> Hi Erick
> 
> Thanks for reply and your example is very helpful.
> But i think we can only use this attribute if we are getting data from a
> single field
> which has the copy of all data from every field.
> Please correct me if i am wrong.
> Thanks for your great support.
> 
> Shubham
> 
> On Thu, Oct 17, 2019 at 5:56 PM Erick Erickson 
> wrote:
> 
>> First, it only counts if you add multiple entries for the field. Consider
>> the following
>> 
>>   a b c
>>   def
>> 
>> 
>> where the field has a positionIncrementGap of 100. The term positions of
>> the entries are
>> a:1
>> b:2
>> c:3
>> d:103
>> e:104
>> f:105
>> 
>> Now consider the doc where there’s only one field:
>> 
>>   a b c d e f
>> 
>> 
>> The term positions are
>> a:1
>> b:2
>> c:3
>> d:4
>> e:5
>> f:6
>> 
>> The use-case is if you, say, index individual sentences and want to match
>> two or more words in the _same_ sentence. You can specify a phrase query
>> where the slop is < the positionIncrementGap. So in the first case, if I
>> search for “a b”~99 I’d get a match. But if I searched for “a d”~99 I
>> wouldn’t.
>> 
>> Best,
>> Erick
>> 
>>> On Oct 17, 2019, at 2:09 AM, Shubham Goswami 
>> wrote:
>>> 
>>> Hi Community
>>> 
>>> I am a beginner in solr and i am trying to understand the working of
>>> positionIncrementGap but i am still not clear how it exactly works for
>> the
>>> phrase queries and general queires.
>>>  Can somebody please help me to understand this with the help fo an
>>> example ?
>>> Any help will be appreciated. Thanks in advance.
>>> 
>>> --
>>> *Thanks & Regards*
>>> Shubham Goswami
>>> Enterprise Software Engineer
>>> *HotWax Systems*
>>> *Enterprise open source experts*
>>> cell: +91-7803886288
>>> office: 0731-409-3684
>>> http://www.hotwaxsystems.com
>> 
>> 
> 
> -- 
> *Thanks & Regards*
> Shubham Goswami
> Enterprise Software Engineer
> *HotWax Systems*
> *Enterprise open source experts*
> cell: +91-7803886288
> office: 0731-409-3684
> http://www.hotwaxsystems.com



Re: Japanese Query Unexpectedly Misses

2019-10-18 Thread Yasufumi Mizoguchi
Hi,

There are two solutions as far as I know.

1. Use userDictionary attribute
This is common and safe way I think.
Add userDictionary attribute into your tokenizer configuration and define
userDictionary file as follows.

Tokenizer:


userDictionary(lang/userdict_ja.txt in above setting):
日本人,日本 人,ニッポン ジン,カスタム名詞

This leads you the result you want.

But, "カスタム名詞"(Customized noun) might not be an appropriate "part of speech"
to your service, threfore
you should change "カスタム名詞" to another "part of speech", I think.


2. Use nBest attribute
If you use 6.0 or higher version solr(maybe...), there is a worth to try
this.
Adding nBestExamples attribute into the tokenizer configuration as follows.

Tokenizer:


When tokenizing sentences, JapaneseTokenizer assumes various
results(tokenized sentences) and calculates the cost on each results.
And, the tokenizer returns the result having the lowest cost.
Using nBest, JapaneseTokenizer becomes to return the lowest and some other
results.
However, this can affect the result not only the case you want to solve,
but also the others.

And, both way require you to re-indexing all documents with Japanese field
type.


Thanks,
Yasufumi

2019年10月18日(金) 2:44 Stephen Lewis Bianamara :

> Hi SOLR Community,
>
> I have an example of a basic Japanese indexing/recall scenario which I am
> trying to support, but cannot get to work.
>
> The scenario is: I would like for 日本人 (Japanese Person) to be matched by
> either 日本 (Japan) or 人 (Person). Currently, I am not seeing this work. My
> Japanese text field currently has the tokenizer
>
>> 
>>
> What is most surprising to me is that I though this is what mode="search"
> was made for. From the docs, I see
>
>> Use search mode to get a noun-decompounding effect useful for search.
>> search mode improves segmentation for search at the expense of
>> part-of-speech accuracy
>>
>
> I analyzed the breakdown, and I can see that the tokenizer is not
> generating three tokens (one for Japan, one for person, and one for
> Japanese Person) as I would have expected. Interestingly, the tokenizer
> does recognize that  日本人 is a compound noun, so it would seem to be that it
> should decompound it (see image below).
>
> Can you help me figure out if my configuration is incorrect, or if there
> is some way to fix this scenario?
>
> Thanks!
> Stephen
>
>
> [image: image.png]
>
>


Re: Solr 8.2 docker image in cloud mode not connecting to Zookeeper on startup

2019-10-18 Thread Martijn Koster


> On 18 Oct 2019, at 00:25, Drew Kidder  wrote:

> * I'm using the following command line to start a basic solr cloud instance
> as per the documentation: `bin/solr start -c -z zk1:2181,zk2:2181,zk3:2181`

I assume you’re just looking to run a single Solr node in a single container, 
right?

Just set the ZK_HOST environment variable, and remove the command-line 
arguments.
And you don’t need to specify the port number unless you deviate from the 
default.
Have a look at this example 
https://github.com/docker-solr/docker-solr-examples/blob/master/swarm/docker-compose.yml
 


The “start” command starts Solr in the background, which is typically not what 
you want
when running Solr under docker.


Why your command isn’t working as is, is not clear. When you say you’re using 
that
command-line, how do you actually do that? In a full docker command line,
or a compose file, or from a “docker exec”, or from some orchestrator.
Share the exact thing you’re doing; perhaps there is mistake there.
Also, run `ps -eflww` in the container to see what command-line arguments the 
JVM actually got started with.
And share the full startup log somewhere (in a GitHub gist perhaps), there 
might be something of interest earlier on.

>> (running `echo ruok | nc zk1 2181` returns the expected "imok" response
>> from ZK within the docker container where Solr is located)
>> * The netcat command mentioned above shows up in the ZK logs, but the Solr
>> attempts to connect do not (it's like the request isn't even getting to ZK)

Then it doesn’t sound like a environmental firewall/security-group/routing 
issue.
Next step to debug then could be to check if you actually see Solr make tcp 
connections
to port 2181, in the Solr container, using tcpdump/sysdig/netstat or some such.
If that gives a negative result, then you know it’s an issue in your Solr 
invocation config, or name resolution.
If that gives a positive result, then it’s environmental after all; and you can 
dig further.


But try the ZK_HOST thing first; it may just fix it.

— Martijn

Re: Performance Issue since Solr 7.7 with wt=javabin

2019-10-18 Thread Jan Høydahl
Hi,

Did you find a solution to your performance problem?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 17. jun. 2019 kl. 17:17 skrev Andy Reek :
> 
> Hi Solr team,
> 
> we are using Solr in version 7.1 as search engine in our online shop (SAP 
> Hybris). And as a task I needed to migrate to the most recent Solr in version 
> 7 (7.7). Doing this I faced extreme performance issues. After debugging and 
> testing different setups I found out, that they were caused by the parameter 
> wt=javabin. These issues begin to raise since version 7.7, in 7.6 it is still 
> working as fast as in 7.1.
> 
> Just an example: Doing a simple query for *.* and wt=javabin in 7.6: 0.2 
> seconds and in 7.7: 34 seconds!
> 
> The configuration of the schema.xml and solrconfig.xml are equal in both 
> versions. Version 8.1 has the same effect as 7.7. Using something other than 
> wt=javabin (e.g. wt=xml) will work fast in every version - which is our 
> current workaround.
> 
> 
> To reproduce this issue I have attached my used configsets folder plus some 
> test data. This all can be tested with docker and wget:
> 
> Solr 7.6:
> docker run -d --name solr7.6 -p 8983:8983 --rm -v 
> $PWD/configsets/default:/opt/solr/server/solr/configsets/myconfig:ro 
> solr:7.6-slim solr-create -c mycore -d 
> /opt/solr/server/solr/configsets/myconfig
> docker cp $PWD/data.json solr7.6:/opt/solr/data.json
> docker exec -it --user solr solr7.6 bin/post -c mycore data.json
> wget "http://localhost:8983/solr/mycore/select?q=*:*=javabin= 
> "
> (0.2s)
> 
> Solr 7.7:
> docker run -d --name solr7.7 -p 18983:8983 --rm -v 
> $PWD/configsets/default:/opt/solr/server/solr/configsets/myconfig:ro 
> solr:7.7-slim solr-create -c mycore -d 
> /opt/solr/server/solr/configsets/myconfig
> docker cp $PWD/data.json solr7.7:/opt/solr/data.json
> docker exec -it --user solr solr7.7 bin/post -c mycore data.json
> (34s)
> 
> For me it seems like a bug. But if not, then please let me know what I did 
> wrong ;-)
> 
> 
> Best Regards,
>  
> Andy Reek
> Principal Software Developer
>  
> diva-e Jena
> Mälzerstraße 3, 07745 Jena, Deutschland
> 
> T:   +49 (3641) 3678 (223)
> F:   +49 (3641) 3678 101
> andy.r...@diva-e.com 
>  
> www.diva-e.com  follow us: facebook 
> , twitter 
> , LinkedIn 
> , Xing 
> 
> 
>  
> diva-e AGETO GmbH
> Handelsregister: HRB 210399 Amtsgericht Jena
> Geschäftsführung: Sascha Sauer, Sirko Schneppe, Axel Jahn
> 



Re: Solr 8.2 - Added Field - can't facet using alias

2019-10-18 Thread Jan Høydahl
Are you querying across multiple collections? In that case you have to add the 
field to all those collections. You have not shown us your facet query.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 11. okt. 2019 kl. 22:37 skrev Joe Obernberger :
> 
> Hi All, I've added a field with:
> 
> curl -X POST -H 'Content-type:application/json' --data-binary 
> '{"add-field":{"name":"FaceCluster","type":"plongs","stored":false,"multiValued":true,"indexed":true}}'
>  http://miranda:9100/solr/UNCLASS_2019_8_5_36/schema
> 
> It returned success.  In the UI, when I examine the schema, it shows up but 
> does not list 'schema' with the check-boxes for Indexed/DocValues etc..  It 
> only lists Properties for FaceCluster.  Other plong fields that were added a 
> while back and show both properties and schema.
> While I can facet on this field using an alias, I get 'Error from server at 
> null: undefined field: FaceCluster'.  If I search an individual solr 
> collection, I can facet on it.
> 
> Any ideas?
> 
> -Joe
> 



Help to learn synonymQueryStyle and FieldTypeSimilarity

2019-10-18 Thread Shubham Goswami
Hi Community

I am a beginner in solr and i am trying to understand the working of
synonymQueryStyle and FieldTypeSimilarity but i am still not clear how it
exactly works.
Can somebody please help me to understand this with the help of an example ?
Any help will be appreciated. Thanks in advance.

-- 
*Thanks & Regards*
Shubham Goswami
Enterprise Software Engineer
*HotWax Systems*
*Enterprise open source experts*
cell: +91-7803886288
office: 0731-409-3684
http://www.hotwaxsystems.com


Re: Query regarding positionIncrementGap

2019-10-18 Thread Shubham Goswami
Hi Erick

Thanks for reply and your example is very helpful.
But i think we can only use this attribute if we are getting data from a
single field
which has the copy of all data from every field.
Please correct me if i am wrong.
Thanks for your great support.

Shubham

On Thu, Oct 17, 2019 at 5:56 PM Erick Erickson 
wrote:

> First, it only counts if you add multiple entries for the field. Consider
> the following
> 
>a b c
>def
> 
>
> where the field has a positionIncrementGap of 100. The term positions of
> the entries are
> a:1
> b:2
> c:3
> d:103
> e:104
> f:105
>
> Now consider the doc where there’s only one field:
> 
>a b c d e f
> 
>
> The term positions are
> a:1
> b:2
> c:3
> d:4
> e:5
> f:6
>
> The use-case is if you, say, index individual sentences and want to match
> two or more words in the _same_ sentence. You can specify a phrase query
> where the slop is < the positionIncrementGap. So in the first case, if I
> search for “a b”~99 I’d get a match. But if I searched for “a d”~99 I
> wouldn’t.
>
> Best,
> Erick
>
> > On Oct 17, 2019, at 2:09 AM, Shubham Goswami 
> wrote:
> >
> > Hi Community
> >
> > I am a beginner in solr and i am trying to understand the working of
> > positionIncrementGap but i am still not clear how it exactly works for
> the
> > phrase queries and general queires.
> >   Can somebody please help me to understand this with the help fo an
> > example ?
> > Any help will be appreciated. Thanks in advance.
> >
> > --
> > *Thanks & Regards*
> > Shubham Goswami
> > Enterprise Software Engineer
> > *HotWax Systems*
> > *Enterprise open source experts*
> > cell: +91-7803886288
> > office: 0731-409-3684
> > http://www.hotwaxsystems.com
>
>

-- 
*Thanks & Regards*
Shubham Goswami
Enterprise Software Engineer
*HotWax Systems*
*Enterprise open source experts*
cell: +91-7803886288
office: 0731-409-3684
http://www.hotwaxsystems.com