Re: Advice on how to work with pure JSON data.

2017-04-20 Thread russell . lemaster
One thing I forgot to mention in my original post is that I wish to do this 
using the SolrJ client. 
I have my own rest server that presents a common API to our users, but the 
back-end can be 
anything I wish. I have been using "that other Lucene based product" :), but I 
wish to stick to 
a product that is more open and that perhaps I can contribute to. 

I've searched for SolrJ examples for child documents and unfortunately there 
are far too 
many references to implementations based off of older versions of Solr. 
Specifically, I would 
like to insert beans with multiple child collections in them, but the latest 
I've read says this 
is not currently possible. Is that still true? 

In short, It isn't so important that REST based requests / responses from Solr 
are pure JSON 
so long as I can do what I want from the java client. 

Do you know if there have been recent additions / enhancements up through 6.5 
that make 
this more straight-forward? 

Thanks 


- Original Message -

From: "Mikhail Khludnev"  
To: "solr-user"  
Sent: Thursday, April 20, 2017 3:38:11 PM 
Subject: Re: Advice on how to work with pure JSON data. 

This is one of the features of the epic 
https://issues.apache.org/jira/browse/SOLR-10144. 
Until it's done the only way to achieve this is to properly set many params 
for 
https://cwiki.apache.org/confluence/display/solr/Transforming+Result+Documents#TransformingResultDocuments-[subquery]
 

Note, here I assume that children mapping is static ie there is a limited 
list of optional scopes. 
Indexing and searching arbitrary JSON is esoteric (XML DB like) problem. 
Also, beware of https://issues.apache.org/jira/browse/SOLR-10500. I hope to 
fix it soon. 

On Thu, Apr 20, 2017 at 10:15 PM,  wrote: 

> 
> I have looked at many examples on how to do what I want, but they tend to 
> only show fragments or they 
> are based on older versions of Solr. I'm hoping there are new features 
> that make what I'm doing easier. 
> 
> I am running version 6.5 and am testing by running in cloud mode but only 
> on a single machine. 
> 
> Basically, I have a large number of documents stored as JSON in individual 
> files. I want to take that JSON 
> document and index it without having to do any pre-processing, etc. I also 
> need to be able to write newly indexed 
> JSON data back to individual files in the same format. 
> 
> For example, let's say I have a json document that looks like the 
> following: 
> 
> { 
> "id" : "bb903493-55b0-421f-a83e-2199ea11e136", 
> "productName_s" : "UsefulWidget", 
> "productCategory_s" : "tool", 
> "suppliers" : [ 
> { 
> "id" : " bb903493-55b0-421f-a83e-2199ea11e221", 
> "name_s" : "Acme Tools", 
> "productNumber_s" : "10342UW" 
> }, { 
> "id" : " bb903493-55b0-421a-a83e-2199ea11e445", 
> "name_s" : "Snappy Tools", 
> "productNumber_s" : "ST-X100023" 
> } 
> ], 
> "resellers" : [ 
> { 
> "id" : "cc 903493-55b0-421f-a83e-2199ea11e221", 
> "name_s" : "Target", 
> "productSKU_s" : "TA092310342UW" 
> }, { 
> "id" : "bc903493-55b0-421a-a83e-2199ea11e445", 
> "name_s" : "Wal-Mart", 
> "productSKU_s" : "029342ABLSWM" 
> } 
> ] 
> } 
> 
> I know I can use the /update/json/docs handler to insert the above but 
> from what I understand, I'd have to set up parameters 
> telling it how to split the children, etc. Though that is a bit of a pain, 
> I can make that happen. 
> 
> The problem is that, when I then try to query for the data, it comes back 
> with _childDocuments_ instead of the names of the 
> child document lists. So, how can I have Solr return the document as it 
> was originally indexed (I know it would be embedded 
> in the results structure, but I can deal with that)? 
> 
> I am running version 6.5 and I am hoping there is a method I haven't seen 
> documented that can do this. If not, can someone 
> point me to some examples of how to do this another way. 
> 
> If there is no easy way to do this with the current version, can someone 
> point me to a good resource for writing my own 
> handlers? 
> 
> Thank you. 
> 
> 
> 
> 
> 
> 
> 
> 
> 


-- 
Sincerely yours 
Mikhail Khludnev 



Data Changes Logging

2017-04-20 Thread Preeti Bhat
Hi All,

We got a peculiar requirement from client, not sure whether the SOLR supports 
it or not. We would like to have alerts in place if we have the changes on the 
particular column for more than specific threshold on said day.
For example: Say, we have column say "Name" on which we have say around 30% 
changes we would like an alert. I am not sure whether we have some logging 
mechanism to get this done. Any ideas would be appreciated.


Thanks and Regards,
Preeti



NOTICE TO RECIPIENTS: This communication may contain confidential and/or 
privileged information. If you are not the intended recipient (or have received 
this communication in error) please notify the sender and 
it-supp...@shoregrp.com immediately, and destroy this communication. Any 
unauthorized copying, disclosure or distribution of the material in this 
communication is strictly forbidden. Any views or opinions presented in this 
email are solely those of the author and do not necessarily represent those of 
the company. Finally, the recipient should check this email and any attachments 
for the presence of viruses. The company accepts no liability for any damage 
caused by any virus transmitted by this email.




Re: Nodes goes down but never recovers.

2017-04-20 Thread Pranaya Behera
Hi Erick,
  Even if they use different solr.home which I have also
tested in AWS environment there also is the same problem.

Can someone verify the first message in their local ?

On Fri, Apr 21, 2017 at 2:27 AM, Erick Erickson  wrote:
> Have you looked at the Solr logs on the node you try to bring back up?
> There are sometimes much more informative messages in the log files.
> The proverbial "smoking gun" would be messages about write locks.
>
> You say they are all using the same solr.home, which is probably the
> source of a lot of your issues. Take a look at the directory structure
> after you start up the example and you'll see different -s parameters
> for each of the instances started on the same machine, so the startup
> looks something like:
>
> bin/solr start -c -z localhost:2181 -p 898$1 -s example/cloud/node1/solr
> bin/solr start -c -z localhost:2181 -p 898$1 -s example/cloud/node2/solr
>
> and the like.
>
> Best,
> Erick
>
> On Thu, Apr 20, 2017 at 11:01 AM, Pranaya Behera
>  wrote:
>> Hi,
>>  Can someone from the mailing list also confirm the same findings
>> ? I am at wit's end on what to do to fix this. Please guide me to
>> create a patch for the same.
>>
>> On Thu, Apr 20, 2017 at 3:13 PM, Pranaya Behera
>>  wrote:
>>> Hi,
>>>  Through SolrJ I am trying to upload configsets and create
>>> collections in my solrcloud.
>>>
>>> Setup:
>>> 1 Standalone zookeeper listening on 2181 port. version 3.4.10
>>> -- bin/zkServer.sh start
>>> 3 Starting solr nodes. (All running from the same solr.home) version
>>> 6.5.0 and as well in 6.2.1
>>> -- bin/solr -c -z localhost:2181 -p 8983
>>> -- bin/solr -c -z localhost:2181 -p 8984
>>> -- bin/solr -c -z localhost:2181 -p 8985
>>>
>>> After first run of my java application to upload the config and create
>>> the collections in solr through zookeeper is seemless and working
>>> fine.
>>> Here is the clusterstatus after the first run.
>>> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-3nodes-json
>>>
>>> Stopped one solr node via:
>>> -- bin/solr stop -p 8985
>>> clusterstatus changed to:
>>> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-3nodes1down-json
>>>
>>> Till now everything is as expected.
>>>
>>> Here is the remaining part where it confuses me.
>>>
>>> Bring the down node back to life. Clusterstatus changed from 2 node
>>> down with 1 node not found to 3 node down including the new node that
>>> just brought up.
>>> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-3nodes3down-json
>>> Expected result should be all the other nodes should be in active mode
>>> and this one would be recovery mode and then it would be active mode,
>>> as this node had data before i stopped it using the script.
>>>
>>> Now I added one more node to the cluster via
>>> -- bin/solr -c -z localhost:2181 -p 8986
>>> The clusterstatus changed to:
>>> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-4node3down-json
>>> This one just retains the previous state and adds the node to the cluster.
>>>
>>>
>>> When bringing up the removed node which was previously in the cluster
>>> which was registered to the zookeeper and has data about the
>>> collections be registered as active rather than making every other
>>> node down ? If so what is the solution to this ?
>>>
>>> When we add more nodes to an existing cluster, how to ensure that it
>>> also gets the same collections/data i.e. basically synchronizes with
>>> the other nodes which are present in the node rather than manually
>>> create collection for that specific node ? As you can see from the
>>> lastly added node's clusterstate it is there in the live_nodes but
>>> never got the collections into its data dir.
>>> Is there any other way to add a node with the existing cluster with
>>> the cluster data ?
>>>
>>> For the completion here is the code that is used to upload config and
>>> create collection through CloudSolrClient in Solrj.(Not full code but
>>> part of it where the operation is happening.)
>>> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-code-java
>>> Thats all there is for a collection to create: upload configsets to
>>> zookeeper, create collection and reload collection if required.
>>>
>>> This I have tried in my local Mac OS Sierra and also in AWS env which
>>> same effect.
>>>
>>>
>>>
>>> --
>>> Thanks & Regards
>>> Pranaya PR Behera
>>
>>
>>
>> --
>> Thanks & Regards
>> Pranaya PR Behera



-- 
Thanks & Regards
Pranaya PR Behera


Enable https for Solr

2017-04-20 Thread Zheng Lin Edwin Yeo
Hi,

I would like to find out, how can we allow Solr to accept secure
connections via https?

I am using SolrCloud on Solr 6.4.2

Regards,
Edwin


Re: HttpSolrServer commit is taking more time

2017-04-20 Thread Shawn Heisey
On 4/20/2017 9:23 PM, Venkateswarlu Bommineni wrote:
> I am new to Solr so need your help in solving below issue.
>
> I am using SolrJ to add and commit the files into Solr.
>
> But Solr commit is taking a long time.
>
> for example: for 14000 records it is taking 4 min.

Usually, extreme commit times like this have one of two causes:

1) The caches are very large and have a large autowarmCount.

2) The Solr heap is way too small, and the JVM is doing constant garbage
collections.

If queries are not having slowness issues, I would bet on option 1,
although you may in fact be running into BOTH problems.

This is what a typical example config looks like for one of the Solr caches:




This cache has a size of 512, but autowarmCount is zero.  This means
that when a new searcher is created by a commit, none of the entries in
the filterCache on the old searcher will make it to the cache on the new
searcher.  If you change the autowarmCount value to say 4, then the top
4 filter queries in the cache will be re-executed on the new searcher,
prepopulating the new cache with four entries.  If each of those four
filters takes ten seconds to run, then warming that cache will take 40
seconds.  I'm betting that somebody changed the autowarmCount values on
the Solr caches to a high number in your configuration.If that's the
case, lower the number and reload/restart.

Thanks,
Shawn



HttpSolrServer commit is taking more time

2017-04-20 Thread Venkateswarlu Bommineni
Hi,

I am new to Solr so need your help in solving below issue.

I am using SolrJ to add and commit the files into Solr.

But Solr commit is taking a long time.

for example: for 14000 records it is taking 4 min.

below is the code snippet.


*static final HttpSolrServer server = new HttpSolrServer(urlString);*
*server.add();*
*server.commit();*

Could you please help me in resolving the issue.\


Thanks,
Venkat.


Re: Nodes goes down but never recovers.

2017-04-20 Thread Erick Erickson
Have you looked at the Solr logs on the node you try to bring back up?
There are sometimes much more informative messages in the log files.
The proverbial "smoking gun" would be messages about write locks.

You say they are all using the same solr.home, which is probably the
source of a lot of your issues. Take a look at the directory structure
after you start up the example and you'll see different -s parameters
for each of the instances started on the same machine, so the startup
looks something like:

bin/solr start -c -z localhost:2181 -p 898$1 -s example/cloud/node1/solr
bin/solr start -c -z localhost:2181 -p 898$1 -s example/cloud/node2/solr

and the like.

Best,
Erick

On Thu, Apr 20, 2017 at 11:01 AM, Pranaya Behera
 wrote:
> Hi,
>  Can someone from the mailing list also confirm the same findings
> ? I am at wit's end on what to do to fix this. Please guide me to
> create a patch for the same.
>
> On Thu, Apr 20, 2017 at 3:13 PM, Pranaya Behera
>  wrote:
>> Hi,
>>  Through SolrJ I am trying to upload configsets and create
>> collections in my solrcloud.
>>
>> Setup:
>> 1 Standalone zookeeper listening on 2181 port. version 3.4.10
>> -- bin/zkServer.sh start
>> 3 Starting solr nodes. (All running from the same solr.home) version
>> 6.5.0 and as well in 6.2.1
>> -- bin/solr -c -z localhost:2181 -p 8983
>> -- bin/solr -c -z localhost:2181 -p 8984
>> -- bin/solr -c -z localhost:2181 -p 8985
>>
>> After first run of my java application to upload the config and create
>> the collections in solr through zookeeper is seemless and working
>> fine.
>> Here is the clusterstatus after the first run.
>> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-3nodes-json
>>
>> Stopped one solr node via:
>> -- bin/solr stop -p 8985
>> clusterstatus changed to:
>> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-3nodes1down-json
>>
>> Till now everything is as expected.
>>
>> Here is the remaining part where it confuses me.
>>
>> Bring the down node back to life. Clusterstatus changed from 2 node
>> down with 1 node not found to 3 node down including the new node that
>> just brought up.
>> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-3nodes3down-json
>> Expected result should be all the other nodes should be in active mode
>> and this one would be recovery mode and then it would be active mode,
>> as this node had data before i stopped it using the script.
>>
>> Now I added one more node to the cluster via
>> -- bin/solr -c -z localhost:2181 -p 8986
>> The clusterstatus changed to:
>> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-4node3down-json
>> This one just retains the previous state and adds the node to the cluster.
>>
>>
>> When bringing up the removed node which was previously in the cluster
>> which was registered to the zookeeper and has data about the
>> collections be registered as active rather than making every other
>> node down ? If so what is the solution to this ?
>>
>> When we add more nodes to an existing cluster, how to ensure that it
>> also gets the same collections/data i.e. basically synchronizes with
>> the other nodes which are present in the node rather than manually
>> create collection for that specific node ? As you can see from the
>> lastly added node's clusterstate it is there in the live_nodes but
>> never got the collections into its data dir.
>> Is there any other way to add a node with the existing cluster with
>> the cluster data ?
>>
>> For the completion here is the code that is used to upload config and
>> create collection through CloudSolrClient in Solrj.(Not full code but
>> part of it where the operation is happening.)
>> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-code-java
>> Thats all there is for a collection to create: upload configsets to
>> zookeeper, create collection and reload collection if required.
>>
>> This I have tried in my local Mac OS Sierra and also in AWS env which
>> same effect.
>>
>>
>>
>> --
>> Thanks & Regards
>> Pranaya PR Behera
>
>
>
> --
> Thanks & Regards
> Pranaya PR Behera


Re: Advice on how to work with pure JSON data.

2017-04-20 Thread Mikhail Khludnev
This is one of the features of the epic
https://issues.apache.org/jira/browse/SOLR-10144.
Until it's done the only way to achieve this is to properly set many params
for
https://cwiki.apache.org/confluence/display/solr/Transforming+Result+Documents#TransformingResultDocuments-[subquery]

Note, here I assume that children mapping is static ie there is a limited
list of optional scopes.
Indexing and searching arbitrary JSON is esoteric (XML DB like) problem.
Also, beware of https://issues.apache.org/jira/browse/SOLR-10500. I hope to
fix it soon.

On Thu, Apr 20, 2017 at 10:15 PM,  wrote:

>
> I have looked at many examples on how to do what I want, but they tend to
> only show fragments or they
> are based on older versions of Solr. I'm hoping there are new features
> that make what I'm doing easier.
>
> I am running version 6.5 and am testing by running in cloud mode but only
> on a single machine.
>
> Basically, I have a large number of documents stored as JSON in individual
> files. I want to take that JSON
> document and index it without having to do any pre-processing, etc. I also
> need to be able to write newly indexed
> JSON data back to individual files in the same format.
>
> For example, let's say I have a json document that looks like the
> following:
>
> {
> "id" : "bb903493-55b0-421f-a83e-2199ea11e136",
> "productName_s" : "UsefulWidget",
> "productCategory_s" : "tool",
> "suppliers" : [
> {
> "id" : " bb903493-55b0-421f-a83e-2199ea11e221",
> "name_s" : "Acme Tools",
> "productNumber_s" : "10342UW"
> }, {
> "id" : " bb903493-55b0-421a-a83e-2199ea11e445",
> "name_s" : "Snappy Tools",
> "productNumber_s" : "ST-X100023"
> }
> ],
> "resellers" : [
> {
> "id" : "cc 903493-55b0-421f-a83e-2199ea11e221",
> "name_s" : "Target",
> "productSKU_s" : "TA092310342UW"
> }, {
> "id" : "bc903493-55b0-421a-a83e-2199ea11e445",
> "name_s" : "Wal-Mart",
> "productSKU_s" : "029342ABLSWM"
> }
> ]
> }
>
> I know I can use the /update/json/docs handler to insert the above but
> from what I understand, I'd have to set up parameters
> telling it how to split the children, etc. Though that is a bit of a pain,
> I can make that happen.
>
> The problem is that, when I then try to query for the data, it comes back
> with _childDocuments_ instead of the names of the
> child document lists. So, how can I have Solr return the document as it
> was originally indexed (I know it would be embedded
> in the results structure, but I can deal with that)?
>
> I am running version 6.5 and I am hoping there is a method I haven't seen
> documented that can do this. If not, can someone
> point me to some examples of how to do this another way.
>
> If there is no easy way to do this with the current version, can someone
> point me to a good resource for writing my own
> handlers?
>
> Thank you.
>
>
>
>
>
>
>
>
>


-- 
Sincerely yours
Mikhail Khludnev


Re: Loadbalance for SorCloud using SolrNet

2017-04-20 Thread Florian Gleixner
Hi,

i wrote a Solr Cloud Proxy. The idea is to run a Proxy on every client
machine, that wants to connect to the SolrCloud. The client connects to
the proxy at localhost, and the proxy used SolrJ to connect to the cloud.

It is not yet tested very well, but it works for me.

https://gitlab.lrz.de/a2814ad/SolrCloudProxy

A jar with dependencies is here:

https://gitlab.lrz.de/a2814ad/SolrCloudProxy/blob/master/target/SolrCloudProxy-0.0.1-SNAPSHOT-jar-with-dependencies.jar



On 20.04.2017 09:53, Vrinda Ashok wrote:
> Thanks  Shawn.
> 
> So do you suggest to have external load balance ? Something like HA proxy or 
> physical load balance.
> 
> -Original Message-
> From: Shawn Heisey [mailto:apa...@elyograg.org] 
> Sent: Thursday, April 20, 2017 12:36 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Loadbalance for SorCloud using SolrNet
> 
> On 4/20/2017 12:47 AM, Vrinda Ashok wrote:
>> I have application  sending request to Shard1 each time, making this single 
>> point of failure. Please suggest what can I use for Load balancing in 
>> SolrNet.
>>
>> Is there something like CloudSolrClient as in SolrJ ? Or will I have to go 
>> with HA proxy or physical load balance only ?
> 
> SolrNet was not built by the Solr project.  It was developed by somebody else.
> 
> Unless SolrNet has the capability of using more than one base URL to access 
> Solr and failing over if one of them becomes unusable, you will need a 
> separate load balancer.  I have no idea whether SolrNet has that capability.
> 
> Thanks,
> Shawn
> 
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
> for the use of the addressee(s).
> 
> If you are not the intended recipient, please notify so to the sender by 
> e-mail and delete the original message.
> 
> In such cases, please notify us immediately at i...@infinite.com . Further, 
> you are not to copy, 
> 
> disclose, or distribute this e-mail or its contents to any unauthorized 
> person(s) .Any such actions are 
> 
> considered unlawful. This e-mail may contain viruses. Infinite has taken 
> every reasonable precaution to minimize
> 
> this risk, but is not liable for any damage you may sustain as a result of 
> any virus in this e-mail. You should 
> 
> carry out your own virus checks before opening the e-mail or attachments. 
> Infinite reserves the right to monitor
> 
> and review the content of all messages sent to or from this e-mail address. 
> Messages sent to or from this e-mail
> 
> address may be stored on the Infinite e-mail system. 
> 
> ***INFINITE End of DisclaimerINFINITE 
> 




signature.asc
Description: OpenPGP digital signature


Advice on how to work with pure JSON data.

2017-04-20 Thread russell . lemaster

I have looked at many examples on how to do what I want, but they tend to only 
show fragments or they 
are based on older versions of Solr. I'm hoping there are new features that 
make what I'm doing easier. 

I am running version 6.5 and am testing by running in cloud mode but only on a 
single machine. 

Basically, I have a large number of documents stored as JSON in individual 
files. I want to take that JSON 
document and index it without having to do any pre-processing, etc. I also need 
to be able to write newly indexed 
JSON data back to individual files in the same format. 

For example, let's say I have a json document that looks like the following: 

{ 
"id" : "bb903493-55b0-421f-a83e-2199ea11e136", 
"productName_s" : "UsefulWidget", 
"productCategory_s" : "tool", 
"suppliers" : [ 
{ 
"id" : " bb903493-55b0-421f-a83e-2199ea11e221", 
"name_s" : "Acme Tools", 
"productNumber_s" : "10342UW" 
}, { 
"id" : " bb903493-55b0-421a-a83e-2199ea11e445", 
"name_s" : "Snappy Tools", 
"productNumber_s" : "ST-X100023" 
} 
], 
"resellers" : [ 
{ 
"id" : "cc 903493-55b0-421f-a83e-2199ea11e221", 
"name_s" : "Target", 
"productSKU_s" : "TA092310342UW" 
}, { 
"id" : "bc903493-55b0-421a-a83e-2199ea11e445", 
"name_s" : "Wal-Mart", 
"productSKU_s" : "029342ABLSWM" 
} 
] 
} 

I know I can use the /update/json/docs handler to insert the above but from 
what I understand, I'd have to set up parameters 
telling it how to split the children, etc. Though that is a bit of a pain, I 
can make that happen. 

The problem is that, when I then try to query for the data, it comes back with 
_childDocuments_ instead of the names of the 
child document lists. So, how can I have Solr return the document as it was 
originally indexed (I know it would be embedded 
in the results structure, but I can deal with that)? 

I am running version 6.5 and I am hoping there is a method I haven't seen 
documented that can do this. If not, can someone 
point me to some examples of how to do this another way. 

If there is no easy way to do this with the current version, can someone point 
me to a good resource for writing my own 
handlers? 

Thank you. 










Re: Nodes goes down but never recovers.

2017-04-20 Thread Pranaya Behera
Hi,
 Can someone from the mailing list also confirm the same findings
? I am at wit's end on what to do to fix this. Please guide me to
create a patch for the same.

On Thu, Apr 20, 2017 at 3:13 PM, Pranaya Behera
 wrote:
> Hi,
>  Through SolrJ I am trying to upload configsets and create
> collections in my solrcloud.
>
> Setup:
> 1 Standalone zookeeper listening on 2181 port. version 3.4.10
> -- bin/zkServer.sh start
> 3 Starting solr nodes. (All running from the same solr.home) version
> 6.5.0 and as well in 6.2.1
> -- bin/solr -c -z localhost:2181 -p 8983
> -- bin/solr -c -z localhost:2181 -p 8984
> -- bin/solr -c -z localhost:2181 -p 8985
>
> After first run of my java application to upload the config and create
> the collections in solr through zookeeper is seemless and working
> fine.
> Here is the clusterstatus after the first run.
> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-3nodes-json
>
> Stopped one solr node via:
> -- bin/solr stop -p 8985
> clusterstatus changed to:
> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-3nodes1down-json
>
> Till now everything is as expected.
>
> Here is the remaining part where it confuses me.
>
> Bring the down node back to life. Clusterstatus changed from 2 node
> down with 1 node not found to 3 node down including the new node that
> just brought up.
> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-3nodes3down-json
> Expected result should be all the other nodes should be in active mode
> and this one would be recovery mode and then it would be active mode,
> as this node had data before i stopped it using the script.
>
> Now I added one more node to the cluster via
> -- bin/solr -c -z localhost:2181 -p 8986
> The clusterstatus changed to:
> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-4node3down-json
> This one just retains the previous state and adds the node to the cluster.
>
>
> When bringing up the removed node which was previously in the cluster
> which was registered to the zookeeper and has data about the
> collections be registered as active rather than making every other
> node down ? If so what is the solution to this ?
>
> When we add more nodes to an existing cluster, how to ensure that it
> also gets the same collections/data i.e. basically synchronizes with
> the other nodes which are present in the node rather than manually
> create collection for that specific node ? As you can see from the
> lastly added node's clusterstate it is there in the live_nodes but
> never got the collections into its data dir.
> Is there any other way to add a node with the existing cluster with
> the cluster data ?
>
> For the completion here is the code that is used to upload config and
> create collection through CloudSolrClient in Solrj.(Not full code but
> part of it where the operation is happening.)
> https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-code-java
> Thats all there is for a collection to create: upload configsets to
> zookeeper, create collection and reload collection if required.
>
> This I have tried in my local Mac OS Sierra and also in AWS env which
> same effect.
>
>
>
> --
> Thanks & Regards
> Pranaya PR Behera



-- 
Thanks & Regards
Pranaya PR Behera


Unstored uniqueKey

2017-04-20 Thread Chris Sun
Hi,

In Solr (6.5.0) cloud mode, is a string based (docValues=“true”) uniqueKey 
still required to be stored?

I set it to false and got a “uniqueKey is not stored - distributed search and 
MoreLikeThis will not work” warning.


Thanks,
--  
Chris



Re: BUILD FAILED solr 6.5.0

2017-04-20 Thread Steve Rowe
Hi Bernd,

Glad you got things working.

https://issues.apache.org/jira/browse/LUCENE-4960 is an existing issue to 
address this problem, but nobody has figured out how to do it yet I guess.

--
Steve
www.lucidworks.com

> On Apr 20, 2017, at 2:35 AM, Bernd Fehling  
> wrote:
> 
> Hi Steve,
> 
> thanks a lot for solving my problem.
> 
> Would it be possible to check for ivy >= 2.3 at start of build
> and give your hint as message to the user?
> 
> Regards
> Bernd
> 
> Am 19.04.2017 um 17:01 schrieb Steve Rowe:
>> Hi Bernd,
>> 
>> Your Ivy may be outdated - the project requires minimum 2.3.
>> 
>> Try removing all pre-2.3 ivy-*.jar files from ~/.ant/lib/, then running “ant 
>> ivy-bootstrap”.
>> 
>> --
>> Steve
>> www.lucidworks.com
>> 
>>> On Apr 19, 2017, at 10:55 AM, Bernd Fehling 
>>>  wrote:
>>> 
>>> Tried today to have a look at solr 6.5.0.
>>> - download solr-6.5.0-src.tgz from apache.org and extracted to workspace
>>> - ant eclipse
>>> - imported to eclipse neon as new project
>>> - from eclipse in lucene subdir clicked on build.xml and selected
>>> "Run As" --> "Ant Build..."
>>> - selected "package" and "Run"
>>> 
>>> Result:
>>> ...
>>> [javadoc] Loading source files for package org.apache.lucene.search.spans...
>>> [javadoc] Loading source files for package org.apache.lucene.store...
>>> [javadoc] Loading source files for package org.apache.lucene.util...
>>> [javadoc] Loading source files for package 
>>> org.apache.lucene.util.automaton...
>>> [javadoc] Loading source files for package org.apache.lucene.util.fst...
>>> [javadoc] Constructing Javadoc information...
>>> [javadoc] Standard Doclet version 1.8.0_121
>>> [javadoc] Building tree for all the packages and classes...
>>> [javadoc] Building index for all the packages and classes...
>>> [javadoc] Building index for all classes...
>>>[exec] Result: 128
>>> [jar] Building jar:
>>> /srv/www/solr/workspace_neon_solr_6_5_0/solr-6.5.0/lucene/build/test-framework/lucene-test-framework-6.5.0-SNAPSHOT-javadoc.jar
>>> javadocs:
>>> changes-to-html:
>>>   [mkdir] Created dir: 
>>> /srv/www/solr/workspace_neon_solr_6_5_0/solr-6.5.0/lucene/build/docs/changes
>>>  [delete] Deleting: 
>>> /srv/www/solr/workspace_neon_solr_6_5_0/solr-6.5.0/lucene/build/doap.lucene.version.dates.csv
>>>[copy] Copying 3 files to 
>>> /srv/www/solr/workspace_neon_solr_6_5_0/solr-6.5.0/lucene/build/docs/changes
>>> ivy-availability-check:
>>> ivy-fail:
>>> ivy-configure:
>>> [ivy:configure] :: loading settings :: file = 
>>> /srv/www/solr/workspace_neon_solr_6_5_0/solr-6.5.0/lucene/top-level-ivy-settings.xml
>>> resolve-groovy:
>>> [ivy:cachepath] :: resolving dependencies :: 
>>> org.codehaus.groovy#groovy-all-caller;working
>>> [ivy:cachepath] confs: [default]
>>> [ivy:cachepath] found org.codehaus.groovy#groovy-all;2.4.8 in public
>>> [ivy:cachepath] :: resolution report :: resolve 10ms :: artifacts dl 0ms
>>> -
>>> |  |modules||   artifacts   |
>>> |   conf   | number| search|dwnlded|evicted|| number|dwnlded|
>>> -
>>> |  default |   1   |   0   |   0   |   0   ||   1   |   0   |
>>> -
>>> resolve-markdown:
>>> 
>>> BUILD FAILED
>>> /srv/www/solr/workspace_neon_solr_6_5_0/solr-6.5.0/lucene/common-build.xml:2415:
>>>  ivy:cachepath doesn't support the nested "dependency" element.
>>> 
>>> 
>>> Any idea what is going wrong?
>>> 
>>> Something with ivy:dependency within ivy:cachepath, but how to fix it?
>>> 
>>> Regards
>>> Bernd
>> 



Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Shalin Shekhar Mangar
I also opened https://issues.apache.org/jira/browse/SOLR-10532 to fix
this annoying and confusing behavior of SuggestComponent.

On Thu, Apr 20, 2017 at 8:40 PM, Andrea Gazzarini  wrote:
> Ah great, many thanks again!
>
>
>
> On 20/04/17 17:09, Shalin Shekhar Mangar wrote:
>>
>> Hi Andrea,
>>
>> Looks like I have you some bad information. I looked at the code and
>> ran a test locally. The suggest.build and suggest.reload params are in
>> fact distributed across to all shards but only to one replica of each
>> shard. This is still bad enough and you should use buildOnOptimize as
>> suggested but I just wanted to correct the wrong information I gave
>> earlier.
>>
>> On Thu, Apr 20, 2017 at 6:23 PM, Andrea Gazzarini 
>> wrote:
>>>
>>> Perfect, I don't need NRT at this moment so that fits perfectly
>>>
>>> Thanks,
>>> Andrea
>>>
>>>
>>> On 20/04/17 14:37, Shalin Shekhar Mangar wrote:

 Yeah, if it is just once a day then you can afford to do an optimize.
 For a more NRT indexing approach, I wouldn't recommend optimize at
 all.

 On Thu, Apr 20, 2017 at 5:29 PM, Andrea Gazzarini 
 wrote:
>
> Ok, many thanks
>
> I see / read that it should be better to rely on the background merging
> instead of issuing explicit optimizes, but I think in this case one
> optimize
> in a day it shouldn't be a problem.
>
> Did I get you correctly?
>
> Thanks again,
> Andrea
>
>
> On 20/04/17 13:17, Shalin Shekhar Mangar wrote:
>>
>> Can the client not send an optimize command explicitly after all
>> indexing/deleting is complete?
>
>

>>
>>
>



-- 
Regards,
Shalin Shekhar Mangar.


Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Andrea Gazzarini

Ah great, many thanks again!


On 20/04/17 17:09, Shalin Shekhar Mangar wrote:

Hi Andrea,

Looks like I have you some bad information. I looked at the code and
ran a test locally. The suggest.build and suggest.reload params are in
fact distributed across to all shards but only to one replica of each
shard. This is still bad enough and you should use buildOnOptimize as
suggested but I just wanted to correct the wrong information I gave
earlier.

On Thu, Apr 20, 2017 at 6:23 PM, Andrea Gazzarini  wrote:

Perfect, I don't need NRT at this moment so that fits perfectly

Thanks,
Andrea


On 20/04/17 14:37, Shalin Shekhar Mangar wrote:

Yeah, if it is just once a day then you can afford to do an optimize.
For a more NRT indexing approach, I wouldn't recommend optimize at
all.

On Thu, Apr 20, 2017 at 5:29 PM, Andrea Gazzarini 
wrote:

Ok, many thanks

I see / read that it should be better to rely on the background merging
instead of issuing explicit optimizes, but I think in this case one
optimize
in a day it shouldn't be a problem.

Did I get you correctly?

Thanks again,
Andrea


On 20/04/17 13:17, Shalin Shekhar Mangar wrote:

Can the client not send an optimize command explicitly after all
indexing/deleting is complete?











Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Shalin Shekhar Mangar
Hi Andrea,

Looks like I have you some bad information. I looked at the code and
ran a test locally. The suggest.build and suggest.reload params are in
fact distributed across to all shards but only to one replica of each
shard. This is still bad enough and you should use buildOnOptimize as
suggested but I just wanted to correct the wrong information I gave
earlier.

On Thu, Apr 20, 2017 at 6:23 PM, Andrea Gazzarini  wrote:
> Perfect, I don't need NRT at this moment so that fits perfectly
>
> Thanks,
> Andrea
>
>
> On 20/04/17 14:37, Shalin Shekhar Mangar wrote:
>>
>> Yeah, if it is just once a day then you can afford to do an optimize.
>> For a more NRT indexing approach, I wouldn't recommend optimize at
>> all.
>>
>> On Thu, Apr 20, 2017 at 5:29 PM, Andrea Gazzarini 
>> wrote:
>>>
>>> Ok, many thanks
>>>
>>> I see / read that it should be better to rely on the background merging
>>> instead of issuing explicit optimizes, but I think in this case one
>>> optimize
>>> in a day it shouldn't be a problem.
>>>
>>> Did I get you correctly?
>>>
>>> Thanks again,
>>> Andrea
>>>
>>>
>>> On 20/04/17 13:17, Shalin Shekhar Mangar wrote:

 Can the client not send an optimize command explicitly after all
 indexing/deleting is complete?
>>>
>>>
>>
>>
>



-- 
Regards,
Shalin Shekhar Mangar.


Issues with ingesting to Solr using Flume

2017-04-20 Thread Anantharaman, Srinatha (Contractor)
Hi all,

I am trying to ingest data to Solr 6.3 using flume 1.5 on Hortonworks 2.5 
platform Facing below issue while sinking the data

19 Apr 2017 19:54:26,943 ERROR [lifecycleSupervisor-1-3] 
(org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run:253)  - 
Unable to start SinkRunner: { 
policy:org.apache.flume.sink.DefaultSinkProcessor@130344d7 counterGroup:{ 
name:null counters:{} } } - Exception follows.
org.kitesdk.morphline.api.MorphlineCompilationException: No command builder 
registered for name: detectMimeType near: {
# /etc/flume/conf/morphline.conf: 48
"detectMimeType" : {
# /etc/flume/conf/morphline.conf: 50
"includeDefaultMimeTypes" : true
}
}

The morphline config file is as below


id : morphline1

importCommands : ["org.kitesdk.**", "org.apache.solr.**"]
#importCommands : ["com.cloudera.**", "org.kitesdk.**"]

commands :
[

  { detectMimeType { includeDefaultMimeTypes : true } }

  {

solrCell {

  solrLocator : ${solrLocator}

  captureAttr : true

  lowernames : true

  capture : [_attachment_body, _attachment_mimetype, basename, content, 
content_encoding, content_type, file, meta,text]

  parsers : [ # { parser : org.apache.tika.parser.txt.TXTParser }

# { parser : org.apache.tika.parser.AutoDetectParser }
  #{ parser : org.apache.tika.parser.asm.ClassParser }
  #{ parser : org.gagravarr.tika.FlacParser }
  #{ parser : 
org.apache.tika.parser.executable.ExecutableParser }
  #{ parser : org.apache.tika.parser.font.TrueTypeParser }
  #{ parser : org.apache.tika.parser.xml.XMLParser }
  #{ parser : org.apache.tika.parser.html.HtmlParser }
  #{ parser : org.apache.tika.parser.image.TiffParser }
  # { parser : org.apache.tika.parser.mail.RFC822Parser }
  #{ parser : org.apache.tika.parser.mbox.MboxParser, 
additionalSupportedMimeTypes : [message/x-emlx] }
  #{ parser : org.apache.tika.parser.microsoft.OfficeParser 
}
  #{ parser : org.apache.tika.parser.hdf.HDFParser }
  #{ parser : org.apache.tika.parser.odf.OpenDocumentParser 
}
  #{ parser : org.apache.tika.parser.pdf.PDFParser }
  #{ parser : org.apache.tika.parser.rtf.RTFParser }
  { parser : org.apache.tika.parser.txt.TXTParser }
  #{ parser : org.apache.tika.parser.chm.ChmParser }
]

 fmap : { content : text }
 }

  }
  { generateUUID { field : id } }

  { sanitizeUnknownSolrFields { solrLocator : ${solrLocator} } }


  { logDebug { format : "output record: {}", args : ["@{}"] } }

  { loadSolr: { solrLocator : ${solrLocator} } }

]

  }

]


I have copied all required jars files to Flume Classpath Kindly let me know the 
solution for this issue

Regards,
~Sri



Re: Solr Stream Content from URL

2017-04-20 Thread Alexandre Rafalovitch
Not that I know of.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 19 April 2017 at 14:30, Furkan KAMACI  wrote:
> Hi Alexandre,
>
> My content is protected via Basic Authentication. Is it possible to use
> Basic Authentication with Solr Content Streams?
>
> Kind Regards,
> Furkan KAMACI
>
> On Wed, Apr 19, 2017 at 9:13 PM, Alexandre Rafalovitch 
> wrote:
>
>> Have you tried stream.url parameter after enabling the
>> enableRemoteStreaming flag?
>> https://cwiki.apache.org/confluence/display/solr/Content+Streams
>>
>> Regards,
>>Alex.
>> 
>> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>>
>>
>> On 19 April 2017 at 13:27, Furkan KAMACI  wrote:
>> > Hi,
>> >
>> > Is it possible to stream a CSV content from URL to Solr?
>> >
>> > I've tried URLDataSource but could not figure out about what to use as
>> > document.
>> >
>> > Kind Regards,
>> > Furkan KAMACI
>>


Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Andrea Gazzarini

Perfect, I don't need NRT at this moment so that fits perfectly

Thanks,
Andrea

On 20/04/17 14:37, Shalin Shekhar Mangar wrote:

Yeah, if it is just once a day then you can afford to do an optimize.
For a more NRT indexing approach, I wouldn't recommend optimize at
all.

On Thu, Apr 20, 2017 at 5:29 PM, Andrea Gazzarini  wrote:

Ok, many thanks

I see / read that it should be better to rely on the background merging
instead of issuing explicit optimizes, but I think in this case one optimize
in a day it shouldn't be a problem.

Did I get you correctly?

Thanks again,
Andrea


On 20/04/17 13:17, Shalin Shekhar Mangar wrote:

Can the client not send an optimize command explicitly after all
indexing/deleting is complete?









Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Shalin Shekhar Mangar
Yeah, if it is just once a day then you can afford to do an optimize.
For a more NRT indexing approach, I wouldn't recommend optimize at
all.

On Thu, Apr 20, 2017 at 5:29 PM, Andrea Gazzarini  wrote:
> Ok, many thanks
>
> I see / read that it should be better to rely on the background merging
> instead of issuing explicit optimizes, but I think in this case one optimize
> in a day it shouldn't be a problem.
>
> Did I get you correctly?
>
> Thanks again,
> Andrea
>
>
> On 20/04/17 13:17, Shalin Shekhar Mangar wrote:
>>
>> Can the client not send an optimize command explicitly after all
>> indexing/deleting is complete?
>
>



-- 
Regards,
Shalin Shekhar Mangar.


Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Andrea Gazzarini

Ok, many thanks

I see / read that it should be better to rely on the background merging 
instead of issuing explicit optimizes, but I think in this case one 
optimize in a day it shouldn't be a problem.


Did I get you correctly?

Thanks again,
Andrea

On 20/04/17 13:17, Shalin Shekhar Mangar wrote:

Can the client not send an optimize command explicitly after all
indexing/deleting is complete?




Re: Loadbalance for SorCloud using SolrNet

2017-04-20 Thread Mikhail Khludnev
Hello, Vrinda.

AFAIK, this SolrCloud is supported
https://github.com/dkrupnou/SolrNet/tree/cloud

On Thu, Apr 20, 2017 at 9:47 AM, Vrinda Ashok 
wrote:

> Hello,
>
> I have application  sending request to Shard1 each time, making this
> single point of failure. Please suggest what can I use for Load balancing
> in SolrNet.
>
> Is there something like CloudSolrClient as in SolrJ ? Or will I have to go
> with HA proxy or physical load balance only ?
>
> Please suggest.
>
> Thank you,
> Vrinda Davda
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> solely for the use of the addressee(s).
>
> If you are not the intended recipient, please notify so to the sender by
> e-mail and delete the original message.
>
> In such cases, please notify us immediately at i...@infinite.com .
> Further, you are not to copy,
>
> disclose, or distribute this e-mail or its contents to any unauthorized
> person(s) .Any such actions are
>
> considered unlawful. This e-mail may contain viruses. Infinite has taken
> every reasonable precaution to minimize
>
> this risk, but is not liable for any damage you may sustain as a result of
> any virus in this e-mail. You should
>
> carry out your own virus checks before opening the e-mail or attachments.
> Infinite reserves the right to monitor
>
> and review the content of all messages sent to or from this e-mail
> address. Messages sent to or from this e-mail
>
> address may be stored on the Infinite e-mail system.
>
> ***INFINITE End of DisclaimerINFINITE
>



-- 
Sincerely yours
Mikhail Khludnev


Issue regarding range faceting inside pivot facets using solrj

2017-04-20 Thread Naman Asati
Hi

I am using Solr 6.3.0 with Fusion 1.2.8.

I am having an issue doing range faceting INSIDE the pivot faceting using
the solr-solrj-5.0.0.jar.

Let us consider 3 fields A, B and C.
I want to do a range facet for A and pivot faceting of (B, C, Range facet
of A).

Code :
ModifiableSolrParams params = new ModifiableSolrParams();
params.add("facet.range","{!tag=r1}A") ;
params.add("f.A.facet.range.start","0") ;
params.add("f.A.facet.range.end","100") ;
params.add("f.A.facet.range.gap","10") ;
params.add("f.A.facet.range.hardend","true") ;
params.add("facet.pivot","{!range=r1}B,C") ;

The query formed from this is working fine when I copy it and hit it on the
browser, but using solrj, it gives SolrServerException.

java.lang.RuntimeException: unknown key in pivot: ranges
[{A={counts={0=0,8=3,16=4,96=1},gap=8,start=0,end=100}}]

Please help me to figure out how to correctly use solrj to do range
faceting inside pivot faceting.


Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Shalin Shekhar Mangar
On Thu, Apr 20, 2017 at 4:27 PM, Andrea Gazzarini  wrote:
> Hi Shalin,
> many thanks for your response. This is my scenario:
>
>  * I build my index once in a day, it could be a delta or a full
>re-index.In any case, that takes some time;
>  * I have an auto-commit (hard, no soft-commits) set to a given period
>and during the indexing cycle, several hard commits are executed. So
>the buildOnCommit (I guess) it's not an option because it will
>rebuild that suggest index several times.

Yes, you're right, multiple commits will cause the suggest index to be
rebuilt needlessly.

>
> But I have a doubt on the second point: the reference guide says:
>
> /"Use buildOnCommit to rebuild the dictionary with every soft-commit"/
>
> As I said, I have no soft-commits only hard-commits: does the rebuild happen
> after hard commits (with buildOnCommit=true)?

I peeked at the code and yes, actually the rebuild happens whenever a
new searcher is created which means that it happens on soft-commits or
on a hard commit with openSearcher=true.

>
> The other option, buildOnOptimize, makes me curious: in the scenario above,
> let's say documents are indexed / deleted every morning at 4am, in a window
> that takes 1 max 3 hours, how can I build the suggest index (more or less)
> just after that window? I'm ok if the build happens after a reasonable delay
> (e.g. 1, max 2 hours)

Can the client not send an optimize command explicitly after all
indexing/deleting is complete?

>
> Many thanks,
> Andrea
>
>
>
> On 20/04/17 11:11, Shalin Shekhar Mangar wrote:
>>
>> Comments inline:
>>
>>
>> On Wed, Apr 19, 2017 at 2:46 PM, Andrea Gazzarini 
>> wrote:
>>>
>>> Hi,
>>> any help out there?
>>>
>>> BTW I forgot the Solr version: 6.5.0
>>>
>>> Thanks,
>>> Andrea
>>>
>>>
>>> On 18/04/17 11:45, Andrea Gazzarini wrote:

 Hi,
 I have a project, with SolrCloud, where I'm going to use the Suggester
 component (BlendedInfixLookupFactory with DocumentDictionaryFactory).
 Some info:

* I will have a suggest-only collection, with no NRT requirements
  (indexes will be updated with a daily frequency)
* I'm not yet sure about the replication factor (I have to do some
  checks)
* I'm using Solrj on the client side

 After reading some documentation I have a couple of doubts:

* how the *suggest.build* command is working? Can I issue this
  command towards just one node, and have that node forward the
  request to the other nodes (so each of them can build its own
  suggester index portion)?
>>
>> The suggest.build only builds locally in the node to which you sent
>> the request. This makes it a bit tricky because if you send that
>> command with just the collection name, it will be resolved to a local
>> core and executed there. The safest/easiest way is to set
>> buildOnCommit or buildOnOptimize in the suggester configuration.
>>
* how things are working at query time? Can I use send a request
  with only suggest.q=... to my /suggest request handler and get
  back distributed suggestions?
>>
>> The SuggestComponent works in distributed mode and it will request and
>> merge results from all shards.
>>
 Thanks in advance
 Andrea
>>>
>>>
>>
>>
>



-- 
Regards,
Shalin Shekhar Mangar.


Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Andrea Gazzarini

Hi Shalin,
many thanks for your response. This is my scenario:

 * I build my index once in a day, it could be a delta or a full
   re-index.In any case, that takes some time;
 * I have an auto-commit (hard, no soft-commits) set to a given period
   and during the indexing cycle, several hard commits are executed. So
   the buildOnCommit (I guess) it's not an option because it will
   rebuild that suggest index several times.

But I have a doubt on the second point: the reference guide says:

/"Use buildOnCommit to rebuild the dictionary with every soft-commit"/

As I said, I have no soft-commits only hard-commits: does the rebuild 
happen after hard commits (with buildOnCommit=true)?


The other option, buildOnOptimize, makes me curious: in the scenario 
above, let's say documents are indexed / deleted every morning at 4am, 
in a window that takes 1 max 3 hours, how can I build the suggest index 
(more or less) just after that window? I'm ok if the build happens after 
a reasonable delay (e.g. 1, max 2 hours)


Many thanks,
Andrea


On 20/04/17 11:11, Shalin Shekhar Mangar wrote:

Comments inline:


On Wed, Apr 19, 2017 at 2:46 PM, Andrea Gazzarini  wrote:

Hi,
any help out there?

BTW I forgot the Solr version: 6.5.0

Thanks,
Andrea


On 18/04/17 11:45, Andrea Gazzarini wrote:

Hi,
I have a project, with SolrCloud, where I'm going to use the Suggester
component (BlendedInfixLookupFactory with DocumentDictionaryFactory).
Some info:

   * I will have a suggest-only collection, with no NRT requirements
 (indexes will be updated with a daily frequency)
   * I'm not yet sure about the replication factor (I have to do some
 checks)
   * I'm using Solrj on the client side

After reading some documentation I have a couple of doubts:

   * how the *suggest.build* command is working? Can I issue this
 command towards just one node, and have that node forward the
 request to the other nodes (so each of them can build its own
 suggester index portion)?

The suggest.build only builds locally in the node to which you sent
the request. This makes it a bit tricky because if you send that
command with just the collection name, it will be resolved to a local
core and executed there. The safest/easiest way is to set
buildOnCommit or buildOnOptimize in the suggester configuration.


   * how things are working at query time? Can I use send a request
 with only suggest.q=... to my /suggest request handler and get
 back distributed suggestions?

The SuggestComponent works in distributed mode and it will request and
merge results from all shards.


Thanks in advance
Andrea









Nodes goes down but never recovers.

2017-04-20 Thread Pranaya Behera
Hi,
 Through SolrJ I am trying to upload configsets and create
collections in my solrcloud.

Setup:
1 Standalone zookeeper listening on 2181 port. version 3.4.10
-- bin/zkServer.sh start
3 Starting solr nodes. (All running from the same solr.home) version
6.5.0 and as well in 6.2.1
-- bin/solr -c -z localhost:2181 -p 8983
-- bin/solr -c -z localhost:2181 -p 8984
-- bin/solr -c -z localhost:2181 -p 8985

After first run of my java application to upload the config and create
the collections in solr through zookeeper is seemless and working
fine.
Here is the clusterstatus after the first run.
https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-3nodes-json

Stopped one solr node via:
-- bin/solr stop -p 8985
clusterstatus changed to:
https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-3nodes1down-json

Till now everything is as expected.

Here is the remaining part where it confuses me.

Bring the down node back to life. Clusterstatus changed from 2 node
down with 1 node not found to 3 node down including the new node that
just brought up.
https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-3nodes3down-json
Expected result should be all the other nodes should be in active mode
and this one would be recovery mode and then it would be active mode,
as this node had data before i stopped it using the script.

Now I added one more node to the cluster via
-- bin/solr -c -z localhost:2181 -p 8986
The clusterstatus changed to:
https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-4node3down-json
This one just retains the previous state and adds the node to the cluster.


When bringing up the removed node which was previously in the cluster
which was registered to the zookeeper and has data about the
collections be registered as active rather than making every other
node down ? If so what is the solution to this ?

When we add more nodes to an existing cluster, how to ensure that it
also gets the same collections/data i.e. basically synchronizes with
the other nodes which are present in the node rather than manually
create collection for that specific node ? As you can see from the
lastly added node's clusterstate it is there in the live_nodes but
never got the collections into its data dir.
Is there any other way to add a node with the existing cluster with
the cluster data ?

For the completion here is the code that is used to upload config and
create collection through CloudSolrClient in Solrj.(Not full code but
part of it where the operation is happening.)
https://gist.github.com/shadow-fox/5874f8b5de93fff0f5bcc8886be81d4d#file-code-java
Thats all there is for a collection to create: upload configsets to
zookeeper, create collection and reload collection if required.

This I have tried in my local Mac OS Sierra and also in AWS env which
same effect.



-- 
Thanks & Regards
Pranaya PR Behera


Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Shalin Shekhar Mangar
Comments inline:


On Wed, Apr 19, 2017 at 2:46 PM, Andrea Gazzarini  wrote:
> Hi,
> any help out there?
>
> BTW I forgot the Solr version: 6.5.0
>
> Thanks,
> Andrea
>
>
> On 18/04/17 11:45, Andrea Gazzarini wrote:
>>
>> Hi,
>> I have a project, with SolrCloud, where I'm going to use the Suggester
>> component (BlendedInfixLookupFactory with DocumentDictionaryFactory).
>> Some info:
>>
>>   * I will have a suggest-only collection, with no NRT requirements
>> (indexes will be updated with a daily frequency)
>>   * I'm not yet sure about the replication factor (I have to do some
>> checks)
>>   * I'm using Solrj on the client side
>>
>> After reading some documentation I have a couple of doubts:
>>
>>   * how the *suggest.build* command is working? Can I issue this
>> command towards just one node, and have that node forward the
>> request to the other nodes (so each of them can build its own
>> suggester index portion)?

The suggest.build only builds locally in the node to which you sent
the request. This makes it a bit tricky because if you send that
command with just the collection name, it will be resolved to a local
core and executed there. The safest/easiest way is to set
buildOnCommit or buildOnOptimize in the suggester configuration.

>>   * how things are working at query time? Can I use send a request
>> with only suggest.q=... to my /suggest request handler and get
>> back distributed suggestions?

The SuggestComponent works in distributed mode and it will request and
merge results from all shards.

>>
>> Thanks in advance
>> Andrea
>
>



-- 
Regards,
Shalin Shekhar Mangar.


RE: Loadbalance for SorCloud using SolrNet

2017-04-20 Thread Vrinda Ashok
Thanks  Shawn.

So do you suggest to have external load balance ? Something like HA proxy or 
physical load balance.

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Thursday, April 20, 2017 12:36 PM
To: solr-user@lucene.apache.org
Subject: Re: Loadbalance for SorCloud using SolrNet

On 4/20/2017 12:47 AM, Vrinda Ashok wrote:
> I have application  sending request to Shard1 each time, making this single 
> point of failure. Please suggest what can I use for Load balancing in SolrNet.
>
> Is there something like CloudSolrClient as in SolrJ ? Or will I have to go 
> with HA proxy or physical load balance only ?

SolrNet was not built by the Solr project.  It was developed by somebody else.

Unless SolrNet has the capability of using more than one base URL to access 
Solr and failing over if one of them becomes unusable, you will need a separate 
load balancer.  I have no idea whether SolrNet has that capability.

Thanks,
Shawn

This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s).

If you are not the intended recipient, please notify so to the sender by e-mail 
and delete the original message.

In such cases, please notify us immediately at i...@infinite.com . Further, you 
are not to copy, 

disclose, or distribute this e-mail or its contents to any unauthorized 
person(s) .Any such actions are 

considered unlawful. This e-mail may contain viruses. Infinite has taken every 
reasonable precaution to minimize

this risk, but is not liable for any damage you may sustain as a result of any 
virus in this e-mail. You should 

carry out your own virus checks before opening the e-mail or attachments. 
Infinite reserves the right to monitor

and review the content of all messages sent to or from this e-mail address. 
Messages sent to or from this e-mail

address may be stored on the Infinite e-mail system. 

***INFINITE End of DisclaimerINFINITE 


Re: Loadbalance for SorCloud using SolrNet

2017-04-20 Thread Shawn Heisey
On 4/20/2017 12:47 AM, Vrinda Ashok wrote:
> I have application  sending request to Shard1 each time, making this single 
> point of failure. Please suggest what can I use for Load balancing in SolrNet.
>
> Is there something like CloudSolrClient as in SolrJ ? Or will I have to go 
> with HA proxy or physical load balance only ?

SolrNet was not built by the Solr project.  It was developed by somebody
else.

Unless SolrNet has the capability of using more than one base URL to
access Solr and failing over if one of them becomes unusable, you will
need a separate load balancer.  I have no idea whether SolrNet has that
capability.

Thanks,
Shawn



Loadbalance for SorCloud using SolrNet

2017-04-20 Thread Vrinda Ashok
Hello,

I have application  sending request to Shard1 each time, making this single 
point of failure. Please suggest what can I use for Load balancing in SolrNet.

Is there something like CloudSolrClient as in SolrJ ? Or will I have to go with 
HA proxy or physical load balance only ?

Please suggest.

Thank you,
Vrinda Davda
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s).

If you are not the intended recipient, please notify so to the sender by e-mail 
and delete the original message.

In such cases, please notify us immediately at i...@infinite.com . Further, you 
are not to copy, 

disclose, or distribute this e-mail or its contents to any unauthorized 
person(s) .Any such actions are 

considered unlawful. This e-mail may contain viruses. Infinite has taken every 
reasonable precaution to minimize

this risk, but is not liable for any damage you may sustain as a result of any 
virus in this e-mail. You should 

carry out your own virus checks before opening the e-mail or attachments. 
Infinite reserves the right to monitor

and review the content of all messages sent to or from this e-mail address. 
Messages sent to or from this e-mail

address may be stored on the Infinite e-mail system. 

***INFINITE End of DisclaimerINFINITE 


Re: BUILD FAILED solr 6.5.0

2017-04-20 Thread Bernd Fehling
Hi Steve,

thanks a lot for solving my problem.

Would it be possible to check for ivy >= 2.3 at start of build
and give your hint as message to the user?

Regards
Bernd

Am 19.04.2017 um 17:01 schrieb Steve Rowe:
> Hi Bernd,
> 
> Your Ivy may be outdated - the project requires minimum 2.3.
> 
> Try removing all pre-2.3 ivy-*.jar files from ~/.ant/lib/, then running “ant 
> ivy-bootstrap”.
> 
> --
> Steve
> www.lucidworks.com
> 
>> On Apr 19, 2017, at 10:55 AM, Bernd Fehling  
>> wrote:
>>
>> Tried today to have a look at solr 6.5.0.
>> - download solr-6.5.0-src.tgz from apache.org and extracted to workspace
>> - ant eclipse
>> - imported to eclipse neon as new project
>> - from eclipse in lucene subdir clicked on build.xml and selected
>>  "Run As" --> "Ant Build..."
>> - selected "package" and "Run"
>>
>> Result:
>> ...
>>  [javadoc] Loading source files for package org.apache.lucene.search.spans...
>>  [javadoc] Loading source files for package org.apache.lucene.store...
>>  [javadoc] Loading source files for package org.apache.lucene.util...
>>  [javadoc] Loading source files for package 
>> org.apache.lucene.util.automaton...
>>  [javadoc] Loading source files for package org.apache.lucene.util.fst...
>>  [javadoc] Constructing Javadoc information...
>>  [javadoc] Standard Doclet version 1.8.0_121
>>  [javadoc] Building tree for all the packages and classes...
>>  [javadoc] Building index for all the packages and classes...
>>  [javadoc] Building index for all classes...
>> [exec] Result: 128
>>  [jar] Building jar:
>> /srv/www/solr/workspace_neon_solr_6_5_0/solr-6.5.0/lucene/build/test-framework/lucene-test-framework-6.5.0-SNAPSHOT-javadoc.jar
>> javadocs:
>> changes-to-html:
>>[mkdir] Created dir: 
>> /srv/www/solr/workspace_neon_solr_6_5_0/solr-6.5.0/lucene/build/docs/changes
>>   [delete] Deleting: 
>> /srv/www/solr/workspace_neon_solr_6_5_0/solr-6.5.0/lucene/build/doap.lucene.version.dates.csv
>> [copy] Copying 3 files to 
>> /srv/www/solr/workspace_neon_solr_6_5_0/solr-6.5.0/lucene/build/docs/changes
>> ivy-availability-check:
>> ivy-fail:
>> ivy-configure:
>> [ivy:configure] :: loading settings :: file = 
>> /srv/www/solr/workspace_neon_solr_6_5_0/solr-6.5.0/lucene/top-level-ivy-settings.xml
>> resolve-groovy:
>> [ivy:cachepath] :: resolving dependencies :: 
>> org.codehaus.groovy#groovy-all-caller;working
>> [ivy:cachepath]  confs: [default]
>> [ivy:cachepath]  found org.codehaus.groovy#groovy-all;2.4.8 in public
>> [ivy:cachepath] :: resolution report :: resolve 10ms :: artifacts dl 0ms
>>  -
>>  |  |modules||   artifacts   |
>>  |   conf   | number| search|dwnlded|evicted|| number|dwnlded|
>>  -
>>  |  default |   1   |   0   |   0   |   0   ||   1   |   0   |
>>  -
>> resolve-markdown:
>>
>> BUILD FAILED
>> /srv/www/solr/workspace_neon_solr_6_5_0/solr-6.5.0/lucene/common-build.xml:2415:
>>  ivy:cachepath doesn't support the nested "dependency" element.
>>
>>
>> Any idea what is going wrong?
>>
>> Something with ivy:dependency within ivy:cachepath, but how to fix it?
>>
>> Regards
>> Bernd
> 


Re: Questions about the read and write speed of solr

2017-04-20 Thread Shawn Heisey
On 4/19/2017 6:31 PM, hu.xiaod...@zte.com.cn wrote:
> What is the speed of reading and writing about solr?
> Can someone give me some data of Performance?

This question is too vague to answer.  What exactly are you wanting to
read and write?

Even if you ask a more detailed question, the answer is probably going
to be "it depends."  Performance depends on a lot of things -- the
hardware, the nature of the documents you're storing in Solr, how many
of those documents there are, the nature of the queries you're making,
and other things I haven't thought of right now.

A related but not precisely applicable discussion:

https://lucidworks.com/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

Thanks,
Shawn