regarding cursorMark feature for deep pagination

2017-07-18 Thread suresh pendap
Hi,

This question is more about the Implementation detail of the cursorMark
feature.

I was reading about using the cursorMark feature for deep pagination in
Solr mentioned in this blog http://yonik.com/solr/paging-and-deep-paging/

It is not clear to me as to how it is more efficient as compared to the
regular pagination.

The blog says that there is no state maintained on the server side.

If there is no state maintained then where does it get its efficiency from?

Assuming that it does maintain the state on the server side, does the next
page request has to go the same aggregator node which had served the first
page?


Thanks
Suresh


Re: Copy field a source of copy field

2017-07-18 Thread Erick Erickson
OK, I take it back. Keepwords handle multiple words just fine. So I
have to rewind.

I'm having no trouble at all applying multiple, successive keepwords
filters, even when there are multiple words on a single line in the
keepwords file. Your use of shingles in here is probably going to
confuse things, so I'd probably recommend taking that out until you
work out what's happening with multiple keepwords filters, then add it
back in.

The images you pasted almost look like you're showing the contents of
elevate.xml, but I suspect that's bogus.

But I think this is an XY problem, you're asking about how to chain
copyFields and we got off into talking about chaining keepwords and
the like. You state:

"So, the requirements here, are to be able to find all species in
species files (step one) and then make a facet with species in file
genus, step two."

Then you say:

"And the second one (genus), which contains genus that has to be for
facet purposes, like this"

How are those reconciled? Do you want facets on the genus+species? Or
just on the genus? Or both? So let's just start over.

What's also missing is why you think you need keepwords in the first
place. Is this a free-text field you're trying to extract
genus/species from? Or do you have the genus/species extracted
already?

Give us two docs, a sample search and what you want as outcome.
Because if you just want to facet on genus then do a copyField simply
to a "genus" field that strips out everything but the genus (however
you implement that, tricky given sub-species perhaps).

Ditto if you want to facet on species. Just a species_facet field that
you put whatever you want into. Or just use KeywordTokenizer for
species if you're guaranteed that you want the whole field.

You can then use copyField to copy as you wish.

Best,
Erick


On Tue, Jul 18, 2017 at 2:23 PM, tstusr  wrote:
> Well, for me it's kind of strange because it's working only with words that
> have blank spaces. It seems that maybe I'm not explaining well.
>
> My field is defined as follows:
>
>positionIncrementGap="0">
> 
>   
>mapping="mapping/mapping-ISOLatin1Accent.txt"/>
>pattern="[0-9]+|(\-)(\s*)" replacement=""/>
>   
>outputUnigrams="true"/>
>ignoreCase="true"/>
>ignoreCase="true"/>
> 
> 
>   
>   
> 
>   
>
> We have 2 KWF files, "species" and then "genus". It seems that is just
> working with genus.
>
> Since I'm not able to use copy fields, what choices I have?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Copy-field-a-source-of-copy-field-tp4346425p4346665.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 6.6.0 - Indexing errors

2017-07-18 Thread Joe Obernberger
Thank you Shawn.  We will be adjusting solr.solr.home to point some 
place else so that our puppet module will work.  We actually didn't 
loose any data since the indexes are in HDFS.  Our configuration for our 
largest collection is 100 shards with 3 replicas each on top of HDFS 
with 3x replication.  Perhaps overkill.  It's just the core properties 
files that we lost.  I ended up writing a program that uses the 
CloudSolrClient to get all the info from zookeeper and then rebuild the 
core properties files.  Looks like it is working.  For example, for a 
collection called COL1 with config called COL1:


File output;
Iterator iSlice = 
mainServer.getZkStateReader().getClusterState().getCollection("COL1").getActiveSlices().iterator();

while (iSlice != null && iSlice.hasNext()) {
Slice s = iSlice.next();
Iterator replicaIt = s.getReplicas().iterator();
while (replicaIt != null && replicaIt.hasNext()) {
Replica r = replicaIt.next();
System.out.println("Name: "+r.getCoreName());
System.out.println("CodeNodeName: "+r.getName());
System.out.println("Node name: "+r.getNodeName());
System.out.println("Shard: "+s.getName());

output = new File(r.getNodeName()+"/"+r.getCoreName());
output.mkdirs();
output = new 
File(r.getNodeName()+"/"+r.getCoreName()+"/"+"core.properties");

StringBuilder buff = new StringBuilder();
buff.append("collection.configName=COL1\n");
buff.append("name=").append(r.getCoreName());
buff.append("\nshard=").append(s.getName());
buff.append("\ncollection=COL1");
buff.append("\ncoreNodeName=").append(r.getName());
try {
setContents(output, buff.toString());
} catch (IOException ex) {
System.out.println("Error writting: "+ex);
}
}
}


Then I copied the files to the 45 servers and restarted solr 6.6.0 on 
each.  It came back up OK, and it has been indexing all night long.


-Joe

On 7/17/2017 3:15 PM, Erick Erickson wrote


On 7/18/2017 12:31 PM, Shawn Heisey wrote:

On 7/17/2017 11:39 AM, Joe Obernberger wrote:
We use puppet to deploy the solr instance to all the nodes.  I 
changed what was deployed to use the CDH jars, but our puppet module 
deletes the old directory and replaces it.  So, all the core 
configuration files under server/solr/ were removed. Zookeeper still 
has the configuration, but the nodes won't come up.


Is there a way around this?  Re-creating these files manually isn't 
realistic; do I need to re-index?


Put the solr home elsewhere so it's not under the program directory 
and doesn't get deleted when you re-deploy Solr.  When starting Solr 
manually with bin/solr, this is done with the -s option.


If you install Solr as a service, which works on operating systems 
with a strong GNU presence (such as Linux), then the solr home will 
typically not be in the program directory.  The configuration script 
(default filename is /etc/default/solr.in.sh) should not get deleted 
if Solr is reinstalled, but I have not confirmed that this is the 
case.  The service installer script is included in the Solr download.


With SolrCloud, deleting all the core data like that will NOT be 
automatically fixed by restarting Solr.  SolrCloud will have lost part 
of its data.  If you have enough replicas left after a losslike that 
to remain fully operational, then you'll need to use the DELETEREPLICA 
and ADDREPLICA actions on the Collections API to rebuild the data on 
that server from the leader of each shard.


If the collection is incomplete after the solr home on a server gets 
deleted, you'll probably need to completely delete the collection, 
then recreate it, and reindex.  And you'll need to look into adding 
servers/replicas so the loss of a single server cannot take you offline.


Thanks,
Shawn


---
This email has been checked for viruses by AVG.
http://www.avg.com





Re: Copy field a source of copy field

2017-07-18 Thread tstusr
Well, for me it's kind of strange because it's working only with words that
have blank spaces. It seems that maybe I'm not explaining well.

My field is defined as follows:

  

  
  
  
  
  
  
  


  
  

  

We have 2 KWF files, "species" and then "genus". It seems that is just
working with genus. 

Since I'm not able to use copy fields, what choices I have?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Copy-field-a-source-of-copy-field-tp4346425p4346665.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: cpu utilization high

2017-07-18 Thread Erick Erickson
There isn't nearly enough information to say much of anything here. Saying
"Poi" makes me wonder if you're using the extracting request handler, in
which case I'd recommend you move it off Solr, see:
https://lucidworks.com/2012/02/14/indexing-with-solrj/

you might review: https://wiki.apache.org/solr/UsingMailingLists

Best,
Erick

On Tue, Jul 18, 2017 at 9:08 AM, Satya Marivada 
wrote:

> Hi All,
>
> We are using solr-6.3.0 with external zookeeper. Setup is as below. Poi is
> the collection which is big about 20G with each shard at 10G. Each jvm is
> having 3G and the vms have 70G of RAM. The processors are at 6.
>
> The cpu utilization when running queries is reaching more than 100%. Any
> suggestions. Should I increase the number of cpus on each vm. Is it true
> that a shard can be searched by a cpu and not multiple cpus can be put to
> work on the same shard as it does not support multiple threading?
>
> Or should the shard be split further? Each poi shard now is at 10G with
> about 8 million documents.
>
>
> [image: image.png]
>


Re: Copy field a source of copy field

2017-07-18 Thread Erick Erickson
Multiple keyword files work just fine for me.

one issue you're having is that multi-word keepwords aren't going to
do what you expect. The analysis chains work on _tokens_, and only see
one at a time. Plus (apparently) the input is broken up on whitespace
(the docs aren't entirely clear on this, but can be inferred by "one
per line").

Even if there were multi-word keepwords, it wouldn't work as you
apparently expect. The problem is that the analysis chain first breaks
the input into tokens. So even if a "single" keepword were "a b", and
your input was "a b", by the time it gets to the keepword filter the
context would be lost. So the filter would see just "a" and say "nope
it doesn't match 'a b', throw it out". Ditto with "b".

Since keepwords are apparently split on whitespace though, in the
example above both would be kept. The keepword list is "a" and "b" so
in the above example both match and are kept.

Best,
Erkck

On Tue, Jul 18, 2017 at 9:49 AM, tstusr  wrote:
> Well, I have no idea why that images display as did.
>
> The correct order is:
>
> Field chain analyzer.
> 
>
> KWF-genus file
> 
>
> Test output.
> 
>
> Sorry for the mistake
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Copy-field-a-source-of-copy-field-tp4346425p4346602.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: default values for numRecordsToKeep and maxNumLogsToKeep

2017-07-18 Thread Erick Erickson
I'm going to punt on the rationale since I wasn't involved in that discussion.

numRecordsToKeep can be configured in the  section of
solrconfig.xml if you want to change it though.

Best,
Erick

On Tue, Jul 18, 2017 at 10:53 AM, suresh pendap  wrote:
> Hi,
> After looking at the source code I see that the default values for
> numRecordsToKeep is 100 and maxNumLogsToKeep is 10.
>
> So it seems by default the replica can only have 1000 document updates lag
> before the replica goes for a Full recovery from the leader.
>
> I would like to know the rationale for keeping so low values for these
> configurations.
>
> If the idea for these configuration params is to avoid full recovery for
> down replicas then shouldn't the default values for these config params be
> high?
>
> I understand that higher values would mean more disk space consumed by the
> update logs, but the current default values seem to be very low.
>
> Is my understanding of these configuration params correct?
>
> Thanks
> Suresh


default values for numRecordsToKeep and maxNumLogsToKeep

2017-07-18 Thread suresh pendap
Hi,
After looking at the source code I see that the default values for
numRecordsToKeep is 100 and maxNumLogsToKeep is 10.

So it seems by default the replica can only have 1000 document updates lag
before the replica goes for a Full recovery from the leader.

I would like to know the rationale for keeping so low values for these
configurations.

If the idea for these configuration params is to avoid full recovery for
down replicas then shouldn't the default values for these config params be
high?

I understand that higher values would mean more disk space consumed by the
update logs, but the current default values seem to be very low.

Is my understanding of these configuration params correct?

Thanks
Suresh


Re: Copy field a source of copy field

2017-07-18 Thread tstusr
Well, I have no idea why that images display as did.

The correct order is:

Field chain analyzer.
 

KWF-genus file
 

Test output.
 

Sorry for the mistake



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Copy-field-a-source-of-copy-field-tp4346425p4346602.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Copy field a source of copy field

2017-07-18 Thread tstusr
It seems that is just taking the last file of keep words.


 

Now for control purposes, I have in genus file:


 

And just is taking the composed field, abutilon aurantiacum.

By testing with
abutilon aurantiacum
abutilon bakerianum


 

It's is not possible to put 2 tokenizers in a field, am I right? Because I
just think there is a missing split in between the 2 KWFs.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Copy-field-a-source-of-copy-field-tp4346425p4346601.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 6.6.0 - Indexing errors

2017-07-18 Thread Shawn Heisey

On 7/17/2017 11:39 AM, Joe Obernberger wrote:
We use puppet to deploy the solr instance to all the nodes.  I changed 
what was deployed to use the CDH jars, but our puppet module deletes 
the old directory and replaces it.  So, all the core configuration 
files under server/solr/ were removed. Zookeeper still has the 
configuration, but the nodes won't come up.


Is there a way around this?  Re-creating these files manually isn't 
realistic; do I need to re-index?


Put the solr home elsewhere so it's not under the program directory and 
doesn't get deleted when you re-deploy Solr.  When starting Solr 
manually with bin/solr, this is done with the -s option.


If you install Solr as a service, which works on operating systems with 
a strong GNU presence (such as Linux), then the solr home will typically 
not be in the program directory.  The configuration script (default 
filename is /etc/default/solr.in.sh) should not get deleted if Solr is 
reinstalled, but I have not confirmed that this is the case.  The 
service installer script is included in the Solr download.


With SolrCloud, deleting all the core data like that will NOT be 
automatically fixed by restarting Solr.  SolrCloud will have lost part 
of its data.  If you have enough replicas left after a losslike that to 
remain fully operational, then you'll need to use the DELETEREPLICA and 
ADDREPLICA actions on the Collections API to rebuild the data on that 
server from the leader of each shard.


If the collection is incomplete after the solr home on a server gets 
deleted, you'll probably need to completely delete the collection, then 
recreate it, and reindex.  And you'll need to look into adding 
servers/replicas so the loss of a single server cannot take you offline.


Thanks,
Shawn



cpu utilization high

2017-07-18 Thread Satya Marivada
Hi All,

We are using solr-6.3.0 with external zookeeper. Setup is as below. Poi is
the collection which is big about 20G with each shard at 10G. Each jvm is
having 3G and the vms have 70G of RAM. The processors are at 6.

The cpu utilization when running queries is reaching more than 100%. Any
suggestions. Should I increase the number of cpus on each vm. Is it true
that a shard can be searched by a cpu and not multiple cpus can be put to
work on the same shard as it does not support multiple threading?

Or should the shard be split further? Each poi shard now is at 10G with
about 8 million documents.


[image: image.png]


Re: Copy field a source of copy field

2017-07-18 Thread Erick Erickson
The code is very simple, it looks at a quick glance like it just reads
the words in then the "accept" method just returns true or false based
on whether the text file contains the token.

Are you sure you reloaded your core/collection and pushed the changed
schema to the right place? The admin/analysis page is very helpful
here, your indexing side should have two keep word filters and you
should be able to see each transformation (uncheck the "verbose"
checkbox for more readability.

Best,
Erick

On Tue, Jul 18, 2017 at 8:49 AM, tstusr  wrote:
> Ok, I know shingling will join with "_".
>
> But that is the behaviour we want, imagine we have this fields (contained in
> species file):
>
> abarema idiopoda
> abutilon bakerianum
>
> Those become in:
> abarema
> idiopoda
> abutilon
> bakerianum
> abarema_idiopoda
> abutilon_bakerianum
>
> But now in my genus file maybe is only the word abarema, so, we end up with
> a field with only that word.
>
> So, the requirements here, are to be able to find all species in species
> files (step one) and then make a facet with species in file genus, step two.
>
> It seems reasonable to just chain the fields, I just forgot solr didn't
> change the field, as Shawn points (thanks for it).
>
> So what we came here is to make 2 fields the first with species.
>
>  positionIncrementGap="0">
> 
>   
>mapping="mapping/mapping-ISOLatin1Accent.txt"/>
>pattern="[0-9]+|(\-)(\s*)" replacement=""/>
>   
>outputUnigrams="true"/>
>ignoreCase="true"/>
> 
> 
>   
>outputUnigrams="false"/>
>   
> 
>   
>
> And the second one (genus), which contains genus that has to be for facet
> purposes, like this:
>
>  positionIncrementGap="0">
> 
>   
>mapping="mapping/mapping-ISOLatin1Accent.txt"/>
>pattern="[0-9]+|(\-)(\s*)" replacement=""/>
>   
>outputUnigrams="true"/>
>ignoreCase="true"/>
>ignoreCase="true"/>
> 
> 
>   
>   
> 
>   
>
> Nevertheless, there is no second processing for keep word filter as (I)
> expect. Am I missing something?
>
>
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Copy-field-a-source-of-copy-field-tp4346425p4346593.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Joins in Parallel SQL?

2017-07-18 Thread Erick Erickson
bq: "Is it possible to contribute towards..."

Of course. "Developer documentation" is in short supply, mostly you
have to dive into the code and figure it out. See:
https://wiki.apache.org/solr/HowToContribute for getting the code,
setting up an IDE etc.

I often find the most useful approach is to look at the junit tests
and step through one that looks relevant in an IDE to start getting my
head around unfamiliar code.

Welcome to the world of open source ;)
Erick

On Tue, Jul 18, 2017 at 11:19 AM,   wrote:
> Is it possible to contribute towards building this capability? What part of 
> developer documentation would be suitable for this?
>
> Regards,
> Imran
>
> Sent from Mail for Windows 10
>
> From: Joel Bernstein
> Sent: Thursday, July 6, 2017 7:40 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Joins in Parallel SQL?
>
> Joins and OFFSET are not currently supported with Parallel SQL.
>
> The docs for parallel SQL cover all the supported features. Any syntax not
> covered in the docs is likely not supported.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Thu, Jul 6, 2017 at 2:40 PM,  wrote:
>
>>
>> Is it possible to join documents from different collections through
>> Parallel SQL?
>>
>> In addition to the LIMIT feature on Parallel SQL, can we do use OFFSET to
>> implement paging?
>>
>> Thanks,
>> Imran
>>
>>
>> Sent from Mail for Windows 10
>>
>>
>


Re: Copy field a source of copy field

2017-07-18 Thread tstusr
Ok, I know shingling will join with "_".

But that is the behaviour we want, imagine we have this fields (contained in
species file):

abarema idiopoda
abutilon bakerianum

Those become in:
abarema 
idiopoda
abutilon 
bakerianum
abarema_idiopoda
abutilon_bakerianum

But now in my genus file maybe is only the word abarema, so, we end up with
a field with only that word.

So, the requirements here, are to be able to find all species in species
files (step one) and then make a facet with species in file genus, step two.

It seems reasonable to just chain the fields, I just forgot solr didn't
change the field, as Shawn points (thanks for it).

So what we came here is to make 2 fields the first with species.



  
  
  
  
  
  


  
  
  

  

And the second one (genus), which contains genus that has to be for facet
purposes, like this:



  
  
  
  
  
  
  


  
  

  

Nevertheless, there is no second processing for keep word filter as (I)
expect. Am I missing something?






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Copy-field-a-source-of-copy-field-tp4346425p4346593.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Need guidance for distributing data base on date interval in a collection

2017-07-18 Thread Charlie Hull
Hi,

You should also consider how you should shard for best performance: for
example, if most of your queries are for recent documents, you could end up
with them all hitting only one shard. Here's an old blog we wrote on this
subject (it mentions another open source engine, Xapian, but ignore that as
the same principles apply to Solr).

HTH

Charlie

On 18 July 2017 at 09:16, Modassar Ather  wrote:

> Hi Rehman,
>
> You may want to look into how the documents are routed on different shards.
> For that you can look into following documentation.
> https://cwiki.apache.org/confluence/display/solr/
> Shards+and+Indexing+Data+in+SolrCloud
>
> Basically it is the id of the document which when prefixed with certain
> attribute helps decide which shard the document actually goes.
> So document id and date id prefix may be helpful.
>
> Best,
> Modassar
>
>
>
>
>
> On Tue, Jul 18, 2017 at 1:08 PM, Atita Arora  wrote:
>
> > Hi Rehman,
> > I am not sure about your use case,  but why wouldn't you consider
> creating
> > shard for a particular date range like within a week from current date,
> 15
> > days,  a month and so on and so forth.
> >
> > I have done a similar implementation elsewhere.
> > Can you tell more about your use case?
> >
> > Atita
> >
> > On Jul 18, 2017 1:04 PM, "rehman kahloon"  > invalid>
> > wrote:
> >
> > Hello Sir/MadamI am new to SolrCloud, Having ORACLE
> > technologies experience.
> > Now a days , i am comparing oracle and solrcloud using bigdata.
> > So i want to know how can i create time interval sharding.
> > e.g i have 10 machines, each machine for one shard and one date data, So
> > how can i fix next day data go to next shard and so on?
> >
> > search too much but not found any command/way, that handle it from some
> > core/shard file.
> > So i request you please guide me.
> > thanks in advanced.
> > Kind Regards,Muhammad Rehman kahloonmrehman_kahl...@yahoo.com
> >
>


RE: StringIndexOutOfBoundsException "in" SpellCheckCollator.getCollation

2017-07-18 Thread Umoreno
Hi. Was this issue solved?, I am facing a similar one



--
View this message in context: 
http://lucene.472066.n3.nabble.com/StringIndexOutOfBoundsException-in-SpellCheckCollator-getCollation-tp4312517p4346582.html
Sent from the Solr - User mailing list archive at Nabble.com.


Short Circuit Reads -

2017-07-18 Thread Joe Obernberger

Hi All - does SolrCloud support using Short Circuit Reads when using HDFS?

Thanks!

-Joe



Re: SolrJ 6.6.0 Connection pool shutdown now with stack trace

2017-07-18 Thread Shawn Heisey
On 7/18/2017 5:10 AM, Markus Jelsma wrote:
> The problem was never resolved but Shawn asked for the stack trace, here it 
> is:

> Caused by: java.lang.IllegalStateException: Connection pool shut down 
> at org.apache.http.util.Asserts.check(Asserts.java:34) 

As I suspected, it is the connection pool inside HttpClient that is shut
down (closed).

Earlier today before I came into the office, I asked the HttpClient user
list whether this could ever happen for a reason other than an explicit
close/shutdown.  They looked at the code and found that the exception
only is thrown if the "isShutDown" boolean flag is true, and the only
place that ever gets set to true is when an explicit shutdown is called
on the connection pool.

When a solr client is built without an external HttpClient, calling
close() on the solr client will shut down the internal HttpClient.  If
an external HttpClient is used, the user code would need to shut it down
for this to happen.  Recent versions of SolrJ are using
CloseableHttpClient, which will shut down the connection pool if close()
is called.

It's looking like this error has happened because the HttpClient object
inside the solr client has been shut down explicitly, which might have
happened because one of the outer layers had close() called.

Thanks,
Shawn



Re: Get results in multiple orders (multiple boosts)

2017-07-18 Thread alessandro.benedetti
"I have different "sort preferences", so I can't build a index and use for
sorting.Maybe I have to sort by category then by source and by language or
by source, then by category and by date"

I would like to focus on this bit.
It is ok to go for a custom function and sort at query time, but I am
curious to explore why an index time solution should not be ok.
You can have these distinct fields :
source_priority
language_priority
category_priority 
ect

This values can be assigned at the documents at indexing time ( using for
example a custom update request processor).
Then at query time you can easily sort on those values in a multi layered
approach :
sort:source_priority desc, category_priority  desc
Of course, if the priority for a source changes quite often or if it's user
dependent, a query time solution would be preferred.





-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Get-results-in-multiple-orders-multiple-boosts-tp4346304p4346559.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Create too many zookeeper connections when recreate CloudSolrServer instance

2017-07-18 Thread Walter Underwood
The entire point of a Zookeeper cluster is that it continues to be available 
when one (or more) nodes are down.

If you want more failure tolerance, run a five node Zookeeper cluster instead 
of a three node cluster.

Hacking the client will not increase robustness. Right now, you are hurting 
robustness by being too clever with the client.

Hacking the client is not a last choice, it is a bad choice.

For queries, there is not much benefit in running the cloud-aware client. A 
regular load balancer works just about as well. We use the Amazon load 
balancers.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Jul 18, 2017, at 3:25 AM, wg85907  wrote:
> 
> I am not mean my Zookeeper cluster is rebooting frequently, just want to
> ensure my query service can be stable when Zookeeper cluster has issue or
> reboot. Will do some test to check if there is some issue here. Maybe
> current Zookeeper client can handle this case well. Hacking the client will
> always be the last choice.
> Regards,
> Geng, Wei
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Create-too-many-zookeeper-connections-when-recreate-CloudSolrServer-instance-tp4346040p4346528.html
> Sent from the Solr - User mailing list archive at Nabble.com.



RE: SolrJ 6.6.0 Connection pool shutdown now with stack trace

2017-07-18 Thread Markus Jelsma
Hello Susheel,

Yes, the closing happens only at the end of the checking cycle. I asked my 
colleague about the firewall and he is positive everything is allowed between 
those nodes. I also cannot completely drop the firewall between those nodes to 
be sure, because the problem is very hard to reproduce; it pops up once in a 
while, sometimes not for weeks, today already a couple of times.

It is locally unreproducible but we're going to try to reproduce it in our 
development environment. So i have to get back to the problem in a few weeks 
from now.

Number of requests is sometimes, briefly, very high. Usually very low. These 
specific checks are executed in order, not concurrently.

Thanks,
Markus
 
-Original message-
> From:Susheel Kumar 
> Sent: Tuesday 18th July 2017 15:17
> To: solr-user@lucene.apache.org
> Subject: Re: SolrJ 6.6.0 Connection pool shutdown now with stack trace
> 
> Then most likely its due to closing of connection as mentioned above though
> you said it's not happening in that part of your code.  To rule out
> firewall possibility, you can test in some other/local env.  Also how many
> requests/client/connections happening concurrently.
> 
> Thanks,
> Susheel
> 
> On Tue, Jul 18, 2017 at 8:43 AM, Markus Jelsma 
> wrote:
> 
> > Hello Susheel,
> >
> > No, nothing at all. I've check all six nodes, they are clean.
> >
> > Thanks,
> > Markus
> >
> >
> >
> > -Original message-
> > > From:Susheel Kumar 
> > > Sent: Tuesday 18th July 2017 14:30
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: SolrJ 6.6.0 Connection pool shutdown now with stack trace
> > >
> > > Do you see any errors etc. in solr.log during this time?
> > >
> > > On Tue, Jul 18, 2017 at 7:10 AM, Markus Jelsma <
> > markus.jel...@openindex.io>
> > > wrote:
> > >
> > > > The problem was never resolved but Shawn asked for the stack trace,
> > here
> > > > it is:
> > > >
> > > > org.apache.solr.client.solrj.SolrServerException: java.lang.
> > IllegalStateException:
> > > > Connection pool shut down
> > > > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.
> > > > doRequest(LBHttpSolrClient.java:485)
> > > > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(
> > > > LBHttpSolrClient.java:388)
> > > > at org.apache.solr.client.solrj.impl.CloudSolrClient.
> > > > sendRequest(CloudSolrClient.java:1383)
> > > > at org.apache.solr.client.solrj.impl.CloudSolrClient.
> > > > requestWithRetryOnStaleState(CloudSolrClient.java:1134)
> > > > at org.apache.solr.client.solrj.impl.CloudSolrClient.request(
> > > > CloudSolrClient.java:1073)
> > > > at org.apache.solr.client.solrj.SolrRequest.process(
> > SolrRequest.java:160)
> > > > at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942)
> > > > at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.
> > java:1173)
> > > > at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.
> > java:1090)
> > > > at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.
> > java:1110)
> > > > ..internal method calling getById()..
> > > > at java.lang.Thread.run(Thread.java:748)
> > > > Caused by: java.lang.IllegalStateException: Connection pool shut down
> > > > at org.apache.http.util.Asserts.check(Asserts.java:34)
> > > > at org.apache.http.pool.AbstractConnPool.lease(
> > AbstractConnPool.java:184)
> > > > at org.apache.http.pool.AbstractConnPool.lease(
> > AbstractConnPool.java:217)
> > > > at org.apache.http.impl.conn.PoolingClientConnectionManager
> > > > .requestConnection(PoolingClientConnectionManager.java:184)
> > > > at org.apache.http.impl.client.DefaultRequestDirector.execute(
> > > > DefaultRequestDirector.java:415)
> > > > at org.apache.http.impl.client.AbstractHttpClient.doExecute(
> > > > AbstractHttpClient.java:882)
> > > > at org.apache.http.impl.client.CloseableHttpClient.execute(
> > > > CloseableHttpClient.java:82)
> > > > at org.apache.http.impl.client.CloseableHttpClient.execute(
> > > > CloseableHttpClient.java:55)
> > > > at org.apache.solr.client.solrj.impl.HttpSolrClient.
> > > > executeMethod(HttpSolrClient.java:515)
> > > > at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> > > > HttpSolrClient.java:279)
> > > > at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> > > > HttpSolrClient.java:268)
> > > > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.
> > > > doRequest(LBHttpSolrClient.java:447)
> > > > ... 24 more
> > > >
> > > > So, to summarize, we have a program checking presence of documents in
> > Solr
> > > > using getById() and we don't want this exception to bubble up, we want
> > > > SolrJ to restore the connection pool just as CloudSolrClient would
> > move on
> > > > to another node if one went down in the mean time.
> > > >
> > > > Is this possible? How?
> > > >
> > > > Many thanks,
> > > > Markus
> > > >
> > > > -Original message-
> > > > > From:Markus Jelsma 

Re: SolrJ 6.6.0 Connection pool shutdown now with stack trace

2017-07-18 Thread Susheel Kumar
Then most likely its due to closing of connection as mentioned above though
you said it's not happening in that part of your code.  To rule out
firewall possibility, you can test in some other/local env.  Also how many
requests/client/connections happening concurrently.

Thanks,
Susheel

On Tue, Jul 18, 2017 at 8:43 AM, Markus Jelsma 
wrote:

> Hello Susheel,
>
> No, nothing at all. I've check all six nodes, they are clean.
>
> Thanks,
> Markus
>
>
>
> -Original message-
> > From:Susheel Kumar 
> > Sent: Tuesday 18th July 2017 14:30
> > To: solr-user@lucene.apache.org
> > Subject: Re: SolrJ 6.6.0 Connection pool shutdown now with stack trace
> >
> > Do you see any errors etc. in solr.log during this time?
> >
> > On Tue, Jul 18, 2017 at 7:10 AM, Markus Jelsma <
> markus.jel...@openindex.io>
> > wrote:
> >
> > > The problem was never resolved but Shawn asked for the stack trace,
> here
> > > it is:
> > >
> > > org.apache.solr.client.solrj.SolrServerException: java.lang.
> IllegalStateException:
> > > Connection pool shut down
> > > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.
> > > doRequest(LBHttpSolrClient.java:485)
> > > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(
> > > LBHttpSolrClient.java:388)
> > > at org.apache.solr.client.solrj.impl.CloudSolrClient.
> > > sendRequest(CloudSolrClient.java:1383)
> > > at org.apache.solr.client.solrj.impl.CloudSolrClient.
> > > requestWithRetryOnStaleState(CloudSolrClient.java:1134)
> > > at org.apache.solr.client.solrj.impl.CloudSolrClient.request(
> > > CloudSolrClient.java:1073)
> > > at org.apache.solr.client.solrj.SolrRequest.process(
> SolrRequest.java:160)
> > > at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942)
> > > at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.
> java:1173)
> > > at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.
> java:1090)
> > > at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.
> java:1110)
> > > ..internal method calling getById()..
> > > at java.lang.Thread.run(Thread.java:748)
> > > Caused by: java.lang.IllegalStateException: Connection pool shut down
> > > at org.apache.http.util.Asserts.check(Asserts.java:34)
> > > at org.apache.http.pool.AbstractConnPool.lease(
> AbstractConnPool.java:184)
> > > at org.apache.http.pool.AbstractConnPool.lease(
> AbstractConnPool.java:217)
> > > at org.apache.http.impl.conn.PoolingClientConnectionManager
> > > .requestConnection(PoolingClientConnectionManager.java:184)
> > > at org.apache.http.impl.client.DefaultRequestDirector.execute(
> > > DefaultRequestDirector.java:415)
> > > at org.apache.http.impl.client.AbstractHttpClient.doExecute(
> > > AbstractHttpClient.java:882)
> > > at org.apache.http.impl.client.CloseableHttpClient.execute(
> > > CloseableHttpClient.java:82)
> > > at org.apache.http.impl.client.CloseableHttpClient.execute(
> > > CloseableHttpClient.java:55)
> > > at org.apache.solr.client.solrj.impl.HttpSolrClient.
> > > executeMethod(HttpSolrClient.java:515)
> > > at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> > > HttpSolrClient.java:279)
> > > at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> > > HttpSolrClient.java:268)
> > > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.
> > > doRequest(LBHttpSolrClient.java:447)
> > > ... 24 more
> > >
> > > So, to summarize, we have a program checking presence of documents in
> Solr
> > > using getById() and we don't want this exception to bubble up, we want
> > > SolrJ to restore the connection pool just as CloudSolrClient would
> move on
> > > to another node if one went down in the mean time.
> > >
> > > Is this possible? How?
> > >
> > > Many thanks,
> > > Markus
> > >
> > > -Original message-
> > > > From:Markus Jelsma 
> > > > Sent: Thursday 29th June 2017 16:38
> > > > To: solr-user@lucene.apache.org
> > > > Subject: RE: SolrJ 6.6.0 Connection pool shutdown
> > > >
> > > > Thanks. I probably should have mentioned there is no firewall
> limiting
> > > connections between those hosts. Actually, the processes run on the
> same
> > > hosts as the Solr cluster is running on.
> > > >
> > > > Thanks,
> > > > Markus
> > > >
> > > >
> > > >
> > > > -Original message-
> > > > > From:Alexandre Rafalovitch 
> > > > > Sent: Thursday 29th June 2017 15:38
> > > > > To: solr-user 
> > > > > Subject: Re: SolrJ 6.6.0 Connection pool shutdown
> > > > >
> > > > > One thing to check is whether there is a firewall between the
> client
> > > > > and the server. They - sometimes - cut the silent connections in
> the
> > > > > _middle_ (at the firewall). The usual solution is keepAlive
> request of
> > > > > some kind or not using the connection pool.
> > > > >
> > > > > One way to check is with network tracer like Wireshark and checking
> > > > > whether the actual hardware at the 

RE: Enabling SSL

2017-07-18 Thread Miller, William K - Norman, OK - Contractor
Thank you all for your responses.  I finally got it straightened out.  I had 
forgotten to change my url from http to https.  Dumb mistake on my part.  
Consider this issue closed.




~~~
William Kevin Miller

ECS Federal, Inc.
USPS/MTSC
(405) 573-2158


-Original Message-
From: Nawab Zada Asad Iqbal [mailto:khi...@gmail.com] 
Sent: Wednesday, July 12, 2017 12:50 PM
To: solr-user@lucene.apache.org
Subject: Re: Enabling SSL

I guess your certificates are self generated?  In that case, this is a browser 
nanny trying to protect you.
I also get same error in Firefox, however Chrome was little forgiving. It 
showed me an option to choose my certificate (the client certificate), and then 
bypassed the safety barrier.
I should add that even Chrome didn't show me that 'select certificate'
option on the first attempt, so I don't know what caused it to trigger.

Here is a relevant thread about Firefox:
https://bugzilla.mozilla.org/show_bug.cgi?id=1255049


Let me know how it worked for you, as I am still learning this myself.


Regards
Nawab



On Wed, Jul 12, 2017 at 9:05 AM, Miller, William K - Norman, OK - Contractor 
 wrote:

> I am not using Zookeeper.  Is the urlScheme also used outside of Zookeeper?
>
>
>
>
> ~~~
> William Kevin Miller
>
> ECS Federal, Inc.
> USPS/MTSC
> (405) 573-2158
>
>
> -Original Message-
> From: esther.quan...@lucidworks.com 
> [mailto:esther.quan...@lucidworks.com]
> Sent: Wednesday, July 12, 2017 10:58 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Enabling SSL
>
> Hi William,
>
> You should be able to navigate to https://local host:8983/solr (albeit 
> with your host:port) to access the admin UI, provided you updated the 
> urlScheme property in the Zookeeper cluster props.
>
> Did you complete that step?
>
> Esther
> Search Engineer
> Lucidworks
>
>
>
> > On Jul 12, 2017, at 08:20, Miller, William K - Norman, OK - 
> > Contractor <
> william.k.mil...@usps.gov.INVALID> wrote:
> >
> > I am trying to enable SSL and I have followed the instructions in 
> > the
> Solr 6.4 reference manual, but when I restart my Solr server and try 
> to access the Solr Admin page I am getting:
> >
> > “This page isn’t working”;
> >  sent an invalid response; ERR_INVALID_HTTP_RESPONSE
> >
> > Does the Solr server need to be on a secure server in order to 
> > enable
> SSL.
> >
> >
> > Additional Info:
> > Running Solr 6.5.1 on Linux OS
> >
> >
> >
> >
> > ~~~
> > William Kevin Miller
> >
> > ECS Federal, Inc.
> > USPS/MTSC
> > (405) 573-2158
> >
>


Re: Create too many zookeeper connections when recreate CloudSolrServer instance

2017-07-18 Thread Shawn Heisey
On 7/17/2017 2:48 AM, wg85907 wrote:
> Thanks for your detail explanation. The reason I want to shutdown the 
> CloudSolrServer instance and create a new one is that I have concern that if 
> it can successfully reconnect to Zookeeper server if Zookeeper cluster has 
> some issue and reboot.

I know that as long as the zookeeper ensemble (which is three or more ZK
servers working together) does not lose quorum, and Solr is connected to
all of the servers in the ensemble, Solr will be fine.

I have heard someone on the list say that if ZK loses quorum (which
means that the number of running servers drops below a required minimum)
then Solr doesn't recover correctly when quorum is re-established.  If
you have three servers, then at least two of them must be working to
maintain quorum.  If there are five servers, then at least three of them
must be working.

I do not think that the problem described above has been confirmed as an
issue.  If it does turn out to be true, then the problem is not likely
to be in Solr, but in ZK -- Solr uses the ZK client, which completely
manages that communication.

Thanks,
Shawn



Re: Get results in multiple orders (multiple boosts)

2017-07-18 Thread Susheel Kumar
As Eric suggested, its possible by sorting using custom function. You may
have to use if, sum and exists function etc. to come up with custom score
field and sort using this field. The if condition would check for the
conditions mentioned and keep adding the score etc.

Thanks,
Susheel

On Tue, Jul 18, 2017 at 6:23 AM, Luca Dall'Osto <
tenacious...@yahoo.it.invalid> wrote:

> Hello everyone,thanks for the prompt reply!
> In response to Florian, I can get correct score only when boost for 1
> filed (for example category): the score are correctly increased by the
> factor.But when I try to make a double boost, the score are not great as
> expected (for example, if the greatest boost factor for category is 3 and
> for source is 3,  sometimes I've got documents boosted with category^3 and
> source^1 **before** documents boosted with category^3 and source^3).
> I tired you snippet code "^=[boost]" instead of "^[factor]" but seems not
> work for me: SyntaxError: Cannot parse 'category:9500^': Encountered
> \"\" at line 1 (...)
> In response to Erik, I have different "sort preferences", so I can't build
> a index and use for sorting.Maybe I have to sort by category then by source
> and by language or by source, then by category and by date.The sort
> function that you are talking is a custom sort function of Solr?I build a
> custom PHP method that sort documents queried from Solr and it works good:
> the problem is that if I choose this way I have to get **all** the results
> in a single query and I can't paginate.Today I have millions of record and
> query could be very large, I would like to paginate results.To paginate
> results I need that Solr give me the results query in the correct
> order.Thank you very much
>
> Luca
>
>
>
> On Monday, July 17, 2017 6:04 PM, Erick Erickson <
> erickerick...@gmail.com> wrote:
>
>
>  I don't think boosting is really what you want here. Boosting
> _influences_ the score, it does not impose an ordering.
>
> Sorting _does_ impose an ordering, the question is how to sort and the
> answer depends on how fixed (or not) the sorting criteria are. Do they
> change with different queries? If not, the very simplest thing to do
> is to index a field with a pre-computed sort value. IOW, if your
> ordering is _always_ source 5, 9, 7 index a source_sort field that
> orders things that way and sort on that. Then I'd have a secondary
> sort by score as a tie-breaker.
>
> If that's not the case, perhaps sorting by function (perhaps a custom
> function) would work.
>
> Best,
> Erick
>
> On Mon, Jul 17, 2017 at 4:30 AM, Florian Waltersdorfer
>  wrote:
> > Hi,
> >
> > I am quite the SolR newbie myself, but have you looked at the resulting
> scores, e.g. via fl=*,score (that way, you can see/test how your boosting
> affects the results)?
> > In a similar scenario, I am using fixed value boosts for specific field
> values; "^=[boost]" instead of "^[factor]", for example:
> >
> > category:9500^=20  source:(5^=20 OR 9^=10 OR 7^=5)
> >
> > (Actual fixed values open for experimentation.)
> >
> > Regards,
> > Florian
> >
> > -Ursprüngliche Nachricht-
> > Von: Luca Dall'Osto [mailto:tenacious...@yahoo.it.INVALID]
> > Gesendet: Montag, 17. Juli 2017 12:20
> > An: solr-user@lucene.apache.org
> > Betreff: Get results in multiple orders (multiple boosts)
> >
> > Hello,
> > I'm new in Solr (and in mailing lists..), and I have a question about
> querying contents in multiple custom orders.
> > I 'm trying to query some documents boosted by 2 (or more) fields: I'm
> able to make a search of 2 day and return results boosted by category
> field, like this:
> >
> > ?indent=on
> > =edismax
> > =(date:[2017-06-16T00:00:00Z TO 2017-06-18T23:59:59Z])
> > =category:9500^2
> > =category:1100^1
> > =40
> > =jsonThis will return all documents of category 9500 first, and 1100
> in after.Now I would like to get this documents with a second boost based
> on another field, called source.I would like to have documents in this
> order:1) category:9500 AND source:5
> > 2) category:9500 AND source:9
> > 3) category:9500 AND source:7
> > 4) category:1100 AND source:5
> > 5) category:1100 AND source:9
> > 6) category:1100 AND source:7
> > To get this order, I tied with this query:?indent=on =edismax
> =(date:[2017-06-16T00:00:00Z TO 2017-06-18T23:59:59Z])
> > =category:9500^2+source:(5^3 OR 9^2 OR 7^1)
> > =category:1100^1+source:(5^3 OR 9^2 OR 7^1)
> > =40
> > =json
> > How can I apply a double boosts to get the documents in my correct
> order? Is boost the correct tool for my purpose?Any help will be greatly
> appreciated. Thanks Luca
>
>
>


RE: SolrJ 6.6.0 Connection pool shutdown now with stack trace

2017-07-18 Thread Markus Jelsma
Hello Susheel,

No, nothing at all. I've check all six nodes, they are clean.

Thanks,
Markus

 
 
-Original message-
> From:Susheel Kumar 
> Sent: Tuesday 18th July 2017 14:30
> To: solr-user@lucene.apache.org
> Subject: Re: SolrJ 6.6.0 Connection pool shutdown now with stack trace
> 
> Do you see any errors etc. in solr.log during this time?
> 
> On Tue, Jul 18, 2017 at 7:10 AM, Markus Jelsma 
> wrote:
> 
> > The problem was never resolved but Shawn asked for the stack trace, here
> > it is:
> >
> > org.apache.solr.client.solrj.SolrServerException: 
> > java.lang.IllegalStateException:
> > Connection pool shut down
> > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.
> > doRequest(LBHttpSolrClient.java:485)
> > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(
> > LBHttpSolrClient.java:388)
> > at org.apache.solr.client.solrj.impl.CloudSolrClient.
> > sendRequest(CloudSolrClient.java:1383)
> > at org.apache.solr.client.solrj.impl.CloudSolrClient.
> > requestWithRetryOnStaleState(CloudSolrClient.java:1134)
> > at org.apache.solr.client.solrj.impl.CloudSolrClient.request(
> > CloudSolrClient.java:1073)
> > at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:160)
> > at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942)
> > at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.java:1173)
> > at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.java:1090)
> > at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.java:1110)
> > ..internal method calling getById()..
> > at java.lang.Thread.run(Thread.java:748)
> > Caused by: java.lang.IllegalStateException: Connection pool shut down
> > at org.apache.http.util.Asserts.check(Asserts.java:34)
> > at org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:184)
> > at org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:217)
> > at org.apache.http.impl.conn.PoolingClientConnectionManager
> > .requestConnection(PoolingClientConnectionManager.java:184)
> > at org.apache.http.impl.client.DefaultRequestDirector.execute(
> > DefaultRequestDirector.java:415)
> > at org.apache.http.impl.client.AbstractHttpClient.doExecute(
> > AbstractHttpClient.java:882)
> > at org.apache.http.impl.client.CloseableHttpClient.execute(
> > CloseableHttpClient.java:82)
> > at org.apache.http.impl.client.CloseableHttpClient.execute(
> > CloseableHttpClient.java:55)
> > at org.apache.solr.client.solrj.impl.HttpSolrClient.
> > executeMethod(HttpSolrClient.java:515)
> > at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> > HttpSolrClient.java:279)
> > at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> > HttpSolrClient.java:268)
> > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.
> > doRequest(LBHttpSolrClient.java:447)
> > ... 24 more
> >
> > So, to summarize, we have a program checking presence of documents in Solr
> > using getById() and we don't want this exception to bubble up, we want
> > SolrJ to restore the connection pool just as CloudSolrClient would move on
> > to another node if one went down in the mean time.
> >
> > Is this possible? How?
> >
> > Many thanks,
> > Markus
> >
> > -Original message-
> > > From:Markus Jelsma 
> > > Sent: Thursday 29th June 2017 16:38
> > > To: solr-user@lucene.apache.org
> > > Subject: RE: SolrJ 6.6.0 Connection pool shutdown
> > >
> > > Thanks. I probably should have mentioned there is no firewall limiting
> > connections between those hosts. Actually, the processes run on the same
> > hosts as the Solr cluster is running on.
> > >
> > > Thanks,
> > > Markus
> > >
> > >
> > >
> > > -Original message-
> > > > From:Alexandre Rafalovitch 
> > > > Sent: Thursday 29th June 2017 15:38
> > > > To: solr-user 
> > > > Subject: Re: SolrJ 6.6.0 Connection pool shutdown
> > > >
> > > > One thing to check is whether there is a firewall between the client
> > > > and the server. They - sometimes - cut the silent connections in the
> > > > _middle_ (at the firewall). The usual solution is keepAlive request of
> > > > some kind or not using the connection pool.
> > > >
> > > > One way to check is with network tracer like Wireshark and checking
> > > > whether the actual hardware at the other end of the connection is a
> > > > normal server or some sort of unexpected hardware piece of equipment
> > > > (firewall). Yes, that's using the hammer to swat a fly :-)
> > > >
> > > > Regards,
> > > >Alex.
> > > > 
> > > > http://www.solr-start.com/ - Resources for Solr users, new and
> > experienced
> > > >
> > > >
> > > > On 29 June 2017 at 08:21, Markus Jelsma 
> > wrote:
> > > > > Hi,
> > > > >
> > > > > Everything is 6.6.0. I could include a stack trace (i don't print
> > them in my program), but that would only be the the trace from getById() to
> > 

Re: SolrJ 6.6.0 Connection pool shutdown now with stack trace

2017-07-18 Thread Susheel Kumar
Do you see any errors etc. in solr.log during this time?

On Tue, Jul 18, 2017 at 7:10 AM, Markus Jelsma 
wrote:

> The problem was never resolved but Shawn asked for the stack trace, here
> it is:
>
> org.apache.solr.client.solrj.SolrServerException: 
> java.lang.IllegalStateException:
> Connection pool shut down
> at org.apache.solr.client.solrj.impl.LBHttpSolrClient.
> doRequest(LBHttpSolrClient.java:485)
> at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(
> LBHttpSolrClient.java:388)
> at org.apache.solr.client.solrj.impl.CloudSolrClient.
> sendRequest(CloudSolrClient.java:1383)
> at org.apache.solr.client.solrj.impl.CloudSolrClient.
> requestWithRetryOnStaleState(CloudSolrClient.java:1134)
> at org.apache.solr.client.solrj.impl.CloudSolrClient.request(
> CloudSolrClient.java:1073)
> at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:160)
> at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942)
> at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.java:1173)
> at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.java:1090)
> at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.java:1110)
> ..internal method calling getById()..
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Connection pool shut down
> at org.apache.http.util.Asserts.check(Asserts.java:34)
> at org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:184)
> at org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:217)
> at org.apache.http.impl.conn.PoolingClientConnectionManager
> .requestConnection(PoolingClientConnectionManager.java:184)
> at org.apache.http.impl.client.DefaultRequestDirector.execute(
> DefaultRequestDirector.java:415)
> at org.apache.http.impl.client.AbstractHttpClient.doExecute(
> AbstractHttpClient.java:882)
> at org.apache.http.impl.client.CloseableHttpClient.execute(
> CloseableHttpClient.java:82)
> at org.apache.http.impl.client.CloseableHttpClient.execute(
> CloseableHttpClient.java:55)
> at org.apache.solr.client.solrj.impl.HttpSolrClient.
> executeMethod(HttpSolrClient.java:515)
> at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> HttpSolrClient.java:279)
> at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> HttpSolrClient.java:268)
> at org.apache.solr.client.solrj.impl.LBHttpSolrClient.
> doRequest(LBHttpSolrClient.java:447)
> ... 24 more
>
> So, to summarize, we have a program checking presence of documents in Solr
> using getById() and we don't want this exception to bubble up, we want
> SolrJ to restore the connection pool just as CloudSolrClient would move on
> to another node if one went down in the mean time.
>
> Is this possible? How?
>
> Many thanks,
> Markus
>
> -Original message-
> > From:Markus Jelsma 
> > Sent: Thursday 29th June 2017 16:38
> > To: solr-user@lucene.apache.org
> > Subject: RE: SolrJ 6.6.0 Connection pool shutdown
> >
> > Thanks. I probably should have mentioned there is no firewall limiting
> connections between those hosts. Actually, the processes run on the same
> hosts as the Solr cluster is running on.
> >
> > Thanks,
> > Markus
> >
> >
> >
> > -Original message-
> > > From:Alexandre Rafalovitch 
> > > Sent: Thursday 29th June 2017 15:38
> > > To: solr-user 
> > > Subject: Re: SolrJ 6.6.0 Connection pool shutdown
> > >
> > > One thing to check is whether there is a firewall between the client
> > > and the server. They - sometimes - cut the silent connections in the
> > > _middle_ (at the firewall). The usual solution is keepAlive request of
> > > some kind or not using the connection pool.
> > >
> > > One way to check is with network tracer like Wireshark and checking
> > > whether the actual hardware at the other end of the connection is a
> > > normal server or some sort of unexpected hardware piece of equipment
> > > (firewall). Yes, that's using the hammer to swat a fly :-)
> > >
> > > Regards,
> > >Alex.
> > > 
> > > http://www.solr-start.com/ - Resources for Solr users, new and
> experienced
> > >
> > >
> > > On 29 June 2017 at 08:21, Markus Jelsma 
> wrote:
> > > > Hi,
> > > >
> > > > Everything is 6.6.0. I could include a stack trace (i don't print
> them in my program), but that would only be the the trace from getById() to
> CloudSolrClient.requestWithRetryOnStaleState() and little deeper, that
> what you're looking for?
> > > >
> > > > We haven't called close() in that particular part of the program.
> > > >
> > > > Method requestWithRetryOnStaleState has some retry logic built-in
> but doesn't seem to work for the exception i got.
> > > >
> > > > I'll let it print the stack trace and get back if it happens again.
> > > >
> > > > Thanks,
> > > > Markus
> > > >
> > > > -Original message-
> > > >> From:Shawn Heisey 
> 

Re: Embedded documents in solr

2017-07-18 Thread Susheel Kumar
How many availabilities.day can be there for a single document? Is it for a
week/month/year?

On Tue, Jul 18, 2017 at 4:21 AM, Swapnil Pande 
wrote:

> Hi ,
> I am new to solr. I am facing a problem for embedding documents to solr. I
> dont want to use solr joins.
> The document is similar to
> {"name":string, availabilities:[{"day":Date,"status":0}..{}]}
> I want to index the array and search with queries like
>
> 1) where name = 'xyz' and availabilities.day = Date.current  and status =0
>
> Can you provide me the alternative solution to index this type of documents
> in solr.
>


Re: Limit to the number of cores supported?

2017-07-18 Thread Pouliot, Scott
It doesn't seem to report anything at all, which is part of the problem.   No 
error for me to track down as of yet

Get Outlook for iOS

From: Erick Erickson 
Sent: Monday, July 17, 2017 3:23:24 PM
To: solr-user
Subject: Re: Limit to the number of cores supported?

I know of thousands of cores on a single Solr instance. Operationally
there's not problem there, although there may be some practical issues
(i.e. startup time and the like).

What does your Solr log show? Two popular issues:
OutOfMemory issues
Not enough file handles (fix with ulimit)

 But without more specific info about what Solr reports in the log
it's impossible to say much.

Best,
Erick

On Mon, Jul 17, 2017 at 10:41 AM, Pouliot, Scott
 wrote:
> Hey guys.
>
> We're running SOLR 6.2.0 in a master/slave configuration and I was wondering 
> if there is a limit to the number of cores this setup can support? We're 
> having random issue where a core or 2 will stop responding to POSTS (GETS 
> work fine) until we restart SOLR.
>
> We've currently got 140+ cores on this setup and wondering if that could be 
> part of the problem?
>
> Anyone ever run into this before?
>
> Scott


RE: SolrJ 6.6.0 Connection pool shutdown now with stack trace

2017-07-18 Thread Markus Jelsma
The problem was never resolved but Shawn asked for the stack trace, here it is:

org.apache.solr.client.solrj.SolrServerException: 
java.lang.IllegalStateException: Connection pool shut down 
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:485)
 
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:388)
 
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1383)
 
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1134)
 
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1073)
 
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:160) 
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942) 
at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.java:1173) 
at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.java:1090) 
at org.apache.solr.client.solrj.SolrClient.getById(SolrClient.java:1110)
    ..internal method calling getById().. 
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: Connection pool shut down 
at org.apache.http.util.Asserts.check(Asserts.java:34) 
at org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:184) 
at org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:217) 
at 
org.apache.http.impl.conn.PoolingClientConnectionManager.requestConnection(PoolingClientConnectionManager.java:184)
 
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:415)
 
at 
org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:882)
 
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
 
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
 
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:515)
 
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:279)
 
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:268)
 
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:447)
 
... 24 more

So, to summarize, we have a program checking presence of documents in Solr 
using getById() and we don't want this exception to bubble up, we want SolrJ to 
restore the connection pool just as CloudSolrClient would move on to another 
node if one went down in the mean time.

Is this possible? How?

Many thanks,
Markus
 
-Original message-
> From:Markus Jelsma 
> Sent: Thursday 29th June 2017 16:38
> To: solr-user@lucene.apache.org
> Subject: RE: SolrJ 6.6.0 Connection pool shutdown
> 
> Thanks. I probably should have mentioned there is no firewall limiting 
> connections between those hosts. Actually, the processes run on the same 
> hosts as the Solr cluster is running on.
> 
> Thanks,
> Markus
> 
>  
>  
> -Original message-
> > From:Alexandre Rafalovitch 
> > Sent: Thursday 29th June 2017 15:38
> > To: solr-user 
> > Subject: Re: SolrJ 6.6.0 Connection pool shutdown
> > 
> > One thing to check is whether there is a firewall between the client
> > and the server. They - sometimes - cut the silent connections in the
> > _middle_ (at the firewall). The usual solution is keepAlive request of
> > some kind or not using the connection pool.
> > 
> > One way to check is with network tracer like Wireshark and checking
> > whether the actual hardware at the other end of the connection is a
> > normal server or some sort of unexpected hardware piece of equipment
> > (firewall). Yes, that's using the hammer to swat a fly :-)
> > 
> > Regards,
> >    Alex.
> > 
> > http://www.solr-start.com/ - Resources for Solr users, new and experienced
> > 
> > 
> > On 29 June 2017 at 08:21, Markus Jelsma  wrote:
> > > Hi,
> > >
> > > Everything is 6.6.0. I could include a stack trace (i don't print them in 
> > > my program), but that would only be the the trace from getById() to 
> > > CloudSolrClient.requestWithRetryOnStaleState() and little deeper, that 
> > > what you're looking for?
> > >
> > > We haven't called close() in that particular part of the program.
> > >
> > > Method requestWithRetryOnStaleState has some retry logic built-in but 
> > > doesn't seem to work for the exception i got.
> > >
> > > I'll let it print the stack trace and get back if it happens again.
> > >
> > > Thanks,
> > > Markus
> > >
> > > -Original message-
> > >> From:Shawn Heisey 
> > >> Sent: Tuesday 27th June 2017 23:02
> > >> To: solr-user@lucene.apache.org
> > >> Subject: Re: SolrJ 6.6.0 Connection pool shutdown
> > >>
> > >> On 6/27/2017 6:50 AM, Markus Jelsma wrote:
> > >> > We have a proces checking presence of many documents in a collection, 
> > >> > 

Re: Get results in multiple orders (multiple boosts)

2017-07-18 Thread Luca Dall'Osto
Hello everyone,thanks for the prompt reply!
In response to Florian, I can get correct score only when boost for 1 filed 
(for example category): the score are correctly increased by the factor.But 
when I try to make a double boost, the score are not great as expected (for 
example, if the greatest boost factor for category is 3 and for source is 3,  
sometimes I've got documents boosted with category^3 and source^1 **before** 
documents boosted with category^3 and source^3).
I tired you snippet code "^=[boost]" instead of "^[factor]" but seems not work 
for me: SyntaxError: Cannot parse 'category:9500^': Encountered \"\" at 
line 1 (...)
In response to Erik, I have different "sort preferences", so I can't build a 
index and use for sorting.Maybe I have to sort by category then by source and 
by language or by source, then by category and by date.The sort function that 
you are talking is a custom sort function of Solr?I build a custom PHP method 
that sort documents queried from Solr and it works good: the problem is that if 
I choose this way I have to get **all** the results in a single query and I 
can't paginate.Today I have millions of record and query could be very large, I 
would like to paginate results.To paginate results I need that Solr give me the 
results query in the correct order.Thank you very much

Luca

 

On Monday, July 17, 2017 6:04 PM, Erick Erickson  
wrote:
 

 I don't think boosting is really what you want here. Boosting
_influences_ the score, it does not impose an ordering.

Sorting _does_ impose an ordering, the question is how to sort and the
answer depends on how fixed (or not) the sorting criteria are. Do they
change with different queries? If not, the very simplest thing to do
is to index a field with a pre-computed sort value. IOW, if your
ordering is _always_ source 5, 9, 7 index a source_sort field that
orders things that way and sort on that. Then I'd have a secondary
sort by score as a tie-breaker.

If that's not the case, perhaps sorting by function (perhaps a custom
function) would work.

Best,
Erick

On Mon, Jul 17, 2017 at 4:30 AM, Florian Waltersdorfer
 wrote:
> Hi,
>
> I am quite the SolR newbie myself, but have you looked at the resulting 
> scores, e.g. via fl=*,score (that way, you can see/test how your boosting 
> affects the results)?
> In a similar scenario, I am using fixed value boosts for specific field 
> values; "^=[boost]" instead of "^[factor]", for example:
>
> category:9500^=20  source:(5^=20 OR 9^=10 OR 7^=5)
>
> (Actual fixed values open for experimentation.)
>
> Regards,
> Florian
>
> -Ursprüngliche Nachricht-
> Von: Luca Dall'Osto [mailto:tenacious...@yahoo.it.INVALID]
> Gesendet: Montag, 17. Juli 2017 12:20
> An: solr-user@lucene.apache.org
> Betreff: Get results in multiple orders (multiple boosts)
>
> Hello,
> I'm new in Solr (and in mailing lists..), and I have a question about 
> querying contents in multiple custom orders.
> I 'm trying to query some documents boosted by 2 (or more) fields: I'm able 
> to make a search of 2 day and return results boosted by category field, like 
> this:
>
> ?indent=on
> =edismax
> =(date:[2017-06-16T00:00:00Z TO 2017-06-18T23:59:59Z])
> =category:9500^2
> =category:1100^1
> =40
> =jsonThis will return all documents of category 9500 first, and 1100 in 
> after.Now I would like to get this documents with a second boost based on 
> another field, called source.I would like to have documents in this order:1) 
> category:9500 AND source:5
> 2) category:9500 AND source:9
> 3) category:9500 AND source:7
> 4) category:1100 AND source:5
> 5) category:1100 AND source:9
> 6) category:1100 AND source:7
> To get this order, I tied with this query:?indent=on =edismax 
> =(date:[2017-06-16T00:00:00Z TO 2017-06-18T23:59:59Z])
> =category:9500^2+source:(5^3 OR 9^2 OR 7^1)
> =category:1100^1+source:(5^3 OR 9^2 OR 7^1)
> =40
> =json
> How can I apply a double boosts to get the documents in my correct order? Is 
> boost the correct tool for my purpose?Any help will be greatly appreciated. 
> Thanks Luca

   

Re: Create too many zookeeper connections when recreate CloudSolrServer instance

2017-07-18 Thread wg85907
I am not mean my Zookeeper cluster is rebooting frequently, just want to
ensure my query service can be stable when Zookeeper cluster has issue or
reboot. Will do some test to check if there is some issue here. Maybe
current Zookeeper client can handle this case well. Hacking the client will
always be the last choice.
Regards,
Geng, Wei



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Create-too-many-zookeeper-connections-when-recreate-CloudSolrServer-instance-tp4346040p4346528.html
Sent from the Solr - User mailing list archive at Nabble.com.


Embedded documents in solr

2017-07-18 Thread Swapnil Pande
Hi ,
I am new to solr. I am facing a problem for embedding documents to solr. I
dont want to use solr joins.
The document is similar to
{"name":string, availabilities:[{"day":Date,"status":0}..{}]}
I want to index the array and search with queries like

1) where name = 'xyz' and availabilities.day = Date.current  and status =0

Can you provide me the alternative solution to index this type of documents
in solr.


Re: Need guidance for distributing data base on date interval in a collection

2017-07-18 Thread Modassar Ather
Hi Rehman,

You may want to look into how the documents are routed on different shards.
For that you can look into following documentation.
https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud

Basically it is the id of the document which when prefixed with certain
attribute helps decide which shard the document actually goes.
So document id and date id prefix may be helpful.

Best,
Modassar





On Tue, Jul 18, 2017 at 1:08 PM, Atita Arora  wrote:

> Hi Rehman,
> I am not sure about your use case,  but why wouldn't you consider creating
> shard for a particular date range like within a week from current date,  15
> days,  a month and so on and so forth.
>
> I have done a similar implementation elsewhere.
> Can you tell more about your use case?
>
> Atita
>
> On Jul 18, 2017 1:04 PM, "rehman kahloon"  invalid>
> wrote:
>
> Hello Sir/MadamI am new to SolrCloud, Having ORACLE
> technologies experience.
> Now a days , i am comparing oracle and solrcloud using bigdata.
> So i want to know how can i create time interval sharding.
> e.g i have 10 machines, each machine for one shard and one date data, So
> how can i fix next day data go to next shard and so on?
>
> search too much but not found any command/way, that handle it from some
> core/shard file.
> So i request you please guide me.
> thanks in advanced.
> Kind Regards,Muhammad Rehman kahloonmrehman_kahl...@yahoo.com
>


Re: Need guidance for distributing data base on date interval in a collection

2017-07-18 Thread Atita Arora
Hi Rehman,
I am not sure about your use case,  but why wouldn't you consider creating
shard for a particular date range like within a week from current date,  15
days,  a month and so on and so forth.

I have done a similar implementation elsewhere.
Can you tell more about your use case?

Atita

On Jul 18, 2017 1:04 PM, "rehman kahloon" 
wrote:

Hello Sir/MadamI am new to SolrCloud, Having ORACLE
technologies experience.
Now a days , i am comparing oracle and solrcloud using bigdata.
So i want to know how can i create time interval sharding.
e.g i have 10 machines, each machine for one shard and one date data, So
how can i fix next day data go to next shard and so on?

search too much but not found any command/way, that handle it from some
core/shard file.
So i request you please guide me.
thanks in advanced.
Kind Regards,Muhammad Rehman kahloonmrehman_kahl...@yahoo.com


Need guidance for distributing data base on date interval in a collection

2017-07-18 Thread rehman kahloon
Hello Sir/Madam                    I am new to SolrCloud, Having ORACLE 
technologies experience.
Now a days , i am comparing oracle and solrcloud using bigdata.
So i want to know how can i create time interval sharding.
e.g i have 10 machines, each machine for one shard and one date data, So how 
can i fix next day data go to next shard and so on? 

search too much but not found any command/way, that handle it from some 
core/shard file.
So i request you please guide me.
thanks in advanced.
Kind Regards,Muhammad Rehman kahloonmrehman_kahl...@yahoo.com

Re: Parent child documents partial update

2017-07-18 Thread Sujay Bawaskar
Yup, got it!

On Tue, Jul 18, 2017 at 12:22 PM, Amrit Sarkar 
wrote:

> Sujay,
>
> Lucene index is in flat-object document style, so I really not think nested
> documents at index / storage will ever be supported unless someone change
> the very intricacy of the index.
>
> Amrit Sarkar
> Search Engineer
> Lucidworks, Inc.
> 415-589-9269
> www.lucidworks.com
> Twitter http://twitter.com/lucidworks
> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
>
> On Tue, Jul 18, 2017 at 8:11 AM, Sujay Bawaskar 
> wrote:
>
> > Thanks Amrit. So storage mechanism of parent child documents is limiting
> > the capability of partial update. It would be great to have flawless
> parent
> > child index support in solr.
> >
> > On 17-Jul-2017 11:14 PM, "Amrit Sarkar"  wrote:
> >
> > > Sujay,
> > >
> > > Not really. Parent-child documents are stored in a single block
> > > contiguously. Read more about parent-child relationship at:
> > > https://medium.com/@sarkaramrit2/multiple-
> documents-with-same-doc-id-in-
> > > index-in-solr-cloud-32c072db2164
> > >
> > > While we perform partial / atomic update, say {"id":"X",
> > > "fieldA":{"set":"Z"}, that particular doc with X will be fetched (all
> the
> > > "stored" fields), update will be performed and indexed, all happens in
> > > *DistributedUpdateProcessor* internally. So there is no way it will
> fetch
> > > the child documents along with it.
> > >
> > > I am not sure whether this can be done with current code or it will be
> > > fixed / improved in the future.
> > >
> > > Amrit Sarkar
> > > Search Engineer
> > > Lucidworks, Inc.
> > > 415-589-9269
> > > www.lucidworks.com
> > > Twitter http://twitter.com/lucidworks
> > > LinkedIn: https://www.linkedin.com/in/sarkaramrit2
> > >
> > > On Mon, Jul 17, 2017 at 12:44 PM, Sujay Bawaskar <
> > sujaybawas...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Need a help to understand solr parent child document partial update
> > > > behaviour. Can we perform partial update on parent document without
> > > losing
> > > > its chiild documents? My observation is that parent child
> relationship
> > > > between documents get lost in case partial update is performed on
> > parent.
> > > > Any work around or solution to this issue?
> > > >
> > > > --
> > > > Thanks,
> > > > Sujay P Bawaskar
> > > > M:+91-77091 53669
> > > >
> > >
> >
>


Re: Parent child documents partial update

2017-07-18 Thread Amrit Sarkar
Sujay,

Lucene index is in flat-object document style, so I really not think nested
documents at index / storage will ever be supported unless someone change
the very intricacy of the index.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2

On Tue, Jul 18, 2017 at 8:11 AM, Sujay Bawaskar 
wrote:

> Thanks Amrit. So storage mechanism of parent child documents is limiting
> the capability of partial update. It would be great to have flawless parent
> child index support in solr.
>
> On 17-Jul-2017 11:14 PM, "Amrit Sarkar"  wrote:
>
> > Sujay,
> >
> > Not really. Parent-child documents are stored in a single block
> > contiguously. Read more about parent-child relationship at:
> > https://medium.com/@sarkaramrit2/multiple-documents-with-same-doc-id-in-
> > index-in-solr-cloud-32c072db2164
> >
> > While we perform partial / atomic update, say {"id":"X",
> > "fieldA":{"set":"Z"}, that particular doc with X will be fetched (all the
> > "stored" fields), update will be performed and indexed, all happens in
> > *DistributedUpdateProcessor* internally. So there is no way it will fetch
> > the child documents along with it.
> >
> > I am not sure whether this can be done with current code or it will be
> > fixed / improved in the future.
> >
> > Amrit Sarkar
> > Search Engineer
> > Lucidworks, Inc.
> > 415-589-9269
> > www.lucidworks.com
> > Twitter http://twitter.com/lucidworks
> > LinkedIn: https://www.linkedin.com/in/sarkaramrit2
> >
> > On Mon, Jul 17, 2017 at 12:44 PM, Sujay Bawaskar <
> sujaybawas...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > Need a help to understand solr parent child document partial update
> > > behaviour. Can we perform partial update on parent document without
> > losing
> > > its chiild documents? My observation is that parent child relationship
> > > between documents get lost in case partial update is performed on
> parent.
> > > Any work around or solution to this issue?
> > >
> > > --
> > > Thanks,
> > > Sujay P Bawaskar
> > > M:+91-77091 53669
> > >
> >
>


RE: Joins in Parallel SQL?

2017-07-18 Thread imran
Is it possible to contribute towards building this capability? What part of 
developer documentation would be suitable for this?

Regards,
Imran

Sent from Mail for Windows 10

From: Joel Bernstein
Sent: Thursday, July 6, 2017 7:40 AM
To: solr-user@lucene.apache.org
Subject: Re: Joins in Parallel SQL?

Joins and OFFSET are not currently supported with Parallel SQL.

The docs for parallel SQL cover all the supported features. Any syntax not
covered in the docs is likely not supported.

Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, Jul 6, 2017 at 2:40 PM,  wrote:

>
> Is it possible to join documents from different collections through
> Parallel SQL?
>
> In addition to the LIMIT feature on Parallel SQL, can we do use OFFSET to
> implement paging?
>
> Thanks,
> Imran
>
>
> Sent from Mail for Windows 10
>
>



Re: Highlighting words with special characters

2017-07-18 Thread Lasitha Wattaladeniya
Further more, ngram field has following tokenizer/filter chain in index and
query

UAX29URLEmailTokenizerFactory (only in index)
stopFilterFactory
LowerCaseFilterFactory
ASCIIFoldingFilterFactory
EnglishPossessiveFilterFactory
StemmerOverrideFilterFactory (only in query)
NgramTokenizerFactory (only in index)

Regards,
Lasitha

On 18 Jul 2017 14:11, "Lasitha Wattaladeniya"  wrote:

> Hi devs,
>
> I have setup solr highlighting with default setup (only changed the
> fragsize to 0 to match any field length). It worked fine but recently I
> discovered it doesn't highlight for words with special characters in the
> middle.
>
> For an example, let's say I have indexed email address test.f...@ran.com
> to a ngram field. And when I search for the partial text fsdg, I get the
> results but it's not highlighted. It works in all other scenarios as
> expected.
>
> The ngram field has termVectors, termPositions, termOffsets set to true.
>
> Can somebody please suggest me, what may be wrong here?
>
> (sorry for the unstructured text. Typed using a mobile phone )
>
> Regards
> Lasitha
>


Highlighting words with special characters

2017-07-18 Thread Lasitha Wattaladeniya
Hi devs,

I have setup solr highlighting with default setup (only changed the
fragsize to 0 to match any field length). It worked fine but recently I
discovered it doesn't highlight for words with special characters in the
middle.

For an example, let's say I have indexed email address test.f...@ran.com to
a ngram field. And when I search for the partial text fsdg, I get the
results but it's not highlighted. It works in all other scenarios as
expected.

The ngram field has termVectors, termPositions, termOffsets set to true.

Can somebody please suggest me, what may be wrong here?

(sorry for the unstructured text. Typed using a mobile phone )

Regards
Lasitha