Re: Indexing of documents in more than one step (SOLRJ)

2017-02-15 Thread Erick Erickson
Maciej:

you really have two choices:
1> re-index the entire document with fields a, b, c, d, e, f. In that
case though, why bother indexing the first time ;)
2> use Atomic Updates:
https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents
but note the restrictions.

Best,
Erick

On Wed, Feb 15, 2017 at 3:45 AM, Emir Arnautovic
 wrote:
> Which version of Solr do you use? Is it always the same field? Again,
> without checking anything, see if it could be that field is not multivalue
> and your value is.
>
> In any case, this is inefficient way of indexing. If possible, stream both
> sources ordered by ID and merge them in one input doc and send to Solr.
>
> Emir
>
>
>
> On 15.02.2017 12:24, Maciej Ł. PCSS wrote:
>>
>> No, it's not the case. In both steps I'm indexing documents from the same
>> set of IDs (I mean the values of the 'id').
>>
>> Maciej
>>
>>
>> W dniu 15.02.2017 o 11:07, Emir Arnautovic pisze:
>>>
>>> I did not have time to test it or look at the code, but can you check if
>>> it could be the case when there is no document with a, b, c fields and you
>>> are trying to update it with d, e, f using partial update syntax.
>>>
>>> Emir
>>>
>>>
>>> On 15.02.2017 09:25, Maciej Ł. PCSS wrote:

 Dear All,
 how should I handle the following scenario using SOLRJ?  Index a
 collection of documents (fill fields a, b, c). Then index the same
 collection but this time fill fields d, e, f.

 In a pseudo-code it would be: step1(collectionX); step2(collectionX);
 solrCommit();

 See my observations below:
 - first step is done by calling SolrInputDocument.addField(fieldName,
 value); and this works fine.
 - if I do the same for the second step then all fields in my documents
 get removed;
 - for that reason I need to call SolrInputDocument.addField(fieldName,
 Collections.singletonMap("set", value)); and then it's fine
 - but for some field, if I do the call from above, then the indexed
 values are like "{set=value}" instead of just "value".

 Can somebody explain this strange behaviour to me?

 Regards
 Maciej

>>>
>>
>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>


Re: How to get the core index progress/status ?

2017-02-15 Thread Shawn Heisey
On 2/15/2017 4:06 PM, Ratan Servegar wrote:
> I've been looking everywhere to find a solution for the problem i've
> having, I am starting an index on a core by making an ajax call from
> my page to the solr server via url. However i want to know the index
> start time and completion time and also i want to know how many
> records are processed and the rate of processing per second (the same
> details that are in the Solr admin screen) i want to bring back this
> data to my application so that i can show visual progress bar of the
> index process. Can i do this? is it possible ? sorry if this has
> already been asked. 

This would depend on what mechanism you are using for indexing.

If you're using the dataimport handler, then the status response from
that handler (after an import is started) provides much of the
information needed to display or calculate what you have mentioned.  One
caveat -- the handler usually does not know how many documents are
available in the source system, so it typically cannot tell how far
along the import is.  If you can obtain that information from the source
system separately, then you would be able to calculate progress.

If you're indexing by using a program of your own design to send update
requests to Solr, then Solr will not have that information.  Your
program would need to provide it.

Thanks,
Shawn



How to get the core index progress/status ?

2017-02-15 Thread Ratan Servegar
Hello,


I've been looking everywhere to find a solution for the problem i've
having, I am starting an index on a core by making an ajax call from my
page to the solr server via url. However i want to know the index start
time and completion time and also i want to know how many records are
processed and the rate of processing per second (the same details that are
in the Solr admin screen)  i want to bring back this data to my application
so that i can show visual progress bar of the index process.

Can i do this? is it possible ? sorry if this has already been asked.


--
Regards,
Ratan Servegar

This message contains information that may be privileged or confidential.
It is intended only for the person to whom it is addressed. If you are not
the intended recipient, you are not authorized to read, print, retain,
copy, disseminate, distribute, or use this message or any part thereof. If
you receive this message in error, please notify the sender immediately and
delete all copies of this message.


[SECURITY] CVE-2017-3163 Apache Solr ReplicationHandler path traversal attack

2017-02-15 Thread Jan Høydahl
CVE-2017-3163: Apache Solr ReplicationHandler path traversal attack

Severity: Moderate

Vendor:
The Apache Software Foundation

Versions Affected:
Solr 1.4 to 6.4.0

Description:
When using the Index Replication feature, Solr nodes can pull index files from
a master/leader node using an HTTP API which accepts a file name. However,
Solr did not validate the file name, hence it was possible to craft a special
request involving path traversal, leaving any file readable to the Solr server
process exposed. Solr servers protected and restricted by firewall rules
and/or authentication would not be at risk since only trusted clients and users
would gain direct HTTP access.

Mitigation:
6.x users should upgrade to 6.4.1
5.x users should upgrade to 5.5.4
4.x, 3.x and 1.4 users should upgrade to a supported version of Solr
or setup proper firewalling, or disable the ReplicationHandler if not in use.

Credit:
This issue was discovered by Hrishikesh Gadre of Cloudera Inc.

References:
https://issues.apache.org/jira/browse/SOLR-10031
https://wiki.apache.org/solr/SolrSecurity

The Lucene PMC


signature.asc
Description: Message signed with OpenPGP


RE: Atomic updates to increase single field bulk updates?

2017-02-15 Thread Markus Jelsma
Hello Sebastian,

Except for the requirement to have all fields stored, there is from 
Solr/Lucene's point of view not much difference between indexing a partial 
update or a complete document. Under the hood a partial update is a complete 
object anyway. Using partial updates you gain a little bandwidth at the expense 
of additional stored fields.

If your backend is the bottleneck, it would probably be very beneficial for you 
to switch to atomic updates: decrease stress on your database and decrease 
reindexing time.

Regards,
Markus

-Original message-
> From:Sebastian Riemer 
> Sent: Wednesday 15th February 2017 19:31
> To: solr-user@lucene.apache.org
> Subject: Atomic updates to increase single field bulk updates?
> 
> Dear solr users,
> 
> when updating documents in bulk (i.e. 40.000 documents at once), and only 
> changing the value of a single Boolean-Flag, I currently re-index all whole 
> 40.000 objects. However, the process of obtaining all relevant information 
> for each object from the database is one of relatively high cost.
> 
> I now wonder, if in this situation it would be a good idea to implement a 
> single-field update routine using atomic updates? In that case, I could skip 
> any necessary lookups in the relational database, since the only information 
> would be the new value for that Boolean-Flag, and the list of those 40.000 
> document ids.
> 
> I am aware of the requirements to use atomic updates, but as I understood, 
> those would not have a big impact on performance and only a slight increase 
> in index size?
> 
> What is your opinion on that?
> 
> Thanks for your input, have a nice evening!
> 
> Sebastian
> 
> 


Atomic updates to increase single field bulk updates?

2017-02-15 Thread Sebastian Riemer
Dear solr users,

when updating documents in bulk (i.e. 40.000 documents at once), and only 
changing the value of a single Boolean-Flag, I currently re-index all whole 
40.000 objects. However, the process of obtaining all relevant information for 
each object from the database is one of relatively high cost.

I now wonder, if in this situation it would be a good idea to implement a 
single-field update routine using atomic updates? In that case, I could skip 
any necessary lookups in the relational database, since the only information 
would be the new value for that Boolean-Flag, and the list of those 40.000 
document ids.

I am aware of the requirements to use atomic updates, but as I understood, 
those would not have a big impact on performance and only a slight increase in 
index size?

What is your opinion on that?

Thanks for your input, have a nice evening!

Sebastian



[ANNOUNCE] Apache Solr 5.5.4 released

2017-02-15 Thread Adrien Grand
15 February 2017, Apache Solr™ 5.5.4 available

The Lucene PMC is pleased to announce the release of Apache Solr 5.5.4

Solr is the popular, blazing fast, open source NoSQL search platform
from the Apache Lucene project. Its major features include powerful
full-text search, hit highlighting, faceted search, dynamic
clustering, database integration, rich document (e.g., Word, PDF)
handling, and geospatial search. Solr is highly scalable, providing
fault tolerant distributed search and indexing, and powers the search
and navigation features of many of the world's largest internet sites.

The release is available for immediate download at:

  http://www.apache.org/dyn/closer.lua/lucene/solr/5.5.4

This release includes 2 bug fixes since the 5.5.3 release:
 * Better validation of filename params in ReplicationHandler
 * Upgraded commons-fileupload to 1.3.2, fixing a potential vulnerability
CVE-2016-3092

Please read CHANGES.txt for a detailed list of changes:

  https://lucene.apache.org/solr/5_5_4/changes/Changes.html

Please report any feedback to the mailing lists
(http://lucene.apache.org/solr/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring
network for distributing releases. It is possible that the mirror you
are using may not have replicated the release yet. If that is the
case, please try another mirror. This also goes for Maven access.

-- 
Adrien Grand


Re: Getting "Error getting file length for [segments_5]" warnings in Solr 6.4.0

2017-02-15 Thread Peter Matthew Eichman
Shawn,

We have confirmed that yes, this is just log noise, and possibly related to
the admin interface and not the actual indexing process. As for indexing
stopping, that was not actually the case. I got confused about how many
documents were in the collection I was indexing, and thought there should
have been more.

Thanks,
-Peter

On Tue, Feb 14, 2017 at 6:10 PM, Shawn Heisey  wrote:

> On 2/14/2017 9:57 AM, Peter Matthew Eichman wrote:
> > I am running Solr 6.4.0, and while I am attempting to index my Fedora
> > 4 data, I keep getting warning messages in my solr.log: "WARN
> > (qtp401424608-18) [ x:fedora4] o.a.s.h.a.LukeRequestHandler Error
> > getting file length for [segments_5]". And after that, the indexing
> > stops, and the core is left in a non-current state until I issue a
> > manual commit request to it.
>
> I have seen this warning frequently on newer Solr versions.  It doesn't
> seem to actually affect Solr's operation, it just results in a lot of
> excess logging.  It is unlikely to cause issues with indexing.
>
> In your statement, what does "the indexing stops" mean?  This is quite
> vague about what's actually happening.  Error messages related to the
> *indexing* have not been provided.  Are your indexing clients receiving
> any error messages?
>
> Thanks,
> Shawn
>
>


-- 
Peter Eichman
Senior Software Developer
University of Maryland Libraries
peich...@umd.edu


Re: indent parsedquery_toString

2017-02-15 Thread Alexandre Rafalovitch
That is not supported. However, you can add
debug.explain.structured=true for a more detailed breakdown of the
information.

Regards,
   Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 15 February 2017 at 09:03, Gleb  wrote:
> How to make parsedquery_toString more readable? I want read it indented.
>
> +(
> (
> (name_text_ru:hello)~0.5
> (name_text_ru:word)~0.5
> (
> (name_text_ru:ложка name_text_ru:trump)
> )~0.5
> )~3
> )
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/indent-parsedquery-toString-tp4320501.html
> Sent from the Solr - User mailing list archive at Nabble.com.


indent parsedquery_toString

2017-02-15 Thread Gleb
How to make parsedquery_toString more readable? I want read it indented.

+(
(
(name_text_ru:hello)~0.5 
(name_text_ru:word)~0.5 
(
(name_text_ru:ложка name_text_ru:trump)
)~0.5
)~3
)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/indent-parsedquery-toString-tp4320501.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Core replication, Slave not flipping to master

2017-02-15 Thread Shawn Heisey
On 2/15/2017 4:06 AM, philippa griggs wrote:
> Solr 5.4.1, multiple cores with two cores per shard. Zookeeper 3.4.6   (5 
> zookeeper ensemble).
>
> I have noticed an error with the replication between two cores in a shard. 
> I’m having to perform a schema update which means I have to stop and start 
> the cores.  I’m trying to do this in a way so I don’t get any down time. 
> Restarting one core in the shard, waiting for that to come back up before 
> restarting the second one.

You're talking about zookeeper, which means SolrCloud.  Just reload the
collection after you change the config/schema in zookeeper.  Solr will
handle reloading all shards and all replicas, no matter how many actual
servers are involved.  There will be no downtime.

https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-RELOAD:ReloadaCollection

> However when restarting the master, the slave isn’t flipping and becoming the 
> master itself.  Instead I’m getting errors in the log as follows:

We would need to see the *full* error in your logs.  It looks like
you've just pulled part of it out and not included the entire message,
which might be dozens of lines.

There are no masters and no slaves in normal SolrCloud operation.  One
of the replicas of each shard gets elected as leader.  The replication
feature is **NOT** used unless something goes wrong.  If a replica
requires complete replacement, then SolrCloud will use the old
master/slave replication feature to copy the leader's index to the bad
replica.

> When I run
>
> http://xxx:8983/solr/core_name/replication?command=details

If you're running SolrCloud (Solr plus zookeeper), why are you doing
anything at all with the replication handler?  As I said above,
SolrCloud only uses the replication feature in emergencies.  It doesn't
touch the replication handler's config as master or slave until the
precise moment that an index replication is actually needed.  A core's
status as a replication master or slave is meaningless to normal
SolrCloud operation.

Thanks,
Shawn



Re: Can SOLR-5730 patch be backported to Solr 5.5.3

2017-02-15 Thread Shawn Heisey
On 2/14/2017 2:45 AM, Sahil Agarwal wrote:
> Can the patch for jira issue SOLR-5730 be backported to solr 5.5.3?? ie.
> Can Lucene's SortingMergePolicy and EarlyTerminatingSortingCollector
> configurable in Solr 5.5.3 too??
>
> https://issues.apache.org/jira/browse/SOLR-5730
>
> The SortingMergePolicy and EarlyTerminatingSortCollector are both available
> in the lucene misc package.

That issue *adds* those capabilities to Solr.  It doesn't just expose
configuration for capability that's already present.  Before 6.0, Solr
did not have "SortingMergePolicyFactory" at all.  *Lucene* had the
capability, but Solr didn't.

Because it's no longer the active major release, 5.x is in maintenance
mode.  To ensure stability, new features are never added to code
branches in maintenance mode.  The only thing that project policy allows
is fixes for bugs where the bug is severe or the change is so small that
it can't break anything.  The patch for SOLR-5730 doesn't meet either of
these requirements.

The patch has not been applied to the 5.5 branch, and will not apply
cleanly to that branch.

The patch *has* been applied to branch_5x, which is what *would* produce
5.6, but at this point a 5.6 release is unlikely.  You can always
download the branch_5x source code and build a snapshot version of 5.6.

Thanks,
Shawn



Re: Arabic words search in solr

2017-02-15 Thread Steve Rowe
Hi Mohan,

When I said "the ICU folding filter should be the last filter, to allow the 
Arabic normalization and stemming filters to see the original words”, I meant 
that no filter should follow it.  

You did not make that change.

Here’s what I mean:

   
  
   
   
   
   
   
 
   

--
Steve
www.lucidworks.com

> On Feb 15, 2017, at 12:23 AM, mohanmca01  wrote:
> 
> Hi Steve,
> 
> As per your suggestion,I added ICUFoldingFilterFactory in schema.xml as
> below:
> 
> 
>   
>
>
> words="lang/stopwords_ar.txt" />
>
>
>  
>
> 
> I attached expecting result document in previous mail thread for your
> references.
> 
> Kindly check and let me know.
> 
> Thanks
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Arabic-words-search-in-solr-tp4317733p4320427.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Indexing of documents in more than one step (SOLRJ)

2017-02-15 Thread Emir Arnautovic
Which version of Solr do you use? Is it always the same field? Again, 
without checking anything, see if it could be that field is not 
multivalue and your value is.


In any case, this is inefficient way of indexing. If possible, stream 
both sources ordered by ID and merge them in one input doc and send to 
Solr.


Emir


On 15.02.2017 12:24, Maciej Ł. PCSS wrote:
No, it's not the case. In both steps I'm indexing documents from the 
same set of IDs (I mean the values of the 'id').


Maciej


W dniu 15.02.2017 o 11:07, Emir Arnautovic pisze:
I did not have time to test it or look at the code, but can you check 
if it could be the case when there is no document with a, b, c fields 
and you are trying to update it with d, e, f using partial update 
syntax.


Emir


On 15.02.2017 09:25, Maciej Ł. PCSS wrote:

Dear All,
how should I handle the following scenario using SOLRJ?  Index a 
collection of documents (fill fields a, b, c). Then index the same 
collection but this time fill fields d, e, f.


In a pseudo-code it would be: step1(collectionX); 
step2(collectionX); solrCommit();


See my observations below:
- first step is done by calling 
SolrInputDocument.addField(fieldName, value); and this works fine.
- if I do the same for the second step then all fields in my 
documents get removed;
- for that reason I need to call 
SolrInputDocument.addField(fieldName, 
Collections.singletonMap("set", value)); and then it's fine
- but for some field, if I do the call from above, then the indexed 
values are like "{set=value}" instead of just "value".


Can somebody explain this strange behaviour to me?

Regards
Maciej







--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: SSL using signed client certificate not working

2017-02-15 Thread Kevin Risden
It sounds like Edge, Firefox, and Chrome aren't setup on your computer to
do client authentication. You can set need client authentication to false
and use want client authentication in solr.in.sh. This will allow browsers
that don't present a client certificate to work. Otherwise you need to
configure your browsers.

Client authentication is an extra part of SSL and not usually required.

Kevin Risden

On Feb 15, 2017 4:43 AM, "Espen Rise Halstensen"  wrote:

>
> Hi,
>
> I have some problems with client certificates. By the look of it, it works
> with
> curl and safari prompts for and accepts my certificate. Does not work with
> Edge,
> Firefox or Chrome. The certificates are requested from our CA.
>
> When requesting https://s02/solr in the browser, it doesn't
> prompt for certificate and I get the following error message in Chrome:
> >This site can't provide a secure connection
> >s02 didn't accept your login certificate, or one may not have been
> provided.
> >Try contacting the system admin.
>
> When debugging with wireshark I can see the s01t9 certificate in the
> "certificate request"-part of the handshake, but the browser answers
> without certificate.
>
>
> Setup as follows:
>
> solr.in.sh:
> SOLR_SSL_KEY_STORE=etc/keystore.jks
> SOLR_SSL_KEY_STORE_PASSWORD=secret
> SOLR_SSL_TRUST_STORE=etc/truststore.jks
> SOLR_SSL_TRUST_STORE_PASSWORD=secret
> SOLR_SSL_NEED_CLIENT_AUTH=true
> SOLR_SSL_WANT_CLIENT_AUTH=false
>
> Content of truststore.jks:
> [solruser@s02 etc]# keytool -list -keystore 
> /opt/solr-6.4.0/server/etc/truststore.jks
> -storepass secret
>
> Keystore type: JKS
> Keystore provider: SUN
>
> Your keystore contains 1 entry
>
> s01t9, 15.feb.2017, trustedCertEntry,
> Certificate fingerprint (SHA1): CF:BD:02:71:64:F0:BA:65:71:10:
> A1:23:42:34:E0:3C:37:75:E1:BF
>
>
>
> Curl(returns html of admin page with -L option):
>
> curl -v -E  s01t9.pem:secret --cacert  rootca.pem 'https://vs02/solr'
> * Hostname was NOT found in DNS cache
> *   Trying 10.0.121.132...
> * Connected to s02 (10.0.121.132) port 443 (#0)
> * successfully set certificate verify locations:
> *   CAfile: rootca.pem
>   CApath: /etc/ssl/certs
> * SSLv3, TLS handshake, Client hello (1):
> * SSLv3, TLS handshake, Server hello (2):
> * SSLv3, TLS handshake, CERT (11):
> * SSLv3, TLS handshake, Request CERT (13):
> * SSLv3, TLS handshake, Server finished (14):
> * SSLv3, TLS handshake, CERT (11):
> * SSLv3, TLS handshake, Client key exchange (16):
> * SSLv3, TLS handshake, CERT verify (15):
> * SSLv3, TLS change cipher, Client hello (1):
> * SSLv3, TLS handshake, Finished (20):
> * SSLv3, TLS change cipher, Client hello (1):
> * SSLv3, TLS handshake, Finished (20):
> * SSL connection using AES256-SHA256
> * Server certificate:
> *subject: CN=s01t9
> *start date: 2017-01-09 11:31:49 GMT
> *expire date: 2022-01-08 11:31:49 GMT
> *subjectAltName: s02 matched
> *issuer: DC=local; DC=com; CN=Root CA
> *SSL certificate verify ok.
> > GET /solr HTTP/1.1
> > User-Agent: curl/7.35.0
> > Host: s02
> > Accept: */*
> >
> < HTTP/1.1 302 Found
> < Location: https://s02 /solr/
> < Content-Length: 0
> <
> * Connection #0 to host s02 left intact
>
> Thanks,
> Espen
>


Re: Indexing of documents in more than one step (SOLRJ)

2017-02-15 Thread Maciej Ł. PCSS
No, it's not the case. In both steps I'm indexing documents from the 
same set of IDs (I mean the values of the 'id').


Maciej


W dniu 15.02.2017 o 11:07, Emir Arnautovic pisze:
I did not have time to test it or look at the code, but can you check 
if it could be the case when there is no document with a, b, c fields 
and you are trying to update it with d, e, f using partial update syntax.


Emir


On 15.02.2017 09:25, Maciej Ł. PCSS wrote:

Dear All,
how should I handle the following scenario using SOLRJ?  Index a 
collection of documents (fill fields a, b, c). Then index the same 
collection but this time fill fields d, e, f.


In a pseudo-code it would be: step1(collectionX); step2(collectionX); 
solrCommit();


See my observations below:
- first step is done by calling SolrInputDocument.addField(fieldName, 
value); and this works fine.
- if I do the same for the second step then all fields in my 
documents get removed;
- for that reason I need to call 
SolrInputDocument.addField(fieldName, Collections.singletonMap("set", 
value)); and then it's fine
- but for some field, if I do the call from above, then the indexed 
values are like "{set=value}" instead of just "value".


Can somebody explain this strange behaviour to me?

Regards
Maciej







Core replication, Slave not flipping to master

2017-02-15 Thread philippa griggs
Hello,



Solr 5.4.1, multiple cores with two cores per shard. Zookeeper 3.4.6   (5 
zookeeper ensemble).


I have noticed an error with the replication between two cores in a shard. I’m 
having to perform a schema update which means I have to stop and start the 
cores.  I’m trying to do this in a way so I don’t get any down time. Restarting 
one core in the shard, waiting for that to come back up before restarting the 
second one.


However when restarting the master, the slave isn’t flipping and becoming the 
master itself.  Instead I’m getting errors in the log as follows:


Exception while invoking 'details' method for replication on master -Server 
refused connection at xxx


When I run


http://xxx:8983/solr/core_name/replication?command=details


Is see




invalid_master

http://xxx:8983/solr/core_name/

Wed Feb 15 10:44:30 UTC 2017

false

false






Once the old master comes back up again, it comes in as a slave, which is what 
I would expect. However as the other core hasn’t flipped into becoming the 
master, I am left with both cores thinking they are slaves.


I would expect when the master goes down and is unreachable, the slave would 
flip and not just throw an error about the connection.  Does anyone have any 
ideas on why this is happening and could point me in the direction of what to 
do to fix this issue?


Many thanks

Philippa


SSL using signed client certificate not working

2017-02-15 Thread Espen Rise Halstensen

Hi,

I have some problems with client certificates. By the look of it, it works with
curl and safari prompts for and accepts my certificate. Does not work with Edge,
Firefox or Chrome. The certificates are requested from our CA.

When requesting https://s02/solr in the browser, it doesn't
prompt for certificate and I get the following error message in Chrome:
>This site can't provide a secure connection
>s02 didn't accept your login certificate, or one may not have been provided.
>Try contacting the system admin.

When debugging with wireshark I can see the s01t9 certificate in the
"certificate request"-part of the handshake, but the browser answers without 
certificate.


Setup as follows:

solr.in.sh:
SOLR_SSL_KEY_STORE=etc/keystore.jks
SOLR_SSL_KEY_STORE_PASSWORD=secret
SOLR_SSL_TRUST_STORE=etc/truststore.jks
SOLR_SSL_TRUST_STORE_PASSWORD=secret
SOLR_SSL_NEED_CLIENT_AUTH=true
SOLR_SSL_WANT_CLIENT_AUTH=false

Content of truststore.jks:
[solruser@s02 etc]# keytool -list -keystore 
/opt/solr-6.4.0/server/etc/truststore.jks -storepass secret

Keystore type: JKS
Keystore provider: SUN

Your keystore contains 1 entry

s01t9, 15.feb.2017, trustedCertEntry,
Certificate fingerprint (SHA1): 
CF:BD:02:71:64:F0:BA:65:71:10:A1:23:42:34:E0:3C:37:75:E1:BF



Curl(returns html of admin page with -L option):

curl -v -E  s01t9.pem:secret --cacert  rootca.pem 'https://vs02/solr'
* Hostname was NOT found in DNS cache
*   Trying 10.0.121.132...
* Connected to s02 (10.0.121.132) port 443 (#0)
* successfully set certificate verify locations:
*   CAfile: rootca.pem
  CApath: /etc/ssl/certs
* SSLv3, TLS handshake, Client hello (1):
* SSLv3, TLS handshake, Server hello (2):
* SSLv3, TLS handshake, CERT (11):
* SSLv3, TLS handshake, Request CERT (13):
* SSLv3, TLS handshake, Server finished (14):
* SSLv3, TLS handshake, CERT (11):
* SSLv3, TLS handshake, Client key exchange (16):
* SSLv3, TLS handshake, CERT verify (15):
* SSLv3, TLS change cipher, Client hello (1):
* SSLv3, TLS handshake, Finished (20):
* SSLv3, TLS change cipher, Client hello (1):
* SSLv3, TLS handshake, Finished (20):
* SSL connection using AES256-SHA256
* Server certificate:
*subject: CN=s01t9
*start date: 2017-01-09 11:31:49 GMT
*expire date: 2022-01-08 11:31:49 GMT
*subjectAltName: s02 matched
*issuer: DC=local; DC=com; CN=Root CA
*SSL certificate verify ok.
> GET /solr HTTP/1.1
> User-Agent: curl/7.35.0
> Host: s02
> Accept: */*
>
< HTTP/1.1 302 Found
< Location: https://s02 /solr/
< Content-Length: 0
<
* Connection #0 to host s02 left intact

Thanks,
Espen


Re: Continual garbage collection loop

2017-02-15 Thread Leon STRINGER
Thanks all who replied, lots of information to help us improve our use and
management of Solr!

> 
> On 15 February 2017 at 08:04 Michael Kuhlmann  wrote:
> 
> 
> The number of cores is not *that much* important compared to the index
> size, but each core has its memory overhead. For instance, caches are
> based on cores, so you're having 36 individual caches per type.
> 
> Best,
> Michael
> 
> Am 14.02.2017 um 16:39 schrieb Leon STRINGER:
> >> On 14 February 2017 at 14:44 Michael Kuhlmann  wrote:
> >>
> >>
> >> Wow, running 36 cores with only half a gigabyte of heap memory is
> >> *really* optimistic!
> >>
> >> I'd raise the heap size to some gigabytes at least and see how it's
> >> working then.
> >>
> > I'll try increasing the heap size and see if I get the problem again.
> >
> > Is core quantity a big issue? As opposed to the size of the cores? Yes,
> > there's
> > 36 but some relate to largely inactive web sites so the average size
> > (assuming
> > my "Master (Searching)" way of calculating this is correct) is less than
> > 4 MB. I
> > naively assumed a heap size-related issue would result from larger data
> > sets.
> >
> > Thanks for your recommendation,
> >
> > Leon Stringer
> >
> 

>

Re: Indexing of documents in more than one step (SOLRJ)

2017-02-15 Thread Emir Arnautovic
I did not have time to test it or look at the code, but can you check if 
it could be the case when there is no document with a, b, c fields and 
you are trying to update it with d, e, f using partial update syntax.


Emir


On 15.02.2017 09:25, Maciej Ł. PCSS wrote:

Dear All,
how should I handle the following scenario using SOLRJ?  Index a 
collection of documents (fill fields a, b, c). Then index the same 
collection but this time fill fields d, e, f.


In a pseudo-code it would be: step1(collectionX); step2(collectionX); 
solrCommit();


See my observations below:
- first step is done by calling SolrInputDocument.addField(fieldName, 
value); and this works fine.
- if I do the same for the second step then all fields in my documents 
get removed;
- for that reason I need to call SolrInputDocument.addField(fieldName, 
Collections.singletonMap("set", value)); and then it's fine
- but for some field, if I do the call from above, then the indexed 
values are like "{set=value}" instead of just "value".


Can somebody explain this strange behaviour to me?

Regards
Maciej



--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: Can SOLR-5730 patch be backported to Solr 5.5.3

2017-02-15 Thread Sahil Agarwal
Can the Solr branch_5x (specifically, browsing the repo at the point of
SOLR-5730's last commit) be a good place to check in regards to this task?

The team is upgrading to Solr 5 from Solr 4 for now. Im an intern and had
been asked to implement these features and discovered that they already
existed in SortingMergePolicy and EarlyTerminatingSortCollector. So Im just
concentrating on that since that is the version the company is migrating
its Solr to.


Re: Can SOLR-5730 patch be backported to Solr 5.5.3

2017-02-15 Thread Daniel Collins
The other question is what do you hope to gain from SortingMergePolicy and
EarlyTerminatingSortingCollector, and why would you want to do that in Solr
5.5.3 and not upgrade to Solr 6?  What prevents you from upgrading I guess
is my real question?

On 15 February 2017 at 05:06, Erick Erickson 
wrote:

> Don't know, give it a try and see? But you're in uncharted/unsupported
> territory so
> it's really chancy.
>
> Best,
> Erick
>
> On Tue, Feb 14, 2017 at 1:45 AM, Sahil Agarwal
>  wrote:
> > Can the patch for jira issue SOLR-5730 be backported to solr 5.5.3?? ie.
> > Can Lucene's SortingMergePolicy and EarlyTerminatingSortingCollector
> > configurable in Solr 5.5.3 too??
> >
> > https://issues.apache.org/jira/browse/SOLR-5730
> >
> > The SortingMergePolicy and EarlyTerminatingSortCollector are both
> available
> > in the lucene misc package.
> >
> > Thank you,
> > Sahil
>


Indexing of documents in more than one step (SOLRJ)

2017-02-15 Thread Maciej Ł. PCSS

Dear All,
how should I handle the following scenario using SOLRJ?  Index a 
collection of documents (fill fields a, b, c). Then index the same 
collection but this time fill fields d, e, f.


In a pseudo-code it would be: step1(collectionX); step2(collectionX); 
solrCommit();


See my observations below:
- first step is done by calling SolrInputDocument.addField(fieldName, 
value); and this works fine.
- if I do the same for the second step then all fields in my documents 
get removed;
- for that reason I need to call SolrInputDocument.addField(fieldName, 
Collections.singletonMap("set", value)); and then it's fine
- but for some field, if I do the call from above, then the indexed 
values are like "{set=value}" instead of just "value".


Can somebody explain this strange behaviour to me?

Regards
Maciej



Re: Continual garbage collection loop

2017-02-15 Thread Michael Kuhlmann
The number of cores is not *that much* important compared to the index
size, but each core has its memory overhead. For instance, caches are
based on cores, so you're having 36 individual caches per type.

Best,
Michael

Am 14.02.2017 um 16:39 schrieb Leon STRINGER:
>> On 14 February 2017 at 14:44 Michael Kuhlmann  wrote:
>>
>>
>> Wow, running 36 cores with only half a gigabyte of heap memory is
>> *really* optimistic!
>>
>> I'd raise the heap size to some gigabytes at least and see how it's
>> working then.
>>
> I'll try increasing the heap size and see if I get the problem again.
>
> Is core quantity a big issue? As opposed to the size of the cores? Yes, 
> there's
> 36 but some relate to largely inactive web sites so the average size (assuming
> my "Master (Searching)" way of calculating this is correct) is less than 4 
> MB. I
> naively assumed a heap size-related issue would result from larger data sets.
>
> Thanks for your recommendation,
>
> Leon Stringer
>