Re: solr is using TLS1.0
Hi Anchal, the IBM JVM behaves differently in the TLS setup then the Oracle JVM. If you search for IBM Java TLS 1.2 you find tons of reports of problems with that. In most cases you can get around that using the system property "com.ibm.jsse2.overrideDefaultTLS" as documented here: https://www.ibm.com/support/knowledgecenter/en/SSYKE2_8.0.0/com.ibm.java.security.component.80.doc/security-component/jsse2Docs/matchsslcontext_tls.html regards, Hendrik On 22.11.2018 07:25, Anchal Sharma2 wrote: Hi Shawn , Thanks for your reply . Here are the details abut java we are using : java version "1.8.0_151" IBM J9 VM (build 2.9, JRE 1.8.0 AIX ppc64-64 Compressed References 20171102_369060 (JIT enabled, AOT enabled) I have already patched the policy jars . And I tried to comment out the ciphers ,protocol entries in jetty-ssl.xml ,but it did not work for me .I also tried to use an "IncludeCipherSuites" entry to include a cipher I wanted to include ,but it did not work either .I started getting SSL_ERROR_INTERNAL_ERROR_ALERT and ssl_error_no_cypher_overlap errors on my console URL.I tried this in solr 7.3.1 version ,so jetty version must also be relatively new. Do you think java might not be letting me enable TLS1.2? Thanks & Regards, - Anchal Sharma Inactive hide details for Shawn Heisey ---21-11-2018 05:28:50---On 11/20/2018 3:02 AM, Anchal Sharma2 wrote: > I have enabled Shawn Heisey ---21-11-2018 05:28:50---On 11/20/2018 3:02 AM, Anchal Sharma2 wrote: > I have enabled SSL for solr using steps mentioned o From: Shawn Heisey To: solr-user@lucene.apache.org Date: 21-11-2018 05:28 Subject: Re: solr is using TLS1.0 On 11/20/2018 3:02 AM, Anchal Sharma2 wrote: > I have enabled SSL for solr using steps mentioned over Lucene > website .And though solr console URL is now secure(https) ,it is still > using TLS v1.0. > I have tried few things to force SSL to use TLS1.2 protocol ,but they > have not worked for me . > > While trying to do same ,I have observed solr itself does not offer any > solr property to specify cipher ,algorithm or TLS version . > > Following things have been tried : > 1.key store /trust store for solr to enable SSL with different key > algorithm ,etc combinations for the certificates > 2.different solr versions for step 1(solr 5.x,6.x,7.x-we are using solr > 5.3 currently) > 3.using java version 1.8 and adding solr certificate in java keystore to > enforce TLS1.2 Solr lets Java and Jetty handle TLS. Solr itself doesn't get involved except to provide information to other software. There are a whole lot of versions of Java 8, and at least three vendors for it. The big names are Oracle, IBM, and OpenJDK. What vendor and exact version of Java are you running? What OS is it on? Do you have the "unlimited JCE" addition installed in your Java and enabled? If your Java version is new enough, you won't need to mess with JCE. See this page: https://golb.hplar.ch/2017/10/JCE-policy-changes-in-Java-SE-8u151-and-8u152.html Solr 5.3 ships with Jetty 9.2.11, which is considered very outdated by the Jetty project -- released well over three years ago. From the perspective of the Solr project, version 5.3 is also very old -- two major versions behind what's current, and also released three years ago. Jetty 9.2 is up to 9.2.26. The current version is Jetty 9.4.14. The latest version of Solr (7.5.0) is shipping with Jetty 9.4.11. I think Jetty will likely be upgraded to the latest release for Solr 7.6.0. Have you made any changes to the Jetty config, particularly jetty-ssl.xml? One thing you might try, although I'll warn you that it may make no difference at all, is to remove the parts of that config file that exclude certain protocols and ciphers, letting Jetty decide for itself what it should use. Recent versions of Jetty and Java have very good defaults. I do not know whether Jetty 9.2.11 (included with Solr 5.3, as mentioned) has good defaults or not. Thanks, Shawn
Re: solr is using TLS1.0
Hi Shawn , Thanks for your reply . Here are the details abut java we are using : java version "1.8.0_151" IBM J9 VM (build 2.9, JRE 1.8.0 AIX ppc64-64 Compressed References 20171102_369060 (JIT enabled, AOT enabled) I have already patched the policy jars . And I tried to comment out the ciphers ,protocol entries in jetty-ssl.xml ,but it did not work for me .I also tried to use an "IncludeCipherSuites" entry to include a cipher I wanted to include ,but it did not work either .I started getting SSL_ERROR_INTERNAL_ERROR_ALERT and ssl_error_no_cypher_overlap errors on my console URL.I tried this in solr 7.3.1 version ,so jetty version must also be relatively new. Do you think java might not be letting me enable TLS1.2? Thanks & Regards, - Anchal Sharma From: Shawn Heisey To: solr-user@lucene.apache.org Date: 21-11-2018 05:28 Subject:Re: solr is using TLS1.0 On 11/20/2018 3:02 AM, Anchal Sharma2 wrote: > I have enabled SSL for solr using steps mentioned over Lucene > website .And though solr console URL is now secure(https) ,it is still > using TLS v1.0. > I have tried few things to force SSL to use TLS1.2 protocol ,but they > have not worked for me . > > While trying to do same ,I have observed solr itself does not offer any > solr property to specify cipher ,algorithm or TLS version . > > Following things have been tried : > 1.key store /trust store for solr to enable SSL with different key > algorithm ,etc combinations for the certificates > 2.different solr versions for step 1(solr 5.x,6.x,7.x-we are using solr > 5.3 currently) > 3.using java version 1.8 and adding solr certificate in java keystore to > enforce TLS1.2 Solr lets Java and Jetty handle TLS. Solr itself doesn't get involved except to provide information to other software. There are a whole lot of versions of Java 8, and at least three vendors for it. The big names are Oracle, IBM, and OpenJDK. What vendor and exact version of Java are you running? What OS is it on? Do you have the "unlimited JCE" addition installed in your Java and enabled? If your Java version is new enough, you won't need to mess with JCE. See this page: https://golb.hplar.ch/2017/10/JCE-policy-changes-in-Java-SE-8u151-and-8u152.html Solr 5.3 ships with Jetty 9.2.11, which is considered very outdated by the Jetty project -- released well over three years ago. From the perspective of the Solr project, version 5.3 is also very old -- two major versions behind what's current, and also released three years ago. Jetty 9.2 is up to 9.2.26. The current version is Jetty 9.4.14. The latest version of Solr (7.5.0) is shipping with Jetty 9.4.11. I think Jetty will likely be upgraded to the latest release for Solr 7.6.0. Have you made any changes to the Jetty config, particularly jetty-ssl.xml? One thing you might try, although I'll warn you that it may make no difference at all, is to remove the parts of that config file that exclude certain protocols and ciphers, letting Jetty decide for itself what it should use. Recent versions of Jetty and Java have very good defaults. I do not know whether Jetty 9.2.11 (included with Solr 5.3, as mentioned) has good defaults or not. Thanks, Shawn
Re: SOLR Performance Statistics
On 11/21/2018 8:59 AM, Marc Schöchlin wrote: Is it possible to modify the log4j appender to also log other query attributes like response/request size in bytes and number of resulted documents? Changing the log4j config might not do anything useful at all. In order for such a change to be useful, the application must have code that logs the information you're after. If you change the default logging level in your log4j config to DEBUG instead of INFO, you'll get a LOT more information in your logs. The information you're after *MIGHT* be logged, but it might not -- I really have no idea without checking the source code about precisely what information Solr is logging. The number of documents that match each query *IS* mentioned in solr.log, and you won't even have to change anything to get it. You'll see "hits=nn" when a query is logged. I think about snooping on the ethernet interface of a client or on the server to gather libpcap data. Is there a chance to analyze captured data i format is i.e "wt=javabin&version=2" The javabin format is binary. You would need something that understands it. If you added solr jars to a custom program, you could probably feed the data to it and make sense of it. That would require some research into how Solr works at a low level, to learn how to take information gathered from a packet capture and decode it into an actual response. Thanks, Shawn
Re: Solr Cloud - Store Data using multiple drives
On 11/21/2018 7:32 AM, Tech Support wrote: As per your suggestion, I had add the dataDir in the core.properties file. It creates the data directory on the new location. But newly added data only accessable. If I move the existing index files into the new location, then only I can able to read the data. Is it possible to read both the existing and new data, without moving existing data into new data location ? No, that's not possible. The index must be contained all in the one index directory. It would take some significant effort at the Lucene layer to make that possible. High effort, low reward. You can't combine the index directories, either. Doing so wouldn't work, and could cause serious index corruption problems. To achieve combined data, you'll need to move the original index into place and once it's running, index the new data into it. Thanks, Shawn
Re: Solr Cloud - Store Data using multiple drives
You really have to split your index. The good news is that you can use aliases to search multiples cores at ones. That's probably what you are looking for. So, you can start with one index and when it gets close to capacity, add the second one and the third one, etc. But have an alias that you keep redefining to include all of these indexes. Regards, Alex. On Wed, 21 Nov 2018 at 09:41, Tech Support wrote: > > @apa...@elyograg.org > > Dear Shawn, > > As per your suggestion, I had add the dataDir in the core.properties file. > > It creates the data directory on the new location. But newly added data only > accessable. > > If I move the existing index files into the new location, then only I can > able to read the data. > > Is it possible to read both the existing and new data, without moving > existing data into new data location ? > > > Thanks , > > Karthick Ramu > > > > From: Tech Support [mailto:techsupp...@sardonyx.in] > Sent: Monday, November 19, 2018 7:15 PM > To: 'solr-user@lucene.apache.org' > Subject: Solr Cloud - Store Data using multiple drives > > > > Hello Solr Team, > > > > I am using Solr 7.5. , Indexed data stored in the Solr Installation > directory. > > I need the following features, Is it possible to achieve the following > scenarios in SOLR Cloud? > > 1. If the disk free space is completed, is it possible to configure > another drive? Which means, if C drive free space is over need to configure > the D drive. I need to read the data from both C and D drives. > > > > 2. If D is getting full need to configure another drive. I mean, if > one drive free space is full then only need to use the other drive to store > data. At the time of read, need to get data from all the drives. > > > > > > Is it possible to configure on SOLR Cloud ? Suggest me in a right way. > > > > Thanks, > > Karthick Ramu > > >
Re: uniqueKey and docValues?
In SolrCloud there are a couple of places where it might be useful. First pass each replica collects the top N ids for the aggregator to sort. If the uniqueKey isn't DV, it needs to either decompress it off disk or build an structure on heap if it's not DV. Last I knew anyway. Best, Erick On Wed, Nov 21, 2018 at 12:04 PM Walter Underwood wrote: > > Is it a good idea to store the uniqueKey as docValues? A great idea? A maybe > or maybe not idea? > > It looks like it will speed up export and streaming. Otherwise, I can’t find > anything the docs pro or con. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) >
Re: Solr Cloud - Store Data using multiple drives
In a word, "no". There's no provision in Solr for storing two different parts of the same index in two different directories, whether or not they're on the same drive. Best, Erick On Wed, Nov 21, 2018 at 6:41 AM Tech Support wrote: > > @apa...@elyograg.org > > Dear Shawn, > > As per your suggestion, I had add the dataDir in the core.properties file. > > It creates the data directory on the new location. But newly added data only > accessable. > > If I move the existing index files into the new location, then only I can > able to read the data. > > Is it possible to read both the existing and new data, without moving > existing data into new data location ? > > > Thanks , > > Karthick Ramu > > > > From: Tech Support [mailto:techsupp...@sardonyx.in] > Sent: Monday, November 19, 2018 7:15 PM > To: 'solr-user@lucene.apache.org' > Subject: Solr Cloud - Store Data using multiple drives > > > > Hello Solr Team, > > > > I am using Solr 7.5. , Indexed data stored in the Solr Installation > directory. > > I need the following features, Is it possible to achieve the following > scenarios in SOLR Cloud? > > 1. If the disk free space is completed, is it possible to configure > another drive? Which means, if C drive free space is over need to configure > the D drive. I need to read the data from both C and D drives. > > > > 2. If D is getting full need to configure another drive. I mean, if > one drive free space is full then only need to use the other drive to store > data. At the time of read, need to get data from all the drives. > > > > > > Is it possible to configure on SOLR Cloud ? Suggest me in a right way. > > > > Thanks, > > Karthick Ramu > > >
Re: Solr cache clear
On 11/21/2018 8:36 AM, Rajdeep Sahoo wrote: The problem is that we are using master slave solr configuration and for similar type of query sometime it is taking 512 ms and sometime it is 29 ms . we are observing this issue since the query modification. As part of modification we have reduced a large no of facet.field param. In solrconfig.xml file we have changed the cache size and reloaded the the cores as part of this modification. Reloading cores is not clearing the cache. Is there any reason why we are getting this much of difference in response time for similar type of queries. What *SPECIFIC* evidence do you have that reloading the cores isn't emptying caches? Have you gone into plugins/stats and checked to see what the current cache sizes are? Often when a query fully utilizes Solr's caches, it will return in one or two milliseconds. A 29 millisecond query time probably means that the result did NOT come from Solr's caches, unless the machine is particularly slow. The cache that is likely affecting query performance in your case is not part of Solr at all -- it's the operating system. All modern operating systems use unallocated memory to cache any data on disk that has been read or written. Solr is highly reliant on this OS feature for good performance -- if you do not have extra memory not allocated to programs, Solr performance is probably going to be very bad. https://wiki.apache.org/solr/SolrPerformanceProblems#RAM Thanks, Shawn
Re: Sort index by size
Just as a sanity check, is this getting replicated many times, or further scaled up... it sounds like about $3.50/mo of disk space on AWS and it should all fit in ram on any decent sized server.. (i.e. any server that looks like half or quarter of a decent laptop) As a question, it's interesting but it doesn't yet sound like a problem worth sweating. On Mon, Nov 19, 2018, 3:29 PM Edward Ribeiro One more tidbit: are you really sure you need all 20 fields to be indexed > and stored? Do you really need all those 20 fields? > > See this blog post, for example: > https://www.garysieling.com/blog/tuning-solr-lucene-disk-usage > > On Mon, Nov 19, 2018 at 1:45 PM Walter Underwood > wrote: > > > > Worst case is 3X. That happens when there are no merges until the commit. > > > > With tlogs, worst case is more than that. I’ve seen humongous tlogs with > a batch load and no hard commit until the end. If you do that several > times, then you have a few old humongous tlogs. Bleah. > > > > wunder > > Walter Underwood > > wun...@wunderwood.org > > http://observer.wunderwood.org/ (my blog) > > > > > On Nov 19, 2018, at 7:40 AM, David Hastings < > hastings.recurs...@gmail.com> wrote: > > > > > > Also a full import, assuming the documents were already indexed, will > just > > > double your index size until a merge/optimize is ran since you are just > > > marking a document as deleted, not taking back any space, and then > adding > > > another completely new document on top of it. > > > > > > On Mon, Nov 19, 2018 at 10:36 AM Shawn Heisey > wrote: > > > > > >> On 11/19/2018 2:31 AM, Srinivas Kashyap wrote: > > >>> I have a solr core with some 20 fields in it.(all are stored and > > >> indexed). For an environment, the number of documents are around 0.29 > > >> million. When I run the full import through DIH, indexing is > completing > > >> successfully. But, it is occupying the disk space of around 5 GB. Is > there > > >> a possibility where I can go and check, which document is consuming > more > > >> memory? Put in another way, can I sort the index based on size? > > >> > > >> I am not aware of any way to do that. Might be one that I don't know > > >> about, but if there were a way, seems like I would have come across it > > >> before. > > >> > > >> It is not very that the large index size is due to a single document > or > > >> a handful of documents. It is more likely that most documents are > > >> relatively large. I could be wrong about that, though. > > >> > > >> If you have 29 documents (which is how I interpreted 0.29 million) > > >> and the total index size is about 5 GB, then the average size per > > >> document in the index is about 18 kilobytes.This is in my view pretty > > >> large. Typically I think that most documents are 1-2 kilobytes. > > >> > > >> Can we get your Solr version, a copy of your schema, and exactly what > > >> Solr returns in search results for a typically sized document? You'll > > >> need to use a paste website or a file-sharing website ... if you try > to > > >> attach these things to a message, the mailing list will most likely > eat > > >> them, and we'll never see them. If you need to redact the information > in > > >> search results ... please do it in a way that we can still see the > exact > > >> size of the text -- don't just remove information, replace it with > > >> information that's the same length. > > >> > > >> Thanks, > > >> Shawn > > >> > > >> >
uniqueKey and docValues?
Is it a good idea to store the uniqueKey as docValues? A great idea? A maybe or maybe not idea? It looks like it will speed up export and streaming. Otherwise, I can’t find anything the docs pro or con. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog)
SOLR Performance Statistics
Hello list, i am using the pretty old solr 4.7 *sigh* release and i am currently in investigation of performance problems. The solr instance runs currently very expensive queries with huge results and i want to find the most promising queries for optimization. I am currently using the solr logfiles and a simple tool (enhanced by me) to analyze the queries: https://github.com/scoopex/solr-loganalyzer Is it possible to modify the log4j appender to also log other query attributes like response/request size in bytes and number of resulted documents? #- File to log to and log format log4j.appender.file.File=${solr.log}/solr.log log4j.appender.file.layout=org.apache.log4j.PatternLayout log4j.appender.file.layout.ConversionPattern=%-5p - %d{-MM-dd HH:mm:ss.SSS}; %C; %m\n log4j.appender.file.bufferedIO=true Is there a better way to create detailed query stats and to replay queries on a test system? I think about snooping on the ethernet interface of a client or on the server to gather libpcap data. Is there a chance to analyze captured data i format is i.e "wt=javabin&version=2" I do similar things for mysql to make non intrusive performance analytics using pt-query-digest (Percona Toolkit). This works like that on mysql: 1.) Capture data # Capture all data on port 3306 tcpdump -s 65535 -x -nn -q - -i any port 3306 > mysql.tcp.txt # capure only 1/7 of the connection using a modulus of 7 on the source port if you have a very busy network connection tcpdump -i eth0 -s 65535 -x -n -q - 'port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2' > mysql.tcp.txt 2.) Create statistics on a other system using the tcpdump file pt-query-digest --watch-server '127.0.0.1:3307' --limit 110 --type tcpdump mysql.tcp.txt If i can extract the streams of the connections - do you have a idea how to parse the binary data? (Can i use parts of the solr client?) Is there comparable tool out there? Regards Marc
Re: Solr cache clear
Hi all, The problem is that we are using master slave solr configuration and for similar type of query sometime it is taking 512 ms and sometime it is 29 ms . we are observing this issue since the query modification. As part of modification we have reduced a large no of facet.field param. In solrconfig.xml file we have changed the cache size and reloaded the the cores as part of this modification. Reloading cores is not clearing the cache. Is there any reason why we are getting this much of difference in response time for similar type of queries. On Nov 21, 2018 4:16 AM, "Edward Ribeiro" wrote: > Disabling or reducing autowarming can help too, in addition to cache size > reduction. > > Edward > > Em ter, 20 de nov de 2018 17:29, Erick Erickson escreveu: > > > Why would you want to? This sounds like an XY problem, there's some > > problem you think would be cured by clearing the cache. What is > > that problem? > > > > Because I doubt this would do anything useful, pretty soon the caches > > would be filled up again and you'd be right back where you started and > > the real solution is to stop doing whatever you're doing that leads to > > whatever the real problem is. Maybe reducing the cache sizes. > > > > Best, > > Erick > > On Tue, Nov 20, 2018 at 9:05 AM Shawn Heisey > wrote: > > > > > > On 11/20/2018 9:25 AM, Rajdeep Sahoo wrote: > > > > Hi all, > > > > Without restarting is it possible to clear the cache? > > > > > > You'll need to clarify what cache you're talking about, but I think for > > > the most part that if you reload the core (or collection if running > > > SolrCloud) that all caches should be rebuilt empty. > > > > > > Thanks, > > > Shawn > > > > > >
RE: Solr Cloud - Store Data using multiple drives
@apa...@elyograg.org Dear Shawn, As per your suggestion, I had add the dataDir in the core.properties file. It creates the data directory on the new location. But newly added data only accessable. If I move the existing index files into the new location, then only I can able to read the data. Is it possible to read both the existing and new data, without moving existing data into new data location ? Thanks , Karthick Ramu From: Tech Support [mailto:techsupp...@sardonyx.in] Sent: Monday, November 19, 2018 7:15 PM To: 'solr-user@lucene.apache.org' Subject: Solr Cloud - Store Data using multiple drives Hello Solr Team, I am using Solr 7.5. , Indexed data stored in the Solr Installation directory. I need the following features, Is it possible to achieve the following scenarios in SOLR Cloud? 1. If the disk free space is completed, is it possible to configure another drive? Which means, if C drive free space is over need to configure the D drive. I need to read the data from both C and D drives. 2. If D is getting full need to configure another drive. I mean, if one drive free space is full then only need to use the other drive to store data. At the time of read, need to get data from all the drives. Is it possible to configure on SOLR Cloud ? Suggest me in a right way. Thanks, Karthick Ramu
Solr 7.3.0 Backup API not successfull
I was used to take the backup of my core using the following API curl 'http://localhost:8080/solr/report/replication?command=backup&location=/backup/solr-backup/&numberToKeep=1' But recently, we started to see Exceptions like following java.nio.file.NoSuchFileException: /var/solr/data/report_shard1_replica_n1/data/index/_2mp7v.si at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) at java.nio.channels.FileChannel.open(FileChannel.java:287) at java.nio.channels.FileChannel.open(FileChannel.java:335) at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:238) at org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:182) at org.apache.lucene.store.Directory.copyFrom(Directory.java:159) at org.apache.solr.core.backup.repository.LocalFileSystemRepository.copyFileFrom(LocalFileSystemRepository.java:145) at org.apache.solr.handler.SnapShooter.createSnapshot(SnapShooter.java:239) at org.apache.solr.handler.SnapShooter.lambda$createSnapAsync$1(SnapShooter.java:206) at java.lang.Thread.run(Thread.java:748) hence the backup was not successful and not snapshots aren't created. I checked the status with the Backup details API curl --silent http://localhost:8080/solr/report/replication?command=details >>omitting rest "master":{ "replicateAfter":["commit"], "replicationEnabled":"true", "replicableVersion":1542809383290, "replicableGeneration":3870332}, "backup":[ "exception","/var/solr/data/report_shard1_replica_n1/data/index/segments_2aycw"]}} Which means the snapshot API is throwing exceptions. It seem's like index is changing regularly which may have corrupted the backup. I would like to know the detailed explanation for this issue and some help in order to sort out this issues. Any help is highly appreciated. Thanks teki -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Able to search with indexed=false and docvalues=true
On Tue, 2018-11-20 at 21:17 -0700, Shawn Heisey wrote: > Maybe the error condition should be related to a new schema > property, something like allowQueryOnDocValues. This would default > to true with current schema versions and false in the next schema > version, which I think is 1.7. Then a user could choose to allow > them on a field-by-field basis, by reading documentation that > outlines the severe performance disadvantages. I support the idea of making such choices explicit and your suggestion sounds like a sensible way to do it. I have toyed with a related idea: Issue a query with debug=sanity and get a report from checks on both the underlying index and the issued query for indicators of problems: https://github.com/tokee/lucene-solr/issues/54 - Toke Eskildsen, Royal Danish Library