Re: Reverse-engineering existing installation

2019-05-03 Thread Erick Erickson
Wait. I was recommending you diff the 4.2.1 solrconfig and the solrconfig you’re using. Ditto with the schema. If you’re trying to diff the 7x or 8x ones they’ll be totally different. But if you are getting massive differences in the yo4.2.1 stock and what you’re using, then whoever set it up

Re: Reverse-engineering existing installation

2019-05-03 Thread Doug Reeder
Thanks! Diffs for solr.xml and zoo.cfg were easy, but it looks like we'll need to strip the comments before we can get a useful diff of solrconfig.xml or schema.xml. Can you recommend tools to normalize XML files? XMLStarlet is hosted on SourceForge, which I no longer trust, and hasn't been

Re: [collection create & delete] collection It is not created after several hundred times when it is repeatedly deleted and created. Resolved after restarting the service.

2019-05-03 Thread Shawn Heisey
On 4/30/2019 1:38 AM, 유정인 wrote: 2019-04-27 21:50:32.043 ERROR (OverseerThreadFactory-1184-thread-4- processing-n:211.60.221.94:9080_) [ ] o.a.s.c.a.c.OverseerCollectionMessageHandler [processResponse:880] Error from shard: http://x.x.x.x:8080 org.apache.solr.client.solrj.SolrServerException:

Solr RuleBasedAuthorizationPlugin question

2019-05-03 Thread Jérémy
Hi, I hope that this question wasn't answered already, but I couldn't find what I was looking for in the archives. I'm having a hard time to use solr with the BasicAuth and RoleBasedAuthorization plugins. The auth part works well but I have issues with the RoleBasedAuthorization part. I'd like

Re: Reverse-engineering existing installation

2019-05-03 Thread Shawn Heisey
On 5/3/2019 1:44 PM, Erick Erickson wrote: Then git will let you check out any previous branch. 4.2 is from before we switched to Git, co I’m not sure you can go that far back, but 4x is probably close enough for comparing configs. Git has all of Lucene's history, and most of Solr's history,

Re: Solr long q values

2019-05-03 Thread Walter Underwood
512M was the default heap for Java 1.1. We never changed the default. So no size was “chosen”. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On May 3, 2019, at 10:11 PM, Shawn Heisey wrote: > > On 5/3/2019 1:37 PM, Erick Erickson wrote: >> We

Re: Solr long q values

2019-05-03 Thread Shawn Heisey
On 5/3/2019 1:37 PM, Erick Erickson wrote: We already do warnings for ulimits, so memory seems reasonable. Along the same vein, does starting with 512M make sense either? Feel free to, raise a JIRA, but I won’t have any time to work on it…. Done.

Re: Reverse-engineering existing installation

2019-05-03 Thread Erick Erickson
Doug: You can pull any version of Solr from Git. git clone https://gitbox.apache.org/repos/asf/lucene-solr.git some_local_dir Then git will let you check out any previous branch. 4.2 is from before we switched to Git, co I’m not sure you can go that far back, but 4x is probably close enough

Re: Solr long q values

2019-05-03 Thread Erick Erickson
Shawn: We already do warnings for ulimits, so memory seems reasonable. Along the same vein, does starting with 512M make sense either? Feel free to, raise a JIRA, but I won’t have any time to work on it…. > On May 3, 2019, at 3:27 PM, Walter Underwood wrote: > > We run very long queries with

Re: Solr long q values

2019-05-03 Thread Walter Underwood
We run very long queries with an 8 GB heap. 30 million documents in 8 shards with an average query length of 25 terms. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On May 3, 2019, at 6:49 PM, Shawn Heisey wrote: > > On 5/3/2019 2:32 AM, solrnoobie

Solr 7.7.1 issue: TemplateTransformer doesn't take the value of static template attribute value

2019-05-03 Thread Irfan Nagoo
Hi, Recently we upgraded our Solr from 5.1 to 7.7.1. Here is an example of an entity in data-config.xml to illustrate the issue we are facing:

Re: Solr long q values

2019-05-03 Thread Shawn Heisey
On 5/3/2019 2:32 AM, solrnoobie wrote: So whenever we have long q values (from a sentence to a small paragraph), we encounter some heap problems (OOM) and I guess this is normal? So my question would be is how should we handle this type of problem? Of course we could always limit the size of

Re: Reverse-engineering existing installation

2019-05-03 Thread Doug Reeder
Thanks! Alexandre's presentation is helpful in understanding what's not essential. David's suggesting of comparing config files is good - I'll have to see if I can dig up the config files for version 4.2, which we're currently running. I'll also look into updating to a supported version. I guess

Re: Search using filter query on multivalued fields

2019-05-03 Thread David Hastings
another option is to index dynamically, so you would index in this case, or this is what i would do: INGREDIENT_SALT_i:40 INGREDIENT_EGG_i:20 etc and query INGREDIENT_SALT_i:[20 TO *] or an arbitrary max value, since these are percentages INGREDIENT_SALT_i:[20 TO 100] On Fri, May 3, 2019 at

Re: Solr Log rotation

2019-05-03 Thread Erick Erickson
Shouldn’t be happening like this, you should have 10, approximately 10M files. Did you by any chance upgrade to a Solr that uses Log4j2 and keep the old config files? log4j2.xml should be your config if so and it has a much different format than what you’re showing. The when to purge decision

Re: Search using filter query on multivalued fields

2019-05-03 Thread Erick Erickson
There is no way to do this with the setup you describe. That is, there’s no way to say “only use the third element of a multiValued field”. What I’d do is index (perhaps in a separate field) with payloads, so you have input like SALT|20, then use some of the payload functionality to make this

Re: Unresolved dependencies (io.dropwizard.metrics) while building Solr

2019-05-03 Thread Erick Erickson
Sometimes this is leftover files in the checkout tree, occasionally it has to do with checksum files, sometimes it’s gremlins. I have never had a problem if I try some combination of: 1> clone the repo into a new directory 2> clean the ivy cache 3> ant clean-jars jar-checksums But usually I

Re: Accessing Solr collections at different ports

2019-05-03 Thread Erick Erickson
This is not true. You can run as many separate JVMs on a single physical machine as you have available ports. There’s no capability to address a Solr _collection_ in the _same_ JVM by a different port though. But what you didn’t mention is having separate collections per client. A single Solr

Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Shawn Heisey
On 5/3/2019 12:52 AM, Salmaan Rashid Syed wrote: I say that the nodes are limited to 4 because when I launch Solr in cloud mode, the first prompt that I get is to choose number of nodes [1-4]. When I tried to enter 7, it says that they are more than 4 and choose a smaller number. That's the

Re: Accessing Solr collections at different ports

2019-05-03 Thread Shawn Heisey
On 5/2/2019 11:47 PM, Salmaan Rashid Syed wrote: I am using Solr 7.6 in cloud mode with external zookeeper installed at ports 2181, 2182, 2183. Currently we have only one server allocated for Solr. We are planning to move to multiple servers for better sharing, replication etc in near future.

Help extracting text from PDF images when indexing files

2019-05-03 Thread Miguel Fernandes
Hi all, I'm new to Solr, i've recently downloaded solr 8.0.0 and have been following the tutorials. Using the 2 example instances created, i'm trying to create my own collection. I've done a copy of the _default configset and used it to create my collection. For my case, the files i want to

Re: Unable to tag queries (q) in SOLR >= 7.2

2019-05-03 Thread Fredrik Rodland
Thank you for a quick response David. Your suggestion works like a charm. (And you were of course right about the query being manually edited). Regards, Fredrik > On 30 Apr 2019, at 14:48, David Smiley wrote: > > Hi Frederik, > > In your example, I think you may have typed it manually

Re: Why did Solr stats min/max values were returned as float number for field of type="pint"?

2019-05-03 Thread Wendy2
Hi Joel, Thanks for your response. Regarding your response "This syntax is bringing back correct data types", I have a pint field, the stats returned the following min/max values. "min":0.0, "max":1356.0, But I was expecting min/max values like below. Is it possible?Thanks! "min":0

SSL in Solr 7.6.0

2019-05-03 Thread dinesh naik
Hi all, I am working on securing Solr and Client communication by implementing SSL for a multi node cluster(100+). The client are connecting to Solr via CloudSolrClient through Zoo keeper and i am looking for best way to create the certificate for making the connection secured. for a cluster of

Unresolved dependencies (io.dropwizard.metrics) while building Solr

2019-05-03 Thread Erlend Garåsen
I'm trying to build the latest Solr release from Git, but I'm stuck at this stage: ivy:retrieve] :: [ivy:retrieve] :: UNRESOLVED DEPENDENCIES :: [ivy:retrieve] ::

Facetting heat map, too many cells

2019-05-03 Thread Markus Jelsma
Hello, With gridlevel set to 3 i have a map of 256 x 128. However, i would really like a higher resolution, preferable twice as high. But with any gridlevel higher than 3, or distErrPct 0.1 or lower, i get the IllegalArgumentException, saying it does not want to give me a 1024x1024 sized map.

Search using filter query on multivalued fields

2019-05-03 Thread Srinivas Kashyap
Hi, I have indexed data as shown below using DIH: "INGREDIENT_NAME": [ "EGG", "CANOLA OIL", "SALT" ], "INGREDIENT_NO": [ "550", "297", "314" ], "COMPOSITION PERCENTAGE": [ 20, 60, 40

Solr long q values

2019-05-03 Thread solrnoobie
So whenever we have long q values (from a sentence to a small paragraph), we encounter some heap problems (OOM) and I guess this is normal? So my question would be is how should we handle this type of problem? Of course we could always limit the size of the search term queries in the application

Solr Log rotation

2019-05-03 Thread shruti suri
Hi, My log size is growing larger and it take most of the space. Please suggest how to handle this. Also Is there a way for log cleanup other than on startup as my servers didn't restart daily and the size keep on increasing. log4j.properties # Logging level solr.log=/var/log/solr

Sort field values by client-specified order

2019-05-03 Thread Andreas Hubold
Hi, we have a fixed number of values in a String field (up to around 100), that should be used for sorting query results. Is there some way to let the client specify the sort order as part of its query? I was thinking about using a function query. Is it possible to specify the order of

Re: Status of solR / HDFS-v3 compatibility

2019-05-03 Thread Hendrik Haddorp
We have some Solr 7.6 setups connecting to HDFS 3 clusters. So far that did not show any compatibility problems. On 02.05.19 15:37, Kevin Risden wrote: For Apache Solr 7.x or older yes - Apache Hadoop 2.x was the dependency. Apache Solr 8.0+ has Hadoop 3 compatibility with SOLR-9515. I did some

Re: problem indexing GPS metadata for video upload

2019-05-03 Thread Where is Where
Thank you very much Tim, I wonder how to make the Tika change apply to Solr? I saw Tika core, parse and xml jar files tika-core.jar tika-parsers.jar tika-xml.jar in solr contrib/extraction/lib folder. Do we just replace these files? Thanks! On Thu, May 2, 2019 at 12:16 PM Where is Where wrote:

Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Jörn Franke
This is just the setup for an experimental cluster (generally it does also not make sense to have many instances on the same server). Once you have got more experience take a look at https://lucene.apache.org/solr/guide/7_7/taking-solr-to-production.html To see how to set up clusters. > Am

Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Salmaan Rashid Syed
Thanks Jorn for your reply. I say that the nodes are limited to 4 because when I launch Solr in cloud mode, the first prompt that I get is to choose number of nodes [1-4]. When I tried to enter 7, it says that they are more than 4 and choose a smaller number. *Thanks and Regards,* Salmaan

Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Salmaan Rashid Syed
Thanks Walter, Since I am new to Solr and by looking at your suggestion, it looks like I am trying to do something very complicated and out-of-box capabilities of Solr. I really don't want to do that. I am not from Computer Science background and my specialisation is in Analytics and AI. Let me

Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Jörn Franke
BTW why do you think that SolrCloud is limited to 4 nodes? More are for sure possible. > Am 03.05.2019 um 07:54 schrieb Salmaan Rashid Syed > : > > Hi Solr Users, > > I am using Solr 7.6 in cloud mode with external zookeeper installed at > ports 2181, 2182, 2183. Currently we have only one

Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Jörn Franke
You can have dedicarse clusters per Client and/or you can protect it via Kerberos or Basic Auth or write your own authorization plugin based on OAuth. I am not sure why you want to offer this on different ports to different clients. > Am 03.05.2019 um 07:54 schrieb Salmaan Rashid Syed > : >

Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Walter Underwood
The best option is to run all the collections at the same port. Intra-cluster communication cannot be split over multiple ports, so this would require big internal changes to Solr. And what about communication that does not belong to a collection, like electing an overseer node? Why do you