Re: Wiki new user

2015-05-18 Thread Erick Erickson
Sergio: Done. Best, Erick Erickson (there are never too many Erik/Eric/Erick's in the world) 2015-05-18 2:38 GMT-07:00 Sergio Velasco ser...@mitula.com: Hi Erik, I had it created. My user is SergioVelasco. Thank you. Regards. www.mitula.com Sergio Velasco | Dpto. de Desarrollo

Wildcard/Regex Searching with Decimal Fields

2015-05-18 Thread Todd Long
I'm having some normalization issues when trying to search decimal fields (i.e. TrieDoubleField copied to TextField). 1. Wildcard searching: I created a separate TextField field type (e.g. filter_decimal) which filters whole numbers to have at least one decimal place (i.e. dot zero) using the

Re: Solr 5.0, Jetty and WAR

2015-05-18 Thread Upayavira
On Mon, May 18, 2015, at 01:42 PM, Steven White wrote: Hi Everyone, With regards to Solr 5.0+, can someone point me where I can find out more on: 1) Why WAR support is deprecated and is being dropped from Solr? and 2) How do I deploy Solr 5.1 in a servlet other than Jetty, such as

Re: Solr 5.0, Jetty and WAR

2015-05-18 Thread Erick Erickson
At this point you can. At this point you have a war file anyway, so... Eventually this will be unsupported, I expect that _eventually_ there'll be a change that makes this no longer possible. So my question back to you is why you have to run Solr in a servlet container? Why not just use the

Re: Wildcard/Regex Searching with Decimal Fields

2015-05-18 Thread Jack Krupansky
Maybe you should first disclose the nature of the business problem you are trying to solve. To be clear, patterns and wildcards are string processing operations, not numeric operations. Usually one searches for ranges of numeric values. So, again, what operation are you really trying to perform

Re: Wildcard/Regex Searching with Decimal Fields

2015-05-18 Thread Erick Erickson
This feels like an XY problem. Either you're working with numbers or you're not. It's hard for me to imagine what purpose is served by a query on numerical data that would match 2.5, 20.5, 299.5, 299.5 etc, much less regexes. That may just be my limited imagination however. You could

please confirm: pseudo join queries can only be performed on fields of exactly the same type

2015-05-18 Thread Matteo Grolla
Hi, I tried performing a join query {!join from=fA to=fB} where fA was string and fB was text using keywordTokenizer it doesn't work, but it does if either fields are both string or both text. If you confirm this is the correct behavior I'll

Re: please confirm: pseudo join queries can only be performed on fields of exactly the same type

2015-05-18 Thread Yonik Seeley
They should not have to be *exactly* the same type... just compatible types such that the indexed tokens match. When you used the keywordTokenizer, was there other analysis such as lowercasing going on? -Yonik On Mon, May 18, 2015 at 10:26 AM, Matteo Grolla matteo.gro...@gmail.com wrote: Hi,

Re: Error while creating collection

2015-05-18 Thread Erick Erickson
What's the rest of the stack trace? The fragment you provided isn't enough to say much. There should be a caused by bit somewhere that should provide a more informative message. Best, Erick On Mon, May 18, 2015 at 3:24 AM, Manohar Sripada manohar...@gmail.com wrote: I am using 4.7.2 version of

NPE when faceting with MLT Query from upgrade to Solr 5.1.0

2015-05-18 Thread Tim H
Hi everyone, Recently I upgraded to solr 5.1.0. When trying to generate facets using the more like this handler, I now get a a NullPointerException. I never got this exception while using Solr 4.10.0 Details are below: Stack Trace: at

Including new filters/anayzers/tokenizers (in jars) on SolrCloud

2015-05-18 Thread Bruno René Santos
Hello, I need to use the solr.ICUFoldingFilterFactory on my solrCloud instances and so I need to know which is the best way to include this jar on the classpath of each node of the solrcloud. I tried to put it on the [collection_name]/lib folder and create the collection using solr create but

RE: Wiki new user

2015-05-18 Thread Sergio Velasco
Great!!! Many Thanks. Hehehe :) www.mitula.com Sergio Velasco | Dpto. de Desarrollo Contáctame: ser...@mitula.com | Tfno. +34 917 08 21 47 | Fax +34 917 08 21 56 Síguenos en: Facebook.com/mitula.es.latam | @mitula_es | Linkedin.com/mitula | Blog El

Re: Solr 5.0, Jetty and WAR

2015-05-18 Thread Steven White
Thanks Shawn and Upayavira. What about the second part of my question? I guess I was not clear on it, so let me try again. Lets say I want to use Solr in a container other that Jetty, such as WebSphere or Tomcat. If the WAR will be deprecated, is it a matter of copying other Solr files into

Re: Problem with solr.LengthFilterFactory

2015-05-18 Thread Charles Sanders
Jack, Thanks for the information. If I understand this correctly, the White space tokenizer will break a single token of size 300 into two tokens, one of size 256 and the other of size 44. If this is true, then for the single test document I have used, in the index in the portal_package field,

Solr 5.0, Jetty and WAR

2015-05-18 Thread Steven White
Hi Everyone, With regards to Solr 5.0+, can someone point me where I can find out more on: 1) Why WAR support is deprecated and is being dropped from Solr? and 2) How do I deploy Solr 5.1 in a servlet other than Jetty, such as WebSphere? Thanks Steve

Re: Index Image Features in Solr

2015-05-18 Thread chalitha udara Perera
thank you Ahmet ! On Mon, May 18, 2015 at 5:45 PM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi Chalitha, You may find these two relevant. http://www.semanticmetadata.net/lire/ https://bitbucket.org/dermotte/liresolr Ahmet On Monday, May 18, 2015 2:17 PM, chalitha udara Perera

Re: Index Image Features in Solr

2015-05-18 Thread Ahmet Arslan
Hi Chalitha, You may find these two relevant. http://www.semanticmetadata.net/lire/ https://bitbucket.org/dermotte/liresolr Ahmet On Monday, May 18, 2015 2:17 PM, chalitha udara Perera chalithaud...@gmail.com wrote: Hi Alex, Just watched the presentation, It is a very good starting point

NPE with faceting query on MoreLikeThis handler

2015-05-18 Thread Tim Hearn
Hi everyone, Recently I upgraded to solr 5.1.0. When trying to generate facets using the more like this handler, I now get a a NullPointerException. I never got this exception while using Solr 4.10.0 Details are below: Stack Trace: at

Re: Solr 5.0, Jetty and WAR

2015-05-18 Thread Shawn Heisey
On 5/18/2015 6:42 AM, Steven White wrote: With regards to Solr 5.0+, can someone point me where I can find out more on: 1) Why WAR support is deprecated and is being dropped from Solr? and 2) How do I deploy Solr 5.1 in a servlet other than Jetty, such as WebSphere? As of 5.1, and probably

Re: Problem with solr.LengthFilterFactory

2015-05-18 Thread Charles Sanders
Shawn, Sorry about my mistake. I have rerun the test with the correct fieldType definition. Still have the same results. See test information below. field name=portal_package type=text_std indexed=true stored=true multiValued=true/ fieldType name=text_std class=solr.TextField

Re: Verify a certain Replica contains a document

2015-05-18 Thread Anshum Gupta
I just tested out what you've mentioned and see the same behavior. I think it calls for a JIRA and a fix. distrib=false shouldn't consult zk in my opinion, else it makes no sense to have that param. I'm not sure but it might just be regression. Can you create a JIRA? I'll take it up soon if no

Re: Problem with solr.LengthFilterFactory

2015-05-18 Thread Jack Krupansky
Sorry for not spotting that earlier. Lucene itself does have such a limit. No way around it - an individual term is limited to 32K-2 bytes. Lucene is designed for searching of terms, not large blob storage. Maybe you defined that field as a string originally and later updated your schema to a

Added As Editor

2015-05-18 Thread Bill Trembley
As large users of Solr/Lucene for many of our existing sites (BoatTrader.com, GetAuto.com, ForRent.com, etc) we would like for the ability to contribute to the wiki as we come across items. My current wiki login is BillTrembley (bill.tremb...@gmail.com). Thanks, Bill Trembley Director of

Re: Wildcard/Regex Searching with Decimal Fields

2015-05-18 Thread Erick Erickson
No, not using SynonymFilterFactory. Rather take that as a base for a custom Filter that doesn't use any input file. Rather it would normalize any numeric tokens and inject as many variants on the spot as you desire. Best, Erick On Mon, May 18, 2015 at 9:56 AM, Todd Long lon...@gmail.com wrote:

Relevancy Scoring

2015-05-18 Thread John Blythe
Background: I'm using Solr as a mechanism for search for users, but before even getting to that point as a means of intelligent inference more or less. Product data comes in and we're hoping to match it to the correct known product without having to use the user for confirmation/search. Problem:

boost field in my schema.xml

2015-05-18 Thread Jorge Luis Betancourt González
Does a boost field in Solr has any use on the core calculation? For what I can see in [1] If a boost attribute is used in the doc/field level it my be encoded in the norm field and then used to boost the specific match in the doc/field. But I've a schema.xml with a boost field defined and using

Re: [MASSMAIL]Re: High fieldNorm values causing really odd results

2015-05-18 Thread Jorge Luis Betancourt González
For what I'm seeing I've defined a boost field in my docs, this field is defined as float which has the following fieldType: fieldType name=float class=solr.TrieFloatField precisionStep=6/ Is a boost field used by default to boost a document? I couldn't find any reference to this behaviour

Help/Guidance Needed : To reload kstem protword hash without full core reload

2015-05-18 Thread Aman Tandon
Hi, *Problem Statement: *I want to reload an hash of protwords created by the kstem filter without reloading the whole index core. *My Thought: *I am thinking to reload the hash by passing a parameter like *r=1 *to analysis url request (to somehow pass the parameter via url). And I am thinking

Re: Problem with solr.LengthFilterFactory

2015-05-18 Thread Charles Sanders
No, the field has always been text. And from the error, its obviously passing a very large token to the index, regardless of the tokenizer and filter. So I guess I will have to tokenize and filter the text before I send it to solr, since solr is not able to properly handle a very large token.

Re: Wildcard/Regex Searching with Decimal Fields

2015-05-18 Thread Todd Long
Essentially, we have a grid of data (i.e. frequencies, baud rates, data rates, etc.) and we allow wildcard filtering on the various columns. As the user provides input, in a specific column, we simply filter the overall data by an implicit starts with query (i.e. 23 becomes 23*). In most cases,

Re: please confirm: pseudo join queries can only be performed on fields of exactly the same type

2015-05-18 Thread Matteo Grolla
Thanks for the quick and precise answer, fb are lowercased so indexed tokens don't match Il giorno 18/mag/2015, alle ore 16:54, Yonik Seeley ha scritto: They should not have to be *exactly* the same type... just compatible types such that the indexed tokens match. When you

Re: Solr 5.0, Jetty and WAR

2015-05-18 Thread Shawn Heisey
On 5/18/2015 8:07 AM, Steven White wrote: Thanks Shawn and Upayavira. What about the second part of my question? I guess I was not clear on it, so let me try again. Lets say I want to use Solr in a container other that Jetty, such as WebSphere or Tomcat. If the WAR will be deprecated, is

Re: Added As Editor

2015-05-18 Thread Erik Hatcher
Bill - done. Thanks in advance for your contributions! Erik - waves from not too far away in VA too On May 18, 2015, at 12:06 PM, Bill Trembley bill.tremb...@dominionenterprises.com wrote: As large users of Solr/Lucene for many of our existing sites (BoatTrader.com,

Re: Solr 5.0, Jetty and WAR

2015-05-18 Thread Steven White
Hi Erick, This isn't about deploying Solr using the WAR file, this is about a business need in which I may not be able to use Jetty that comes with Solr 5.x. My organization has issues with Jetty (some customers don't want Jetty on their boxes, but are OK with WebSphere or Tomcat) so I'm trying

Re: Solr 5.0, Jetty and WAR

2015-05-18 Thread Upayavira
On Mon, May 18, 2015, at 05:11 PM, Steven White wrote: Hi Erick, This isn't about deploying Solr using the WAR file, this is about a business need in which I may not be able to use Jetty that comes with Solr 5.x. My organization has issues with Jetty (some customers don't want Jetty on

Re: Relevancy Scoring

2015-05-18 Thread Doug Turnbull
The maxScore is 772 when I remove the description. I suppose the actual question, then, is if a low relevancy score on one field hurts the rest of them / the cumulative score, This depends a lot on how you're searching over these fields. Is this a (e)dismax query? Or a lucene query? Something

Re: Relevancy Scoring

2015-05-18 Thread Doug Turnbull
Also, I wouldn't expect at all that query-to-query you'll get comparable scores. I'm not at all surprised that suddenly you get big swings in scoring. So many parts of the scoring equation can change query to query. On Mon, May 18, 2015 at 2:18 PM, Doug Turnbull

Re: Relevancy Scoring

2015-05-18 Thread John Blythe
Hey Doug, Thanks for the quick reply. No edismax just yet. Planning on getting there, but have been trying to fine tune the 3 primary fields we use over the last week or so before jumping into edismax and its nifty toolset to help push our accuracy and precision even further (aside: is this a

Re: Relevancy Scoring

2015-05-18 Thread Doug Turnbull
You might just need some syntax help. Not sure what the Solr admin escapes, but many of the text in your query actually have reserved meaning. Also, when a term appears without a fieldName:value directly in front of it, I believe its going to search the default field (it's no longer attached to

Re: Relevancy Scoring

2015-05-18 Thread John Blythe
Thanks again for the speediness, Doug. Good to know on some of those things, not least of all the + indicating a mandatory field and the parentheses. It seems like the escaping is pretty robust in light of the product number. I'm thinking it has to be largely related to the analyzer. Check this

Re: Relevancy Scoring

2015-05-18 Thread Doug Turnbull
Hey John, I think you likely do need to think about escaping the query operators. I doubt the Solr admin could tell the difference. For analysis, have you looked at the handy analysis tool in the Solr Admin UI? Its pretty indespensible for figuring out if an analyzed query matches an analyzed

Issue with German search

2015-05-18 Thread Shamik Bandopadhyay
Hi, I'm having an issue with searching a term in german. Here's the keyword(s) I'm trying to search -- Software und Downloads I've a document indexed in German with the same title -- Software und Downloads I'm expecting that the search on Software und Downloads will return this document,

Re: Relevancy Scoring

2015-05-18 Thread John Blythe
Doug, A couple things quickly: - I'll check in to that. How would you go about testing things, direct URL? If so, how would you compose one of the examples above? - yup, I used it extensively before testing scores to ensure that I was getting things parsed appropriately (segmenting off the unit

Re: Relevancy Scoring

2015-05-18 Thread John Blythe
Doug, very very cool tool you've made there. thanks so much for sharing! i ended up removing the shinglefilterfactory and voila! things are back in good, working order with some great matching. i'm not 100% certain as to why shingling was so ineffective. i'm guessing the stacked terms created

Re: Problem with solr.LengthFilterFactory

2015-05-18 Thread Jack Krupansky
Just for our future reference: 1. Which schema field type analyzer combination was producing the exception? 2. Can you provide the full stack trace for the exception? It shouldn't be possible to get the exception when using the white space tokenizer, due to the noted issue, so which tokenizer

Re: Relevancy Scoring

2015-05-18 Thread Doug Turnbull
Glad you figured things out and found splainer useful! Pull requests, bugs, feature requests welcome! https://github.com/o19s/splainer Doug On Monday, May 18, 2015, John Blythe j...@curvolabs.com wrote: Doug, very very cool tool you've made there. thanks so much for sharing! i ended up

Re: PermGen space OutOfMemory error when Solr is running

2015-05-18 Thread Zheng Lin Edwin Yeo
Thanks for the info. I was afraid that there's a memory leak, although so far the problem didn't occur after I enlarge the PermGen size. Is there any way to check the current PermGen size usage and prevent it before the system crash? I've read some articles and they recommend that I can include

Re: Is it possible to search for the empty string?

2015-05-18 Thread Shawn Heisey
On 5/18/2015 7:34 PM, Walter Underwood wrote: Not out of the box. Fields are parsed into tokens and queries search on tokens. An empty string has no tokens for that field and a missing field has no tokens for that field. If you really need to do this, then you’ll need to turn the empty

Re: Index Image Features in Solr

2015-05-18 Thread Bhawna Asnani
You can look inti Lire solr plugin. Works very well while searching similar images. Sent from my iPhone On May 18, 2015, at 9:14 AM, chalitha udara Perera chalithaud...@gmail.com wrote: thank you Ahmet ! On Mon, May 18, 2015 at 5:45 PM, Ahmet Arslan iori...@yahoo.com.invalid wrote:

Re: Wildcard/Regex Searching with Decimal Fields

2015-05-18 Thread Erick Erickson
I really have in mind an index-time filter, not necessarily a query-time Filter. So at index time you have something like 123.4000. You index synonyms 123.4 and 123 (or whatever) and now your _queries_ should just work since the forms you need are in the index already and whether it's a regex or

Re: Help/Guidance Needed : To reload kstem protword hash without full core reload

2015-05-18 Thread Aman Tandon
Please help or I am not clear here? With Regards Aman Tandon On Mon, May 18, 2015 at 9:47 PM, Aman Tandon amantandon...@gmail.com wrote: Hi, *Problem Statement: *I want to reload an hash of protwords created by the kstem filter without reloading the whole index core. *My Thought: *I am

Re: Is it possible to search for the empty string?

2015-05-18 Thread Shawn Heisey
On 5/18/2015 7:16 PM, Lianyi Han wrote: Would field: works in your case? Best, On Mon, May 18, 2015 at 8:56 PM Shawn Heisey apa...@elyograg.org wrote: Can I search for the empty string? This is distinct from searching for documents that don't have a certain fieldat all, which I can

Re: Is it possible to search for the empty string?

2015-05-18 Thread Lianyi Han
Sounds interesting to test it out. I only played with 5.10 recently on empty string search,it seems that *field:* (double quotes) works, but not *field:''* ( single quotes ) Best, -lianyi Unity of knowing and doing On Mon, May 18, 2015 at 9:21 PM, Shawn Heisey elyog...@elyograg.org wrote:

Re: Is it possible to search for the empty string?

2015-05-18 Thread Walter Underwood
Not out of the box. Fields are parsed into tokens and queries search on tokens. An empty string has no tokens for that field and a missing field has no tokens for that field. If you really need to do this, then you’ll need to turn the empty string in a special token that means “empty string”,

Re: Wildcard/Regex Searching with Decimal Fields

2015-05-18 Thread Todd Long
Erick Erickson wrote No, not using SynonymFilterFactory. Rather take that as a base for a custom Filter that doesn't use any input file. OK, I just wanted to make sure I wasn't missing something that could be done with the SynonymFilterFactory itself. At one time, I started going down this

Re: Is it possible to search for the empty string?

2015-05-18 Thread Lianyi Han
Would field: works in your case? Best, On Mon, May 18, 2015 at 8:56 PM Shawn Heisey apa...@elyograg.org wrote: Can I search for the empty string? This is distinct from searching for documents that don't have a certain fieldat all, which I can already do with a clause of *:* -field:[*TO *]in

Is it possible to search for the empty string?

2015-05-18 Thread Shawn Heisey
Can I search for the empty string? This is distinct from searching for documents that don't have a certain fieldat all, which I can already do with a clause of *:* -field:[*TO *]in my query. Thanks, Shawn

Index Image Features in Solr

2015-05-18 Thread chalitha udara Perera
Hi all, I'm trying to index feature vectors extracted from images for similarity search. As a simple example, I have a color histogram of a image represented as a double array. Is there a way I can index a feature vector in Solr without using hashing ? Thanks, Chalitha -- J.M Chalitha Udara

RE: Wiki new user

2015-05-18 Thread Sergio Velasco
Hi Erik, I had it created. My user is SergioVelasco. Thank you. Regards. www.mitula.com Sergio Velasco | Dpto. de Desarrollo Contáctame: ser...@mitula.com | Tfno. +34 917 08 21 47 | Fax +34 917 08 21 56 Síguenos en: Facebook.com/mitula.es.latam | @mitula_es |

Re: PermGen space OutOfMemory error when Solr is running

2015-05-18 Thread Tomasz Borek
The error happens either when you have too large codebase or when you are String-intensive in your application (Solr including) or when your previous process did not terminate well. Can't say for certain what Solr usage scenarios are string intensive without deep look at it's code. Usually

Re: Index Image Features in Solr

2015-05-18 Thread Alexandre Rafalovitch
Have you looked at the Solr Revolution presentation yet? https://youtu.be/WjnLhtwp678?list=PLU6n9Voqu_1FM8nmVwiWWDRtsEjlPqhgP Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 18 May 2015 at 19:04, chalitha udara Perera

Error while creating collection

2015-05-18 Thread Manohar Sripada
I am using 4.7.2 version of Solr. Cluster contains 16 VMs for Solr and 3 Zookeeper VMs. I am creating a bunch of collections (total 54) sequentially. I am getting the below error during one of the collection creation. This error is intermittent. * Could not fully createcollection:

Re: Index Image Features in Solr

2015-05-18 Thread chalitha udara Perera
Hi Alex, Just watched the presentation, It is a very good starting point for me. I am looking for a more generic solution to indexing any given feature vector. Thanks a lot ! On Mon, May 18, 2015 at 2:42 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: Have you looked at the Solr