If you goal is to search prefixes only, I'd go away from the _text_ field all together and use a "string" type. This will mean you need to 1> make it multiValued=true 2> split this up (either on your client or use a FieldMutatingUpdateProcessor, probably RegexReplaceProcessorFactory) into separate entries, i.e. 'EO.1954.53.1', 'EO.1954.53.2', EO.1954.53.3' becomes three separate entries in the field 'EO.1954.53.1' 'EO.1954.53.2' 'EO.1954.53.3'
At that point, searches like: 'EO.1954.53.*' will work just fine. NOTE: String types do zero analysis, so you have to handle things like casing yourself. That is, 'eO.1954.53.*' would _not_ match. You can probably use something like KeywordTokenizerFactory + LowerCaseFilterFactory in that case. All this makes _much_ more sense if you use the admin UI>>analysis page (probably uncheck the "verbose" checkbox, there'll be less clutter"). Best, Erick On Fri, Mar 16, 2018 at 8:35 AM, Emir Arnautović <emir.arnauto...@sematext.com> wrote: > Hi Roel, > As mentioned, _text_ field probably does not contain complete “EO.1954.53.1” > but only its parts. You can verify that using snalysis screen in admin > console. What you can try is searching for phrase without wildcard > “EO.1954.53” or if you are using WordDelimiterTokenFilter in your analysis > chain, you can set preserveOriginal=“1” and reindex. > > Can you share how your text_general looks like. > > HTH, > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > >> On 16 Mar 2018, at 14:05, Paesen Roel <roel.pae...@africamuseum.be> wrote: >> >> Hi, >> >> Unfortunately that also gives no results (and it would not be practical, as >> for this example the numbering only goes up till 19 but others go up into >> the thousands etc) >> >> Anybody with a pointer on this? >> >> Thanks already, >> Roel >> >> >> -----Original Message----- >> From: jagdish vasani [mailto:jagdisht.vas...@gmail.com] >> Sent: vrijdag 16 maart 2018 12:41 >> To: solr-user@lucene.apache.org >> Subject: Re: question regarding wildcard-searches >> >> Hi paesen, >> >> Value - EO.1954.53.1 is indexed as below Eo >> 1954 >> 53 >> 1 >> Dot is removed.try with wildcard -? >> Like EO.1954.53.?? If you have 2 digits only in last.. >> >> I have not tried but you just check it. >> Hope it will solve your problem. >> >> Thanks, >> Jagdish >> On 16-Mar-2018 3:51 pm, "Paesen Roel" <roel.pae...@africamuseum.be> wrote: >> >>> Hi everybody, >>> >>> We are experimenting with solr, and I have a (I think) basic-level >>> question: >>> we have a multiple fields, all copied into a generic field so we can >>> search everything at once. >>> However we have a (for us) strange situation doing wildcard searches >>> for the contents of one specific field. >>> >>> Given in the schema: >>> >>> <field name="_text_" type="text_general" indexed="true" stored="false" >>> multiValued="true"/> >>> >>> <field name="genormaliseerdInventarisnummer" type="string" indexed="true" >>> stored="true"/> >>> <copyField source="genormaliseerdInventarisnummer" dest="_text_" /> >>> and lot of other fields exactly like 'genormaliseerdInventarisnummer'. >>> >>> >>> Now, we are certain that the field 'genormaliseerdInventarisnummer' >>> contains entries like 'EO.1954.53.1', 'EO.1954.53.2', EO.1954.53.3', >>> all the way up to '.19', we can query these directly by passing these >>> exact texts to the query on field '_text_' (our default search field). >>> Problem is: wildcard searches for these don't work, like 'EO.1954.53.*' >>> for example returns zero results. >>> >>> Why is that? >>> What needs to be adjusted? (and how?) >>> >>> Thanks already, >>> Roel >>> >>> >