Re: Solr 5.0 - uniqueKey case insensitive ?
Yes thanks it's now for me too. Daniel, my pn is always in uppercase and I index them always in uppercase. the problem (solved now after all your answers, thanks) was the request, if users requests with lowercase then solr reply no result and it was not good. but now the problem is solved, I changed in my source file the name pn field to id and in my schema I use a copy field named pn and it works perfectly. Thanks a lot !!! Le 06/05/2015 09:44, Daniel Collins a écrit : Ah, I remember seeing this when we first started using Solr (which was 4.0 because we needed Solr Cloud), I never got around to filing an issue for it (oops!), but we have a note in our schema to leave the key field a normal string (like Bruno we had tried to lowercase it which failed). We didn't really know Solr in those days, and hadn't really thought about it since then, but Hoss' and Erick's explanations make perfect sense now! Since shard routing is (basically) done on hashes of the unique key, if I have 2 documents which are the "same", but have values "HELLO" and "hello", they might well hash to completely different shards, so the update logistics would be horrible. Bruno, why do you need to lowercase at all then? You said in your example, that your client application always supplies "pn" and it is always uppercase, so presumably all adds/updates could be done directly on that field (as a normal string with no lowercasing). Where does the case insensitivity come in, is that only for searching? If so couldn't you add a search field (called id), and update your app to search using that (or make that your default search field, I guess it depends if your calling app explicitly uses the pn field name in its searches). On 6 May 2015 at 01:55, Erick Erickson wrote: Well, "working fine" may be a bit of an overstatement. That has never been officially supported, so it "just happened" to work in 3.6. As Chris points out, if you're using SolrCloud then this will _not_ work as routing happens early in the process, i.e. before the analysis chain gets the token so various copies of the doc will exist on different shards. Best, Erick On Mon, May 4, 2015 at 4:19 PM, Bruno Mannina wrote: Hello Chris, yes I confirm on my SOLR3.6 it works fine since several years, and each doc added with same code is updated not added. To be more clear, I receive docs with a field name "pn" and it's the uniqueKey, and it always in uppercase so I must define in my schema.xml indexed="true" stored="false"/> ... id ... but the application that use solr already exists so it requests with pn field not id, i cannot change that. and in each docs I receive, there is not id field, just pn field, and i cannot also change that. so there is a problem no ? I must import a id field and request a pn field, but I have a pn field only for import... Le 05/05/2015 01:00, Chris Hostetter a écrit : : On SOLR3.6, I defined a string_ci field like this: : : : : : : : : : I'm really suprised that field would have worked for you (reliably) as a uniqueKey field even in Solr 3.6. the best practice for something like what you describe has always (going back to Solr 1.x) been to use a copyField to create a case insensitive copy of your uniqueKey for searching. if, for some reason, you really want case insensitve *updates* (so a doc with id "foo" overwrites a doc with id "FOO" then the only reliable way to make something like that work is to do the lowercassing in an UpdateProcessor to ensure it happens *before* the docs are distributed to the correct shard, and so the correct existing doc is overwritten (even if you aren't using solr cloud) -Hoss http://www.lucidworks.com/ --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Solr 5.0 - uniqueKey case insensitive ?
Ah, I remember seeing this when we first started using Solr (which was 4.0 because we needed Solr Cloud), I never got around to filing an issue for it (oops!), but we have a note in our schema to leave the key field a normal string (like Bruno we had tried to lowercase it which failed). We didn't really know Solr in those days, and hadn't really thought about it since then, but Hoss' and Erick's explanations make perfect sense now! Since shard routing is (basically) done on hashes of the unique key, if I have 2 documents which are the "same", but have values "HELLO" and "hello", they might well hash to completely different shards, so the update logistics would be horrible. Bruno, why do you need to lowercase at all then? You said in your example, that your client application always supplies "pn" and it is always uppercase, so presumably all adds/updates could be done directly on that field (as a normal string with no lowercasing). Where does the case insensitivity come in, is that only for searching? If so couldn't you add a search field (called id), and update your app to search using that (or make that your default search field, I guess it depends if your calling app explicitly uses the pn field name in its searches). On 6 May 2015 at 01:55, Erick Erickson wrote: > Well, "working fine" may be a bit of an overstatement. That has never > been officially supported, so it "just happened" to work in 3.6. > > As Chris points out, if you're using SolrCloud then this will _not_ > work as routing happens early in the process, i.e. before the analysis > chain gets the token so various copies of the doc will exist on > different shards. > > Best, > Erick > > On Mon, May 4, 2015 at 4:19 PM, Bruno Mannina wrote: > > Hello Chris, > > > > yes I confirm on my SOLR3.6 it works fine since several years, and each > doc > > added with same code is updated not added. > > > > To be more clear, I receive docs with a field name "pn" and it's the > > uniqueKey, and it always in uppercase > > > > so I must define in my schema.xml > > > > > required="true" stored="true"/> > > indexed="true" > > stored="false"/> > > ... > >id > > ... > > > > > > but the application that use solr already exists so it requests with pn > > field not id, i cannot change that. > > and in each docs I receive, there is not id field, just pn field, and i > > cannot also change that. > > > > so there is a problem no ? I must import a id field and request a pn > field, > > but I have a pn field only for import... > > > > > > > > Le 05/05/2015 01:00, Chris Hostetter a écrit : > >> > >> : On SOLR3.6, I defined a string_ci field like this: > >> : > >> : >> : sortMissingLast="true" omitNorms="true"> > >> : > >> : > >> : > >> : > >> : > >> : > >> : >> : required="true" stored="true"/> > >> > >> > >> I'm really suprised that field would have worked for you (reliably) as a > >> uniqueKey field even in Solr 3.6. > >> > >> the best practice for something like what you describe has always (going > >> back to Solr 1.x) been to use a copyField to create a case insensitive > >> copy of your uniqueKey for searching. > >> > >> if, for some reason, you really want case insensitve *updates* (so a doc > >> with id "foo" overwrites a doc with id "FOO" then the only reliable way > to > >> make something like that work is to do the lowercassing in an > >> UpdateProcessor to ensure it happens *before* the docs are distributed > to > >> the correct shard, and so the correct existing doc is overwritten (even > if > >> you aren't using solr cloud) > >> > >> > >> > >> -Hoss > >> http://www.lucidworks.com/ > >> > >> > > > > > > --- > > Ce courrier électronique ne contient aucun virus ou logiciel malveillant > > parce que la protection avast! Antivirus est active. > > http://www.avast.com > > >
Re: Solr 5.0 - uniqueKey case insensitive ?
Well, "working fine" may be a bit of an overstatement. That has never been officially supported, so it "just happened" to work in 3.6. As Chris points out, if you're using SolrCloud then this will _not_ work as routing happens early in the process, i.e. before the analysis chain gets the token so various copies of the doc will exist on different shards. Best, Erick On Mon, May 4, 2015 at 4:19 PM, Bruno Mannina wrote: > Hello Chris, > > yes I confirm on my SOLR3.6 it works fine since several years, and each doc > added with same code is updated not added. > > To be more clear, I receive docs with a field name "pn" and it's the > uniqueKey, and it always in uppercase > > so I must define in my schema.xml > > required="true" stored="true"/> > stored="false"/> > ... >id > ... > > > but the application that use solr already exists so it requests with pn > field not id, i cannot change that. > and in each docs I receive, there is not id field, just pn field, and i > cannot also change that. > > so there is a problem no ? I must import a id field and request a pn field, > but I have a pn field only for import... > > > > Le 05/05/2015 01:00, Chris Hostetter a écrit : >> >> : On SOLR3.6, I defined a string_ci field like this: >> : >> : > : sortMissingLast="true" omitNorms="true"> >> : >> : >> : >> : >> : >> : >> : > : required="true" stored="true"/> >> >> >> I'm really suprised that field would have worked for you (reliably) as a >> uniqueKey field even in Solr 3.6. >> >> the best practice for something like what you describe has always (going >> back to Solr 1.x) been to use a copyField to create a case insensitive >> copy of your uniqueKey for searching. >> >> if, for some reason, you really want case insensitve *updates* (so a doc >> with id "foo" overwrites a doc with id "FOO" then the only reliable way to >> make something like that work is to do the lowercassing in an >> UpdateProcessor to ensure it happens *before* the docs are distributed to >> the correct shard, and so the correct existing doc is overwritten (even if >> you aren't using solr cloud) >> >> >> >> -Hoss >> http://www.lucidworks.com/ >> >> > > > --- > Ce courrier électronique ne contient aucun virus ou logiciel malveillant > parce que la protection avast! Antivirus est active. > http://www.avast.com >
Re: Solr 5.0 - uniqueKey case insensitive ?
Hello Chris, yes I confirm on my SOLR3.6 it works fine since several years, and each doc added with same code is updated not added. To be more clear, I receive docs with a field name "pn" and it's the uniqueKey, and it always in uppercase so I must define in my schema.xml required="true" stored="true"/> indexed="true" stored="false"/> ... id ... but the application that use solr already exists so it requests with pn field not id, i cannot change that. and in each docs I receive, there is not id field, just pn field, and i cannot also change that. so there is a problem no ? I must import a id field and request a pn field, but I have a pn field only for import... Le 05/05/2015 01:00, Chris Hostetter a écrit : : On SOLR3.6, I defined a string_ci field like this: : : : : : : : : : I'm really suprised that field would have worked for you (reliably) as a uniqueKey field even in Solr 3.6. the best practice for something like what you describe has always (going back to Solr 1.x) been to use a copyField to create a case insensitive copy of your uniqueKey for searching. if, for some reason, you really want case insensitve *updates* (so a doc with id "foo" overwrites a doc with id "FOO" then the only reliable way to make something like that work is to do the lowercassing in an UpdateProcessor to ensure it happens *before* the docs are distributed to the correct shard, and so the correct existing doc is overwritten (even if you aren't using solr cloud) -Hoss http://www.lucidworks.com/ --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Solr 5.0 - uniqueKey case insensitive ?
: On SOLR3.6, I defined a string_ci field like this: : : : : : : : : : I'm really suprised that field would have worked for you (reliably) as a uniqueKey field even in Solr 3.6. the best practice for something like what you describe has always (going back to Solr 1.x) been to use a copyField to create a case insensitive copy of your uniqueKey for searching. if, for some reason, you really want case insensitve *updates* (so a doc with id "foo" overwrites a doc with id "FOO" then the only reliable way to make something like that work is to do the lowercassing in an UpdateProcessor to ensure it happens *before* the docs are distributed to the correct shard, and so the correct existing doc is overwritten (even if you aren't using solr cloud) -Hoss http://www.lucidworks.com/
Solr 5.0 - uniqueKey case insensitive ?
Dear Solr users, I have a problem with SOLR5.0 (and not on SOLR3.6) What kind of field can I use for my uniqueKey field named "code" if I want it case insensitive ? On SOLR3.6, I defined a string_ci field like this: and it works fine. - If I add a document with the same code then the doc is updated. - If I search a document with lower or upper case, the doc is found But in SOLR5.0, if I use this definition then : - I can search in lower/upper case, it's OK - BUT if I add a doc with the same code then the doc is added not updated !? I read that the problem could be that the type of field is tokenized instead of use a string. If I change from string_ci to string, then - I lost the possibility to search in lower/upper case - but it works fine to update the doc. So, could you help me to find the right field type to: - search in case insensitive - if I add a document with the same code, the old doc will be updated Thanks a lot ! --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com