Hi Erick/Jack,

I agree that "Your code is violating the contract for the UUID update
processor." so index could be in bad state. I have already put the fix and
no further action needed. I was just curious about the resulting behavior.

For completeness here were my results -
indexed 2 docs with these fields - using solr console's document tab


doc1: {"id":""}
doc2: {"id":""}

matchAllDocs query
q=*:*&sort=id+desc
"numFound": 2, "start": 0, "docs": [ {"id":
"9542901e-ede3-46dc-af6c-c30025c7b417"}, {"id":
"f29fcb97-ef5e-4c3e-b4fe-f50a963f894d"} ]

q=*:*&sort=id+asc - no change in order
"numFound": 2, "start": 0, "docs": [ {"id":
"9542901e-ede3-46dc-af6c-c30025c7b417"}, {"id":
"f29fcb97-ef5e-4c3e-b4fe-f50a963f894d"} ]

doc1: {"id":"whatever"}
got error:
"error": { "msg": "Invalid UUID String: 'whatever'",

doc1: {"_version_":-1} - id field is omitted but atleast one field needed
to index
doc2: {"_version_":-1}

matchAllDocs query
q=*:*&sort=id+desc
"numFound": 2, "start": 0, "docs": [ {"id": "
c4e19489-fad1-42f4-b216-88ba550f3d16"}, {"id": "
99d652b8-3eb6-4a9f-a722-33246e8553d4"} ]

q=*:*&sort=id+asc - works
"numFound": 2, "start": 0, "docs": [ {"id": "
99d652b8-3eb6-4a9f-a722-33246e8553d4"}, {"id": "
c4e19489-fad1-42f4-b216-88ba550f3d16"} ]

On Sat, Apr 16, 2016 at 8:01 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> I did a quick experiment (admittedly with 5.x, but even if this is a
> bug it won't be back-ported to 5.3) and this works exactly as I
> expect. I have three docs with IDs as follows
> doc1: <field name="id"></field>. This is equivalent to your ""
> doc2: <field name="id">whatever</field>
> doc3:
>
> As expected, when the output comes back doc1 has an empty field, doc2
> has "whatever" and doc3 has a newly-generated uuid that happens to
> start with "f".
>
> Adding &sort=id asc returns:
> doc1: (empty string)
> doc3: fblahblah
> doc2: whatever
>
> Adding &sort=id desc returns
> doc2: whatever
> doc3: fblahblah
> doc1:(empty string)
>
> So for about the third time, "what do you mean by 'doesn't work'?"
> Provide simple example date (just how you specify the "id" field is
> sufficient). Provide the requests you're using. Point out what's not
> as you expect.
>
> You might want to review:
> http://wiki.apache.org/solr/UsingMailingLists
>
> Best,
> Erick
>
> On Sat, Apr 16, 2016 at 9:54 AM, Jack Krupansky
> <jack.krupan...@gmail.com> wrote:
> > Remove that line of code from your client, or... add the remove blank
> field
> > update processor as Hoss suggested. Your code is violating the contract
> for
> > the UUID update processor. An empty string is still a value, and the
> > presence of a value is an explicit trigger to suppress the UUID update
> > processor.
> >
> > -- Jack Krupansky
> >
> > On Sat, Apr 16, 2016 at 12:41 PM, Susmit Shukla <shukla.sus...@gmail.com
> >
> > wrote:
> >
> >> I am seeing the UUID getting generated when I set the field as empty
> string
> >> like this - solrDoc.addField("id", ""); with solr 5.3.1 and based on the
> >> above schema.
> >> The resulting documents in the index are searchable but not sortable.
> >> Someone could verify if this bug exists and file a jira.
> >>
> >> Thanks,
> >> Susmit
> >>
> >>
> >>
> >> On Sat, Apr 16, 2016 at 8:56 AM, Jack Krupansky <
> jack.krupan...@gmail.com>
> >> wrote:
> >>
> >> > "UUID processor factory is generating uuid even if it is empty."
> >> >
> >> > The processor will generate the UUID only if the id field is not
> >> specified
> >> > in the input document. Empty value and value not present are not the
> same
> >> > thing.
> >> >
> >> > So, please clarify your specific situation.
> >> >
> >> >
> >> > -- Jack Krupansky
> >> >
> >> > On Thu, Apr 14, 2016 at 7:20 PM, Susmit Shukla <
> shukla.sus...@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi Chris/Erick,
> >> > >
> >> > > Does not work in the sense the order of documents does not change on
> >> > > changing sort from asc to desc.
> >> > > This could be just a trivial bug where UUID processor factory is
> >> > generating
> >> > > uuid even if it is empty.
> >> > > This is on solr 5.3.0
> >> > >
> >> > > Thanks,
> >> > > Susmit
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > On Thu, Apr 14, 2016 at 2:30 PM, Chris Hostetter <
> >> > hossman_luc...@fucit.org
> >> > > >
> >> > > wrote:
> >> > >
> >> > > >
> >> > > > I'm also confused by what exactly you mean by "doesn't work" but a
> >> > > general
> >> > > > suggestion you can try is putting the
> >> > > > RemoveBlankFieldUpdateProcessorFactory before your UUID
> Processor...
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://lucene.apache.org/solr/6_0_0/solr-core/org/apache/solr/update/processor/RemoveBlankFieldUpdateProcessorFactory.html
> >> > > >
> >> > > > If you are also worried about strings that aren't exactly empty,
> but
> >> > > > consist only of whitespace, you can put
> >> TrimFieldUpdateProcessorFactory
> >> > > > before RemoveBlankFieldUpdateProcessorFactory ...
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://lucene.apache.org/solr/6_0_0/solr-core/org/apache/solr/update/processor/TrimFieldUpdateProcessorFactory.html
> >> > > >
> >> > > >
> >> > > > : Date: Thu, 14 Apr 2016 12:30:24 -0700
> >> > > > : From: Erick Erickson <erickerick...@gmail.com>
> >> > > > : Reply-To: solr-user@lucene.apache.org
> >> > > > : To: solr-user <solr-user@lucene.apache.org>
> >> > > > : Subject: Re: UUID processor handling of empty string
> >> > > > :
> >> > > > : What do you mean "doesn't work"? An empty string is
> >> > > > : different than not being present. Thee UUID update
> >> > > > : processor (I'm pretty sure) only adds a field if it
> >> > > > : is _absent_. Specifying it as an empty string
> >> > > > : fails that test so no value is added.
> >> > > > :
> >> > > > : At that point, if this uuid field is also the <unkqueKey>,
> >> > > > : then each doc that comes in with an empty field will replace
> >> > > > : the others.
> >> > > > :
> >> > > > : If it's _not_ the <unkqueKey>, the sorting will be confusing.
> >> > > > : All the empty string fields are equal, so the tiebreaker is
> >> > > > : the internal Lucene doc ID, which may change as merges
> >> > > > : happen. You can specify secondary sort fields to make the
> >> > > > : sort predictable (the <unkqueKey> field is popular for this).
> >> > > > :
> >> > > > : Best,
> >> > > > : Erick
> >> > > > :
> >> > > > : On Thu, Apr 14, 2016 at 12:18 PM, Susmit Shukla <
> >> > > shukla.sus...@gmail.com>
> >> > > > wrote:
> >> > > > : > Hi,
> >> > > > : >
> >> > > > : > I have configured solr schema to generate unique id for a
> >> > collection
> >> > > > using
> >> > > > : > UUIDUpdateProcessorFactory
> >> > > > : >
> >> > > > : > I am seeing a peculiar behavior - if the unique 'id' field is
> >> > > > explicitly
> >> > > > : > set as empty string in the SolrInputDocument, the document
> gets
> >> > > indexed
> >> > > > : > with UUID update processor generating the id.
> >> > > > : > However, sorting does not work if uuid was generated in this
> way.
> >> > > Also
> >> > > > : > cursor functionality that depends on unique id sort also does
> not
> >> > > work.
> >> > > > : > I guess the correct behavior would be to fail the indexing if
> >> user
> >> > > > provides
> >> > > > : > an empty string for a uuid field.
> >> > > > : >
> >> > > > : > The issues do not happen if I omit the id field from the
> >> > > > SolrInputDocument .
> >> > > > : >
> >> > > > : > SolrInputDocument
> >> > > > : >
> >> > > > : > solrDoc.addField("id", "");
> >> > > > : >
> >> > > > : > ...
> >> > > > : >
> >> > > > : > I am using schema similar to below-
> >> > > > : >
> >> > > > : > <!--schema.xml-->
> >> > > > : >
> >> > > > : > <fieldType name="uuid" class="solr.UUIDField" indexed="true"
> />
> >> > > > : >
> >> > > > : > <field name="id" type="uuid" indexed="true" stored="true"
> >> > > > required="true" />
> >> > > > : >
> >> > > > : > <uniqueKey>id</uniqueKey>
> >> > > > : >
> >> > > > : > <!--solrconfig.xml-->
> >> > > > : > <updateRequestProcessorChain name="uuid">
> >> > > > : >     <processor class="solr.UUIDUpdateProcessorFactory">
> >> > > > : >       <str name="fieldName">id</str>
> >> > > > : >     </processor>
> >> > > > : >     <processor class="solr.RunUpdateProcessorFactory" />
> >> > > > : > </updateRequestProcessorChain>
> >> > > > : >
> >> > > > : >
> >> > > > : >  <requestHandler name="/update"
> >> class="solr.UpdateRequestHandler">
> >> > > > : >        <lst name="defaults">
> >> > > > : >          <str name="update.chain">uuid</str>
> >> > > > : >        </lst>
> >> > > > : > </requestHandler>
> >> > > > : >
> >> > > > : >
> >> > > > : > Thanks,
> >> > > > : > Susmit
> >> > > > :
> >> > > >
> >> > > > -Hoss
> >> > > > http://www.lucidworks.com/
> >> > > >
> >> > >
> >> >
> >>
>

Reply via email to