Yes, the /export handler returns zero for numeric fields that aren't
present. String fields should be empty though if not present.

We'll want to keep the zero while sorting in the /export handler. But
removing the zero when outputing the field should by OK. We'll just need to
add test cases that cover the numeric nulls. The RollupStream would be one
place that might have a problem with this.

Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, May 27, 2016 at 4:52 AM, Dennis Gove <dpg...@gmail.com> wrote:

> Is this true for non-numeric fields as well? I agree that this seems like
> a very bad thing.
>
> I can't imagine that a fix would cause a problem with Streaming
> Expressions, ParallelSQL, or other given that the /select handler is not
> returning 0 for these missing fields (the /select handler is the default
> handler for the Streaming API so if nulls were a problem I imagine we'd
> have already seen it).
>
> That said, within Streaming Expressions there is a select(...) function
> which supports a replace(...) operation which allows you to replace one
> value (or null) with some other value. If a 0 were necessary one could use
> a select(...) to replace null with 0 using an expression like this
>    select(<stream>, replace(fieldA, null, withValue=0)).
> The end result of that would be that the field fieldA would never have a
> null value and for all tuples where a null value existed it would be
> replaced with 0.
>
> Details on the select function can be found at
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61330338#StreamingExpressions-select
> .
>
> - Dennis
>
> On Thu, May 26, 2016 at 11:35 PM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> This seems to me to be A Bad Thing. Zero is different from not
>> existing. And let's claim that I want to process a stream and, say,
>> facet on in integer field over the result set. There's no way on the
>> client side to distinguish between a document that has a zero in the
>> field and one that didn't have the field in the first place so I'll
>> over-count the zero bucket.
>>
>> So before I raise a JIRA, my question is whether this is expected
>> behavior or not? I've found a mechanism that _shouldn't_ be very
>> expensive to omit the field if it doesn't exist in the returned
>> tuples.
>>
>> Now, how badly this would break Streaming Expressions, ParallelSQL and
>> the like I haven't looked into yet.
>>
>> So before I work up a trial patch am I going off in the weeds?
>>
>> Best,
>> Erick
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>

Reply via email to