Unifying encode/decode to a single class makes sense to me.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Sep 4, 2020 at 11:12 AM Gus Heck <[email protected]> wrote:

> In reviewing SOLR-14787 <https://issues.apache.org/jira/browse/SOLR-14787> ,
> with the author verbally, he pointed out there's some duplicated code where
> he needed to decode a payload into a float. We
> have org.apache.lucene.analysis.payloads.PayloadHelper#decodeFloat(byte[],
> int)  but lucene/queries doesn't have lucene/analysis-common on the
> classpath and he found no better option than to essentially cut and paste
> those methods into his code. Neither of us likes that solution very much
> because code duplication is not cool, so I went fishing for other options
> or examples in the code base (on 8.x) and ran into
> PayloadDecoder.FLOAT_DECODER... which looks quite broken, or at least
> wildly different:
>
> PayloadDecoder FLOAT_DECODER = bytes -> bytes == null ? 1 :
> bytes.bytes[bytes.offset];
>
> this coerces a single byte to a float, but analysis common is encoding via
> PayloadHelper like this:
>
>   public static byte[] encodeFloat(float payload, byte[] data, int offset){
>     return encodeInt(Float.floatToIntBits(payload), data, offset);
>   }
>
> Which can clearly write more than one byte... and decoding via
>
>   public static final float decodeFloat(byte [] bytes, int offset){
>     return Float.intBitsToFloat(decodeInt(bytes, offset));
>   }
>
>   public static final int decodeInt(byte [] bytes, int offset){
>     return ((bytes[offset] & 0xFF) << 24) | ((bytes[offset + 1] & 0xFF) <<
> 16)
>          | ((bytes[offset + 2] & 0xFF) <<  8) |  (bytes[offset + 3] &
> 0xFF);
>   }
>
> Which is clearly expecting 4 bytes... and nothing like FLOAT_DECODER
>
> The scary thing is PayloadDecoder.FLOAT_DECODER seems to be used in
> BoostingTermBuilder... and a whole bunch of unit tests, which I presume are
> passing on coincidence of handling floats that happen to qualify as small
> value integers but I haven't tested that theory yet.
>
> PayloadDecoder.FLOAT_DECODER has been around for a long time, so I thought
> I'd solicit opinions before attempting to clean it up. My concept of a
> cleanup here would include trying to find a home for the float decoding
> logic (from PayloadHelper) that can be seen by both areas and unifying
> encode/decode for float payloads to a single location, and also add a unit
> test that round trips the encode/decode cycle (which I'm not seeing).
>
> -Gus
>
>
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>

Reply via email to