Seems reasonable to me. +1
On 12/17/19 2:36 PM, Sloane, Brandon wrote:
> This is the option I think I am settling on.
>
> SBT allows us to provide a seperate implementation of a file based on
> compiler
> version, so we can incur the related overhead on only 2.11 builds.
>
> The only caveat is that in 2.11, we would need to incur a cost on every call
> to
> getAnyRef, as that function needs to change to:
>
> @inline def getAnyRef = {
> if(v.isInstanceOf[BoxedByteArray]){
> v.asInstanceOf[BoxedByteArray].v
> }else{
> v.asInstanceOf[AnyRef]
> }
>
> We could avoid this by going through all of our code that expects a ByteArray
> and handling BoxedByteArray, but I would like to keep the hackiness contained.
>
> Since we can contain this to only 2.11 builds, I think the performance hit
> would
> be acceptable.
>
> This would still impose a bit of a maintance burden, as we now have 2
> versions
> of the file to maintain, but hopefully this is not a file that requires much
> modifications.
> --------------------------------------------------------------------------------
> *From:* Steve Lawrence <slawre...@apache.org>
> *Sent:* Tuesday, December 17, 2019 2:11 PM
> *To:* Sloane, Brandon <bslo...@tresys.com>; dev@daffodil.apache.org
> <dev@daffodil.apache.org>
> *Subject:* Re: DataValue compiler bug under 2.11
> So this seems a bit hacky, but doing some simple testing I think
> wrapping the byte array works, e.g.:
>
> class WrappedByteArray(val ba: Array[Byte])
>
>
> object DataValue {
> ...
>
> type DataValueByteArray = DataValue[WrappedByteArray, ...]
>
> @inline
> implicit def toDataValue(v: Array[Byte]): DataValueByteArray =
> {
> val w = new WrappedByteArray(v)
> new DataValueByteArray(w)
> }
>
> ...
>
> }
>
> class DataValue ... {
>
> @inline
> def getByteArray = v.asInstanceOf[WrappedByteArray].array
>
> }
>
> We're now effectively boxing byte arrays, but those are only used for
> hexBinary data, which aren't all that common so might not be too big of
> a deal. The byte code looks like this:
>
> public WrappedByteArray toDataValue(byte[]);
> Code:
> 0: new #20 // class WrappedByteArray
> 3: dup
> 4: aload_1
> 5: invokespecial #23 // Method WrappedByteArray."<init>":([B)V
> 8: astore_2
> 9: aload_2
> 10: areturn
>
> Which is a bit more overhead, but is at least what we would actually
> expect, and it works on scala 2.11.
>
> The other option is probably to drop support for 2.11. The only use case
> I know of needing 2.11 is for using Daffodil in Apache Spark, which now
> has experimental support for 2.12. So it's not unreasonable to drop
> support, but would be nice to maintain, IMO.
>
> - Steve
>
> On 12/17/19 1:40 PM, Sloane, Brandon wrote:
>> That certainly looks like what I am seeing.
>>
>> If I am reading that issue correctly, it looks like it was fixed in
>> 2.12.0-M1.
>> --------------------------------------------------------------------------------
>> *From:* Steve Lawrence <slawre...@apache.org>
>> *Sent:* Tuesday, December 17, 2019 1:32 PM
>> *To:* dev@daffodil.apache.org <dev@daffodil.apache.org>
>> *Subject:* Re: DataValue compiler bug under 2.11
>> I believe this is scala bug 7521:
>>
>> https://github.com/scala/bug/issues/7521
>>
>> Skimming the comments and other referenced issues, I don't see anyone
>> mentioning a workaround. Appears the fix was added in 2.12, which makes
>> sense based on the behavior you're seeing.
>>
>> On 12/17/19 1:19 PM, Sloane, Brandon wrote:
>>> While debugging my latest DataValue PR under 2.11, I came across what I
>>> believe to be a compiler bug.
>>>
>>> Recall that we have source code along the lines of:
>>>
>>> ...
>>> @inline implicit def toDataValue(v: DFDLTime): DataValueTime = new
>>> DataValue(v)
>>> @inline implicit def toDataValue(v: Array[Byte]): DataValueByteArray =
>>> new DataValue(v)
>>> @inline implicit def toDataValue(v: JBoolean): DataValueBool = new
>>> DataValue(v)
>>> @inline implicit def toDataValue(v: JNumber): DataValueNumber = new
>>> DataValue(v)
>>> ...
>>>
>>> Under 2.11, this gets compiled to:
>>>
>>> ...
>>> public org.apache.daffodil.calendar.DFDLDate
>>> toDataValue(org.apache.daffodil.calendar.DFDLDate);
>>> Code:
>>> 0: aload_1
>>> 1: areturn
>>>
>>> public org.apache.daffodil.calendar.DFDLTime
>>> toDataValue(org.apache.daffodil.calendar.DFDLTime);
>>> Code:
>>> 0: aload_1
>>> 1: areturn
>>>
>>> public byte[] toDataValue(byte[]);
>>> Code:
>>> 0: getstatic #35 // Field
>>> scala/runtime/ScalaRunTime$.MODULE$:Lscala/runtime/ScalaRunTime$;
>>> 3: aload_1
>>> 4: invokevirtual #39 // Method
>>> scala/runtime/ScalaRunTime$.toObjectArray:(Ljava/lang/Object;)[Ljava/lang/Object;
>>> 7: checkcast #41 // class "[Ljava/lang/Byte;"
>>> 10: areturn
>>>
>>> public java.lang.Boolean toDataValue(java.lang.Boolean);
>>> Code:
>>> 0: aload_1
>>> 1: areturn
>>>
>>> public java.lang.Number toDataValue(java.lang.Number);
>>> Code:
>>> 0: aload_1
>>> 1: areturn
>>> ...
>>>
>>> Note that for every instant accept byte[], the function compiles to a
>>> simple identity function.
>>>
>>> When compiled under 2.12, the byte[] instance gets compiled to identity as
>>> well:
>>>
>>> public byte[] toDataValue(byte[]);
>>> Code:
>>> 0: aload_1
>>> 1: areturn
>>>
>>> The 2.11 version is not just inefficient; it is also wrong. It first
>>> converts to an array of Objects (which implies boxing), before attempting
>>> to cast said array to an array of (boxed) Bytes. This case fails , as
>>> Object[] does not extends Byte[].
>>>
>>> Even if this cast were to succeed, the function would then be attempting to
>>> return an array of boxed java.lang.Byte, while its signature indicates it
>>> should be returning an array of unboxed bytes.
>>>
>>> (As an asside, at the callpoint I looked at in 2.12 optimizes this function
>>> out entirely; the 2.11 version makes a standard function call to
>>> toDataValue)
>>>
>>> I'm at a bit of a loss as to how to proceed. This bug seems fatal to the
>>> idea of having an AnyVal DataValue class, and we do not have much
>>> wiggle-room to tweak our source code to avoid triggering it.
>>>
>>> Thoughts?
>>>
>>>
>>> Brandon T. Sloane
>>>
>>> Associate, Services
>>>
>>> bslo...@tresys.com | tresys.com
>>>
>>
>