Re: [DISCUSS] Concern regarding ColumnType.getJavaEquivalentClass()

Kasper Sørensen Sun, 06 Apr 2014 02:36:23 -0700

I was going through our issue list, and realized that changing ColumnType
to an interface instead of a enum would also have other benefits. For
example, it would make solving this issue a lot easier:


https://issues.apache.org/jira/browse/METAMODEL-8 -  Further details added
to ColumnType



2014-04-04 7:52 GMT+02:00 Kasper Sørensen <[email protected]>:
>
> I did post my patch just now, which converts ColumnType to an interface:
https://reviews.apache.org/r/20028/
>
> Would be interesting if someone can check it out and give it some
thought. Not a trivial change for backwards compatibility I think, but
might be the best from an API standpoint.
>
>
> 2014-04-01 20:15 GMT+02:00 Kasper Sørensen <[email protected]
>:
>
>> Hi Ankit,
>>
>> I think it is fair to assume that a CLOB will always be convertible to a
String - if memory on the client permits at least. Normally a java.sql.Clob
provides a getCharacterStream() method which returns a Reader. By setting
the system property 'metamodel.jdbc.convert.lobs' to "true" we offer the
service of eagerly reading that Reader to the end and returning a String.
The good thing about that is that the client avoids issues of the DataSet
is closed or even sometimes just moving the cursor to the next record may
invalidate the Reader. In other words: With the LOB conversion we offer a
way to make the MetaModel Row objects useable even though the connection to
the statement/resultset/connection is closed. It's of course optional.
>>
>> The same idea is applied to BLOBs. We offer to automatically open the
java.sql.Blob InputStream and load it into a byte-array.
>>
>> In my opinion the fact that something is a CLOB or a VARCHAR is a
different type of information than whether it's values are represented as
java.sql.Clob or String objects. The information on the column type is
valuable in it's own right and I would not want us to "make believe" that a
column which is really a CLOB is shown as a VARCHAR.
>>
>> Now that we're on the topic, we might even also consider if using the
JDBC data type names is even appropriate as a default behaviour. I mean,
we're acting in the Java world and a type like VARCHAR or TINYINT etc. is
only used in SQL, not in native Java apps. Other types of datastores have
different data types and I guess many of the datastores only have simple
data types like String, Integer, Date etc. We could choose to go for a
non-enum solution for the ColumnType and thereby also provide more
flexibility. In the CSV module we could thus provide just a single
ColumnType, e.g.:
>>
>>   public static final ColumnType CSV_COLUMN_TYPE = new
ColumnTypeImpl("String", String.class);
>>
>> Read the example constructor arguments as this: Name would be "String"
(or maybe "Text" is more appropriate for "text files"?) and expected Java
class is String.class
>>
>> In the JDBC module we could have other column types available, e.g.:
>>
>>   public static final ColumnType VARCHAR = new ColumnTypeImpl("VARCHAR",
String.class);
>>   public static final ColumnType CLOB = new ColumnTypeImpl("CLOB",
java.sql.Clob.class);
>>   public static final ColumnType CLOB_CONVERTED = new
ColumnTypeImpl("CLOB", String.class);
>>   public static final ColumnType BLOB = new ColumnTypeImpl("BLOB",
java.sql.Blob.class);
>>   public static final ColumnType BLOB_CONVERTED = new
ColumnTypeImpl("BLOB", byte[].class);
>>
>> (and there you then have my suggested solution for solving the Clob
issue - to use a String for the name and a Class for the expected Java type)
>>
>> I can try to make a patch for this solution. That would sort of validate
if it will work or not. But first ... what do you guys think?
>>
>> Only pitfall I see is that it breaks backwards compatibility. I think we
can retain most of the compile-time compatibility, but not runtime
compatibility (as in "drop in a new jar without recompile"). Also, old
objects serialized with the old enums would not be properly deserialized
into using these constants. Maybe that can be solved in the
LegacyDeserializationObjectInputStream [1] somehow.
>>
>> Best regards,
>> Kasper
>>
>>
>> [1] Described here:
http://wiki.apache.org/metamodel/MigratingFromEobjectsMetaModel
>>
>>
>> 2014-04-01 10:28 GMT+02:00 Ankit Kumar <[email protected]>:
>>
>>> Hi Kasper,
>>>
>>> Thanks for initiating this discussion as we are hit by this in MM right
now.
>>>
>>> Thinking just a bit from a client of MM perspective, I guess the CLOB
could
>>> contain a Object/File/Image/???/Large String text, so while getting
CLOB's
>>> from the real database we still can check the content and if it smells
more
>>> like a String then we return the VARCHAR otherwise an Object. May be
this
>>> is not the neatest solution but I guess for a client this provides the
>>> flexibility needed.
>>>
>>> An additional point worth mentioning with respect to the above remark,
>>> playing with different databases we see some databases allow large text
to
>>> be stored in VARCHAR kind of fields whereas some are still having
limits of
>>> 4000 characters like Oracle. So when using MM as the abstraction layer
>>> above the databases this feature might be quite handy for a client.
>>>
>>> Excuse me if I sound a bit too biased here.
>>>
>>> Regards
>>> Ankit
>>>
>>>
>>> On Tue, Mar 25, 2014 at 3:54 PM, Kasper Sørensen <
>>> [email protected]> wrote:
>>>
>>> > Hi all,
>>> >
>>> > I just came across a potential issue in MetaModel's design and want to
>>> > share it and maybe start thinking of ways to fix or work around it...
>>> >
>>> > In our ColumnType enum we have the method getJavaEquivalentClass()
which is
>>> > used to tell which java type to expect when querying a particular
column.
>>> > For instance, of I query a VARCHAR column, I can expect a
java.lang.String
>>> > value.
>>> >
>>> > Now the trouble is that in our JDBC module we have a system property
which
>>> > allows for eager loading of BLOBs and CLOBs so that they are
automatically
>>> > read into byte-arrays and Strings respectively. This is a great
utility
>>> > because otherwise the user has to do a lot of tedious work with
>>> > inputstreams etc which in most cases isn't particularly useful - in
most
>>> > cases you just want the byte[ ] or String.
>>> >
>>> > Now the trouble is that if you turn that system property on, you get
>>> > Strings or byte-arrays but the column type is still CLOB/BLOB and that
>>> > means that the "expected equivalent Java class" is still
java.sql.Clob or
>>> > java.sql.Blob! If you build your code to expect that, then it will
>>> > eventually break because you get a String or a byte-array instead.
>>> >
>>> > How to make this consistent? I can think of a few ways, but none that
I
>>> > really love:
>>> >
>>> > 1) Probably the easiest way to do it is to let the JDBC datacontext
give
>>> > the columns other ColumnTypes. But that sorta doesn't feel good now
that
>>> > the "real" datatype is CLOB, that we will then call it e.g. VARCHAR.
>>> >
>>> > 2) We can remove the getEquivalentJavaClass() from ColumnType and
instead
>>> > make it a direct member of the Column class. This will break backwards
>>> > compatibility of the API.
>>> >
>>> > 3) We can make ColumnType an interface or class instead of a enum.
Then the
>>> > behaviour can simply be plugged in by the JDBC DataContext. I do like
this
>>> > approach the best for many reasons, but it has the downside that it
also
>>> > breaks backwards compatibility of the API and that there will no
longer be
>>> > an enumerable and finite list of ColumnTypes. Maybe we could
alleviate that
>>> > problem by ALSO having an enum with the typical implementations or
>>> > something like that.
>>> >
>>> > Maybe there are other solutions that I didn't think of.
>>> >
>>> > Regards,
>>> > Kasper
>>> >
>>
>>
>

Re: [DISCUSS] Concern regarding ColumnType.getJavaEquivalentClass()

Reply via email to