I was going through our issue list, and realized that changing ColumnType to an interface instead of a enum would also have other benefits. For example, it would make solving this issue a lot easier:
https://issues.apache.org/jira/browse/METAMODEL-8 - Further details added to ColumnType 2014-04-04 7:52 GMT+02:00 Kasper Sørensen <[email protected]>: > > I did post my patch just now, which converts ColumnType to an interface: https://reviews.apache.org/r/20028/ > > Would be interesting if someone can check it out and give it some thought. Not a trivial change for backwards compatibility I think, but might be the best from an API standpoint. > > > 2014-04-01 20:15 GMT+02:00 Kasper Sørensen <[email protected] >: > >> Hi Ankit, >> >> I think it is fair to assume that a CLOB will always be convertible to a String - if memory on the client permits at least. Normally a java.sql.Clob provides a getCharacterStream() method which returns a Reader. By setting the system property 'metamodel.jdbc.convert.lobs' to "true" we offer the service of eagerly reading that Reader to the end and returning a String. The good thing about that is that the client avoids issues of the DataSet is closed or even sometimes just moving the cursor to the next record may invalidate the Reader. In other words: With the LOB conversion we offer a way to make the MetaModel Row objects useable even though the connection to the statement/resultset/connection is closed. It's of course optional. >> >> The same idea is applied to BLOBs. We offer to automatically open the java.sql.Blob InputStream and load it into a byte-array. >> >> In my opinion the fact that something is a CLOB or a VARCHAR is a different type of information than whether it's values are represented as java.sql.Clob or String objects. The information on the column type is valuable in it's own right and I would not want us to "make believe" that a column which is really a CLOB is shown as a VARCHAR. >> >> Now that we're on the topic, we might even also consider if using the JDBC data type names is even appropriate as a default behaviour. I mean, we're acting in the Java world and a type like VARCHAR or TINYINT etc. is only used in SQL, not in native Java apps. Other types of datastores have different data types and I guess many of the datastores only have simple data types like String, Integer, Date etc. We could choose to go for a non-enum solution for the ColumnType and thereby also provide more flexibility. In the CSV module we could thus provide just a single ColumnType, e.g.: >> >> public static final ColumnType CSV_COLUMN_TYPE = new ColumnTypeImpl("String", String.class); >> >> Read the example constructor arguments as this: Name would be "String" (or maybe "Text" is more appropriate for "text files"?) and expected Java class is String.class >> >> In the JDBC module we could have other column types available, e.g.: >> >> public static final ColumnType VARCHAR = new ColumnTypeImpl("VARCHAR", String.class); >> public static final ColumnType CLOB = new ColumnTypeImpl("CLOB", java.sql.Clob.class); >> public static final ColumnType CLOB_CONVERTED = new ColumnTypeImpl("CLOB", String.class); >> public static final ColumnType BLOB = new ColumnTypeImpl("BLOB", java.sql.Blob.class); >> public static final ColumnType BLOB_CONVERTED = new ColumnTypeImpl("BLOB", byte[].class); >> >> (and there you then have my suggested solution for solving the Clob issue - to use a String for the name and a Class for the expected Java type) >> >> I can try to make a patch for this solution. That would sort of validate if it will work or not. But first ... what do you guys think? >> >> Only pitfall I see is that it breaks backwards compatibility. I think we can retain most of the compile-time compatibility, but not runtime compatibility (as in "drop in a new jar without recompile"). Also, old objects serialized with the old enums would not be properly deserialized into using these constants. Maybe that can be solved in the LegacyDeserializationObjectInputStream [1] somehow. >> >> Best regards, >> Kasper >> >> >> [1] Described here: http://wiki.apache.org/metamodel/MigratingFromEobjectsMetaModel >> >> >> 2014-04-01 10:28 GMT+02:00 Ankit Kumar <[email protected]>: >> >>> Hi Kasper, >>> >>> Thanks for initiating this discussion as we are hit by this in MM right now. >>> >>> Thinking just a bit from a client of MM perspective, I guess the CLOB could >>> contain a Object/File/Image/???/Large String text, so while getting CLOB's >>> from the real database we still can check the content and if it smells more >>> like a String then we return the VARCHAR otherwise an Object. May be this >>> is not the neatest solution but I guess for a client this provides the >>> flexibility needed. >>> >>> An additional point worth mentioning with respect to the above remark, >>> playing with different databases we see some databases allow large text to >>> be stored in VARCHAR kind of fields whereas some are still having limits of >>> 4000 characters like Oracle. So when using MM as the abstraction layer >>> above the databases this feature might be quite handy for a client. >>> >>> Excuse me if I sound a bit too biased here. >>> >>> Regards >>> Ankit >>> >>> >>> On Tue, Mar 25, 2014 at 3:54 PM, Kasper Sørensen < >>> [email protected]> wrote: >>> >>> > Hi all, >>> > >>> > I just came across a potential issue in MetaModel's design and want to >>> > share it and maybe start thinking of ways to fix or work around it... >>> > >>> > In our ColumnType enum we have the method getJavaEquivalentClass() which is >>> > used to tell which java type to expect when querying a particular column. >>> > For instance, of I query a VARCHAR column, I can expect a java.lang.String >>> > value. >>> > >>> > Now the trouble is that in our JDBC module we have a system property which >>> > allows for eager loading of BLOBs and CLOBs so that they are automatically >>> > read into byte-arrays and Strings respectively. This is a great utility >>> > because otherwise the user has to do a lot of tedious work with >>> > inputstreams etc which in most cases isn't particularly useful - in most >>> > cases you just want the byte[ ] or String. >>> > >>> > Now the trouble is that if you turn that system property on, you get >>> > Strings or byte-arrays but the column type is still CLOB/BLOB and that >>> > means that the "expected equivalent Java class" is still java.sql.Clob or >>> > java.sql.Blob! If you build your code to expect that, then it will >>> > eventually break because you get a String or a byte-array instead. >>> > >>> > How to make this consistent? I can think of a few ways, but none that I >>> > really love: >>> > >>> > 1) Probably the easiest way to do it is to let the JDBC datacontext give >>> > the columns other ColumnTypes. But that sorta doesn't feel good now that >>> > the "real" datatype is CLOB, that we will then call it e.g. VARCHAR. >>> > >>> > 2) We can remove the getEquivalentJavaClass() from ColumnType and instead >>> > make it a direct member of the Column class. This will break backwards >>> > compatibility of the API. >>> > >>> > 3) We can make ColumnType an interface or class instead of a enum. Then the >>> > behaviour can simply be plugged in by the JDBC DataContext. I do like this >>> > approach the best for many reasons, but it has the downside that it also >>> > breaks backwards compatibility of the API and that there will no longer be >>> > an enumerable and finite list of ColumnTypes. Maybe we could alleviate that >>> > problem by ALSO having an enum with the typical implementations or >>> > something like that. >>> > >>> > Maybe there are other solutions that I didn't think of. >>> > >>> > Regards, >>> > Kasper >>> > >> >> >
