I dont think we disagree - we're just looking from the problem from very different perspectives.
I just dont like calling a database table or result-set a FeatureType. A feature is a representation of a real world concept defined by a community. I'm happy to call it a "View", or maybe a better name. It seems the argument stems from my obstinance in this matter. I'm agnostic about the implementation specifics, but in general I think we should aim at, for simplicity across the range of issues: 1) Abilty to stream features during simple processing chains 2) Hence, default behaviour of reading all the attributes required to create the feature 3) Not necessarily post-processing attributes - e.g. converting strings into dates or vice versa unless required - i.e, _allow_ but _do not force_ lazy instantiation of feature properties from the result set. 4) Minimise actual retrieved data by taking into account the target FeatureType, not the underlying persistence store which may contain much else. Perhaps we could start off with a default strategy of: * read and resolve simple scalars (current strategy, where "implicit mapping" means we need to real all attributes.) * resolve expressions lazily * read and resolve gml:FeatureTypeProperty properties lazily unless mapping provides a hint * read and resolve expensive (eg blobs, XML docs) lazily (not that we do this at the moment anyway) Within this framework we can make things work as they do now, start to optimise and support joins later. I suspect the reality is we wont pay too badly for reading lots of stuff because: a) the implicit FeatureType is really a construct designed for playing with the technology, not production services b) you can always put a view into the persistence layer to optimise WMS operations, where an implicit feature type is OK _provided_ you never, ever, intend to support GetFeatureInfo for this layer. c) we still get the pruning happening before the expensive bits - the serialisation and network transfers d) if it streams, let it go - we dont burn too much memory. This may not require any changes yet, but hopefully will inform any changes planned. Rob Andrea Aime wrote: > Rob Atkinson ha scritto: >> >>>>> Do you think that "schema-assisted" object model mapping is a >>>>> >>> reasonable >>> >>>>> compromise - as long as we have the extension points to extend the >>>>> schema with object libraries - such as GML primitives, coverage >>>>> primitives, custom operations >>>>> >>>> Guys, speaking from an OO and library point of view, can't we consider >>>> XML just an I/O format? Forcing XML schema right in our object model >>>> as ways to get extra information strikes me as something we should >>>> really want to avoid. >>>> >>> >>> Amen! >>> >>> >> I agree too, but point out our code is doing the opposite of this... >> >> Consider this: we are reading the persistence schema to deduce a >> "Feature Type" (old way) or reading the GML schema to define it >> (complex datastore). The issue is how do we import the feature model >> we want to work with? The idea of schema-assist is to read the schema >> to describe the data structures we have to manage, then allow >> matching these to object libraries to add behaviours. > > We are doing the opposite to keep our datastore complexity to a > tolerable level. If you need to do feature type mapping you need to > operate a level higher than this in my opinion. Separate concerns, > low level datastores concern themselves with how to read stuff from > persistent storage, a higher level on how to masquerade the real nature > of data and present it in a different format. I thought that's what > the "complex data store" is all about. > >> OK - but with any caching strategy you have a situation where the >> model (data viewed through the cache) knows about persistence (do I >> have this cached or do I get it). What you dont have is the model >> knowing about _how_ the persistence is managed by the provider. > > Nope, here I can disagree, I have direct experience of exactly the > opposite. In Hibernate, my java objects do not know about Hibernate at > all (that's what transparent persistence is all about). > Lazy loading is implemented by using proxies. Say class A has a field > or class B, that's marked as lazy loaded. > I load object A1, Hibernate builds it up with a proxy instead of a > reference to, say, B1. The moment you try to cross the proxy it checks > if you're still in the transactional context that created A, if so, > it loads B1 from the database, if not, it throws a > LazyInizializationException, because it does not make any sense to try > and load data after transaction isolation is lost (you could open > a new connection, but you would load data at a different time, t2, > which may be inconsistent with A1, which was loaded at t1). > >> And, if the reader gets the data for feature (a row in a result set, >> say), then a view exposes part of it, and then a client accesses one >> or more attributes, then moves on to the next feature, you have still >> got efficient streaming. > > If you lazy load attributes you're not efficiently streaming. > If feature type FT has four attributes A1,... A4 and I do a: > > select A1, A2 > from FT > ... > > then I won't be able to access A3 and A4 without another query. > This is not efficiently streaming, quite the opposite (see my other mail > to Bruce). > >> Its only if you build the whole feature collection then decide you >> need additional attributes that would hurt. So Bryce's idea that you >> get all the attributes needed to support a specific View, then simply >> avoid processing the data unless you need it seems reasonably efficient. > > Hem... sorry, but that's exactly what we are doing now? When I ask > for a feature reader and select only A1, A2 I only get those, no way > to get A3 and A4. > > >>> If reality bites the user on the head because they are using an >>> inappropriate data access strategy, that user needs to get smarter. >>> There's no cure for dumb users. We'd only need to populate the toolbox >>> with "likely-to-be-useful" tools. >>> >>> This idea is actually the opposite of Rob's injecting/merging >>> philosophy, >>> >> I'm not sure it is - I was focussed on injection of types during the >> FeatureType creation, not during the resolving/reading process. > > I lost you again. What's type injection? > >> If you have an extra feature type, you have to define it. This is an >> overhead, currently handled by only allowing subsets of the >> persistence schema derived by black magic from the context of the query. > > Black magic? When did age old SQL projection become black magic? :-) > >> Not a good place to be if you want to make something behave smarter >> - for example support a coverage operation, or convert a linear >> reference system position into a cartesian coordinate set, or average >> a set of readings over a time period. > > Ah, now that's black magic :-) Seriously, it's processing, and by all > means I want that. It would be nice if the datastores could adversite > what expressions they can handle (a + x * b) is something that > every database can handle, no point in loading a, b and x separately, > and the same goes for aggregation, thougth we don't have way to express > aggregates now afaik. > So, once we have that, a "mapping" datastore does feature type mapping, > from real data schema to whatever community schema has been agreed upon, > moving whatever computation is supported to the real datastore and doing > the rest in-memory after stored features are loaded. > > Cheers > Andrea ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Geotools-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/geotools-devel
