On 7 Mar 2011, at 10:57, Jonathan Warren wrote:

> 
> On 7 Mar 2011, at 10:35, Thomas Down wrote:
> 
>> On Mon, Mar 7, 2011 at 10:04 AM, Andy Jenkinson 
>> <[email protected]>wrote:
>> 
>>> Hi Thomas,
>>> 
>>> Thanks for this. Regarding the option of whether to return just one feature
>>> per side or all overlapping features, the only other advantage that
>>> immediately springs to mind for the latter (in addition to some measure of
>>> consistency, as you mention) is that it allows the client to immediately
>>> render the exact region of that feature without triggering another request.
>>> It would generally mean changing zoom level. I'm can't say if clients are
>>> likely to follow this mechanism as opposed to, say, pan and centre on the
>>> feature, but if they wanted to it would be more efficient (and possibly a
>>> little bit more efficient anyway depending on how your client does its
>>> requests).
>>> 
>> 
>> Yep, I agree.  I'd be interested to learn whether there are any clients that
>> would seriously consider taking advantage of this.  My own thinking is that
>> even if we do adjust zoom level (as Dalliance sometimes does, e.g. in the
>> "jump to gene..." navigation op), clients are much more likely to zoom to a
>> view that contains the target feature plus a "sensible" amount of flanking
>> sequence, rather than a view where the target feature is perfectly framed.
>> 
>> Furthermore, this rather seems like optimizing for the case where only one
>> annotation source is active.   Surely we're talking about the
>> *distributed*annotation system, and clients will still have to go off
>> and query all the
>> other annotation sources, even if they are able to skip the one which
>> responded to the "adjacent" query.  So long as there's some kind of query
>> parallelization in place, this probably isn't a performance issue.
> 
> My vote would ideally to change feature_by_id to return one feature and have 
> the adjacent_feature as returning one feature. This in my opinion would mean 
> these capabilities on servers do "exactly as they say on the tin" and would 
> be easier to implement for data providers and are thus more likely to be 
> implemented?
> If the feature_id capability as it stands is needed it could be changed to 
> something more akin to what it means like feature_id_region but I would bet 
> no one would bother to change it/use it?
> 
> However the reality is that we are too late to change the old feature_by_id, 
> but I don't think we need to make the same mistake twice by repeating it for 
> adjacent_features?

I agree with Jonathan, feature_by_id sounds like it gets the feature by the 
requested Id, and to be honest is the way I have implemented before, so if you 
ask me I will say the adjacent capability should just return one feature. I 
don't think we are too late to change the old feature_by_id behaviour and we 
can take this as the opportunity to make such a change.
> 
> 
>> 
>> Do any other client developers feel differently?
>> 
>> 
>>> Disadvantages I can think of:
>>> - "adjacent" request takes marginally longer
>>> - not quite as obvious what clients should put in their UI controls - need
>>> to pick a feature to be able to do "jump to BRCA1"
>>> - risk of servers not implementing it correctly and only returning one
>>> feature anyway (although I don't think this is likely as the concept is
>>> different to "feature-by-id")
>>> 
>>> Some things to further define:
>>> - servers can't return a fake feature
>>> 
>> 
>> Yep, will clarify this.
>> 
>> 
>>> - should servers return features on different reference sequences if there
>>> are none one the current one?
>>> 
>> 
>> In my opinion, absolutely yes.  Otherwise the "10 features in the genome"
>> case remains a massive pain (and potentially a disaster, for
>> inhomogeneous-dstributed data; won't someone think of the MHC tiling arrays?
>> :-).  And even worse for the "10 features in UniProt" case (where I can also
>> see this feature being quite interesting).
>> 
>> I've tried to be explicit about this in my proposal (see the penultimate
>> paragraph + example 3), but any suggestions for further clarifications are
>> welcome.
>> 
>> 
>>> - how should servers treat features that overlap the adjacent range? Treat
>>> them as the adjacent feature to return, or only include features completely
>>> outside the query range? What if the next feature completely outside the
>>> query range is part of the same feature hierarchy (e.g. an exon outside the
>>> current window).
>>> 
>> 
>> It's a point rather than a range, but yes I agree this is still an open
>> question.  I'd actually written the spec such that overlapping features do
>> get returned (on the assumption that clients will do "trivial" cases of
>> next/previous feature in-memory without a network round trip), but again if
>> other client developers do things differently, I'd like to know.
>> 
>> I think "include overlapping" will have less special-cases to worry about,
>> though.  e.g. the PART/PARENT issue you allude to.  Let clients deal with
>> that ("dumb servers, smart clients").
>> 
>>                Thomas.
>> _______________________________________________
>> DAS mailing list
>> [email protected]
>> http://lists.open-bio.org/mailman/listinfo/das
> 
> Jonathan Warren
> Senior Developer and DAS coordinator
> blog: http://biodasman.wordpress.com/
> [email protected]
> Ext: 2314
> Telephone: 01223 492314
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a 
> charity registered in England with number 1021457 and acompany registered in 
> England with number 2742969, whose registeredoffice is 215 Euston Road, 
> London, NW1 2BE._______________________________________________
> DAS mailing list
> [email protected]
> http://lists.open-bio.org/mailman/listinfo/das


_______________________________________________
DAS mailing list
[email protected]
http://lists.open-bio.org/mailman/listinfo/das

Reply via email to