On Tue, Jun 30, 2009 at 17:38, Fan Dong<[email protected]> wrote: > Thanks Stian for explaining the usage of the list, I will read them through. > > But can you state why you wouldn't suggest to expose file names through web > services? > > In our case, we simply pass the file names into the web service, no path > attached to those file names. The reason for that is we do not want the > data physically being transferred among the processors - our data are huge - > thereby we feed file names into the service and expect the service returns > the names of the output, and so on.
Note that we are moving into some kind of 'best practice' discussion here - Taverna obviously does not care what it is you are passing around :) I agree that not transferring large data is a an obvious goal - and also something we have recommended to service providers. The main problems I can see with file names is: * User of web service would need to know the possible file names, or an operation could list them * Security aspects - what happens if someone is evil and pass in /etc/passwd as the filename * If you are keeping results of users' jobs like this, and you return a filename like 'job14.txt' - you don't need to be a genius to figure out that your competitor (that told you about this service) might have some interesting data in job12.txt or job15 * Although it removes the need to transfer large data in the SOAP message, it only works within 'your' services But I agree that it's a solution to avoid large transfers. What is confusing is if you expose it as 'filename' - novice users might think it's a filename local on their machine that the service wants - you could call it something like 'dataset' (and have a getDatasets() operation) - nobody needs to know that it's really a filename. Another solution we have recommended - which is very similar - is if you don't mind exposing the data for users who want them - is to instead of file names use URLs. If the user wants to, he can download the data. If he passes the URL up again to your service, you can recognize the prefix (http://host.uni.com/2009/datasets/) - chop it off and treat it like a local filename. The first advantage is that your identifier now is so much more than just a filename on some server the client does not know - it's now a universal identifier - at least if you don't change the data on the server! :-) When the user looks at his results in a years time and sees the full URL there, it is much more valuable and referenceable (?) than something like 'data14.db'. If you want some kind of pseudo-security/privacy without requiring passwords or certificates, you can use UUIDs like 30ad84fc-f485-48c3-a1f1-2ab39c528c13 for identifiers in the URLs. If the user passes this URL to another 3rd party service that works in the same way - the service can download the data directly from your site - typically the link between two universities can be much better than going down and up again to wherever Taverna is running - which leads us to the last piece for your service - to support downloading from 3rd party URLs as incoming parameters. The only challenge here is when the user want to provide some data that is not already world-wide accessible, for not-that-large data you can provide a twin operation that takes the data directly, you can use SOAP attachment support, or a separate REST-like HTTP-based PUT/POST-interface for uploading - However this again will force you to think about security - you don't want people to start uploading movies and music to share it through your service, so you can do some checking of the data syntax, or not share uploaded URLs with the world. (only accept them as inputs to your own services) -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester ------------------------------------------------------------------------------ _______________________________________________ taverna-users mailing list [email protected] [email protected] Web site: http://www.taverna.org.uk Mailing lists: http://www.taverna.org.uk/taverna-mailing-lists/
