Re: [HACKYSTAT-DEV-L] Early Access: Hackystat Version 8 SensorBase REST API

Philip Johnson Mon, 23 Apr 2007 11:33:06 -0700

Hi Cedric,

I almost agree with you. :-) First, regarding what I mean by "webservice", I was perhaps a little imprecise. I view REST and WSDL asalternative technologies/architectural styles for the implementation of webservices. I definitely don't "associate web services with a monolithicserver" (although that is certainly one possible implementation choice).It sounds to me like you might be thinking of "web service" as "a systemthat uses WSDL", which I think is too narrow a definition. However, afterdoing a google "define: web service", I found a variety of conflictingdefinitions, some of which equated "web service" with the use of WSDL, somewith the use of SOAP, some with the use of XML, and some which simply saidthere were "common protocols"! So, there is definitely a lack of concensusin the community on this point!!

While it is certainly possible to use WSDL in a bunch of different ways,the RPC-based, single end-point style of implementation is the primary wayWSDL is applied in the materials I encountered during my research.Different technologies lead one in different design directions (forexample, you can program in Java in a functional style, but it is certainlyless convenient than using a real functional language). The Restletframework "leads" you to a REST architectural style of implementation, justas WSDL "leads" you toward an RPC, single endpoint style of implementation.It's easy to google around and see all sorts of WSDL-based web services (atAmazon and elsewhere) that use GET for both sending and retrievinginformation from the same endpoint. That's anti-REST, but quite reasonablefrom a SOAP/WSDL implementation point of view.

Your point about typing is a good one which I didn't discuss. My approachis that each service needs to provide an XmlSchema specification for theXML payloads it returns as a response to GET requests (and/or expects asthe payload for a POST request). This provides, I believe, just as muchtype-level structure as WSDL, but in a more technology-neutral fashion. Asa bonus, it makes conversion between XML and Java objects quite simpleusing JAXB (thus enabling us to get rid of JDOM in Version 8).


Cheers,
Philip



--On April 22, 2007 2:19:07 PM -0700 Qin Zhang <[EMAIL PROTECTED]> wrote:

First, my apology that it takes me a while to respond to this message.

Your description of web service is not entirely correct. You are still
thinking it as an alternative way to do remote procedure call. It's more
appropriate to view it in terms of document/information exchange. Of
course, the widely used approach is for a thread to send a request, and
block itself to wait for a response (i.e. synchronous / rpc style). There
are other approaches, such as one way message and asynchronous message
exchange.

Your description of one end point is not correct either. One-end-point is
just a remnant of the now-deprecated soap library. I deliberately
retrofitted it to have one end point when I updated the underneath
library from soap to axis one year ago, because I did not want to break
sensorshell client code. In fact, apart from the familiar
http://hackystat.ics.hawaii.edu/hackystat/rpcrouter
end point, there is a unpublished
    http://hackystat.ics.hawaii.edu/hackystat/AxisServlet
end point that accepts the exact the same thing. There is absolutely
nothing to prevent you from having as many end points as you want. You
can have one for GET, another one for POST, and yet another one for
DELETE.

It seems that you are associating web service to monolithic server. In
fact, the very idea of web service is to allow you to stay away from
monolithic server. One service running on one server, another service
running on another server, and the binding contract between them is a
WSDL document, which not only specifies the communication end point and
protocol, but also strongly types the xml messages.

Without WSDL, what mechanism are you going to use to define exchanged xml
messages? Our current approach is no type information at all, and we
assume whoever receive the xml knows how to parse it. But this assumption
no longer fits well with the loosely coupled version 8 architecture you
proposed. I think you have to address this question before going to REST.

I agree that Glassfish or any full blown J2EE stack implementation is an
overkill for hackystat. Tomcat plus axis is good enough.

Cheers,

Cedric


----- Original Message -----
From: Philip Johnson <[EMAIL PROTECTED]>
Date: Wednesday, April 18, 2007 2:24 pm
Subject: Re: [HACKYSTAT-DEV-L] Early Access: Hackystat Version 8
SensorBase REST API To: [email protected]

--On Tuesday, April 17, 2007 10:19 PM -0700 Qin Zhang
<[EMAIL PROTECTED]> wrote:

> Probably my question is too late since you have already decide
use REST, but I want to
> know the rationale behind it.
>
> Since you are still returning data in xml format, what makes you
decide not to publish
> a collection of WSDL and go along with more industrial standard
web service calls?

Excellent question! No, it's not too late at all.  This is exactly
the right time to be
discussing this kind of thing.

It turns out that when I started the Version 8 design process, I
was still thinking in
terms of a monolithic server and was heading down the SOAP/WSDL
route.  I was, for
example, investigating Glassfish as an alternative to Tomcat due to
its purportedly
better support for web services.

Then the Version 8 design process took an unexpected turn, and the
monolithic server
fragmented into a set of communicating services: SensorBase
services for raw sensor data,
Analysis services that would request data from SensorBases and
provide higher level
abstractions, and UI services that would request data from
SensorBases and Analyses and
display it with a user interface.

What worried me about this design initially was that every Analysis
service would have to
be able to both produce and consume data (kind of like being a web
server and a web
browser at the same time), and that Glassfish might be overkill for
this situation. So, I
started looking for a lightweight Java-based framework for
producing/consuming web
services, and came upon the Restlet Framework
(http://www.restlet.org/), which then got
me thinking more deeply about REST.

It's hard to quickly sum up the differences between REST and WSDL,
but here's a few
thoughts to get you started. WSDL is basically based upon the
remote procedure call
architectural style, with HTTP used as a "tunnel".  As a result,
you generally have a
single "endpoint", or URL, such as
<host>/soap/servlet/messagerouter, that is used for
all communication.  Every single communication with the service,
whether it is to "get"
data from the service, "put" data to the service, or modify
existing data is always
implemented (from an HTTP perspective) in exactly the same way: an
HTTP POST to a single
URL. From the perspective of HTTP, the "meaning" of the request is
completely opaque.

In REST, in contrast, you design your system so that your URLs
actually "mean" something:
they name a "resource". Furthermore, the type of HTTP method also
"means" something: GET
means "get" a representation of the resource named by the URL,
"POST" means create a new
resource which will have a unique URL as its name, DELETE means
"delete" the resource
named by the URL, and so forth.

For example, in Hackystat Version 7, to send sensor data to the
server, we use Axis,
SOAP, and WSDL to send an HTTP POST to
http://hackystat.ics.hawaii.edu/hackystat/soap/rpcrouter, and the
content of the message
indicates that we want to create some sensor data. All sensor data,
of all types, for all
users, is sent to the same URL in the same way.  If we wanted to
enable programmatic
access to sensor data in Version 7, we would tell clients to
continue to use HTTP POST to
http://hackystat.ics.hawaii.edu/hackystat/soap/rpcrouter, but tell
them that the content
of the POST could now invoke a method in the server to obtain data.

A RESTful interface does it differently: to request data, you use
GET with an URL that
identifies the data you want.  To put data, you use POST with an
URL that identifies the
resource you are creating on the server. For example:

 GET
http://hackystat.ics.hawaii.edu/hackystat/sensordata/x3fhU784vcEW/Commit
/1176759070170

might return the Commit sensor data with timestamp 1176759070170
for user x3fhU784vcEW.
Similarly,

 POST
http://hackystat.ics.hawaii.edu/hackystat/sensordata/x3fhU784vcEW/Commit
/1176759070170

would contain a payload with the actual Commit data contents that
should be created on
the server. And

 DELETE
http://hackystat.ics.hawaii.edu/hackystat/sensordata/x3fhU784vcEW/Commit
/1176759070170

would delete that resource.  (There are authentication issues, of
course.)
In fact, REST asserts a direct correspondance between the CRUD
(create/read/update/delete) DB operations and the POST, GET, PUT,
and DELETE methods for
resources named by URLs.

Now, why do we care? What's so good about REST anyway? In the case
of Hackystat, I think
there are two really significant advantages of a RESTfully designed
system over an
RPC/SOAP/WSDL designed system:

(1) Caching can be done by the Internet. If you obey a few more
principles when designing
your system, then you can use HTTP techniques as a way to cache
data rather than build in
your own caching system.  It's exactly the same way that your
browser avoids going back
to Amazon to get the logo files and so forth when you move between
pages.  In the case of
Hackystat, when someone invokes a GET on the SensorBase with a
specific URL, the results
can be transparently cached to speed up future GETs of the same
URL, since that
represents the same resource.  (There are cache expiration issues,
which I'm pretty sure
we can deal with.)

In Hackystat Version 7, there is a huge amount of code that is
devoted to caching, and
this code is also a huge source of bugs and concurrency issues.
With a REST
architecture, it is possible that most, perhaps all, of this code
can be completely
eliminated without a performance hit. Indeed, performance might
actually be significantly
better in Version 8.

(2) A REST API is substantially more "accessible" than a WSDL API.
One thing I want from
Hackystat Version 8 is a substantially simpler, more accessible
interface, that enables
outsiders to quickly learn how to extend Hackystat for their own
purposes with new
services and/or extract low-level or high-level data from Hackystat
for their own
analyses.  To do this with a RESTful API, it's straightforward:
here are some URLs,
here's how they translate into resources, invoke GET and you are on
your way.  Pretty
much every programming language has library support for invoking an
HTTP GET with an URL.
One could expect a first semester programming student to be able to
write a program to do
that.  Shoots, you can do it in a browser.  The "barrier to entry"
for this kind of API
is really, really low.

Now consider a WSDL API.  All of a sudden, you need to learn about
SOAP, and you need to
find out how to do Web Services in your chosen programming
language, and you have to
study the remote procedure calls that are available, and so forth.
The "barrier to
entry" is suddenly much higher: there are incompatible versions of
SOAP, there's way more
to learn, and I bet more than a few people will quickly decide to
just bail and request
direct access to the database, which cuts them out of 90% of the
cool stuff in Hackystat.

So, from my reckoning, if we decided to use Axis/SOAP/WSDL in
Version 8, we'd (1)
continue to need to do all our own caching with all of the
headaches that entails, and
(2) we'd be stuck with a relatively complex interface to the data.

I want to emphasize that a RESTful architecture is more subtle than
simply using GET,
POST, PUT, and DELETE.  For example, the following is probably not
restful:
GET http://foo/bar/baz&action=delete

For more details,
<http://en.wikipedia.org/wiki/Representational_State_Transfer> has
a
good intro with pointers to other readings.

Your email made another interesting assertion:

> what makes you decide not to publish
> a collection of WSDL and go along with more industrial standard
web service calls?

Although I agree that WSDL is an "industry standard", this doesn't
mean that REST isn't
one as well.  Indeed, my sense after a few weeks of research on the
topic is that most
significant industrial players have already moved to REST or offer
REST as an alternative
to WSDL: eBay, Google, Yahoo, Flickr, and Amazon all have REST-
based services.  I recall
reading that the REST API gets far more traffic than the
correponding WSDL API for at
least some of these services.

Finally, no architecture is a silver bullet, and REST is no
exception. For example, if
you can't effectively model your domain as a set of resources, or
if the CRUD operations
aren't a good fit with the kinds of manipulations you want to do,
then REST isn't right.
Another REST requirement is statelessness, which can be a problem
for some applications.
So far in my design process, however, I haven't run into any
showstoppers for the case of
Hackystat.

Version 8 is still in the early stages, and the advantages of REST
are still
hypothetical, so I'm really happy to have this conversation.  There
are no hard
commitments to anything yet, and if there turns out to be a
showstopping problem with
REST, then we can of course make a change.  The more we talk about
it, the greater the
odds we'll figure out the right thing.

Cheers,
Philip

Re: [HACKYSTAT-DEV-L] Early Access: Hackystat Version 8 SensorBase REST API

Reply via email to