Re: Evaluate Suitable Scientific Workflow Language for Airavata.

Bruce Barkstrom Fri, 19 Sep 2014 09:30:28 -0700

One factor that should be included in the group's deliberations
on adding a workflow language to the other things in OODT
is the impact on long-term maintenance.  While there's a lot
of enthusiasm in the developer community right now, we need
to think about what happens when development turns into
maintenance.  The account that follows is based on my experience
with trying to resurrect a W3C-related project to visualize RDF
graphs.

The project is called IsaViz.  It's even got a W3 web site:
http://www.w3.org/2001/11/IsaViz/
IsaViz identifies itself as a visual authoring tool for RDF.
Right up near the top are two dates that should serve as
a cautionary note for people who want to pick up this tool:
Current Stable Version: October 2007 and Current Development Version:
May 2007.  It also looks like the site was maintained by a single
developer who did the development as a postdoc project.
The overview page was last modified on Oct. 21, 2007.
The installation instructions were last modified on Oct. 18, 2004.

IsaViz uses a number of tools.  The Installation Instructions
identify the following:

   - A Java Virtual Machine version 1.3.0 or later (1.4 strongly
   recommended - see Known problems
   <http://www.w3.org/2001/11/IsaViz/overview.html#bugs>)
   - A distribution of IsaViz, which contains the following Java JAR files:
      - IsaViz itself *(isaviz.jar)*
      - Zoomable Visual Transformation Machine *(zvtm.jar)*
      - Jena 2.1 for IsaViz 2.1, Jena 1.6.1 for IsaViz 1.2
      - Xerces-J version 2 *(xercesImpl.jar,xmlParserAPIs.jar)* for IsaViz
      2, Xerces 1.4.4 for IsaViz 1.2
   - GraphViz from AT&T version 1.8.9 or later (version 1.7.x is no longer
   supported in IsaViz 2, and has actually only been tested with
version 1.9). *Note:
   some instances of version 1.10.0 had a bug that produced incomplete SVG
   files, but it has been corrected in subsequent releases *(newer versions
   can be obtained on the graphviz.org site
   <http://www.graphviz.org/pub/graphviz/CURRENT>).

So, what complications ensue:
1.  Java has moved way beyond version 1.3 or 1.4.  Since Java can deprecate
code and
since there's Oracle and OpenJDK, there may be some unpleasantries that
might need
fixes.  I haven't seen comments from the community on whether or not these
might be
significant.  The IsaViz documentation refers to the ancient time when Sun
controlled
the language.  Apparently, the IsaViz code was only tested with Sun's
j2se/1.3 or 1.4.
2.  I suppose the jar files from IsaViz version 2 would be the place to
start in reconstructing
this piece of software.  However, one might be careful about this when you
get into the
installation scripts from the Installation Instructions.
3.  The Zoomable Visual Transformation Machine project is on Sourceforge.
It's apparently
done in Java.  However, the IsaViz code used version 0.9.0, while the
current Sourceforge
project (at http://zvtm.sourceforge.net/) is now up to 0.11.1 for the
stable version (Aug. 2013)
with a more recent development version (0.11.2 - snapshot; June 2014).  No
idea if
there would be any serious ramifications from this change.
4.  The Installation Instructions have a link to the HP Jena site.  If you
link to it, the
page says "Oops! ..."  Jena was moved from HP to apache.  So if you want to
do
Jena, you now need to consult <https://jena.apache.org/>.  I'm not sure
exactly how
the apache Jena source code or binary installations compare with what
IsaViz is expecting.
As a note, Jena is a BIG chunk of software.  I think the tutorials on RDF
(including OWL
and related reasoners) are going to take a novice user (including many
scientists) a month
or two of dedicated time to work through.  I don't know how easy IsaViz
would be to install
without at least a basic understanding of RDF and of the related triple
store database.
5.  Xerces-J is the XML Java parser (see <Xerces.apache.org>), which is now
up to
version 2.11.0.  Again, it isn't clear what kinds of difficulties one would
encounter to use
this library.
6.  GraphViz (at <http://www.graphviz.org/>) is now at version 2.38.0-1.
AT&T seems
to be maintaining a lot of installation options.  I was interested in
Ubuntu - and then
there are different versions of that.

As an additional note, Linux has developed a bunch of variants.  A
particularly active
area of development is the creation of automated package managers - often
with centralized
control over installation procedures and source code libraries.  The
packages have dependencies
on the libraries -- and there's no guarantee that an RPM package has the
same dependencies
as a Debian package.  This is a bit like the DOI guarantee of providing a
unique location to
obtain original items - although publishers have been known to substitute
new versions of
the unique object for the "true" original.

At the same time, software packages with complex networks of dependencies
are
not exactly easy to maintain with Linux (or Unix) scripts.  Exploring the
integrity of
the whole package requires a fair amount of work by experienced system
administrators.

If the intent is to produce data archives (or data production facilities)
that have long-term
maintainability, they need to handle replication [see Barkstrom and
Mattman, 2010, ESI]
of objects, as well as transparency.  The key attributes of such systems
need to be
simplicity, provenance integrity, and reliability.  They aren't easy
attributes to maintain
over the long haul.  The article on "being digital" in the current CACM has
a useful
perspective on how our enthusiasm for "rupture talk" plays out:
Haigh, T., 2014: We Have Never Been Digital, CACM, 57,No. 09,24-28
Peter Denning's article that follows immediately in the print version
[Denning, P. J., 2014: The Profession of IT: Learning for the New Digital
Age,
CACM, 57, No. 09, 29-31] offers some additional perspective that's probably
relevant to the issue of the learning curve for new technologies.  That
curve
is usually underestimated.  While everyone wants "user friendly" tools, it
isn't
easy for developers to get an accurate idea for how many person-hours of
work
it will require to make a user proficient enough to use new tools,
particularly in the
presence of "version churn" like we can see in the IsaViz example.

Bruce B.

On Thu, Sep 18, 2014 at 2:54 PM, Lavanya Ramakrishnan <lramakrish...@lbl.gov
> wrote:

> Here is my 2c -
>
> I think it is important to try and understand what your users are going to
> do with workflow and what kind of language they are used to
> (domain-specific, functional, etc). They are processes called user-centered
> design processes you can use to do this or do at a minimum an informal
> study.
>
>  A couple of years ago, we did an introspection on why all the existing
> workflow tools didn't have the uptake we had assumed it would. I have been
> part of a half dozen different tools over my career. We have since launched
> a project called Tigres - http://tigres.lbl.gov/ where we have learned a
> lot due to using a user-centered design approach. We have an IEEE eScience
> paper on our initial work - which you might find interesting. I am also
> happy to share more details on Tigres and/or the process.
>
> Lavanya
>
>
>
>
>
> On Thu, Sep 18, 2014 at 10:53 AM, BW <bw...@mysoftcloud.com> wrote:
>
> > Is there a list of graphical BEL workflow tools?
> >
> > On Thursday, September 18, 2014, Mattmann, Chris A (3980) <
> > chris.a.mattm...@jpl.nasa.gov> wrote:
> >
> > > Hi Guys,
> > >
> > > I've been interested in this too - we don't per have a specific
> > > OODT workflow language, but we specific workflows using XML, and
> > > other configuration (we are also thinking of moving to JSON for
> > > this).
> > >
> > > In the past I've also looked at YAWL and BPEL - both seem complex
> > > to me.
> > >
> > > I wonder at the end of the day if we should adopt something more
> > > modern like PIG or some other data flow type of language (PIG
> > > is really neat).
> > >
> > > Cheers,
> > > Chris
> > >
> > >
> > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > Chris Mattmann, Ph.D.
> > > Chief Architect
> > > Instrument Software and Science Data Systems Section (398)
> > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > > Office: 168-519, Mailstop: 168-527
> > > Email: chris.a.mattm...@nasa.gov <javascript:;>
> > > WWW:  http://sunset.usc.edu/~mattmann/
> > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > Adjunct Associate Professor, Computer Science Department
> > > University of Southern California, Los Angeles, CA 90089 USA
> > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >
> > >
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Shameera Rathnayaka <shameerai...@gmail.com <javascript:;>>
> > > Reply-To: "architect...@airavata.apache.org <javascript:;>"
> > > <architect...@airavata.apache.org <javascript:;>>
> > > Date: Thursday, September 18, 2014 8:26 AM
> > > To: "architect...@airavata.apache.org <javascript:;>" <
> > > architect...@airavata.apache.org <javascript:;>>,
> > > dev <dev@airavata.apache.org <javascript:;>>
> > > Subject: Evaluate Suitable Scientific Workflow Language for Airavata.
> > >
> > > >Hi All,
> > > >
> > > >As we all know Airavata has its own workflow language call XWF. When
> XWF
> > > >was introduced, main focus points are interoperability and
> > convertibility.
> > > >But with years of experience it is convinced that above requirements
> are
> > > >not really useful when we come to real world use cases. And XWF is XML
> > > >based bulky language where we attache WSDLs and Workflow image it
> self.
> > > >But
> > > >with the recent changes WSDL part is being removed from XWF.
> > > >
> > > >It is worth to evaluate handy Scientific workflow languages in
> industry
> > > >and
> > > >find out pros and cons, at the end of this evaluation we need to come
> up
> > > >with idea how we should improve Airavata workflow language, either we
> > can
> > > >improve existing XWF language, totally change to a new language
> > available
> > > >in industry or write a new light weight language. Basic requirements
> > that
> > > >we expect from new improvement are, high usability, flexible, light
> > weight
> > > >and real time monitoring support. As you can see above requirements
> are
> > > >not
> > > >direct comes with workflow languages but we need workflow language
> which
> > > >help to support above requirements.
> > > >
> > > >After reading few papers and googling, initially i have come up with
> > > >following three existing languages,
> > > >1. YAWL <http://www.yawlfoundation.org/>
> > > >2. WS-BPEL
> > > >3. SIDL
> > > ><http://computation.llnl.gov/casc/components/index.html#page=home>
> > > >
> > > >In my opinion SIDL is more familiar with scientific domain,
> Radical-SAGA
> > > >also uses slightly modified version of SIDL. Other than above three
> > > >languages we can come up with simple workflow language base on json(or
> > > >yaml) which support all our requirements for some extends.
> > > >
> > > >It would be grate if I can get more input regarding the $Subject form
> > the
> > > >airavata community. You all are more than welcome to provide any type
> of
> > > >suggestions.
> > > >
> > > >Thanks,
> > > >Shameera.
> > > >
> > > >
> > > >
> > > >--
> > > >Best Regards,
> > > >Shameera Rathnayaka.
> > > >
> > > >email: shameera AT apache.org , shameerainfo AT gmail.com
> > > >Blog : http://shameerarathnayaka.blogspot.com/
> > >
> > >
> >
>

Re: Evaluate Suitable Scientific Workflow Language for Airavata.

Reply via email to