Re: [PROPOSAL] Climate Model Diagnostic Analyzer

Henry Saputra Wed, 25 Mar 2015 08:25:00 -0700

HI Chris,

Great proposal.


Looks like the people from CMU are excluded from list of initial committers?
They are mentioned in the affiliations section but not in the
committers section.


- Henry

On Sun, Mar 22, 2015 at 10:55 PM, Mattmann, Chris A (3980)
<chris.a.mattm...@jpl.nasa.gov> wrote:
> Hi Everyone,
>
> I am pleased to submit for consideration to the Apache Incubator
> the Climate Model Diagnostic Analyzer proposal. We are actively
> soliciting interested mentors in this project related to climate
> science and analytics and big data.
>
> Please find the wiki text of the proposal below and the link up
> on the wiki here:
>
> https://wiki.apache.org/incubator/ClimateModelDiagnosticAnalyzerProposal
>
> Thank you for your consideration!
>
> Cheers,
> Chris
> (on behalf of the Climate Model Diagnostic Analyzer community)
>
> = Apache ClimateModelDiagnosticAnalyzer Proposal =
>
> == Abstract ==
>
> The Climate Model Diagnostic Analyzer (CMDA) provides web services for
> multi-aspect physics-based and phenomenon-oriented climate model
> performance evaluation and diagnosis through the comprehensive and
> synergistic use of multiple observational data, reanalysis data, and model
> outputs.
>
> == Proposal ==
>
> The proposed web-based tools let users display, analyze, and download
> earth science data interactively. These tools help scientists quickly
> examine data to identify specific features, e.g., trends, geographical
> distributions, etc., and determine whether a further study is needed. All
> of the tools are designed and implemented to be general so that data from
> models, observation, and reanalysis are processed and displayed in a
> unified way to facilitate fair comparisons. The services prepare and
> display data as a colored map or an X-Y plot and allow users to download
> the analyzed data. Basic visual capabilities include 1) displaying
> two-dimensional variable as a map, zonal mean, and time series 2)
> displaying three-dimensional variable’s zonal mean, a two-dimensional
> slice at a specific altitude, and a vertical profile. General analysis can
> be done using the difference, scatter plot, and conditional sampling
> services. All the tools support display options for using linear or
> logarithmic scales and allow users to specify a temporal range and months
> in a year. The source/input datasets for these tools are CMIP5 model
> outputs, Obs4MIP observational datasets, and ECMWF reanalysis datasets.
> They are stored on the server and are selectable by a user through the web
> services.
>
> === Service descriptions ===
>
> 1. '''Two dimensional variable services'''
>
> * Map of two-dimensional variable:  This services displays a two
> dimensional variable as a colored longitude and latitude map with values
> represented by a color scheme. Longitude and latitude ranges can be
> specified to magnify a specific region.
>
> * Two dimensional variable zonal mean:  This service plots the zonal mean
> value of a two-dimensional variable as a function of the latitude in terms
> of an X-Y plot.
>
> * Two dimensional variable time series:  This service displays the average
> of a two-dimensional variable over the specific region as function of time
> as an X-Y plot.
>
> 2. '''Three dimensional variable services'''
>
> * Map of a two dimensional slice of a three-dimensional variable:  This
> service displays a two-dimensional slice of a three-dimensional variable
> at a specific altitude as a colored longitude and latitude map with values
> represented by a color scheme.
>
> * Three dimensional zonal mean:  Zonal mean of the specified
> three-dimensional variable is computed and displayed as a colored
> altitude-latitude map.
>
> * Vertical profile of a three-dimensional variable:  Compute the area
> weighted average of a three-dimensional variable over the specified region
> and display the average as function of pressure level (altitude) as an X-Y
> plot.
>
> 3. '''General services'''
>
> * Difference of two variables:  This service displays the differences
> between the two variables, which can be either a two dimensional variable
> or a slice of a three-dimensional variable at a specified altitude as
> colored longitude and latitude maps
>
> * Scatter and histogram plots of two variables:  This service displays the
> scatter plot (X-Y plot) between two specified variables and the histograms
> of the two variables. The number of samples can be specified and the
> correlation is computed. The two variables can be either a two-dimensional
> variable or a slice of a three-dimensional variable at a specific altitude.
>
> * Conditional sampling:  This service lets user to sort a physical
> quantity of two or dimensions according to the values of another variable
> (environmental condition, e.g. SST) which may be a two-dimensional
> variable or a slice of a three-dimensional variable at a specific
> altitude. For a two dimensional quantity, the plot is displayed an X-Y
> plot, and for a two-dimensional quantity, plot is displayed as a
> colored-map.
>
>
> == Background and Rationale ==
>
> The latest Intergovernmental Panel on Climate Change (IPCC) Fourth
> Assessment Report stressed the need for the comprehensive and innovative
> evaluation of climate models with newly available global observations. The
> traditional approach to climate model evaluation, which is the comparison
> of a single parameter at a time, identifies symptomatic model biases and
> errors but fails to diagnose the model problems. The model diagnosis
> process requires physics-based multi-variable comparisons, which typically
> involve large-volume and heterogeneous datasets, and computationally
> demanding and data-intensive operations. We propose to develop a
> computationally efficient information system to enable the physics-based
> multi-variable model performance evaluations and diagnoses through the
> comprehensive and synergistic use of multiple observational data,
> reanalysis data, and model outputs.
>
> Satellite observations have been widely used in model-data
> inter-comparisons and model evaluation studies. These studies normally
> involve the comparison of a single parameter at a time using a time and
> space average. For example, modeling cloud-related processes in global
> climate models requires cloud parameterizations that provide quantitative
> rules for expressing the location, frequency of occurrence, and intensity
> of the clouds in terms of multiple large-scale model-resolved parameters
> such as temperature, pressure, humidity, and wind. One can evaluate the
> performance of the cloud parameterization by comparing the cloud water
> content with satellite data and can identify symptomatic model biases or
> errors. However, in order to understand the cause of the biases and
> errors, one has to simultaneously investigate several parameters that are
> integrated in the cloud parameterization.
>
> Such studies, aimed at a multi-parameter model diagnosis, require
> locating, understanding, and manipulating multi-source observation
> datasets, model outputs, and (re)analysis outputs that are physically
> distributed, massive in volume, heterogeneous in format, and provide
> little information on data quality and production legacy. Additionally,
> these studies involve various data preparation and processing steps that
> can easily become computationally demanding since many datasets have to be
> combined and processed simultaneously. It is notorious that scientists
> spend more than 60% of their research time on just preparing the dataset
> before it can be analyzed for their research.
>
> To address these challenges, we propose to build Climate Model Diagnostic
> Analyzer (CMDA) that will enable a streamlined and structured preparation
> of multiple large-volume and heterogeneous datasets, and provide a
> computationally efficient approach to processing the datasets for model
> diagnosis. We will leverage the existing information technologies and
> scientific tools that we developed in our current NASA ROSES COUND, MAP,
> and AIST projects. We will utilize the open-source Web-service technology.
> We will make CMDA complementary to other climate model analysis tools
> currently available to the research community (e.g., PCMDI’s CDAT and
> NCAR’s CCMVal) by focusing on the missing capabilities such as conditional
> sampling, and probability distribution function and cluster analysis of
> multiple-instrument datasets. The users will be able to use a web browser
> to interface with CMDA.
>
> == Current Status ==
>
> The current version of ClimateModelDiagnosticAnalyzer was developed by a
> team at The Jet Propulsion Laboratory (JPL). The project was initiated as
> a NASA-sponsored project (ROSES-CMAC) in 2011.
>
> == Meritocracy ==
>
> The current developers are not familiar with meritocratic open source
> development at Apache, but would like to encourage this style of
> development for the project.
>
> == Community ==
>
> While ClimateModelDiagnosticAnalyzer started as a JPL research project, it
> has been used in The 2014 Caltech Summer School sponsored by the JPL
> Center for Climate Sciences. Some 23 students from different institutions
> over the world participated. We deployed the tool to the Amazon Cloud and
> let every student each has his or her own virtual machine. Students gave
> positive feedback mostly on the usability and speed of our web services.
> We also collected a number of enhancement requests. We seek to further
> grow the developer and user communities using the Apache open source
> venue. During incubation we will explicitly seek increased academic
> collaborations (e.g., with The Carnegie Mellon University) as well as
> industrial participation.
>
> One instance of our web services can be found at:
> http://cmacws.jpl.nasa.gov:8080/cmac/
>
> == Core Developers ==
>
> The core developers of the project are JPL scientists and software
> developers.
>
> == Alignment ==
>
> Apache is the most natural home for taking the
> ClimateModelDiagnosticAnalyzer project forward. It is well-aligned with
> some Apache projects such as Apache Open Climate Workbench.
> ClimateModelDiagnosticAnalyzer also seeks to achieve an Apache-style
> development model; it is seeking a broader community of contributors and
> users in order to achieve its full potential and value to the Climate
> Science and Big Data community.
>
> There are also a number of dependencies that will be mentioned below in
> the Relationships with Other Apache products section.
>
>
> == Known Risks ==
>
> === Orphaned products ===
>
> Given the current level of intellectual investment in
> ClimateModelDiagnosticAnalyzer, the risk of the project being abandoned is
> very small. The Carnegie Mellon University and JPL are collaborating
> (2014-2015) to build a service for climate analytics workflow
> recommendation using fund from NASA. A two-year NASA AIST project
> (2015-2016) will soon start to add diagnostic analysis methodologies such
> as conditional sampling method, conditional probability density function,
> data co-location, and random forest. We will also infuse the provenance
> technology into CMDA so that the history of the data products and
> workflows will be automatically collected and saved. This information will
> also be indexed so that the products and workflows can be searchable by
> the community of climate scientists and students.
>
> === Inexperience with Open Source ===
>
> The current developers of ClimateModelDiagnosticAnalyzer are inexperienced
> with Open Source. However, our Champion Chris Mattmann is experienced
> (Champions of ApacheOpenClimateWorkbench and AsterixDB) and will be
> working closely with us, also as the Chief Architect of our JPL section.
>
> === Relationships with Other Apache Products ===
>
> Clearly there is a direct relationship between this project and the Apache
> Open Climate Workbench already a top level Apache project and also brought
> to the ASF by its Champion (and ours) Chris Mattmann. We plan on directly
> collaborating with the Open Climate Workbench community via our Champion
> and we also welcome ASF mentors familiar with the OCW project to help
> mentor our project. In addition our team is extremely welcoming of ASF
> projects and if there are synergies with them we invite participation in
> the proposal and in the discussion.
>
> === Homogeneous Developers ===
>
> The current community is within JPL but we would like to increase the
> heterogeneity.
>
> === Reliance on Salaried Developers ===
>
> The initial committers are full-time JPL staff from 2013 to 2014. The
> other committers from 2014 to 2015 are a mix of CMU faculty, students and
> JPL staff.
>
> === An Excessive Fascination with the Apache Brand ===
>
> We believe in the processes, systems, and framework Apache has put in
> place. Apache is also known to foster a great community around their
> projects and provide exposure. While brand is important, our fascination
> with it is not excessive. We believe that the ASF is the right home for
> ClimateModelDiagnosticAnalyzer and that having
> ClimateModelDiagnosticAnalyzer inside of the ASF will lead to a better
> long-term outcome for the Climate Science and Big Data community.
>
> === Documentation ===
>
> The ClimateModelDiagnosticAnalyzer services and documentation can be found
> at: http://cmacws.jpl.nasa.gov:8080/cmac/.
>
> === Initial Source ===
>
> Current source resides in ...
>
> === External Dependencies ===
>
> ClimateModelDiagnosticAnalyzer depends on a number of open source projects:
>
>  * Flask
>  * Gunicorn
>  * Tornado Web Server
>  * GNU octave
>  * epd python
>  * NOAA ferret
>  * GNU plot
>
> == Required Resources ==
>
> === Developer and user mailing lists ===
>
>  * priv...@cmda.incubator.apache.org (with moderated subscriptions)
>  * comm...@cmda.incubator.apache.org
>  * d...@cmda.incubator.apache.org
>  * us...@cmda.incubator.apache.org
>
> A git repository
>
> https://git-wip-us.apache.org/repos/asf/incubator-cmda.git
>
> A JIRA issue tracker
>
> https://issues.apache.org/jira/browse/CMDA
>
> === Initial Committers ===
>
> The following is a list of the planned initial Apache committers (the
> active subset of the committers for the current repository at Google code).
>
>  * Seungwon Lee (seungwon....@jpl.nasa.gov)
>  * Lei Pan (lei....@jpl.nasa.gov)
>  * Chengxing Zhai (chengxing.z...@jpl.nasa.gov)
>  * Benyang Tang (benyang.t...@jpl.nasa.gov)
>
>
> === Affiliations ===
>
> JPL
>
>  * Seungwon Lee
>  * Lei Pan
>  * Chengxing Zhai
>  * Benyang Tang
>
> CMU
>
>  * Jia Zhang
>  * Wei Wang
>  * Chris Lee
>  * Xing Wei
>
> == Sponsors ==
>
> NASA
>
> === Champion ===
>
> Chris Mattmann (NASA/JPL)
>
> === Nominated Mentors ===
>
> TBD
>
> === Sponsoring Entity ===
>
> The Apache Incubator
>
>
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattm...@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [PROPOSAL] Climate Model Diagnostic Analyzer

Reply via email to