[service-orientated-architecture] Goodson & Jason on Building a Data Services Layer

Gervas Douglas Sun, 29 Jun 2008 03:22:40 -0700

<<Best Practices for SOA: Building a Data Services Layer

These days nearly every sizable organization has either implemented some form of SOA or has it on their roadmap. They quickly find that SOA efforts tend to expand like spider webs, eventually touching every corner of IT as well as the business itself. Due to the vital role that data plays both in business and systems operations, database architects, information specialists, data integration experts, and anyone responsible for data persistence in an organization are increasingly being called on to contribute to their organization’s SOA initiatives – whether or not this was intended at the onset.

Information locked away inside monolithic application silos has proven to be a stubborn obstacle to the flexibility that modern businesses require. If businesses are to have any hope of building flexible services that offer the performance and agility needed to succeed with SOA, those businesses must solve the technical challenge of accessing information – that is, data – across application platforms and their organization as a whole.

System architects who fail to devote sufficient planning to data access issues and attempt to layer a service-oriented approach on top of their existing data sources often find that providing flexibility above the service abstraction requires complex changes at the data source level, impeding the agility they sought and thereby undermining one of the core rationales for implementing SOAs in the first place.

In traditional distributed architectures, developers write a data access code, which they might then seek to make reusable. However, if a problem exists with this data access code, that problem essentially becomes propagated with adverse impact on any application that requires access to that particular data source. Furthermore, whenever anything changes — including the underlying database, the data model, or the version of the coding environment being used — the data access code must be updated everywhere it appears.

Considering that data sources can range from all kinds of structured data stores (such as relational databases, mainframe data sources, and enterprise applications) to semi- or unstructured data such as Web pages, PDF documents, office application files, XML documents, e-mail, media content, print streams, or a wide variety of content and data feeds and formats, it becomes clear that accessing and processing all these disparate types of information from so many disparate sources via the tightly coupled approach of traditional distributed data access would constitute a technical support challenge of monumental proportions.

Properly architected SOA presents both business functionality and data as abstracted services. Applied to data access, if access is abstracted as data services and the access code moved into supporting infrastructure, then problems can be addressed and changes can be supported throughout the environment in a much more loosely coupled and flexible manner. In essence, the data services layer provides a single abstracted point of access for all data access, update, and creation operations, providing a holistic view of the data models that the underlying persistence layer relies on. It acts as a bridge between the business services and the underlying data persistence layer; business users needn’t concern themselves with whether the data they are consuming originated in the database, an enterprise application, a file system, another company, or anywhere else for that matter. This promise of ubiquitously accessible data freed from the constraints of its sources is liberating for companies as they struggle with the challenges of systems integration.

A data services layer must provide an interface that exposes a standard set of reusable data services for reading and writing data, independent from the underlying data sources. The loose coupling it enables between applications using the services and the underlying data source providers lets database architects modify, combine, relocate, or even remove underlying data sources from the data services layer without requiring changes to the interfaces that the data services expose. As a result, the architects can retain control over the structure of their data while providing relevant information to the applications that need it. Over time, this increased flexibility eases the maintenance of enterprise applications.

Before the advent of SOA, developers built the capabilities that the abstracted data service layer could provide using manual coding, tightly embedding that code into the application under construction. Embedding such data access and data abstraction code directly into applications limits the flexibility and reusability of the resulting applications and, as a result, enterprises looked to traditional middleware such as Extract, Transform, Load (ETL), and Enterprise Application Integration (EAI) products to provide the capabilities of the data service layer from a middleware perspective. The ETL approach is best suited to static applications that don’t require the flexibility that SOA can provide. It can be costly, as well, and requires a high management overhead. The EAI approach centralizes data interaction logic, but still fails to provide the flexibility many organizations require from their data services layer.

Even when an organization is implementing true SOA, however, a poorly architected data services layer can lead to performance issues. In many cases, each application has its own database which contains a duplicate copy of business reference data such as customer information, product information, and inventory levels, as shown in Figure 1.

The organization must synchronize such databases on a regular schedule, which can lead to stale or inconsistent data between applications. In this case, even when an organization has exposed application functionality as loosely coupled services the persistence tier still limits the flexibility and reusability of the services. Incorporating a data services layer into the SOA implementation addresses this problem. Data services offer transaction and connection management to multiple applications, as shown in Figure 2.

The data services layer manages the relationships between the data services and ensures that each application is aware of all data changes, regardless of the cause of the change. Leveraging data services as infrastructure services beneath the business service abstraction increases reusability and flexibility, and shortens development and rollout time for new services. A well-architected data services layer relies on consistent, high-performance data access middleware that leverages industry-standard APIs such as ODBC, JDBC, ADO.NET, and SDO.

To sum it up, building a data services layer as part of an SOA implementation provides the infrastructure necessary to reap the full benefits of SOA. Specifically, it provides the following solutions to common data problems:

Reduced reliance on hand-coded persistence logic: By building abstracted, shared data services, it’s possible to leverage best-of-breed data access middleware to support the data services as part of an architected plan.
Inflexible integration is addressed: Instead of the point-to-point or hard-coded integration of traditional integration approaches, the data services layer provides for loose coupling at the persistence tier.
Data query bottlenecks removed: The data services layer enables organizations to implement content-based routing techniques to avoid bottlenecks.
Improved data consistency: The data services layer functions as a single locus for data access in the enterprise, helping an organization drive toward a single source of truth for its data. Implementing a data services layer helps ensure that applications always pull data from the correct sources and consistently provide appropriate information back to all applications.

Without doubt, architects who focus on SOA must take a hard look at solving the data access challenges in order to plan a successful, loosely coupled architecture. Accessing data in the various stores across the enterprise in the most efficient, flexible manner is the basis for building data services, and is thus critical for building all the services in the SOA implementation.

The Importance of Best-of-Breed Data Access in Building a Data Services Layer
Ask just about any business executive about the core value IT provides to the business, and they’re likely to respond that data (in the form of useful information) rather than applications form the lifeblood of their business. In essence they’re correct; information technology is by definition all about information. But actually the two – data and applications – can only be separated in theory, not in fact. Data by itself is often inaccessible and/or unintelligible without the applications that process it, and most applications in turn serve no real business purpose without data.

While establishing a data services layer is essential if you want to fully apply SOA best practices to data, it’s equally important to realize that the foundation of that data services layer consists of data access. Data access, in turn, depends on standards such as ODBC, JDBC, ADO.NET, and SDO. Even if you’re using a persistence layer like one built with the open source Hibernate toolkit, for example, high-quality data access middleware is critical. A slow JDBC driver under Hibernate or a slow ADO.NET provider under Microsoft’s Language Integrated Query (LINQ) will inevitably impede the services. The results can be far reaching. Accessing the data in the various data stores across your organization in the most efficient, flexible manner is a core reason for building data services. It’s therefore critical for building all the services in your SOA implementation.

Performance speed is not the least of the issues that can be affected by data access middleware. And, because the data services layer essentially virtualizes data access, those services can hide a plethora of potential data access pitfalls. Other such pitfalls can include:

Scalability issues
Database platform, application, and version incompatibilities
Differences in how the various data sources handle the details of standard data access operations such as create, read, search, update, and delete
Data source security priorities, which might vary from database to database, table to table, or even row to row
Data mapping challenges arising from semantic differences among heterogeneous data
Problems arising from the mixture of structured and unstructured data — such as file formatting issues
Differences in versions of SQL

SOA can actually serve to magnify limitations in scalability, performance, and interoperability imposed by data access because organizations are both more likely to reuse the underlying code and leverage existing data services across many services and composite applications.

Your first step then in resolving these data-related issues is to build shared, centralized data services, which will result in an adaptable and easily maintained SOA implementation. This way data access logic only appears in one place, no matter how many applications consume it. The result is that, instead of scattering data-related services invocation code throughout each business service, centralized data access helps you create an environment that enables a best-of-breed data access solution to address all those data access issues.

This last point – the use of best-of-breed access solutions – can’t be stressed strongly enough. Even many people involved in data management rarely give data access middleware more than a passing thought. One reason for this is that commercial databases all include connectivity drivers bundled with the database software, which are often used by default. The fact that these “free” drivers may be less than optimal for a given IT environment typically doesn’t come up unless and until either a technical or a maintenance issue becomes serious enough to be traced back to the data connectivity layer. For reasons we’ve described, such issues can be compounded in an SOA and can also be extremely difficult to diagnose.

Quite simply, it is ill advised to rely on database-bundled data access middleware for your data services layer in an enterprise SOA. The same holds true for freely available open source drivers. As the very foundation of your data services layer, the type of data access you use deserves serious consideration and careful planning. Dynamic environments where multiple services reuse data access code and new services regularly go into production require that such code meet rigorous requirements.

Your best bet is to go with third-party data access middleware, preferably from a supplier whose core business and expertise consists of data connectivity. Look for specific attributes for your data access middleware, including capabilities for boosting query performance. Such attributes can include connection pooling as well as support for tunable data access performance such as adjusting network packet size. For scalability and high availability, data access middleware should be multithreaded and thread-safe, and offer client load balancing and failover to alternate servers.

It’s also important for data access middleware to support different types and versions of databases as well as all the subtle variations in SQL they support. Such support for heterogeneity should also extend to multiple computing platforms, chipsets, and operating systems. In addition, for the best performance and flexibility, data access middleware should offer wire protocol drivers to avoid the overhead and maintenance issues of off-the-shelf database drivers. Wire protocol drivers obviate the need for database client software and libraries, simplifying installation and administration as well as offering a much more efficient and high-performing operation. Such drivers must support the full panoply of relevant standards, including JDBC, ODBC, ADO.NET, and – over time – SDO.

Finally, you should incorporate data access middleware in a comprehensive IT security strategy that covers both network security and database security, with secure communications and secure code. It should integrate with multiple solutions for authentication and authorization, too.

It’s important to choose best-of-breed data access middleware as a critical building block for any SOA initiative you undertake in your business (see sidebar). Data access is a fundamental building block of SOA, and if you fail to make good choices there, the entire services infrastructure will suffer for it. Fundamentally, regardless of the SOA infrastructure that runs above the data services layer, there’s no question that data access remains a key building block technology for SOA.>>

You can read this at:

http://soa.sys-con.com/read/584308.htm

Gervas

[service-orientated-architecture] Goodson & Jason on Building a Data Services Layer

<<Best Practices for SOA: Building a Data Services Layer

Reply via email to