On 7/15/07, Tony Gravagno <[EMAIL PROTECTED]> wrote:
Dawn Wolthuis wrote:
> I searched for some appropriate benchmarks in 2002, but it looked like
> all industry standards, such as TCP benchmarks would compare apples
> and oranges unless comparing only SQL-based functionality and
> performance of SQL-based RDBMS tools.

TPC-B allowed any sort of database to take part in a transaction-based
benchmark.  It was sometime around -C or -D that they changed the
requirements and definition for the test itself conform to relational
structures, thus closing the door for any non-relational database to even
take part.  Talk about stacking the deck...

Agreed.

To compare apples to apples I think you need a number of simple and
progressive tests which can be configured to ensure some amount of equity.
Examples:
1) A single table/file has 10 fields.  Record the time to add 1000 records
to the file.

Even with tests like these, there are issues comparing apples and
oranges when a problem is modeled differently in different
environments. A data modeler approaches a set of functional
requirements differently whether modeling for SQL Server or U2, for
example. Those modeling for first normal form and with tools employing
three-valued logic are apt to work with the requirements and users to
move them away from some nuances in the requirements, such as those
that might prompt multivalues or null values, for example.

So, even if you were to get a performance benchmark that compares
performance of 10 single-valued columns/fields in a single table/file
with 1000 inserts of rows/records, when it comes to a real
application, if we were to model the same requirements to compare
performance, we would might need to compare the insert of 1000 records
in one file in UniData to the insertion of 1000 rows in a table plus
additional rows in other tables related to the modeling of the
multivalues.

Do that 100 times and get an average.  Do the same for 10k
records, 100k, 1M, and 10M records.  Scale as desired to higher numbers.
2) Using that same table with 1M records, perform a non-indexed sort by
field1, by field2, by field3, etc.  Do the same test with indexes.
3) Using the same 1M records, time how long it takes to update each field.
That is, field 1 for all million records, then field2 for all million
records, etc.
4) Create another table/file and perform join/translates to do similar
sorting and retrieval.  Join to file3, file4, etc, where similar files are
used both in the MVDBMS and the RDBMS.

I liked the facts that Stephen O'Neal posted a while back about how
many users were actively using a database as well.  All of this
information would be helpful, but it would be more useful, I think, to
implement the same requirements in multiple environments, then look at
both performance and functionality (and user satisfaction) of usuable
software.

5) Do similar tests using common ODBC/OLEDB tools.

You will be hard-pressed to get products like OpenQM to participate in
that test ;-) There can be useful information with such tests, but if
the products are never or rarely used with such tools, then those
products can be written out of the performance testing altogether.
Oracle might do poorly with ODBC (it used to, at least), but since
almost every 3rd party was building Oracle-specific drivers, how it
did with ODBC was almost irrelevant.  My point is that each test can
favor one platform or another unless you look at actual solutions to
problems built for each environment and ask the quality requirement
questions about it, including performance.

 Use custom code,
Crystal Reports, or any other tools that might be used in the field to
retrieve and update data.  This test factors in how long it takes to
transmit data through the native interfaces, and is thus more of a real
world benchmark resembling end-user transaction time.
6) Do similar tests with N users all posting similar transactions at the
same time - tests must be designed to initialize all N users, then get
bench numbers, then cycle down.  Test user numbers 50, 100, 200, 500, 1000,
5000, 10000, and 20000.  A clear bell curve should arise and it would be
interesting to see where that curve peaks for each DBMS type.
7) Test processing of data using simple subroutines / stored procedures.
8) Use triggers to simulate referential integrity and note performance of
varying data volumes, transaction volumes, and concurrent users.

All tests must be performed some number of times on newly initialized
systems to avoid the benefits of memory caching which will remove much of
the burden of disk IO in tests 2-N.  Tests should be performed on the same
hardware with equal memory, CPU, disk IO controllers, etc.

So, I think we can devise our own tests

Sure we could.

and coordinate with RDBMS DBA types

heh heh

to ensure they're conducted fairly.  The only thing I'm concerned about is
funding.  There are very few organizations who want to pay for the time to
do this sort of thing - outside of the DBMS vendors themselves, and I'm
afraid anything they come up with won't be portable to other MV or RDBMS
platforms, and they would keep the exact tests a deep dark secret.  If
anyone does want to pay for it, I'll be happy to make it happen.

When I focussed on comparisons a few years back and tried to come up
with a way to have fair comparisons, including of performance, I
wanted to gather better emperical data so that a company could make a
good choice among various tools to be used, particularly database
management tools, for software development.  I became convinced we
needed the bake-off approach, starting with actual requirements. We
need to eat the results (the baked goods) and get a full range of
comparisons of the various implementations of a solution for the same
problem in order to compare them in a way that would be helpful toward
making a decision.

If you look at an application, such as those provided by Datatel, an
RDBMS data modeler would be able to tell that the data modeling was
done without thinking strictly in RDBMS terms.  There are many
nullable fields, multivalues where an RDBMS data modeler might have
suggested to the "owner" of the app that a single value would be
sufficient, and other telltale signs that this application ought to
perform better in an MV environment. I haven't kept up, so I don't
know if the UniData implementation still outperforms the SQL Server
environment, in general, for this app from the standpoint of
transactions or not.  SQL Server, on the other hand, provides an
environment that is most handily used with ODBC/OLEDB tools. UniData,
on the other hand again, does not require the use of ODBC tools (you
can use Entrinsik Informer, for example, which is easier than using
SQL-based tools, in general, since it "thinks like" the underlying
data model).  I would guess that the UniData environment in this case
will continue to outperform the SQL Server implementation, unless the
VAR opts to to break UniData (maybe by trying to make UniData function
in two tiers ;-), for example, should they decide they want everyone
to migrate to SQL Server.

I think that in order for SQL Server to be on equal footing in a
performance comparison in that particular case, the original modeling
for the application would need to be done with SQL Server as the
target, at which point performance and functionality of related
applications must both be taken into consideration or the performance
comparisons could be irrelevant.

Similarly, if the data model was originally designed for an RDBMS, MV
could be at a disadvantage.  The best performance tests, I think,
would be related to the performance of a particular set of real
requirements in each environment.

By the way, Tony, I did come up with a business model for doing such
bake-offs that would be sustainable once off the ground, but could
only come up with one that was prohibitive in the need for
considerable up front dollars plus a need for the process to be fair
(unbiased) and also perceived as fair.  As I am sure our politicians
know, I suspect it is difficult to be unbiased if dollars are coming
from here and not there, and impossible for others to believe you are
unbiased if you are getting dollars from here and not there.  --dawn

--
Dawn M. Wolthuis
Tincat Group, Inc.

Take and give some delight today
-------
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/

Reply via email to