Chris,
first of all, thanks for your interest on this subject.
Please, see my comments inline:
Chris Howe wrote:
Jacopo,
I still don't think I'm understanding the benefit of this being native
in OFBiz as opposed to a cube style integration that
mondrian/pentaho/etc might provide?
Obviously additional services will need to be created to denormalize
and pick out relevant data regardless of the approach, but that's not
really the meat and potatoes that the pentaho stuff is demonstrating.
Building ETL services and an initial star schema (that I guess is what
you refer to with denormalized entities) is my primary goal now.
Mondrian is essentially taking relational database views (that can be
based on star schemas) and creating the multi-dimensional cube that
you're going to be ending up with ultimately if you're planning on
using time as a dimension, only mondrian is prepared to take on many
more dimensions as well.
The star schema will be built in a relational database but essentially
is composed of:
a fact table (e.g. "Sales Transactions")
a set of dimension tables (e.g. "Products", "Date", "Time", "Stores" etc..)
Usually the fact table is only useful inside one star schema, while
dimensions are shared among many star schemas.
I'd like to build a common, based on best practices set of dimensions
(and a few fact tables that use them) derived from the OFBiz data model.
You can then use the tool you want to run your reports/analysis etc...
I don't know how Mondrian works, but by what you say here it seems that
you can use Mondrian on top of these star schemas.
I understand wanting to benefit from the screen and form widgets, but
to utilize multiple dimensions, I imagine you would be creating forms
that could only be reused by multidimensional data anyway which is the
same place you get with a pentaho integration except with pentaho
you're also getting the on the fly query instead of being predefined.
Yes, I'd like also to provide some simple tools, based on form widgets
to create some simple reports based on the star schemas above but mainly
as a PoC... you can definitely use more advanced reporting tools for this.
Jacopo
I'm really excited about this topic and see my calendar clearing up to
be able to help but am afraid my toolbox may not be deep enough to
provide help in the direction you're talking about. Which is fine. It
just seems like a lot of time spent on something that appears to
already exist and have minds that are in the industry of BI
contributing to it.
I'm still learning the penatho stuff and am by no means an expert, just
curious as to what you're seeing on the surface that makes you want to
stay inside OFBiz for the solution.
--- Jacopo Cappellato <[EMAIL PROTECTED]> wrote:
David,
first of all, thanks for the insight, much appreciated.
David E. Jones wrote:
Jacopo,
I like the idea of this approach, and it's something I've been
thinking
about for a while.
A good next step might be to do a PoC implementation with maybe
just one
star schema using a simple-method to do the ETL, and a form widget
to
report on the data. In that process we'll probably find some things
that
are ugly and could use some tool extensions.
I agree with you that we should start with a pretty simple PoC
implementation. However, while reading the book "The Datawarehouse
Toolkit" in my (spare?) time I'm taking notes about tool extensions
that
will help to build the datawarehouse; here are a few examples of
these
very preliminary requirements:
1) add support in the entity engine for view-entities that are stored
in
the db as sql VIEWs (it seems that we will need many different views
over the same shared dimensions; for example the "Date" dimension,
could
play different roles in the datawarehouse)
2) implement util methods to populate the Date dimension (that will
store all the days of the years of interest for the analysis) and the
Time dimenstion (all the minutes of a day)
etc...
BTW, there are some interesting generic star schemas in the Data
Model
Resource Book. There is a sales one with some requirements in the
form
of questions on page 370 of volume 1, and a model diagram on page
371.
I will definitely have a look at them.
The data warehouse entities should go into their own entity group,
a new
group, with it's own datasource in the OOTB entityengine.xml file.
A
good group name might be something like org.ofbiz.olap (as opposed
to
oltp which characterizes most of the current entities.
I agree with everything you write here; by the way Iìd like to
clarify
that the name "olap" here will be used in its original (generic)
meaning
of "Online Analytic Processing" and not to designate the underlying
technology of the db (that in my initial plan will be a relational db
with star schemas and not an olap db with cubes).
Anyway, yeah this would be cool, and we already have a lot of tools
that
would work really well here.
I will try to write down some notes in the Confluence doc site and
see
if we can get this ball rolling... do you think that the upcoming
conference would be a good chance to speed up things?
Jacopo
-David
On Feb 16, 2007, at 12:15 AM, Jacopo Cappellato wrote:
Christopher,
still nothing official or concrete, as far as I know.
I know that Chris Howe did some integration tests with OpenI:
http://docs.ofbiz.org/x/0AI
I am seriously considering a different approach; while I'm
studying
the book "The Datawarehouse Toolkit"
(http://ofbiz.apache.org/documents.html), I'm trying to draft out
a
proposal for the implementation of base datawarehousing features
in
OFBiz (a separate set of entities for dimensions, facts and start
schemas; ETL services based on minilang, tools to manage the
dimensions tables and synchronization).
You'll find some of my notes here:
http://docs.ofbiz.org/x/2QI
My goal is this: once we have a set of star schemas (facts and
dimensions) derived from OFBiz entities and based on best
practices,
and the tools to manage the data in them, we could integrate a
visual
reporting tool to run reports against them (or just use, with some
improvements the form widgets).
It may seem an ambitious plan, but I think that many of the
building
blocks to complete it are already in the framework, we'll just
have to
improve and fine tune them.
If you are interested in helping with this we could try to create
a
work group for this...
Jacopo
Christopher Snow wrote:
Is there any work going on at the moment to build an ofbiz data
warehouse and reporting infrastructure, for example the
integration
of the Pentaho toolset?
Many thanks ...
--This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.