Re: [HACKERS] Re: OLAP, Aggregates, and order of operations

2001-08-23 Thread Tom Lane

mlw <[EMAIL PROTECTED]> writes:
> So, what would the order of operation be?

> I assume "my_array_constructor()" would be called first, and the return value
> then be passed to "my_aggregate()" along with the state value being set to the
> initial state, then subsequent calls to "my_array_constructor()", followed by
> "my_aggregate()" for each additional row in the group?

Check.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] Re: OLAP, Aggregates, and order of operations

2001-08-23 Thread mlw

Tom Lane wrote:
> 
> mlw <[EMAIL PROTECTED]> writes:
> >> If the needed parameters are all the same datatype, maybe you could put
> >> them into an array and pass the array as a single argument to the
> >> aggregate.
> 
> > How would you do this without having to make multiple SQL calls?
> 
> I was thinking something like
> 
> select my_aggregate(my_array_constructor(foo, bar, baz)) from ...
> 
> where my_array_constructor is a quick hack C routine to build a
> 3-element array from 3 input arguments (s/3/whatever you need/).
> Someday we ought to have SQL syntax to build an array value from
> a list of scalars, but in the meantime an auxiliary function is the
> only way to do it.

Interesting. Kind of ugly, but interesting. 

So, what would the order of operation be?

I assume "my_array_constructor()" would be called first, and the return value
then be passed to "my_aggregate()" along with the state value being set to the
initial state, then subsequent calls to "my_array_constructor()", followed by
"my_aggregate()" for each additional row in the group?

I need to think about that.

> 
> The overhead of constructing and then interpreting the temporary
> array value is slightly annoying, but I don't think it'll be horribly
> expensive.  See the existing aggregate-related routines in numeric.c
> if you need some help with the C coding.


I can do the C stuff, I have tons of C and C++ functions written for Postgres
already, when I get the time to make them clean enough to contribute to the
Postgres project, I will. (Text manipulation, search engine, date manipulation,
xmcd, analysis functions, decode, and others) If you are interested in seeing a
half Oracle, half Postgres site, take a look at http://www.dotclick.com. (You
will need a Windows box)

It is pretty evenly split between postgres and oracle. All "member" related
data is on Oracle. All music related data is in Postgres. It has saved us
probably $50K to $100 in Oracle database licenses and hardware to do it this
way.

We have three postgres boxes. One master, and two slaves. The master gets
updated with new information from various sites. The program which does the
updating, on the master, creates a SQL log script of everything it does. The
script is then run against the slaves to maintain consistency. A web farm is
split evenly between the two slaves.

It is pretty cool. 

(As a side note, we are using Oracle for session management across a bunch of
servers. Sadly we can not use postgres for this (we would love too), sessions
are mostly updates and deletes, maybe when 7.2 comes out, but I'm still not
sure about that.)



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[HACKERS] Re: OLAP, Aggregates, and order of operations

2001-08-23 Thread mlw

Tom Lane wrote:

> mlw <[EMAIL PROTECTED]> writes:
> > I need to do some OLAP stuff, and I asked previously if there were a way
> > to pass multiple parameters to an aggrigate function. i.e.:
> > I looked through the code and it is non-trivial to do,
>
> Offhand I don't know of any fundamental reason why it couldn't be done,
> but you're right that it'd take a fair amount of work.

I understand the implications of the work, but it would be VERY cool to be
able to do this for statistical stuff.

>
> > If I do this:
> > select mycube(value1) as d1, dimention(value2) as d2, dimention(value3)
> > as d3 group by value1;
> > Can I safely assume the following:
> > (1) mycube() will be called first
> > (2) Assuming dimention() has no final func, that final func of mycube()
> > will be called last.
>
> That might be true in the present code, but it strikes me as an awfully
> risky set of assumptions.  Also, it sounds like what you have in mind is
> to have some hidden state that all the aggregate functions will access;
> how then will you work if there are more than one set of these
> aggregates being used in a query?

What I was thinking is that I could use the state to hold a pointer returned
by palloc. I don't think I can handle multiple mycube() calls, but short of
reworking aggregates, I don't see any other way.

>
> If the needed parameters are all the same datatype, maybe you could put
> them into an array and pass the array as a single argument to the
> aggregate.

How would you do this without having to make multiple SQL calls?


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



[HACKERS] Re: OLAP, Aggregates, and order of operations

2001-08-23 Thread Tom Lane

mlw <[EMAIL PROTECTED]> writes:
>> If the needed parameters are all the same datatype, maybe you could put
>> them into an array and pass the array as a single argument to the
>> aggregate.

> How would you do this without having to make multiple SQL calls?

I was thinking something like

select my_aggregate(my_array_constructor(foo, bar, baz)) from ...

where my_array_constructor is a quick hack C routine to build a
3-element array from 3 input arguments (s/3/whatever you need/).
Someday we ought to have SQL syntax to build an array value from
a list of scalars, but in the meantime an auxiliary function is the
only way to do it.

The overhead of constructing and then interpreting the temporary
array value is slightly annoying, but I don't think it'll be horribly
expensive.  See the existing aggregate-related routines in numeric.c
if you need some help with the C coding.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])