Update: it seems to work with JOIN_OPERATOR_IMPLEMENTATION = IMPROVED.
This is the plan:
table col or idx strategy pagecount
TA2 TABLE SCAN 36154
TA1 ABORTCODEID JOIN VIA KEY COLUMN 1
TABLE HASHED
NO TEMPORARY RESULTS CREATED
RESULT IS COPIED , COSTVALUE IS 32918
Kind regards
robert
2005/11/24, Robert Klemme <[EMAIL PROTECTED]>:
> 2005/11/24, Becker, Holger <[EMAIL PROTECTED]>:
> > Hi,
> >
> > what does the explain plan looks like for the group by statement?
>
> JOIN_OPERATOR_IMPLEMENTATION = YES
>
> table col or idx strategy
> pagecount
> TA1 EQUAL CONDITION FOR KEY 1
> ABORTCODEID (USED KEY COLUMN)
> TA2 IDX_HOURLY_APPID JOIN VIA INDEXED COLUMN 36154
> NO TEMPORARY RESULTS CREATED
> RESULT IS COPIED , COSTVALUE IS 1783
>
>
> JOIN_OPERATOR_IMPLEMENTATION = NO
>
> table col or idx strategy pagecount
> TA2 TABLE SCAN 36154
> TA1 ABORTCODEID JOIN VIA KEY COLUMN 1
> RESULT IS COPIED , COSTVALUE IS 45460
>
> (also attached as txt for easier reading)
>
> > Additionally I would like to ask you to switch back to old join
> > implementation by setting JOIN_OPERATOR_IMPLEMENTATION to NO.
> > Perhaps we can encircle the reason for the problem.
>
> When doing the group by query with this option set to NO I get 261185
> as result which is an order of magnitude off. If I remove just one
> group by column I get the expected 4502345. Now this is getting
> stranger and stranger..
>
> Kind regards
>
> robert
>
> > Kind regards
> > Holger
> >
> > > -----Original Message-----
> > > From: Robert Klemme [mailto:[EMAIL PROTECTED]
> > > Sent: Donnerstag, 24. November 2005 10:26
> > > To: Becker, Holger
> > > Cc: maxdb
> > > Subject: Re: Strange results with join and group by
> > >
> > > 2005/11/24, Becker, Holger <[EMAIL PROTECTED]>:
> > > > > From: Robert Klemme wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > we discovered a strange phenomenon with MaxDB 7.5.00.30
> > > on Linux (RHEE
> > > > > 4). Consider these two queries:
> > > > >
> > > > > SELECT TA1.abortcode AS COL1, TA1.description AS COL2,
> > > > > SUM(TA2.refcount) AS COL3
> > > > > FROM ag_abortcode TA1, v_ag_hourlymaster TA2
> > > > > WHERE TA1.abortcodeid=TA2.abortcodeid AND ( TA2.applianceid=1 )
> > > > > GROUP BY TA1.abortcode, TA1.description
> > > > >
> > > > > SELECT count(*) AS COL3, SUM(TA2.refcount)
> > > > > FROM ag_abortcode TA1, v_ag_hourlymaster TA2
> > > > > WHERE TA1.abortcodeid=TA2.abortcodeid AND ( TA2.applianceid=1 )
> > > > >
> > > > > Same tables, same join condition, same filter criteria
> > > and since the
> > > > > grouping just returns one row in the first case comparing
> > > of results
> > > > > is easy. Now here's the strange thing: the second query
> > > always returns
> > > > > the same (correct) result for SUM(): 4502345. The first
> > > query returns
> > > > > 4436366 as well as 2921189 and probably other values,
> > > too, if we try
> > > > > longer - without any change to the data! Earlier 7.5 versions on
> > > > > Windows do not exhibit this behavior. Our tests have shown that as
> > > > > soon as we add a group by clause with two (!) columns the
> > > results are,
> > > > > um, unpredictable.
> > > > >
> > > > > Did anybody see something like this before? I didn't
> > > find anything
> > > > > similar in the bug database... We'll try to come up with
> > > a smaller
> > > > > test case (several million rows are involved) but I
> > > thought I might
> > > > > ask in the meantime whether this is a known issue.
> > > >
> > > > Hi,
> > > >
> > > > you could check the setting of the database parameter
> > > HASHED_RESULTSET.
> > > > This parameter enables a new feature for set functions in joins.
> > > > Perhaps there is a problem with this new code.
> > > > So if the parameter is set to 'YES' in your database you should
> > > > disable the feature by setting the parameter to 'NO' and test
> > > > your group by query again.
> > > > Please let me kown if this was the reason.
> > >
> > > No luck here: it is already set to 'NO':
> > >
> > > fox:CR> param_getvalue HASHED_RESULTSET
> > > OK
> > > NO
> > > fox:CR> param_getfull HASHED_RESULTSET
> > > OK
> > > string
> > > NO
> > > NO
> > > CHANGE YES
> > > INTERN NO
> > > MANDATORY YES
> > > CLEAR NO
> > > DYNAMIC NO
> > > CASESENSITIVE NO
> > > DEVSPACE NO
> > > MODIFY YES
> > > GROUP EXTENDED
> > > DISPLAYNAME
> > > VALUESET
> > > MAX
> > > MIN
> > > INSTANCES
> > > LASTKNOWNGOOD NO
> > > HELP
> > > Enables the use of hashed resultsets (YES/NO)
> > > EXPLAIN
> > > HASHED_RESULTSET 'YES' or 'NO'
> > >
> > > 'YES': Hashed resultsets are used for aggregates
> > >
> > > 'NO' : Hashed resultsets will not be used
> > >
> > > Hm, what now? This seems like a serious issue to me as one cannot rely
> > > on query results being correct...
> > >
> > > Kind regards
> > >
> > > robert
> > >
> >
>
>
>
--
MaxDB Discussion Mailing List
For list archives: http://lists.mysql.com/maxdb
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]