[web2py] Re: Wiki with Oracle?

Jason Phillips Sun, 07 Apr 2013 12:42:32 -0700

The resolution of 
issue#1432<https://code.google.com/p/web2py/issues/detail?id=1432> (submitted 
under my other account, I need to unify these) has now solved the first 
error you listed (*ORA-00904: "WIKI_TAG"."WIKI_PAGE": invalid identifier*), 
which was a IS_NOT_IN_DB query error of the type I described in the issue.


So I tested the wiki again in Oracle to see if I would still run into the 
second error you posted; I did, but can now add the full SQL queries that 
were generated.

For the Search page, the query looks like this:

SELECT DISTINCT COUNT(wiki_tag.wiki_page), wiki_page.id, wiki_page.slug, 
wiki_page.title, wiki_page.tags, wiki_page.can_read, wiki_page.body 
FROM (
  SELECT w_tmp.*, ROWNUM w_row FROM (
    SELECT COUNT(wiki_tag.wiki_page), wiki_page.id, wiki_page.slug, 
wiki_page.title, wiki_page.tags, wiki_page.can_read, wiki_page.body 
    FROM wiki_tag, wiki_page 
    WHERE (((wiki_page.id = wiki_tag.wiki_page) AND (wiki_tag.name IN 
('stuff'))) OR (wiki_page.title LIKE (('%' || (REPLACE('stuff','%','%%')) 
|| '%')))) 
    GROUP BY wiki_page.id, wiki_page.slug, wiki_page.title, wiki_page.tags, 
wiki_page.can_read, wiki_page.body 
    ORDER BY COUNT(wiki_tag.wiki_page) DESC
  ) w_tmp WHERE ROWNUM<=100
) wiki_tag, wiki_page  
WHERE (((wiki_page.id = wiki_tag.wiki_page) AND (wiki_tag.name IN 
('stuff'))) OR (wiki_page.title LIKE (('%' || (REPLACE('stuff','%','%%')) 
|| '%')))) AND w_row > 0  
GROUP BY wiki_page.id, wiki_page.slug, wiki_page.title, wiki_page.tags, 
wiki_page.can_read, wiki_page.body 
ORDER BY COUNT(wiki_tag.wiki_page) DESC

There are two problems here. The larger problem not surprisingly occurs due 
to pagination, which I know has openly been acknowledged as a limitation 
with Oracle and the DAL:

Oracle also does not support pagination. It does not support neither the 
> OFFSET nor the LIMIT keywords. Web2py achieves pagination by translating a 
> db(...).select(limitby=(a,b)) into a complex three-way nested select (as 
> suggested by official Oracle documentation). *This works for simple 
> select but may break for complex selects involving aliased fields and or 
> joins*.


Indeed we do have a JOIN and pagination here. I've been looking at problems 
with the existing pagination solution and hope to find another way forward 
eventually (still using subqueries, but slightly modifying which conditions 
get written into the inner and outer queries); if I get somewhere, I'll 
certainly post my attempt.

But putting that aside for the moment, even when I remove the pagination 
(manually commented out to test), the query for this page gets generated 
incorrectly for Oracle:

SELECT DISTINCT COUNT(wiki_tag.wiki_page), wiki_page.id, 
wiki_page.slug,wiki_page
.title, wiki_page.tags, wiki_page.can_read, wiki_page.body 
FROM wiki_tag, wiki_page 
WHERE (((wiki_page.id = wiki_tag.wiki_page) AND (wiki_tag.name IN ('stuff'
))) OR (wiki_page.title LIKE (('%' || (REPLACE('stuff','%','%%')) || '%')))) 
GROUP BY wiki_page.id, wiki_page.slug, wiki_page.title, wiki_page.tags,wiki_page
.can_read, wiki_page.body 
ORDER BY COUNT(wiki_tag.wiki_page) DESC

The error returned is: *ORA-00932: inconsistent datatypes: expected - got 
CLOB*

The problem is that Oracle does not permit GROUP BY on columns of type 
CLOB; nor does it permit DISTINCT on these, for that is essentially another 
type of grouping. The CLOB columns in this table as defined include:
 
wiki_page.body
wiki_page.tags
wiki_page.html
wiki_page.can_read
wiki_page.can_edit

The query made for the page attempts to simultaneously fetch all the 
important information from the WIKI_PAGE table (including the full text of 
the post) *and* the count of matches in the WIKI_TAG table; to achieve 
this, it uses GROUP BY on all the WIKI_PAGE columns and then uses COUNT() 
on the WIKI_TAG entries. 

Here is the relevant wiki code:

            count = db.wiki_tag.wiki_page.count()
            fields = [db.wiki_page.id, db.wiki_page.slug,
                      db.wiki_page.title, db.wiki_page.tags,
                      db.wiki_page.can_read]
            if preview:
                fields.append(db.wiki_page.body)
            if query is None:
                query = (db.wiki_page.id == db.wiki_tag.wiki_page) &\
                    (db.wiki_tag.name.belongs(tags))
                query = query | db.wiki_page.title.contains(request.vars.q)
            if self.settings.restrict_search and not self.manage():
                query = query & (db.wiki_page.created_by == self.auth.
user_id)
            *pages **= db(query).select(count,
                *fields, **dict(orderby=orderby or ~count,
                               groupby=reduce(lambda a, b: a | b, fields),
                               distinct=True,
                               limitby=limitby))*

Since the *wiki_page.tags* and *wiki_page.can_read *columns are also CLOBs, 
it won't help to disable preview and thereby skip the *body* column; 
grouping across CLOBs will persist as written.

>From a general perspective of database efficiency, I wonder if that kind of 
query is a good approach in the first place. Grouping across all columns of 
a table that include potentially enormous unlimited-text fields is not 
something I would typically do; I didn't even realize other databases would 
allow it, since I've avoided it so long in Oracle on CLOBs. 

I don't know the cleanest or shortest way to rewrite it and avoid fetching 
the count alongside all the grouped text data, but here's my initial 
attempt that still works in SQLITE but now also runs correctly in ORACE 
(even with pagination). Replace the bolded section in the above code with 
this:

            page_ids = db(query).select(count, db.wiki_page.id,
                **dict(orderby=orderby or ~count,
                       groupby=db.wiki_page.id,
                       distinct=True,
                       limitby=limitby))
            if page_ids:
                page_data = db(
                    db.wiki_page.id.belongs(row.wiki_page.id for row inpage_ids
)
                    ).select(*fields).group_by_value(db.wiki_page.id)
                pages=[page_data[row.wiki_page.id][0] for row in page_ids]
            else:
                pages=[]

So I first fetched only the ordered and paginated list of ids, then used 
these with the belongs() function to fetch the actual page data, also 
taking advantage of group_by_value() to create a keyed set for accessing 
when iterating over the sorted ids.  

But one caveat is that the results returned by group_by_value() appear to 
behave different from normal row objects (I may be missing something here), 
so I had to change all field references in the output loop from *
p.table.field* to *p['table.field']* as highlighted below:

                         items = [DIV(H3(A(*p['wiki_page.title']*, 
_href=URL(
                                    args=*p['wiki_page.slug']*))),
                             MARKMIN(self.first_paragraph(*p*))
                                 if preview else '',
                             DIV(_class='w2p_wiki_tags',
                                 *[link(t.strip()) for t in
                                       *p['wiki_page.tags']* or [] if 
t.strip()]),
                             _class='w2p_wiki_search_item')
                         for p in pages]

These changes now have the wiki working fully for me on Oracle (I can 
provide the proper diff if desired), since solving the CLOB-grouping error 
by rewriting in this manner also sidestepped the problem with LIMITBY that 
was experienced in the single-query approach.

Thoughts?  




On Thursday, March 7, 2013 12:22:44 PM UTC-6, chicks wrote:
>
> Has anyone been able to get the wiki feature working with an Oracle DB?  
> The tables get created successfully with the current version, but any 
> attempt to create a page or search results in a ticket:
>  
> Create:
> <class 'cx_Oracle.DatabaseError'> ORA-00904: "WIKI_PAGE"."SLUG": invalid 
> identifier 
>
>  
>
> Search:
> <class 'cx_Oracle.DatabaseError'> ORA-00904: "WIKI_TAG"."WIKI_PAGE": 
> invalid identifier 
>
>  
>
> Looks like a nice feature, if it could be made to work.
>
> Thanks
>
>  
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

[web2py] Re: Wiki with Oracle?

Reply via email to