Re: [PATCHES] next draft of list rewrite

2004-01-25 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 I've attached the latest header and implementation files (pg_list.h
 and list.c, respectively) for the new linked list implementation. I'm
 pretty satisfied with the new API: now would be the ideal time to
 suggest any changes you think I should make to it. I decided to use a
 new naming convention: all public functions and macros are prefixed
 with list_ or cell_, as appropriate, and type-specific functions are
 suffixed with _int or _oid, as appropriate.

I'm fairly unhappy with that idea.  I thought the plan was to keep the
API exactly the same as before, with the single exception of having to
use ListCell* rather than List* as the type of pointers used to iterate
through a list.  Your apparent intention to rename every single list
routine makes the change far more invasive than it has any need to be.
It will undoubtedly break a fair amount of user-written code that we
don't have control over; and to what benefit?  Everyone who works on
the backend is already accustomed to the existing names, which are
mostly derived from ancient Lisp tradition.  I don't think we should
change them.  If you want to push forward on it, we had better have
a vote in pghackers, rather than assuming everyone concerned will see
this discussion in -patches.

Other than the naming issues, the code looks reasonable.  I noticed one
trivial bug, which is that the Asserts in list_member_int and
list_member_oid seem backwards.  A bigger problem is that list_length,
list_head, and list_tail must *not* be macros --- we cannot afford
double-evaluation bugs associated with operations as widely used as these.
Since these functions are not used within loop bodies, I don't think
that avoiding function call overhead for them is conceivably worth the
risks involved.

 ISTM that the allocate the list itself and the head node in one
 palloc() idea that Tom suggested actually won't work: since the head
 node can change over the life of the list, we need to be able to
 deallocate a list cell via pfree() -- which would not be the case if a
 particular node was allocated via the same palloc() as the list
 itself.

No, you'd simply leave that cell-space wasted if the head member were to
change.  It's doable, though it would complicate the deletion routine
and probably nconc as well.  It's entirely likely that it'd be more
complexity than it's worth, though, so if you don't want to do it that's
fine with me.  Certainly it makes sense not to do it in the first pass.

 BTW, one nice thing about the new List API is that it lets us get rid
 of the WRITE_INTLIST_FIELD() and related cruft in nodes/, because each
 type of list (T_List, T_IntList, or T_OidList) is now a legitimate
 Node in its own right. So we can now use WRITE_NODE_FIELD() to output
 a list of integers, for example.

Right, that was a benefit I was expecting ...

 (The implementation is a little off the cuff, and I haven't reviewed
 it; also, since the tree is totally broken with this patch applied, I
 haven't tested it either.

This is one reason I'd suggest refraining from renaming everything.
Without the rename, it'd be possible to change the implementation with
little or no change to the rest of the tree.  The idea I had was to
temporarily place (ListCell*) casts in foreach, lfirst, and lnext,
so that they would work without complaint if the iteration variable is
misdeclared as List* rather than ListCell*.  Then it should be possible
to test and commit the reimplemented list.c with few if any source-code
changes elsewhere.  After that, you can incrementally go through the
rest of the tree and change iteration variables to ListCell*, removing
the casts in the macros when all is done.  This is a lot more pleasant
way to proceed than a big bang changeover.  (Removal of the FastList
code could also be done incrementally.)

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PATCHES] next draft of list rewrite

2004-01-24 Thread Neil Conway
I've attached the latest header and implementation files (pg_list.h
and list.c, respectively) for the new linked list implementation. I'm
pretty satisfied with the new API: now would be the ideal time to
suggest any changes you think I should make to it. I decided to use a
new naming convention: all public functions and macros are prefixed
with list_ or cell_, as appropriate, and type-specific functions are
suffixed with _int or _oid, as appropriate.

ISTM that the allocate the list itself and the head node in one
palloc() idea that Tom suggested actually won't work: since the head
node can change over the life of the list, we need to be able to
deallocate a list cell via pfree() -- which would not be the case if a
particular node was allocated via the same palloc() as the list
itself.

BTW, one nice thing about the new List API is that it lets us get rid
of the WRITE_INTLIST_FIELD() and related cruft in nodes/, because each
type of list (T_List, T_IntList, or T_OidList) is now a legitimate
Node in its own right. So we can now use WRITE_NODE_FIELD() to output
a list of integers, for example.

(The implementation is a little off the cuff, and I haven't reviewed
it; also, since the tree is totally broken with this patch applied, I
haven't tested it either. That said, I'd appreciate anyone mentioning
any implementation bugs they might happen to spot.)

Incidentally, does anyone have any thoughts on the best way to commit
this? It requires changing pretty much _every single_ place in the
code that uses the List API. Although most of those modifications
should be trivial, it will still require touching a lot of
files. Should I just make the changes locally and commit them all in
one fell swoop? Create a private branch in CVS? Any other suggestions?

FWIW, I'm hoping to have this done (and committed to CVS HEAD) by the
end of next weekend.

-Neil

/*-
 *
 * list.c
 *	  implementation for PostgreSQL generic linked list package
 *
 *
 * Portions Copyright (c) 1996-2003, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
 *
 * IDENTIFICATION
 *	  $PostgreSQL: pgsql-server/src/backend/nodes/list.c,v 1.55 2003/11/29 19:51:49 pgsql Exp $
 *
 *-
 */
#include postgres.h

#include nodes/pg_list.h

/*
 * Return a freshly allocated List. Since empty non-NIL lists are
 * invalid, new_list() also allocates the head cell of the new list:
 * the caller should be sure to fill in that cell's data.
 */
static List *
new_list(NodeTag type)
{
	List		*new_list;
	ListCell	*new_head;

	new_head = palloc(sizeof(*new_head));
	new_head-next = NULL;

	new_list = palloc(sizeof(*new_list));
	new_list-type = type;
	new_list-length = 1;
	new_list-head = new_head;
	new_list-tail = new_head;

	return new_list;
}

/*
 * Allocate a new cell and make it the head of the specified list. The
 * data in the new head cell is undefined; the caller should be sure
 * to fill it in
 */
static void
new_head_cell(List *list)
{
	ListCell *new_head;

	new_head = palloc(sizeof(*new_head));
	new_head-next = list-head;

	list-head = new_head;
	list-length++;
}

/*
 * Allocate a new cell and make it the tail of the specified list. The
 * data in the new tail cell is undefined; the caller should be sure
 * to fill it in
 */
static void
new_tail_cell(List *list)
{
	ListCell *new_tail;

	new_tail = palloc(sizeof(*new_tail));
	new_tail-next = NULL;

	list-tail-next = new_tail;
	list-tail = new_tail;
	list-length++;
}

/*
 * Append a pointer to the list. A pointer to the modified list is
 * returned. Note that this function may or may not destructively
 * modify the list; callers should always use this function's return
 * value, rather than continuing to use the pointer passed as the
 * first argument.
 */
List *
list_append(List *list, void *datum)
{
	if (list == NIL)
		list = new_list(T_List);
	else
		new_tail_cell(list);

	cell_value(list_tail(list)) = datum;
	return list;
}

/*
 * Append an integer to the specified list. See list_append()
 */
List * 
list_append_int(List *list, int datum)
{
	if (list == NIL)
		list = new_list(T_IntList);
	else
		new_tail_cell(list);

	cell_value_int(list_tail(list)) = datum;
	return list;
}

/*
 * Append an OID to the specified list. See list_append()
 */
List * 
list_append_oid(List *list, Oid datum)
{
	if (list == NIL)
		list = new_list(T_OidList);
	else
		new_tail_cell(list);

	cell_value_oid(list_tail(list)) = datum;
	return list;
}

/*
 * Private convenience routine: allocate a new tail cell and place the
 * data from the specified cell into it. The data is copied via a
 * shallow copy, not a deep copy.
 */
static List *
list_append_auto(List *list, ListCell *cell)
{
	/* XXX: broken if list == NIL */
	switch (list-type)
	{
		case T_List:
			return list_append(list, cell_value(cell));
		case T_IntList:
			return