I've been reading the bison source code and came up with some changes
I would like to be considered.
The for_all_rules.patch changes the double loop over all rules deriving
all variables to a single loop over all rules.
Tested with make check on Cygwin with 1 expected failure:
49: Output file name: [EMAIL PROTECTED]&*()-=_+{}[]|\:;<>, .' FAILED
(output.at:188)
failed because the file system (NTFS) does not allow \/:?*"<>|
see http://lists.gnu.org/archive/html/bug-bison/2008-07/msg00008.html
2008-09-27 Di-an Jan <[EMAIL PROTECTED]>
* src/closure.c (set_firsts): Use single loop over all rules.
diff --git a/src/closure.c b/src/closure.c
index ff3109c..4c1d40c 100755
--- a/src/closure.c
+++ b/src/closure.c
@@ -123,17 +123,16 @@ print_fderives (void)
static void
set_firsts (void)
{
- symbol_number i, j;
+ rule_number r;
firsts = bitsetv_create (nvars, nvars, BITSET_FIXED);
- for (i = ntokens; i < nsyms; i++)
- for (j = 0; derives[i - ntokens][j]; ++j)
- {
- item_number sym = derives[i - ntokens][j]->rhs[0];
- if (ISVAR (sym))
- bitset_set (FIRSTS (i), sym - ntokens);
- }
+ for (r = 0; r < nrules; r++)
+ {
+ item_number sym = rules[r].rhs[0];
+ if (ISVAR (sym))
+ bitset_set (FIRSTS (rules[r].lhs->number), sym - ntokens);
+ }
if (trace_flag & trace_sets)
bitsetv_matrix_dump (stderr, "RTC: Firsts Input", firsts);
The shift_symbol-bitset.patch changes an array of NSYMS integers that's
sorted after adding all the shift symbols of a state to a bitset of NSYMS
bits. Obviously there's space saving. For speed, the trade off is between
an insertion sort verses possibly slower clear, set, count and iterate.
The changes to state.* allows transitions states to be filled in in place
rather than copied over.
I'm not sure whether BITSET_FOR_EACH or loop and bitset_test is more
efficient, or whether the original code is faster. I couldn't get good
timing on Cygwin. So I attached patches for both. In general, someone
should figure out which is more efficient for each bitset and document it.
The bitset_test version is tested with make check on Cygwin with 1 expected
failure. The bitset_iterator version is tested in combination with other
stuff.
2008-09-27 Di-an Jan <[EMAIL PROTECTED]>
* src/LR0.c (shift_symbol): Change to bitset.
(nshifts, shiftset): Remove.
(allocate_storage): Initialize shift_symbol, remove shiftset.
(free_storage): Free shift_symbol, remove shiftset.
(new_itemsets): Clear and update shift_symbol.
(append_states): Iterate over shift_symbol already sorted.
Allocate and fill in state transitions in place.
Don't use shiftset.
(set_states): Adjust call to state_transitions_set.
(generate_states): Move state_transitions_set to append_states.
* src/state.h (state_transitions_set): Remove states parameter.
* src/state.c (state_transitions_set): Remove states parameter.
* src/state.c (transitions_new): Remove states parameter.
Other things I'm thinking of working on:
Rework the line wrapping logic in src/print.c (print_grammar).
A straight forward version is attached as print_grammar.patch,
but it seems to be slower than with string buffers. Maybe I could
do a version with a printf-like interface rather than END_TEST.
In src/reader.c, instead of using a single symbol_list for all rules,
use a list of rules, each containing a symbol_list. At the cost of an
extra structure, this would move the per-rule fields out of each node
of symbol_list and make the code clearer. Should be at least as efficient
since all iterations over the one symbol_list stops after each rule.
Use macros/inline functions for accessing ritem. Makes the code easy
to understand without having to remember the special ritem structure.
Introduce the typedef var_number for 0..nvars-1. More self-documenting
and makes some code simpler.
Do something like
struct transitions { int num; state** states; };
struct state { ... struct transitions transitions; ... };
This uses just one pointer, as before, but probably more efficient
and doesn't use "state *states[1];" to allocate 0 or more states.
Similarly for reductions and errs.
Consistent style.
i++ vs ++i vs i+=1 statements,
and put them in expressions as in dst[i++] = src[j++].
Remove semicolons after {}.
Use XNMALLOC(n, type) or xnmalloc(n, s) and xmemdup from gnulib.
Di-an Jandiff --git a/src/closure.c b/src/closure.c
index ff3109c..f32f7f6 100755
--- a/src/closure.c
+++ b/src/closure.c
@@ -1,6 +1,6 @@
/* Closures for Bison
- Copyright (C) 1984, 1989, 2000, 2001, 2002, 2004, 2005, 2007 Free
+ Copyright (C) 1984, 1989, 2000, 2001, 2002, 2004, 2005, 2007, 2008 Free
Software Foundation, Inc.
This file is part of Bison, the GNU Compiler Compiler.
@@ -123,17 +123,16 @@ print_fderives (void)
static void
set_firsts (void)
{
- symbol_number i, j;
+ rule_number r;
firsts = bitsetv_create (nvars, nvars, BITSET_FIXED);
- for (i = ntokens; i < nsyms; i++)
- for (j = 0; derives[i - ntokens][j]; ++j)
- {
- item_number sym = derives[i - ntokens][j]->rhs[0];
- if (ISVAR (sym))
- bitset_set (FIRSTS (i), sym - ntokens);
- }
+ for (r = 0; r < nrules; r++)
+ {
+ item_number sym = rules[r].rhs[0];
+ if (ISVAR (sym))
+ bitset_set (FIRSTS (rules[r].lhs->number), sym - ntokens);
+ }
if (trace_flag & trace_sets)
bitsetv_matrix_dump (stderr, "RTC: Firsts Input", firsts);
diff --git a/src/LR0.c b/src/LR0.c
index efda69f..667ad3f 100755
--- a/src/LR0.c
+++ b/src/LR0.c
@@ -1,7 +1,7 @@
/* Generate the nondeterministic finite state machine for Bison.
- Copyright (C) 1984, 1986, 1989, 2000, 2001, 2002, 2004, 2005, 2006, 2007
- Free Software Foundation, Inc.
+ Copyright (C) 1984, 1986, 1989, 2000, 2001, 2002, 2004, 2005, 2006, 2007,
+ 2008 Free Software Foundation, Inc.
This file is part of Bison, the GNU Compiler Compiler.
@@ -77,11 +77,9 @@ state_list_append (symbol_number sym, size_t core_size,
item_number *core)
return s;
}
-static int nshifts;
-static symbol_number *shift_symbol;
+static bitset shift_symbol;
static rule **redset;
-static state **shiftset;
static item_number **kernel_base;
static int *kernel_size;
@@ -136,19 +134,17 @@ allocate_storage (void)
{
allocate_itemsets ();
- shiftset = xnmalloc (nsyms, sizeof *shiftset);
redset = xnmalloc (nrules, sizeof *redset);
state_hash_new ();
- shift_symbol = xnmalloc (nsyms, sizeof *shift_symbol);
+ shift_symbol = bitset_create (nsyms, BITSET_FIXED);
}
static void
free_storage (void)
{
- free (shift_symbol);
+ bitset_free (shift_symbol);
free (redset);
- free (shiftset);
free (kernel_base);
free (kernel_size);
free (kernel_items);
@@ -183,17 +179,14 @@ new_itemsets (state *s)
memset (kernel_size, 0, nsyms * sizeof *kernel_size);
- nshifts = 0;
+ bitset_zero (shift_symbol);
for (i = 0; i < nitemset; ++i)
if (item_number_is_symbol_number (ritem[itemset[i]]))
{
symbol_number sym = item_number_as_symbol_number (ritem[itemset[i]]);
if (!kernel_size[sym])
- {
- shift_symbol[nshifts] = sym;
- nshifts++;
- }
+ bitset_set (shift_symbol, sym);
kernel_base[sym][kernel_size[sym]] = itemset[i] + 1;
kernel_size[sym]++;
@@ -231,33 +224,25 @@ get_state (symbol_number sym, size_t core_size,
item_number *core)
| Use the information computed by new_itemsets to find the state |
| numbers reached by each shift transition from S. |
| |
-| SHIFTSET is set up as a vector of those states. |
+| Record in the transitions structure. |
`---------------------------------------------------------------*/
static void
append_states (state *s)
{
int i;
+ symbol_number sym;
+ size_t nshifts = bitset_count (shift_symbol);
if (trace_flag & trace_automaton)
fprintf (stderr, "Entering append_states, state = %d\n", s->number);
- /* First sort shift_symbol into increasing order. */
-
- for (i = 1; i < nshifts; i++)
- {
- symbol_number sym = shift_symbol[i];
- int j;
- for (j = i; 0 < j && sym < shift_symbol[j - 1]; j--)
- shift_symbol[j] = shift_symbol[j - 1];
- shift_symbol[j] = sym;
- }
-
- for (i = 0; i < nshifts; i++)
- {
- symbol_number sym = shift_symbol[i];
- shiftset[i] = get_state (sym, kernel_size[sym], kernel_base[sym]);
- }
+ state_transitions_set (s, nshifts);
+ i = 0;
+ for (sym = 0; sym < nsyms; sym++)
+ if (bitset_test (shift_symbol, sym))
+ s->transitions->states[i++]
+ = get_state (sym, kernel_size[sym], kernel_base[sym]);
}
@@ -314,7 +299,7 @@ set_states (void)
computed later, but set_conflicts. */
state *s = this->state;
if (!s->transitions)
- state_transitions_set (s, 0, 0);
+ state_transitions_set (s, 0);
if (!s->reductions)
state_reductions_set (s, 0, 0);
@@ -360,12 +345,9 @@ generate_states (void)
save_reductions (s);
/* Find the itemsets of the states that shifts can reach. */
new_itemsets (s);
- /* Find or create the core structures for those states. */
+ /* Find or create the core structures for those states.
+ Record the shifts allowed out of this state. */
append_states (s);
-
- /* Create the shifts structures for the shifts to those states,
- now that the state numbers transitioning to are known. */
- state_transitions_set (s, nshifts, shiftset);
}
/* discard various storage */
diff --git a/src/state.c b/src/state.c
index d3460c1..bf4ad3d 100755
--- a/src/state.c
+++ b/src/state.c
@@ -1,6 +1,6 @@
/* Type definitions for nondeterministic finite state machine for Bison.
- Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007 Free Software
+ Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software
Foundation, Inc.
This file is part of Bison, the GNU Compiler Compiler.
@@ -39,12 +39,11 @@
`-----------------------------------------*/
static transitions *
-transitions_new (int num, state **the_states)
+transitions_new (int num)
{
- size_t states_size = num * sizeof *the_states;
+ size_t states_size = num * sizeof (state *);
transitions *res = xmalloc (offsetof (transitions, states) + states_size);
res->num = num;
- memcpy (res->states, the_states, states_size);
return res;
}
@@ -169,15 +168,15 @@ state_free (state *s)
}
-/*---------------------------.
-| Set the transitions of S. |
-`---------------------------*/
+/*--------------------------------.
+| Allocate the transitions of S. |
+`--------------------------------*/
void
-state_transitions_set (state *s, int num, state **trans)
+state_transitions_set (state *s, int num)
{
aver (!s->transitions);
- s->transitions = transitions_new (num, trans);
+ s->transitions = transitions_new (num);
}
diff --git a/src/state.h b/src/state.h
index 4afc1f0..422555d 100755
--- a/src/state.h
+++ b/src/state.h
@@ -1,6 +1,6 @@
/* Type definitions for nondeterministic finite state machine for Bison.
- Copyright (C) 1984, 1989, 2000, 2001, 2002, 2003, 2004, 2007 Free
+ Copyright (C) 1984, 1989, 2000, 2001, 2002, 2003, 2004, 2007, 2008 Free
Software Foundation, Inc.
This file is part of Bison, the GNU Compiler Compiler.
@@ -223,8 +223,8 @@ extern state *final_state;
state *state_new (symbol_number accessing_symbol,
size_t core_size, item_number *core);
-/* Set the transitions of STATE. */
-void state_transitions_set (state *s, int num, state **trans);
+/* Allocate the transitions of STATE. */
+void state_transitions_set (state *s, int num);
/* Set the reductions of STATE. */
void state_reductions_set (state *s, int num, rule **reds);
diff --git a/src/LR0.c b/src/LR0.c
index efda69f..7e4c1d8 100755
--- a/src/LR0.c
+++ b/src/LR0.c
@@ -1,7 +1,7 @@
/* Generate the nondeterministic finite state machine for Bison.
- Copyright (C) 1984, 1986, 1989, 2000, 2001, 2002, 2004, 2005, 2006, 2007
- Free Software Foundation, Inc.
+ Copyright (C) 1984, 1986, 1989, 2000, 2001, 2002, 2004, 2005, 2006, 2007,
+ 2008 Free Software Foundation, Inc.
This file is part of Bison, the GNU Compiler Compiler.
@@ -77,11 +77,9 @@ state_list_append (symbol_number sym, size_t core_size,
item_number *core)
return s;
}
-static int nshifts;
-static symbol_number *shift_symbol;
+static bitset shift_symbol;
static rule **redset;
-static state **shiftset;
static item_number **kernel_base;
static int *kernel_size;
@@ -136,19 +134,17 @@ allocate_storage (void)
{
allocate_itemsets ();
- shiftset = xnmalloc (nsyms, sizeof *shiftset);
redset = xnmalloc (nrules, sizeof *redset);
state_hash_new ();
- shift_symbol = xnmalloc (nsyms, sizeof *shift_symbol);
+ shift_symbol = bitset_create (nsyms, BITSET_FIXED);
}
static void
free_storage (void)
{
- free (shift_symbol);
+ bitset_free (shift_symbol);
free (redset);
- free (shiftset);
free (kernel_base);
free (kernel_size);
free (kernel_items);
@@ -183,17 +179,14 @@ new_itemsets (state *s)
memset (kernel_size, 0, nsyms * sizeof *kernel_size);
- nshifts = 0;
+ bitset_zero (shift_symbol);
for (i = 0; i < nitemset; ++i)
if (item_number_is_symbol_number (ritem[itemset[i]]))
{
symbol_number sym = item_number_as_symbol_number (ritem[itemset[i]]);
if (!kernel_size[sym])
- {
- shift_symbol[nshifts] = sym;
- nshifts++;
- }
+ bitset_set (shift_symbol, sym);
kernel_base[sym][kernel_size[sym]] = itemset[i] + 1;
kernel_size[sym]++;
@@ -231,32 +224,26 @@ get_state (symbol_number sym, size_t core_size,
item_number *core)
| Use the information computed by new_itemsets to find the state |
| numbers reached by each shift transition from S. |
| |
-| SHIFTSET is set up as a vector of those states. |
+| Record in the transitions structure. |
`---------------------------------------------------------------*/
static void
append_states (state *s)
{
int i;
+ bitset_iterator biter;
+ symbol_number sym;
+ size_t nshifts = bitset_count (shift_symbol);
if (trace_flag & trace_automaton)
fprintf (stderr, "Entering append_states, state = %d\n", s->number);
- /* First sort shift_symbol into increasing order. */
-
- for (i = 1; i < nshifts; i++)
+ state_transitions_set (s, nshifts);
+ i = 0;
+ BITSET_FOR_EACH (biter, shift_symbol, sym, 0)
{
- symbol_number sym = shift_symbol[i];
- int j;
- for (j = i; 0 < j && sym < shift_symbol[j - 1]; j--)
- shift_symbol[j] = shift_symbol[j - 1];
- shift_symbol[j] = sym;
- }
-
- for (i = 0; i < nshifts; i++)
- {
- symbol_number sym = shift_symbol[i];
- shiftset[i] = get_state (sym, kernel_size[sym], kernel_base[sym]);
+ s->transitions->states[i] = get_state (sym, kernel_size[sym],
kernel_base[sym]);
+ i++;
}
}
@@ -314,7 +301,7 @@ set_states (void)
computed later, but set_conflicts. */
state *s = this->state;
if (!s->transitions)
- state_transitions_set (s, 0, 0);
+ state_transitions_set (s, 0);
if (!s->reductions)
state_reductions_set (s, 0, 0);
@@ -360,12 +347,9 @@ generate_states (void)
save_reductions (s);
/* Find the itemsets of the states that shifts can reach. */
new_itemsets (s);
- /* Find or create the core structures for those states. */
+ /* Find or create the core structures for those states.
+ Record the shifts allowed out of this state. */
append_states (s);
-
- /* Create the shifts structures for the shifts to those states,
- now that the state numbers transitioning to are known. */
- state_transitions_set (s, nshifts, shiftset);
}
/* discard various storage */
diff --git a/src/state.c b/src/state.c
index d3460c1..bf4ad3d 100755
--- a/src/state.c
+++ b/src/state.c
@@ -1,6 +1,6 @@
/* Type definitions for nondeterministic finite state machine for Bison.
- Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007 Free Software
+ Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software
Foundation, Inc.
This file is part of Bison, the GNU Compiler Compiler.
@@ -39,12 +39,11 @@
`-----------------------------------------*/
static transitions *
-transitions_new (int num, state **the_states)
+transitions_new (int num)
{
- size_t states_size = num * sizeof *the_states;
+ size_t states_size = num * sizeof (state *);
transitions *res = xmalloc (offsetof (transitions, states) + states_size);
res->num = num;
- memcpy (res->states, the_states, states_size);
return res;
}
@@ -169,15 +168,15 @@ state_free (state *s)
}
-/*---------------------------.
-| Set the transitions of S. |
-`---------------------------*/
+/*--------------------------------.
+| Allocate the transitions of S. |
+`--------------------------------*/
void
-state_transitions_set (state *s, int num, state **trans)
+state_transitions_set (state *s, int num)
{
aver (!s->transitions);
- s->transitions = transitions_new (num, trans);
+ s->transitions = transitions_new (num);
}
diff --git a/src/state.h b/src/state.h
index 4afc1f0..422555d 100755
--- a/src/state.h
+++ b/src/state.h
@@ -1,6 +1,6 @@
/* Type definitions for nondeterministic finite state machine for Bison.
- Copyright (C) 1984, 1989, 2000, 2001, 2002, 2003, 2004, 2007 Free
+ Copyright (C) 1984, 1989, 2000, 2001, 2002, 2003, 2004, 2007, 2008 Free
Software Foundation, Inc.
This file is part of Bison, the GNU Compiler Compiler.
@@ -223,8 +223,8 @@ extern state *final_state;
state *state_new (symbol_number accessing_symbol,
size_t core_size, item_number *core);
-/* Set the transitions of STATE. */
-void state_transitions_set (state *s, int num, state **trans);
+/* Allocate the transitions of STATE. */
+void state_transitions_set (state *s, int num);
/* Set the reductions of STATE. */
void state_reductions_set (state *s, int num, rule **reds);
diff --git a/src/print.c b/src/print.c
index ddd76a6..a36c369 100755
--- a/src/print.c
+++ b/src/print.c
@@ -370,14 +370,15 @@ print_state (FILE *out, state *s)
| Print information on the whole grammar. |
`-----------------------------------------*/
-#define END_TEST(End) \
+#define WRAP_PUT(Str) \
do { \
- if (column + strlen(buffer) > (End)) \
+ if (column > 65) \
{ \
- fprintf (out, "%s\n ", buffer); \
+ fputs ("\n ", out); \
column = 3; \
- buffer[0] = 0; \
} \
+ fputs ((Str), out); \
+ column += strlen(Str); \
} while (0)
@@ -386,7 +387,7 @@ print_grammar (FILE *out)
{
symbol_number i;
char buffer[90];
- int column = 0;
+ int column;
grammar_rules_print (out);
@@ -399,21 +400,20 @@ print_grammar (FILE *out)
rule_number r;
item_number *rhsp;
- buffer[0] = 0;
- column = strlen (tag);
- fputs (tag, out);
- END_TEST (65);
+ column = 0;
+ WRAP_PUT (tag);
sprintf (buffer, " (%d)", i);
+ WRAP_PUT (buffer);
for (r = 0; r < nrules; r++)
for (rhsp = rules[r].rhs; *rhsp >= 0; rhsp++)
if (item_number_as_symbol_number (*rhsp) == token_translations[i])
{
- END_TEST (65);
- sprintf (buffer + strlen (buffer), " %d", r);
+ sprintf (buffer, " %d", r);
+ WRAP_PUT (buffer);
break;
}
- fprintf (out, "%s\n", buffer);
+ fputc ('\n', out);
}
fputs ("\n\n", out);
@@ -438,23 +438,19 @@ print_grammar (FILE *out)
}
}
- buffer[0] = 0;
- fputs (tag, out);
- column = strlen (tag);
- sprintf (buffer, " (%d)", i);
- END_TEST (0);
+ fprintf (out, "%s (%d)\n ", tag, i);
+ column = 3;
if (left_count > 0)
{
- END_TEST (65);
- sprintf (buffer + strlen (buffer), _(" on left:"));
+ WRAP_PUT (_(" on left:"));
for (r = 0; r < nrules; r++)
{
if (rules[r].lhs->number == i)
{
- END_TEST (65);
- sprintf (buffer + strlen (buffer), " %d", r);
+ sprintf (buffer, " %d", r);
+ WRAP_PUT (buffer);
}
}
}
@@ -462,22 +458,24 @@ print_grammar (FILE *out)
if (right_count > 0)
{
if (left_count > 0)
- sprintf (buffer + strlen (buffer), ",");
- END_TEST (65);
- sprintf (buffer + strlen (buffer), _(" on right:"));
+ {
+ fputc (',', out);
+ column++;
+ }
+ WRAP_PUT (_(" on right:"));
for (r = 0; r < nrules; r++)
{
item_number *rhsp;
for (rhsp = rules[r].rhs; *rhsp >= 0; rhsp++)
if (item_number_as_symbol_number (*rhsp) == i)
{
- END_TEST (65);
- sprintf (buffer + strlen (buffer), " %d", r);
+ sprintf (buffer, " %d", r);
+ WRAP_PUT (buffer);
break;
}
}
}
- fprintf (out, "%s\n", buffer);
+ fputc ('\n', out);
}
}