Hello all,
I'd like to request some advice/thoughs/help on selecting a name for a
module.
The working name is 'Regexp::Query', but as it happens the realization
of the
original idea has moved slightly ahead of that so I'm no longer
confident on the
suitability...
The Scratched Itch:
I have a large number of commandline tools, and quite frequently I want
the user
to be able to express, with some flag(s), a selection among something.
Example: the user gives the command
SomeCommand /some/path
and it will scan the path and for all files it finds it will do
something
useful. Now, I also want to provide flags for the command such that they
can say
SomeCommand -exclude 'some_regexp' /some/path
Obviously not a problem, and I also provide the reverse if that is more
convenient:
SomeCommand -include 'another_regexp' /some/path
and this can be extended so flags can be given multiple times and
interweaved:
SomeCommand -include 'rx1' -exclude 'rx2' -include 'rx3' /some/path
coupled with rules that then shrinks the target set based on rx1,
shrinks that
set using rx2 etc etc. I think you get the idea.
What I found however is that it becomes hard to string together regexps
to find
the exact subset you want. In fact, while regexps are powerful, they're
not that
suited to easily mix multiple of them, and some expressions are
basically
impossible, especially to provide a commandline interface to them...
Thus, instead I'd like to provide a more capable way for a user to
provide a
more complex query, i.e. where it'd be possible to use AND/OR/NOT,
including
parenthesized, e.g. something very contrived:
(
REGEXP/some_rx_1/ AND REGEXP/some_rx_2/
) OR
(
REGEXP/some_rx_3/ AND NOT REGEXP/some_rx_4/
) OR
NOT (
REGEXP/some_rx_5/ OR NOT REGEXP/some_rx_6/
)
Basically, feed 'something' the query and a list of scalars and get back
a list
of the subset of scalars that fulfills the query. In short, behaving
like a
grep, you might say.
This was my original goal, and stopping there, arguably calling such a
module
"Regexp::Query" could make sense IMHO.
===
However, moving beyond this I realized two things:
1) It would be generically useful to allow simple numerical comparisons
on the
scalars, i.e. the usual suspects ==, !=, >, >=, <, <=
2) Why only scalars? With some additions it can handle lists of raw
hashes, or
even arbitrary objects
The first now provides the opportunity to write a queries like this:
EQ(42) OR GTE(99) OR REGEXP(1\d\d\d2)
This is not so interesting with a list of plain scalars, but adding the
second
capability, I introduce the notion of 'fields'. With plain hashes, this
equates
to plainly looking up the keys, but with objects we can provide a
special hash
which has keys corresponding to small anonymous subs that knowns how to
dig out
the data we want. Given a query like this:
name.REGEXP(^A) and age.GT(30)
and having any kind of Perl objects that has these 'fields', we can do
this:
my $fa = FieldAccessor(
{
name => sub { $_[0]->getName() },
age => sub { $_[0]->calculateAge() },
});
which, passed along with the list of objects, will perform the query.
Given the last features, I feel I'm slightly beyond just something in
the
Regexp:: namespace, but I have no good feel for what it could be
instead. Pretty
generic in a way, so just plain 'Data::Query' perhaps (ignoring for a
second
that that name appears to be taken for some module I at present don't
quite
understand what it's for...perhaps there's a fit...?)
Maybe I should just stick with the R::Q naming...:-)
Any feedback helpful.
TIA,
ken1 (CPAN id: KNTH)