Re: Help on module name (working name 'Regexp::Query')

David Mertens Mon, 25 Jul 2016 18:20:07 -0700

Hello Kenneth,

My background: I have been working on a generic greedy pattern matching
engine for my years. It is called Scrooge. The original itch was to be able
to write "regexes" for numerical data (i.e. PDL). It took a long time
before that idea grew into an implementation (starting around 2012), which
then morphed into a generic sequential greedy pattern matching engine for
arrays of any type. I don't know if it handles hashes yet, but they will be
easy to weave in as arrays of zero length that are tested with zero-width
assertions. (If anybody is interested, my current development is at
https://github.com/run4flat/perl-Scrooge.)


So, first of all, please do not take Scrooge. I am rather fond of that
name, and plan to publish my module with that name some day.

As to your idea, it looks like you're working with something like grep. The
key difference is that you provide some sort string parsing for queries,
along with a way to add keywords to those queries. In that case, I think
that *Grep::Query* might be the best name. Another possibility, if you want
to some day add map-like capabilities as well, would be a top-level
namespace. The name Matchbox comes to mind, for some odd reason, with
Matchbox::Grep and Matchbox::Map being your two major namespaces.

That's just my two cents. I hope that helps.
David

On Mon, Jul 25, 2016 at 6:20 AM, <kenn...@olwing.se> wrote:

> Hello all,
>
> I'd like to request some advice/thoughs/help on selecting a name for a
> module.
> The working name is 'Regexp::Query', but as it happens the realization of
> the
> original idea has moved slightly ahead of that so I'm no longer confident
> on the
> suitability...
>
> The Scratched Itch:
>
> I have a large number of commandline tools, and quite frequently I want
> the user
> to be able to express, with some flag(s), a selection among something.
>
> Example: the user gives the command
>
>   SomeCommand /some/path
>
> and it will scan the path and for all files it finds it will do something
> useful. Now, I also want to provide flags for the command such that they
> can say
>
>   SomeCommand -exclude 'some_regexp' /some/path
>
> Obviously not a problem, and I also provide the reverse if that is more
> convenient:
>
>   SomeCommand -include 'another_regexp' /some/path
>
> and this can be extended so flags can be given multiple times and
> interweaved:
>
>   SomeCommand -include 'rx1' -exclude 'rx2' -include 'rx3' /some/path
>
> coupled with rules that then shrinks the target set based on rx1, shrinks
> that
> set using rx2 etc etc. I think you get the idea.
>
> What I found however is that it becomes hard to string together regexps to
> find
> the exact subset you want. In fact, while regexps are powerful, they're
> not that
> suited to easily mix multiple of them, and some expressions are basically
> impossible, especially to provide a commandline interface to them...
>
> Thus, instead I'd like to provide a more capable way for a user to provide
> a
> more complex query, i.e. where it'd be possible to use AND/OR/NOT,
> including
> parenthesized, e.g. something very contrived:
>
>   (
>     REGEXP/some_rx_1/ AND REGEXP/some_rx_2/
>   ) OR
>   (
>     REGEXP/some_rx_3/ AND NOT REGEXP/some_rx_4/
>   ) OR
>   NOT (
>         REGEXP/some_rx_5/ OR NOT REGEXP/some_rx_6/
>       )
>
> Basically, feed 'something' the query and a list of scalars and get back a
> list
> of the subset of scalars that fulfills the query. In short, behaving like a
> grep, you might say.
>
> This was my original goal, and stopping there, arguably calling such a
> module
> "Regexp::Query" could make sense IMHO.
>
> ===
>
> However, moving beyond this I realized two things:
>
> 1) It would be generically useful to allow simple numerical comparisons on
> the
> scalars, i.e. the usual suspects ==, !=, >, >=, <, <=
> 2) Why only scalars? With some additions it can handle lists of raw
> hashes, or
> even arbitrary objects
>
> The first now provides the opportunity to write a queries like this:
>
>   EQ(42) OR GTE(99) OR REGEXP(1\d\d\d2)
>
> This is not so interesting with a list of plain scalars, but adding the
> second
> capability, I introduce the notion of 'fields'. With plain hashes, this
> equates
> to plainly looking up the keys, but with objects we can provide a special
> hash
> which has keys corresponding to small anonymous subs that knowns how to
> dig out
> the data we want. Given a query like this:
>
>   name.REGEXP(^A) and age.GT(30)
>
> and having any kind of Perl objects that has these 'fields', we can do
> this:
>
>   my $fa = FieldAccessor(
>              {
>                name => sub { $_[0]->getName() },
>                age => sub { $_[0]->calculateAge() },
>              });
>
> which, passed along with the list of objects, will perform the query.
>
> Given the last features, I feel I'm slightly beyond just something in the
> Regexp:: namespace, but I have no good feel for what it could be instead.
> Pretty
> generic in a way, so just plain 'Data::Query' perhaps (ignoring for a
> second
> that that name appears to be taken for some module I at present don't quite
> understand what it's for...perhaps there's a fit...?)
>
> Maybe I should just stick with the R::Q naming...:-)
>
> Any feedback helpful.
>
> TIA,
>
> ken1 (CPAN id: KNTH)
>
>
>


-- 
 "Debugging is twice as hard as writing the code in the first place.
  Therefore, if you write the code as cleverly as possible, you are,
  by definition, not smart enough to debug it." -- Brian Kernighan

Re: Help on module name (working name 'Regexp::Query')

Reply via email to