For a specific one-off specific task like this, I would assume that
would take more work to find and evaluate a module that solves my
problem than it would take to roll my own solution.

Heck, in this case you can arrange it as a map-reduce.  Your initial
map takes each file, and spits out key/value pairs where the key is
the meaningful identifier of a node, and the value is the meaningful
identifier of a predecessor.  Your reduce takes the key and dedupes
the values to get a node and its unique predecessors.

With Hadoop you can now distribute this calculation across multiple
machines and parallelize the work.

On Tue, Jun 25, 2013 at 4:47 PM, Steve Tolkin <stevetol...@comcast.net> wrote:
> Summary: I went looking for a CPAN graph module so I could merge multiple
> directed graphs.  I found two that looked good, by "famous" Perl authors.
> Unfortunately both have "issues".
>
> 1. Graph::Easy looks good, but it has not changed in years and has a bug
> list
> https://rt.cpan.org/Public/Dist/Display.html?Status=Active;Name=Graph-Easy
> with a bunch of Open and Important bugs.  See also
> http://search.cpan.org/~shlomif/Graph-Easy-0.73/lib/Graph/Easy.pm   and
> http://bloodgate.com/perl/graph/index.html
> This was originally by tels and now maintained by by Shlomi Fish.
>
> 2. Graph now says: <q>
> UNSUPPORTED
> Unfortunately, as of release 0.95, this module is unsupported, and will no
> more be maintained. Sorry about that. </q>
> Its bug list at https://rt.cpan.org/Public/Dist/Display.html?Name=Graph is
> short but it includes this Important one: "find_a_cycle and has_cycle are
> broken" https://rt.cpan.org/Public/Bug/Display.html?id=78465
> See also http://search.cpan.org/~jhi/Graph-0.96/lib/Graph.pod
> This is by Jarkko Hietaniemi.
>
> 3. Graph::Simple is just v0.03
>
> Are there other good modules?
>
> A summary of what I MIGHT want to do:
> Merge separate directed graphs into one, by combining equivalent nodes and
> creating the union of their predecessor sets.
>
> In more detail: There are several existing directed graphs, each in its own
> file.  Sometimes a node in one file is equivalent to a node in another file.
> Nodes have associated attributes.  The "meaningful" identifier for a node is
> a three part key.  However, in each file each node is assigned an arbitrary
> integer ID starting with 1, so the same integers appear in many files,
> referring to different nodes.  In each file a node's predecessors are
> identified just by a set of those integers.
>
> --
> Thanks,
> Steve Tolkin
>
>
>
>
> _______________________________________________
> Boston-pm mailing list
> Boston-pm@mail.pm.org
> http://mail.pm.org/mailman/listinfo/boston-pm

_______________________________________________
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to