I agree that the job could be performed on a single machine given your
relatively small amount of data.

Are you locked into Perl or are you open to libraries in other languages.
If Java libraries are an option, you might take a look at
JGraphT<http://jgrapht.org/>.
I haven't used it but it has good reviews. You could use Inline::Java, if
you needed the library to be connected to Perl code.

--

David


On Wed, Jun 26, 2013 at 8:12 AM, Steve Tolkin <stevetol...@comcast.net>wrote:

> Thanks, thinking of it as map reduce is helpful.  However, the graphs are
> small (<10,000 nodes) so I conjecture running serially on a PC would be
> fast
> enough.
>
> I am still interested in feedback on the modules partly because I am in an
> "exploratory" mode and don't know exactly what I will later need.  E.g.
> Graph::Simple can create SVG and other "pictures" of a graph. Graph
> supports
> weighted edges which is relevant.   Both modules 1 & 2 can detect a cycle,
> and that seems like a good sanity check.  If one is detected it would mean
> there is a problem in the data, or more likely in my code, or in the module
> itself (which is that bug I cited).
>
> Steve
>
> -----Original Message-----
> From: Ben Tilly [mailto:bti...@gmail.com]
> Sent: Tuesday, June 25, 2013 8:16 PM
> To: Steve Tolkin
> Cc: Boston Perl Mongers
> Subject: Re: [Boston.pm] directed Graph modules in perl and CPAN
>
> For a specific one-off specific task like this, I would assume that
> would take more work to find and evaluate a module that solves my
> problem than it would take to roll my own solution.
>
> Heck, in this case you can arrange it as a map-reduce.  Your initial
> map takes each file, and spits out key/value pairs where the key is
> the meaningful identifier of a node, and the value is the meaningful
> identifier of a predecessor.  Your reduce takes the key and dedupes
> the values to get a node and its unique predecessors.
>
> With Hadoop you can now distribute this calculation across multiple
> machines and parallelize the work.
>
> On Tue, Jun 25, 2013 at 4:47 PM, Steve Tolkin <stevetol...@comcast.net>
> wrote:
> > Summary: I went looking for a CPAN graph module so I could merge multiple
> > directed graphs.  I found two that looked good, by "famous" Perl authors.
> > Unfortunately both have "issues".
> >
> > 1. Graph::Easy looks good, but it has not changed in years and has a bug
> > list
> >
> https://rt.cpan.org/Public/Dist/Display.html?Status=Active;Name=Graph-Easy
> > with a bunch of Open and Important bugs.  See also
> > http://search.cpan.org/~shlomif/Graph-Easy-0.73/lib/Graph/Easy.pm   and
> > http://bloodgate.com/perl/graph/index.html
> > This was originally by tels and now maintained by by Shlomi Fish.
> >
> > 2. Graph now says: <q>
> > UNSUPPORTED
> > Unfortunately, as of release 0.95, this module is unsupported, and will
> no
> > more be maintained. Sorry about that. </q>
> > Its bug list at https://rt.cpan.org/Public/Dist/Display.html?Name=Graphis
> > short but it includes this Important one: "find_a_cycle and has_cycle are
> > broken" https://rt.cpan.org/Public/Bug/Display.html?id=78465
> > See also http://search.cpan.org/~jhi/Graph-0.96/lib/Graph.pod
> > This is by Jarkko Hietaniemi.
> >
> > 3. Graph::Simple is just v0.03
> >
> > Are there other good modules?
> >
> > A summary of what I MIGHT want to do:
> > Merge separate directed graphs into one, by combining equivalent nodes
> and
> > creating the union of their predecessor sets.
> >
> > In more detail: There are several existing directed graphs, each in its
> own
> > file.  Sometimes a node in one file is equivalent to a node in another
> file.
> > Nodes have associated attributes.  The "meaningful" identifier for a node
> is
> > a three part key.  However, in each file each node is assigned an
> arbitrary
> > integer ID starting with 1, so the same integers appear in many files,
> > referring to different nodes.  In each file a node's predecessors are
> > identified just by a set of those integers.
> >
> > --
> > Thanks,
> > Steve Tolkin
> >
> >
> >
> >
> > _______________________________________________
> > Boston-pm mailing list
> > Boston-pm@mail.pm.org
> > http://mail.pm.org/mailman/listinfo/boston-pm
>
>
> _______________________________________________
> Boston-pm mailing list
> Boston-pm@mail.pm.org
> http://mail.pm.org/mailman/listinfo/boston-pm
>

_______________________________________________
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to