Re: [graph] Why the Vertex and Edge interfaces?

Oliver Heger Sat, 03 Mar 2012 08:04:08 -0800

Hi, if I am allowed to throw in some (unfinished) thoughts into thisdiscussion:

My preference would also be to keep the actual graph data structure leanand simple. As James pointed out, if algorithms need additionalinformation, a function from the vertex/edge type to the target type isrequired. This could indeed be implemented in a very generic way using afunctor interface (maybe with an underlying map as a defaultimplementation).

The advantage over a pack of interfaces as I see it is that suchfunctors could be added dynamically to a graph. Thus it could be adaptedfor the execution of different algorithms at runtime.


Just my 2 cents.
Oliver

Am 03.03.2012 16:47, schrieb Simone Tripodi:

Cloud.io,

back to James' first email: any type could be immediately used as
edge/vertex without wrappers, while specific data related to the domain of
graphs (weights, labels) would be handled separately. I actually think this
is useful: back to my days of "DIY Java graphs" I implemented something
similar to what we have now, just to think every time: "why should I wrap my
objects with these markers? I already know my Router is a Vertex in the
graph..."


I already showed be open on dropping elements, do I have to copy my
first reply as well so we start again? :)
OK, I started collecting various thoughts and trying to converge them
to a common path: Vertex is something we can safety drop because we
know its nature at priori, markers are unnecessary.This is fine.


Here's the way I see it. A Graph<V,E>  implementing HasWeightedEdges would
have something like this inside:

Map<E, W>  edgeWeights = new HashMap<E, W>();

[... populate map, other methods, etc ...]

// implementing HasWeightedEdges#getEdgeWeight
public W getEdgeWeight(E edge)
{
    [... check null etc...]
    return edgeWeights.get(edge);

}


what is the sense, at that point, on keeping the Edge?!! It would be
more than enough to know which is the Head and which is the Tail in
the Edge to get the W!

then let's find a better name, but why *OrderedMonoid?


maybe because they implement OrderedMonoid? :)

Assume we replace the
whole set of current interfaces (see my comment to the next paragraph below)
with just Addition and Comparable (the latter already exists of course).
There is no need to create another interface to merge the two
(ComparableAddition? OrderedAddition?). Then we have:


how much would Addition and Multiplication interface differ each other?

public class DoubleWeightOperations
    implements Addition, Comparator
{
    [...]
}

I would not rename the class to DoubleWeightAddition or even
DoubleWeightComparableAddition. What if later we need to e.g. add a function
that normalizes weights by some factor, or returns the reciprocal of a
weight, etc? With the above code it would suffice to add new interfaces:

public class DoubleWeightOperations
    implements Addition, Comparator, Normalization, Reciprocal
{
    [...]

}


that would be fine, what is not clear is that Monoids expose the same
interface, so *Operations class implementation canot declare same
method multiple times


That is fine for me. I simply followed pure theory while implementing that
and used all possible granularity.


questionable, that is why we are still speaking about it

Now looking at our implementations I
think it is save enough to even merge Zero, Semigroup and Monoid (so that's
one step further than your example below) and call the result Addition so
that its role is clear to everybody. Does that sound good? :)


Sounds more than good, it is what I already proposed messages ago:

Zero, name of an element, contains `zero` method to get the zero (it
is still confusing to me), Monoid  extends Zero and Semigroup - given
the use inside graph math, Zero#zero and Semigroup#append can be moved
directly to Monoid and rename it as WeightOperation


despite the rename, I still like the Monoid :P

enough talk IMHO, time to code and make concrete proposals

best,
-Simo

http://people.apache.org/~simonetripodi/
http://simonetripodi.livejournal.com/
http://twitter.com/simonetripodi
http://www.99soft.org/



On Sat, Mar 3, 2012 at 2:58 PM, Claudio Squarcella
<squar...@dia.uniroma3.it>  wrote:

Hi,

Apologize but I still don't understand what the benefit is on storing
nodes/arcs data in the Graph data structure



back to James' first email: any type could be immediately used as
edge/vertex without wrappers, while specific data related to the domain of
graphs (weights, labels) would be handled separately. I actually think this
is useful: back to my days of "DIY Java graphs" I implemented something
similar to what we have now, just to think every time: "why should I wrap my
objects with these markers? I already know my Router is a Vertex in the
graph..."

- sounds to me like a
Collection<Integer>    where, to know the int value of its elements, have
to query the data structure, i.e.

Collection<Integer>    integerCollection = ...;
for ( Integer ptr : integerCollection )
{
     int value = integerCollection.getInt( ptr );
}

It is weird to me even reading it.

When I started modeling Graph, I had collections in mind - above all
to simplify its adoption - I would be more than pleased to drop
Weighted* version of graphs and come back to the previous situation,
but with the annoying  type inference issue fixed.



Here's the way I see it. A Graph<V,E>  implementing HasWeightedEdges would
have something like this inside:

Map<E, W>  edgeWeights = new HashMap<E, W>();

[... populate map, other methods, etc ...]

// implementing HasWeightedEdges#getEdgeWeight
public W getEdgeWeight(E edge)
{
    [... check null etc...]
    return edgeWeights.get(edge);

}

no, trying to be clearer: you propose to rename Monoid into
WeightOperation,
which is like renaming Addition into Operation. Or alternatively to call
the
current *WeightBaseOperations something like *WeightMonoid. In both cases
I
disagree because I would prefer to keep a clear distinction between
single
well-defined properties/operations (like Addition or Comparator) and the
comprehensive implementation (e.g. DoubleWeightBaseOperations) that
implements all the operations it can implement with Doubles.


OK, concept is clear, I still don't agree on the nomenclature, anyway.
Actually *WeightBaseOperations describe
*WeightAdditionalOrderedMonoid, so *BaseOperations doesn't sound self
explaining.



then let's find a better name, but why *OrderedMonoid? Assume we replace the
whole set of current interfaces (see my comment to the next paragraph below)
with just Addition and Comparable (the latter already exists of course).
There is no need to create another interface to merge the two
(ComparableAddition? OrderedAddition?). Then we have:

public class DoubleWeightOperations
    implements Addition, Comparator
{
    [...]
}

I would not rename the class to DoubleWeightAddition or even
DoubleWeightComparableAddition. What if later we need to e.g. add a function
that normalizes weights by some factor, or returns the reciprocal of a
weight, etc? With the above code it would suffice to add new interfaces:

public class DoubleWeightOperations
    implements Addition, Comparator, Normalization, Reciprocal
{
    [...]

}


Moreover, The Zero interface is the *additional* monoid identity
element, if someone has to implement the Multiplication Monoid I
wouldn't expect he implements the One interface wich declares O one()
method.
This is why IMHO in the current algebra model, Zero has to be dropped
and has to be modified in:



That is fine for me. I simply followed pure theory while implementing that
and used all possible granularity. Now looking at our implementations I
think it is save enough to even merge Zero, Semigroup and Monoid (so that's
one step further than your example below) and call the result Addition so
that its role is clear to everybody. Does that sound good? :)

Claudio


/**
  * semigroup is an algebraic structure consisting of a set together
with an associative binary operation.
  */
interface Semigroup<E>
{

     E append( E s1, E s2 );

}

/**
  * A {@link Monoid} is a {@link Semigroup} with an identity value.
  */
public interface Monoid<E>
     extends Semigroup<M>
{

    E identity();

    E inverse( E element );

}

that would continue working for every Monoid specialization. Or not?
thoughts?
Thanks and best,
-Simo

http://people.apache.org/~simonetripodi/
http://simonetripodi.livejournal.com/
http://twitter.com/simonetripodi
http://www.99soft.org/



On Sat, Mar 3, 2012 at 1:43 PM, Claudio Squarcella
<squar...@dia.uniroma3.it>    wrote:


Hi,


On 03/03/2012 02:21, Simone Tripodi wrote:


first of all: yes, I will play with this stuff as soon as I find the
time
:)


this is scaring... go Orb.io, go! :)

constant), but still there is one more step of indirection. So we would
need
to test and compare performances, hopefully with acceptable results.


sounds it would be better if you implement that stuff in a branch to
keep both implementations, IMHO



sure :)

  * with the current approach: one interface for edge-weighted graphs
   (EdgeWeightedGraph, renaming the current WeightedGraph?), one for
   vertex-weighted graphs (VertexWeightedGraph) and maybe even one for
   weights on both edges and vertices (EdgeAndVertexWeightedGraph?) --
   not to talk about their counterparts with labels, etc;
  * with the proposed approach: a Graph would implement
   HasWeightsOnEdges and/or HasWeightsOnVertices -- and maybe also
   HasLabelsOnEdges if needed.


do you remember that we reintroduced the WeightedGraph just for the
type inference issue on fluent APIs in Eclipse, do you? ;) It would
have been worked otherwise as well dropping the WeightedGraph and
expressing types only on Vertex/Edges



that is right. On the other hand with "HasWeightedEdges" we could drop
"WeightedEdge" and so on: one interface in, one out.

Or we could even save both approaches as alternative implementations.
That
is:

  * a set of interfaces: e.g. HasWeightedEdges#getWeight(edge),
   HasWeightedVertices#getWeight(vertex), etc.
  * #1 implementation with external properties: the graph keeps the
   mapping between edge/vertex and weight, so that any type can be used
   for edges/vertices
  * #2 classical implementation: we keep markers like WeightedEdge and
   WeightedVertex but only the graph knows about them. algorithms keep
   asking the graph for weights of edges/vertices, and the graph in
   turn asks the edge/vertex itself (passed as parameter).

WDYT?

I know that this kind of "mismatch" is not your favourite ;) but would
you
really call "Operations" something which is just an "Addition" -- or
viceversa "DoubleWeightAddition" something that might later be expanded
with
other operations?


I am confused now: this is what you did in the concrete implementation!



no, trying to be clearer: you propose to rename Monoid into
WeightOperation,
which is like renaming Addition into Operation. Or alternatively to call
the
current *WeightBaseOperations something like *WeightMonoid. In both cases
I
disagree because I would prefer to keep a clear distinction between
single
well-defined properties/operations (like Addition or Comparator) and the
comprehensive implementation (e.g. DoubleWeightBaseOperations) that
implements all the operations it can implement with Doubles.

Hoping we'll converge somewhere, maybe asymptotically ;)
Claudio

time to sleep, cannot reply anymore, read tomorrow

-Simo

http://people.apache.org/~simonetripodi/
http://simonetripodi.livejournal.com/
http://twitter.com/simonetripodi
http://www.99soft.org/



On Sat, Mar 3, 2012 at 1:37 AM, Claudio Squarcella
<squar...@dia.uniroma3.it>      wrote:

Hi,

what if that mapping function becomes a responsibility of
WeightedGraph
itself?
And more generally, what if any property of vertices and/or edges is
moved to the containing graph?


that would imply that Graph implementations have to implement vertices
and/or edges metadata indexing, that would be anyway less performant
than accessing directly on metadata contained in current node/arc -
just count the number of time you should have to touch the adapted
data structures, of course will be at least one more than the actual.



that is absolutely right. Not asymptotically if the implementation is
good
(hashmaps are already good enough with their read time which is
basically
constant), but still there is one more step of indirection. So we would
need
to test and compare performances, hopefully with acceptable results.

We could externalize all different graph properties to appropriate
interfaces (HasWeightsOnEdges, HasLabelsOnVertices, etc) and then
each
algorithm specifies the needed input graph including the subset of
interfaces it needs to implement. We do something like that with
weight
operations already.


I am worried that with that approach the number of interfaces would
proliferate like pollen during Spring, users - and devs - would get
easily lost



but that would happen anyway as soon as we implement an algorithm with
weights on vertices, right? Here are the options I see:

  * with the current approach: one interface for edge-weighted graphs
   (EdgeWeightedGraph, renaming the current WeightedGraph?), one for
   vertex-weighted graphs (VertexWeightedGraph) and maybe even one for
   weights on both edges and vertices (EdgeAndVertexWeightedGraph?) --
   not to talk about their counterparts with labels, etc;
  * with the proposed approach: a Graph would implement
   HasWeightsOnEdges and/or HasWeightsOnVertices -- and maybe also
   HasLabelsOnEdges if needed.

weights are something already complicated for being a simple concept,
please apologize for the little offtopic:

Zero, name of an element, contains `zero` method to get the zero (it
is still confusing to me), Monoid  extends Zero and Semigroup - given
the use inside graph math, Zero#zero and Semigroup#append can be moved
directly to Monoid and rename it as WeightOperation - does it remind
you something? :P



I can agree with most of what you say but I would still call the result
Monoid, or maybe even better Addition -- because that is what it is, a
Monoid representing the sum operation. "WeightOperation" sounds more
like
a
general "container" which can include Addition, Comparator, and maybe
in
the
near future Multiplication or who knows what -- which again is pretty
much
what happens now with the wrappers for Double, Integer, etc.

I know that this kind of "mismatch" is not your favourite ;) but would
you
really call "Operations" something which is just an "Addition" -- or
viceversa "DoubleWeightAddition" something that might later be expanded
with
other operations?

Ciao and thanks for your feedback!
Claudio

best,
-Simo

http://people.apache.org/~simonetripodi/
http://simonetripodi.livejournal.com/
http://twitter.com/simonetripodi
http://www.99soft.org/



On Fri, Mar 2, 2012 at 10:22 PM, Claudio Squarcella
<squar...@dia.uniroma3.it>        wrote:

Hi,

The weights can be external, too.  It's only a function from edge to
weight.  Your algorithm can take a function for its weights.  The
files
library does it similar to this.



what if that mapping function becomes a responsibility of
WeightedGraph
itself? And more generally, what if any property of vertices and/or
edges
is
moved to the containing graph?

We could externalize all different graph properties to appropriate
interfaces (HasWeightsOnEdges, HasLabelsOnVertices, etc) and then
each
algorithm specifies the needed input graph including the subset of
interfaces it needs to implement. We do something like that with
weight
operations already.

Claudio

On Mar 2, 2012 3:08 PM, "Ted Dunning"<ted.dunn...@gmail.com>
  wrote:

Having weights on vertices is quite common.  Consider any
probability
transition network.  The weight on each node is the probability of
being
in
that state and the weights on the edges are conditional
probabilties.

Page rank is a related example of having weights on nodes.

On Fri, Mar 2, 2012 at 12:40 AM, Claudio Squarcella<
squar...@dia.uniroma3.it>          wrote:

Hi all,

  Claudio is aware also about algorithms where weights are
associated
to


Vertex - he's preparing his PhD research on graphes - maybe he
can
show us a more long-vision roadmap and evaluate benefits on
simplifying the design.

yes there are algorithms with weights on vertices. Of course those
with
weighted edges (like the ones already implemented) are much more


widespread


and frequently used, but still we cannot forget about that. Also,


although


on a secondary level, labels on vertices/edges are kind of
important
in
many situations (including testing, debugging) where I think it is
good

to


keep them distinct from the standard "toString" method (you might
want
to
represent only a subset of info in the label, etc).

Matthew Pocock suggested an alternative approach back in the days
of
weight abstraction:

  * the graph itself is extremely simple and naked: no
weights/labels
on
   vertices/edges;
  * all properties are stored in some external structure, which I
   imagine composed of associative maps (Map<Edge, Weight>, etc
etc).

He motivated the idea with a "personal use case": often graphs are
used
and reused with the same structure but different weights (and/or
labels,
etc). Now if James' question becomes a second use case, maybe it's
the
right time to exhume that idea ;)

Ciao,
Claudio

--
Claudio Squarcella
PhD student at Roma Tre University
http://www.dia.uniroma3.it/~**squarcel<


http://www.dia.uniroma3.it/~squarcel>


http://squarcella.com/






------------------------------**------------------------------**---------
To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org<


dev-unsubscr...@commons.apache.org>


For additional commands, e-mail: dev-h...@commons.apache.org

--
Claudio Squarcella
PhD student at Roma Tre University
http://www.dia.uniroma3.it/~squarcel
http://squarcella.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org

For additional commands, e-mail: dev-h...@commons.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

--
Claudio Squarcella
PhD student at Roma Tre University
http://www.dia.uniroma3.it/~squarcel
http://squarcella.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

--
Claudio Squarcella
PhD student at Roma Tre University
http://www.dia.uniroma3.it/~squarcel
http://squarcella.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org


--
Claudio Squarcella
PhD student at Roma Tre University
http://www.dia.uniroma3.it/~squarcel
http://squarcella.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [graph] Why the Vertex and Edge interfaces?

Reply via email to