Re: [Graph] On graph weight type(s)

2011-12-23 Thread Matthew Pocock
On 23 December 2011 10:38, Simone Tripodi  wrote:

> Hi Matthew!
>


> Usually algorithms (like Dijkstra) already have clear enunciates and
> steps are known, so we could safety expose 1 APIs that hide all that
> details to clients wrapping your proposals.
>

That's what façades are for. I totally agree with providing users with
utility methods that hide all the guts.


>
> WDYT? Thanks again!!!
> -Simo
>
> http://people.apache.org/~simonetripodi/
> http://simonetripodi.livejournal.com/
> http://twitter.com/simonetripodi
> http://www.99soft.org/
>
>
>
> On Fri, Dec 23, 2011 at 10:44 AM, Matthew Pocock
>  wrote:
> > Hi Simo,
> >
> > I guess the 5 minutes to run example would be:
> >
> > ShortestPathFunction.dijkstra
> >  .makeResult(graph, EdgeWeight.forWeightedEdge, Monotonic.sumDouble)
> >  .findShortestPath(source, target);
> >
> > I was assuming that there would be standard pallets of all the strategies
> > available statically in the obvious places. Actually, now I see the code
> > written out in full like that, I'd perhaps consider renaming makeResult
> to
> > `calculate` or `prepare` or some other verb.
> >
> > Matthew
> >
> > On 23 December 2011 08:47, Simone Tripodi 
> wrote:
> >
> >> Hi Matthew!
> >>
> >> at a first looks it is really interesting, just give me the time to
> >> digest because at the same time I had the feeling of a little
> >> over-engineering activity, I am worried that "5 minutes to run" users
> >> would find it not so immediate.
> >>
> >> Thanks for providing stuff to learn from!
> >> All the best,
> >> -Simo
> >>
> >> http://people.apache.org/~simonetripodi/
> >> http://simonetripodi.livejournal.com/
> >> http://twitter.com/simonetripodi
> >> http://www.99soft.org/
> >>
> >>
> >>
> >> On Thu, Dec 22, 2011 at 6:25 PM, Matthew Pocock
> >>  wrote:
> >> > Hi,
> >> >
> >> > Just thought I'd throw something out here. My experience is that I
> often
> >> > take the same graph (as in the exact same data, same objects) but at
> >> > different times want to use different weights. So, rather than having
> >> Edge
> >> > extend Weighted, I'd factor weights out into their own interface:
> >> >
> >> > /**
> >> >  * An edge weight function.
> >> >  *
> >> >  * note: perhaps this should more generally just be a Function1 B>, if
> >> > we have such a thing handy.
> >> >  *
> >> >  * @tparam E  edge type
> >> >  * @tparam W weight type
> >> >  */
> >> > public interface EdgeWeight {
> >> >  public W getWeight(E: Edge);
> >> > }
> >> >
> >> > /**
> >> >  * A combination of a monoid and comparator that satisfy monotinicity
> of
> >> > the addition operation with respect to the comparator.
> >> >  *
> >> >  * ∀a: m.compare(m.zero, a) <= 0
> >> >  * ∀a,b: m.compare(a, m.append(a, b)) <= 0
> >> >  */
> >> > public interface Monotonic extends Monoid, Comparator
> >> >
> >> > Also, some algorithms calculate all shortest paths at once, while
> others
> >> > calculate them individually and independently. It's probably even
> >> possible
> >> > to calculate some lazily. So, the interfaces for shortest paths should
> >> > decouple setting up a strategy for all shortest paths from an object
> that
> >> > can be used to fetch a specific shortest path.
> >> >
> >> > /**
> >> >  * An algorithm for finding shortest paths between vertices of a
> graph,
> >> > given some edge weighting function and
> >> >  * a well-behaved combinator for edges between connected vertices.
> >> >  */
> >> > public interface ShortestPathFunction >> Edge,
> >> > G extends DirectedGraph, W> {
> >> >  public ShortestPathResult makeResult(G graph, EdgeWeight W>
> >> > weighting, Monotonic combineWith);
> >> > }
> >> >
> >> > /**
> >> >  * The shortest paths between vertices in a graph.
> >> >  */
> >> > public interface ShortestPathResult Edge,
> >> W>
> >> > {
> >> >  public WeightedPath findShortestPath(V source, V target);
> >> > }
> >> >
> >> > H

Re: [Graph] On graph weight type(s)

2011-12-23 Thread Matthew Pocock
Hi Simo,

I guess the 5 minutes to run example would be:

ShortestPathFunction.dijkstra
  .makeResult(graph, EdgeWeight.forWeightedEdge, Monotonic.sumDouble)
  .findShortestPath(source, target);

I was assuming that there would be standard pallets of all the strategies
available statically in the obvious places. Actually, now I see the code
written out in full like that, I'd perhaps consider renaming makeResult to
`calculate` or `prepare` or some other verb.

Matthew

On 23 December 2011 08:47, Simone Tripodi  wrote:

> Hi Matthew!
>
> at a first looks it is really interesting, just give me the time to
> digest because at the same time I had the feeling of a little
> over-engineering activity, I am worried that "5 minutes to run" users
> would find it not so immediate.
>
> Thanks for providing stuff to learn from!
> All the best,
> -Simo
>
> http://people.apache.org/~simonetripodi/
> http://simonetripodi.livejournal.com/
> http://twitter.com/simonetripodi
> http://www.99soft.org/
>
>
>
> On Thu, Dec 22, 2011 at 6:25 PM, Matthew Pocock
>  wrote:
> > Hi,
> >
> > Just thought I'd throw something out here. My experience is that I often
> > take the same graph (as in the exact same data, same objects) but at
> > different times want to use different weights. So, rather than having
> Edge
> > extend Weighted, I'd factor weights out into their own interface:
> >
> > /**
> >  * An edge weight function.
> >  *
> >  * note: perhaps this should more generally just be a Function1, if
> > we have such a thing handy.
> >  *
> >  * @tparam E  edge type
> >  * @tparam W weight type
> >  */
> > public interface EdgeWeight {
> >  public W getWeight(E: Edge);
> > }
> >
> > /**
> >  * A combination of a monoid and comparator that satisfy monotinicity of
> > the addition operation with respect to the comparator.
> >  *
> >  * ∀a: m.compare(m.zero, a) <= 0
> >  * ∀a,b: m.compare(a, m.append(a, b)) <= 0
> >  */
> > public interface Monotonic extends Monoid, Comparator
> >
> > Also, some algorithms calculate all shortest paths at once, while others
> > calculate them individually and independently. It's probably even
> possible
> > to calculate some lazily. So, the interfaces for shortest paths should
> > decouple setting up a strategy for all shortest paths from an object that
> > can be used to fetch a specific shortest path.
> >
> > /**
> >  * An algorithm for finding shortest paths between vertices of a graph,
> > given some edge weighting function and
> >  * a well-behaved combinator for edges between connected vertices.
> >  */
> > public interface ShortestPathFunction Edge,
> > G extends DirectedGraph, W> {
> >  public ShortestPathResult makeResult(G graph, EdgeWeight
> > weighting, Monotonic combineWith);
> > }
> >
> > /**
> >  * The shortest paths between vertices in a graph.
> >  */
> > public interface ShortestPathResult,
> W>
> > {
> >  public WeightedPath findShortestPath(V source, V target);
> > }
> >
> > How does that look? You can then have standard implementations of these
> > things in some static utility class or a spring-friendly resource. The
> > brute-force algorithms that compute all paths at once would do all the
> work
> > in makeResult() and simply store this in some state within the returned
> > ShortestPathResult. Those that calculate individual pairs on the fly (or
> > all shortest paths from some vertex) would capture state in makeResult()
> > and perform the actual computation in findShortestPath().
> >
> > Matthew
> >
> > On 22 December 2011 16:39, Claudio Squarcella  >wrote:
> >
> >> Hi,
> >>
> >>
> >>  I highly appreciated the last contributions (thanks guys!) but I also
> >>> agree on this point, so let's start from the end.
> >>> I think that, no matter what underlying structure we come up with, the
> >>> user should be able to specify e.g. a weighted edge with something like
> >>>
> >>> public class MyEdge implements Edge, Weighted { ... }
> >>>
> >>> and be able to immediately use it as an input for all the algorithms,
> >>> without extra steps required. So the average user is happy, while
> "graph
> >>> geeks" can dig into advanced capabilities and forge their personalized
> >>> weights :)
> >>> I hope we all agree on this as a first step. Complexity comes after.
> >>>
> >>> I&#x

Re: [Graph] On graph weight type(s)

2011-12-22 Thread Matthew Pocock
Hi,

Just thought I'd throw something out here. My experience is that I often
take the same graph (as in the exact same data, same objects) but at
different times want to use different weights. So, rather than having Edge
extend Weighted, I'd factor weights out into their own interface:

/**
 * An edge weight function.
 *
 * note: perhaps this should more generally just be a Function1, if
we have such a thing handy.
 *
 * @tparam E  edge type
 * @tparam W weight type
 */
public interface EdgeWeight {
  public W getWeight(E: Edge);
}

/**
 * A combination of a monoid and comparator that satisfy monotinicity of
the addition operation with respect to the comparator.
 *
 * ∀a: m.compare(m.zero, a) <= 0
 * ∀a,b: m.compare(a, m.append(a, b)) <= 0
 */
public interface Monotonic extends Monoid, Comparator

Also, some algorithms calculate all shortest paths at once, while others
calculate them individually and independently. It's probably even possible
to calculate some lazily. So, the interfaces for shortest paths should
decouple setting up a strategy for all shortest paths from an object that
can be used to fetch a specific shortest path.

/**
 * An algorithm for finding shortest paths between vertices of a graph,
given some edge weighting function and
 * a well-behaved combinator for edges between connected vertices.
 */
public interface ShortestPathFunction,
G extends DirectedGraph, W> {
  public ShortestPathResult makeResult(G graph, EdgeWeight
weighting, Monotonic combineWith);
}

/**
 * The shortest paths between vertices in a graph.
 */
public interface ShortestPathResult, W>
{
  public WeightedPath findShortestPath(V source, V target);
}

How does that look? You can then have standard implementations of these
things in some static utility class or a spring-friendly resource. The
brute-force algorithms that compute all paths at once would do all the work
in makeResult() and simply store this in some state within the returned
ShortestPathResult. Those that calculate individual pairs on the fly (or
all shortest paths from some vertex) would capture state in makeResult()
and perform the actual computation in findShortestPath().

Matthew

On 22 December 2011 16:39, Claudio Squarcella wrote:

> Hi,
>
>
>  I highly appreciated the last contributions (thanks guys!) but I also
>> agree on this point, so let's start from the end.
>> I think that, no matter what underlying structure we come up with, the
>> user should be able to specify e.g. a weighted edge with something like
>>
>> public class MyEdge implements Edge, Weighted { ... }
>>
>> and be able to immediately use it as an input for all the algorithms,
>> without extra steps required. So the average user is happy, while "graph
>> geeks" can dig into advanced capabilities and forge their personalized
>> weights :)
>> I hope we all agree on this as a first step. Complexity comes after.
>>
>> I'll take my time as well to re-think.
>>
>
> I did think and code a bit more. First of all please take a look at the
> updated code: Weighted is an interface (weight W can be any type) and
> all the algorithms require edges to implement Weighted for now --
> we did not break it that much ;)
>
> About the "HasProperty-vs-Property" question (as in Comparable vs
> Comparator, MonoidElement vs Monoid, etc) I would go for the second one
> only. That is, external classes handle all operations on weights. Downside:
> the # of method parameters would increase linearly with the number of
> properties, but I can live with that (how many properties would weights
> have anyway?). On the other hand we have a neat interface for each
> property/class (Zero, Semigroup, Monoid, Ordering or Comparator, etc) and
> one clean, generic implementation for each algorithm. Dijkstra's signature
> becomes something like:
>
> public static , G extends
> DirectedGraph> WeightedPath findShortestPath( G graph, V
> source, V target, Monoid weightMonoid, Comparator weightComparator )
>
> Scary uh? But wait, default implementations for Double, Integer, etc. are
> way easier. E.g. Dijkstra's shortcut for Double:
>
> public static , G
> extends DirectedGraph> WeightedPath findShortestPath(
> G graph, V source, V target )
> {
>return findShortestPath(graph, source, target, new DoubleMonoid(), new
> DoubleComparator());
> }
>
> where DoubleMonoid and DoubleComparator are part of the library.
>
>
> If you guys are fine with this, I'm ready to try and patch [graph] with a
> Christmas gift :)
> Claudio
>
>
> --
> Claudio Squarcella
> PhD student at Roma Tre University
> E-mail address: squar...@dia.uniroma3.it
> Phone: +39-06-57333215
> Fax: +39-06-57333612
> http://www.dia.uniroma3.it/~**squarcel<

Re: [Graph] On graph weight type(s)

2011-12-15 Thread Matthew Pocock
Hi,

On 15 December 2011 11:35, James Carman  wrote:


> public interface BinaryOperation
> {
>  public T execute(T operand1, T operand2);
> }
>
> Perhaps we can come up with an interface that combines the two
> aspects.  I'm trying to think of mathematically what that would be
> called.  By the way, what do you need to know "HasZero"?  A sum
> operation has to have a "zero", doesn't it?


The mathematical hierarchy goes: semigroup -> monoid.

http://en.wikipedia.org/wiki/Semigroup
http://en.wikipedia.org/wiki/Monoid

You don't need a full group here as you are only interested in a single
operation, not a pair of interacting operations. There are several
JVM-hosted libraries that model this hierarchy to a greater or lesser
degree. scalaz uses:

trait Semigroup[S] {
  def append(s1: S
<http://scalaz.googlecode.com/svn/continuous/latest/browse.sxr/scalaz/Semigroup.scala.html#22357>,
s2: => S): S 
<http://scalaz.googlecode.com/svn/continuous/latest/browse.sxr/scalaz/Semigroup.scala.html#22357>
}

trait Zero[Z] {
  val zero: Z 
<http://scalaz.googlecode.com/svn/continuous/latest/browse.sxr/scalaz/Zero.scala.html#22160>
}

trait Monoid[M] extends Zero
<http://scalaz.googlecode.com/svn/continuous/latest/browse.sxr/scalaz/Zero.scala.html#17945>[M]
with Semigroup 
<http://scalaz.googlecode.com/svn/continuous/latest/browse.sxr/scalaz/Semigroup.scala.html#15476>[M]

http://scalaz.googlecode.com/svn/continuous/latest/browse.sxr/scalaz/Semigroup.scala.html
http://scalaz.googlecode.com/svn/continuous/latest/browse.sxr/scalaz/Zero.scala.html
http://scalaz.googlecode.com/svn/continuous/latest/browse.sxr/scalaz/Monoid.scala.html

If you're not used to reading scala, here's the essentially equivalent
definitions in Java:

public interface Semigroup {
  public S append(S s1, S s2);
}

public interface Zero {

  public Z zero();
}

public interface Monoid extends
Zero<http://scalaz.googlecode.com/svn/continuous/latest/browse.sxr/scalaz/Zero.scala.html#17945>,
Semigroup<http://scalaz.googlecode.com/svn/continuous/latest/browse.sxr/scalaz/Semigroup.scala.html#15476>



So, given that you have Ordering already (or is that Comparator?),
I'd say that Weight is defined as:

// insert comments here about the consistency of append and compare
public interface Weight extends Monoid, Ordered

Of course, a sensible default implementation of Weight would delegate to
Monoid and Ordered instances and you can then pray to the gods of hotspot
to inline this all away.

public  Weight weight(Monoid m, Ordered o) {
  new Weight() {
public W zero() { return m.zero(); }
...
  }
}

Matthew

-- 
Dr Matthew Pocock
Integrative Bioinformatics Group, School of Computing Science, Newcastle
University
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
skype: matthew.pocock
tel: (0191) 2566550
mob: +447535664143


Re: [Graph] On graph weight type(s)

2011-12-12 Thread Matthew Pocock
Hi,

I have tended to find that edge weights always follow the following laws:

They have a monoid:
  There is a zero (0) constant and a |+| operator for combining two weights.
They have an equivalence and (compatible) ordering relations (>, =):
The ordering is compatible with the monoid. For example,
  a |+| 0 = 0
  a |+| b >= a
  a |+| b = b |+| a
  a >= 0
  a != 0, b != 0 => a |+| b > a

Taken together, the algorithms for things like shortest-path or
weighted-k-neighbourhood can all be expressed, abstracted away from the
weight datatype, the operations for combining weights, and the operations
for comparing weights.

If you choose your ordering then you can derive the compatible min monoid
where a |+| b = min(a, b). If you use the natural ordering on numbers then
you commonly use the monoids (0, +) or (1, *).

However, I've had cases where the individual weights and accumulated
path-traversal weights are complex structures. This isn't a problem, as
long as there's a zero and |+| for these 'weight' structures, and a
well-behaved ordering over these structures.

On 12 December 2011 04:39, James Carman  wrote:

> Sorry, I was on my phone before when I sent that.  Let me elaborate a
> bit more.  I would just allow the weights to be of any type.  However,
> you can create two different types of scenarios where you either use a
> Comparable derivative or you use whatever you want, but you have to
> supply a custom Comparator.
>
> On Sun, Dec 11, 2011 at 8:01 PM, James Carman
>  wrote:
> > I wouldn't restrict the weight to Comparable.  What if the user wanted to
> > provide their own Comparator?
> >
> > On Dec 11, 2011 7:07 PM, "Claudio Squarcella" 
> > wrote:
> >>
> >> Hi all,
> >>
> >> I explored a bit more the (rather philosophical) dilemma that came from
> a
> >> thread from last week, quoted below
> >>>
> >>> One step further. A weight is not necessarily a double: in some cases
> not
> >>> even a number, but rather a "comparable" of some sort. So I would
> suggest to
> >>> make use of generics in some way, possibly the smartest. Suggestions
> are
> >>> welcome :-)
> >>
> >>
> >> The question is: *what do we mean by weight when dealing with graphs?*
> >>
> >> "Real number" is a standard answer in graph theory: see, e.g.,
> >> http://www.math.jussieu.fr/~jabondy/books/gtwa/pdf/chapter1.pdf (pag.
> 15).
> >> What we have now in the code is a {{getWeight()}} method that returns a
> >> double. That serves well for all the algorithms currently implemented,
> and
> >> probably for many more to come. However it is also true that:
> >>
> >>  * some domains of interest and/or algorithms might be more restrictive
> >>   on the type and sign of "real number" for the weights: integers,
> >>   non-negative rationals, etc.
> >>  * strictly speaking, the basic operations associated with weights are
> >>   usually just a few. Comparison and sum are enough at least for the
> >>   algorithms implemented so far in the project (please correct me if I
> >>   am wrong). Maybe scaling? Additive inverse?
> >>  * each algorithm is aware of the subset of required operations. E.g.
> >>   Prim's algorithm for minimum spanning trees only requires edge
> >>   weights to be comparable, so they could even be Strings or whatever...
> >>  * some very abstract user might want to use a new class (not
> >>   necessarily a number) as a weight, provided that it meets the
> >>   requirements of the domain.
> >>
> >> So here is a high-level view of what I propose:
> >>
> >>  * the basic weight is nothing more than a {{Comparable}}, which is
> >>   hopefully generic enough;
> >>  * where needed, algorithms define more specific constraints on the
> >>   input graph in their signature (e.g. Dijkstra can use {{Double}}).
> >>
> >>
> >> Looking forward for comments,
> >> Claudio
> >>
> >> --
> >> Claudio Squarcella
> >> PhD student at Roma Tre University
> >> E-mail address: squar...@dia.uniroma3.it
> >> Phone: +39-06-57333215
> >> Fax: +39-06-57333612
> >> http://www.dia.uniroma3.it/~squarcel
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >> For additional commands, e-mail: dev-h...@commons.apache.org
> >>
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


-- 
Dr Matthew Pocock
Integrative Bioinformatics Group, School of Computing Science, Newcastle
University
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
skype: matthew.pocock
tel: (0191) 2566550
mob: +447535664143


Re: [VOTE][Codec] Release Commons Codec 1.6-RC2

2011-11-23 Thread Matthew Pocock
Hi Henri,


On 23 November 2011 07:14, Henri Yandell  wrote:

> I get the following when I 'mvn clean package':
>
>
>  testSpeedCheck(org.apache.commons.codec.language.bm.BeiderMorseEncoderTest):
> Java heap space
>
> My MAVEN_OPTS are:
>
>  -Xmx2048m -XX:MaxPermSize=128m
>
> Not sure if that's expected (2G felt big enough) or if I did something
> wrong.
>

It surprises me (to put it mildly) that it should need that much memory. It
points to something perhaps going wrong. Can you tell me anything more?
java -version, anything I should know about your hardware, if this happens
every time or sporadically? If you know which test method it is failing in,
or if it fails in different methods at different times?

Thanks,

Matthew


>
> Otherwise everything looked good.
>
> Hen
>
> On Sun, Nov 20, 2011 at 10:25 AM, Gary Gregory 
> wrote:
> > Good day to you all:
> >
> > I have prepared Commons Codec 1.6-RC2.
> >
> > The changes from RC1 are what Sebb found:
> > - EOL in sources
> > - Cruft from a dirty build, so I built this RC as I should have the first
> > time around with:
> >- mvn clean
> >- mvn deploy -Prelease
> >
> > Tag:
> >
> >
> https://svn.apache.org/repos/asf/commons/proper/codec/tags/commons-codec-1.6-RC2
> >
> > Site:
> >
> > https://people.apache.org/builds/commons/codec/1.6/RC2/
> >
> > Binaries:
> >
> > https://repository.apache.org/content/repositories/orgapachecommons-224/
> >
> > [ ] +1 release it
> > [ ] +0 go ahead, I cannot take the time
> > [ ] -1 no, do not release it because:
> >
> > This VOTE is open for 72 hours, until November 23 2011, 14:00 EST.
> >
> > Fixed Bugs:
> > o Use standard Maven directory layout.  Issue: CODEC-129. Thanks to
> > ggregory.
> > o Documentation spelling fixes.  Issue: CODEC-128. Thanks to
> > ville.sky...@iki.fi.
> > o Fix various character encoding issues in comments and test cases.
>  Issue:
> > CODEC-127.
> > o ColognePhonetic Javadoc should use HTML entities for special
> characters.
> > Issue: CODEC-123.
> >
> > Changes:
> > o Implement a Beider-Morse phonetic matching codec.  Issue: CODEC-125.
> > Thanks to Matthew Pocock.
> > o Migrate to Java 5.  Issue: CODEC-119.
> > o Migrate to JUnit 4.  Issue: CODEC-120.
> >
> > Heads up: the Beider-Morse encoder tests take a long time to run (5
> > minutes). The code has been optimized.
> >
> > Thank you,
> > Gary
> >
> > --
> > E-Mail: garydgreg...@gmail.com | ggreg...@apache.org
> > JUnit in Action, 2nd Ed: http://bit.ly/ECvg0
> > Spring Batch in Action: http://bit.ly/bqpbCK
> > Blog: http://garygregory.wordpress.com
> > Home: http://garygregory.com/
> > Tweet! http://twitter.com/GaryGregory <http://twitter.com/GaryGregory>
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


-- 
Dr Matthew Pocock
Integrative Bioinformatics Group, School of Computing Science, Newcastle
University
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
skype: matthew.pocock
tel: (0191) 2566550
mob: +447535664143


Re: [functor] Method 'XXX' is not designed for extension

2011-08-26 Thread Matthew Pocock
Hi,

On 26 August 2011 20:06, Simone Tripodi  wrote:

> Hi Matt,
> sorry for the late and for (maybe) silly question, but what's your PoV
> about making classes Vs methods as 'final'?
>

If I lived in a world without dependency injection, then I'd favour final
classes where I wanted to prevent sub-classing, and as a fringe benefit,
give hotspot extra hints that it can safely start inlining things. It's
possible that there's no sane reason ever to sub-class some of your classes
and also no sane reason for them to have bits over-ridden by DI. Removing
final in the future is a safe change with respect to other people's code.
Adding final is not.

I personally use final methods when classes are designed for extension and
are a mix of abstract and final methods. The sub-class is meant to fill in
the missing behaviour but not over-write the skeleton I've already provided.

Matthew



> Many thanks in advance, have a nice day!!!
> Simo
>
> http://people.apache.org/~simonetripodi/
> http://www.99soft.org/
>
>
>
> On Thu, Aug 25, 2011 at 6:11 PM, Matt Benson  wrote:
> > On Wed, Aug 24, 2011 at 3:14 AM, Simone Tripodi
> >  wrote:
> >> Hi Matthew!
> >>
> >> agreed on such 3rd parties integrations you are speaking about, Google
> >> Guice would suffer the same (I'm not a fan of Spring :P)
> >>
> >> Anyway, as you already mentioned, it is a matter of design, IMHO
> >> subclassing those classes wouldn't have a lot of sense, since they are
> >> used to implement a kind of "expression language" - I would be scared
> >> if in my language I could change the semantic of my syntax...
> >>
> >> At the same time I wonder if it would make sense intercepting such
> >> calls... didn't think to any valid example, do you have one?
> >>
> >> Since I'm not the original author of [functor] and I'm just providing
> >> help to get it in a state to be released, better if more people are
> >> involved before doing any action :P
> >
> > Disclaimer:  I am also not the original author, nor am I any master of
> > FP... on the one hand, many of the complete
> > algorithm/comparator/composite implementations provided by [functor]
> > could probably be sensibly made final.  On the other hand, applying
> > this check to #equals(), #hashCode(), etc., seems pretty stupid.
> > Maybe we should just turn it off.
> >
> > Matt
> >
> >>
> >> Thanks for your feedbacks, have a nice day!!!
> >> All the best,
> >> Simo
> >>
> >> http://people.apache.org/~simonetripodi/
> >> http://www.99soft.org/
> >>
> >>
> >>
> >> On Wed, Aug 24, 2011 at 12:25 AM, Matthew Pocock
> >>  wrote:
> >>> Final classes don't always play well with things like aspects and
> dependency
> >>> injection and other things that mangle bytecode or dynamically
> introduce
> >>> subclasses/proxies (I'm thinking SPRING). Perhaps this is not an issue
> here.
> >>>
> >>> Should these classes be final? Taking the example of FoldLeft - are
> their
> >>> circumstances where it would make sense to sub-class FoldLeft? Can it
> even
> >>> be subclassed in a way that would produce something that behaved as a
> >>> FoldLeft but over-wrote these flagged methods?
> >>>
> >>> Matthew
> >>>
> >>> On 23 August 2011 20:00, Simone Tripodi 
> wrote:
> >>>
> >>>> Hi all guys,
> >>>> in [functor] component there are several classes with checkstyle
> >>>> errors[1] of the type
> >>>>
> >>>>Method '' is not designed for extension - needs to be
> >>>> abstract, final or empty.
> >>>>
> >>>> My opinion is that such classes should be final - but what someone
> >>>> else thinks about it?
> >>>>
> >>>> TIA, all the best!!!
> >>>> Simo
> >>>>
> >>>> [1] http://commons.apache.org/sandbox/functor/checkstyle.html
> >>>>
> >>>> http://people.apache.org/~simonetripodi/
> >>>> http://www.99soft.org/
> >>>>
> >>>> -
> >>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>>
> >>>>
> >>>
> >>&g

Re: [math] Consistent use of ExceptionContext [was "using the ExceptionContext facility"]

2011-08-26 Thread Matthew Pocock
Hi,

2011/8/26 Sébastien Brisard 

>
> In other words, having this kind of context with documented keys would
> help the end-user debug his own code. I hope I'm making my point
> clearly, there.
>

More info attached to exceptions is great. I often find that the first half
of fixing a bug is adding more explicit exception throw/catch handling and
ever-more explicit messages, chasing the fault back to a root cause.
However, you have to ask who is going to be making use of the exception. Is
it someone who is debugging the library, or some user who's called into it
and somehow got broken behaviour?

For debugging, you are familiar with your own library, and can capture info
about the state around where the exception was raised using your IDE in
debug mode. Your users aren't familiar with the library, and should be able
to tell from the exception message if it is most likely their fault (and if
so, how to fix it), or the library's (in which case they may send you the
stack-trace and we all know that better annotated exceptions make it easier
to interpret these even across different builds).

So, personally I would lean on the side of as much explicit info in the
message as possible. "It was parameter A that borked me because it was XXX
and I was expecting YYY". Don't rely upon line numbers, because these change
as ppl edit the file and your users can understand 'A is borked' but can't
understand a line number. I wouldn't usually bother putting objects
capturing state describing the faulty environment into the exception as I
can get that from the debugger, given a test case. In my experience, a good
proportion of the real causes of exceptions aren't co-located with where the
exception is raised. Perhaps with the right cascades of exception handlers,
all of which capture their relevant local state, you can then serialize the
result out and have a pre-canned test-case for the failure. I'm not sure how
practical this would be in the real world - I expect you'd drown in
try/catch eventualities. You may be better logging the hinkey state very
close to where the exception is raised rather than storing references to it.

Happy to be proved wrong though. Perhaps this is the beginning of an era of
code that spits out bug unit tests when ever there are exceptions.

Matthew


> Best regards,
> Sébastien
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


-- 
Dr Matthew Pocock
Visitor, School of Computing Science, Newcastle University
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
tel: (0191) 2566550
mob: +447535664143


Re: [functor] Method 'XXX' is not designed for extension

2011-08-23 Thread Matthew Pocock
Final classes don't always play well with things like aspects and dependency
injection and other things that mangle bytecode or dynamically introduce
subclasses/proxies (I'm thinking SPRING). Perhaps this is not an issue here.

Should these classes be final? Taking the example of FoldLeft - are their
circumstances where it would make sense to sub-class FoldLeft? Can it even
be subclassed in a way that would produce something that behaved as a
FoldLeft but over-wrote these flagged methods?

Matthew

On 23 August 2011 20:00, Simone Tripodi  wrote:

> Hi all guys,
> in [functor] component there are several classes with checkstyle
> errors[1] of the type
>
>Method '' is not designed for extension - needs to be
> abstract, final or empty.
>
> My opinion is that such classes should be final - but what someone
> else thinks about it?
>
> TIA, all the best!!!
> Simo
>
> [1] http://commons.apache.org/sandbox/functor/checkstyle.html
>
> http://people.apache.org/~simonetripodi/
> http://www.99soft.org/
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


-- 
Dr Matthew Pocock
Visitor, School of Computing Science, Newcastle University
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
tel: (0191) 2566550
mob: +447535664143


Re: [codec] next releases

2011-08-23 Thread Matthew Pocock
My vote (not that I have one) would be for 1.6, and to keep 2.0 as the
release when the breaking changes are introduced.

Matthew

On 23 August 2011 09:18, Simone Tripodi  wrote:

> Hi all guys,
> I'd suggest to go through 1.6 too, even if we have a precedence in the
> past (before I joined as committer) when the Digester version was
> promoted from 1.8 to 2.0 just switching to JVM and added Generics...
> So my "concern" is just make sure we adopt a common policy for every
> component and understand if the Digester case was just an exception
> (or not).
> Have a nice day, all the best!!!
> Simo
>
> http://people.apache.org/~simonetripodi/
> http://www.99soft.org/
>
>
>
> On Tue, Aug 23, 2011 at 4:42 AM, sebb  wrote:
> > On 23 August 2011 03:32, Gary Gregory  wrote:
> >> Hi All:
> >>
> >> After the last round of discussion WRT generics, a 2.0, version, and the
> new
> >> BM encoder, it seems the consensus is:
> >>
> >> - Release a version based on trunk. Trunk requires Java 5 and includes
> the
> >> new BM encoder.
> >>
> >> - Revert the trunk changes that break binary compatibility,
> specifically,
> >> based on Clirr:
> >>
> >> SeverityMessageClassMethod / Field
> >> ErrorMethod 'public StringEncoderComparator()' has been removed
> >> org.apache.commons.codec.StringEncoderComparatorpublic
> >> StringEncoderComparator()
> >> ErrorMethod 'public boolean isArrayByteBase64(byte[])' has been
> >> removedorg.apache.commons.codec.binary.Base64public boolean
> >> isArrayByteBase64(byte[])
> >> ErrorClass org.apache.commons.codec.language.Caverphone removed
> >> org.apache.commons.codec.language.Caverphone
> >> ErrorMethod 'public int getMaxLength()' has been removed
> >> org.apache.commons.codec.language.Soundexpublic int getMaxLength()
> >> ErrorMethod 'public void setMaxLength(int)' has been removed
> >> org.apache.commons.codec.language.Soundexpublic void
> setMaxLength(int)
> >> ErrorField charset is now final
> >> org.apache.commons.codec.net.URLCodeccharset
> >> ErrorMethod 'public java.lang.String getEncoding()' has been removed
> >> org.apache.commons.codec.net.URLCodecpublic java.lang.String
> >> getEncoding()
> >>
> >> - Continue the generics discussion toward a major release which would
> likely
> >> require a package name change.
> >>
> >> Question: Because the code now requires Java 5, should the new version
> be
> >> 1.6 or 2.0?
> >>
> >> 1.6 feels right because we are adding an encoder.
> >> The only reason now for a 2.0 label is because we are using Java 5.
> >>
> >> Thoughts?
> >
> > A major version bump is required for API breaks; it is not required
> > for changes in base Java level. [1]
> >
> > Though if we were suddenly to require Java 7 it might make sense to go to
> 2.0.
> >
> > Given that Java 1.5 has been out for some years now, and most users
> > will probably be on at least Java 1.5 now, it seems to me that it's
> > not necessary to have a major version bump; 1.6 is fine by me.
> >
> > [1] http://commons.apache.org/releases/versioning.html
> >
> >> Thank you,
> >> Gary
> >> --
> >> http://garygregory.wordpress.com/
> >> http://garygregory.com/
> >> http://people.apache.org/~ggregory/
> >> http://twitter.com/GaryGregory
> >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


-- 
Dr Matthew Pocock
Visitor, School of Computing Science, Newcastle University
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
tel: (0191) 2566550
mob: +447535664143


Re: [collections] Iterate over sublists of an original list

2011-08-18 Thread Matthew Pocock
The scala collections library has the grouped() method. From the scaladoc:

defgrouped (size: Int <http://www.scala-lang.org/api/current/scala/Int.html>
): 
Iterator<http://www.scala-lang.org/api/current/scala/collection/Iterator.html>
[List<http://www.scala-lang.org/api/current/scala/collection/immutable/List.html>
[A]]

Partitions elements in fixed size lists.
size

the number of elements per group
returns

An iterator producing lists of size size, except the last will be truncated
if the elements don't divide evenly.

I'm not suggesting you should jump to scala to use its collections library,
or even use scala-library.jar from your Java app, but I do find this method
together with groupBy() one of the more useful bits of functionality. Pity
something like it isn't supported by other 3rd party collections for Java.

Matthew

On 18 August 2011 13:48, Sébastien Lorber wrote:

> Hello
>
> Actually if you look at my implementation, i use that List.sublist()
> method.
>
> It's a little pain to use it to split because you must always take care of
> an out of bound...
> IMHO, people do not really like to play with array/list indexes... If they
> just want to split a big list of a couple of small sublists,
> they probably do not like to have to deal with calculating the good index
> positions, they just want the list to be splitted :)
>
> I've looked at some guava classes but wasn't able to find anything to do
> this: a convenient way to split a list
>
>
>
> 2011/8/18 David Karlsen 
>
> > Guava also has a lot of handy classes for working on collections.
> >
> > 2011/8/18 Simone Tripodi 
> >
> > > Salut Sébastien,
> > > wouldn't the List#subList(int, int)[1] method be helpful for your
> > purposes?
> > > HTH,
> > > Simo
> > >
> > > [1]
> > >
> >
> http://download.oracle.com/javase/1.5.0/docs/api/java/util/List.html#subList(int
> > > ,
> > > int)
> > >
> > >
> > > http://people.apache.org/~simonetripodi/
> > > http://www.99soft.org/
> > >
> > >
> > >
> > > On Thu, Aug 18, 2011 at 2:26 PM, Sébastien Lorber
> > >  wrote:
> > > > Hello,
> > > >
> > > >
> > > > It's not the first time i have to split a big list of hibernate
> > entities
> > > > ID's to sublists of 100 items for exemple so that i could load all
> > these
> > > > entities 100 in a single request (with a "where id in (")
> > > >
> > > > Thus I want to iterate easily on sublists of a list, with the
> > possibility
> > > to
> > > > give the sublist a size...
> > > > I though i would find the tool in apache collections but i didn't
> find
> > > it.
> > > > Perhaps i've missed the class...
> > > >
> > > >
> > > >
> > > > If there's no tool to do that yet, i think it would be great to make
> > one
> > > in
> > > > apache collections.
> > > >
> > > > The kinda simple implementation i use at work is the following:
> > > > http://pastebin.com/CRitkWTG
> > > >
> > > >
> > > > And you use it like that:
> > > >
> > > >// We load vehicles 100 by 100
> > > > for ( List idSublist : new
> > > > SublistIterable(allIds,100) ) {
> > > >List vehiclesSublist =
> > > vehicleDAO.findByIds(idSublist);
> > > >// blablabla
> > > >}
> > > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> > >
> >
> >
> > --
> > --
> > David J. M. Karlsen - http://www.linkedin.com/in/davidkarlsen
> >
>



-- 
Dr Matthew Pocock
Visitor, School of Computing Science, Newcastle University
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
tel: (0191) 2566550
mob: +447535664143


Re: [Math] "iterator" and "sparseIterator" in "RealVector" hierarchy

2011-08-17 Thread Matthew Pocock
 right operands and
> do
> > > the
> > > > > > right thing.  Obviously, a type test will not tell you whether
> this
> > > > > matrix
> > > > > > is sparse or not.
> > > > > >
> > > > > > This matrix and siblings is very important in compressed sensing
> and
> > > > > > stochastic projection algorithms.
> > > > > >
> > > > > > On Tue, Aug 16, 2011 at 1:55 PM, Phil Steitz <
> phil.ste...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > On 8/16/11 4:46 AM, Gilles Sadowski wrote:
> > > > > > > > Hi.
> > > > > > > >
> > > > > > > >> I understood what he was suggesting.  I still disagree.
>  Dynamic
> > > > > > > dispatch
> > > > > > > >> and non-lattice typing structure is still required to make
> this
> > > all
> > > > > > > work.
> > > > > > > >>  Java doesn't really do that.  Pretending that what Java
> does is
> > > > > > > sufficient
> > > > > > > >> is hammer-looking-for-a-nail, not solving the problems at
> hand.
> > > > > > > > Maybe that *I* don't understand what you are hinting at.
> Sorry
> > > for
> > > > > being
> > > > > > > > dense. [Although that seems appropriate in this discussion
> :-).]
> > > > > > > >
> > > > > > > > Polymorphism provides dynamic dispatch, overloading does not;
> > > that's
> > > > > why
> > > > > > > my
> > > > > > > > proposition is that when you manipulate "unknown" types,
> those
> > > should
> > > > > > > come
> > > > > > > > as "this", not as the argument of the method.
> > > > > > > >
> > > > > > > > What's wrong with that?
> > > > > > > >
> > > > > > > > As for "hammer-looking-for-a-nail", I also don't see what you
> > > mean:
> > > > > What
> > > > > > > is
> > > > > > > > the problem? I guess that there are lots of applications who
> > > never
> > > > > need
> > > > > > > to
> > > > > > > > know about sparse vectors/matrices. In those cases, the added
> > > > > complexity
> > > > > > > is
> > > > > > > > not a "feature". The issue reported contends that the current
> > > design
> > > > > in
> > > > > > > CM
> > > > > > > > can cause problems for dense implementations. I'm not even
> sure
> > > that
> > > > > the
> > > > > > > > current design is usable for the type of applications that
> make
> > > heavy
> > > > > use
> > > > > > > of
> > > > > > > > sparseness. Those are problems, IMHO.
> > > > > > >
> > > > > > > I have been out of pocket the last couple of days and may not
> have
> > > > > > > time to dig into this until late tonight, but I agree with
> Gilles
> > > > > > > that we need to get the conversation here more concrete.  I
> know we
> > > > > > > discussed this before and Ted and others had good examples
> > > > > > > justifying the current setup.  Can we revisit these, please?
> What
> > > > > > > would be great would be some examples both from the perspective
> of
> > > > > > > the [math] developer looking to add a new or specialized class
> and
> > > > > > > [math] users writing code that leverages the setup.
> > > > > > >
> > > > > > > Phil
> > > > > > > >
> > > > > > > >
> > > > > > > > Gilles
> > > > > > > >
> > > > > > > >> On Mon, Aug 15, 2011 at 6:52 PM, Greg Sterijevski <
> > > > > > > gsterijev...@gmail.com>wrote:
> > > > > > > >>
> > > > > > > >>> Forgive me for pushing my nose under the tent... I couldn't
> > > resist.
> > 

Re: [codec] Encoder / Decoder interface

2011-08-17 Thread Matthew Pocock
On 17 August 2011 13:58, Stephen Colebourne  wrote:

> The Object encode(Object) approach is still valid if the primary use
> case of the interface is for frameworks. In a framework, objects are
> generally treated as of type Object, so the API is fine. User code
> should use concrete versions.
>

In the case of ENC encode(DEC), the framework sees Object encode(Object), as
this is the erased type. The framework is happy and Java code referring to
the Encoder interface (directly or indirectly) is type-safe (baring casts
which we can't do much about).

Matthew


>
> Stephen
>
> --
Dr Matthew Pocock
Visitor, School of Computing Science, Newcastle University
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
tel: (0191) 2566550
mob: +447535664143


Re: [codec] Encoder / Decoder interface

2011-08-17 Thread Matthew Pocock
>> Error]Field charset is now
> >> finalorg.apache.commons.codec.net.URLCodeccharset[image:
> >> Error]Method 'public java.lang.String getEncoding()' has been removed
> >> org.apache.commons.codec.net.URLCodecpublic java.lang.String
> getEncoding()
> >
> > DIfficult to read, but looks like the Clirr report I generated.
>
> Yep, that's what the build generates.
>
> Gary
>
> >
> >> Gary
> >>
> >>>
> >>>> Gary
> >>>>
> >>>>>
> >>>>>> Gary
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>> Gary
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Here is one thought in favour of removing them, at least from
> >>> Base64:
> >>>>>>>>>  sometimes I copy Base64.java into my own projects as a
> copy/paste.
> >>>  I
> >>>>>>>>> change the namespace.  Then I remove references to other parts of
> >>>>>>>>> commons-codec that I am not bringing in, but that Base64.java
> refers
> >>>>>>>>> to (typically Encoder, Decoder, and EncoderException).  The
> smaller
> >>> my
> >>>>>>>>> delta after the copy/paste, the easier it is for me copy the
> newest
> >>>>>>>>> version in the future to keep my fork up to date.
> >>>>>>>>>
> >>>>>>>>> I like doing this because it can make the difference between
> needing
> >>> a
> >>>>>>>>> jar dependency and having no dependencies at all in some of my
> other
> >>>>>>>>> work.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Of course I am pretty focused on Base64.  I have never used the
> >>> soundex
> >>>>>>>>> stuff.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I'm torn.  On the one hand, I suspect the Encoder/Decoder
> interfaces
> >>>>>>>>> have been mostly unused, and analyzing the Maven2 repository
> could
> >>>>>>>>> shed light on that.  Removing the interfaces makes sense if they
> are
> >>>>>>>>> not really used, but on the other hand, improving them, making
> them
> >>>>>>>>> actually useful, also makes sense.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> yours,
> >>>>>>>>>
> >>>>>>>>> Julius Davies
> >>>>>>>>> 604-222-3310 (Home)
> >>>>>>>>>
> >>>>>>>>> $ sudo apt-get install cowsay
> >>>>>>>>> $ echo "Moo." | cowsay | cowsay -n | cowsay -n
> >>>>>>>>> http://juliusdavies.ca/cowsay/
> >>>>>>>>>
> >>>>>>>>>
> >>> -
> >>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>>>>>>>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Thank you,
> >>>>>>>> Gary
> >>>>>>>>
> >>>>>>>> http://garygregory.wordpress.com/
> >>>>>>>> http://garygregory.com/
> >>>>>>>> http://people.apache.org/~ggregory/
> >>>>>>>> http://twitter.com/GaryGregory
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> -
> >>>>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>>>>>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>>>>>
> >>>>>>
> >>>>>>
> -
> >>>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>>>>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> -
> >>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>>>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>>>
> >>>>
> >>>> -
> >>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>>
> >>>>
> >>>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>
> >>>
> >>
> >>
> >> --
> >> Thank you,
> >> Gary
> >>
> >> http://garygregory.wordpress.com/
> >> http://garygregory.com/
> >> http://people.apache.org/~ggregory/
> >> http://twitter.com/GaryGregory
> >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


-- 
Dr Matthew Pocock
Visitor, School of Computing Science, Newcastle University
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
tel: (0191) 2566550
mob: +447535664143


Re: [codec] getting the bmpm code out there

2011-08-11 Thread Matthew Pocock
Hi Sebb,


> The reason I raised the issue was that the API seems to be currently
> in a state of flux.
>

The BMPM code has not appeared in a previous release. It is a discrete
addition that doesn't alter any existing code, and as far as I know,
currently no 3rd party code relies upon it. Right now on trunk, it is a
StringEncoder.


> In this case, because the BMPM code is new, it might be possible to
> relax the requirement somewhat, so long as the code API is documented
> as being unstable.
>

I've no problem with marking it as new or unstable or whatever the right
word is. While it extends StringEncoder, the API is stable. Although there
may be more flux with the finer details of the string you get out for the
string you put in as we fix bugs and update the rule tables, this shouldn't
alter how clients (users of the API) call this code, only the quality of the
results they get back.


>
> If we do have to change BMPM in a way that is not binary compatible,
> then all code that uses the BMPM classes will need to be updated.
>

Understood. I think this only becomes an issue if/when Encoder becomes
generified, and at that point clearly we need a big version bump, with all
the associated changes, and all encoders and their clients would be equally
affected.

Does that help, or have I further muddied the waters?

Matthew

-- 
Dr Matthew Pocock
Visitor, School of Computing Science, Newcastle University
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
tel: (0191) 2566550
mob: +447535664143


[codec] getting the bmpm code out there

2011-08-11 Thread Matthew Pocock
Hi,

As those of you who've been following the CODEC-125 ticket will know, with
Greg's help I've got a port of the beider morse phonetic
matching (bmpm) algorithm in as a string encoder. As far as I can tell, it's
ready for people to use and abuse. It ideally needs more test-case words,
but to the best of my knowledge it doesn't have any horrendous bugs or
performance issues.

The discussion on the ticket started to stray off bmpm and on to policy for
releases and changing APIs, and Sebb said we should discuss it on the list.
So, here we are.

Ideally, I'd like there to be a release of commons-codec some time soon so
that users can start to try out bmpm right away, and so that we can start
the process of adding it to the list of supported indexing methods in solr.
What do people think?

Matthew

-- 
Dr Matthew Pocock
Visitor, School of Computing Science, Newcastle University
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
tel: (0191) 2566550
mob: +447535664143


CODEC-125

2011-07-22 Thread Matthew Pocock
Hi Gary,

I'm having no joy reproducing your test failures against my patch to
CODEC-125. It's possibly something to do with our different platforms,
locales or tool-chains. However, I doubt we can make any progress with it by
us ping-ponging messages through the bug tracker. If it is OK with you,
could we hook up for an IM session to try to get this sorted?

https://issues.apache.org/jira/browse/CODEC-125

Thanks,

Matthew

-- 
Dr Matthew Pocock
Visitor, School of Computing Science, Newcastle University
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
tel: (0191) 2566550
mob: +447535664143


Re: (MATH-608) Remove methods from RealMatrix Interface

2011-07-02 Thread Matthew Pocock
You may get more mileage by having a matrix operation interface that has is
parameterised over the two matrix types. It would have things like multiply
once and you would have different concrete implementations for different
pairs of matrix types. The implementations can even be provided via one of
the matrix classes to allow it to take advantage of the matrix internal
structure without exposing it.

On 1 Jul 2011 22:49, "Ted Dunning"  wrote:

Double dispatch was the wrong term.  I should have said double argument
polymorphism.  Double dispatch is a sub-optimal answer to the problem of
double polymorphism.

Apologies for polluting the discussion with a silly error.

On Fri, Jul 1, 2011 at 2:35 PM, Greg Sterijevski wrote:


> Ted,
>
> I am not sure why you think there will be double dispatch. If we remove
the
> multiplica...


Re: [codec] submitting a StringEncoder

2011-06-24 Thread Matthew Pocock
Hi,

I've submitted an issue:

https://issues.apache.org/jira/browse/CODEC-125

I have BMPM implemented now, with good (but not complete) test coverage.
It's certainly ready for someone else to try to break it for me. So,

1. The original authors are very keen to see this integrated with
commons-codec. Could somebody contact me directly so that I can hook them up
with you so that we can sort out the licensing issues?

2. How do I go about uploading the code? I can make one big svn patch file,
if you would like.

Thanks,

Matthew

-- 
Dr Matthew Pocock
Visitor, School of Computing Science, Newcastle University
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
(0191) 2566550


Re: [LANG] Is a Range a kind of Pair?

2011-05-18 Thread Matthew Pocock
Range is not a sub-type of pair. You can think of a pair as being an ordered
set of 2 items. A Range is a contiguous set defined by a lower and upper
bound (which may or may not be inclusive). Given some flag
Clusive=Inclusive|Exclusive, then every range is uniquely identified by a
single Pair>. The in-memory representation of the
data defining a pair and a range may be the same, but they are not at all
the same kind of thing.

Matthew

On 18 May 2011 17:46, Matt Benson  wrote:

> On Wed, May 18, 2011 at 11:32 AM, Gary Gregory 
> wrote:
> > Why doesn't a Range does extend Pair? It's pretty clear (to me at least)
> > that a range is a pair of values.
> >
> > Because the Pair is in our tuple package, it means that it should follow
> > tuple logic and be thought of as an ordered list of elements, in this
> case
> > two elements.
> >
> > The methods that Range has that are not in Pair could be moved there.
> >
>
> IMHO a Range is not precisely a Pair, though it could define its
> _limits_ in those terms.
>
> Matt
>
> > --
> > Thank you,
> > Gary
> >
> > http://garygregory.wordpress.com/
> > http://garygregory.com/
> > http://people.apache.org/~ggregory/
> > http://twitter.com/GaryGregory
> >
>
> ---------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


-- 
Matthew Pocock
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
(0191) 2566550


Re: [codec] submitting a StringEncoder

2011-05-16 Thread Matthew Pocock
Hi Gary,

I will see how much extra work is involved in also implementing D-M. I'm
only contracted to port B-M, but if it's not oodles of extra work I'll do
D-M also.

Matthew

On 16 May 2011 12:50, Gary Gregory  wrote:

> Hi Mathew,
>
> Will you also provide a Daitch–Mokotoff Soundex for comparison?
>
>
> Gary
>
> On May 16, 2011, at 7:17, Matthew Pocock 
> wrote:
>
> Beider-Morse
>



-- 
Matthew Pocock
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
(0191) 2566550


Re: [codec] submitting a StringEncoder

2011-05-16 Thread Matthew Pocock
Hi,

Sorry to be slow getting back to you. Thanks for all the helpful feedback.
The codec implements Beider-Morse Phonetic Matching which is optimised for
sounds-like in family names, particularly Central/Eastern European and
Jewish names. I'm in the process of contacting all the relevant people about
relicensing.

Matthew

On 12 May 2011 16:34, Henri Yandell  wrote:

> On Thu, May 12, 2011 at 8:31 AM, Matt Benson  wrote:
> > On Thu, May 12, 2011 at 10:08 AM, Henri Yandell 
> wrote:
> >> "Original code under GPL" is an IP issue. Namely that your Java code
> >> must also be under GPL.
> >>
> >> That subsequently can't be included in the Apache project, unless you
> >> can get the original PHP author to change the license to something
> >> listed here:
> >>
> >> http://www.apache.org/legal/resolved.html#category-a
> >>
> >
> > Does that hold if the original author straight up grants us... something?
>
> Good question.
>
> Either:
>
> a) original author signs an Apache CLA and submits the PHP code as
> "please add xyz feature; here is a php example that can be ported"
> or,
> b) original author relicenses under AL 2.0 or similar license.
>
> Both of these assume that the original author is the only author on
> the product. I've seen relicensing fall over because other's code was
> also in there.
>
> Hen
>
> ---------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


-- 
Matthew Pocock
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
(0191) 2566550


[codec] submitting a StringEncoder

2011-05-12 Thread Matthew Pocock
Hi,

How would I go about submitting a new StringEncoder to the language package
of codec? I have been contracted to port an existing sounds-like coding from
PHP to Java and ideally to contribute the result to the codec project. To
the best of my knowledge, there are no IP issues as the original code is
under an open-source license (GPL) and in addition my client has obtained
permission for the port from the copyright holder.

Thanks,

Matthew

-- 
Matthew Pocock
mailto: turingatemyhams...@gmail.com
gchat: turingatemyhams...@gmail.com
msn: matthew_poc...@yahoo.co.uk
irc.freenode.net: drdozer
(0191) 2566550