Yes, it would be nice! Any other opinion? Will you open a Jira for this improvement?
Thank you, William 2014-04-27 21:59 GMT-03:00 Mark G <ma...@apache.org>: > In my local copy I have these methods in the interface: > Map<String, Double> scoreMap(String text); > SortedMap<Double, Set<String>> sortedScoreMap(String text); > > and these impls of them in the ME impl > > > public Map<String, Double> scoreMap(String text) { > Map<String, Double> probDist = new HashMap<String, Double>(); > > double[] categorize = categorize(text); > int catSize = getNumberOfCategories(); > for (int i = 0; i < catSize; i++) { > String category = getCategory(i); > probDist.put(category, categorize[getIndex(category)]); > } > return probDist; > > } > > public SortedMap<Double, Set<String>> sortedScoreMap(String text) { > SortedMap<Double, Set<String>> descendingMap = new TreeMap<Double, > Set<String>>().descendingMap(); > double[] categorize = categorize(text); > int catSize = getNumberOfCategories(); > for (int i = 0; i < catSize; i++) { > String category = getCategory(i); > double score = categorize[getIndex(category)]; > if (descendingMap.containsKey(score)) { > descendingMap.get(score).add(category); > } else { > Set<String> newset = new HashSet<>(); > newset.add(category); > descendingMap.put(score, newset); > } > } > return descendingMap; > } > > > They are pretty simple, but if everyone agrees I can commit them (with some > java docs) > > > > > > On Sat, Apr 26, 2014 at 8:39 AM, Jörn Kottmann <kottm...@gmail.com> wrote: > > > On Thu, 2014-04-24 at 19:54 -0300, William Colen wrote: > > > Yes, it looks nice. Maybe we should redo all the DocumentCategorizer > > > interface. It is different from other tools, for example, we can't get > > the > > > best category of one document with only one call, we need to use two > > > methods. > > > > Yes that is right. +1 to change it. Can we deprecate the old methods and > > just add new ones to not break backward compatibility? > > > > Jörn > > > > >