svn commit: r1674746 - in /commons/proper/dbutils/trunk/src: changes/ main/java/org/apache/commons/dbutils/ test/java/org/apache/commons/dbutils/handlers/columns/ test/java/org/apache/commons/dbutils/
Author: thecarlhall Date: Mon Apr 20 06:32:39 2015 New Revision: 1674746 URL: http://svn.apache.org/r1674746 Log: DBUTILS-124 Add tests for new column and property handlers. Added: commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/BooleanColumnHandlerTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/ByteColumnHandlerTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/ColumnHandlerTestBase.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/DoubleColumnHandlerTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/FloatColumnHandlerTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/IntegerColumnHandlerTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/LongColumnHandlerTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/SQLXMLColumnHandlerTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/ShortColumnHandlerTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/StringColumnHandlerTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/TimestampColumnHandlerTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/properties/DatePropertyHandlerTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/properties/StringEnumPropertyHandlerTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/properties/TestEnum.java Modified: commons/proper/dbutils/trunk/src/changes/changes.xml commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/BeanProcessor.java Modified: commons/proper/dbutils/trunk/src/changes/changes.xml URL: http://svn.apache.org/viewvc/commons/proper/dbutils/trunk/src/changes/changes.xml?rev=1674746&r1=1674745&r2=1674746&view=diff == --- commons/proper/dbutils/trunk/src/changes/changes.xml (original) +++ commons/proper/dbutils/trunk/src/changes/changes.xml Mon Apr 20 06:32:39 2015 @@ -20,13 +20,13 @@ This file is used by the maven-changes-plugin to generate the release notes. Useful ways of finding items to add to this file are: -1. Add items when you fix a bug or add a feature (this makes the +1. Add items when you fix a bug or add a feature (this makes the release process easy :-). 2. Do a bugzilla search for tickets closed since the previous release. 3. Use the report generated by the maven-changelog-plugin to see all -CVS commits. Set the project.properties' maven.changelog.range +CVS commits. Set the project.properties' maven.changelog.range property to the number of days since the last release. To regenerate the RELEASE-NOTES.txt: @@ -68,6 +68,9 @@ The type attribute can be add,u Support CallableStatement "out" parameters + +Implement column and property handlers using Java's service interfaces. + @@ -126,7 +129,7 @@ The type attribute can be add,u Enhance BasicRowProcessor to have row mapping easier to configure -Updated pom.xml: Java 1.6 now required, clirr and compiler plugin removed +Updated pom.xml: Java 1.6 now required, clirr and compiler plugin removed BeanProcessor method processColumn should take SQLXML in consideration @@ -208,7 +211,7 @@ The type attribute can be add,u NullPointerException occured at rethrow method - + Object with Long or Decimal got initial zero value while database field is null @@ -222,7 +225,7 @@ The type attribute can be add,u -Tests fail to build under 1.6, and warning while compiling source +Tests fail to build under 1.6, and warning while compiling source BeanListHandler and BeanHandler fail to support java.sql.Date() @@ -234,7 +237,7 @@ The type attribute can be add,u Setting bean properties fails silently -MockResultSet needs to handle equals and hashCode +MockResultSet needs to handle equals and hashCode MockResultSet: Throw UnsupportedOperationException for not implemented methods @@ -243,16 +246,16 @@ The type attribute can be add,u Implement Pluggable Adaptors to Make BeanHandler Smarter -Patch for extending BasicRowProcessor +Patch for extending BasicRowProcessor -Protected QueryRunner.close() methods +
svn commit: r1674745 - in /commons/proper/dbutils/trunk/src: main/java/org/apache/commons/dbutils/ main/java/org/apache/commons/dbutils/handlers/columns/ main/java/org/apache/commons/dbutils/handlers/
Author: thecarlhall Date: Mon Apr 20 06:32:36 2015 New Revision: 1674745 URL: http://svn.apache.org/r1674745 Log: DBUTILS-124 Find column, property handlers using java's spi Move out the hard coded values of column and property types that dbutils can handle into spi files. This should also allow users of this library to add their own handlers later. Added: commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/ColumnHandler.java commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/PropertyHandler.java commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/columns/ commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/columns/BooleanColumnHandler.java commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/columns/ByteColumnHandler.java commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/columns/DoubleColumnHandler.java commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/columns/FloatColumnHandler.java commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/columns/IntegerColumnHandler.java commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/columns/LongColumnHandler.java commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/columns/SQLXMLColumnHandler.java commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/columns/ShortColumnHandler.java commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/columns/StringColumnHandler.java commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/columns/TimestampColumnHandler.java commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/properties/ commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/properties/DatePropertyHandler.java commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/handlers/properties/StringEnumPropertyHandler.java commons/proper/dbutils/trunk/src/main/resources/ commons/proper/dbutils/trunk/src/main/resources/META-INF/ commons/proper/dbutils/trunk/src/main/resources/META-INF/services/ commons/proper/dbutils/trunk/src/main/resources/META-INF/services/org.apache.commons.dbutils.ColumnHandler commons/proper/dbutils/trunk/src/main/resources/META-INF/services/org.apache.commons.dbutils.PropertyHandler commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/ServiceLoaderTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/ commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/TestColumnHandler.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/properties/ commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/properties/PropertyHandlerTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/properties/TestPropertyHandler.java commons/proper/dbutils/trunk/src/test/resources/META-INF/ commons/proper/dbutils/trunk/src/test/resources/META-INF/services/ commons/proper/dbutils/trunk/src/test/resources/META-INF/services/org.apache.commons.dbutils.ColumnHandler commons/proper/dbutils/trunk/src/test/resources/META-INF/services/org.apache.commons.dbutils.PropertyHandler Modified: commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/BeanProcessor.java Modified: commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/BeanProcessor.java URL: http://svn.apache.org/viewvc/commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/BeanProcessor.java?rev=1674745&r1=1674744&r2=1674745&view=diff == --- commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/BeanProcessor.java (original) +++ commons/proper/dbutils/trunk/src/main/java/org/apache/commons/dbutils/BeanProcessor.java Mon Apr 20 06:32:36 2015 @@ -20,18 +20,18 @@ import java.beans.BeanInfo; import java.beans.IntrospectionException; import java.beans.Introspector; import java.beans.PropertyDescriptor; +import java.lang.reflect.Field; import java.lang.reflect.InvocationTargetException; import java.lang.reflect.Method; import java.sql.ResultSet; import java.sql.ResultSetMetaData; import java.sql.SQLException; -import java.sql.SQLXML; -import java.sql.Timestamp; import java.util.ArrayList; import java.util.Arrays; import java.util.HashMap; import java.util.List; import java.util.Map; +import java.util.ServiceLoader; /** * @@ -66,6 +66,20 @@ public class BeanProcessor { private static final Map, Object> primitiveDefaults = new HashMap, Object>(); /** +
svn commit: r1674747 - in /commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils: StatementConfigurationTest.java handlers/columns/ColumnHandlerTestBase.java
Author: thecarlhall Date: Mon Apr 20 06:32:40 2015 New Revision: 1674747 URL: http://svn.apache.org/r1674747 Log: Clean up tests and add missing source header Modified: commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/StatementConfigurationTest.java commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/ColumnHandlerTestBase.java Modified: commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/StatementConfigurationTest.java URL: http://svn.apache.org/viewvc/commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/StatementConfigurationTest.java?rev=1674747&r1=1674746&r2=1674747&view=diff == --- commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/StatementConfigurationTest.java (original) +++ commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/StatementConfigurationTest.java Mon Apr 20 06:32:40 2015 @@ -17,7 +17,8 @@ package org.apache.commons.dbutils; import static org.junit.Assert.assertEquals; -import static org.junit.Assert.assertNull; +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; import org.junit.Test; @@ -29,11 +30,11 @@ public class StatementConfigurationTest public void testEmptyBuilder() { StatementConfiguration config = new StatementConfiguration.Builder().build(); -assertNull(config.getFetchDirection()); -assertNull(config.getFetchSize()); -assertNull(config.getMaxFieldSize()); -assertNull(config.getMaxRows()); -assertNull(config.getQueryTimeout()); +assertFalse(config.isFetchDirectionSet()); +assertFalse(config.isFetchSizeSet()); +assertFalse(config.isMaxFieldSizeSet()); +assertFalse(config.isMaxRowsSet()); +assertFalse(config.isQueryTimeoutSet()); } /** @@ -49,10 +50,19 @@ public class StatementConfigurationTest .queryTimeout(5); StatementConfiguration config = builder.build(); +assertTrue(config.isFetchDirectionSet()); assertEquals(Integer.valueOf(1), config.getFetchDirection()); + +assertTrue(config.isFetchSizeSet()); assertEquals(Integer.valueOf(2), config.getFetchSize()); + +assertTrue(config.isMaxFieldSizeSet()); assertEquals(Integer.valueOf(3), config.getMaxFieldSize()); + +assertTrue(config.isMaxRowsSet()); assertEquals(Integer.valueOf(4), config.getMaxRows()); + +assertTrue(config.isQueryTimeoutSet()); assertEquals(Integer.valueOf(5), config.getQueryTimeout()); } Modified: commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/ColumnHandlerTestBase.java URL: http://svn.apache.org/viewvc/commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/ColumnHandlerTestBase.java?rev=1674747&r1=1674746&r2=1674747&view=diff == --- commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/ColumnHandlerTestBase.java (original) +++ commons/proper/dbutils/trunk/src/test/java/org/apache/commons/dbutils/handlers/columns/ColumnHandlerTestBase.java Mon Apr 20 06:32:40 2015 @@ -1,3 +1,19 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ package org.apache.commons.dbutils.handlers.columns; import static org.junit.Assert.assertFalse;
[text] SANDBOX-498 Add parser options and initialise regular expressions once
Repository: commons-text Updated Branches: refs/heads/SANDBOX-498-OPTIONS [created] 331f80bfc SANDBOX-498 Add parser options and initialise regular expressions once Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/331f80bf Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/331f80bf Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/331f80bf Branch: refs/heads/SANDBOX-498-OPTIONS Commit: 331f80bfcf0380fcc35a6d18a327aef4a9e844e4 Parents: bf8bfb0 Author: Bruno P. Kinoshita Authored: Mon Apr 20 15:41:05 2015 +1200 Committer: Bruno P. Kinoshita Committed: Mon Apr 20 15:41:09 2015 +1200 -- .../commons/text/names/HumanNameParser.java | 73 .../commons/text/names/ParserOptions.java | 59 2 files changed, 102 insertions(+), 30 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/331f80bf/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index 5407d15..e7a3927 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -17,8 +17,6 @@ */ package org.apache.commons.text.names; -import java.util.Arrays; -import java.util.List; import java.util.Objects; import org.apache.commons.lang3.StringUtils; @@ -100,22 +98,51 @@ import org.apache.commons.lang3.StringUtils; */ public final class HumanNameParser { -private final List suffixes; -private final List prefixes; +/** + * The options used by the parser. + */ +private final ParserOptions options; + +/* + * Regular expressions used by the parser. + */ +// The regex use is a bit tricky. *Everything* matched by the regex will be replaced, +// but you can select a particular parenthesized submatch to be returned. +// Also, note that each regex requres that the preceding ones have been run, and matches chopped out. +// names that starts or end w/ an apostrophe break this +private final static String NICKNAMES_REGEX = "(?i) ('|\\\"|\\(\\\"*'*)(.+?)('|\\\"|\\\"*'*\\)) "; +// note the lookahead, which isn't returned or replaced +private final static String LEADING_INIT_REGEX = "(?i)(^(.\\.*)(?= \\p{L}{2}))"; +private final static String FIRST_NAME_REGEX = "(?i)^([^ ]+)"; +private final String suffixRegex; +private final String lastRegex; + /** * Creates a new parser. */ public HumanNameParser() { -// TODO make this configurable -this.suffixes = Arrays.asList( -"esq", "esquire", "jr", -"sr", "2", "ii", "iii", "iv"); -this.prefixes = Arrays.asList( -"bar", "ben", "bin", "da", "dal", -"de la", "de", "del", "der", "di", "ibn", "la", "le", -"san", "st", "ste", "van", "van der", "van den", "vel", -"von" ); +this(ParserOptions.DEFAULT_OPTIONS); +} + +/** + * Creates a new parser by providing options. + */ +public HumanNameParser(ParserOptions options) { +this.options = options; +final String suffixes = StringUtils.join(options.getSuffixes(), "\\.*|") + "\\.*"; +final String prefixes = StringUtils.join(options.getPrefixes(), " |") + " "; +suffixRegex = "(?i),* *((" + suffixes + ")$)"; +lastRegex = "(?i)(?!^)\\b([^ ]+ y |" + prefixes + ")*[^ ]+$"; +} + +/** + * Gets the parser options. + * + * @return parser options + */ +public ParserOptions getOptions() { +return options; } /** @@ -129,23 +156,9 @@ public final class HumanNameParser { Objects.requireNonNull(name, "Parameter 'name' must not be null."); NameString nameString = new NameString(name); -// TODO compile regexes only once when the parser is created -String suffixes = StringUtils.join(this.suffixes, "\\.*|") + "\\.*"; -String prefixes = StringUtils.join(this.prefixes, " |") + " "; - -// The regex use is a bit tricky. *Everything* matched by the regex will be replaced, -// but you can select a particular parenthesized submatch to be returned. -// Also, note that each regex requres that the preceding ones have been run, and matches chopped out. -// names that starts or end w/ an apostrophe break this -String nicknamesRegex = "(?i) ('|\\\"|\\(\\\"*'*)(.+?)('|\\\"|\\\"*'*\\)) "; -String suffixRegex = "(?i),* *((" + suffixes + ")$)";
Git Push Summary
Repository: commons-text Updated Branches: refs/heads/SANDBOX-498-KINOW [deleted] c5785647e
[text] Renaming variables to simpler names
Repository: commons-text Updated Branches: refs/heads/SANDBOX-498-KINOW [created] c5785647e Renaming variables to simpler names Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/c5785647 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/c5785647 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/c5785647 Branch: refs/heads/SANDBOX-498-KINOW Commit: c5785647efbb802cdfc5fc74dc1ee4457ca525d0 Parents: c4e8a3e Author: Bruno P. Kinoshita Authored: Sun Apr 19 22:17:15 2015 +1200 Committer: Bruno P. Kinoshita Committed: Sun Apr 19 22:17:15 2015 +1200 -- .../commons/text/names/HumanNameParser.java | 26 ++-- 1 file changed, 13 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/c5785647/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index 843685a..659ec95 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -75,7 +75,7 @@ public class HumanNameParser { /** * First name. */ -private String first; +private String firstName; /** * Single nickname found in the name input. */ @@ -83,11 +83,11 @@ public class HumanNameParser { /** * Middle name. */ -private String middle; +private String middleName; /** * Last name. */ -private String last; +private String lastName; /** * Name suffix. */ @@ -119,10 +119,10 @@ public class HumanNameParser { this.name = name; this.leadingInit = ""; -this.first = ""; +this.firstName = ""; this.nickname = ""; -this.middle = ""; -this.last = ""; +this.middleName = ""; +this.lastName = ""; this.suffix = ""; this.suffixes = Arrays.asList(new String[] { @@ -162,7 +162,7 @@ public class HumanNameParser { * @return first name */ public String getFirst() { -return first; +return firstName; } /** @@ -180,7 +180,7 @@ public class HumanNameParser { * @return the middle name */ public String getMiddle() { -return middle; +return middleName; } /** @@ -189,7 +189,7 @@ public class HumanNameParser { * @return the last name */ public String getLast() { -return last; +return lastName; } /** @@ -249,19 +249,19 @@ public class HumanNameParser { this.name.flip(","); // get the last name -this.last = this.name.chopWithRegex(lastRegex, 0); +this.lastName = this.name.chopWithRegex(lastRegex, 0); // get the first initial, if there is one this.leadingInit = this.name.chopWithRegex(leadingInitRegex, 1); // get the first name -this.first = this.name.chopWithRegex(firstRegex, 0); -if (StringUtils.isBlank(this.first)) { +this.firstName = this.name.chopWithRegex(firstRegex, 0); +if (StringUtils.isBlank(this.firstName)) { throw new NameParseException("Couldn't find a first name in '{" + this.name.getStr() + "}'"); } // if anything's left, that's the middle name -this.middle = this.name.getStr(); +this.middleName = this.name.getStr(); } }
[08/13] [text] Make classes in the name package final.
Make classes in the name package final. Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/9e340643 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/9e340643 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/9e340643 Branch: refs/heads/master Commit: 9e340643cfebd7b4088fd9946b3e92fc9f8cd394 Parents: a942b4c Author: Benedikt Ritter Authored: Sun Apr 19 16:32:31 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:32:31 2015 +0200 -- src/main/java/org/apache/commons/text/names/HumanNameParser.java | 2 +- .../java/org/apache/commons/text/names/NameParseException.java | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/9e340643/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index c47abde..a29e375 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -63,7 +63,7 @@ import org.apache.commons.lang3.StringUtils; * * This class is immutable. */ -public class HumanNameParser { +public final class HumanNameParser { /** * Suffixes found. http://git-wip-us.apache.org/repos/asf/commons-text/blob/9e340643/src/main/java/org/apache/commons/text/names/NameParseException.java -- diff --git a/src/main/java/org/apache/commons/text/names/NameParseException.java b/src/main/java/org/apache/commons/text/names/NameParseException.java index b09c2d6..4fe5eda 100644 --- a/src/main/java/org/apache/commons/text/names/NameParseException.java +++ b/src/main/java/org/apache/commons/text/names/NameParseException.java @@ -19,7 +19,7 @@ package org.apache.commons.text.names; /** * Name parse exception. */ -public class NameParseException extends RuntimeException { +public final class NameParseException extends RuntimeException { /** * Serial UID.
[09/13] [text] Drop unused code from NameString and clean up NameStringTest
Drop unused code from NameString and clean up NameStringTest Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/ed985cd5 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/ed985cd5 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/ed985cd5 Branch: refs/heads/master Commit: ed985cd51220e956f516acecf1039defd0141d34 Parents: 9e34064 Author: Benedikt Ritter Authored: Sun Apr 19 16:44:32 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:44:32 2015 +0200 -- .../commons/text/names/HumanNameParser.java | 5 +- .../apache/commons/text/names/NameString.java | 24 ++- .../commons/text/names/NameStringTest.java | 67 ++-- 3 files changed, 30 insertions(+), 66 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/ed985cd5/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index a29e375..b5c0aa3 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -100,6 +100,7 @@ public final class HumanNameParser { Objects.requireNonNull(name, "Parameter 'name' must not be null."); NameString nameString = new NameString(name); +// TODO compile regexes only once when the parser is created String suffixes = StringUtils.join(this.suffixes, "\\.*|") + "\\.*"; String prefixes = StringUtils.join(this.prefixes, " |") + " "; @@ -132,11 +133,11 @@ public final class HumanNameParser { // get the first name String first = nameString.chopWithRegex(firstRegex, 0); if (StringUtils.isBlank(first)) { -throw new NameParseException("Couldn't find a first name in '{" + nameString.getStr() + "}'"); +throw new NameParseException("Couldn't find a first name in '{" + nameString.getWrappedString() + "}'"); } // if anything's left, that's the middle name -String middle = nameString.getStr(); +String middle = nameString.getWrappedString(); return new Name(leadingInit, first, nickname, middle, last, suffix); } http://git-wip-us.apache.org/repos/asf/commons-text/blob/ed985cd5/src/main/java/org/apache/commons/text/names/NameString.java -- diff --git a/src/main/java/org/apache/commons/text/names/NameString.java b/src/main/java/org/apache/commons/text/names/NameString.java index 8f606f2..54e2753 100644 --- a/src/main/java/org/apache/commons/text/names/NameString.java +++ b/src/main/java/org/apache/commons/text/names/NameString.java @@ -37,30 +37,20 @@ final class NameString { * * @param str encapsulated string. */ -public NameString(String str) { +NameString(String str) { this.str = str; } /** - * Gets the encapsulated string. + * Gets the wrapped string. * - * @return encapsulated string + * @return wrapped string */ -public String getStr() { +String getWrappedString() { return str; } /** - * Sets the encapsulated string value. - * - * @param str string value - */ -public void setStr(String str) { -this.str = str; -this.norm(); -} - -/** * Uses a regex to chop off and return part of the namestring. * There are two parts: first, it returns the matched substring, * and then it removes that substring from the encapsulated @@ -70,7 +60,7 @@ final class NameString { * @param submatchIndex which of the parenthesized submatches to use * @return the part of the namestring that got chopped off */ -public String chopWithRegex(String regex, int submatchIndex) { +String chopWithRegex(String regex, int submatchIndex) { String chopped = ""; Pattern pattern = Pattern.compile(regex); Matcher matcher = pattern.matcher(this.str); @@ -106,7 +96,7 @@ final class NameString { * @param flipAroundChar the character(s) demarcating the two halves you want to flip. * @throws NameParseException if a regex fails or a condition is not expected */ -public void flip(String flipAroundChar) { +void flip(String flipAroundChar) { String[] parts = this.str.split(flipAroundChar); if (parts != null) { if (parts.length == 2) { @@ -125,7 +115,7 @@ final class NameString { * Strips whitespace chars from ends, strips redundant whitespace, converts * whitespace
[05/13] [text] Remove state from HumanNameParser, making it immutable
Remove state from HumanNameParser, making it immutable Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/1f6c5dae Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/1f6c5dae Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/1f6c5dae Branch: refs/heads/master Commit: 1f6c5daecded67a17c07371a564f74ef623b3f29 Parents: 685f9a8 Author: Benedikt Ritter Authored: Sun Apr 19 16:28:37 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:28:37 2015 +0200 -- .../commons/text/names/HumanNameParser.java | 141 +++ .../org/apache/commons/text/names/Name.java | 32 + 2 files changed, 51 insertions(+), 122 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/1f6c5dae/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index df8e55c..c47abde 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -61,135 +61,32 @@ import org.apache.commons.lang3.StringUtils; * This implementation is based on the Java implementation, with additions * suggested in https://issues.apache.org/jira/browse/SANDBOX-487";>SANDBOX-487. * - * This class is not thread-safe. + * This class is immutable. */ public class HumanNameParser { /** - * Leading init part. - */ -private String leadingInit; -/** - * First name. - */ -private String first; -/** - * Single nickname found in the name input. - */ -private String nickname; -/** - * Middle name. - */ -private String middle; -/** - * Last name. - */ -private String last; -/** - * Name suffix. - */ -private String suffix; -/** * Suffixes found. */ -private List suffixes; +private final List suffixes; /** * Prefixes found. */ -private List prefixes; +private final List prefixes; /** * Creates a parser given a string name. */ public HumanNameParser() { -this.leadingInit = ""; -this.first = ""; -this.nickname = ""; -this.middle = ""; -this.last = ""; -this.suffix = ""; - -this.suffixes = Arrays.asList(new String[]{ +// TODO make this configurable +this.suffixes = Arrays.asList( "esq", "esquire", "jr", -"sr", "2", "ii", "iii", "iv"}); -this.prefixes = Arrays -.asList(new String[] { +"sr", "2", "ii", "iii", "iv"); +this.prefixes = Arrays.asList( "bar", "ben", "bin", "da", "dal", "de la", "de", "del", "der", "di", "ibn", "la", "le", "san", "st", "ste", "van", "van der", "van den", "vel", -"von" }); -} - -/** - * Gets the leading init part of the name. - * - * @return the leading init part of the name - */ -public String getLeadingInit() { -return leadingInit; -} - -/** - * Gets the first name. - * - * @return first name - */ -public String getFirst() { -return first; -} - -/** - * Gets the nickname. - * - * @return the nickname - */ -public String getNickname() { -return nickname; -} - -/** - * Gets the middle name. - * - * @return the middle name - */ -public String getMiddle() { -return middle; -} - -/** - * Gets the last name. - * - * @return the last name - */ -public String getLast() { -return last; -} - -/** - * Gets the suffix part of the name. - * - * @return the name suffix - */ -public String getSuffix() { -return suffix; -} - -/** - * Gets the name suffixes. - * - * @return the name suffixes - */ -public List getSuffixes() { -return suffixes; -} - -/** - * Gets the name prefixes. - * - * @return the name prefixes - */ -public List getPrefixes() { -return prefixes; +"von" ); } /** @@ -218,28 +115,28 @@ public class HumanNameParser { String firstRegex = "(?i)^([^ ]+)"; // get nickname, if there is one -this.nickname = nameString.chopWithRegex(nicknamesRegex, 2); +String nickname = nameString.chopWithRegex(nicknamesRegex, 2); // get suffix, if there is one -this.suffix = nameString.chopWithRegex(suffixRe
[07/13] [text] Fix typo
Fix typo Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/a942b4c0 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/a942b4c0 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/a942b4c0 Branch: refs/heads/master Commit: a942b4c02194a6f544f129e89e0f399d51c5c01a Parents: bbba0a3 Author: Benedikt Ritter Authored: Sun Apr 19 16:31:01 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:31:01 2015 +0200 -- .../commons/text/names/HumanNameParserTest.java | 16 1 file changed, 8 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/a942b4c0/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java -- diff --git a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java index 314a949..f6c9ba6 100644 --- a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java +++ b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java @@ -71,29 +71,29 @@ public class HumanNameParserTest { * @param record a CSVRecord representing one record in the input file. */ private void validateRecord(CSVRecord record) { -Name result = nameParser.parse(record.get(Colums.Name)); +Name result = nameParser.parse(record.get(Columns.Name)); long recordNum = record.getRecordNumber(); assertThat("Wrong LeadingInit in record " + recordNum, -result.getLeadingInitial(), equalTo(record.get(Colums.LeadingInit))); +result.getLeadingInitial(), equalTo(record.get(Columns.LeadingInit))); assertThat("Wrong FirstName in record " + recordNum, -result.getFirstName(), equalTo(record.get(Colums.FirstName))); +result.getFirstName(), equalTo(record.get(Columns.FirstName))); assertThat("Wrong NickName in record " + recordNum, -result.getNickName(), equalTo(record.get(Colums.NickName))); +result.getNickName(), equalTo(record.get(Columns.NickName))); assertThat("Wrong MiddleName in record " + recordNum, -result.getMiddleName(), equalTo(record.get(Colums.MiddleName))); +result.getMiddleName(), equalTo(record.get(Columns.MiddleName))); assertThat("Wrong LastName in record " + recordNum, -result.getLastName(), equalTo(record.get(Colums.LastName))); +result.getLastName(), equalTo(record.get(Columns.LastName))); assertThat("Wrong Suffix in record " + recordNum, -result.getSuffix(), equalTo(record.get(Colums.Suffix))); +result.getSuffix(), equalTo(record.get(Columns.Suffix))); } -private enum Colums { +private enum Columns { Name,LeadingInit,FirstName,NickName,MiddleName,LastName,Suffix } }
[13/13] [text] Merge remote-tracking branch 'remotes/origin/SANDBOX-498' for issue SANDBOX-498
Merge remote-tracking branch 'remotes/origin/SANDBOX-498' for issue SANDBOX-498 Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/bf8bfb0a Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/bf8bfb0a Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/bf8bfb0a Branch: refs/heads/master Commit: bf8bfb0a46c0e6d7f9e3d3416bf2f147c9b81074 Parents: e8e85d9 c1372c1 Author: Bruno P. Kinoshita Authored: Mon Apr 20 14:58:27 2015 +1200 Committer: Bruno P. Kinoshita Committed: Mon Apr 20 14:58:27 2015 +1200 -- src/changes/changes.xml | 1 + .../commons/text/names/HumanNameParser.java | 279 +++ .../org/apache/commons/text/names/Name.java | 141 +- .../commons/text/names/NameParseException.java | 2 +- .../apache/commons/text/names/NameString.java | 122 .../commons/text/names/HumanNameParserTest.java | 43 +-- .../commons/text/names/NameStringTest.java | 77 + .../org/apache/commons/text/names/NameTest.java | 104 --- 8 files changed, 381 insertions(+), 388 deletions(-) --
[02/13] [text] Pass the name to parse as parameter to the parse method
Pass the name to parse as parameter to the parse method Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/df7e7a7b Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/df7e7a7b Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/df7e7a7b Branch: refs/heads/master Commit: df7e7a7b0aba73a1bf09c41dbd32e913252a8707 Parents: aa29350 Author: Benedikt Ritter Authored: Sun Apr 19 16:02:55 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:02:55 2015 +0200 -- .../commons/text/names/HumanNameParser.java | 52 ++-- .../commons/text/names/HumanNameParserTest.java | 4 +- 2 files changed, 16 insertions(+), 40 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/df7e7a7b/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index 5088bba..bf8f9ed 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -65,10 +65,6 @@ import org.apache.commons.lang3.StringUtils; public class HumanNameParser { /** - * Name parsed. - */ -private Name name; -/** * Leading init part. */ private String leadingInit; @@ -103,21 +99,8 @@ public class HumanNameParser { /** * Creates a parser given a string name. - * - * @param name string name - */ -public HumanNameParser(String name) { -this(new Name(name)); -} - -/** - * Creates a parser given a {@code Name} object. - * - * @param name {@code Name} */ -public HumanNameParser(Name name) { -this.name = name; - +public HumanNameParser() { this.leadingInit = ""; this.first = ""; this.nickname = ""; @@ -125,9 +108,9 @@ public class HumanNameParser { this.last = ""; this.suffix = ""; -this.suffixes = Arrays.asList(new String[] { +this.suffixes = Arrays.asList(new String[]{ "esq", "esquire", "jr", -"sr", "2", "ii", "iii", "iv" }); +"sr", "2", "ii", "iii", "iv"}); this.prefixes = Arrays .asList(new String[] { "bar", "ben", "bin", "da", "dal", @@ -137,15 +120,6 @@ public class HumanNameParser { } /** - * Gets the {@code Name} object. - * - * @return the {@code Name} object - */ -public Name getName() { -return name; -} - -/** * Gets the leading init part of the name. * * @return the leading init part of the name @@ -220,9 +194,11 @@ public class HumanNameParser { /** * Consumes the string and creates the name parts. * + * @param nameStr the name to parse. * @throws NameParseException if the parser fails to retrieve the name parts */ -public void parse() { +public void parse(String nameStr) { +Name name = new Name(nameStr); String suffixes = StringUtils.join(this.suffixes, "\\.*|") + "\\.*"; String prefixes = StringUtils.join(this.prefixes, " |") + " "; @@ -238,28 +214,28 @@ public class HumanNameParser { String firstRegex = "(?i)^([^ ]+)"; // get nickname, if there is one -this.nickname = this.name.chopWithRegex(nicknamesRegex, 2); +this.nickname = name.chopWithRegex(nicknamesRegex, 2); // get suffix, if there is one -this.suffix = this.name.chopWithRegex(suffixRegex, 1); +this.suffix = name.chopWithRegex(suffixRegex, 1); // flip the before-comma and after-comma parts of the name -this.name.flip(","); +name.flip(","); // get the last name -this.last = this.name.chopWithRegex(lastRegex, 0); +this.last = name.chopWithRegex(lastRegex, 0); // get the first initial, if there is one -this.leadingInit = this.name.chopWithRegex(leadingInitRegex, 1); +this.leadingInit = name.chopWithRegex(leadingInitRegex, 1); // get the first name -this.first = this.name.chopWithRegex(firstRegex, 0); +this.first = name.chopWithRegex(firstRegex, 0); if (StringUtils.isBlank(this.first)) { -throw new NameParseException("Couldn't find a first name in '{" + this.name.getStr() + "}'"); +throw new NameParseException("Couldn't find a first name in '{" + name.getStr() + "}'"); } // if anything's left, that's the middle name -this.middle = this.name.getStr(); +this.middle = name.getS
[03/13] [text] Check for null inputs
Check for null inputs Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/9a0cc85a Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/9a0cc85a Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/9a0cc85a Branch: refs/heads/master Commit: 9a0cc85ad01dcf1f468736984cdd5dec0a7a3bf3 Parents: df7e7a7 Author: Benedikt Ritter Authored: Sun Apr 19 16:06:09 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:06:09 2015 +0200 -- .../java/org/apache/commons/text/names/HumanNameParser.java | 8 ++-- .../org/apache/commons/text/names/HumanNameParserTest.java | 6 ++ 2 files changed, 12 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/9a0cc85a/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index bf8f9ed..fa2433a 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -19,6 +19,7 @@ package org.apache.commons.text.names; import java.util.Arrays; import java.util.List; +import java.util.Objects; import org.apache.commons.lang3.StringUtils; @@ -194,10 +195,13 @@ public class HumanNameParser { /** * Consumes the string and creates the name parts. * - * @param nameStr the name to parse. - * @throws NameParseException if the parser fails to retrieve the name parts + * @param nameStr the name to parse. Must not be null. + * @throws NameParseException if the parser fails to retrieve the name parts. + * @throws NullPointerException if nameStr is null. */ public void parse(String nameStr) { +Objects.requireNonNull(nameStr, "Parameter 'nameStr' must not be null."); + Name name = new Name(nameStr); String suffixes = StringUtils.join(this.suffixes, "\\.*|") + "\\.*"; String prefixes = StringUtils.join(this.prefixes, " |") + " "; http://git-wip-us.apache.org/repos/asf/commons-text/blob/9a0cc85a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java -- diff --git a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java index 478d19c..d43d2be 100644 --- a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java +++ b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java @@ -50,6 +50,12 @@ public class HumanNameParserTest { } } +@Test(expected = NullPointerException.class) +public void shouldThrowNullPointerException_WhenNullIsParsed() throws Exception { +HumanNameParser parser = new HumanNameParser(); +parser.parse(null); +} + @Test public void testInputs() { for (CSVRecord record : parser) {
[10/13] [text] Condition will always be true
Condition will always be true Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/b1c7e564 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/b1c7e564 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/b1c7e564 Branch: refs/heads/master Commit: b1c7e564251e7a404aa3d021c282349150fd4061 Parents: ed985cd Author: Benedikt Ritter Authored: Sun Apr 19 16:45:49 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:45:49 2015 +0200 -- .../org/apache/commons/text/names/NameString.java | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/b1c7e564/src/main/java/org/apache/commons/text/names/NameString.java -- diff --git a/src/main/java/org/apache/commons/text/names/NameString.java b/src/main/java/org/apache/commons/text/names/NameString.java index 54e2753..21898d3 100644 --- a/src/main/java/org/apache/commons/text/names/NameString.java +++ b/src/main/java/org/apache/commons/text/names/NameString.java @@ -98,14 +98,12 @@ final class NameString { */ void flip(String flipAroundChar) { String[] parts = this.str.split(flipAroundChar); -if (parts != null) { -if (parts.length == 2) { -this.str = String.format("%s %s", parts[1], parts[0]); -this.norm(); -} else if (parts.length > 2) { -throw new NameParseException( -"Can't flip around multiple '" + flipAroundChar + "' characters in namestring."); -} +if (parts.length == 2) { +this.str = String.format("%s %s", parts[1], parts[0]); +this.norm(); +} else if (parts.length > 2) { +throw new NameParseException( +"Can't flip around multiple '" + flipAroundChar + "' characters in namestring."); } }
[12/13] [text] Add SANDBOX-498 to the list of fixed issues
Add SANDBOX-498 to the list of fixed issues Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/c1372c1f Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/c1372c1f Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/c1372c1f Branch: refs/heads/master Commit: c1372c1f9754434995c9a91fe47508946ff5744f Parents: 6d047a4 Author: Benedikt Ritter Authored: Sun Apr 19 17:14:22 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 17:14:22 2015 +0200 -- src/changes/changes.xml | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/c1372c1f/src/changes/changes.xml -- diff --git a/src/changes/changes.xml b/src/changes/changes.xml index fbb60b9..0a77677 100644 --- a/src/changes/changes.xml +++ b/src/changes/changes.xml @@ -22,6 +22,7 @@ +Improve HumanNameParser IP clearance for the names package Write user guide Work on the string metric, distance, and similarity definitions for the project
[11/13] [text] Better JavaDoc for HumanNameParser
Better JavaDoc for HumanNameParser Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/6d047a46 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/6d047a46 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/6d047a46 Branch: refs/heads/master Commit: 6d047a461f83017c8b723f4b28c0ad10f3f1dc36 Parents: b1c7e56 Author: Benedikt Ritter Authored: Sun Apr 19 17:13:11 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 17:13:11 2015 +0200 -- .../commons/text/names/HumanNameParser.java | 99 +--- 1 file changed, 64 insertions(+), 35 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/6d047a46/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index b5c0aa3..5407d15 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -24,58 +24,87 @@ import java.util.Objects; import org.apache.commons.lang3.StringUtils; /** - * A parser capable of parsing name parts out of a single string. + * A parser capable of parsing name parts out of a single string. * + * Parsing examples + * * The code works by basically applying several Regexes in a certain order - * and removing (chopping) tokens off the original string. The parser consumes - * the tokens during its creation. + * and removing (chopping) tokens off the original string. The parser creates + * a {@link Name} object representing the parse result. Note that passing null + * to the {@link #parse(String)} method will result in an exception. * - * - * J. Walter Weatherman - * de la Cruz, Ana M. - * James C. ('Jimmy') O'Dell, Jr. - * - * - * and parses out the: - * - * - * leading initial (Like "J." in "J. Walter Weatherman") - * first name (or first initial in a name like 'R. Crumb') - * nicknames (like "Jimmy" in "James C. ('Jimmy') O'Dell, Jr.") - * middle names - * last name (including compound ones like "van der Sar' and "Ortega y Gasset"), and - * suffix (like 'Jr.', 'III') - * + * + * + * input + * Leading initial + * First name + * Nick name + * Middle name + * Last Name + * Suffix + * + * + * J. Walter Weatherman + * J. + * Walter + * + * + * Weatherman + * + * + * + * de la Cruz, Ana M. + * + * Ana + * + * M. + * de la Cruz + * + * + * + * James C. ('Jimmy') O'Dell, Jr. + * + * James + * Jimmy + * C. + * O'Dell + * Jr. + * + * * + * Sample usage + * + * HumanNameParser instances are immutable and can be reused for parsing multiple names: + * * - * Name name = new Name("S�rgio Vieira de Mello"); - * HumanNameParser parser = new HumanNameParser(name); - * String firstName = parser.getFirst(); - * String nickname = parser.getNickname(); + * HumanNameParser parser = new HumanNameParser(); + * Name parsedName = parser.parse("S�rgio Vieira de Mello") + * String firstName = parsedName.getFirstName(); + * String nickname = parsedName.getNickName(); * // ... + * + * Name nextName = parser.parse("James C. ('Jimmy') O'Dell, Jr.") + * String firstName = nextName.getFirstName(); + * String nickname = nextName.getNickName(); * * + * Further notes + * * The original code was written in http://jasonpriem.com/human-name-parse";>PHP - * and ported to http://tupilabs.github.io/HumanNameParser.java/";>Java. - * - * This implementation is based on the Java implementation, with additions - * suggested in https://issues.apache.org/jira/browse/SANDBOX-487";>SANDBOX-487. + * and ported to http://tupilabs.github.io/HumanNameParser.java/";>Java. This + * implementation is based on the Java implementation, with additions + * suggested in https://issues.apache.org/jira/browse/SANDBOX-487";>SANDBOX-487 + * and https://issues.apache.org/jira/browse/SANDBOX-498";>SANDBOX-498. * * This class is immutable. */ public final class HumanNameParser { -/** - * Suffixes found. - */ private final List suffixes; -/** - * Prefixes found. - */ private final List prefixes; /** - * Creates a parser given a string name. + * Creates a new parser. */ public HumanNameParser() { // TODO make this configurable @@ -90,7 +119,7 @@ public final class HumanNameParser { } /** - * Consumes the string and creates the name parts. + * Parses a name from the given string. * * @param name the name to parse. Must not be null. * @throws NameParseException if the parser fails to retrieve
[06/13] [text] Use a shared parser instance for tests
Use a shared parser instance for tests Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/bbba0a32 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/bbba0a32 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/bbba0a32 Branch: refs/heads/master Commit: bbba0a327b7ad8873d176254ec2a550757911bda Parents: 1f6c5da Author: Benedikt Ritter Authored: Sun Apr 19 16:30:00 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:30:00 2015 +0200 -- .../commons/text/names/HumanNameParserTest.java | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/bbba0a32/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java -- diff --git a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java index d059ed4..314a949 100644 --- a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java +++ b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java @@ -33,32 +33,33 @@ import org.junit.Test; */ public class HumanNameParserTest { -private CSVParser parser; +private CSVParser inputParser; +private HumanNameParser nameParser; @Before public void setUp() throws Exception { -parser = CSVParser.parse( +inputParser = CSVParser.parse( HumanNameParserTest.class.getResource("testNames.txt"), Charset.forName("UTF-8"), CSVFormat.DEFAULT.withDelimiter('|').withHeader()); +nameParser = new HumanNameParser(); } @After public void tearDown() throws Exception { -if (parser != null) { -parser.close(); +if (inputParser != null) { +inputParser.close(); } } @Test(expected = NullPointerException.class) public void shouldThrowNullPointerException_WhenNullIsParsed() throws Exception { -HumanNameParser parser = new HumanNameParser(); -parser.parse(null); +nameParser.parse(null); } @Test public void testInputs() { -for (CSVRecord record : parser) { +for (CSVRecord record : inputParser) { validateRecord(record); } } @@ -70,8 +71,7 @@ public class HumanNameParserTest { * @param record a CSVRecord representing one record in the input file. */ private void validateRecord(CSVRecord record) { -HumanNameParser parser = new HumanNameParser(); -Name result = parser.parse(record.get(Colums.Name)); +Name result = nameParser.parse(record.get(Colums.Name)); long recordNum = record.getRecordNumber(); assertThat("Wrong LeadingInit in record " + recordNum,
[04/13] [text] Make HumanNameParser return a name object. Introduce a new wrapper object for strings to be parsed called NameString.
Make HumanNameParser return a name object. Introduce a new wrapper object for strings to be parsed called NameString. Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/685f9a86 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/685f9a86 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/685f9a86 Branch: refs/heads/master Commit: 685f9a864d46cc526b14e3a7476465c49d991478 Parents: 9a0cc85 Author: Benedikt Ritter Authored: Sun Apr 19 16:22:45 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:22:45 2015 +0200 -- .../commons/text/names/HumanNameParser.java | 36 ++--- .../org/apache/commons/text/names/Name.java | 141 ++- .../apache/commons/text/names/NameString.java | 134 ++ .../commons/text/names/HumanNameParserTest.java | 24 ++-- .../commons/text/names/NameStringTest.java | 104 ++ .../org/apache/commons/text/names/NameTest.java | 104 -- 6 files changed, 315 insertions(+), 228 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/685f9a86/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index fa2433a..df8e55c 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -195,14 +195,14 @@ public class HumanNameParser { /** * Consumes the string and creates the name parts. * - * @param nameStr the name to parse. Must not be null. + * @param name the name to parse. Must not be null. * @throws NameParseException if the parser fails to retrieve the name parts. - * @throws NullPointerException if nameStr is null. + * @throws NullPointerException if name is null. */ -public void parse(String nameStr) { -Objects.requireNonNull(nameStr, "Parameter 'nameStr' must not be null."); +public Name parse(String name) { +Objects.requireNonNull(name, "Parameter 'name' must not be null."); -Name name = new Name(nameStr); +NameString nameString = new NameString(name); String suffixes = StringUtils.join(this.suffixes, "\\.*|") + "\\.*"; String prefixes = StringUtils.join(this.prefixes, " |") + " "; @@ -218,28 +218,30 @@ public class HumanNameParser { String firstRegex = "(?i)^([^ ]+)"; // get nickname, if there is one -this.nickname = name.chopWithRegex(nicknamesRegex, 2); +this.nickname = nameString.chopWithRegex(nicknamesRegex, 2); // get suffix, if there is one -this.suffix = name.chopWithRegex(suffixRegex, 1); +this.suffix = nameString.chopWithRegex(suffixRegex, 1); -// flip the before-comma and after-comma parts of the name -name.flip(","); +// flip the before-comma and after-comma parts of the nameString +nameString.flip(","); -// get the last name -this.last = name.chopWithRegex(lastRegex, 0); +// get the last nameString +this.last = nameString.chopWithRegex(lastRegex, 0); // get the first initial, if there is one -this.leadingInit = name.chopWithRegex(leadingInitRegex, 1); +this.leadingInit = nameString.chopWithRegex(leadingInitRegex, 1); -// get the first name -this.first = name.chopWithRegex(firstRegex, 0); +// get the first nameString +this.first = nameString.chopWithRegex(firstRegex, 0); if (StringUtils.isBlank(this.first)) { -throw new NameParseException("Couldn't find a first name in '{" + name.getStr() + "}'"); +throw new NameParseException("Couldn't find a first name in '{" + nameString.getStr() + "}'"); } -// if anything's left, that's the middle name -this.middle = name.getStr(); +// if anything's left, that's the middle nameString +this.middle = nameString.getStr(); + +return new Name(leadingInit, first, nickname, middle, last, suffix); } } http://git-wip-us.apache.org/repos/asf/commons-text/blob/685f9a86/src/main/java/org/apache/commons/text/names/Name.java -- diff --git a/src/main/java/org/apache/commons/text/names/Name.java b/src/main/java/org/apache/commons/text/names/Name.java index 0dd2560..3067ba5 100644 --- a/src/main/java/org/apache/commons/text/names/Name.java +++ b/src/main/java/org/apache/commons/text/names/Name.java @@ -16,119 +16,70 @@ */ package org.apache.co
[01/13] [text] Make parse method public
Repository: commons-text Updated Branches: refs/heads/master e8e85d9de -> bf8bfb0a4 Make parse method public Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/aa293500 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/aa293500 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/aa293500 Branch: refs/heads/master Commit: aa293500080d6872b3ac653dcf74a50cf8223ae5 Parents: e8e85d9 Author: Benedikt Ritter Authored: Sun Apr 19 15:58:16 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 15:58:16 2015 +0200 -- src/main/java/org/apache/commons/text/names/HumanNameParser.java | 4 +--- .../java/org/apache/commons/text/names/HumanNameParserTest.java | 1 + 2 files changed, 2 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/aa293500/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index 843685a..5088bba 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -134,8 +134,6 @@ public class HumanNameParser { "de la", "de", "del", "der", "di", "ibn", "la", "le", "san", "st", "ste", "van", "van der", "van den", "vel", "von" }); - -this.parse(); } /** @@ -224,7 +222,7 @@ public class HumanNameParser { * * @throws NameParseException if the parser fails to retrieve the name parts */ -private void parse() { +public void parse() { String suffixes = StringUtils.join(this.suffixes, "\\.*|") + "\\.*"; String prefixes = StringUtils.join(this.prefixes, " |") + " "; http://git-wip-us.apache.org/repos/asf/commons-text/blob/aa293500/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java -- diff --git a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java index 90e1dfa..5ff7805 100644 --- a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java +++ b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java @@ -65,6 +65,7 @@ public class HumanNameParserTest { */ private void validateRecord(CSVRecord record) { HumanNameParser parser = new HumanNameParser(record.get(Colums.Name)); +parser.parse(); long recordNum = record.getRecordNumber(); assertThat("Wrong LeadingInit in record " + recordNum,
svn commit: r1674710 - /commons/proper/io/trunk/src/main/java/org/apache/commons/io/output/CloseShieldOutputStream.java
Author: ggregory Date: Mon Apr 20 00:25:55 2015 New Revision: 1674710 URL: http://svn.apache.org/r1674710 Log: Javadoc 8 fix. Modified: commons/proper/io/trunk/src/main/java/org/apache/commons/io/output/CloseShieldOutputStream.java Modified: commons/proper/io/trunk/src/main/java/org/apache/commons/io/output/CloseShieldOutputStream.java URL: http://svn.apache.org/viewvc/commons/proper/io/trunk/src/main/java/org/apache/commons/io/output/CloseShieldOutputStream.java?rev=1674710&r1=1674709&r2=1674710&view=diff == --- commons/proper/io/trunk/src/main/java/org/apache/commons/io/output/CloseShieldOutputStream.java (original) +++ commons/proper/io/trunk/src/main/java/org/apache/commons/io/output/CloseShieldOutputStream.java Mon Apr 20 00:25:55 2015 @@ -24,7 +24,7 @@ import java.io.OutputStream; * This class is typically used in cases where an output stream needs to be * passed to a component that wants to explicitly close the stream even if * other components would still use the stream for output. - * + * * @version $Id$ * @since 1.4 */
svn commit: r1674708 - /commons/proper/io/trunk/src/main/java/org/apache/commons/io/output/NullOutputStream.java
Author: ggregory Date: Mon Apr 20 00:02:45 2015 New Revision: 1674708 URL: http://svn.apache.org/r1674708 Log: Javadoc 8 fix. Modified: commons/proper/io/trunk/src/main/java/org/apache/commons/io/output/NullOutputStream.java Modified: commons/proper/io/trunk/src/main/java/org/apache/commons/io/output/NullOutputStream.java URL: http://svn.apache.org/viewvc/commons/proper/io/trunk/src/main/java/org/apache/commons/io/output/NullOutputStream.java?rev=1674708&r1=1674707&r2=1674708&view=diff == --- commons/proper/io/trunk/src/main/java/org/apache/commons/io/output/NullOutputStream.java (original) +++ commons/proper/io/trunk/src/main/java/org/apache/commons/io/output/NullOutputStream.java Mon Apr 20 00:02:45 2015 @@ -24,7 +24,7 @@ import java.io.OutputStream; * * This output stream has no destination (file/socket etc.) and all * bytes written to it are ignored and lost. - * + * * @version $Id$ */ public class NullOutputStream extends OutputStream {
svn commit: r1674633 - /commons/proper/csv/trunk/pom.xml
Author: britter Date: Sun Apr 19 15:57:54 2015 New Revision: 1674633 URL: http://svn.apache.org/r1674633 Log: SCM connection should point to trunk Modified: commons/proper/csv/trunk/pom.xml Modified: commons/proper/csv/trunk/pom.xml URL: http://svn.apache.org/viewvc/commons/proper/csv/trunk/pom.xml?rev=1674633&r1=1674632&r2=1674633&view=diff == --- commons/proper/csv/trunk/pom.xml (original) +++ commons/proper/csv/trunk/pom.xml Sun Apr 19 15:57:54 2015 @@ -99,9 +99,9 @@ CSV files of various types. - scm:svn:http://svn.apache.org/repos/asf/commons/proper/csv/tags/CSV_1.1 - scm:svn:https://svn.apache.org/repos/asf/commons/proper/csv/tags/CSV_1.1 -http://svn.apache.org/viewvc/commons/proper/csv/tags/CSV_1.1 + scm:svn:http://svn.apache.org/repos/asf/commons/proper/csv/trunk + scm:svn:https://svn.apache.org/repos/asf/commons/proper/csv/trunk +http://svn.apache.org/viewvc/commons/proper/csv/trunk
[05/12] [text] Remove state from HumanNameParser, making it immutable
Remove state from HumanNameParser, making it immutable Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/1f6c5dae Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/1f6c5dae Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/1f6c5dae Branch: refs/heads/SANDBOX-498 Commit: 1f6c5daecded67a17c07371a564f74ef623b3f29 Parents: 685f9a8 Author: Benedikt Ritter Authored: Sun Apr 19 16:28:37 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:28:37 2015 +0200 -- .../commons/text/names/HumanNameParser.java | 141 +++ .../org/apache/commons/text/names/Name.java | 32 + 2 files changed, 51 insertions(+), 122 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/1f6c5dae/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index df8e55c..c47abde 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -61,135 +61,32 @@ import org.apache.commons.lang3.StringUtils; * This implementation is based on the Java implementation, with additions * suggested in https://issues.apache.org/jira/browse/SANDBOX-487";>SANDBOX-487. * - * This class is not thread-safe. + * This class is immutable. */ public class HumanNameParser { /** - * Leading init part. - */ -private String leadingInit; -/** - * First name. - */ -private String first; -/** - * Single nickname found in the name input. - */ -private String nickname; -/** - * Middle name. - */ -private String middle; -/** - * Last name. - */ -private String last; -/** - * Name suffix. - */ -private String suffix; -/** * Suffixes found. */ -private List suffixes; +private final List suffixes; /** * Prefixes found. */ -private List prefixes; +private final List prefixes; /** * Creates a parser given a string name. */ public HumanNameParser() { -this.leadingInit = ""; -this.first = ""; -this.nickname = ""; -this.middle = ""; -this.last = ""; -this.suffix = ""; - -this.suffixes = Arrays.asList(new String[]{ +// TODO make this configurable +this.suffixes = Arrays.asList( "esq", "esquire", "jr", -"sr", "2", "ii", "iii", "iv"}); -this.prefixes = Arrays -.asList(new String[] { +"sr", "2", "ii", "iii", "iv"); +this.prefixes = Arrays.asList( "bar", "ben", "bin", "da", "dal", "de la", "de", "del", "der", "di", "ibn", "la", "le", "san", "st", "ste", "van", "van der", "van den", "vel", -"von" }); -} - -/** - * Gets the leading init part of the name. - * - * @return the leading init part of the name - */ -public String getLeadingInit() { -return leadingInit; -} - -/** - * Gets the first name. - * - * @return first name - */ -public String getFirst() { -return first; -} - -/** - * Gets the nickname. - * - * @return the nickname - */ -public String getNickname() { -return nickname; -} - -/** - * Gets the middle name. - * - * @return the middle name - */ -public String getMiddle() { -return middle; -} - -/** - * Gets the last name. - * - * @return the last name - */ -public String getLast() { -return last; -} - -/** - * Gets the suffix part of the name. - * - * @return the name suffix - */ -public String getSuffix() { -return suffix; -} - -/** - * Gets the name suffixes. - * - * @return the name suffixes - */ -public List getSuffixes() { -return suffixes; -} - -/** - * Gets the name prefixes. - * - * @return the name prefixes - */ -public List getPrefixes() { -return prefixes; +"von" ); } /** @@ -218,28 +115,28 @@ public class HumanNameParser { String firstRegex = "(?i)^([^ ]+)"; // get nickname, if there is one -this.nickname = nameString.chopWithRegex(nicknamesRegex, 2); +String nickname = nameString.chopWithRegex(nicknamesRegex, 2); // get suffix, if there is one -this.suffix = nameString.chopWithRegex(suf
[04/12] [text] Make HumanNameParser return a name object. Introduce a new wrapper object for strings to be parsed called NameString.
Make HumanNameParser return a name object. Introduce a new wrapper object for strings to be parsed called NameString. Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/685f9a86 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/685f9a86 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/685f9a86 Branch: refs/heads/SANDBOX-498 Commit: 685f9a864d46cc526b14e3a7476465c49d991478 Parents: 9a0cc85 Author: Benedikt Ritter Authored: Sun Apr 19 16:22:45 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:22:45 2015 +0200 -- .../commons/text/names/HumanNameParser.java | 36 ++--- .../org/apache/commons/text/names/Name.java | 141 ++- .../apache/commons/text/names/NameString.java | 134 ++ .../commons/text/names/HumanNameParserTest.java | 24 ++-- .../commons/text/names/NameStringTest.java | 104 ++ .../org/apache/commons/text/names/NameTest.java | 104 -- 6 files changed, 315 insertions(+), 228 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/685f9a86/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index fa2433a..df8e55c 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -195,14 +195,14 @@ public class HumanNameParser { /** * Consumes the string and creates the name parts. * - * @param nameStr the name to parse. Must not be null. + * @param name the name to parse. Must not be null. * @throws NameParseException if the parser fails to retrieve the name parts. - * @throws NullPointerException if nameStr is null. + * @throws NullPointerException if name is null. */ -public void parse(String nameStr) { -Objects.requireNonNull(nameStr, "Parameter 'nameStr' must not be null."); +public Name parse(String name) { +Objects.requireNonNull(name, "Parameter 'name' must not be null."); -Name name = new Name(nameStr); +NameString nameString = new NameString(name); String suffixes = StringUtils.join(this.suffixes, "\\.*|") + "\\.*"; String prefixes = StringUtils.join(this.prefixes, " |") + " "; @@ -218,28 +218,30 @@ public class HumanNameParser { String firstRegex = "(?i)^([^ ]+)"; // get nickname, if there is one -this.nickname = name.chopWithRegex(nicknamesRegex, 2); +this.nickname = nameString.chopWithRegex(nicknamesRegex, 2); // get suffix, if there is one -this.suffix = name.chopWithRegex(suffixRegex, 1); +this.suffix = nameString.chopWithRegex(suffixRegex, 1); -// flip the before-comma and after-comma parts of the name -name.flip(","); +// flip the before-comma and after-comma parts of the nameString +nameString.flip(","); -// get the last name -this.last = name.chopWithRegex(lastRegex, 0); +// get the last nameString +this.last = nameString.chopWithRegex(lastRegex, 0); // get the first initial, if there is one -this.leadingInit = name.chopWithRegex(leadingInitRegex, 1); +this.leadingInit = nameString.chopWithRegex(leadingInitRegex, 1); -// get the first name -this.first = name.chopWithRegex(firstRegex, 0); +// get the first nameString +this.first = nameString.chopWithRegex(firstRegex, 0); if (StringUtils.isBlank(this.first)) { -throw new NameParseException("Couldn't find a first name in '{" + name.getStr() + "}'"); +throw new NameParseException("Couldn't find a first name in '{" + nameString.getStr() + "}'"); } -// if anything's left, that's the middle name -this.middle = name.getStr(); +// if anything's left, that's the middle nameString +this.middle = nameString.getStr(); + +return new Name(leadingInit, first, nickname, middle, last, suffix); } } http://git-wip-us.apache.org/repos/asf/commons-text/blob/685f9a86/src/main/java/org/apache/commons/text/names/Name.java -- diff --git a/src/main/java/org/apache/commons/text/names/Name.java b/src/main/java/org/apache/commons/text/names/Name.java index 0dd2560..3067ba5 100644 --- a/src/main/java/org/apache/commons/text/names/Name.java +++ b/src/main/java/org/apache/commons/text/names/Name.java @@ -16,119 +16,70 @@ */ package org.apac
[12/12] [text] Add SANDBOX-498 to the list of fixed issues
Add SANDBOX-498 to the list of fixed issues Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/c1372c1f Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/c1372c1f Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/c1372c1f Branch: refs/heads/SANDBOX-498 Commit: c1372c1f9754434995c9a91fe47508946ff5744f Parents: 6d047a4 Author: Benedikt Ritter Authored: Sun Apr 19 17:14:22 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 17:14:22 2015 +0200 -- src/changes/changes.xml | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/c1372c1f/src/changes/changes.xml -- diff --git a/src/changes/changes.xml b/src/changes/changes.xml index fbb60b9..0a77677 100644 --- a/src/changes/changes.xml +++ b/src/changes/changes.xml @@ -22,6 +22,7 @@ +Improve HumanNameParser IP clearance for the names package Write user guide Work on the string metric, distance, and similarity definitions for the project
[06/12] [text] Use a shared parser instance for tests
Use a shared parser instance for tests Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/bbba0a32 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/bbba0a32 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/bbba0a32 Branch: refs/heads/SANDBOX-498 Commit: bbba0a327b7ad8873d176254ec2a550757911bda Parents: 1f6c5da Author: Benedikt Ritter Authored: Sun Apr 19 16:30:00 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:30:00 2015 +0200 -- .../commons/text/names/HumanNameParserTest.java | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/bbba0a32/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java -- diff --git a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java index d059ed4..314a949 100644 --- a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java +++ b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java @@ -33,32 +33,33 @@ import org.junit.Test; */ public class HumanNameParserTest { -private CSVParser parser; +private CSVParser inputParser; +private HumanNameParser nameParser; @Before public void setUp() throws Exception { -parser = CSVParser.parse( +inputParser = CSVParser.parse( HumanNameParserTest.class.getResource("testNames.txt"), Charset.forName("UTF-8"), CSVFormat.DEFAULT.withDelimiter('|').withHeader()); +nameParser = new HumanNameParser(); } @After public void tearDown() throws Exception { -if (parser != null) { -parser.close(); +if (inputParser != null) { +inputParser.close(); } } @Test(expected = NullPointerException.class) public void shouldThrowNullPointerException_WhenNullIsParsed() throws Exception { -HumanNameParser parser = new HumanNameParser(); -parser.parse(null); +nameParser.parse(null); } @Test public void testInputs() { -for (CSVRecord record : parser) { +for (CSVRecord record : inputParser) { validateRecord(record); } } @@ -70,8 +71,7 @@ public class HumanNameParserTest { * @param record a CSVRecord representing one record in the input file. */ private void validateRecord(CSVRecord record) { -HumanNameParser parser = new HumanNameParser(); -Name result = parser.parse(record.get(Colums.Name)); +Name result = nameParser.parse(record.get(Colums.Name)); long recordNum = record.getRecordNumber(); assertThat("Wrong LeadingInit in record " + recordNum,
[02/12] [text] Pass the name to parse as parameter to the parse method
Pass the name to parse as parameter to the parse method Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/df7e7a7b Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/df7e7a7b Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/df7e7a7b Branch: refs/heads/SANDBOX-498 Commit: df7e7a7b0aba73a1bf09c41dbd32e913252a8707 Parents: aa29350 Author: Benedikt Ritter Authored: Sun Apr 19 16:02:55 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:02:55 2015 +0200 -- .../commons/text/names/HumanNameParser.java | 52 ++-- .../commons/text/names/HumanNameParserTest.java | 4 +- 2 files changed, 16 insertions(+), 40 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/df7e7a7b/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index 5088bba..bf8f9ed 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -65,10 +65,6 @@ import org.apache.commons.lang3.StringUtils; public class HumanNameParser { /** - * Name parsed. - */ -private Name name; -/** * Leading init part. */ private String leadingInit; @@ -103,21 +99,8 @@ public class HumanNameParser { /** * Creates a parser given a string name. - * - * @param name string name - */ -public HumanNameParser(String name) { -this(new Name(name)); -} - -/** - * Creates a parser given a {@code Name} object. - * - * @param name {@code Name} */ -public HumanNameParser(Name name) { -this.name = name; - +public HumanNameParser() { this.leadingInit = ""; this.first = ""; this.nickname = ""; @@ -125,9 +108,9 @@ public class HumanNameParser { this.last = ""; this.suffix = ""; -this.suffixes = Arrays.asList(new String[] { +this.suffixes = Arrays.asList(new String[]{ "esq", "esquire", "jr", -"sr", "2", "ii", "iii", "iv" }); +"sr", "2", "ii", "iii", "iv"}); this.prefixes = Arrays .asList(new String[] { "bar", "ben", "bin", "da", "dal", @@ -137,15 +120,6 @@ public class HumanNameParser { } /** - * Gets the {@code Name} object. - * - * @return the {@code Name} object - */ -public Name getName() { -return name; -} - -/** * Gets the leading init part of the name. * * @return the leading init part of the name @@ -220,9 +194,11 @@ public class HumanNameParser { /** * Consumes the string and creates the name parts. * + * @param nameStr the name to parse. * @throws NameParseException if the parser fails to retrieve the name parts */ -public void parse() { +public void parse(String nameStr) { +Name name = new Name(nameStr); String suffixes = StringUtils.join(this.suffixes, "\\.*|") + "\\.*"; String prefixes = StringUtils.join(this.prefixes, " |") + " "; @@ -238,28 +214,28 @@ public class HumanNameParser { String firstRegex = "(?i)^([^ ]+)"; // get nickname, if there is one -this.nickname = this.name.chopWithRegex(nicknamesRegex, 2); +this.nickname = name.chopWithRegex(nicknamesRegex, 2); // get suffix, if there is one -this.suffix = this.name.chopWithRegex(suffixRegex, 1); +this.suffix = name.chopWithRegex(suffixRegex, 1); // flip the before-comma and after-comma parts of the name -this.name.flip(","); +name.flip(","); // get the last name -this.last = this.name.chopWithRegex(lastRegex, 0); +this.last = name.chopWithRegex(lastRegex, 0); // get the first initial, if there is one -this.leadingInit = this.name.chopWithRegex(leadingInitRegex, 1); +this.leadingInit = name.chopWithRegex(leadingInitRegex, 1); // get the first name -this.first = this.name.chopWithRegex(firstRegex, 0); +this.first = name.chopWithRegex(firstRegex, 0); if (StringUtils.isBlank(this.first)) { -throw new NameParseException("Couldn't find a first name in '{" + this.name.getStr() + "}'"); +throw new NameParseException("Couldn't find a first name in '{" + name.getStr() + "}'"); } // if anything's left, that's the middle name -this.middle = this.name.getStr(); +this.middle = name
[10/12] [text] Condition will always be true
Condition will always be true Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/b1c7e564 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/b1c7e564 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/b1c7e564 Branch: refs/heads/SANDBOX-498 Commit: b1c7e564251e7a404aa3d021c282349150fd4061 Parents: ed985cd Author: Benedikt Ritter Authored: Sun Apr 19 16:45:49 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:45:49 2015 +0200 -- .../org/apache/commons/text/names/NameString.java | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/b1c7e564/src/main/java/org/apache/commons/text/names/NameString.java -- diff --git a/src/main/java/org/apache/commons/text/names/NameString.java b/src/main/java/org/apache/commons/text/names/NameString.java index 54e2753..21898d3 100644 --- a/src/main/java/org/apache/commons/text/names/NameString.java +++ b/src/main/java/org/apache/commons/text/names/NameString.java @@ -98,14 +98,12 @@ final class NameString { */ void flip(String flipAroundChar) { String[] parts = this.str.split(flipAroundChar); -if (parts != null) { -if (parts.length == 2) { -this.str = String.format("%s %s", parts[1], parts[0]); -this.norm(); -} else if (parts.length > 2) { -throw new NameParseException( -"Can't flip around multiple '" + flipAroundChar + "' characters in namestring."); -} +if (parts.length == 2) { +this.str = String.format("%s %s", parts[1], parts[0]); +this.norm(); +} else if (parts.length > 2) { +throw new NameParseException( +"Can't flip around multiple '" + flipAroundChar + "' characters in namestring."); } }
[07/12] [text] Fix typo
Fix typo Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/a942b4c0 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/a942b4c0 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/a942b4c0 Branch: refs/heads/SANDBOX-498 Commit: a942b4c02194a6f544f129e89e0f399d51c5c01a Parents: bbba0a3 Author: Benedikt Ritter Authored: Sun Apr 19 16:31:01 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:31:01 2015 +0200 -- .../commons/text/names/HumanNameParserTest.java | 16 1 file changed, 8 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/a942b4c0/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java -- diff --git a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java index 314a949..f6c9ba6 100644 --- a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java +++ b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java @@ -71,29 +71,29 @@ public class HumanNameParserTest { * @param record a CSVRecord representing one record in the input file. */ private void validateRecord(CSVRecord record) { -Name result = nameParser.parse(record.get(Colums.Name)); +Name result = nameParser.parse(record.get(Columns.Name)); long recordNum = record.getRecordNumber(); assertThat("Wrong LeadingInit in record " + recordNum, -result.getLeadingInitial(), equalTo(record.get(Colums.LeadingInit))); +result.getLeadingInitial(), equalTo(record.get(Columns.LeadingInit))); assertThat("Wrong FirstName in record " + recordNum, -result.getFirstName(), equalTo(record.get(Colums.FirstName))); +result.getFirstName(), equalTo(record.get(Columns.FirstName))); assertThat("Wrong NickName in record " + recordNum, -result.getNickName(), equalTo(record.get(Colums.NickName))); +result.getNickName(), equalTo(record.get(Columns.NickName))); assertThat("Wrong MiddleName in record " + recordNum, -result.getMiddleName(), equalTo(record.get(Colums.MiddleName))); +result.getMiddleName(), equalTo(record.get(Columns.MiddleName))); assertThat("Wrong LastName in record " + recordNum, -result.getLastName(), equalTo(record.get(Colums.LastName))); +result.getLastName(), equalTo(record.get(Columns.LastName))); assertThat("Wrong Suffix in record " + recordNum, -result.getSuffix(), equalTo(record.get(Colums.Suffix))); +result.getSuffix(), equalTo(record.get(Columns.Suffix))); } -private enum Colums { +private enum Columns { Name,LeadingInit,FirstName,NickName,MiddleName,LastName,Suffix } }
[09/12] [text] Drop unused code from NameString and clean up NameStringTest
Drop unused code from NameString and clean up NameStringTest Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/ed985cd5 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/ed985cd5 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/ed985cd5 Branch: refs/heads/SANDBOX-498 Commit: ed985cd51220e956f516acecf1039defd0141d34 Parents: 9e34064 Author: Benedikt Ritter Authored: Sun Apr 19 16:44:32 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:44:32 2015 +0200 -- .../commons/text/names/HumanNameParser.java | 5 +- .../apache/commons/text/names/NameString.java | 24 ++- .../commons/text/names/NameStringTest.java | 67 ++-- 3 files changed, 30 insertions(+), 66 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/ed985cd5/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index a29e375..b5c0aa3 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -100,6 +100,7 @@ public final class HumanNameParser { Objects.requireNonNull(name, "Parameter 'name' must not be null."); NameString nameString = new NameString(name); +// TODO compile regexes only once when the parser is created String suffixes = StringUtils.join(this.suffixes, "\\.*|") + "\\.*"; String prefixes = StringUtils.join(this.prefixes, " |") + " "; @@ -132,11 +133,11 @@ public final class HumanNameParser { // get the first name String first = nameString.chopWithRegex(firstRegex, 0); if (StringUtils.isBlank(first)) { -throw new NameParseException("Couldn't find a first name in '{" + nameString.getStr() + "}'"); +throw new NameParseException("Couldn't find a first name in '{" + nameString.getWrappedString() + "}'"); } // if anything's left, that's the middle name -String middle = nameString.getStr(); +String middle = nameString.getWrappedString(); return new Name(leadingInit, first, nickname, middle, last, suffix); } http://git-wip-us.apache.org/repos/asf/commons-text/blob/ed985cd5/src/main/java/org/apache/commons/text/names/NameString.java -- diff --git a/src/main/java/org/apache/commons/text/names/NameString.java b/src/main/java/org/apache/commons/text/names/NameString.java index 8f606f2..54e2753 100644 --- a/src/main/java/org/apache/commons/text/names/NameString.java +++ b/src/main/java/org/apache/commons/text/names/NameString.java @@ -37,30 +37,20 @@ final class NameString { * * @param str encapsulated string. */ -public NameString(String str) { +NameString(String str) { this.str = str; } /** - * Gets the encapsulated string. + * Gets the wrapped string. * - * @return encapsulated string + * @return wrapped string */ -public String getStr() { +String getWrappedString() { return str; } /** - * Sets the encapsulated string value. - * - * @param str string value - */ -public void setStr(String str) { -this.str = str; -this.norm(); -} - -/** * Uses a regex to chop off and return part of the namestring. * There are two parts: first, it returns the matched substring, * and then it removes that substring from the encapsulated @@ -70,7 +60,7 @@ final class NameString { * @param submatchIndex which of the parenthesized submatches to use * @return the part of the namestring that got chopped off */ -public String chopWithRegex(String regex, int submatchIndex) { +String chopWithRegex(String regex, int submatchIndex) { String chopped = ""; Pattern pattern = Pattern.compile(regex); Matcher matcher = pattern.matcher(this.str); @@ -106,7 +96,7 @@ final class NameString { * @param flipAroundChar the character(s) demarcating the two halves you want to flip. * @throws NameParseException if a regex fails or a condition is not expected */ -public void flip(String flipAroundChar) { +void flip(String flipAroundChar) { String[] parts = this.str.split(flipAroundChar); if (parts != null) { if (parts.length == 2) { @@ -125,7 +115,7 @@ final class NameString { * Strips whitespace chars from ends, strips redundant whitespace, converts * white
[11/12] [text] Better JavaDoc for HumanNameParser
Better JavaDoc for HumanNameParser Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/6d047a46 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/6d047a46 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/6d047a46 Branch: refs/heads/SANDBOX-498 Commit: 6d047a461f83017c8b723f4b28c0ad10f3f1dc36 Parents: b1c7e56 Author: Benedikt Ritter Authored: Sun Apr 19 17:13:11 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 17:13:11 2015 +0200 -- .../commons/text/names/HumanNameParser.java | 99 +--- 1 file changed, 64 insertions(+), 35 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/6d047a46/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index b5c0aa3..5407d15 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -24,58 +24,87 @@ import java.util.Objects; import org.apache.commons.lang3.StringUtils; /** - * A parser capable of parsing name parts out of a single string. + * A parser capable of parsing name parts out of a single string. * + * Parsing examples + * * The code works by basically applying several Regexes in a certain order - * and removing (chopping) tokens off the original string. The parser consumes - * the tokens during its creation. + * and removing (chopping) tokens off the original string. The parser creates + * a {@link Name} object representing the parse result. Note that passing null + * to the {@link #parse(String)} method will result in an exception. * - * - * J. Walter Weatherman - * de la Cruz, Ana M. - * James C. ('Jimmy') O'Dell, Jr. - * - * - * and parses out the: - * - * - * leading initial (Like "J." in "J. Walter Weatherman") - * first name (or first initial in a name like 'R. Crumb') - * nicknames (like "Jimmy" in "James C. ('Jimmy') O'Dell, Jr.") - * middle names - * last name (including compound ones like "van der Sar' and "Ortega y Gasset"), and - * suffix (like 'Jr.', 'III') - * + * + * + * input + * Leading initial + * First name + * Nick name + * Middle name + * Last Name + * Suffix + * + * + * J. Walter Weatherman + * J. + * Walter + * + * + * Weatherman + * + * + * + * de la Cruz, Ana M. + * + * Ana + * + * M. + * de la Cruz + * + * + * + * James C. ('Jimmy') O'Dell, Jr. + * + * James + * Jimmy + * C. + * O'Dell + * Jr. + * + * * + * Sample usage + * + * HumanNameParser instances are immutable and can be reused for parsing multiple names: + * * - * Name name = new Name("S�rgio Vieira de Mello"); - * HumanNameParser parser = new HumanNameParser(name); - * String firstName = parser.getFirst(); - * String nickname = parser.getNickname(); + * HumanNameParser parser = new HumanNameParser(); + * Name parsedName = parser.parse("S�rgio Vieira de Mello") + * String firstName = parsedName.getFirstName(); + * String nickname = parsedName.getNickName(); * // ... + * + * Name nextName = parser.parse("James C. ('Jimmy') O'Dell, Jr.") + * String firstName = nextName.getFirstName(); + * String nickname = nextName.getNickName(); * * + * Further notes + * * The original code was written in http://jasonpriem.com/human-name-parse";>PHP - * and ported to http://tupilabs.github.io/HumanNameParser.java/";>Java. - * - * This implementation is based on the Java implementation, with additions - * suggested in https://issues.apache.org/jira/browse/SANDBOX-487";>SANDBOX-487. + * and ported to http://tupilabs.github.io/HumanNameParser.java/";>Java. This + * implementation is based on the Java implementation, with additions + * suggested in https://issues.apache.org/jira/browse/SANDBOX-487";>SANDBOX-487 + * and https://issues.apache.org/jira/browse/SANDBOX-498";>SANDBOX-498. * * This class is immutable. */ public final class HumanNameParser { -/** - * Suffixes found. - */ private final List suffixes; -/** - * Prefixes found. - */ private final List prefixes; /** - * Creates a parser given a string name. + * Creates a new parser. */ public HumanNameParser() { // TODO make this configurable @@ -90,7 +119,7 @@ public final class HumanNameParser { } /** - * Consumes the string and creates the name parts. + * Parses a name from the given string. * * @param name the name to parse. Must not be null. * @throws NameParseException if the parser fails to retr
[08/12] [text] Make classes in the name package final.
Make classes in the name package final. Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/9e340643 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/9e340643 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/9e340643 Branch: refs/heads/SANDBOX-498 Commit: 9e340643cfebd7b4088fd9946b3e92fc9f8cd394 Parents: a942b4c Author: Benedikt Ritter Authored: Sun Apr 19 16:32:31 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:32:31 2015 +0200 -- src/main/java/org/apache/commons/text/names/HumanNameParser.java | 2 +- .../java/org/apache/commons/text/names/NameParseException.java | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/9e340643/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index c47abde..a29e375 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -63,7 +63,7 @@ import org.apache.commons.lang3.StringUtils; * * This class is immutable. */ -public class HumanNameParser { +public final class HumanNameParser { /** * Suffixes found. http://git-wip-us.apache.org/repos/asf/commons-text/blob/9e340643/src/main/java/org/apache/commons/text/names/NameParseException.java -- diff --git a/src/main/java/org/apache/commons/text/names/NameParseException.java b/src/main/java/org/apache/commons/text/names/NameParseException.java index b09c2d6..4fe5eda 100644 --- a/src/main/java/org/apache/commons/text/names/NameParseException.java +++ b/src/main/java/org/apache/commons/text/names/NameParseException.java @@ -19,7 +19,7 @@ package org.apache.commons.text.names; /** * Name parse exception. */ -public class NameParseException extends RuntimeException { +public final class NameParseException extends RuntimeException { /** * Serial UID.
[03/12] [text] Check for null inputs
Check for null inputs Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/9a0cc85a Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/9a0cc85a Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/9a0cc85a Branch: refs/heads/SANDBOX-498 Commit: 9a0cc85ad01dcf1f468736984cdd5dec0a7a3bf3 Parents: df7e7a7 Author: Benedikt Ritter Authored: Sun Apr 19 16:06:09 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 16:06:09 2015 +0200 -- .../java/org/apache/commons/text/names/HumanNameParser.java | 8 ++-- .../org/apache/commons/text/names/HumanNameParserTest.java | 6 ++ 2 files changed, 12 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/9a0cc85a/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index bf8f9ed..fa2433a 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -19,6 +19,7 @@ package org.apache.commons.text.names; import java.util.Arrays; import java.util.List; +import java.util.Objects; import org.apache.commons.lang3.StringUtils; @@ -194,10 +195,13 @@ public class HumanNameParser { /** * Consumes the string and creates the name parts. * - * @param nameStr the name to parse. - * @throws NameParseException if the parser fails to retrieve the name parts + * @param nameStr the name to parse. Must not be null. + * @throws NameParseException if the parser fails to retrieve the name parts. + * @throws NullPointerException if nameStr is null. */ public void parse(String nameStr) { +Objects.requireNonNull(nameStr, "Parameter 'nameStr' must not be null."); + Name name = new Name(nameStr); String suffixes = StringUtils.join(this.suffixes, "\\.*|") + "\\.*"; String prefixes = StringUtils.join(this.prefixes, " |") + " "; http://git-wip-us.apache.org/repos/asf/commons-text/blob/9a0cc85a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java -- diff --git a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java index 478d19c..d43d2be 100644 --- a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java +++ b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java @@ -50,6 +50,12 @@ public class HumanNameParserTest { } } +@Test(expected = NullPointerException.class) +public void shouldThrowNullPointerException_WhenNullIsParsed() throws Exception { +HumanNameParser parser = new HumanNameParser(); +parser.parse(null); +} + @Test public void testInputs() { for (CSVRecord record : parser) {
[01/12] [text] Make parse method public
Repository: commons-text Updated Branches: refs/heads/SANDBOX-498 [created] c1372c1f9 Make parse method public Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/aa293500 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/aa293500 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/aa293500 Branch: refs/heads/SANDBOX-498 Commit: aa293500080d6872b3ac653dcf74a50cf8223ae5 Parents: e8e85d9 Author: Benedikt Ritter Authored: Sun Apr 19 15:58:16 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 15:58:16 2015 +0200 -- src/main/java/org/apache/commons/text/names/HumanNameParser.java | 4 +--- .../java/org/apache/commons/text/names/HumanNameParserTest.java | 1 + 2 files changed, 2 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/aa293500/src/main/java/org/apache/commons/text/names/HumanNameParser.java -- diff --git a/src/main/java/org/apache/commons/text/names/HumanNameParser.java b/src/main/java/org/apache/commons/text/names/HumanNameParser.java index 843685a..5088bba 100644 --- a/src/main/java/org/apache/commons/text/names/HumanNameParser.java +++ b/src/main/java/org/apache/commons/text/names/HumanNameParser.java @@ -134,8 +134,6 @@ public class HumanNameParser { "de la", "de", "del", "der", "di", "ibn", "la", "le", "san", "st", "ste", "van", "van der", "van den", "vel", "von" }); - -this.parse(); } /** @@ -224,7 +222,7 @@ public class HumanNameParser { * * @throws NameParseException if the parser fails to retrieve the name parts */ -private void parse() { +public void parse() { String suffixes = StringUtils.join(this.suffixes, "\\.*|") + "\\.*"; String prefixes = StringUtils.join(this.prefixes, " |") + " "; http://git-wip-us.apache.org/repos/asf/commons-text/blob/aa293500/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java -- diff --git a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java index 90e1dfa..5ff7805 100644 --- a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java +++ b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java @@ -65,6 +65,7 @@ public class HumanNameParserTest { */ private void validateRecord(CSVRecord record) { HumanNameParser parser = new HumanNameParser(record.get(Colums.Name)); +parser.parse(); long recordNum = record.getRecordNumber(); assertThat("Wrong LeadingInit in record " + recordNum,
[3/3] [text] Merge branch 'SANDBOX-497'
Merge branch 'SANDBOX-497' Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/e8e85d9d Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/e8e85d9d Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/e8e85d9d Branch: refs/heads/master Commit: e8e85d9deff253808bb6ba6ec3be8a1f874be27f Parents: 65852f8 db197ab Author: Benedikt Ritter Authored: Sun Apr 19 15:51:55 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 15:51:55 2015 +0200 -- NOTICE.txt | 4 src/changes/changes.xml | 1 + 2 files changed, 5 insertions(+) --
[2/3] [text] SANDBOX-497 IP clearance for the names package
SANDBOX-497 IP clearance for the names package Make clear that Commons Text only includes ported code from the HumanNameParser PHP library. HumanNameParser library is licensed under MIT. Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/db197ab1 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/db197ab1 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/db197ab1 Branch: refs/heads/master Commit: db197ab199281d65ca338f8f47b6099223a9cf8b Parents: be2bcda Author: Benedikt Ritter Authored: Sun Apr 19 15:49:54 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 15:49:54 2015 +0200 -- NOTICE.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/db197ab1/NOTICE.txt -- diff --git a/NOTICE.txt b/NOTICE.txt index f6f4633..80d91a0 100644 --- a/NOTICE.txt +++ b/NOTICE.txt @@ -4,6 +4,6 @@ Copyright 2001-2015 The Apache Software Foundation This product includes software developed at The Apache Software Foundation (http://www.apache.org/). -This product includes software from the HumanNameParser.php -(https://github.com/jasonpriem/HumanNameParser.php) library, -under the Apache License 2.0 (see: the o.a.c.t.names package). +This product includes software ported from the HumanNameParser PHP library +(https://github.com/jasonpriem/HumanNameParser.php), which is licensed under +the MIT License (MIT) (see: the o.a.c.t.names package).
[1/3] [text] SANDBOX-497 IP clearance for the names package
Repository: commons-text Updated Branches: refs/heads/master 65852f808 -> e8e85d9de SANDBOX-497 IP clearance for the names package Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/be2bcda2 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/be2bcda2 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/be2bcda2 Branch: refs/heads/master Commit: be2bcda218bdaf0bce78dac8bb60efb46027f74f Parents: c4e8a3e Author: Bruno P. Kinoshita Authored: Sun Apr 19 21:59:12 2015 +1200 Committer: Bruno P. Kinoshita Committed: Sun Apr 19 21:59:12 2015 +1200 -- NOTICE.txt | 4 src/changes/changes.xml | 1 + 2 files changed, 5 insertions(+) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/be2bcda2/NOTICE.txt -- diff --git a/NOTICE.txt b/NOTICE.txt index 283320b..f6f4633 100644 --- a/NOTICE.txt +++ b/NOTICE.txt @@ -3,3 +3,7 @@ Copyright 2001-2015 The Apache Software Foundation This product includes software developed at The Apache Software Foundation (http://www.apache.org/). + +This product includes software from the HumanNameParser.php +(https://github.com/jasonpriem/HumanNameParser.php) library, +under the Apache License 2.0 (see: the o.a.c.t.names package). http://git-wip-us.apache.org/repos/asf/commons-text/blob/be2bcda2/src/changes/changes.xml -- diff --git a/src/changes/changes.xml b/src/changes/changes.xml index 2768608..fbb60b9 100644 --- a/src/changes/changes.xml +++ b/src/changes/changes.xml @@ -22,6 +22,7 @@ +IP clearance for the names package Write user guide Work on the string metric, distance, and similarity definitions for the project Human name parser
[text] SANDBOX-497 IP clearance for the names package
Repository: commons-text Updated Branches: refs/heads/SANDBOX-497 [created] be2bcda21 SANDBOX-497 IP clearance for the names package Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/be2bcda2 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/be2bcda2 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/be2bcda2 Branch: refs/heads/SANDBOX-497 Commit: be2bcda218bdaf0bce78dac8bb60efb46027f74f Parents: c4e8a3e Author: Bruno P. Kinoshita Authored: Sun Apr 19 21:59:12 2015 +1200 Committer: Bruno P. Kinoshita Committed: Sun Apr 19 21:59:12 2015 +1200 -- NOTICE.txt | 4 src/changes/changes.xml | 1 + 2 files changed, 5 insertions(+) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/be2bcda2/NOTICE.txt -- diff --git a/NOTICE.txt b/NOTICE.txt index 283320b..f6f4633 100644 --- a/NOTICE.txt +++ b/NOTICE.txt @@ -3,3 +3,7 @@ Copyright 2001-2015 The Apache Software Foundation This product includes software developed at The Apache Software Foundation (http://www.apache.org/). + +This product includes software from the HumanNameParser.php +(https://github.com/jasonpriem/HumanNameParser.php) library, +under the Apache License 2.0 (see: the o.a.c.t.names package). http://git-wip-us.apache.org/repos/asf/commons-text/blob/be2bcda2/src/changes/changes.xml -- diff --git a/src/changes/changes.xml b/src/changes/changes.xml index 2768608..fbb60b9 100644 --- a/src/changes/changes.xml +++ b/src/changes/changes.xml @@ -22,6 +22,7 @@ +IP clearance for the names package Write user guide Work on the string metric, distance, and similarity definitions for the project Human name parser
[1/5] [text] Use Commons CSV to parse input data for ParserTest
Repository: commons-text Updated Branches: refs/heads/master c4e8a3e0e -> 65852f808 Use Commons CSV to parse input data for ParserTest Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/49ae4553 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/49ae4553 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/49ae4553 Branch: refs/heads/master Commit: 49ae4553c1599ab82d77cf29684878e4b11b610f Parents: c4e8a3e Author: Benedikt Ritter Authored: Sun Apr 19 11:38:48 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 11:38:48 2015 +0200 -- pom.xml | 7 ++ .../apache/commons/text/names/ParserTest.java | 100 +-- .../org/apache/commons/text/names/testNames.txt | 63 ++-- 3 files changed, 87 insertions(+), 83 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/49ae4553/pom.xml -- diff --git a/pom.xml b/pom.xml index cdfa9c2..f26a8a5 100644 --- a/pom.xml +++ b/pom.xml @@ -84,6 +84,13 @@ 1.3 test + + + org.apache.commons + commons-csv + 1.1 + test + http://git-wip-us.apache.org/repos/asf/commons-text/blob/49ae4553/src/test/java/org/apache/commons/text/names/ParserTest.java -- diff --git a/src/test/java/org/apache/commons/text/names/ParserTest.java b/src/test/java/org/apache/commons/text/names/ParserTest.java index 3e4c9d8..6a3371f 100644 --- a/src/test/java/org/apache/commons/text/names/ParserTest.java +++ b/src/test/java/org/apache/commons/text/names/ParserTest.java @@ -22,10 +22,17 @@ import java.io.BufferedReader; import java.io.File; import java.io.FileReader; import java.io.IOException; +import java.nio.charset.Charset; import java.util.logging.Logger; +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.CSVRecord; import org.apache.commons.lang3.StringUtils; +import org.junit.After; +import org.junit.Before; import org.junit.BeforeClass; +import org.junit.Ignore; import org.junit.Test; /** @@ -33,70 +40,59 @@ import org.junit.Test; */ public class ParserTest { -private static final Logger LOGGER = Logger.getLogger(ParserTest.class.getName()); +private CSVParser parser; -private static File testNames = null; +@Before +public void setUp() throws Exception { +parser = CSVParser.parse( +ParserTest.class.getResource("testNames.txt"), +Charset.forName("UTF-8"), +CSVFormat.DEFAULT.withDelimiter('|').withHeader()); +} -@BeforeClass -public static void setUp() { -testNames = new File(ParserTest.class.getResource("/org/apache/commons/text/names/testNames.txt").getFile()); +@After +public void tearDown() throws Exception { +if (parser != null) { +parser.close(); +} } @Test -public void testAll() throws IOException { -BufferedReader buffer = null; -FileReader reader = null; - -try { -reader = new FileReader(testNames); -buffer = new BufferedReader(reader); - -String line = null; -while ((line = buffer.readLine()) != null) { -if (StringUtils.isBlank(line)) { -LOGGER.warning("Empty line in testNames.txt"); -continue; -} - -String[] tokens = line.split("\\|"); -if (tokens.length != 7) { -LOGGER.warning(String.format("Invalid line in testNames.txt: %s", line)); -continue; -} - -validateLine(tokens); -} -} finally { -if (reader != null) -reader.close(); -if (buffer != null) -buffer.close(); +public void testInputs() { +for (CSVRecord record : parser) { +validateRecord(record); } } + /** * Validates a line in the testNames.txt file. * - * @param tokens the tokens with leading spaces + * @param record the tokens with leading spaces */ -private void validateLine(String[] tokens) { -String name = tokens[0].trim(); - -String leadingInit = tokens[1].trim(); -String first = tokens[2].trim(); -String nickname = tokens[3].trim(); -String middle = tokens[4].trim(); -String last = tokens[5].trim(); -String suffix = tokens[6].trim(); - -HumanNameParser parser = new HumanNameParser(name); - -assertEquals(
[2/5] [text] Rename ParserTest to match the name of the class under test
Rename ParserTest to match the name of the class under test Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/b6dbc7ae Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/b6dbc7ae Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/b6dbc7ae Branch: refs/heads/master Commit: b6dbc7ae77721054f1f7fe87c9913fe34a36278a Parents: 49ae455 Author: Benedikt Ritter Authored: Sun Apr 19 11:39:39 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 11:39:39 2015 +0200 -- .../commons/text/names/HumanNameParserTest.java | 90 ++ .../apache/commons/text/names/ParserTest.java | 98 2 files changed, 90 insertions(+), 98 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/b6dbc7ae/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java -- diff --git a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java new file mode 100644 index 000..59f98a1 --- /dev/null +++ b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.commons.text.names; + +import static org.junit.Assert.assertEquals; + +import java.nio.charset.Charset; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.CSVRecord; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +/** + * Tests the {@code HumanNameParser} class. + */ +public class HumanNameParserTest { + +private CSVParser parser; + +@Before +public void setUp() throws Exception { +parser = CSVParser.parse( +HumanNameParserTest.class.getResource("testNames.txt"), +Charset.forName("UTF-8"), +CSVFormat.DEFAULT.withDelimiter('|').withHeader()); +} + +@After +public void tearDown() throws Exception { +if (parser != null) { +parser.close(); +} +} + +@Test +public void testInputs() { +for (CSVRecord record : parser) { +validateRecord(record); +} +} + + +/** + * Validates a line in the testNames.txt file. + * + * @param record the tokens with leading spaces + */ +private void validateRecord(CSVRecord record) { +HumanNameParser parser = new HumanNameParser(record.get(Colums.Name)); + +assertEquals("Wrong LeadingInit in record " + record.getRecordNumber(), +record.get(Colums.LeadingInit), parser.getLeadingInit()); + +assertEquals("Wrong FirstName in record " + record.getRecordNumber(), +record.get(Colums.FirstName), parser.getFirst()); + +assertEquals("Wrong NickName in record " + record.getRecordNumber(), +record.get(Colums.NickName), parser.getNickname()); + +assertEquals("Wrong MiddleName in record " + record.getRecordNumber(), +record.get(Colums.MiddleName), parser.getMiddle()); + +assertEquals("Wrong LastName in record " + record.getRecordNumber(), +record.get(Colums.LastName), parser.getLast()); + +assertEquals("Wrong Suffix in record " + record.getRecordNumber(), +record.get(Colums.Suffix), parser.getSuffix()); +} + +private enum Colums { +Name,LeadingInit,FirstName,NickName,MiddleName,LastName,Suffix +} +} http://git-wip-us.apache.org/repos/asf/commons-text/blob/b6dbc7ae/src/test/java/org/apache/commons/text/names/ParserTest.java -- diff --git a/src/test/java/org/apache/commons/text/names/ParserTest.java b/src/test/java/org/apache/commons/text/names/ParserTest.java deleted file mode 100644 index 6a3371f..000 --- a/src/tes
[5/5] [text] Merge branch 'refactor-nameparsertest'
Merge branch 'refactor-nameparsertest' Use Commons CSV to parse the testNames.txt file instead of implementing a parsing algorithm in the test. The input file has been adjusted for easier use in the test: A header record has been added for easier retrieval of the fields during validation. Furthermore the unnecessary spaces in testNames.txt have been removed. This way we don't need to call trim on every field, making the test code better readable. Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/65852f80 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/65852f80 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/65852f80 Branch: refs/heads/master Commit: 65852f808ce5b7d0652b479cc237e7531a7eb395 Parents: c4e8a3e 1ccc7b3 Author: Benedikt Ritter Authored: Sun Apr 19 11:45:00 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 11:45:00 2015 +0200 -- pom.xml | 7 ++ .../commons/text/names/HumanNameParserTest.java | 92 + .../apache/commons/text/names/ParserTest.java | 102 --- .../org/apache/commons/text/names/testNames.txt | 63 ++-- 4 files changed, 131 insertions(+), 133 deletions(-) --
[3/5] [text] Use hamcrest matchers for checking results
Use hamcrest matchers for checking results Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/734b777c Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/734b777c Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/734b777c Branch: refs/heads/master Commit: 734b777c09b6a637dbfa2eb341eeaa573dbe39bb Parents: b6dbc7a Author: Benedikt Ritter Authored: Sun Apr 19 11:43:13 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 11:43:13 2015 +0200 -- .../commons/text/names/HumanNameParserTest.java | 27 ++-- 1 file changed, 14 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/734b777c/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java -- diff --git a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java index 59f98a1..86cd304 100644 --- a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java +++ b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java @@ -16,7 +16,8 @@ */ package org.apache.commons.text.names; -import static org.junit.Assert.assertEquals; +import static org.hamcrest.Matchers.equalTo; +import static org.junit.Assert.assertThat; import java.nio.charset.Charset; @@ -65,23 +66,23 @@ public class HumanNameParserTest { private void validateRecord(CSVRecord record) { HumanNameParser parser = new HumanNameParser(record.get(Colums.Name)); -assertEquals("Wrong LeadingInit in record " + record.getRecordNumber(), -record.get(Colums.LeadingInit), parser.getLeadingInit()); +assertThat("Wrong LeadingInit in record " + record.getRecordNumber(), +parser.getLeadingInit(), equalTo(record.get(Colums.LeadingInit))); -assertEquals("Wrong FirstName in record " + record.getRecordNumber(), -record.get(Colums.FirstName), parser.getFirst()); +assertThat("Wrong FirstName in record " + record.getRecordNumber(), +parser.getFirst(), equalTo(record.get(Colums.FirstName))); -assertEquals("Wrong NickName in record " + record.getRecordNumber(), -record.get(Colums.NickName), parser.getNickname()); +assertThat("Wrong NickName in record " + record.getRecordNumber(), +parser.getNickname(), equalTo(record.get(Colums.NickName))); -assertEquals("Wrong MiddleName in record " + record.getRecordNumber(), -record.get(Colums.MiddleName), parser.getMiddle()); +assertThat("Wrong MiddleName in record " + record.getRecordNumber(), +parser.getMiddle(), equalTo(record.get(Colums.MiddleName))); -assertEquals("Wrong LastName in record " + record.getRecordNumber(), -record.get(Colums.LastName), parser.getLast()); +assertThat("Wrong LastName in record " + record.getRecordNumber(), +parser.getLast(), equalTo(record.get(Colums.LastName))); -assertEquals("Wrong Suffix in record " + record.getRecordNumber(), -record.get(Colums.Suffix), parser.getSuffix()); +assertThat("Wrong Suffix in record " + record.getRecordNumber(), +parser.getSuffix(), equalTo(record.get(Colums.Suffix))); } private enum Colums {
[4/5] [text] Extract local variable
Extract local variable Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/1ccc7b3d Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/1ccc7b3d Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/1ccc7b3d Branch: refs/heads/master Commit: 1ccc7b3d1348cf3b981f0a8a4ee885a0a6e8c3f2 Parents: 734b777 Author: Benedikt Ritter Authored: Sun Apr 19 11:44:39 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 11:44:39 2015 +0200 -- .../commons/text/names/HumanNameParserTest.java | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/1ccc7b3d/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java -- diff --git a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java index 86cd304..90e1dfa 100644 --- a/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java +++ b/src/test/java/org/apache/commons/text/names/HumanNameParserTest.java @@ -61,27 +61,28 @@ public class HumanNameParserTest { /** * Validates a line in the testNames.txt file. * - * @param record the tokens with leading spaces + * @param record a CSVRecord representing one record in the input file. */ private void validateRecord(CSVRecord record) { HumanNameParser parser = new HumanNameParser(record.get(Colums.Name)); -assertThat("Wrong LeadingInit in record " + record.getRecordNumber(), +long recordNum = record.getRecordNumber(); +assertThat("Wrong LeadingInit in record " + recordNum, parser.getLeadingInit(), equalTo(record.get(Colums.LeadingInit))); -assertThat("Wrong FirstName in record " + record.getRecordNumber(), +assertThat("Wrong FirstName in record " + recordNum, parser.getFirst(), equalTo(record.get(Colums.FirstName))); -assertThat("Wrong NickName in record " + record.getRecordNumber(), +assertThat("Wrong NickName in record " + recordNum, parser.getNickname(), equalTo(record.get(Colums.NickName))); -assertThat("Wrong MiddleName in record " + record.getRecordNumber(), +assertThat("Wrong MiddleName in record " + recordNum, parser.getMiddle(), equalTo(record.get(Colums.MiddleName))); -assertThat("Wrong LastName in record " + record.getRecordNumber(), +assertThat("Wrong LastName in record " + recordNum, parser.getLast(), equalTo(record.get(Colums.LastName))); -assertThat("Wrong Suffix in record " + record.getRecordNumber(), +assertThat("Wrong Suffix in record " + recordNum, parser.getSuffix(), equalTo(record.get(Colums.Suffix))); }
[2/6] [text] Move classes from the internal package into the package where they are used and make them package private.
Move classes from the internal package into the package where they are used and make them package private. Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/df681238 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/df681238 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/df681238 Branch: refs/heads/master Commit: df681238bf5bcb2fece950b644a7d00a712d0cc8 Parents: 75db6de Author: Benedikt Ritter Authored: Sun Apr 19 10:32:13 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 10:37:50 2015 +0200 -- .../commons/text/similarity/CosineDistance.java | 6 -- .../apache/commons/text/similarity/Counter.java | 60 .../commons/text/similarity/RegexTokenizer.java | 50 .../commons/text/similarity/Tokenizer.java | 34 +++ .../text/similarity/internal/Counter.java | 60 .../similarity/internal/RegexTokenizer.java | 50 .../text/similarity/internal/Tokenizer.java | 34 --- .../text/similarity/internal/package-info.java | 23 .../commons/text/similarity/package-info.java | 2 +- 9 files changed, 145 insertions(+), 174 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/df681238/src/main/java/org/apache/commons/text/similarity/CosineDistance.java -- diff --git a/src/main/java/org/apache/commons/text/similarity/CosineDistance.java b/src/main/java/org/apache/commons/text/similarity/CosineDistance.java index 2fa4515..c5e8853 100644 --- a/src/main/java/org/apache/commons/text/similarity/CosineDistance.java +++ b/src/main/java/org/apache/commons/text/similarity/CosineDistance.java @@ -18,17 +18,11 @@ package org.apache.commons.text.similarity; import java.util.Map; -import org.apache.commons.text.similarity.internal.Counter; -import org.apache.commons.text.similarity.internal.RegexTokenizer; -import org.apache.commons.text.similarity.internal.Tokenizer; - /** * Measures the cosine distance between two character sequences. * * It utilizes the CosineSimilarity to compute the distance. Character sequences * are converted into vectors through a simple tokenizer that works with - * - * @see org.apache.commons.text.similarity.internal.RegexTokenizer */ public class CosineDistance implements EditDistance { /** http://git-wip-us.apache.org/repos/asf/commons-text/blob/df681238/src/main/java/org/apache/commons/text/similarity/Counter.java -- diff --git a/src/main/java/org/apache/commons/text/similarity/Counter.java b/src/main/java/org/apache/commons/text/similarity/Counter.java new file mode 100644 index 000..5eefc51 --- /dev/null +++ b/src/main/java/org/apache/commons/text/similarity/Counter.java @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.commons.text.similarity; + +import java.util.HashMap; +import java.util.Map; + +/** + * Java implementation of Python's collections Counter module. + * + * It counts how many times each element provided occurred in an array and + * returns a dict with the element as key and the count as value. + * + * @see https://docs.python.org/dev/library/collections.html#collections.Counter";> + * https://docs.python.org/dev/library/collections.html#collections.Counter + */ +final class Counter { + +/** + * Hidden constructor. + */ +private Counter() { +super(); +} + +/** + * It counts how many times each element provided occurred in an array and + * returns a dict with the element as key and the count as value. + * + * @param tokens array of tokens + * @return dict, where the elements are key, and the count the value + */ +public static Map of(CharSequence[] tokens) { +final Map innerCounter = new HashMap(); +for (CharSequence token : tokens) { +if (innerCounter
[1/6] [text] Remove since tags for release 1.0
Repository: commons-text Updated Branches: refs/heads/master d39dbb548 -> c4e8a3e0e Remove since tags for release 1.0 Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/75db6def Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/75db6def Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/75db6def Branch: refs/heads/master Commit: 75db6defc919d4113a246a5832b4815b14b7a524 Parents: d39dbb5 Author: Benedikt Ritter Authored: Sun Apr 19 10:28:46 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 10:37:41 2015 +0200 -- src/main/java/org/apache/commons/text/diff/CommandVisitor.java | 1 - src/main/java/org/apache/commons/text/diff/DeleteCommand.java | 1 - src/main/java/org/apache/commons/text/diff/EditCommand.java| 1 - src/main/java/org/apache/commons/text/diff/EditScript.java | 1 - src/main/java/org/apache/commons/text/diff/InsertCommand.java | 1 - src/main/java/org/apache/commons/text/diff/KeepCommand.java| 1 - src/main/java/org/apache/commons/text/diff/ReplacementsFinder.java | 1 - .../java/org/apache/commons/text/diff/ReplacementsHandler.java | 1 - src/main/java/org/apache/commons/text/diff/StringsComparator.java | 2 -- src/main/java/org/apache/commons/text/names/HumanNameParser.java | 2 -- src/main/java/org/apache/commons/text/names/Name.java | 2 -- .../java/org/apache/commons/text/names/NameParseException.java | 2 -- .../java/org/apache/commons/text/similarity/CosineDistance.java| 1 - .../java/org/apache/commons/text/similarity/CosineSimilarity.java | 2 -- src/main/java/org/apache/commons/text/similarity/EditDistance.java | 1 - .../java/org/apache/commons/text/similarity/EditDistanceFrom.java | 1 - src/main/java/org/apache/commons/text/similarity/FuzzyScore.java | 2 -- .../java/org/apache/commons/text/similarity/HammingDistance.java | 2 -- .../org/apache/commons/text/similarity/JaroWrinklerDistance.java | 2 -- .../org/apache/commons/text/similarity/LevenshteinDistance.java| 2 -- .../java/org/apache/commons/text/similarity/internal/Counter.java | 1 - .../apache/commons/text/similarity/internal/RegexTokenizer.java| 2 -- .../org/apache/commons/text/similarity/internal/Tokenizer.java | 1 - src/test/java/org/apache/commons/text/names/NameTest.java | 2 -- src/test/java/org/apache/commons/text/names/ParserTest.java| 2 -- .../org/apache/commons/text/similarity/CosineDistanceTest.java | 2 -- 26 files changed, 39 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/75db6def/src/main/java/org/apache/commons/text/diff/CommandVisitor.java -- diff --git a/src/main/java/org/apache/commons/text/diff/CommandVisitor.java b/src/main/java/org/apache/commons/text/diff/CommandVisitor.java index 7e5f40f..c0daed4 100644 --- a/src/main/java/org/apache/commons/text/diff/CommandVisitor.java +++ b/src/main/java/org/apache/commons/text/diff/CommandVisitor.java @@ -118,7 +118,6 @@ package org.apache.commons.text.diff; * * * @param object type - * @since 1.0 */ public interface CommandVisitor { http://git-wip-us.apache.org/repos/asf/commons-text/blob/75db6def/src/main/java/org/apache/commons/text/diff/DeleteCommand.java -- diff --git a/src/main/java/org/apache/commons/text/diff/DeleteCommand.java b/src/main/java/org/apache/commons/text/diff/DeleteCommand.java index 8173718..3e0d8df 100644 --- a/src/main/java/org/apache/commons/text/diff/DeleteCommand.java +++ b/src/main/java/org/apache/commons/text/diff/DeleteCommand.java @@ -30,7 +30,6 @@ package org.apache.commons.text.diff; * @see EditScript * * @param object type - * @since 1.0 */ public class DeleteCommand extends EditCommand { http://git-wip-us.apache.org/repos/asf/commons-text/blob/75db6def/src/main/java/org/apache/commons/text/diff/EditCommand.java -- diff --git a/src/main/java/org/apache/commons/text/diff/EditCommand.java b/src/main/java/org/apache/commons/text/diff/EditCommand.java index 7920206..49f795c 100644 --- a/src/main/java/org/apache/commons/text/diff/EditCommand.java +++ b/src/main/java/org/apache/commons/text/diff/EditCommand.java @@ -49,7 +49,6 @@ package org.apache.commons.text.diff; * @see EditScript * * @param object type - * @since 1.0 */ public abstract class EditCommand { http://git-wip-us.apache.org/repos/asf/commons-text/blob/75db6def/src/main/java/org/apache/commons/text/diff/EditScript.java -- diff --git a/src/main/java/org/apache/commons/t
[6/6] [text] Merge branch 'prerelease-cleanups'
Merge branch 'prerelease-cleanups' Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/c4e8a3e0 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/c4e8a3e0 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/c4e8a3e0 Branch: refs/heads/master Commit: c4e8a3e0e00b41a2ad6075933304a6cc4cd7a71d Parents: d39dbb5 a7e4c8a Author: Benedikt Ritter Authored: Sun Apr 19 11:02:01 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 11:02:01 2015 +0200 -- NOTICE.txt | 5 +- pom.xml | 10 ++-- .../commons/text/diff/CommandVisitor.java | 1 - .../apache/commons/text/diff/DeleteCommand.java | 1 - .../apache/commons/text/diff/EditCommand.java | 1 - .../apache/commons/text/diff/EditScript.java| 1 - .../apache/commons/text/diff/InsertCommand.java | 1 - .../apache/commons/text/diff/KeepCommand.java | 1 - .../commons/text/diff/ReplacementsFinder.java | 1 - .../commons/text/diff/ReplacementsHandler.java | 1 - .../commons/text/diff/StringsComparator.java| 2 - .../commons/text/names/HumanNameParser.java | 2 - .../org/apache/commons/text/names/Name.java | 2 - .../commons/text/names/NameParseException.java | 2 - .../commons/text/similarity/CosineDistance.java | 7 --- .../text/similarity/CosineSimilarity.java | 2 - .../apache/commons/text/similarity/Counter.java | 60 +++ .../commons/text/similarity/EditDistance.java | 1 - .../text/similarity/EditDistanceFrom.java | 1 - .../commons/text/similarity/FuzzyScore.java | 2 - .../text/similarity/HammingDistance.java| 2 - .../text/similarity/JaroWrinklerDistance.java | 2 - .../text/similarity/LevenshteinDistance.java| 2 - .../commons/text/similarity/RegexTokenizer.java | 50 .../commons/text/similarity/Tokenizer.java | 34 +++ .../text/similarity/internal/Counter.java | 61 .../similarity/internal/RegexTokenizer.java | 52 - .../text/similarity/internal/Tokenizer.java | 35 --- .../text/similarity/internal/package-info.java | 23 .../commons/text/similarity/package-info.java | 2 +- .../org/apache/commons/text/names/NameTest.java | 2 - .../apache/commons/text/names/ParserTest.java | 2 - .../text/similarity/CosineDistanceTest.java | 2 - .../text/similarity/StringMetricFromTest.java | 5 -- 34 files changed, 151 insertions(+), 227 deletions(-) --
[3/6] [text] Tests should not write to std out
Tests should not write to std out Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/0ecf9afc Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/0ecf9afc Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/0ecf9afc Branch: refs/heads/master Commit: 0ecf9afc8473fae8537c76178b4044d85c2b743e Parents: df68123 Author: Benedikt Ritter Authored: Sun Apr 19 10:39:31 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 10:39:31 2015 +0200 -- .../apache/commons/text/similarity/StringMetricFromTest.java| 5 - 1 file changed, 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/0ecf9afc/src/test/java/org/apache/commons/text/similarity/StringMetricFromTest.java -- diff --git a/src/test/java/org/apache/commons/text/similarity/StringMetricFromTest.java b/src/test/java/org/apache/commons/text/similarity/StringMetricFromTest.java index de59452..117b3bc 100644 --- a/src/test/java/org/apache/commons/text/similarity/StringMetricFromTest.java +++ b/src/test/java/org/apache/commons/text/similarity/StringMetricFromTest.java @@ -54,11 +54,6 @@ public class StringMetricFromTest { mostSimilar = test; } } - -System.out.println("The string most similar to \"" + target + "\" " -+ "is \"" + mostSimilar + "\" because " -+ "its distance is only " + shortestDistance + "."); - assertThat(mostSimilar, equalTo("a patchy")); assertThat(shortestDistance, equalTo(4)); }
[4/6] [text] Move main developer to top of developers section
Move main developer to top of developers section Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/656e217a Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/656e217a Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/656e217a Branch: refs/heads/master Commit: 656e217ac0748a3ec2827169a74a27cb5cfd2914 Parents: 0ecf9af Author: Benedikt Ritter Authored: Sun Apr 19 10:41:15 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 10:41:15 2015 +0200 -- pom.xml | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/656e217a/pom.xml -- diff --git a/pom.xml b/pom.xml index 6cf4dcf..cdfa9c2 100644 --- a/pom.xml +++ b/pom.xml @@ -49,15 +49,15 @@ - Benedikt Ritter - britter - brit...@apache.org - - Bruno P. Kinoshita kinow ki...@apache.org + + Benedikt Ritter + britter + brit...@apache.org +
[5/6] [text] Cleanup NOTICE file
Cleanup NOTICE file Project: http://git-wip-us.apache.org/repos/asf/commons-text/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-text/commit/a7e4c8a8 Tree: http://git-wip-us.apache.org/repos/asf/commons-text/tree/a7e4c8a8 Diff: http://git-wip-us.apache.org/repos/asf/commons-text/diff/a7e4c8a8 Branch: refs/heads/master Commit: a7e4c8a8a4d473aa83f8a40d8ac25bb3d4c78853 Parents: 656e217 Author: Benedikt Ritter Authored: Sun Apr 19 10:42:29 2015 +0200 Committer: Benedikt Ritter Committed: Sun Apr 19 10:42:29 2015 +0200 -- NOTICE.txt | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/commons-text/blob/a7e4c8a8/NOTICE.txt -- diff --git a/NOTICE.txt b/NOTICE.txt index 5dd8ae2..283320b 100644 --- a/NOTICE.txt +++ b/NOTICE.txt @@ -1,8 +1,5 @@ Apache Commons Text -Copyright 2001-2014 The Apache Software Foundation +Copyright 2001-2015 The Apache Software Foundation This product includes software developed at The Apache Software Foundation (http://www.apache.org/). - -This product includes software from the Spring Framework, -under the Apache License 2.0 (see: StringUtils.containsWhitespace())