[GitHub] commons-lang pull request: LANG-1124: Add StringUtils split by len...

2015-05-18 Thread britter
Github user britter commented on the pull request:

https://github.com/apache/commons-lang/pull/75#issuecomment-102940510
  
@rikles thank you for your thorough feedback. Give me some time to go 
through your comments. I'll have time to have a look later this week.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] commons-lang pull request: LANG-1124: Add StringUtils split by len...

2015-05-16 Thread rikles
Github user rikles commented on a diff in the pull request:

https://github.com/apache/commons-lang/pull/75#discussion_r30461235
  
--- Diff: src/main/java/org/apache/commons/lang3/StringUtils.java ---
@@ -3277,6 +3277,164 @@ public static String substringBetween(final String 
str, final String open, final
 return list.toArray(new String[list.size()]);
 }
 
+/**
+ * pSplit a String into an array, using an array of fixed string 
lengths./p
+ *
+ * pIf not null String input, the returned array size is same as the 
input lengths array./p
+ *
+ * pA null input String returns {@code null}.
+ * A {@code null} or empty input lengths array returns an empty array.
+ * A {@code 0} in the input lengths array results in en empty 
string./p
+ *
+ * pExtra characters are ignored (ie String length greater than sum 
of split lengths).
+ * All empty substrings other than zero length requested, are returned 
{@code null}./p
+ *
+ * pre
+ * StringUtils.splitByLength(null, *)  = null
+ * StringUtils.splitByLength(abc)= []
+ * StringUtils.splitByLength(abc, null)  = []
+ * StringUtils.splitByLength(abc, [])= []
+ * StringUtils.splitByLength(, 2, 4, 1)  = [null, null, null]
+ *
+ * StringUtils.splitByLength(abcdefg, 2, 4, 1) = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdefghij, 2, 4, 1)  = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdefg, 2, 4, 5) = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdef, 2, 4, 1)  = [ab, cdef, 
null]
+ *
+ * StringUtils.splitByLength( abcdef, 2, 4, 1) = [ a, bcde, 
f]
+ * StringUtils.splitByLength(abcdef , 2, 4, 1) = [ab, cdef, 
 ]
+ * StringUtils.splitByLength(abcdefg, 2, 4, 0, 1)  = [ab, cdef, 
, g]
+ * StringUtils.splitByLength(abcdefg, -1)  = {@link 
IllegalArgumentException}
--- End diff --

It's true that negative numbers could be easily promotes to 0.
But here, it's not like in `StringUtils.split(String, String, int)` where 
the int value indicate the maximum number of results we want. It's very common 
that passing a zero or negative value mean _no limit_.
Here, we request several column lengths. I thought it was natural and 
easier to consider negative lengths as a coding mistake.

I even wondered if we could interpret negative values like :
  * _unlimited length_. But what about 
`StringUtils.splitByLength(abcdefg, -1, 2, -1, -1)` ... ?
  * _backward move_ like `StringUtils.splitByLength(abcde, 3, -2, 4) = 
[abc, bcde]`. But what about `StringUtils.splitByLength(abcdef, 1, -4, 
3)` ... ?

I don't know what to think about it...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] commons-lang pull request: LANG-1124: Add StringUtils split by len...

2015-05-16 Thread rikles
Github user rikles commented on a diff in the pull request:

https://github.com/apache/commons-lang/pull/75#discussion_r30461086
  
--- Diff: src/main/java/org/apache/commons/lang3/StringUtils.java ---
@@ -3277,6 +3277,164 @@ public static String substringBetween(final String 
str, final String open, final
 return list.toArray(new String[list.size()]);
 }
 
+/**
+ * pSplit a String into an array, using an array of fixed string 
lengths./p
+ *
+ * pIf not null String input, the returned array size is same as the 
input lengths array./p
+ *
+ * pA null input String returns {@code null}.
+ * A {@code null} or empty input lengths array returns an empty array.
+ * A {@code 0} in the input lengths array results in en empty 
string./p
+ *
+ * pExtra characters are ignored (ie String length greater than sum 
of split lengths).
+ * All empty substrings other than zero length requested, are returned 
{@code null}./p
+ *
+ * pre
+ * StringUtils.splitByLength(null, *)  = null
+ * StringUtils.splitByLength(abc)= []
+ * StringUtils.splitByLength(abc, null)  = []
+ * StringUtils.splitByLength(abc, [])= []
+ * StringUtils.splitByLength(, 2, 4, 1)  = [null, null, null]
+ *
+ * StringUtils.splitByLength(abcdefg, 2, 4, 1) = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdefghij, 2, 4, 1)  = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdefg, 2, 4, 5) = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdef, 2, 4, 1)  = [ab, cdef, 
null]
--- End diff --

Good point.
My idea was to indicate that there is no more characters to extract the 
explicitly requested column length text.
But in the other hand, this can cause `NullPointerException` if the 
returned array is used without check...

Why I used this approach : with `null` values, in case of hard coded 
lengths, we can simply check the returned array with a _for each_ loop, even 
later in other piece of code :
```java
String[] cols = StringUtils.splitByLength(input, 2, 3, 1);
// ...
for (String col : cols) {
if (col == null) {
break;
}
// Do something
}
```

Without `null` values, we must have a lengths array reference :
```java
int[] LENGTHS = { 2, 3, 0, 1 };
String[] cols = StringUtils.splitByLength(input, LENGTHS);
int index = 0;
for (String col : cols) {
if (col.length() == 0  LENGTHS[index]  0) {
break;
}
index++;
// Do something
}
```

Of course, we can also check the input string length before calling 
`StringUtils.splitByLength`, but we have to get the lengths sum. And what about 
this case : `StringUtils.splitByLength(abcd, 1, 2, 2)` ?

I don't know which is best... What do you think ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] commons-lang pull request: LANG-1124: Add StringUtils split by len...

2015-05-16 Thread rikles
Github user rikles commented on a diff in the pull request:

https://github.com/apache/commons-lang/pull/75#discussion_r30460736
  
--- Diff: src/main/java/org/apache/commons/lang3/StringUtils.java ---
@@ -3277,6 +3277,164 @@ public static String substringBetween(final String 
str, final String open, final
 return list.toArray(new String[list.size()]);
 }
 
+/**
+ * pSplit a String into an array, using an array of fixed string 
lengths./p
+ *
+ * pIf not null String input, the returned array size is same as the 
input lengths array./p
+ *
+ * pA null input String returns {@code null}.
+ * A {@code null} or empty input lengths array returns an empty array.
+ * A {@code 0} in the input lengths array results in en empty 
string./p
+ *
+ * pExtra characters are ignored (ie String length greater than sum 
of split lengths).
+ * All empty substrings other than zero length requested, are returned 
{@code null}./p
+ *
+ * pre
+ * StringUtils.splitByLength(null, *)  = null
+ * StringUtils.splitByLength(abc)= []
+ * StringUtils.splitByLength(abc, null)  = []
+ * StringUtils.splitByLength(abc, [])= []
+ * StringUtils.splitByLength(, 2, 4, 1)  = [null, null, null]
+ *
+ * StringUtils.splitByLength(abcdefg, 2, 4, 1) = [ab, cdef, 
g]
--- End diff --

Like said in the next line : `StringUtils.splitByLength(abcdefg, 2, 2)` 
will return `[ab, cd ]`.
`StringUtils.splitByLength(abcdefghij, 2, 4, 1)  = [ab, cdef, g]`

I asked myself the question during development. Do we discard the extra 
characters ?
I think it would be nice to let users decide. Moreover, depending on use 
case, it could be useful to keep/discard the first extra characters (like 
parsing a single line commented out string).

I propose to :
  * add a private `splitByLengthWorker(String string, boolean splitFromEnd, 
boolean discardExtraChar, int ... lengths)`
  * keep this `splitByLength(String, int ...)` method logic as default  : 
`return splitByLengthWorker(string, false, true, lengths)`. So, by default, the 
returned array is same size as the `int ... lengths` array param and this 
behavior is interesting on parsing fixed column lengths strings.
  * add a `splitByLengthKeepExtraChar(String, int ...)` : `return 
splitByLengthWorker(string, false, false, lengths)`
  * add a `splitByLengthFromEnd(String, int ...)` : `return 
splitByLengthWorker(string, true, false, lengths)`
  * add a `splitByLengthFromEndKeepExtraChar(String, int ...)` : `return 
splitByLengthWorker(string, true, true, lengths)`

A question : For _split from end_ methods, which call do you think is more 
logic : _right aligned/end to start_ lengths, _reversed/not reversed_ result ?
  * `StringUtils.splitByLengthFromEndKeepExtraChar(__abcdef, 1, 2, 3)  = 
[__, a, bc, def]` - (RA, NR)
  * `StringUtils.splitByLengthFromEndKeepExtraChar(__abcdef, 1, 2, 3)  = 
[def, bc, a, __]` - (RA, R)
  * `StringUtils.splitByLengthFromEndKeepExtraChar(__abcdef, 1, 2, 3)  = 
[f, de, abc, __]` - (E2S, R)
  * `StringUtils.splitByLengthFromEndKeepExtraChar(__abcdef, 1, 2, 3)  = 
[__, abc, de, f]` - (E2S, NR)

I think the first one is more readable, we can visually understand the 
splitting, but may be less intuitive :
```
StringUtils.splitByLengthFromEnd(ABCDEFGHIJKLM, 3, 4, 5)  = [BCD, 
EFGH, IJKLM]
 [3][4_][_5_]
ABCDEFGHIJKLM
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] commons-lang pull request: LANG-1124: Add StringUtils split by len...

2015-05-16 Thread rikles
Github user rikles commented on a diff in the pull request:

https://github.com/apache/commons-lang/pull/75#discussion_r30467016
  
--- Diff: src/main/java/org/apache/commons/lang3/StringUtils.java ---
@@ -3277,6 +3277,164 @@ public static String substringBetween(final String 
str, final String open, final
 return list.toArray(new String[list.size()]);
 }
 
+/**
+ * pSplit a String into an array, using an array of fixed string 
lengths./p
+ *
+ * pIf not null String input, the returned array size is same as the 
input lengths array./p
+ *
+ * pA null input String returns {@code null}.
+ * A {@code null} or empty input lengths array returns an empty array.
+ * A {@code 0} in the input lengths array results in en empty 
string./p
+ *
+ * pExtra characters are ignored (ie String length greater than sum 
of split lengths).
+ * All empty substrings other than zero length requested, are returned 
{@code null}./p
+ *
+ * pre
+ * StringUtils.splitByLength(null, *)  = null
+ * StringUtils.splitByLength(abc)= []
+ * StringUtils.splitByLength(abc, null)  = []
+ * StringUtils.splitByLength(abc, [])= []
+ * StringUtils.splitByLength(, 2, 4, 1)  = [null, null, null]
+ *
+ * StringUtils.splitByLength(abcdefg, 2, 4, 1) = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdefghij, 2, 4, 1)  = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdefg, 2, 4, 5) = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdef, 2, 4, 1)  = [ab, cdef, 
null]
+ *
+ * StringUtils.splitByLength( abcdef, 2, 4, 1) = [ a, bcde, 
f]
+ * StringUtils.splitByLength(abcdef , 2, 4, 1) = [ab, cdef, 
 ]
+ * StringUtils.splitByLength(abcdefg, 2, 4, 0, 1)  = [ab, cdef, 
, g]
+ * StringUtils.splitByLength(abcdefg, -1)  = {@link 
IllegalArgumentException}
--- End diff --

Another way to deal with negative values could be to treat them as _discard 
# characters_ :
`StringUtils.splitByLength(abcdef, 2, -3, 1) = [ab, f]`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] commons-lang pull request: LANG-1124: Add StringUtils split by len...

2015-05-16 Thread rikles
Github user rikles commented on a diff in the pull request:

https://github.com/apache/commons-lang/pull/75#discussion_r30460281
  
--- Diff: src/main/java/org/apache/commons/lang3/StringUtils.java ---
@@ -3277,6 +3277,164 @@ public static String substringBetween(final String 
str, final String open, final
 return list.toArray(new String[list.size()]);
 }
 
+/**
+ * pSplit a String into an array, using an array of fixed string 
lengths./p
+ *
+ * pIf not null String input, the returned array size is same as the 
input lengths array./p
+ *
+ * pA null input String returns {@code null}.
+ * A {@code null} or empty input lengths array returns an empty array.
+ * A {@code 0} in the input lengths array results in en empty 
string./p
+ *
+ * pExtra characters are ignored (ie String length greater than sum 
of split lengths).
+ * All empty substrings other than zero length requested, are returned 
{@code null}./p
+ *
+ * pre
+ * StringUtils.splitByLength(null, *)  = null
+ * StringUtils.splitByLength(abc)= []
+ * StringUtils.splitByLength(abc, null)  = []
+ * StringUtils.splitByLength(abc, [])= []
+ * StringUtils.splitByLength(, 2, 4, 1)  = [null, null, null]
--- End diff --

I followed the same logic as in other `StringUtils.split*(...)` methods : 
return `null` if the input string is `null`.
So `StringUtils.splitByLength(null, *)` return `null`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] commons-lang pull request: LANG-1124: Add StringUtils split by len...

2015-05-11 Thread britter
Github user britter commented on a diff in the pull request:

https://github.com/apache/commons-lang/pull/75#discussion_r30060436
  
--- Diff: src/main/java/org/apache/commons/lang3/StringUtils.java ---
@@ -3277,6 +3277,164 @@ public static String substringBetween(final String 
str, final String open, final
 return list.toArray(new String[list.size()]);
 }
 
+/**
+ * pSplit a String into an array, using an array of fixed string 
lengths./p
+ *
+ * pIf not null String input, the returned array size is same as the 
input lengths array./p
+ *
+ * pA null input String returns {@code null}.
+ * A {@code null} or empty input lengths array returns an empty array.
+ * A {@code 0} in the input lengths array results in en empty 
string./p
+ *
+ * pExtra characters are ignored (ie String length greater than sum 
of split lengths).
+ * All empty substrings other than zero length requested, are returned 
{@code null}./p
+ *
+ * pre
+ * StringUtils.splitByLength(null, *)  = null
+ * StringUtils.splitByLength(abc)= []
+ * StringUtils.splitByLength(abc, null)  = []
+ * StringUtils.splitByLength(abc, [])= []
+ * StringUtils.splitByLength(, 2, 4, 1)  = [null, null, null]
--- End diff --

According to ```StringUtils.split(String, char)```this should better return 
an empty array.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] commons-lang pull request: LANG-1124: Add StringUtils split by len...

2015-05-11 Thread britter
Github user britter commented on a diff in the pull request:

https://github.com/apache/commons-lang/pull/75#discussion_r30061214
  
--- Diff: src/main/java/org/apache/commons/lang3/StringUtils.java ---
@@ -3277,6 +3277,164 @@ public static String substringBetween(final String 
str, final String open, final
 return list.toArray(new String[list.size()]);
 }
 
+/**
+ * pSplit a String into an array, using an array of fixed string 
lengths./p
+ *
+ * pIf not null String input, the returned array size is same as the 
input lengths array./p
+ *
+ * pA null input String returns {@code null}.
+ * A {@code null} or empty input lengths array returns an empty array.
+ * A {@code 0} in the input lengths array results in en empty 
string./p
+ *
+ * pExtra characters are ignored (ie String length greater than sum 
of split lengths).
+ * All empty substrings other than zero length requested, are returned 
{@code null}./p
+ *
+ * pre
+ * StringUtils.splitByLength(null, *)  = null
+ * StringUtils.splitByLength(abc)= []
+ * StringUtils.splitByLength(abc, null)  = []
+ * StringUtils.splitByLength(abc, [])= []
+ * StringUtils.splitByLength(, 2, 4, 1)  = [null, null, null]
+ *
+ * StringUtils.splitByLength(abcdefg, 2, 4, 1) = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdefghij, 2, 4, 1)  = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdefg, 2, 4, 5) = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdef, 2, 4, 1)  = [ab, cdef, 
null]
+ *
+ * StringUtils.splitByLength( abcdef, 2, 4, 1) = [ a, bcde, 
f]
+ * StringUtils.splitByLength(abcdef , 2, 4, 1) = [ab, cdef, 
 ]
+ * StringUtils.splitByLength(abcdefg, 2, 4, 0, 1)  = [ab, cdef, 
, g]
+ * StringUtils.splitByLength(abcdefg, -1)  = {@link 
IllegalArgumentException}
--- End diff --

I'm undecided about this. The ```StringUtils.split(String, String, int)``` 
method will not fail if negative values are provided. But I'm not sure how to 
behave for an input like ```[3, -3, 3]```. I think we should handle this like 
the ```ArrayUtils``` class behaves: it promotes negative indicies to 0. So a 
negative number should add an empty string to the resulting array.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] commons-lang pull request: LANG-1124: Add StringUtils split by len...

2015-05-11 Thread thiagoh
Github user thiagoh commented on the pull request:

https://github.com/apache/commons-lang/pull/75#issuecomment-101048576
  
is this finished?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] commons-lang pull request: LANG-1124: Add StringUtils split by len...

2015-05-11 Thread britter
Github user britter commented on a diff in the pull request:

https://github.com/apache/commons-lang/pull/75#discussion_r30061452
  
--- Diff: src/main/java/org/apache/commons/lang3/StringUtils.java ---
@@ -3277,6 +3277,164 @@ public static String substringBetween(final String 
str, final String open, final
 return list.toArray(new String[list.size()]);
 }
 
+/**
+ * pSplit a String into an array, using an array of fixed string 
lengths./p
+ *
+ * pIf not null String input, the returned array size is same as the 
input lengths array./p
+ *
+ * pA null input String returns {@code null}.
+ * A {@code null} or empty input lengths array returns an empty array.
+ * A {@code 0} in the input lengths array results in en empty 
string./p
+ *
+ * pExtra characters are ignored (ie String length greater than sum 
of split lengths).
+ * All empty substrings other than zero length requested, are returned 
{@code null}./p
+ *
+ * pre
+ * StringUtils.splitByLength(null, *)  = null
+ * StringUtils.splitByLength(abc)= []
+ * StringUtils.splitByLength(abc, null)  = []
+ * StringUtils.splitByLength(abc, [])= []
+ * StringUtils.splitByLength(, 2, 4, 1)  = [null, null, null]
+ *
+ * StringUtils.splitByLength(abcdefg, 2, 4, 1) = [ab, cdef, 
g]
--- End diff --

What would be the result of ```StringUtils.splitByLength(abcdefg, 2, 
2)```? I think it should be [ab, cd efg].


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] commons-lang pull request: LANG-1124: Add StringUtils split by len...

2015-05-11 Thread britter
Github user britter commented on a diff in the pull request:

https://github.com/apache/commons-lang/pull/75#discussion_r30060493
  
--- Diff: src/main/java/org/apache/commons/lang3/StringUtils.java ---
@@ -3277,6 +3277,164 @@ public static String substringBetween(final String 
str, final String open, final
 return list.toArray(new String[list.size()]);
 }
 
+/**
+ * pSplit a String into an array, using an array of fixed string 
lengths./p
+ *
+ * pIf not null String input, the returned array size is same as the 
input lengths array./p
+ *
+ * pA null input String returns {@code null}.
+ * A {@code null} or empty input lengths array returns an empty array.
+ * A {@code 0} in the input lengths array results in en empty 
string./p
+ *
+ * pExtra characters are ignored (ie String length greater than sum 
of split lengths).
+ * All empty substrings other than zero length requested, are returned 
{@code null}./p
+ *
+ * pre
+ * StringUtils.splitByLength(null, *)  = null
+ * StringUtils.splitByLength(abc)= []
+ * StringUtils.splitByLength(abc, null)  = []
+ * StringUtils.splitByLength(abc, [])= []
+ * StringUtils.splitByLength(, 2, 4, 1)  = [null, null, null]
+ *
+ * StringUtils.splitByLength(abcdefg, 2, 4, 1) = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdefghij, 2, 4, 1)  = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdefg, 2, 4, 5) = [ab, cdef, 
g]
+ * StringUtils.splitByLength(abcdef, 2, 4, 1)  = [ab, cdef, 
null]
--- End diff --

I'm not sure whether it is a good idea to add null entries to the result.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---