To clarify.  By completely, i mean wiping out the token cache,
and reseting the iterator functionality, as well as resetting
the input.  The configured parameters are not changed on a reset.
This has HUGE performance implications when working with a large
file that you're trying to tokenize.

-----Original Message-----
From: Inger, Matthew [mailto:[EMAIL PROTECTED]
Sent: Monday, November 24, 2003 7:19 PM
To: 'Jakarta Commons Developers List'
Subject: RE: [lang] unexpected StringUtils.split behavior (was RE:
suggest ion for new StringUtils.method)


I've submitted patches for what i was working on.
I had provided methods to completely reset the tokenizer,
including the input string, so that the same tokenizer can
be re-used when parsing a large file, and doesn't have to
be constantly recreated.  See the bug for the patch files.

Also the test cases have been updated.

-----Original Message-----
From: Arun Thomas [mailto:[EMAIL PROTECTED]
Sent: Monday, November 24, 2003 6:30 PM
To: Jakarta Commons Developers List
Subject: RE: [lang] unexpected StringUtils.split behavior (was RE:
suggestion for new StringUtils.method)


It's been checked in....

The most recent changes you made haven't been checked in because I believe
Stephen adapted your original a little as he was checking it in....  Further
changes should be made to the checked in version, rather than the original.

-AMT  

-----Original Message-----
From: Inger, Matthew [mailto:[EMAIL PROTECTED] 
Sent: Monday, November 24, 2003 3:31 PM
To: 'Jakarta Commons Developers List'
Subject: RE: [lang] unexpected StringUtils.split behavior (was RE:
suggestion for new StringUtils.method)


There's a new tokenizer attached to defect 22692 which does
CSV style tokenizing.  Test class is attached as well.  Just waiting on
someone to commit it.


-----Original Message-----
From: Al Chou [mailto:[EMAIL PROTECTED]
Sent: Monday, November 24, 2003 2:02 PM
To: Jakarta Commons Developers List
Subject: RE: [lang] unexpected StringUtils.split behavior (was RE:
suggestion for new StringUtils.method)


I am only skimming [lang] posts for stuff pretty closely related to my
issue, so I hadn't given much thought to the new StringTokenizer replacement
-- perhaps I felt it was a little too new for me to be adding even more new
stuff.  Whoever wants to is welcome to incorporate this new method there, if
it gets committed to StringUtils at all in the first place.

Al


--- Arun Thomas <[EMAIL PROTECTED]> wrote:
> I also know that this is more than you intended, but any thought of 
> incorporating the split on string into the new StringTokenizer 
> replacement
as
> well?  I think that would be pretty useful.  (It's behaviour in two
places,
> but implementation could certainly delegate....)
> 
> -AMT
> 
> -----Original Message-----
> From: Al Chou [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, November 19, 2003 10:22 PM
> To: Jakarta Commons Developers List
> Subject: [lang] unexpected StringUtils.split behavior (was RE: suggestion
for
> new StringUtils.method)
> 
> I guess my previous post got lost in the noise, so I'm reposting.  I 
> have
two
> new StringUtils.split methods that can split a string at occurrences 
> of a substring rather than splitting at the individual characters in 
> the
specified
> delimiter string.
> 
> While testing, I discovered that my expectations for the behavior of 
> the split( *, ..., int max ) methods didn't match their actual 
> behavior.  I expected to get a maximum of "max" substrings, all of 
> which were delimited
in
> the parent string by the specified delimiters.  Instead, what you get 
> is
"max
> - 1" such substrings, plus the rest of the parent string as the final
result
> substring.
> This behavior seems counter to what StringTokenizer would do, which is
> surprising, given the Javadoc comments about using the split methods as
> alternatives to StringTokenizer.
> 
> Currently, my tests reflect my expectations for the behavior, and I
modified
> the existing split( String, String, int ) method to match my 
> expectations.
I
> didn't want to submit such a change as a proposed patch without first
getting
> feedback from the community about whether my expectations are wrong.  
> I am happy to submit only code that does not change the behavior of 
> the
existing
> methods, if need be.
> 
> 
> Al
> 
> 
> --- Al Chou <[EMAIL PROTECTED]> wrote:
> > This thread is a good entree for my question.  I was adding a new
> > StringUtils.split method that can split a string using a whole string 
> > as the delimiter, rather than the characters within that string.  In 
> > running my JUnit tests, I discovered unexpected behavior in the 
> > existing method:
> > 
> > String stringToSplitOnNulls = "ab   de fg" ;
> > String[] splitOnNullExpectedResults = { "ab", "de" } ;
> > 
> > String[] splitOnNullResults = StringUtils.split( 
> > stringToSplitOnNulls,
> > null, 2
> > ) ;
> > assertEquals( splitOnNullExpectedResults.length, 
> > splitOnNullResults.length ) ; for ( int i = 0 ; i < 
> > splitOnNullExpectedResults.length ; i+= 1 ) {
> >     assertEquals( splitOnNullExpectedResults[i], splitOnNullResults[i] )
;
> > }
> > 
> > 
> > The result of the split call is
> > 
> > "ab", "de fg"
> > 
> > and it doesn't look to me like StringTokenizer's documentation 
> > implies
> > this behavior....
> > 
> > 
> > Al

=====
Albert Davidson Chou

    Get answers to Mac questions at http://www.Mac-Mgrs.org/ .

__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to