Re: commons.text.CaseUtils
Gary, thank you for your response. I initiated the pull request (#528) and already received some very constructive feedback from user mbenson. I am modifying the code to contain fewer methods that may be externally modified by a user, if something as simple as .toLowerCase() is required. I also noticed some recent discussion of this which you commented on in pull 450 Cases API + 4 implementations (Pascal, Camel, Kebab, Snake) #450 When I am done with the edits and new tests and pushed them to my fork, I may join this conversation #450. My Jira account has been approved (after an initial disapproval.) I haven't looked at it yet, I will look for similar topics there. I also uncovered an issue with my code when I devised some tests I specifically designed to break it if possible, and I need to fix this. assertThat(CaseUtils.toTitleCase(" ' \u2019 Titl'e Case \u2019 ' ")).isEqualTo("Title_Case"); // todo fix this failure. org.opentest4j.AssertionFailedError: expected: "Title_Case" but was: "Title_Case_’_'" Expected :"Title_Case" Actual :"Title_Case_’_'" This is because of the way I handle apostrophes so "That's good!" will return "Thats_Good" Again, thank you for your response. Stephan Peters On Tue, Apr 9, 2024 at 5:56 PM Stephan Peters wrote: > OK, I will initiate a PR. > Some of the added methods will be more useful than others. > The PR will come from speters33w. > > Thank you, > Stephan Peters > > On Tue, Apr 9, 2024 at 5:31 PM Gary Gregory > wrote: > >> Hello Stephan, >> >> The best way to see what you are proposing is a PR, it's a bit painful to >> see differences otherwise, at least for me. >> >> That said anything new should solve a real world use case, not merely >> something that might be useful (or not) >> >> I think seeing tests in a PR will help clarify what it is you are >> proposing >> that the current code doesn't do. >> >> See also also https://github.com/apache/commons-text/pull/450 >> >> TY, >> Gary >> >> On Tue, Apr 9, 2024, 4:37 PM Stephan Peters >> wrote: >> >> > I added several methods to the org.apache.commons.CaseUtils class I >> think >> > would be very useful, for example to use for normalized naming >> conventions >> > for file paths, file names, URLs, etc. >> > >> > I'm planning on initiating a pull request. >> > >> > I would like to discuss it here. >> > >> > I've posted it in a fork here: >> > >> > >> https://github.com/speters33w/commons-text/blob/master/src/main/java/org/apache/commons/text/CaseUtils.java >> > >> > and written new tests for all the methods that pass here: >> > >> > >> https://github.com/speters33w/commons-text/blob/master/src/test/java/org/apache/commons/text/CaseUtilsTest.java >> > >> > There is an example of the method return values at the top of the >> revised >> > CaseUtils.java. >> > >> > The methods have a little different behavior than the existing >> > toCamelCase(String, boolean, char[]) (which I left intact) in that they >> > normalize the input first before processing, so toCamelSnakeCase("The >> > café’s piñata gave me déjà vu.") will return >> > "the_Cafes_Pinata_Gave_Me_Deja_Vu" >> > >> > The main driver engine is in the toTitleCase() method and the rest of >> the >> > methods piggyback on that engine and perform minor changes to the return >> > value. >> > >> > If anyone feels like taking a look, I'd appreciate any feedback. >> > >> > Thank you. >> > >> > Stephan Peters >> > >> >
Re: [IO] Change in behavior in Commons FileUpload after upgrade to Commons IO 2.16.1
Stephan, Thank you for your message. This is more of a design deffect IMO,. If there is a desire for disabling a feature like caching then there should be a toggle for that, not relying on a side effect of a magic number. PRs welcome! :-) Gary On Wed, Apr 10, 2024, 9:24 AM Stephan Markwalder wrote: > Hi, > > Today, I found the following questions in > https://github.com/apache/commons-io/pull/609: > > > The behavior for a negative threshold should be the same as 0 IMO. WDYT? > > What does it even mean that a threshold is negative? > > Writing zero bytes writes nothing, so there is nothing to reach until > you at least write one byte. > > Am I missing something? > > I would like to highlight a "use case" for a negative threshold, and how > the change to disallow a negative threshold might impact existing code. > > I upgraded from Commons IO 2.15.1 to 2.16.1 yesterday and found a small > change in the behavior of Commons FileUpload when uploading and processing > empty files. The effect is visible only when passing a negative value as > file size threshold to Commons FileUpload. Here is a small extract from the > Javadoc of Commons FileUpload, class `DefaultFileItemFactory`, constructor > parameter `sizeThreshold`: > > > sizeThreshold - The threshold, in bytes, below which items will be > > retained in memory and above which they will be stored as a file. > (source: > https://javadoc.io/doc/commons-fileupload/commons-fileupload/latest/index.html > ) > > By passing a negative value for `sizeThreshold`, Commons FileUpload can be > configured to disable the in-memory caching for all uploaded files, > including empty files with a size of 0 bytes. As a result, `DiskFileItem` > objects created by Commons FileUpload will always have a `File` instance > set internally, even for empty files. > > `DiskFileItem` in Commons FileUpload internally makes use of > `DeferredFileOutputStream`, and therefore `ThresholdingOutputStream`. At > some point it calls `isThresholdExceeded()` to check whether the size of > the uploaded file exceeds the given threshold. By disallowing a negative > threshold, empty files will now be treated differently by Commons > FileUpload. With a size of 0 bytes, they will not exceed the enforced > minimum threshold of 0 bytes anymore, and their data will therefore be kept > in memory. This can break follow-up code which relies on the previous > behavior and expects a `File` instance to be created for every uploaded > file, even empty files. > > I know that this is a very specific use case. I don't know whether the > developers of Commons FileUpload ever intended a negative threshold to be > used. Still, the question was asked whether a negative threshold could have > any meaning. I assume the answer is "yes". But I don't know whether this > qualifies as a bug or a regression. I also don't know whether there are > other similar use cases in other libraries depending on Commons IO. > > Best, > Stephan > Email Disclaimer > FNZ (UK) Ltd registered in England and Wales (05435760) 10th Floor, 135 > Bishopsgate, London EC2M 3TP, FNZ (UK) Ltd is authorised and regulated by > the Financial Conduct Authority (438687); FNZ TA Services Ltd registered in > England and Wales (09571767) 10th Floor, 135 Bishopsgate, London EC2M 3TP, > FNZ TA Services Ltd is authorised and regulated by the Financial Conduct > Authority (932253); FNZ Securities Ltd registered in England and Wales > (09486463) 10th Floor, 135 Bishopsgate, London EC2M 3TP, FNZ Securities > Ltd, is authorised and regulated by the Financial Conduct Authority > (733400); JHC Systems Limited registered in England and Wales (08729370) > Temple Point 6th Floor, 1 Temple Row, Birmingham, West Midlands, B2 5LG; > FNZ (Europe) DAC registered in Ireland (657886) Block C, Irish Life > Centre, Lower Abbey Street, Dublin 1, D01V9F5, Ireland; FNZ SA (Pty) Ltd > registered under the laws of South Africa (2018/547997/07), 1st floor, > Newport House, Prestwich Street, Greenpoint, western Cape, 8001; FNZ > Limited registered in New Zealand (1797706) FNZ House, Level 3, 29A Brandon > Street, Wellington, 6011 New Zealand; FNZ (Australia) Pty Ltd registered in > Australia (138 819 119) Level 1, 99 Elizabeth St, Sydney 2000; FNZ (Hong > Kong) Limited registered in Hong Kong (1305362) 6A-1, Koshun House, 331 > Nathan Road, Hong Kong; FNZ (Singapore) Services Pte. Ltd. registered in > Singapore (201307199R) 61 Robinson Road, #13-03A, Robinson Centre, > Singapore (068893); and FNZ (China) Ltd registered in China > (91310115MA1K3G4K6T) [中国(上海)自由贸易试验区世纪大道1196 号二 座20 层. > This message is intended solely for the addressee and may contain > confidential information. If you have received this message in error, > please send it back to us, and immediately and permanently delete it. Do > not use, copy or disclose the information contained in this message or in > any attachment. > Emails sent to and from FNZ may be monitored and read for legitimate > business purposes. Emails
[IO] Change in behavior in Commons FileUpload after upgrade to Commons IO 2.16.1
Hi, Today, I found the following questions in https://github.com/apache/commons-io/pull/609: > The behavior for a negative threshold should be the same as 0 IMO. WDYT? > What does it even mean that a threshold is negative? > Writing zero bytes writes nothing, so there is nothing to reach until you at > least write one byte. > Am I missing something? I would like to highlight a "use case" for a negative threshold, and how the change to disallow a negative threshold might impact existing code. I upgraded from Commons IO 2.15.1 to 2.16.1 yesterday and found a small change in the behavior of Commons FileUpload when uploading and processing empty files. The effect is visible only when passing a negative value as file size threshold to Commons FileUpload. Here is a small extract from the Javadoc of Commons FileUpload, class `DefaultFileItemFactory`, constructor parameter `sizeThreshold`: > sizeThreshold - The threshold, in bytes, below which items will be > retained in memory and above which they will be stored as a file. (source: https://javadoc.io/doc/commons-fileupload/commons-fileupload/latest/index.html) By passing a negative value for `sizeThreshold`, Commons FileUpload can be configured to disable the in-memory caching for all uploaded files, including empty files with a size of 0 bytes. As a result, `DiskFileItem` objects created by Commons FileUpload will always have a `File` instance set internally, even for empty files. `DiskFileItem` in Commons FileUpload internally makes use of `DeferredFileOutputStream`, and therefore `ThresholdingOutputStream`. At some point it calls `isThresholdExceeded()` to check whether the size of the uploaded file exceeds the given threshold. By disallowing a negative threshold, empty files will now be treated differently by Commons FileUpload. With a size of 0 bytes, they will not exceed the enforced minimum threshold of 0 bytes anymore, and their data will therefore be kept in memory. This can break follow-up code which relies on the previous behavior and expects a `File` instance to be created for every uploaded file, even empty files. I know that this is a very specific use case. I don't know whether the developers of Commons FileUpload ever intended a negative threshold to be used. Still, the question was asked whether a negative threshold could have any meaning. I assume the answer is "yes". But I don't know whether this qualifies as a bug or a regression. I also don't know whether there are other similar use cases in other libraries depending on Commons IO. Best, Stephan Email Disclaimer FNZ (UK) Ltd registered in England and Wales (05435760) 10th Floor, 135 Bishopsgate, London EC2M 3TP, FNZ (UK) Ltd is authorised and regulated by the Financial Conduct Authority (438687); FNZ TA Services Ltd registered in England and Wales (09571767) 10th Floor, 135 Bishopsgate, London EC2M 3TP, FNZ TA Services Ltd is authorised and regulated by the Financial Conduct Authority (932253); FNZ Securities Ltd registered in England and Wales (09486463) 10th Floor, 135 Bishopsgate, London EC2M 3TP, FNZ Securities Ltd, is authorised and regulated by the Financial Conduct Authority (733400); JHC Systems Limited registered in England and Wales (08729370) Temple Point 6th Floor, 1 Temple Row, Birmingham, West Midlands, B2 5LG; FNZ (Europe) DAC registered in Ireland (657886) Block C, Irish Life Centre, Lower Abbey Street, Dublin 1, D01V9F5, Ireland; FNZ SA (Pty) Ltd registered under the laws of South Africa (2018/547997/07), 1st floor, Newport House, Prestwich Street, Greenpoint, western Cape, 8001; FNZ Limited registered in New Zealand (1797706) FNZ House, Level 3, 29A Brandon Street, Wellington, 6011 New Zealand; FNZ (Australia) Pty Ltd registered in Australia (138 819 119) Level 1, 99 Elizabeth St, Sydney 2000; FNZ (Hong Kong) Limited registered in Hong Kong (1305362) 6A-1, Koshun House, 331 Nathan Road, Hong Kong; FNZ (Singapore) Services Pte. Ltd. registered in Singapore (201307199R) 61 Robinson Road, #13-03A, Robinson Centre, Singapore (068893); and FNZ (China) Ltd registered in China (91310115MA1K3G4K6T) [中国(上海)自由贸易试验区世纪大道1196 号二 座20 层. This message is intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to us, and immediately and permanently delete it. Do not use, copy or disclose the information contained in this message or in any attachment. Emails sent to and from FNZ may be monitored and read for legitimate business purposes. Emails cannot be guaranteed to be secure or error-free, and you should protect your systems. FNZ does not accept any liability arising from interception, corruption, loss or destruction of this email, or if it arrives late or incomplete or with errors or viruses. For more information about the FNZ group please visit our website here where you can also find links to our policies including our Privacy policy which
Re: [collections]
On Tue, Apr 9, 2024 at 11:09 PM Rodion Efremov wrote: > > Hello, > > Fair enough. However, why we have CursorableLinkedList and > NodeCachingLinkedList around when my previous benchmarking showed that they > are inferior compated to both TreeList and IndexedLinkedList? We have a lot of things we don't need and that shouldn't be used. It happens sometimes on long lived bazaar-style projects without a clear vision and maintainer. If those two classes are demonstrably inferior, it might be worthwhile deprecating them. Meanwhile I'd prefer not to make the situation worse. We already have more code than we can maintain, and are wasting a lot of dev cycles on idiosyncratic churn to no good end. > Also, note that TreeList requires 3 references, 2 ints and 2 booleans per > node. IndexedLinkedList requires only 3 references per node. > > If you need benchmarking on small lists, just tell me and I will arrange > that. Lies, damned lies, and benchmarks. :-) Benchmarking is hard and rarely matches reality. By coincidence, I spent last week learning about the damage the TPCH benchmarks do in the database space. The benchmarks that matter are profiles of real world applications, and every application is different. Better algorithms are sometimes discovered, even for well trod territory like lists, but typically they only improve performance in the limit and often decrease performance in real world applications. Looking at the repo, this seems to be a newly constructed data structure. I suggest cleaning up the blog post and submitting it to an appropriate peer reviewed journal in the field and posting the preprint on arxiv so true experts can take a look. (I'm just a practitioner.) If the data structure proves out over time in real world use cases, then it should be considered for Apache Commons. However, I don't think Commons is the right place for bleeding edge research. -- Elliotte Rusty Harold elh...@ibiblio.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org