[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082932#comment-15082932 ] ASF GitHub Bot commented on CASSANDRA-8755: --- Github user alshopov commented on the pull request: https://github.com/apache/cassandra/pull/60#issuecomment-168982417 This has been pulled, reformatted and closed. > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Assignee: Alexander Shopov >Priority: Trivial > Labels: lhf > Fix For: 3.2 > > Attachments: 8755.tar.gz, trunk-8755.patch, trunk-8755.txt > > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082933#comment-15082933 ] ASF GitHub Bot commented on CASSANDRA-8755: --- Github user alshopov closed the pull request at: https://github.com/apache/cassandra/pull/60 > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Assignee: Alexander Shopov >Priority: Trivial > Labels: lhf > Fix For: 3.2 > > Attachments: 8755.tar.gz, trunk-8755.patch, trunk-8755.txt > > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073932#comment-15073932 ] Alexander Shopov commented on CASSANDRA-8755: - Current state is in pull request https://github.com/apache/cassandra/pull/60 Two commits - first is unit tests to verify initial behavior. Second is the patch itself - recheck with unit tests. Command to check created tests: -- ant testsome -Dtest.name=org.apache.cassandra.config.CFMetaDataTest -Dtest.methods=testIsNameValidPositive,testIsNameValidNegative ant testsome -Dtest.name=org.apache.cassandra.config.DatabaseDescriptorTest -Dtest.methods=testTokensFromString ant testsome -Dtest.name=org.apache.cassandra.cql3.ColumnIdentifierTest -Dtest.methods=testMaybeQuote ant testsome -Dtest.name=org.apache.cassandra.cql3.statements.PropertyDefinitionsTest -Dtest.methods=testGetBooleanExistant,testGetBooleanNonexistant ant testsome -Dtest.name=org.apache.cassandra.db.marshal.AbstractCompositeTypeTest -Dtest.methods=testEscape,testUnescape ant testsome -Dtest.name=org.apache.cassandra.metrics.CassandraMetricsRegistryTest -Dtest.methods=testChooseType,testMetricName ant testsome -Dtest.name=org.apache.cassandra.cql3.ColumnIdentifierTest -Dtest.methods=testMaybeQuote ant testsome -Dtest.name=org.apache.cassandra.schema.IndexMetadataTest -Dtest.methods=testIsNameValidPositive,testIsNameValidNegative,testGetDefaultIndexName ant testsome -Dtest.name=org.apache.cassandra.utils.CassandraVersionTest -Dtest.methods=testParseIdentifiersPositive,testParseIdentifiersNegative -- No unit tests for: src/java/org/apache/cassandra/db/marshal/TupleType.java src/java/org/apache/cassandra/index/internal/CassandraIndex.java src/java/org/apache/cassandra/service/StorageService.java These seem hard and I have to do quite a bit of mocking an mucking around. Hints welcome. No unit tests for: src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java Testing changes there will need refactoring these classes. They seem very central to Cassandra so I am waiting for advice whether and how to proceed. > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Priority: Trivial > Labels: lhf > Attachments: 8755.tar.gz, trunk-8755.patch, trunk-8755.txt > > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073924#comment-15073924 ] ASF GitHub Bot commented on CASSANDRA-8755: --- GitHub user alshopov opened a pull request: https://github.com/apache/cassandra/pull/60 Pull request for fixing lhf #CASSANDRA-8755 The discussion for this patch is in JIRA: https://issues.apache.org/jira/browse/CASSANDRA-8755 Currently pull request is for review. You can merge this pull request into a Git repository by running: $ git pull https://github.com/alshopov/cassandra 8755 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cassandra/pull/60.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #60 commit 0d929cbc552a9f76fb34e5976d6094ea18093a83 Author: Alexander Shopov Date: 2015-12-28T12:38:03Z Add unit tests that verify current behavior in order to check changes Signed-off-by: Alexander Shopov commit 2685a2cdb85e9a2f6ae4303e645acafedd77c890 Author: Alexander Shopov Date: 2015-12-29T12:14:50Z Trivial speedups of string operations String.split has fast path for some single char and two char combinations. They fail in cases when the char is regex meta (mainly the '.' character). Use StringUtils.split then. Convert String.replaceAll to static final Pattern-s usage. Less throwaway j.u.r.Patterns by using precompiled versions Replace front and back whitespace removal with trim Remove unused imports from changed files. Signed-off-by: Alexander Shopov > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Priority: Trivial > Labels: lhf > Attachments: 8755.tar.gz, trunk-8755.patch, trunk-8755.txt > > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15005267#comment-15005267 ] Robert Stupp commented on CASSANDRA-8755: - Thanks for your work so far! Some comments on your changes: * the changes in {{CommitLogReplayer}}, {{StorageService}} should use {{StringUtils.countMatches}} * the changes that remove a single char from a string should use {{StringUtils.remove(String,char)}} * you can omit all the changes in the {{tools}} and {{stress}} packages and to {{Client}}, {{CQLTester}}, {{Sample}} classes. These are not on a hot path and only affect the initialization of these tools or are just used very rarely. Just don’t want to change something that buys us basically nothing. * also prefer not to change the hadoop classes * the constant {{SLASH}} in {{PropertyFileSnitch}} should be at the beginning of the class * the name {{PATTERN_FINAL_DOLLAR}} in {{CassandraMetricsRegistry}} is incorrect Generally we require unit tests that ensure the changes work as expected. You can use the old code in the unit tests to verify the new production code against a bunch of input parameters. I’ve triggered a CI run against your changes. Unit tests ([here|http://cassci.datastax.com/job/snazy-8755-testall/]) look good, but some dtests failed ([here|http://cassci.datastax.com/job/snazy-8755-dtest/]) (more than currently on trunk) - but probably not caused by your patch. I recommend that you create a separate branch for this change off of trunk. You can safely squash these 4 commits into a single one - also the changes mentioned above. Having a separate branch also has the advantage that you can rebase and/or base on another branch. > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Priority: Trivial > Labels: lhf > Attachments: 8755.tar.gz, trunk-8755.patch, trunk-8755.txt > > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15004023#comment-15004023 ] Alexander Shopov commented on CASSANDRA-8755: - Here you go: https://github.com/alshopov/cassandra > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Priority: Trivial > Labels: lhf > Attachments: 8755.tar.gz, trunk-8755.patch, trunk-8755.txt > > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003895#comment-15003895 ] Robert Stupp commented on CASSANDRA-8755: - [~al_shopov] can you point me to some branches on github (forked from https://github.com/apache/cassandra)? That's much easier to review and we can spawn CI from them. > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Priority: Trivial > Labels: lhf > Attachments: 8755.tar.gz, trunk-8755.patch, trunk-8755.txt > > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15000379#comment-15000379 ] Alexander Shopov commented on CASSANDRA-8755: - I tried installing IntelliJ IDEA but could not figure out how to get the String.replace warning. I mainly use NetBeans and perhaps I missed some IDEA configuration. Still I have rebased and reworked the patch to include the classes you mention. Since these reduce the number of temporary, throw away j.u.r.Pattern-s., I have went through the code to find more instances of such usages. These are the calls: 1. String.replace(CharSequence, CharSequence) 2. String.replaceAll(String, String) 3. String.replaceFirst(String, String) 4. String.matches 5. String.split I have reworked the patch into a series of patches that I am sending as a tarball. 1st initial patch + changes you wanted. 2nd patch is adding more changes to reduce the throwaway regexps 3rd patch substitutes usage of regexp with trimming. This can be incompatible, so double check. 4th patch is cosmetic - since I am touching a lot of classes - remove the unused imports. As we only remove and not change - git annotate will still show proper history. > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Priority: Trivial > Labels: lhf > Attachments: trunk-8755.patch, trunk-8755.txt > > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996413#comment-14996413 ] Alexander Shopov commented on CASSANDRA-8755: - > WRT to String.replace I'd prefer to use a precompiled Pattern instance I will update the patch to reflect your suggestion and will rebase on current trunk. Will ping you when I am ready. > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Priority: Trivial > Labels: lhf > Attachments: trunk-8755.patch, trunk-8755.txt > > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972510#comment-14972510 ] Robert Stupp commented on CASSANDRA-8755: - The changes to use StringUtils are fine IMO. WRT to {{String.replace}} I'd prefer to use a precompiled {{Pattern}} instance for those usages that are on a "hot path" - these are {{ColumnIdentifier}}, {{CommitLogArchiver}}, {{AbstractCompositeType}}, {{TupleType}}, {{PropertyFileSnitch}}, {{CassandraIndex}}, {{Snapshot}}. (IntelliJ can warn on uses of {{String.replace}} and offers an automated refactoring on it.) I don't expect a measurable improvement by this patch, but it's worth to be included. > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Priority: Trivial > Labels: lhf > Attachments: trunk-8755.patch, trunk-8755.txt > > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948175#comment-14948175 ] Alexander Shopov commented on CASSANDRA-8755: - Hello, I am Alexander Shopov and would like to lend a hand with this patch. I have updated it and I am attaching it to this bug report. Only 11 files are changed. I have incorporated Oded's recommendation and extended them as the fast path also works in cases like "\n". I have also replaced String.replaceAll with String.replace in cases where regex literals rather than meta characters are used - the regex is still constructed but this goes via special and faster 'LITERAL' path. I will be very surprised if this change buys measurable performance. If not - I think this issue can be closed. In case you decide to actually merge - please also add Mengyu Zhang in the commit message as he started this. > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Priority: Trivial > Labels: lhf > Attachments: trunk-8755.patch, trunk-8755.txt > > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14632508#comment-14632508 ] Robert Stupp commented on CASSANDRA-8755: - [~mzhang], sorry for the late response to your patch. Can you rebase the patch, incorporate Oded's recommendation and remove the reorganization of imports (grouping changes)? I can then setup a perf test to see whether this buys something. > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Priority: Trivial > Labels: lhf > Attachments: trunk-8755.patch > > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526566#comment-14526566 ] Oded Peer commented on CASSANDRA-8755: -- Java's String.split() method has a fast-path for single character input, avoiding costly regexp creation. See http://stackoverflow.com/a/11002374/248656. > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Priority: Trivial > Labels: lhf > Attachments: trunk-8755.patch > > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14346616#comment-14346616 ] Jaroslav Kamenik commented on CASSANDRA-8755: - Hi, I have meant simple cases as String zone_parts[] = zone.split("-"); String res = input.replaceAll(":", ":"); where regexp is normal string/even one char, so usage of regexp is overkill. > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Priority: Trivial > Labels: lhf > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14327335#comment-14327335 ] Lovekesh Garg commented on CASSANDRA-8755: -- Hello, I want to work on this issue can you give me some example of what to replace with what. Thanks > Replace trivial uses of String.replace/replaceAll/split with StringUtils > methods > > > Key: CASSANDRA-8755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 > Project: Cassandra > Issue Type: Improvement >Reporter: Jaroslav Kamenik >Priority: Trivial > Labels: lhf > > There are places in the code where those regex based methods are used with > plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)