[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2023-04-21 Thread via GitHub
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1517460306 I would like to inquire whether the patch I submitted is eligible for merging into the codebase. I understand that there may be concerns or issues that need to be addressed before

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2023-04-17 Thread via GitHub
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1511047557 > https://user-images.githubusercontent.com/8748814/204439049-53f0bd4f-9ea0-4289-8268-d16aef5b4334.png;> > > @lyy-pineapple Would you share the test sql pattern? I test some

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2023-04-17 Thread via GitHub
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1511045850 > https://user-images.githubusercontent.com/8748814/204439049-53f0bd4f-9ea0-4289-8268-d16aef5b4334.png;> > > @lyy-pineapple Would you share the test sql pattern? I test some

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2023-04-13 Thread via GitHub
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1507832785 > Any new developments in this PR? Reoptimized unit testing to facilitate comparison of results between two regularization engines -- This is an automated message from the

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2023-03-30 Thread via GitHub
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1489987177 > https://user-images.githubusercontent.com/8748814/204439049-53f0bd4f-9ea0-4289-8268-d16aef5b4334.png;> > > @lyy-pineapple Would you share the test sql pattern? I test some

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2023-03-30 Thread via GitHub
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1489985307 > `joni` seems to be used in Hbase client only instead of Hbase server or Hbase common. > > * https://mvnrepository.com/artifact/org.apache.hbase/hbase-client/2.5.3 >

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2022-12-26 Thread GitBox
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1365029419 > https://user-images.githubusercontent.com/8748814/204439049-53f0bd4f-9ea0-4289-8268-d16aef5b4334.png;> > > @lyy-pineapple Would you share the test sql pattern? I test some

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2022-11-02 Thread GitBox
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1299657792 > How much confidence do we have in joni? Is it widely adopted by other open-source projects? I'm a bit concerned about moving away from JDK regex and picking a project that I just

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2022-11-01 Thread GitBox
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1299505071 Add new benchmark that compared with java 11 and java 17 . cc @cloud-fan @LuciferYang -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2022-10-13 Thread GitBox
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1277442557 Does spark has some data that is suitable for regular matching benchmark. @LuciferYang @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2022-10-11 Thread GitBox
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1274563298 > sql test is better, but simple test is OK Hi,I did two benchmark by simple data and https://github.com/mariomka/regex-benchmark/blob/master/input-text.txt.cc @LuciferYang

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2022-10-10 Thread GitBox
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1272891669 > sql test is better, but simple test is OK ![image](https://user-images.githubusercontent.com/46274164/194816709-980e5062-2d05-4e95-b0bc-d83e37a86555.png) Can I add this

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2022-10-10 Thread GitBox
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1272834965 > Can you also add a related micro-benchmark for Spark? If I use SqlBasedBenchmark to test, I don't know how to create a dataset and override regular matching rules. Do you

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2022-10-09 Thread GitBox
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1272746644 > @lyy-pineapple please run `./dev/test-dependencies.sh --replace-manifest` locally and add the changed `spark-deps-hadoop-x-hive-2.3` files to this pr Thanks, i has done it.