[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2021-02-04 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279285#comment-17279285
 ] 

Yun Tang commented on FLINK-15318:
--

[~maguowei] These tests are dropped in FLINK-18373 and I will close this ticket 
as it fixed from 1.12.0

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2021-02-03 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278547#comment-17278547
 ] 

Guowei Ma commented on FLINK-15318:
---

another case

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12891=logs=f0ac5c25-1168-55a5-07ff-0e88223afed9=39a61cac-5c62-532f-d2c1-dea450a66708

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-06-18 Thread Stephan Ewen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139349#comment-17139349
 ] 

Stephan Ewen commented on FLINK-15318:
--

Fair enough, I agree with your conclusion, Yun Tang.

+1 to drop these benchmark unit tests.

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-06-18 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139243#comment-17139243
 ] 

Yun Tang commented on FLINK-15318:
--

[~sewen] After digging into those tests, I think all of them could be dropped.

{{RocksDBListStatePerformanceTest}} targets for performance of 
"stringappendtest" merge operator, which has been covered by 
[ListStateBenchmark#add|https://github.com/apache/flink-benchmarks/blob/8b449865cf733dbb3c01e997fe44b1a5b6f82cdc/src/main/java/org/apache/flink/state/benchmark/ListStateBenchmark.java#L118].

{{RocksDBWriteBatchPerformanceTest}} targets for performance of WriteBatch 
which should be covered by 
[MapStateBenchmark#mapPutAll|https://github.com/apache/flink-benchmarks/blob/8b449865cf733dbb3c01e997fe44b1a5b6f82cdc/src/main/java/org/apache/flink/state/benchmark/MapStateBenchmark.java#L160]

{{RocksDBPerformanceTest}} targets for performance of merge and iterator seek 
and next, which have been covered by 
[ListStateBenchmark#add|https://github.com/apache/flink-benchmarks/blob/8b449865cf733dbb3c01e997fe44b1a5b6f82cdc/src/main/java/org/apache/flink/state/benchmark/ListStateBenchmark.java#L118]
 and 
[MapStateBenchmark#mapIterator|https://github.com/apache/flink-benchmarks/blob/8b449865cf733dbb3c01e997fe44b1a5b6f82cdc/src/main/java/org/apache/flink/state/benchmark/MapStateBenchmark.java#L143]

And the most important thing is unit test cannot watch the performance issues 
clearly. If the execution time expands from 2 seconds to 2.5 seconds, which 
means the performance regression is about 25%. However, a timeout limit of 3 
seconds cannot detect such performance regression.

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-06-16 Thread Stephan Ewen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136697#comment-17136697
 ] 

Stephan Ewen commented on FLINK-15318:
--

The purpose of this test is to guard against the "quadratic concatenation 
complexity bug" that RocksDB had a few versions ago.
In that case, the benchmark took 50s or so. We can probably increase this to 5s 
without a problem.

How about this?
  - we add it to the benchmarks suite to monitor regressions more precisely
  - we keep it in the codebase with a timeout of 5 seconds or so, as a rough 
guard

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-06-09 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128915#comment-17128915
 ] 

Robert Metzger commented on FLINK-15318:


I don't really have an opinion here, because I'm not very familiar with the 
RocksDB code. Let's wait for Stephan's response.

Another case in {{[ERROR]   
RocksDBPerformanceTest.testRocksDbRangeGetPerformance:146 » TestTimedOut test 
...}}: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=2973=logs=3b6ec2fd-a816-5e75-c775-06fb87cb6670=2aff8966-346f-518f-e6ce-de64002a5034

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-06-08 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17127940#comment-17127940
 ] 

Yun Tang commented on FLINK-15318:
--

Since we already have the repo: [https://github.com/apache/flink-benchmarks], I 
wonder the significance of those \{{RocksDB*PerformanceTest}}s and unit test 
performance would easily be affected by the status of running host.

I prefer to remove them all: ({{RocksDBListStatePerformanceTest}}, 
{{RocksDBWriteBatchPerformanceTest}} and {{RocksDBPerformanceTest}}), and we 
could also add cases in flink-benchmarks if we think some field is only covered 
by those tests. 

What do you think of this [~sewen], [~rmetzger]?

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-06-02 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124616#comment-17124616
 ] 

Robert Metzger commented on FLINK-15318:


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=2587=logs=f0ac5c25-1168-55a5-07ff-0e88223afed9=39a61cac-5c62-532f-d2c1-dea450a66708

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-05-05 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17099879#comment-17099879
 ] 

Robert Metzger commented on FLINK-15318:


Another instance: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=610=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=4ed44b66-cdd6-5dcf-5f6a-88b07dda665d

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-01-13 Thread Ronald O. Edmark (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014379#comment-17014379
 ] 

Ronald O. Edmark commented on FLINK-15318:
--

[~yunta] thanks again for helping.

 

This is a Red Hat 7.6 KVM hosted virtual machine, ppc64le with 16 GB of memory, 
8 GB swap, 4 cpus.

 

Currently our only solution is to change the performance write time-out from 2 
seconds to 3 seconds in 

```
./flink-state-backends/flink-statebackend-rocksdb/src/test/java/org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java

[*@Test*|https://jazz06.rchland.ibm.com:12443/jazz/users/Test]*(timeout = 2000) 
change to [@Test|https://jazz06.rchland.ibm.com:12443/jazz/users/Test](timeout 
= 3000)*

*```*

 

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-01-13 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014361#comment-17014361
 ] 

Yun Tang commented on FLINK-15318:
--

[~redmark-ibm] thanks for your feedback. 

What's your hardware information?

It seems no one has ever compared the performance of RocksDB with the same 
hardware on amd64 V.S ppc64le, we could open an issue in RocksDB community. For 
this question in Flink, we could increase the timeout if RocksDB has some 
performance issue on ppc64le confirmed.

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-01-13 Thread Ronald O. Edmark (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014313#comment-17014313
 ] 

Ronald O. Edmark commented on FLINK-15318:
--

Yun thank you for the commit changes.

I've modified the `pom` file, applied the commit changes and ran a clean test 
build of RocksDB but we still see the failure. Changing the time from 2 to 3 
seconds does work-around the problem.

 

Is 3 sec an acceptable timeout?

 

-Ron

```

[redmark@p006vm23 flink]$ mvn clean test -rf :flink-statebackend-rocksdb_2.11

[INFO] Scanning for projects...

..

[INFO] 

[INFO] Building flink-statebackend-rocksdb 1.8.3

[INFO] 

[INFO]

[INFO] --- maven-clean-plugin:3.1.0:clean (default-clean) @ 
flink-statebackend-rocksdb_2.11 --- [INFO] Deleting 
/root/flink/flink-state-backends/flink-statebackend-rocksdb/target

..

[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 2.473 s 
<<< FAILURE! - in 
org.apache.flink.contrib.streaming.state.benchmark.RocksDBWriteBatchPerformanceTest
 [ERROR] 
benchMark(org.apache.flink.contrib.streaming.state.benchmark.RocksDBWriteBatchPerformanceTest)
 Time elapsed: 2.073 s <<<

ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 2000 
milliseconds at 
org.apache.flink.contrib.streaming.state.benchmark.RocksDBWriteBatchPerformanceTest.benchMarkHelper(RocksDBWriteBatchPerformanceTest.java:118)
 at 
org.apache.flink.contrib.streaming.state.benchmark.RocksDBWriteBatchPerformanceTest.benchMark(RocksDBWriteBatchPerformanceTest.java:96)

```

 

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-01-12 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013995#comment-17013995
 ] 

Yun Tang commented on FLINK-15318:
--

[~redmark-ibm] you can try this 
[commit|https://github.com/Myasuka/flink/commit/484fffb08620ab177175405c53d64faaeb585d01].

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-01-09 Thread Ronald O. Edmark (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012283#comment-17012283
 ] 

Ronald O. Edmark commented on FLINK-15318:
--

Yum,

I would like to point out that the tests just fails, changing the number from 2 
seconds to 3 seconds works.  Most of the time the failure are just over 2 
seconds.  i.e. *Time elapsed: 2.095*

 

I made these changes to  the pom.  Current Flink 1.8.3 has


   
   com.data-artisans
   *frocksdbjni*
   *5.17.2-artisans-1.0*


Changed to 



   org.rocksdb
   *rocksdbjni*
   *5.17.2*


 

I worked on removing *org.rocksdb.FlinkCompactionFilter* and 
*org.rocksdb.FlinkCompactionFilter.FlinkCompactionFilterFactory* but I was 
hitting issues getting it cleanly removed, if you can provide a modified 
*RocksDbTtlCompactFiltersManager.java* version that will help. Otherwise I'll 
work on it tomorrow when I have more time.

 

Thanks for helping,

Ron

 

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-01-08 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010885#comment-17010885
 ] 

Yun Tang commented on FLINK-15318:
--

[~redmark-ibm] can you retry the official RocksDB to see whether could meet 
this problem:
 * Edit {{flink-state-backends/flink-statebackend-rocksdb/pom.xml}} to use 
rocksDB instead of FrocksDB

{code:java}

org.rocksdb
frocksdbjni
5.17.2

{code}
 * Edit 
{{flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/ttl/RocksDbTtlCompactFiltersManager.java}}
 to drop all usage of {{org.rocksdb.FlinkCompactionFilter}} and 
{{org.rocksdb.FlinkCompactionFilter.FlinkCompactionFilterFactory}}. Remove them 
would not affect you to run that test.

By doing so, you could verify whether the problem still existed for official 
RocksDB.

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2020-01-07 Thread Ronald O. Edmark (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009686#comment-17009686
 ] 

Ronald O. Edmark commented on FLINK-15318:
--

 I'm hitting the same problem in Flink 1.8.3. Did anyone find a fix for this 
issue?  I have a ppc64le environment to help debug the issue.

 

Red Hat 7.6 Linux ppc64le

Java 1.8.0.232

Maven 3.2.5

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2019-12-24 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002716#comment-17002716
 ] 

Yun Tang commented on FLINK-15318:
--

[~siddheshghadi] I noticed that you also come across that in release-1.8 which 
is an older version of FRocksDB. From my previous experience, I have noticed 
that FRocksDB on ppc64le platform behaves worse than other platforms and I 
actually have not met some guys using Flink in production with ppc64le 
environment.

 

In a nut shell, the timeout for FRocksDB is not enough on ppc64le platform. Did 
you use Flink in production on ppc64le platform? I am afraid Flink community 
lacks of rich experience on ppc64le especially for FRocksDB performance. By the 
way, can you try to use RocksDB instead of FRocksDB to run the tests (Remember 
to remove all the usage of {{org.rocksdb.FlinkCompactionFilter}} so that you 
could build with official RocksDB).

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2019-12-19 Thread Siddhesh Ghadi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17000605#comment-17000605
 ] 

Siddhesh Ghadi commented on FLINK-15318:


I verified it with master, release-1.10, release-1.9 & release-1.8 branches, 
RocksDBWriteBatchPerformanceTest.benchMark fails on all these branches. Also I 
came across this error for the first time when I tried it on ppc64le.

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-15318) RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le

2019-12-18 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999114#comment-16999114
 ] 

Yun Tang commented on FLINK-15318:
--

Which version of Flink did you verify? Did you observe a stable performance 
behavior before and then suddenly fail due to sync with new commits or you just 
come across this error for the first time when you just want to try it on 
ppc64le platform.

Actually, Flink community lacks of benchmark on ppc64le environment and I 
noticed that RocksDB on ppc64le behaves not as good as those on linux64.

> RocksDBWriteBatchPerformanceTest.benchMark fails on ppc64le
> ---
>
> Key: FLINK-15318
> URL: https://issues.apache.org/jira/browse/FLINK-15318
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks, Runtime / State Backends
> Environment: arch: ppc64le
> os: rhel7.6, ubuntu 18.04
> jdk: 8, 11
> mvn: 3.3.9, 3.6.2
>Reporter: Siddhesh Ghadi
>Priority: Major
> Attachments: surefire-report.txt
>
>
> RocksDBWriteBatchPerformanceTest.benchMark fails due to TestTimedOut, however 
> when test-timeout is increased from 2s to 5s in 
> org/apache/flink/contrib/streaming/state/benchmark/RocksDBWriteBatchPerformanceTest.java:75,
>  it passes. Is this acceptable solution?
> Note: Tests are ran inside a container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)