[jira] [Created] (KUDU-2236) org.apache.kudu.client.TestKuduClient flaky
Edward Fancher created KUDU-2236: Summary: org.apache.kudu.client.TestKuduClient flaky Key: KUDU-2236 URL: https://issues.apache.org/jira/browse/KUDU-2236 Project: Kudu Issue Type: Bug Components: test Affects Versions: 1.6.0 Reporter: Edward Fancher Last seen in org.apache.kudu.client.TestKuduClient.testCloseShortlyAfterOpen DEBUG - Could not login via JAAS. Using no credentials: Unable to obtain Principal Name for authentication DEBUG - SASL mechanism PLAIN chosen for peer 127.63.177.1 DEBUG - SASL mechanism PLAIN chosen for peer 127.63.177.1 DEBUG - SASL mechanism PLAIN chosen for peer 127.63.177.1 DEBUG - Learned about tablet Kudu Master for table 'Kudu Master' with partition [, ) DEBUG - Releasing all remaining resources DEBUG - [peer master-127.63.177.1:64030] cleaning up while in state READY due to: connection disconnected INFO - W1206 07:14:39.727399 16334 connection.cc:511] server connection from 127.63.177.1:43497 recv error: Network error: recv error: Connection reset by peer (error 104) DEBUG - [peer master-127.63.177.1:64034] cleaning up while in state NEGOTIATING due to: connection disconnected WARN - Error receiving response from 127.63.177.1:64030 org.apache.kudu.client.RecoverableException: connection disconnected at org.apache.kudu.client.Connection.channelDisconnected(Connection.java:244) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102) at org.apache.kudu.client.Connection.handleUpstream(Connection.java:236) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.channelDisconnected(SimpleChannelUpstreamHandler.java:208) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493) at org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) at org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396) at org.jboss.netty.channel.Channels$4.run(Channels.java:386) at org.jboss.netty.channel.socket.ChannelRunnableWrapper.run(ChannelRunnableWrapper.java:40) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:391) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:315) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) INFO - W1206 07:14:39.741600 17527 negotiation.cc:311] Failed RPC negotiation. Trace: INFO - 1206 07:14:39.695892 (+ 0us) reactor.cc:499] Submitting negotiation task for server connection from 127.63.177.1:48551 INFO - 1206 07:14:39.722215 (+ 26323us) server_negotiation.cc:173] Beginning negotiation INFO - 1206 07:14:39.722236 (+21us) server_negotiation.cc:361] Waiting for connection header DEBUG - [peer master-127.63.177.1:64032] cleaning up while in state READY due to: connection disconnected WARN - Error receiving response from 127.63.177.1:64034 org.apache.kudu.client.RecoverableException: connection disconnected at org.apache.kudu.client.Connection.channelDisconnected(Connection.java:244) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102) at org.apache.kudu.client.Co
[jira] [Created] (KUDU-2216) Post process gtest generated xml to include the output from the *.txt files
Edward Fancher created KUDU-2216: Summary: Post process gtest generated xml to include the output from the *.txt files Key: KUDU-2216 URL: https://issues.apache.org/jira/browse/KUDU-2216 Project: Kudu Issue Type: Improvement Components: test Reporter: Edward Fancher Assignee: Edward Fancher So, match up the corresponding .txt file with the .xml file and insert the section parseable by Jenkins (system-out node, see https://github.com/junit-team/junit5/blob/master/platform-tests/src/test/resources/jenkins-junit.xsd): [ RUN ] HybridClockTest.TestIsAfter [ OK ] HybridClockTest.TestIsAfter (0 ms) and then shove the "..." output into the junit xml file accordingly: -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (KUDU-2171) Add IWYU to kudu/build-support/jenkins/build-and-test.sh
[ https://issues.apache.org/jira/browse/KUDU-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Fancher closed KUDU-2171. Resolution: Not A Bug My mistake. I was running on an older version. > Add IWYU to kudu/build-support/jenkins/build-and-test.sh > > > Key: KUDU-2171 > URL: https://issues.apache.org/jira/browse/KUDU-2171 > Project: Kudu > Issue Type: Improvement > Components: build >Affects Versions: 1.4.0 >Reporter: Edward Fancher >Assignee: Edward Fancher >Priority: Minor > > IWYU isn't present in build and test which makes it difficult to run > consistently from Jenkins. > It should be added in a way similar to lint builds. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KUDU-2171) Add IWYU to kudu/build-support/jenkins/build-and-test.sh
[ https://issues.apache.org/jira/browse/KUDU-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Fancher updated KUDU-2171: - Description: IWYU isn't present in build and test which makes it difficult to run consistently from Jenkins. It should be added in a way similar to lint builds. > Add IWYU to kudu/build-support/jenkins/build-and-test.sh > > > Key: KUDU-2171 > URL: https://issues.apache.org/jira/browse/KUDU-2171 > Project: Kudu > Issue Type: Improvement > Components: build >Affects Versions: 1.4.0 >Reporter: Edward Fancher >Assignee: Edward Fancher >Priority: Minor > > IWYU isn't present in build and test which makes it difficult to run > consistently from Jenkins. > It should be added in a way similar to lint builds. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KUDU-2171) Add IWYU to kudu/build-support/jenkins/build-and-test.sh
Edward Fancher created KUDU-2171: Summary: Add IWYU to kudu/build-support/jenkins/build-and-test.sh Key: KUDU-2171 URL: https://issues.apache.org/jira/browse/KUDU-2171 Project: Kudu Issue Type: Improvement Components: build Affects Versions: 1.4.0 Reporter: Edward Fancher Assignee: Edward Fancher Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KUDU-2105) Create single node stress test framework
[ https://issues.apache.org/jira/browse/KUDU-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133505#comment-16133505 ] Edward Fancher commented on KUDU-2105: -- Ah, after our last 1-1 I was under the impression that this was something that was needed. If not, I'll close this. On Fri, Aug 18, 2017 at 11:51 AM, Todd Lipcon (JIRA) -- Ed > Create single node stress test framework > > > Key: KUDU-2105 > URL: https://issues.apache.org/jira/browse/KUDU-2105 > Project: Kudu > Issue Type: Test > Components: perf >Reporter: Edward Fancher >Assignee: Edward Fancher > > It would be useful to have single node stress test support. > 1. To allow node density testing without multi node cluster support. > 2. To allow general perf testing without multi node cluster support. > 3. To be able to run performance tests which have less variance (due to > network noise). > 4. (nice, but not needed) be able to simulate various characteristics such as > network reliability, rpc call failures, under stress without needing to > modify processes on other nodes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KUDU-2105) Create single node stress test framework
Edward Fancher created KUDU-2105: Summary: Create single node stress test framework Key: KUDU-2105 URL: https://issues.apache.org/jira/browse/KUDU-2105 Project: Kudu Issue Type: Test Components: perf Reporter: Edward Fancher Assignee: Edward Fancher It would be useful to have single node stress test support. 1. To allow node density testing without multi node cluster support. 2. To allow general perf testing without multi node cluster support. 3. To be able to run performance tests which have less variance (due to network noise). 4. (nice, but not needed) be able to simulate various characteristics such as network reliability, rpc call failures, under stress without needing to modify processes on other nodes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KUDU-2100) Verify Java client's behavior for tserver and master fail-over scenario
[ https://issues.apache.org/jira/browse/KUDU-2100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Fancher reassigned KUDU-2100: Assignee: Edward Fancher > Verify Java client's behavior for tserver and master fail-over scenario > --- > > Key: KUDU-2100 > URL: https://issues.apache.org/jira/browse/KUDU-2100 > Project: Kudu > Issue Type: Test >Reporter: Alexey Serbin >Assignee: Edward Fancher > > This is to introduce a scenario where both the leader tserver and leader > master 'unexpectedly crash' during the run. The idea is to verify that the > client automatically updates its metacache even if the leader master changes > and manages to send the data to the destination server eventually. > Mike suggested the following test scenario: > # Have a configuration with 3 master servers, 6 tablet servers, and a table > consisting of 1 tablet with replication factor of 3. Let's assume the tablet > are hosted by tablet servers TS1, TS2, and TS3. > # Start the Kudu cluster. > # Run the client to insert at least one row into the table. > # Stop the client's activity, but keep the client object alive to keep it > ready for the next steps. > # 3 times: permanently kill the leader of the tablet, so the tablet > eventually migrates to and is hosted by tablet servers TS4, TS5, TS6. > # Kill the leader master (after the configuration change is committed). > # Run the pre-warmed client to insert some data into the table again. Doing > so, the client should refresh its metadata from the new leader master and be > able to send the data to the right destination. > # Count the number of rows in the table to make sure it matches the > expectation. > There was a discussion on when to kill the leader master: prior or after > moving the table to the new set of tablet servers. It seems the latter case > (the sequence suggested above) allows covering a situation when no master > server recognizes itself as a leader. The client should retry in that case > as well and eventually receive the tablet location info from the established > leader master. If possible, let's implement the sequence for the former case > as well as an additional test. > The general idea is to make sure the Java client during fail-over events: > * Retries write and read operations automatically on an error happened due to > a fail-over event. > * Does not silently lose any data: if the client cannot send the data due to > timeout or running out of retry attempts, it should report on that. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KUDU-2033) Add a 'torture' scenario to verify Java client's behavior during fail-over
[ https://issues.apache.org/jira/browse/KUDU-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Fancher reassigned KUDU-2033: Assignee: Edward Fancher > Add a 'torture' scenario to verify Java client's behavior during fail-over > --- > > Key: KUDU-2033 > URL: https://issues.apache.org/jira/browse/KUDU-2033 > Project: Kudu > Issue Type: Test > Components: client, java >Reporter: Alexey Serbin >Assignee: Edward Fancher > Labels: newbie, newbie++ > > For the Kudu Java client we have {{TestLeaderFailover}} test which verifies > how the client handles the tablet server fail-over scenario. However, the > test covers only one fail-over event and mainly performs write operations > while the backend handles the 'unexpected crash' of the tablet server. > It would be nice to add more tests which cover the client's fail-over > behavior: > * add the mixed workload scenario, i.e. combine inserts/scans during the > fail-over > * induce more fail-over events while running the scenario, i.e. pause and > then resume the tservers processes many more times and run the test longer > * add the multi-master scenario, where both the leader tserver and leader > master 'unexpectedly crash' during the run > * in the mixed workload scenarios, run scan operations in READ_AT_TIMESTAMP > mode to exercise RYW (Read-Your-Writes) behavior and add assertions to make > sure the RYW behavior is observed as expected > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KUDU-1932) Run at least one tablet-level test against all block managers
[ https://issues.apache.org/jira/browse/KUDU-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Fancher updated KUDU-1932: - Affects Version/s: 1.4.0 > Run at least one tablet-level test against all block managers > - > > Key: KUDU-1932 > URL: https://issues.apache.org/jira/browse/KUDU-1932 > Project: Kudu > Issue Type: Bug > Components: test >Affects Versions: 1.3.0, 1.4.0 >Reporter: Adar Dembo >Assignee: Edward Fancher > Fix For: 1.5.0 > > > Even though the block manager tests provide good overall coverage (and > block_manager-stress-test even approximates a flush-like workload), it would > still be good to make sure all block managers are exposed to flush/compact > operations; right now only the LBM gets that kind of coverage. KUDU-1931 is > an example of an FBM-only bug was hidden due to lack of appropriate coverage > in the block manager tests. > mt-tablet-test is a good candidate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KUDU-1932) Run at least one tablet-level test against all block managers
[ https://issues.apache.org/jira/browse/KUDU-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Fancher updated KUDU-1932: - Code Review: https://gerrit.cloudera.org/#/c/7252/ > Run at least one tablet-level test against all block managers > - > > Key: KUDU-1932 > URL: https://issues.apache.org/jira/browse/KUDU-1932 > Project: Kudu > Issue Type: Bug > Components: test >Affects Versions: 1.3.0, 1.4.0 >Reporter: Adar Dembo >Assignee: Edward Fancher > Fix For: 1.5.0 > > > Even though the block manager tests provide good overall coverage (and > block_manager-stress-test even approximates a flush-like workload), it would > still be good to make sure all block managers are exposed to flush/compact > operations; right now only the LBM gets that kind of coverage. KUDU-1931 is > an example of an FBM-only bug was hidden due to lack of appropriate coverage > in the block manager tests. > mt-tablet-test is a good candidate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KUDU-1932) Run at least one tablet-level test against all block managers
[ https://issues.apache.org/jira/browse/KUDU-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Fancher resolved KUDU-1932. -- Resolution: Fixed Fix Version/s: 1.5.0 > Run at least one tablet-level test against all block managers > - > > Key: KUDU-1932 > URL: https://issues.apache.org/jira/browse/KUDU-1932 > Project: Kudu > Issue Type: Bug > Components: test >Affects Versions: 1.3.0 >Reporter: Adar Dembo >Assignee: Edward Fancher > Fix For: 1.5.0 > > > Even though the block manager tests provide good overall coverage (and > block_manager-stress-test even approximates a flush-like workload), it would > still be good to make sure all block managers are exposed to flush/compact > operations; right now only the LBM gets that kind of coverage. KUDU-1931 is > an example of an FBM-only bug was hidden due to lack of appropriate coverage > in the block manager tests. > mt-tablet-test is a good candidate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KUDU-1932) Run at least one tablet-level test against all block managers
[ https://issues.apache.org/jira/browse/KUDU-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Fancher reassigned KUDU-1932: Assignee: Edward Fancher > Run at least one tablet-level test against all block managers > - > > Key: KUDU-1932 > URL: https://issues.apache.org/jira/browse/KUDU-1932 > Project: Kudu > Issue Type: Bug > Components: test >Affects Versions: 1.3.0 >Reporter: Adar Dembo >Assignee: Edward Fancher > > Even though the block manager tests provide good overall coverage (and > block_manager-stress-test even approximates a flush-like workload), it would > still be good to make sure all block managers are exposed to flush/compact > operations; right now only the LBM gets that kind of coverage. KUDU-1931 is > an example of an FBM-only bug was hidden due to lack of appropriate coverage > in the block manager tests. > mt-tablet-test is a good candidate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)