[jira] [Created] (KUDU-2236) org.apache.kudu.client.TestKuduClient flaky

2017-12-06 Thread Edward Fancher (JIRA)
Edward Fancher created KUDU-2236:


 Summary: org.apache.kudu.client.TestKuduClient flaky
 Key: KUDU-2236
 URL: https://issues.apache.org/jira/browse/KUDU-2236
 Project: Kudu
  Issue Type: Bug
  Components: test
Affects Versions: 1.6.0
Reporter: Edward Fancher


Last seen in org.apache.kudu.client.TestKuduClient.testCloseShortlyAfterOpen

DEBUG - Could not login via JAAS. Using no credentials: Unable to obtain 
Principal Name for authentication 
DEBUG - SASL mechanism PLAIN chosen for peer 127.63.177.1
DEBUG - SASL mechanism PLAIN chosen for peer 127.63.177.1
DEBUG - SASL mechanism PLAIN chosen for peer 127.63.177.1
DEBUG - Learned about tablet Kudu Master for table 'Kudu Master' with partition 
[, )
DEBUG - Releasing all remaining resources
DEBUG - [peer master-127.63.177.1:64030] cleaning up while in state READY due 
to: connection disconnected
INFO - W1206 07:14:39.727399 16334 connection.cc:511] server connection from 
127.63.177.1:43497 recv error: Network error: recv error: Connection reset by 
peer (error 104)
DEBUG - [peer master-127.63.177.1:64034] cleaning up while in state NEGOTIATING 
due to: connection disconnected
WARN - Error receiving response from 127.63.177.1:64030
org.apache.kudu.client.RecoverableException: connection disconnected
 at org.apache.kudu.client.Connection.channelDisconnected(Connection.java:244)
 at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
 at org.apache.kudu.client.Connection.handleUpstream(Connection.java:236)
 at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.channelDisconnected(SimpleChannelUpstreamHandler.java:208)
 at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
 at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at 
org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60)
 at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at 
org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493)
 at 
org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
 at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
 at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
 at org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
 at org.jboss.netty.channel.Channels$4.run(Channels.java:386)
 at 
org.jboss.netty.channel.socket.ChannelRunnableWrapper.run(ChannelRunnableWrapper.java:40)
 at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:391)
 at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:315)
 at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
 at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
 at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
 at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)

INFO - W1206 07:14:39.741600 17527 negotiation.cc:311] Failed RPC negotiation. 
Trace:
INFO - 1206 07:14:39.695892 (+ 0us) reactor.cc:499] Submitting negotiation 
task for server connection from 127.63.177.1:48551
INFO - 1206 07:14:39.722215 (+ 26323us) server_negotiation.cc:173] Beginning 
negotiation
INFO - 1206 07:14:39.722236 (+21us) server_negotiation.cc:361] Waiting for 
connection header
DEBUG - [peer master-127.63.177.1:64032] cleaning up while in state READY due 
to: connection disconnected
WARN - Error receiving response from 127.63.177.1:64034
org.apache.kudu.client.RecoverableException: connection disconnected
 at org.apache.kudu.client.Connection.channelDisconnected(Connection.java:244)
 at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
 at org.apache.kudu.client.Co

[jira] [Created] (KUDU-2216) Post process gtest generated xml to include the output from the *.txt files

2017-11-15 Thread Edward Fancher (JIRA)
Edward Fancher created KUDU-2216:


 Summary: Post process gtest generated xml to include the output 
from the *.txt files
 Key: KUDU-2216
 URL: https://issues.apache.org/jira/browse/KUDU-2216
 Project: Kudu
  Issue Type: Improvement
  Components: test
Reporter: Edward Fancher
Assignee: Edward Fancher


So, match up the corresponding .txt file with the .xml file and insert the 
 section parseable by Jenkins (system-out node, see 
https://github.com/junit-team/junit5/blob/master/platform-tests/src/test/resources/jenkins-junit.xsd):
[ RUN  ] HybridClockTest.TestIsAfter

[   OK ] HybridClockTest.TestIsAfter (0 ms)

and then shove the "..." output into the junit xml file accordingly:





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (KUDU-2171) Add IWYU to kudu/build-support/jenkins/build-and-test.sh

2017-10-05 Thread Edward Fancher (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Fancher closed KUDU-2171.

Resolution: Not A Bug

My mistake. I was running on an older version. 

> Add IWYU to kudu/build-support/jenkins/build-and-test.sh
> 
>
> Key: KUDU-2171
> URL: https://issues.apache.org/jira/browse/KUDU-2171
> Project: Kudu
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 1.4.0
>Reporter: Edward Fancher
>Assignee: Edward Fancher
>Priority: Minor
>
> IWYU isn't present in build and test which makes it difficult to run 
> consistently from Jenkins.
> It should be added in a way similar to lint builds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KUDU-2171) Add IWYU to kudu/build-support/jenkins/build-and-test.sh

2017-10-05 Thread Edward Fancher (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Fancher updated KUDU-2171:
-
Description: 
IWYU isn't present in build and test which makes it difficult to run 
consistently from Jenkins.

It should be added in a way similar to lint builds.

> Add IWYU to kudu/build-support/jenkins/build-and-test.sh
> 
>
> Key: KUDU-2171
> URL: https://issues.apache.org/jira/browse/KUDU-2171
> Project: Kudu
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 1.4.0
>Reporter: Edward Fancher
>Assignee: Edward Fancher
>Priority: Minor
>
> IWYU isn't present in build and test which makes it difficult to run 
> consistently from Jenkins.
> It should be added in a way similar to lint builds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KUDU-2171) Add IWYU to kudu/build-support/jenkins/build-and-test.sh

2017-10-05 Thread Edward Fancher (JIRA)
Edward Fancher created KUDU-2171:


 Summary: Add IWYU to kudu/build-support/jenkins/build-and-test.sh
 Key: KUDU-2171
 URL: https://issues.apache.org/jira/browse/KUDU-2171
 Project: Kudu
  Issue Type: Improvement
  Components: build
Affects Versions: 1.4.0
Reporter: Edward Fancher
Assignee: Edward Fancher
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KUDU-2105) Create single node stress test framework

2017-08-18 Thread Edward Fancher (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133505#comment-16133505
 ] 

Edward Fancher commented on KUDU-2105:
--

Ah, after our last 1-1 I was under the impression that this was something
that was needed. If not, I'll close this.

On Fri, Aug 18, 2017 at 11:51 AM, Todd Lipcon (JIRA) 




-- 
Ed


> Create single node stress test framework
> 
>
> Key: KUDU-2105
> URL: https://issues.apache.org/jira/browse/KUDU-2105
> Project: Kudu
>  Issue Type: Test
>  Components: perf
>Reporter: Edward Fancher
>Assignee: Edward Fancher
>
> It would be useful to have single node stress test support.
> 1. To allow node density testing without multi node cluster support.
> 2. To allow general perf testing without multi node cluster support.
> 3. To be able to run performance tests which have less variance (due to 
> network noise).
> 4. (nice, but not needed) be able to simulate various characteristics such as 
> network reliability, rpc call failures, under stress without needing to 
> modify processes on other nodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KUDU-2105) Create single node stress test framework

2017-08-18 Thread Edward Fancher (JIRA)
Edward Fancher created KUDU-2105:


 Summary: Create single node stress test framework
 Key: KUDU-2105
 URL: https://issues.apache.org/jira/browse/KUDU-2105
 Project: Kudu
  Issue Type: Test
  Components: perf
Reporter: Edward Fancher
Assignee: Edward Fancher


It would be useful to have single node stress test support.
1. To allow node density testing without multi node cluster support.
2. To allow general perf testing without multi node cluster support.
3. To be able to run performance tests which have less variance (due to network 
noise).
4. (nice, but not needed) be able to simulate various characteristics such as 
network reliability, rpc call failures, under stress without needing to modify 
processes on other nodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KUDU-2100) Verify Java client's behavior for tserver and master fail-over scenario

2017-08-15 Thread Edward Fancher (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-2100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Fancher reassigned KUDU-2100:


Assignee: Edward Fancher

> Verify Java client's behavior for tserver and master fail-over scenario
> ---
>
> Key: KUDU-2100
> URL: https://issues.apache.org/jira/browse/KUDU-2100
> Project: Kudu
>  Issue Type: Test
>Reporter: Alexey Serbin
>Assignee: Edward Fancher
>
> This is to introduce a scenario where both the leader tserver and leader 
> master 'unexpectedly crash' during the run. The idea is to verify that the 
> client automatically updates its metacache even if the leader master changes 
> and manages to send the data to the destination server eventually.
> Mike suggested the following test scenario:
> # Have a configuration with 3 master servers, 6 tablet servers, and a table 
> consisting of 1 tablet with replication factor of 3.  Let's assume the tablet 
> are hosted by tablet servers TS1, TS2, and TS3.
> # Start the Kudu cluster.
> # Run the client to insert at least one row into the table.
> # Stop the client's activity, but keep the client object alive to keep it 
> ready for the next steps.
> # 3 times: permanently kill the leader of the tablet, so the tablet 
> eventually migrates to and is hosted by tablet servers TS4, TS5, TS6.
> # Kill the leader master (after the configuration change is committed).
> # Run the pre-warmed client to insert some data into the table again.  Doing 
> so, the client should refresh its metadata from the new leader master and be 
> able to send the data to the right destination.
> # Count the number of rows in the table to make sure it matches the 
> expectation.
> There was a discussion on when to kill the leader master: prior or after 
> moving the table to the new set of tablet servers.  It seems the latter case 
> (the sequence suggested above) allows covering a situation when no master 
> server recognizes itself as a leader.  The client should retry in that case 
> as well and eventually receive the tablet location info from the established 
> leader master.  If possible, let's implement the sequence for the former case 
> as well as an additional test.
> The general idea is to make sure the Java client during fail-over events:
> * Retries write and read operations automatically on an error happened due to 
> a fail-over event.
> * Does not silently lose any data: if the client cannot send the data due to 
> timeout or running out of retry attempts, it should report on that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KUDU-2033) Add a 'torture' scenario to verify Java client's behavior during fail-over

2017-07-07 Thread Edward Fancher (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Fancher reassigned KUDU-2033:


Assignee: Edward Fancher

> Add a 'torture' scenario to verify Java client's behavior during fail-over 
> ---
>
> Key: KUDU-2033
> URL: https://issues.apache.org/jira/browse/KUDU-2033
> Project: Kudu
>  Issue Type: Test
>  Components: client, java
>Reporter: Alexey Serbin
>Assignee: Edward Fancher
>  Labels: newbie, newbie++
>
> For the Kudu Java client we have {{TestLeaderFailover}} test which verifies 
> how the client handles the tablet server fail-over scenario.  However, the 
> test covers only one fail-over event and mainly performs write operations 
> while the backend handles the 'unexpected crash' of the tablet server.
> It would be nice to add more tests which cover the client's fail-over 
> behavior:
>   * add the mixed workload scenario, i.e. combine inserts/scans during the 
> fail-over
>   * induce more fail-over events while running the scenario, i.e. pause and 
> then resume the tservers processes many more times and run the test longer
>   * add the multi-master scenario, where both the leader tserver and leader 
> master 'unexpectedly crash' during the run
>   * in the mixed workload scenarios, run scan operations in READ_AT_TIMESTAMP 
> mode to exercise RYW (Read-Your-Writes) behavior and add assertions to make 
> sure the RYW behavior is observed as expected
>



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KUDU-1932) Run at least one tablet-level test against all block managers

2017-06-30 Thread Edward Fancher (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Fancher updated KUDU-1932:
-
Affects Version/s: 1.4.0

> Run at least one tablet-level test against all block managers
> -
>
> Key: KUDU-1932
> URL: https://issues.apache.org/jira/browse/KUDU-1932
> Project: Kudu
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Adar Dembo
>Assignee: Edward Fancher
> Fix For: 1.5.0
>
>
> Even though the block manager tests provide good overall coverage (and 
> block_manager-stress-test even approximates a flush-like workload), it would 
> still be good to make sure all block managers are exposed to flush/compact 
> operations; right now only the LBM gets that kind of coverage. KUDU-1931 is 
> an example of an FBM-only bug was hidden due to lack of appropriate coverage 
> in the block manager tests.
> mt-tablet-test is a good candidate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KUDU-1932) Run at least one tablet-level test against all block managers

2017-06-30 Thread Edward Fancher (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Fancher updated KUDU-1932:
-
Code Review: https://gerrit.cloudera.org/#/c/7252/

> Run at least one tablet-level test against all block managers
> -
>
> Key: KUDU-1932
> URL: https://issues.apache.org/jira/browse/KUDU-1932
> Project: Kudu
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Adar Dembo
>Assignee: Edward Fancher
> Fix For: 1.5.0
>
>
> Even though the block manager tests provide good overall coverage (and 
> block_manager-stress-test even approximates a flush-like workload), it would 
> still be good to make sure all block managers are exposed to flush/compact 
> operations; right now only the LBM gets that kind of coverage. KUDU-1931 is 
> an example of an FBM-only bug was hidden due to lack of appropriate coverage 
> in the block manager tests.
> mt-tablet-test is a good candidate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KUDU-1932) Run at least one tablet-level test against all block managers

2017-06-30 Thread Edward Fancher (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Fancher resolved KUDU-1932.
--
   Resolution: Fixed
Fix Version/s: 1.5.0

> Run at least one tablet-level test against all block managers
> -
>
> Key: KUDU-1932
> URL: https://issues.apache.org/jira/browse/KUDU-1932
> Project: Kudu
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.3.0
>Reporter: Adar Dembo
>Assignee: Edward Fancher
> Fix For: 1.5.0
>
>
> Even though the block manager tests provide good overall coverage (and 
> block_manager-stress-test even approximates a flush-like workload), it would 
> still be good to make sure all block managers are exposed to flush/compact 
> operations; right now only the LBM gets that kind of coverage. KUDU-1931 is 
> an example of an FBM-only bug was hidden due to lack of appropriate coverage 
> in the block manager tests.
> mt-tablet-test is a good candidate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KUDU-1932) Run at least one tablet-level test against all block managers

2017-06-13 Thread Edward Fancher (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Fancher reassigned KUDU-1932:


Assignee: Edward Fancher

> Run at least one tablet-level test against all block managers
> -
>
> Key: KUDU-1932
> URL: https://issues.apache.org/jira/browse/KUDU-1932
> Project: Kudu
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.3.0
>Reporter: Adar Dembo
>Assignee: Edward Fancher
>
> Even though the block manager tests provide good overall coverage (and 
> block_manager-stress-test even approximates a flush-like workload), it would 
> still be good to make sure all block managers are exposed to flush/compact 
> operations; right now only the LBM gets that kind of coverage. KUDU-1931 is 
> an example of an FBM-only bug was hidden due to lack of appropriate coverage 
> in the block manager tests.
> mt-tablet-test is a good candidate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)