[jira] [Assigned] (KUDU-2375) Can't parse message of type "kudu.master.SysTablesEntryPB" because it is missing required fields: schema.columns[5].type

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-2375:
-

Assignee: (was: Grant Henke)

> Can't parse message of type "kudu.master.SysTablesEntryPB" because it is 
> missing required fields: schema.columns[5].type
> 
>
> Key: KUDU-2375
> URL: https://issues.apache.org/jira/browse/KUDU-2375
> Project: Kudu
>  Issue Type: Bug
>  Components: master
>Affects Versions: 1.7.0
>Reporter: Michael Brown
>Priority: Major
>
> When tables with decimals are added in 1.7.0, a downgrade from 1.7.0 to 1.6 
> results in a dcheck when 1.6 starts and Kudu isn't usable in its downgraded 
> version.
> {noformat}
> F0324 17:45:10.681808 105716 catalog_manager.cc:935] Loading table and tablet 
> metadata into memory failed: Corruption: Failed while visiting tables in sys 
> catalog: unable to parse metadata field for row 
> 467d365fffbe4485a3249079c48f42a9: Error parsing msg: Can't parse message of 
> type "kudu.master.SysTablesEntryPB" because it is missing required fields: 
> schema.columns[5].type
> {noformat}
> {noformat}
> #0  0x003355e32625 in raise () from /lib64/libc.so.6
> #1  0x003355e33e05 in abort () from /lib64/libc.so.6
> #2  0x01cea129 in ?? ()
> #3  0x009268cd in google::LogMessage::Fail() ()
> #4  0x0092878d in google::LogMessage::SendToLog() ()
> #5  0x00926409 in google::LogMessage::Flush() ()
> #6  0x0092922f in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x008f05de in ?? ()
> #8  0x008f6039 in 
> kudu::master::CatalogManager::PrepareForLeadershipTask() ()
> #9  0x01d297d7 in kudu::ThreadPool::DispatchThread() ()
> #10 0x01d20151 in kudu::Thread::SuperviseThread(void*) ()
> #11 0x003356207aa1 in start_thread () from /lib64/libpthread.so.0
> #12 0x003355ee893d in clone () from /lib64/libc.so.6
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-3135) Add Client Metadata Tokens

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-3135:
-

Assignee: (was: Grant Henke)

> Add Client Metadata Tokens
> --
>
> Key: KUDU-3135
> URL: https://issues.apache.org/jira/browse/KUDU-3135
> Project: Kudu
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 1.12.0
>Reporter: Grant Henke
>Priority: Major
>  Labels: roadmap-candidate, scalability
>
> Currently when a distributed task is done using the Kudu client, the 
> driver/coordinator client needs to open the table to request its current 
> metadata and locations. Then it can distribute the work to tasks/executors on 
> remote nodes. In the case of reading data, often ScanTokens are used to 
> distribute the work, and in the case of writing data perhaps just the table 
> name is required.
> The problem is that each parallel task then also needs to open the table to 
> request the metadata for the table. Using Spark as an example, this happens 
> when deserializing the scan tokens in KuduRDD 
> ([here|https://github.com/apache/kudu/blob/master/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala#L107-L108])
>  or when writing rows using the KuduContext 
> ([here|https://github.com/apache/kudu/blob/master/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala#L466]).
>  This results in a large burst of metadata requests to the leader Kudu master 
> all at once. Given the Kudu master is only a single server and requests can't 
> be served from the follower masters, this effectively limits the amount of 
> parallel tasks that can run in a large Kudu deployment. Even if the follower 
> masters could service the requests, that still limits scalability in very 
> large clusters given most deployments would only have 3-5 masters.
> Adding a metadata token, similar to a scan token, would be a useful way to 
> allow the single driver to fetch all the metadata required for the parallel 
> tasks. The tokens can be serialized and then passed to each task in a similar 
> fashion to scan tokens.
> Of course in a pessimistic case, something may change between generation of 
> the token and the start of the task. In that case a request would need to be 
> sent to get the updated metadata. However, that scenario should be rare and 
> likely would not result in all of the requests happening at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-3245) Provide Client API to set verbose logging filtered by vmodule

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-3245:
-

Assignee: (was: Grant Henke)

> Provide Client API to set verbose logging filtered by vmodule 
> --
>
> Key: KUDU-3245
> URL: https://issues.apache.org/jira/browse/KUDU-3245
> Project: Kudu
>  Issue Type: Improvement
>  Components: client
>Reporter: Hao Hao
>Priority: Major
>
> Similar to 
> [{{client::SetVerboseLogLevel}}|https://github.com/apache/kudu/blob/master/src/kudu/client/client.h#L164]
>  API, it will be nice to add another API to allow enabling verbose logging 
> filtered by module.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-2619) Track Java test failures in the flaky test dashboard

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-2619:
-

Assignee: (was: Grant Henke)

> Track Java test failures in the flaky test dashboard
> 
>
> Key: KUDU-2619
> URL: https://issues.apache.org/jira/browse/KUDU-2619
> Project: Kudu
>  Issue Type: Improvement
>  Components: java, test
>Affects Versions: n/a
>Reporter: Adar Dembo
>Priority: Major
>
> Right now our flaky test tracking infrastructure only incorporates C++ tests 
> using GTest; we should extend it to include Java tests too.
> I spent some time on this recently and I wanted to collect my notes in one 
> place.
> For reference, here's how C++ test reporting works:
> # The build-and-test.sh script rebuilds thirdparty dependencies, builds Kudu, 
> and invokes all test suites, optionally using dist-test. After all tests have 
> been run, it also collects any dist-test artifacts and logs so that all of 
> the test results are available in one place.
> # The run-test.sh script is effectively the C++ "test runner". It is 
> responsible for running a test binary, retrying it if it fails, and calling 
> report-test.sh after each test run (success or failure). Importantly, 
> report-test.sh is invoked once per test binary (not individual test), and on 
> test success we don't wait for the script to finish, because we don't care as 
> much about collecting successes.
> # The report-test.sh collects some basic information about the test 
> environment (such as the git hash used, whether ASAN or TSAN was enabled, 
> etc.), then uses curl to send the information to the test result server.
> # The test result server will store the test run result in a database, and 
> will query that database to produce a dashboard.
> There are several problems to solve if we're going to replicate this for Java:
> # There's no equivalent to run-test.sh. The entry point for running the Java 
> test suite is Gradle, but in dist-test, the test invocation is actually done 
> via 'java org.junit.runner.JUnitCore'. Note that C++ test reporting is 
> currently also incompatible with dist-test, so the Java tests aren't unique 
> in that respect.
> # It'd be some work to replace report-test.sh with a Java equivalent.
> My thinking is that we should move test reporting from run-test.sh into 
> build-and-test.sh:
> # It's a good separation of concerns. The test "runners" are responsible for 
> running and maybe retrying tests, while the test "aggregator" 
> (build-and-test.sh) is responsible for reporting.
> # It's more performant. You can imagine building a test_result_server.py 
> endpoint for reporting en masse, which would cut down on the number of round 
> trips. That's especially important if we start reporting individual test 
> results (as opposed to test _suite_ results).
> # It means the reporting logic need only be written once.
> # It was always a bit unexpected to find reporting logic buried in 
> run-test.sh. I mean, it made sense for rapid prototyping but it never really 
> made that much sense to me.
> So then the problem is ensuring that, after all tests have run, we have the 
> right JUnit XML and log files for every test that ran, including retries, 
> which is more tractable, and doable for dist-test environments too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-2283) Improve KuduPartialRow::ToString() decimal output

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-2283:
-

Assignee: (was: Grant Henke)

> Improve KuduPartialRow::ToString() decimal output
> -
>
> Key: KUDU-2283
> URL: https://issues.apache.org/jira/browse/KUDU-2283
> Project: Kudu
>  Issue Type: Improvement
>Affects Versions: 1.7.0
>Reporter: Grant Henke
>Priority: Minor
>
> Currently KuduPartialRow::ToString() uses "AppendDebugStringForValue" to 
> print decimal values. However we could use the ColumnTypeAttributes to better 
> "pretty print" the decimal values. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-1261) Support nested data types

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-1261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-1261:
-

Assignee: (was: Grant Henke)

> Support nested data types
> -
>
> Key: KUDU-1261
> URL: https://issues.apache.org/jira/browse/KUDU-1261
> Project: Kudu
>  Issue Type: New Feature
>Reporter: Jean-Daniel Cryans
>Priority: Major
>  Labels: limitations, roadmap-candidate
>
> AKA complex data types.
> This is a common ask. I'm creating this jira so that we can at least start 
> tracking how people want to use it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-2860) Sign docker images

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-2860:
-

Assignee: (was: Grant Henke)

> Sign docker images
> --
>
> Key: KUDU-2860
> URL: https://issues.apache.org/jira/browse/KUDU-2860
> Project: Kudu
>  Issue Type: Improvement
>  Components: docker
>Reporter: Grant Henke
>Priority: Major
>  Labels: docker
>
> We should sign the Apache docker images following the instructions here: 
> [https://docs.docker.com/ee/dtr/user/manage-images/sign-images/]
>  
> Ideally this would be handled by the build script.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-2524) scalafmt incompatible with jdk8 older than u25

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-2524:
-

Assignee: (was: Grant Henke)

> scalafmt incompatible with jdk8 older than u25
> --
>
> Key: KUDU-2524
> URL: https://issues.apache.org/jira/browse/KUDU-2524
> Project: Kudu
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Adar Dembo
>Priority: Major
>
> We're seeing a fair number of Gradle build failures in scalafmt with the 
> following output:
> {noformat}
> 1: Task failed with an exception.
> ---
> * What went wrong:
> Execution failed for task ':kudu-spark:scalafmt'.
> > Uninitialized object exists on backward branch 209
>   Exception Details:
> Location:
>   
> scala/collection/immutable/HashMap$HashTrieMap.split()Lscala/collection/immutable/Seq;
>  @249: goto
> Reason:
>   Error exists in the bytecode
> Bytecode:
>   000: 2ab6 005b 04a0 001e b200 b3b2 00b8 04bd
>   010: 0002 5903 2a53 c000 bab6 00be b600 c2c0
>   020: 00c4 b02a b600 31b8 003b 3c1b 04a4 015e
>   030: 1b05 6c3d 2a1b 056c 2ab6 0031 b700 c63e
>   040: 2ab6 0031 021d 787e 3604 2ab6 0031 0210
>   050: 201d 647c 7e36 05bb 0014 59b2 00b8 2ab6
>   060: 0033 c000 bab6 00ca b700 cd1c b600 d13a
>   070: 0619 06c6 001a 1906 b600 d5c0 0081 3a07
>   080: 1906 b600 d8c0 0081 3a08 a700 0dbb 00da
>   090: 5919 06b7 00dd bf19 073a 0919 083a 0abb
>   0a0: 0002 5915 0419 09bb 0014 59b2 00b8 1909
>   0b0: c000 bab6 00ca b700 cd03 b800 e33a 0e3a
>   0c0: 0d03 190d b900 e701 0019 0e3a 1136 1036
>   0d0: 0f15 0f15 109f 0027 150f 0460 1510 190d
>   0e0: 150f b900 ea02 00c0 0005 3a17 1911 1917
>   0f0: b800 ee3a 1136 1036 0fa7 ffd8 1911 b800
>   100: f2b7 0060 3a0b bb00 0259 1505 190a bb00
>   110: 1459 b200 b819 0ac0 00ba b600 cab7 00cd
>   120: 03b8 00e3 3a13 3a12 0319 12b9 00e7 0100
>   130: 1913 3a16 3615 3614 1514 1515 9f00 2715
>   140: 1404 6015 1519 1215 14b9 00ea 0200 c000
>   150: 053a 1819 1619 18b8 00f5 3a16 3615 3614
>   160: a7ff d819 16b8 00f2 b700 603a 0cb2 00fa
>   170: b200 b805 bd00 0259 0319 0b53 5904 190c
>   180: 53c0 00ba b600 beb6 00fd b02a b600 3303
>   190: 32b6 00ff b0   
> Stackmap Table:
>   same_frame(@35)
>   
> full_frame(@141,{Object[#2],Integer,Integer,Integer,Integer,Integer,Object[#109]},{})
>   append_frame(@151,Object[#129],Object[#129])
>   
> full_frame(@209,{Object[#2],Integer,Integer,Integer,Integer,Integer,Object[#109],Object[#129],Object[#129],Object[#129],Object[#129],Top,Top,Object[#20],Object[#55],Integer,Integer,Object[#107]},{Uninitialized[#159],Uninitialized[#159],Integer,Object[#129]})
>   
> full_frame(@252,{Object[#2],Integer,Integer,Integer,Integer,Integer,Object[#109],Object[#129],Object[#129],Object[#129],Object[#129],Top,Top,Object[#20],Object[#55],Integer,Integer,Object[#107]},{Uninitialized[#159],Uninitialized[#159],Integer,Object[#129]})
>   
> full_frame(@312,{Object[#2],Integer,Integer,Integer,Integer,Integer,Object[#109],Object[#129],Object[#129],Object[#129],Object[#129],Object[#2],Top,Object[#20],Object[#55],Integer,Integer,Object[#107],Object[#20],Object[#55],Integer,Integer,Object[#107]},{Uninitialized[#262],Uninitialized[#262],Integer,Object[#129]})
>   
> full_frame(@355,{Object[#2],Integer,Integer,Integer,Integer,Integer,Object[#109],Object[#129],Object[#129],Object[#129],Object[#129],Object[#2],Top,Object[#20],Object[#55],Integer,Integer,Object[#107],Object[#20],Object[#55],Integer,Integer,Object[#107]},{Uninitialized[#262],Uninitialized[#262],Integer,Object[#129]})
>   full_frame(@395,{Object[#2],Integer},{}){noformat}
> This appears to be due to [this JDK 
> issue|https://stackoverflow.com/questions/24061672/verifyerror-uninitialized-object-exists-on-backward-branch-jvm-spec-4-10-2-4],
>  which was fixed in JDK 8u25.
> And sure enough, here's the JDK version for failing builds:
> {noformat}
> -- Found Java: /opt/toolchain/sun-jdk-64bit-1.8.0.05/bin/java (found suitable 
> version "1.8.0.05", minimum required is "1.7")
> {noformat}
> And here it is for successful builds:
> {noformat}
> 19:06:12 -- Found Java: /usr/lib/jvm/java-1.8.0-openjdk-amd64/bin/java (found 
> suitable version "1.8.0.111", minimum required is "1.7") 
> {noformat}
> We either need to blacklist JDK8 versions older than u25, or we need to 
> condition the scalafmt step on the JDK version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-3287) Threads can linger for some time after calling close on the Java KuduClient

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-3287:
-

Assignee: (was: Grant Henke)

> Threads can linger for some time after calling close on the Java KuduClient
> ---
>
> Key: KUDU-3287
> URL: https://issues.apache.org/jira/browse/KUDU-3287
> Project: Kudu
>  Issue Type: Bug
>Affects Versions: 1.12.0, 1.13.0, 1.14.0
>Reporter: Grant Henke
>Priority: Major
>
> After the upgrade to Netty 4 in Kudu 1.12 the close/shutdown behavior of the 
> Java client changed where threads and resources could linger for some time 
> after the call to close() returned. This looks the be because 
> `bootstrap.config().group().shutdownGracefully` is called with the default of 
> 15s and returns asynchronously. Additionally, the default ExecutorService was 
> not shutdown on close. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-3218) client_symbol-test fails on Centos 7 with devtoolset-8

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-3218:
-

Assignee: (was: Grant Henke)

> client_symbol-test fails on Centos 7 with devtoolset-8
> --
>
> Key: KUDU-3218
> URL: https://issues.apache.org/jira/browse/KUDU-3218
> Project: Kudu
>  Issue Type: Bug
>  Components: client
>Affects Versions: 1.14.0
>Reporter: Grant Henke
>Priority: Major
>
> When running the client_symbol-test on  Centos 7 with devtoolset-8 the test 
> fails with the following bad symbols: 
> {code:java}
> Found bad symbol 'operator delete[](void*, unsigned long)'
> Found bad symbol 'operator delete(void*, unsigned long)'
> Found bad symbol 'transaction clone for std::logic_error::what() const'
> Found bad symbol 'transaction clone for std::runtime_error::what() const'
> Found bad symbol 'transaction clone for std::logic_error::logic_error(char 
> const*)'
> Found bad symbol 'transaction clone for 
> std::logic_error::logic_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)'
> Found bad symbol 'transaction clone for std::logic_error::logic_error(char 
> const*)'
> Found bad symbol 'transaction clone for 
> std::logic_error::logic_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)'
> Found bad symbol 'transaction clone for std::logic_error::~logic_error()'
> Found bad symbol 'transaction clone for std::logic_error::~logic_error()'
> Found bad symbol 'transaction clone for std::logic_error::~logic_error()'
> Found bad symbol 'transaction clone for std::range_error::range_error(char 
> const*)'
> Found bad symbol 'transaction clone for 
> std::range_error::range_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)'
> Found bad symbol 'transaction clone for std::range_error::range_error(char 
> const*)'
> Found bad symbol 'transaction clone for 
> std::range_error::range_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)'
> Found bad symbol 'transaction clone for std::range_error::~range_error()'
> Found bad symbol 'transaction clone for std::range_error::~range_error()'
> Found bad symbol 'transaction clone for std::range_error::~range_error()'
> Found bad symbol 'transaction clone for std::domain_error::domain_error(char 
> const*)'
> Found bad symbol 'transaction clone for 
> std::domain_error::domain_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)'
> Found bad symbol 'transaction clone for std::domain_error::domain_error(char 
> const*)'
> Found bad symbol 'transaction clone for 
> std::domain_error::domain_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)'
> Found bad symbol 'transaction clone for std::domain_error::~domain_error()'
> Found bad symbol 'transaction clone for std::domain_error::~domain_error()'
> Found bad symbol 'transaction clone for std::domain_error::~domain_error()'
> Found bad symbol 'transaction clone for std::length_error::length_error(char 
> const*)'
> Found bad symbol 'transaction clone for 
> std::length_error::length_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)'
> Found bad symbol 'transaction clone for std::length_error::length_error(char 
> const*)'
> Found bad symbol 'transaction clone for 
> std::length_error::length_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)'
> Found bad symbol 'transaction clone for std::length_error::~length_error()'
> Found bad symbol 'transaction clone for std::length_error::~length_error()'
> Found bad symbol 'transaction clone for std::length_error::~length_error()'
> Found bad symbol 'transaction clone for std::out_of_range::out_of_range(char 
> const*)'
> Found bad symbol 'transaction clone for 
> std::out_of_range::out_of_range(std::__cxx11::basic_string std::char_traits, std::allocator > const&)'
> Found bad symbol 'transaction clone for std::out_of_range::out_of_range(char 
> const*)'
> Found bad symbol 'transaction clone for 
> std::out_of_range::out_of_range(std::__cxx11::basic_string std::char_traits, std::allocator > const&)'
> Found bad symbol 'transaction clone for std::out_of_range::~out_of_range()'
> Found bad symbol 'transaction clone for std::out_of_range::~out_of_range()'
> Found bad symbol 'transaction clone for std::out_of_range::~out_of_range()'
> Found bad symbol 'transaction clone for 
> std::runtime_error::runtime_error(char const*)'
> Found bad symbol 'transaction clone for 
> std::runtime_error::runtime_error(std::__cxx11::basic_string std::char_traits, std::allocator > const&)'
> Found bad symbol 'transaction clone for 
> std::runtime_error::runtime_error(char const*)'
> Found bad symbol 'transaction clone for 
> std::runtime_error::runtime_error(std::__cxx11

[jira] [Assigned] (KUDU-3132) Support RPC compression

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-3132:
-

Assignee: (was: Grant Henke)

> Support RPC compression
> ---
>
> Key: KUDU-3132
> URL: https://issues.apache.org/jira/browse/KUDU-3132
> Project: Kudu
>  Issue Type: New Feature
>  Components: perf, rpc
>Reporter: Grant Henke
>Priority: Major
>  Labels: performance, roadmap-candidate
>
> I have seen more and more deployments of Kudu where the tablet servers are 
> not co-located with the compute resources such as Impala or Spark. In 
> deployments like this, there could be significant network savings by 
> compressing the RPC messages (especially those that write or scan data). 
> Adding simple LZ4 or Snappy compression support to the RPC messages when not 
> on a loopback/local connection should be a great improvement for network 
> bound applications.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-2282) Support coercion of Decimal values

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-2282:
-

Assignee: (was: Grant Henke)

> Support coercion of Decimal values 
> ---
>
> Key: KUDU-2282
> URL: https://issues.apache.org/jira/browse/KUDU-2282
> Project: Kudu
>  Issue Type: Improvement
>Affects Versions: 1.7.0
>Reporter: Grant Henke
>Priority: Major
>
> Currently when decimal values are used in KuduValue.cc or PartialRow.cc we 
> enforce that the scale matches the expected scale. Instead we should support 
> basic coercion where no value rounding or truncating is required.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-3134) Adjust default value for --raft_heartbeat_interval

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-3134:
-

Assignee: (was: Grant Henke)

> Adjust default value for --raft_heartbeat_interval
> --
>
> Key: KUDU-3134
> URL: https://issues.apache.org/jira/browse/KUDU-3134
> Project: Kudu
>  Issue Type: Improvement
>Affects Versions: 1.12.0
>Reporter: Grant Henke
>Priority: Major
>
> Users often increase the `--raft_heartbeat_interval` on larger clusters or on 
> clusters with high replica counts. This helps avoid the servers flooding each 
> other with heartbeat RPCs causing queue overflows and using too much idle 
> CPU. Users have adjusted the values from 1.5 seconds to as high as 10s and we 
> have never seen people complain about problems after doing so.
> Anecdotally, I recently saw a cluster with 4k tablets per tablet server using 
> ~150% cpu usage while idle. By increasing the `--raft_heartbeat_interval` 
> from 500ms to 1500ms the cpu usage dropped to ~50%.
> Generally speaking users often care about Kudu stability and scalability over 
> an extremely short MTTR. Additionally our default client RPC timeouts of 30s 
> also seem to indicate slightly longer failover/retry times are tolerable in 
> the default case. 
> We should consider adjusting the default value of `--raft_heartbeat_interval` 
> to a higher value  to support larger and more efficient clusters by default. 
> Users who need a low MTTR can always adjust the value lower while also 
> adjusting other related timeouts. We may also want to consider adjusting the 
> default `--heartbeat_interval_ms` accordingly.
> Note: Batching the RPCs like mentioned in KUDU-1973 or providing a server to 
> server proxy for heartbeating may be a way to solve the issues without 
> adjusting the default configuration. However, adjusting the configuration is 
> easy and has proven effective in production deployments. Additionally 
> adjusting the defaults along with a KUDU-1973 like approach could lead to 
> even lower idle resource usage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-2530) Add kudu pbc replace tool

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-2530:
-

Assignee: (was: Grant Henke)

> Add kudu pbc replace tool
> -
>
> Key: KUDU-2530
> URL: https://issues.apache.org/jira/browse/KUDU-2530
> Project: Kudu
>  Issue Type: Improvement
>  Components: CLI
>Reporter: Grant Henke
>Priority: Minor
>
> We currently have a _kudu pbc dump_ and a _kudu pbc edit_ tool. However, it 
> could be nice to edit the dumped file elsewhere and be able to load/replace 
> the dumped pbc with it. Adding _kudu pbc replace_ would make this easier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-2500) Kudu Spark InterfaceStability class not found

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-2500:
-

Assignee: (was: Grant Henke)

> Kudu Spark InterfaceStability class not found
> -
>
> Key: KUDU-2500
> URL: https://issues.apache.org/jira/browse/KUDU-2500
> Project: Kudu
>  Issue Type: Bug
>  Components: spark
>Affects Versions: 1.7.0
>Reporter: Grant Henke
>Priority: Major
>
> We recently marked the Yetus annotation library as optional because the 
> annotations are not used at runtime and therefore should not be needed. Here 
> is a good summary of why the annotations are not required at runtime: 
> https://stackoverflow.com/questions/3567413/why-doesnt-a-missing-annotation-cause-a-classnotfoundexception-at-runtime/3568041#3568041
> However, for some reason Spark is requiring the annotation when performing 
> some reflection. See the sample stacktrace below:
> {code}
> Driver stacktrace:
>   at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1602)
>   at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1590)
>   at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1589)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1589)
>   at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
>   at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
>   at scala.Option.foreach(Option.scala:257)
>   at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
>   at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1823)
>   at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1772)
>   at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1761)
>   at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>   at 
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642)
>   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034)
>   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055)
>   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2074)
>   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2099)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:929)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:927)
>   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>   at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
>   at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:927)
>   at 
> org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply$mcV$sp(Dataset.scala:2675)
>   at 
> org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply(Dataset.scala:2675)
>   at 
> org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply(Dataset.scala:2675)
>   at 
> org.apache.spark.sql.Dataset$$anonfun$withNewRDDExecutionId$1.apply(Dataset.scala:3239)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
>   at 
> org.apache.spark.sql.Dataset.withNewRDDExecutionId(Dataset.scala:3235)
>   at org.apache.spark.sql.Dataset.foreachPartition(Dataset.scala:2674)
>   at 
> org.apache.kudu.spark.kudu.KuduContext.writeRows(KuduContext.scala:276)
>   at 
> org.apache.kudu.spark.kudu.KuduContext.insertRows(KuduContext.scala:206)
>   at 
> org.apache.kudu.backup.KuduRestore$$anonfun$run$1.apply(KuduRestore.scala:65)
>   at 
> org.apache.kudu.backup.KuduRestore$$anonfun$run$1.apply(KuduRestore.scala:44)
>   at scala.collection.immutable.List.foreach(List.scala:392)
>   at org.apache.kudu.backup.KuduRestore$.run(KuduRestore.scala:44)
>   at 
> org.apache.kudu.backup.TestKuduBackup.backupAndRestore(TestKuduBackup.scala:310)
>   at 
> org.apache.kudu.backup.TestKuduBackup$$anonfun$2.apply$mcV$sp(TestKuduBackup.scala:83)
>   at 
> org.apache.kudu.backup.TestKuduBackup$$anonfun$2.apply(TestKuduBackup.scala:76)
>   at 
> org.apache.kudu.backup.TestKuduBackup$$anonfun$2.apply(TestKuduBackup.scala:76)
>   at org.scalatest.OutcomeOf$class.outcomeOf(Out

[jira] [Assigned] (KUDU-982) nullable columns should support DEFAULT NULL

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-982:


Assignee: (was: Grant Henke)

> nullable columns should support DEFAULT NULL
> 
>
> Key: KUDU-982
> URL: https://issues.apache.org/jira/browse/KUDU-982
> Project: Kudu
>  Issue Type: Improvement
>  Components: api, client, master
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Priority: Major
>
> I don't think we have APIs which work for setting the default to NULL in 
> Alter/Create.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-3172) Enable hybrid clock and built-in NTP client in Docker by default

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-3172:
-

Assignee: (was: Grant Henke)

> Enable hybrid clock and built-in NTP client in Docker by default
> 
>
> Key: KUDU-3172
> URL: https://issues.apache.org/jira/browse/KUDU-3172
> Project: Kudu
>  Issue Type: Improvement
>Affects Versions: 1.12.0
>Reporter: Grant Henke
>Priority: Minor
>
> Currently the docker entrypoint sets `--use_hybrid_clock=false` by default. 
> This can cause unusual issues when snapshot scans are needed. Now that the 
> built-in client is available we should switch to use that by default in the 
> docker image by setting `--time_source=auto`.
> For the quickstart cluster we can use `--time_source=system_unsync` given we 
> expect all nodes will be on the same machine. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-2788) Validate metadata across backup and restore jobs

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-2788:
-

Assignee: (was: Grant Henke)

> Validate metadata across backup and restore jobs
> 
>
> Key: KUDU-2788
> URL: https://issues.apache.org/jira/browse/KUDU-2788
> Project: Kudu
>  Issue Type: Improvement
>Affects Versions: 1.9.0
>Reporter: Grant Henke
>Priority: Critical
>  Labels: backup
>
> Currently the backup and restore jobs assume the metadata hasn't changed or 
> has changed in a compatible way across runs. We should validate that this is 
> true when building the backup graph and handle as many metadata changes as 
> possible. 
> The metadata changes that can't be handled should be clearly documented and a 
> follow up Jira filed. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-3211) Add a cluster supported features request

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-3211:
-

Assignee: (was: Grant Henke)

> Add a cluster supported features request
> 
>
> Key: KUDU-3211
> URL: https://issues.apache.org/jira/browse/KUDU-3211
> Project: Kudu
>  Issue Type: Improvement
>  Components: master, supportability
>Affects Versions: 1.13.0
>Reporter: Grant Henke
>Priority: Major
>
> Recently we have come across a few scenarios where it would be useful to make 
> decisions in client integrations (Backup/Restore, Spark, NiFi, Impala) based 
> on the supported features of the target Kudu cluster. This can especially 
> helpful when we want to use new features by default if available but using 
> the new feature requires client/integration logic changes. 
> Some recent examples:
> - Push bloomfilter predicates only if supported
> - Use insert ignore operations (vs session based ignore) only if supported
> It is technically possible to be optimistic about the support of a feature 
> and try to handle errors in a clever way using the required feature 
> capabilities of the RPCs. However, that can be difficult to express and near 
> impossible if you want to make a decision for multiple requests or based on 
> what all tablet servers support instead of based on a single request to a 
> single tablet server.
> Additionally now that we support rolling restart, we can't assume that 
> because a single master or tablet server supports a feature that all servers 
> in the cluster support the feature.
> Some thoughts on the feature/implementation:
> - This should be a master request in order to prevent needing to talk to all 
> the tablet servers.
> - We could leverage server registration requests or heartbeats to aggregate 
> the current state on the leader master. 
> - We could represent these features as "cluster" level features and indicate 
> that some (union) or all (intersect) of the servers support a given feature.
> - If this request/response is not available in a cluster the response would 
> indicate that feature support is unknown and the user can decide how to 
> proceed.
> - If we want to support disabling features via runtime flags we will need to 
> ensure we update the master, maybe via heartbeat, with changed support for a 
> running server.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-2858) Update docker readme to be more user focused

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-2858:
-

Assignee: (was: Grant Henke)

> Update docker readme to be more user focused
> 
>
> Key: KUDU-2858
> URL: https://issues.apache.org/jira/browse/KUDU-2858
> Project: Kudu
>  Issue Type: Improvement
>  Components: docker, documentation
>Reporter: Grant Henke
>Priority: Major
>  Labels: docker
>
> Now that the docker images are being published, we should update the readme 
> to focus less on building the images and more on using the already built 
> images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-3244) Build and publish kudu-binary via Gradle

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-3244:
-

Assignee: (was: Grant Henke)

> Build and publish kudu-binary via Gradle
> 
>
> Key: KUDU-3244
> URL: https://issues.apache.org/jira/browse/KUDU-3244
> Project: Kudu
>  Issue Type: Improvement
>  Components: build, test
>Affects Versions: 1.14.0
>Reporter: Grant Henke
>Priority: Major
>
> Now that the kudu-binary jar only uses the `kudu` binary 
> ([here|https://gerrit.cloudera.org/#/c/12523/]), we should be able to 
> simplify the build and release process of that jar, and build that jar inside 
> the Gradle build.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-2696) libgmock is linked into the kudu cli binary

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-2696:
-

Assignee: (was: Grant Henke)

> libgmock is linked into the kudu cli binary
> ---
>
> Key: KUDU-2696
> URL: https://issues.apache.org/jira/browse/KUDU-2696
> Project: Kudu
>  Issue Type: Bug
>  Components: build
>Affects Versions: 1.8.0
>Reporter: Mike Percy
>Priority: Minor
>
> libgmock is linked into the kudu cli binary, even though we consider it a 
> test-only dependency. Possibly a configuration problem in our cmake files?
> {code:java}
> $ ldd build/dynclang/bin/kudu | grep mock
>  libgmock.so => 
> /home/mpercy/src/kudu/thirdparty/installed/uninstrumented/lib/libgmock.so 
> (0x7f01f1495000)
> {code}
> The gmock dependency does not appear in the server binaries, as expected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-2820) No JUnit XMLs when running Java dist-test makes for frustrating precommit experience

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-2820:
-

Assignee: (was: Grant Henke)

> No JUnit XMLs when running Java dist-test makes for frustrating precommit 
> experience
> 
>
> Key: KUDU-2820
> URL: https://issues.apache.org/jira/browse/KUDU-2820
> Project: Kudu
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 1.10.0
>Reporter: Adar Dembo
>Priority: Major
>
> When running Java tests in dist-test (as the precommit job does), JUnit XML 
> files aren't generated. That's because normally they're generated by Gradle, 
> but we don't run Gradle in the dist-test slaves; we run JUnit directly.
> As a result, test failures don't propagate back to the Jenkins job, and you 
> have to click through a few links (console output --> link to dist-test job 
> --> filter failures only --> download the artifacts) to figure out what went 
> wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-3148) Add Java client metrics

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-3148:
-

Assignee: (was: Grant Henke)

> Add Java client metrics
> ---
>
> Key: KUDU-3148
> URL: https://issues.apache.org/jira/browse/KUDU-3148
> Project: Kudu
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 1.12.0
>Reporter: Grant Henke
>Priority: Major
>  Labels: roadmap-candidate, supportability
>
> This Jira is to track adding complete metrics to the Java client. There are 
> many cases where applications using the client have issues that are difficult 
> to debug. The primary reason is that it's hard to reason about what the 
> application is doing with the Kudu client without inspecting the code, and 
> even then it can be easy to miss an issue in the code as well. 
> For example we have seen many cases where an application creates a Kudu 
> client, sends a few messages, and then closes the client in a loop. Creating 
> many clients over an over not only impacts performance/stability of the 
> application but can also put unwelcome load on the servers. If we had request 
> metrics with a client id tag periodically logged, then it would be easy to 
> grep the application logs for unique client ids and spot the issue and the 
> offending application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KUDU-1945) Support generation of surrogate primary keys (or tables with no PK)

2021-07-13 Thread Grant Henke (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-1945:
-

Assignee: (was: Grant Henke)

> Support generation of surrogate primary keys (or tables with no PK)
> ---
>
> Key: KUDU-1945
> URL: https://issues.apache.org/jira/browse/KUDU-1945
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, master, tablet
>Reporter: Todd Lipcon
>Priority: Major
>  Labels: roadmap-candidate
>
> Many use cases have data where there is no "natural" primary key. For 
> example, a web log use case mostly cares about partitioning and not about 
> precise sorting by timestamp, and timestamps themselves are not necessarily 
> unique. Rather than forcing users to come up with their own surrogate primary 
> keys, Kudu should support some kind of "auto_increment" equivalent which 
> generates primary keys on insertion. Alternatively, Kudu could support tables 
> which are partitioned but not internally sorted.
> The advantages would be:
> - Kudu can pick primary keys on insertion to guarantee that there is no 
> compaction required on the table (eg always assign a new key higher than any 
> existing key in the local tablet). This can improve write throughput 
> substantially, especially compared to naive PK generation schemes that a user 
> might pick such as UUID, which would generate a uniform random-insert 
> workload (worst case for performance)
> - Make Kudu easier to use for such use cases (no extra client code necessary)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)