[jira] [Created] (KUDU-1657) read-only FsManager::Open on active tablet can crash
Dan Burkert created KUDU-1657: - Summary: read-only FsManager::Open on active tablet can crash Key: KUDU-1657 URL: https://issues.apache.org/jira/browse/KUDU-1657 Project: Kudu Issue Type: Bug Reporter: Dan Burkert Assignee: Dan Burkert alter_table-randomized-test.cc is currently flaky due to a crash in the LogVerifier that happens because FsManager is not robust to running in read-only mode against an actively writing tablet. The root of the issue is a stale data container length that is used after reading new metadata. The failure results in log messages such as: {code} F0927 19:37:39.883033 22107 log_block_manager.cc:535] Found malformed block record in data file: /tmp/kudutest-4348/insert-verify-itest.InsertVerifyITest.TestInsertAndVerify.1475030222707874-17327/minicluster-data/ts-0/data/e4ade118175d48cabd2085014a6d762e.data Record: block_id { id: 1525 } op_type: CREATE timestamp_us: 1475030259882913 offset: 5840896 length: 279030 Data file size: 6119892 *** Check failure stack trace: *** @ 0x7f86ce57bf5d google::LogMessage::Fail() at ??:0 @ 0x7f86ce57de5d google::LogMessage::SendToLog() at ??:0 @ 0x7f86ce57ba99 google::LogMessage::Flush() at ??:0 @ 0x7f86ce57e8ff google::LogMessageFatal::~LogMessageFatal() at ??:0 @ 0x7f86cfe4e32b kudu::fs::internal::LogBlockContainer::CheckBlockRecord() at ??:0 @ 0x7f86cfe4dc8d kudu::fs::internal::LogBlockContainer::ReadContainerRecords() at ??:0 @ 0x7f86cfe5731a kudu::fs::LogBlockManager::OpenRootPath() at ??:0 @ 0x7f86cfe69023 kudu::internal::RunnableAdapter<>::Run() at ??:0 @ 0x7f86cfe66959 kudu::internal::InvokeHelper<>::MakeItSo() at ??:0 @ 0x7f86cfe63a77 kudu::internal::Invoker<>::Run() at ??:0 @ 0x7f86d598b542 kudu::Callback<>::Run() at ??:0 @ 0x7f86d598fe61 boost::_mfi::cmf0<>::operator()() at ??:0 @ 0x7f86d598f93e boost::_bi::list1<>::operator()<>() at ??:0 @ 0x7f86d598f05d boost::_bi::bind_t<>::operator()() at ??:0 @ 0x7f86d598e860 boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0 @ 0x7f86d1296732 boost::function0<>::operator()() at ??:0 @ 0x7f86cf402124 kudu::FunctionRunnable::Run() at ??:0 @ 0x7f86cf401556 kudu::ThreadPool::DispatchThread() at ??:0 @ 0x7f86cf405824 boost::_mfi::mf1<>::operator()() at ??:0 @ 0x7f86cf40542b boost::_bi::list2<>::operator()<>() at ??:0 @ 0x7f86cf404ecd boost::_bi::bind_t<>::operator()() at ??:0 @ 0x7f86cf4047fe boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0 @ 0x7f86d1296732 boost::function0<>::operator()() at ??:0 @ 0x7f86cf3f8717 kudu::Thread::SuperviseThread() at ??:0 @ 0x3ae0e079d1 (unknown) at ??:0 @ 0x3ae0ae88fd (unknown) at ??:0 @ (nil) (unknown) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KUDU-1656) Scanner timeouts aren't retried when waiting on a transaction
Jean-Daniel Cryans created KUDU-1656: Summary: Scanner timeouts aren't retried when waiting on a transaction Key: KUDU-1656 URL: https://issues.apache.org/jira/browse/KUDU-1656 Project: Kudu Issue Type: Bug Components: tserver Reporter: Jean-Daniel Cryans I recently changed ITClient to use READ_AT_SNAPSHOT scanners and we've been seeing errors like this: {noformat} 19:56:29.459 [WARN - New I/O worker #169] (AsyncKuduScanner.java:407) Can not open scanner org.apache.kudu.client.NonRecoverableException: could not wait for desired snapshot timestamp to be consistent: Timed out waiting for all transactions with ts < P: 1475006188645381 usec, L: 0 to commit at org.apache.kudu.client.TabletClient.dispatchTSErrorOrReturnException(TabletClient.java:548) at org.apache.kudu.client.TabletClient.decode(TabletClient.java:482) at org.apache.kudu.client.TabletClient.decode(TabletClient.java:83) {noformat} Since this comes back as a TimedOut AppStatus, neither clients are retrying the error which doesn't seem to be the expected behavior on the server-side: https://github.com/cloudera/kudu/blob/be719edc3581802e094c3af6a88d67acba44ba71/src/kudu/tserver/tablet_service.cc#L1764 One one hand it seems weird to rely on the user to retry only certain timeouts, OTOH maybe it shouldn't be sent as a timeout? But I'm not sure what it should be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KUDU-1655) Update docs for ASF maven repository coordinates
Todd Lipcon created KUDU-1655: - Summary: Update docs for ASF maven repository coordinates Key: KUDU-1655 URL: https://issues.apache.org/jira/browse/KUDU-1655 Project: Kudu Issue Type: Bug Affects Versions: 1.0.0 Reporter: Todd Lipcon The docs are still pointing to Cloudera maven repos, but we now publish to the ASF repo. We should also update the "spark-shell" example to use the "--packages" argument since Kudu's available in Maven. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KUDU-1363) Add IN-list predicate type
[ https://issues.apache.org/jira/browse/KUDU-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Burkert updated KUDU-1363: -- Status: In Review (was: In Progress) > Add IN-list predicate type > -- > > Key: KUDU-1363 > URL: https://issues.apache.org/jira/browse/KUDU-1363 > Project: Kudu > Issue Type: Sub-task > Components: client, perf, tablet >Reporter: Chris George >Assignee: Dan Burkert > > Currently adding multiple column range predicates for the same column does > essentially an AND between the two predicates which will cause no results to > be returned. > This would greatly increase performance were I can complete in one scan what > would otherwise take two. > As an example using the java api: > ColumnRangePredicate columnRangePredicateColumnNameA = new > ColumnRangePredicate(new ColumnSchema.ColumnSchemaBuilder("column_name", > Type.STRING).build()); > columnRangePredicateColumnNameA.setLowerBound("A"); > columnRangePredicateColumnNameA.setUpperBound("A"); > ColumnRangePredicate columnRangePredicateColumnNameB = new > ColumnRangePredicate(new ColumnSchema.ColumnSchemaBuilder("column_name", > Type.STRING).build()); > columnRangePredicateColumnNameB.setLowerBound("B"); > columnRangePredicateColumnNameB.setUpperBound("B"); > which would be equivalent: > select * from some_table where column_name="A" or column_name="B" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KUDU-1642) Add IS NULL predicate type
[ https://issues.apache.org/jira/browse/KUDU-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15526919#comment-15526919 ] Alexey Serbin edited comment on KUDU-1642 at 9/27/16 6:08 PM: -- Yes, psql supports NULL in IN-list predicates. At least with PostgreSQL 9.3. Probably, that's done to support sub-selects like {{SELECT * FROM x WHERE field_x IN (SELECT field_y FROM y}}; besides, they have stored procedures and pgsql, so it's crucial to provide syntax consistency there. By itself, {{WHERE field_x IN (NULL)}} should result in empty resultset by definition, since it's the same as {{WHERE field_x = NULL}}. I'm not sure whether supporting that brings any value to the Kudu project as is since sub-selects are not supported in Kudu now, AFAIK. It's more about syntax consistency. {noformat} postgres@ubuntu-14:~$ psql psql (9.3.13) Type "help" for help. postgres=# INSERT INTO x VALUES (0, 1); INSERT 0 1 postgres=# INSERT INTO x VALUES (1, NULL); INSERT 0 1 postgres=# SELECT * FROM x; a | b ---+--- 0 | 1 1 | (2 rows) postgres=# SELECT * FROM x WHERE a IN (0, NULL); a | b ---+--- 0 | 1 (1 row) postgres=# SELECT * FROM x WHERE b IN (1, NULL); a | b ---+--- 0 | 1 (1 row) postgres=# SELECT * FROM x WHERE b IN (NULL); a | b ---+--- (0 rows) {noformat} was (Author: aserbin): Yes, psql supports NULL in IN-list predicates. At least with PostgreSQL 9.3. Probably, that's done to support sub-selects like {{SELECT * FROM x WHERE field_x IN (SELECT field_y FROM y}}. By itself, {{WHERE field_x IN (NULL)}} should result in empty resultset by definition, since it's the same as {{WHERE field_x = NULL}}. I'm not sure whether supporting that brings any value to the Kudu project as is since sub-selects are not supported in Kudu now, AFAIK. It's more about syntax consistency. {noformat} postgres@ubuntu-14:~$ psql psql (9.3.13) Type "help" for help. postgres=# INSERT INTO x VALUES (0, 1); INSERT 0 1 postgres=# INSERT INTO x VALUES (1, NULL); INSERT 0 1 postgres=# SELECT * FROM x; a | b ---+--- 0 | 1 1 | (2 rows) postgres=# SELECT * FROM x WHERE a IN (0, NULL); a | b ---+--- 0 | 1 (1 row) postgres=# SELECT * FROM x WHERE b IN (1, NULL); a | b ---+--- 0 | 1 (1 row) postgres=# SELECT * FROM x WHERE b IN (NULL); a | b ---+--- (0 rows) {noformat} > Add IS NULL predicate type > -- > > Key: KUDU-1642 > URL: https://issues.apache.org/jira/browse/KUDU-1642 > Project: Kudu > Issue Type: Sub-task > Components: client, tablet >Reporter: Dan Burkert > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KUDU-1652) Partition pruning / scan optimization fails with IS NOT NULL predicate on PK column
[ https://issues.apache.org/jira/browse/KUDU-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Burkert resolved KUDU-1652. --- Resolution: Fixed Fix Version/s: 1.0.1 1.0.0 > Partition pruning / scan optimization fails with IS NOT NULL predicate on PK > column > --- > > Key: KUDU-1652 > URL: https://issues.apache.org/jira/browse/KUDU-1652 > Project: Kudu > Issue Type: Sub-task > Components: client, tablet >Affects Versions: 1.0.0 >Reporter: Dan Burkert >Assignee: Dan Burkert >Priority: Blocker > Fix For: 1.0.0, 1.0.1 > > > Both the Java client and C++ client/server currently have a bug where > attempting a scan with an {{IS NOT NULL}} predicate on a primary key column > can through an exception (Java), or crash the C++ client or server. This is > a rare situation currently since {{IS NOT NULL}} is not publicly accessible, > so it has to come from a simplified predicate like {{my_int8_column <= 127}}. > The fix is straightforward: stop encoding the lower/upper bound keys when an > {{IS NOT NULL}} predicate is encountered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KUDU-1642) Add IS NULL predicate type
[ https://issues.apache.org/jira/browse/KUDU-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15526919#comment-15526919 ] Alexey Serbin edited comment on KUDU-1642 at 9/27/16 5:55 PM: -- Yes, psql supports NULL in IN-list predicates. At least with PostgreSQL 9.3. Probably, that's done to support sub-selects like {{SELECT * FROM x WHERE field_x IN (SELECT field_y FROM y}}. By itself, {{WHERE field_x IN (NULL)}} should result in empty resultset by definition, since it's the same as {{WHERE field_x = NULL}}. I'm not sure whether supporting that brings any value to the Kudu project as is since sub-selects are not supported in Kudu now, AFAIK. It's more about syntax consistency. {noformat} postgres@ubuntu-14:~$ psql psql (9.3.13) Type "help" for help. postgres=# INSERT INTO x VALUES (0, 1); INSERT 0 1 postgres=# INSERT INTO x VALUES (1, NULL); INSERT 0 1 postgres=# SELECT * FROM x; a | b ---+--- 0 | 1 1 | (2 rows) postgres=# SELECT * FROM x WHERE a IN (0, NULL); a | b ---+--- 0 | 1 (1 row) postgres=# SELECT * FROM x WHERE b IN (1, NULL); a | b ---+--- 0 | 1 (1 row) postgres=# SELECT * FROM x WHERE b IN (NULL); a | b ---+--- (0 rows) {noformat} was (Author: aserbin): Yes, psql supports NULL in IN-list predicates. At least with PostgreSQL 9.3. Probably, that's done to support sub-selects like 'SELECT * FROM x WHERE field_x IN (SELECT field_y FROM y), because {{WHERE field_x IN (NULL)}} should result in empty resultset by definition (it's the same as {{WHERE field_x = NULL}}). {noformat} postgres@ubuntu-14:~$ psql psql (9.3.13) Type "help" for help. postgres=# INSERT INTO x VALUES (0, 1); INSERT 0 1 postgres=# INSERT INTO x VALUES (1, NULL); INSERT 0 1 postgres=# SELECT * FROM x; a | b ---+--- 0 | 1 1 | (2 rows) postgres=# SELECT * FROM x WHERE a IN (0, NULL); a | b ---+--- 0 | 1 (1 row) postgres=# SELECT * FROM x WHERE b IN (1, NULL); a | b ---+--- 0 | 1 (1 row) postgres=# SELECT * FROM x WHERE b IN (NULL); a | b ---+--- (0 rows) {noformat} > Add IS NULL predicate type > -- > > Key: KUDU-1642 > URL: https://issues.apache.org/jira/browse/KUDU-1642 > Project: Kudu > Issue Type: Sub-task > Components: client, tablet >Reporter: Dan Burkert > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KUDU-1642) Add IS NULL predicate type
[ https://issues.apache.org/jira/browse/KUDU-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15526919#comment-15526919 ] Alexey Serbin commented on KUDU-1642: - Yes, psql supports NULL in IN-list predicates. At least with PostgreSQL 9.3. Probably, that's done to support sub-selects like 'SELECT * FROM x WHERE field_x IN (SELECT field_y FROM y), because {{WHERE field_x IN (NULL)}} should result in empty resultset by definition (it's the same as {{WHERE field_x = NULL}}). {noformat} postgres@ubuntu-14:~$ psql psql (9.3.13) Type "help" for help. postgres=# INSERT INTO x VALUES (0, 1); INSERT 0 1 postgres=# INSERT INTO x VALUES (1, NULL); INSERT 0 1 postgres=# SELECT * FROM x; a | b ---+--- 0 | 1 1 | (2 rows) postgres=# SELECT * FROM x WHERE a IN (0, NULL); a | b ---+--- 0 | 1 (1 row) postgres=# SELECT * FROM x WHERE b IN (1, NULL); a | b ---+--- 0 | 1 (1 row) postgres=# SELECT * FROM x WHERE b IN (NULL); a | b ---+--- (0 rows) {noformat} > Add IS NULL predicate type > -- > > Key: KUDU-1642 > URL: https://issues.apache.org/jira/browse/KUDU-1642 > Project: Kudu > Issue Type: Sub-task > Components: client, tablet >Reporter: Dan Burkert > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KUDU-1619) Master Web UI "Tablet Servers" tab should separate live and suspected dead tablet servers
[ https://issues.apache.org/jira/browse/KUDU-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Berkeley resolved KUDU-1619. - Resolution: Fixed Fix Version/s: 1.1.0 Resolved by Ninad in 376f95b6dc19ceb13221f02851cf58366f71825d > Master Web UI "Tablet Servers" tab should separate live and suspected dead > tablet servers > - > > Key: KUDU-1619 > URL: https://issues.apache.org/jira/browse/KUDU-1619 > Project: Kudu > Issue Type: Improvement > Components: master >Reporter: Dan Burkert >Assignee: Ninad Shringarpure > Labels: newbie > Fix For: 1.1.0 > > > We already list the count of live and dead tablet servers, we should split > them into separate tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KUDU-352) Decide on and implement within-batch ordering for client API
[ https://issues.apache.org/jira/browse/KUDU-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15526783#comment-15526783 ] Jean-Daniel Cryans commented on KUDU-352: - The solution I had implemented in the Java client that you were referring to in your May 14th 2015 comment was completely refactored this summer and we still retain ordering. So at least on this side we're good. > Decide on and implement within-batch ordering for client API > > > Key: KUDU-352 > URL: https://issues.apache.org/jira/browse/KUDU-352 > Project: Kudu > Issue Type: New Feature > Components: client >Affects Versions: M5 >Reporter: Vladimir Feinberg > > Currently, the when the client applies a sequence of WriteOperations to a > session without flushing (within a single batch), the batcher runs tablet > location lookup asynchronously (see method Batcher::TabletLookupFinished). > Thus, it is possible that within the same batch, even with manual flushing, > the PerTSBuffer is flushed out of order (causing operations to arrive > out-of-order on the server side). > A contract needs to be designed (and applied to both C++ and Java APIs) > regarding the strength of the ordering within the batches. > Some options: > 1. No order guaranteed (current). Client must manually flush between batches > to ensure order. > 2. Per-row order guarantee - operations are sent to the server where for a > given key, the sequence of operations is preserved. > 3. Strict ordering guarantee. Independent of keys, order of batch is matched. > Things to consider: > -> Is (2) different from (3)? With HybridTime, the client should only see > changes atomically on a per-batch level with concurrent reads. Then > between-row operations do not matter (until multi-row transactions are > introduced). > -> A flexible version of the API that could include BarrierWriteOperations > which would allow the user to control order within batches themselves. > -> Simplifying things entirely, removing all order (force the client to use a > transaction or flushes to ensure order). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KUDU-1241) Add support for Kudu TIMESTAMPs to the Impala kudu scanners
[ https://issues.apache.org/jira/browse/KUDU-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved KUDU-1241. --- Resolution: Incomplete Fix Version/s: n/a Resolving since this is tracked on the Impala side (link above) > Add support for Kudu TIMESTAMPs to the Impala kudu scanners > --- > > Key: KUDU-1241 > URL: https://issues.apache.org/jira/browse/KUDU-1241 > Project: Kudu > Issue Type: New Feature > Components: impala >Affects Versions: Public beta >Reporter: David Alves > Fix For: n/a > > > We currently have the TIMESTAMP type on the Kudu side but still haven't added > support for it on the impala side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KUDU-1216) Add integration for Spark DStream map and foreach partition
[ https://issues.apache.org/jira/browse/KUDU-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved KUDU-1216. --- Resolution: Duplicate Fix Version/s: n/a Closing as duplicate since I think this is implemented by the current Spark integration. Feel free to re-open if I misunderstood. > Add integration for Spark DStream map and foreach partition > --- > > Key: KUDU-1216 > URL: https://issues.apache.org/jira/browse/KUDU-1216 > Project: Kudu > Issue Type: New Feature > Components: integration >Reporter: Ted Malaska > Fix For: n/a > > > This jira will add two implicit method to Spark DStream > 1. kuduForeachPartition > 2. kuduMapPartitions > These method will act like the basic foreach/map partition but they will > provide the developer a live client to interact with Kudu > These methods will be accessable from two different call points. > 1. Scala DStream > 2. KuduContext (which will work for Java) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KUDU-1215) Add integration for Spark map and foreach partition
[ https://issues.apache.org/jira/browse/KUDU-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved KUDU-1215. --- Resolution: Duplicate Fix Version/s: n/a Resolving this as duplicate since I think the API that is currently available solves these use cases. Feel free to reopen if I misunderstood the feature described here. > Add integration for Spark map and foreach partition > --- > > Key: KUDU-1215 > URL: https://issues.apache.org/jira/browse/KUDU-1215 > Project: Kudu > Issue Type: New Feature > Components: integration >Reporter: Ted Malaska > Fix For: n/a > > > This jira will add two implicit method to Spark RDD > 1. kuduForeachPartition > 2. kuduMapPartitions > These method will act like the basic foreach/map partition but they will > provide the developer a live client to interact with Kudu > These methods will be accessable from two different call points. > 1. Scala RDD > 2. KuduContext (which will work for Java) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KUDU-352) Decide on and implement within-batch ordering for client API
[ https://issues.apache.org/jira/browse/KUDU-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15526741#comment-15526741 ] Todd Lipcon commented on KUDU-352: -- [~aserbin] - do you think this issue is now fully implemented/resolved in both the Java and C++ clients? I think you're the latest to look at ordering semantics from the client API. > Decide on and implement within-batch ordering for client API > > > Key: KUDU-352 > URL: https://issues.apache.org/jira/browse/KUDU-352 > Project: Kudu > Issue Type: New Feature > Components: client >Affects Versions: M5 >Reporter: Vladimir Feinberg > > Currently, the when the client applies a sequence of WriteOperations to a > session without flushing (within a single batch), the batcher runs tablet > location lookup asynchronously (see method Batcher::TabletLookupFinished). > Thus, it is possible that within the same batch, even with manual flushing, > the PerTSBuffer is flushed out of order (causing operations to arrive > out-of-order on the server side). > A contract needs to be designed (and applied to both C++ and Java APIs) > regarding the strength of the ordering within the batches. > Some options: > 1. No order guaranteed (current). Client must manually flush between batches > to ensure order. > 2. Per-row order guarantee - operations are sent to the server where for a > given key, the sequence of operations is preserved. > 3. Strict ordering guarantee. Independent of keys, order of batch is matched. > Things to consider: > -> Is (2) different from (3)? With HybridTime, the client should only see > changes atomically on a per-batch level with concurrent reads. Then > between-row operations do not matter (until multi-row transactions are > introduced). > -> A flexible version of the API that could include BarrierWriteOperations > which would allow the user to control order within batches themselves. > -> Simplifying things entirely, removing all order (force the client to use a > transaction or flushes to ensure order). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KUDU-99) Separate internal (e.g., storage related) column attributes from external ones and change APIs
[ https://issues.apache.org/jira/browse/KUDU-99?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved KUDU-99. - Resolution: Won't Fix Fix Version/s: n/a Resolving this as wont fix, since the surgery required at this point would be pretty heavy and probably break wire compat. We can re-open if it becomes more urgent such that it's worth the compatibility issues. > Separate internal (e.g., storage related) column attributes from external > ones and change APIs > -- > > Key: KUDU-99 > URL: https://issues.apache.org/jira/browse/KUDU-99 > Project: Kudu > Issue Type: New Feature > Components: client, master, tablet >Affects Versions: Backlog >Reporter: Alex Feinberg > Fix For: n/a > > > We currently use ColumnSchema (and it's matching protobuf messages > ColumnSchemaPB) to specify both logical (e.g., which data type, what > constitutes a key, is it nullable) and physical (e.g., compression codec, > encoding algorithm) attributes. > We need an approach that allows the user to specify and alter these externals > attributes via our "DDL" APIs, without having to include them in every single > message and data structure: e.g., when sending other data over the wire > (which may as well be encoded in a different format or sent plain text). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KUDU-81) rpc-test TestConnectionKeepalive failure
[ https://issues.apache.org/jira/browse/KUDU-81?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated KUDU-81: Issue Type: Bug (was: New Feature) > rpc-test TestConnectionKeepalive failure > > > Key: KUDU-81 > URL: https://issues.apache.org/jira/browse/KUDU-81 > Project: Kudu > Issue Type: Bug > Components: rpc, test >Affects Versions: M4 >Reporter: Todd Lipcon >Priority: Trivial > Labels: flaky > > Saw this fail once: > {code} > /var/lib/jenkins/workspace/kudu-test/BUILD_TYPE/LEAKCHECK/label/centos6-kudu/src/rpc/rpc-test.cc:155: > Failure > Value of: metrics.num_server_connections_ > Actual: 1 > Expected: 0 > Server should have 0 server connections > {code} > Probably just a timing issue in the test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KUDU-1214) Add Integration points for Spark, Spark Streaming, and Spark SQL
[ https://issues.apache.org/jira/browse/KUDU-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved KUDU-1214. --- Resolution: Duplicate Fix Version/s: n/a Hearing nothing, I'm going to assume that what we've already built satisfies this original JIRA. Feel free to open a new JIRA if there are feature requests against the Spark integration. > Add Integration points for Spark, Spark Streaming, and Spark SQL > > > Key: KUDU-1214 > URL: https://issues.apache.org/jira/browse/KUDU-1214 > Project: Kudu > Issue Type: New Feature > Components: integration >Reporter: Ted Malaska > Fix For: n/a > > Attachments: KUDU-1214.1.patch > > > This Jira will be broken up into four main jira: > 1. Add Support for Spark RDD map and foreach integration with Kudu > 2. Add Support for Spark DStream map and foreach integration with Kudu > 3. Add Support for Spark SQL defaultSource and push down predicates > 4. Add documentation for all Spark Integrations -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KUDU-1654) Python 3 Client Test Failure: test_table_column
[ https://issues.apache.org/jira/browse/KUDU-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordan Birdsell updated KUDU-1654: -- Status: In Review (was: Open) > Python 3 Client Test Failure: test_table_column > --- > > Key: KUDU-1654 > URL: https://issues.apache.org/jira/browse/KUDU-1654 > Project: Kudu > Issue Type: Bug > Components: python >Affects Versions: Public beta >Reporter: Jordan Birdsell >Assignee: Jordan Birdsell > > Python 3 requires an explicit encodinng to be specified when casting to > bytes, in python 2 bytes is synonymous with string so this is a non-issue. > This should be updated to use the compat module that has accounted for this > difference with the frombytes method. > self = > def test_table_column(self): > table = self.client.table(self.ex_table) > cols = [(table['key'], 'key', 'int32'), > (table[1], 'int_val', 'int32'), > (table[-1], 'unixtime_micros_val', 'unixtime_micros')] > > for col, name, type in cols: > > assert col.name == bytes(name) > E TypeError: string argument without an encoding -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KUDU-1654) Python 3 Client Test Failure: test_table_column
[ https://issues.apache.org/jira/browse/KUDU-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15525987#comment-15525987 ] Jordan Birdsell commented on KUDU-1654: --- https://gerrit.cloudera.org/#/c/4543/ > Python 3 Client Test Failure: test_table_column > --- > > Key: KUDU-1654 > URL: https://issues.apache.org/jira/browse/KUDU-1654 > Project: Kudu > Issue Type: Bug > Components: python >Affects Versions: Public beta >Reporter: Jordan Birdsell >Assignee: Jordan Birdsell > > Python 3 requires an explicit encodinng to be specified when casting to > bytes, in python 2 bytes is synonymous with string so this is a non-issue. > This should be updated to use the compat module that has accounted for this > difference with the frombytes method. > self = > def test_table_column(self): > table = self.client.table(self.ex_table) > cols = [(table['key'], 'key', 'int32'), > (table[1], 'int_val', 'int32'), > (table[-1], 'unixtime_micros_val', 'unixtime_micros')] > > for col, name, type in cols: > > assert col.name == bytes(name) > E TypeError: string argument without an encoding -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KUDU-1654) Python 3 Client Test Failure: test_table_column
Jordan Birdsell created KUDU-1654: - Summary: Python 3 Client Test Failure: test_table_column Key: KUDU-1654 URL: https://issues.apache.org/jira/browse/KUDU-1654 Project: Kudu Issue Type: Bug Components: python Affects Versions: Public beta Reporter: Jordan Birdsell Assignee: Jordan Birdsell Python 3 requires an explicit encodinng to be specified when casting to bytes, in python 2 bytes is synonymous with string so this is a non-issue. This should be updated to use the compat module that has accounted for this difference with the frombytes method. self = def test_table_column(self): table = self.client.table(self.ex_table) cols = [(table['key'], 'key', 'int32'), (table[1], 'int_val', 'int32'), (table[-1], 'unixtime_micros_val', 'unixtime_micros')] for col, name, type in cols: > assert col.name == bytes(name) E TypeError: string argument without an encoding -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KUDU-1653) Python 3 Failing to decode serialized token
[ https://issues.apache.org/jira/browse/KUDU-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15525935#comment-15525935 ] Jordan Birdsell commented on KUDU-1653: --- https://gerrit.cloudera.org/#/c/4542/ > Python 3 Failing to decode serialized token > --- > > Key: KUDU-1653 > URL: https://issues.apache.org/jira/browse/KUDU-1653 > Project: Kudu > Issue Type: Bug > Components: python >Affects Versions: 1.1.0 >Reporter: Jordan Birdsell >Assignee: Jordan Birdsell > > Python 3 attempts to deserialize token into utf-8 string and causes failure: > UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 141: > invalid start byte -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KUDU-1653) Python 3 Failing to decode serialized token
[ https://issues.apache.org/jira/browse/KUDU-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordan Birdsell updated KUDU-1653: -- Status: In Review (was: Open) > Python 3 Failing to decode serialized token > --- > > Key: KUDU-1653 > URL: https://issues.apache.org/jira/browse/KUDU-1653 > Project: Kudu > Issue Type: Bug > Components: python >Affects Versions: 1.1.0 >Reporter: Jordan Birdsell >Assignee: Jordan Birdsell > > Python 3 attempts to deserialize token into utf-8 string and causes failure: > UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 141: > invalid start byte -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KUDU-1653) Python 3 Failing to decode serialized token
Jordan Birdsell created KUDU-1653: - Summary: Python 3 Failing to decode serialized token Key: KUDU-1653 URL: https://issues.apache.org/jira/browse/KUDU-1653 Project: Kudu Issue Type: Bug Components: python Affects Versions: 1.1.0 Reporter: Jordan Birdsell Assignee: Jordan Birdsell Python 3 attempts to deserialize token into utf-8 string and causes failure: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 141: invalid start byte -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KUDU-1650) Python - Python 3 GetUnixTimeMicros Symbol Not Recognized
[ https://issues.apache.org/jira/browse/KUDU-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordan Birdsell resolved KUDU-1650. --- Resolution: Not A Problem Fix Version/s: n/a Issue was related to not having the LD_LIBRARY_PATH set > Python - Python 3 GetUnixTimeMicros Symbol Not Recognized > - > > Key: KUDU-1650 > URL: https://issues.apache.org/jira/browse/KUDU-1650 > Project: Kudu > Issue Type: Bug > Components: python >Affects Versions: 1.1.0 >Reporter: Jordan Birdsell >Assignee: Jordan Birdsell >Priority: Blocker > Fix For: n/a > > > Python 3 is raising symbol not recognized errors on the method > GetUnixTimeMicros. -- This message was sent by Atlassian JIRA (v6.3.4#6332)