nique_ptr own(T * raw_ptr)
{
std::unique_ptr p(raw_ptr);
return p;
}
}
int main()
{
sp::shared_ptr client;
check_ok(KuduClientBuilder()
.add_master_server_addr("localhost")
.Build(&client));
KuduSchema schema;
KuduSchemaBuilder b;
b.AddColumn("date")-&g
Hi Paul,
I can't reproduce the behavior you are describing, I always get a single
unbounded range partition when creating the table without specifying range
bounds or splits (regardless of hash partitioning). I searched and couldn't
find a unit test for this behavior, so I wrote one - you might co
I can verify that dropping the unbounded range partition allows me to later
add bounded partitions.
If I only have range partitioning (by commenting out the call to
add_hash_partitions), adding a bounded partition succeeds, regardless of
whether I first drop the unbounded partition. This seems su
On my impala parquet table, each day partition is about 500MB - 1GB.
So using range partition by day, query time went down to 35 sec from 123 sec
Query against the impala table is 2 seconds.
On Fri, Feb 24, 2017 at 1:34 PM, Dan Burkert wrote:
> Hi Tenny,
>
> 1000 partitions is on the uppe
On Fri, Feb 24, 2017 at 12:39 PM, Adar Dembo wrote:
> It's definitely safe to increase the ulimit for open files; we
> typically test with higher values (like 32K or 64K). We don't use
> select(2) directly; any fd polling in Kudu is done via libev which I
> believe uses epoll(2) under the hood. T
Hi Tenny,
1000 partitions is on the upper end of what I'd recommend - with 3x
replication that's 125 tablet replicas per tablet server (something more
like 20 or 30 would be ideal depending on hardware). How much data does
each day have? I would aim for tablet size on the order of 50GiB, so if
i
Hi Paul,
I think the issue you are running into is that if you don't add a range
partition explicitly during table creation (by calling add_range_partition
or inserting a split with add_range_partition_split), Kudu will default to
creating 1 unbounded range partition. So your two options are to a
I have 24 tablet servers.
I added an id column because I needed a unique column to be the primary key
as kudu required primary key to be specified. My original table actually
has 20 columns with no single primary key column. I concatenated 5 of them
to build a unique id column which I made it as
I think range partitioning is a fine solution for your use case,
though you should know that we're not recommending more than 4 TB of
total data (post-encoding/compression) per tserver at the moment. I
don't expect anything to break outright if you exceed that, but
startup will get slower and slowe
I'm trying to create a table with one-column range-partitioned and another
column hash-partitioned. Documentation for add_hash_partitions and
set_range_partition_columns suggest this should be possible ("Tables must
be created with either range, hash, or range and hash partitioning").
I have a sc
I'm using the debs from the cloudera-kudu ppa with little change to the
default configuration, so one master and one tablet server. I set
num_replicas(1) when creating each table. I used range partitioning with
(if I understand correctly) one large open-ended range. So that should
have 334 table
I think this makes sense for isset_bitmap_ and owned_strings_bitmap_. I
have the source checked out from github and will try to build and run the
regression tests so I can play with this idea.
In general though it seems I only want to copy the primary keys (any given
RowPtr might have other colum
12 matches
Mail list logo