Re: impala with kudu write become very slow

2019-07-18 Thread Tim Armstrong
Also including the Kudu list in case someone there recognises the problem.

On Thu, Jul 18, 2019 at 8:05 AM lk_hadoop  wrote:

> I0718 18:42:22.677520 51139 coordinator.cc:357] starting execution on 5
> backends for query_id=2e4a3fbec0d7d721:2ec73c1c
> I0718 18:42:22.679605 12873 impala-internal-service.cc:44]
> ExecQueryFInstances(): query_id=2e4a3fbec0d7d721:2ec73c1c
> I0718 18:42:22.679620 12873 query-exec-mgr.cc:46] StartQueryFInstances()
> query_id=2e4a3fbec0d7d721:2ec73c1c
> coord=realtimeanalysis-kudu-04-10-8-50-58:22000
> I0718 18:42:22.679625 12873 query-state.cc:178] Buffer pool limit for
> 2e4a3fbec0d7d721:2ec73c1c: 17179869184
> I0718 18:42:22.679675 12873 initial-reservations.cc:60] Successfully
> claimed initial reservations (4.00 MB) for query
> 2e4a3fbec0d7d721:2ec73c1c
> I0718 18:42:22.679769 51332 query-state.cc:309] StartFInstances():
> query_id=2e4a3fbec0d7d721:2ec73c1c #instances=2
> I0718 18:42:22.680196 51332 query-state.cc:322] descriptor table for
> query=2e4a3fbec0d7d721:2ec73c1c
> tuples:
> Tuple(id=2 size=567 slots=[Slot(id=52 type=INT col_path=[] offset=464
> null=(offset=563 mask=20) slot_idx=29 field_idx=-1), Slot(id=53 type=STRING
> col_path=[] offset=0 null=(offset=560 mask=1) slot_idx=0 field_idx=-1),
> Slot(id=54 type=STRING col_path=[] offset=16 null=(offset=560 mask=2)
> slot_idx=1 field_idx=-1), Slot(id=55 type=STRING col_path=[] offset=32
> null=(offset=560 mask=4) slot_idx=2 field_idx=-1), Slot(id=56 type=STRING
> col_path=[] offset=48 null=(offset=560 mask=8) slot_idx=3 field_idx=-1),
> Slot(id=57 type=STRING col_path=[] offset=64 null=(offset=560 mask=10)
> slot_idx=4 field_idx=-1), Slot(id=58 type=STRING col_path=[] offset=80
> null=(offset=560 mask=20) slot_idx=5 field_idx=-1), Slot(id=59 type=STRING
> col_path=[] offset=96 null=(offset=560 mask=40) slot_idx=6 field_idx=-1),
> Slot(id=60 type=STRING col_path=[] offset=112 null=(offset=560 mask=80)
> slot_idx=7 field_idx=-1), Slot(id=61 type=STRING col_path=[] offset=128
> null=(offset=561 mask=1) slot_idx=8 field_idx=-1), Slot(id=62 type=STRING
> col_path=[] offset=144 null=(offset=561 mask=2) slot_idx=9 field_idx=-1),
> Slot(id=63 type=STRING col_path=[] offset=160 null=(offset=561 mask=4)
> slot_idx=10 field_idx=-1), Slot(id=64 type=INT col_path=[] offset=468
> null=(offset=563 mask=40) slot_idx=30 field_idx=-1), Slot(id=65 type=INT
> col_path=[] offset=472 null=(offset=563 mask=80) slot_idx=31 field_idx=-1),
> Slot(id=66 type=INT col_path=[] offset=476 null=(offset=564 mask=1)
> slot_idx=32 field_idx=-1), Slot(id=67 type=INT col_path=[] offset=480
> null=(offset=564 mask=2) slot_idx=33 field_idx=-1), Slot(id=68 type=STRING
> col_path=[] offset=176 null=(offset=561 mask=8) slot_idx=11 field_idx=-1),
> Slot(id=69 type=STRING col_path=[] offset=192 null=(offset=561 mask=10)
> slot_idx=12 field_idx=-1), Slot(id=70 type=STRING col_path=[] offset=208
> null=(offset=561 mask=20) slot_idx=13 field_idx=-1), Slot(id=71 type=STRING
> col_path=[] offset=224 null=(offset=561 mask=40) slot_idx=14 field_idx=-1),
> Slot(id=72 type=STRING col_path=[] offset=240 null=(offset=561 mask=80)
> slot_idx=15 field_idx=-1), Slot(id=73 type=STRING col_path=[] offset=256
> null=(offset=562 mask=1) slot_idx=16 field_idx=-1), Slot(id=74 type=INT
> col_path=[] offset=484 null=(offset=564 mask=4) slot_idx=34 field_idx=-1),
> Slot(id=75 type=INT col_path=[] offset=488 null=(offset=564 mask=8)
> slot_idx=35 field_idx=-1), Slot(id=76 type=INT col_path=[] offset=492
> null=(offset=564 mask=10) slot_idx=36 field_idx=-1), Slot(id=77 type=INT
> col_path=[] offset=496 null=(offset=564 mask=20) slot_idx=37 field_idx=-1),
> Slot(id=78 type=INT col_path=[] offset=500 null=(offset=564 mask=40)
> slot_idx=38 field_idx=-1), Slot(id=79 type=INT col_path=[] offset=504
> null=(offset=564 mask=80) slot_idx=39 field_idx=-1), Slot(id=80 type=INT
> col_path=[] offset=508 null=(offset=565 mask=1) slot_idx=40 field_idx=-1),
> Slot(id=81 type=STRING col_path=[] offset=272 null=(offset=562 mask=2)
> slot_idx=17 field_idx=-1), Slot(id=82 type=STRING col_path=[] offset=288
> null=(offset=562 mask=4) slot_idx=18 field_idx=-1), Slot(id=83 type=INT
> col_path=[] offset=512 null=(offset=565 mask=2) slot_idx=41 field_idx=-1),
> Slot(id=84 type=STRING col_path=[] offset=304 null=(offset=562 mask=8)
> slot_idx=19 field_idx=-1), Slot(id=85 type=STRING col_path=[] offset=320
> null=(offset=562 mask=10) slot_idx=20 field_idx=-1), Slot(id=86 type=STRING
> col_path=[] offset=336 null=(offset=562 mask=20) slot_idx=21 field_idx=-1),
> Slot(id=87 type=STRING col_path=[] offset=352 null=(offset=562 mask=40)
> slot_idx=22 field_idx=-1), Slot(id=88 type=STRING col_path=[] offset=368
> null=(offset=562 mask=80) slot_idx=23 field_idx=-1), Slot(id=89 type=INT
> col_path=[] offset=516 null=(offset=565 mask=4) slot_idx=42 field_idx=-1),
> Slot(id=90 type=INT col_path=[] offset=520 null=(offset=565 mask=8)
> slot_idx=43 field_idx=-1), S

Re: Delete or Update by Query

2019-07-18 Thread John Mora
Hi Adar Lieber-Dembo and Grant Henke,

Thanks for your responses.
I think I will implement your suggestion, using a scanner for selecting the
primary keys and applying a block of multiple individual deletions from the
result.

Thanks,
John Mora


El mar., 16 jul. 2019 a las 16:41, Grant Henke ()
escribió:

> Ultimately you need to use a scanner to get all the rows that match that
> predicate and then delete them.
>
> There is an example of doing this via Spark in the Spark quickstart guide
> here:
>
> https://github.com/apache/kudu/tree/master/examples/quickstart/spark#read-and-modify-data
>
> On Tue, Jul 16, 2019 at 4:17 PM Adar Lieber-Dembo 
> wrote:
>
>> Unfortunately there's no way to do that currently: if you want to
>> delete a row, you must provide its complete primary key.
>>
>> On Tue, Jul 16, 2019 at 2:01 PM John Mora  wrote:
>> >
>> > Hi.
>> >
>> > I am trying to delete multiple rows at the same time through a
>> condition using kudu-client.
>> >
>> > Let's say:
>> > DELETE FROM table WHERE key>=xx AND key<=yy
>> >
>> > However, I could not find its equivalent in kudu-client for java.
>> >
>> > I was analyzing delete operations like the one below, but it deletes
>> only one row at the same time.
>> >
>> > Delete delete = table.newDelete();
>> > ...
>> > session.apply(delete);
>> >
>> > Is there some way to delete multiple rows using a condition?, maybe via
>> a Scanner or similar.
>> >
>> > Thanks in advance for your help.
>> >
>> > Cheers,
>> > John Mora
>>
>
>
> --
> Grant Henke
> Software Engineer | Cloudera
> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>