Re: Transactions in Druid?

2018-05-30 Thread Edward Bortnikov
 Hi Gian, 

Thanks for the explanation. 
So, the community does not envision traditional OLTP/RT analytics use cases 
like "read A-compute-write B", cross-partition consistent scans, or atomic 
updates of multiple indexes? The reason I'm asking is that we also work on the 
Omid transaction processor for Hadoop, which might be adopted to Druid in case 
of need. 
Thanks,Edward

On Tuesday, May 29, 2018, 7:09:27 PM PDT, Gian Merlino  
wrote:  
 
 Hi Edward,

There are a couple of ways to do transactional updates to Druid today:

1) When using streaming ingestion with the "parseBatch" feature
(see io.druid.data.input.impl.InputRowParser), rows in the batch are
inserted transactionally.
2) When using batch ingestion in overwrite mode (the default), all
operations are transactional. Of course, you must be willing to overwrite
an entire time interval in this mode.

With (1) scans are not transactionally consistent.

With (2) scans of a particular interval _are_ transactionally consistent,
due to the nature of how we handle overwrite-style ingestion (the new
segment set has a higher version number, and queries will use either the
older or newer version).

However Druid never offers read-your-writes consistency. There is always
some delay (however small) between when you trigger an insert and when that
insert is actually readable.

On Tue, May 29, 2018 at 4:38 PM, Edward Bortnikov  wrote:

> Hi, Community,
> Do we have any existing or perceived use cases of transactional
> (multi-row) updates to Druid? Same about transactionally consistent scans?
> Thanks, Edward
  

Re: Transactions in Druid?

2018-05-29 Thread Gian Merlino
Hi Edward,

There are a couple of ways to do transactional updates to Druid today:

1) When using streaming ingestion with the "parseBatch" feature
(see io.druid.data.input.impl.InputRowParser), rows in the batch are
inserted transactionally.
2) When using batch ingestion in overwrite mode (the default), all
operations are transactional. Of course, you must be willing to overwrite
an entire time interval in this mode.

With (1) scans are not transactionally consistent.

With (2) scans of a particular interval _are_ transactionally consistent,
due to the nature of how we handle overwrite-style ingestion (the new
segment set has a higher version number, and queries will use either the
older or newer version).

However Druid never offers read-your-writes consistency. There is always
some delay (however small) between when you trigger an insert and when that
insert is actually readable.

On Tue, May 29, 2018 at 4:38 PM, Edward Bortnikov  wrote:

> Hi, Community,
> Do we have any existing or perceived use cases of transactional
> (multi-row) updates to Druid? Same about transactionally consistent scans?
> Thanks, Edward


Transactions in Druid?

2018-05-29 Thread Edward Bortnikov
Hi, Community, 
Do we have any existing or perceived use cases of transactional (multi-row) 
updates to Druid? Same about transactionally consistent scans? 
Thanks, Edward