[GitHub] [incubator-druid] Eshcar commented on issue #8126: [Proposal] BufferAggregator support for growable sketches.

2019-08-05 Thread GitBox
Eshcar commented on issue #8126: [Proposal] BufferAggregator support for 
growable sketches.
URL: 
https://github.com/apache/incubator-druid/issues/8126#issuecomment-518356531
 
 
   Thanks, that was our impression  - that off-heap incremental index is not 
operational (however it does exist in the code). So, indeed there is no way to 
compare to it.
   
   I also agree that doing the oak-sketches-druid integration in one step might 
be too complicated.
   We already have an open issue #5698 and a PR #7676 for getting Oak 
incremental index into Druid and we hope to get progress there soon.
   Oak is not based on Memory , yet :).
   The context of my suggestion is how to support growable sketches off-heap 
while ingesting data in druid - namely, in the context of Oak. So the order 
should be first integrating Oak, then having Oak support growable sketches.
   
   If I now understand correctly the purpose of the current proposal is to 
handle the queries aggregation problem. Oak might be a solution also for this 
problem but this is something we haven't looked at yet.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] Eshcar commented on issue #8126: [Proposal] BufferAggregator support for growable sketches.

2019-08-04 Thread GitBox
Eshcar commented on issue #8126: [Proposal] BufferAggregator support for 
growable sketches.
URL: 
https://github.com/apache/incubator-druid/issues/8126#issuecomment-517983639
 
 
   > [off-heap incremental index] It's not used more widely, including during 
the data ingestion specifically because of the unsolved problem with growable 
complex aggregations - this is what this proposal is all about.
   
   Then presenting a solution to this problem using Oak to manage the off-heap 
index, handling growable sketches using Memory, and showing that it performs as 
good as or better than the existing implementation would be a win-win-win 
solution, correct? 
   But I understand that it may not cover the entire scope of this issue, 
namely queries aggregation then I think the best thing to do would be to open a 
new issue for it.
   
   BTW, if the current off-heap solution is operational we can compare against 
it in the current system (cluster) benchmarks that we are running, and compare 
performance (without sketches at this point). From what I know last time we 
tried to evaluate the off-heap incremental index through component level test 
it crashed and we were told it is not properly maintained.
   Nevertheless, we will try running it in cluster mode. Any documentation on 
how we should configure the cluster to allow it running in off-heap mode?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] Eshcar commented on issue #8126: [Proposal] BufferAggregator support for growable sketches.

2019-08-01 Thread GitBox
Eshcar commented on issue #8126: [Proposal] BufferAggregator support for 
growable sketches.
URL: 
https://github.com/apache/incubator-druid/issues/8126#issuecomment-517273629
 
 
   @himanshug - What is the context of the current proposal? Does it refers to 
aggregators that are used only in the context of queries, when querying 
immutable segments that are cached off-heap? Does it also cover off-heap 
incremental index roll-up aggregation? If it is the first then it is reasonable 
to  define an API for external memory allocator, if it is the second then what 
I am suggesting is more relevant.
   Note that Oak is considered as a core contrib and not extension, and is 
proposed as an efficient alternative for the existing off-heap incremental 
index. Which brings me back to my previous question - does the current off-heap 
incremental index considered operational, with reasonable performance?
   
   I think that by introducing a new writable memory aggregator we avoid 
backward compatibility issues, do we not?  
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] Eshcar commented on issue #8126: [Proposal] BufferAggregator support for growable sketches.

2019-07-31 Thread GitBox
Eshcar commented on issue #8126: [Proposal] BufferAggregator support for 
growable sketches.
URL: 
https://github.com/apache/incubator-druid/issues/8126#issuecomment-516737638
 
 
   2 naive question - 
   1) Does buffer aggregators used by on-heap incremental index or only by 
off-heap  incremental index?
   2) If the answer to (1) is only off-heap incremental index, then does 
off-heap incremental index being used in production anywhere? does it perform 
well enough?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] Eshcar commented on issue #8126: [Proposal] BufferAggregator support for growable sketches.

2019-07-31 Thread GitBox
Eshcar commented on issue #8126: [Proposal] BufferAggregator support for 
growable sketches.
URL: 
https://github.com/apache/incubator-druid/issues/8126#issuecomment-516736029
 
 
   Thanks Roman.
   #3892 is a very long issue that splits into multiple discussions covering 
many different things, so I am not sure what is the bottom line. Also it has 
not been discussed over the last year.
   
   Can you summarize any progress made wrt Memory aggregator if any.
   If it is blocked then what is the reason - is it community rejection due to 
backward compatibility? fear of performance degradation? or simply lack of 
working hands?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] Eshcar commented on issue #8126: [Proposal] BufferAggregator support for growable sketches.

2019-07-30 Thread GitBox
Eshcar commented on issue #8126: [Proposal] BufferAggregator support for 
growable sketches.
URL: 
https://github.com/apache/incubator-druid/issues/8126#issuecomment-516411997
 
 
   As part of the work we are doing towards integrating Oak (off-heap based 
incremental index) integration into druid #5698, we invested some thought on 
how to bridge the gap between Oak--with its internal memory management, 
off-heap sketches--based on WritableMemory, and druid aggregators.
   I can share our thoughts, they aim to handle the same problems raised in 
this issue however the solution is different, hence it might be better to 
introduce it in a different issue.
   In a nutshell, 
   (1) Oak manages its memory and needs all allocations of buffers to go 
through the internal memory manager and only be exposed through Oak's API.
   (2) Off-heap sketches are based on WritableMemory, and can work with 
external memory manager that can allocate new WritableMemory when the sketch 
needs to grow
   (3) Druid aggregators access sketches through the aggregator API (init, 
aggregate, get) and a mapping from bytebuffer,position -> sketch
   
   What we suggest is to have oak manage the memory, including re-allocation of 
space when needed. Oak will implement its own WritebleMemory and 
MemoryRequestServer that are needed for a correct behaviour of the sketches wrt 
Oak index. Finally, we suggest to have a new Aggregator type - 
WritableMemoryAggregator that maps WritableMemory to sketch and can work the 
same way as buffer aggregators are working, and it does not need to worry about 
growing size of sketches. 
   
   There might be other alternatives for closing this loop; let's discuss them.
   
   Does all this make sense @leventov @himanshug ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org