Re: Apache Incubation task items
Hello Charles, Going through the item list, I just had a quick question. Are all these tasks meant to be taken up only by the committers? If there are tasks which can be completed by contributors, I would be happy to help. Thanks, Atul On Thu, May 31, 2018 at 12:42 PM, Charles Allen wrote: > https://github.com/druid-io/druid/projects/3 is a list of all the items in > http://incubator.apache.org/projects/druid.html > > We will need help getting these resourced and completed. For a thing to be > completed and closed, the page at > http://incubator.apache.org/projects/druid.html needs updated with any > relevant information. > > I have also created a new label > https://github.com/druid-io/druid/issues?q=is%3Aissue+is% > 3Aopen+label%3AApache > for > any issues related to being a part of ASF, not specifically related to the > Druid code itself. > > The kanban board is in no specific order, so please do not take the > relative order or issue number as any sort of indicator. > > Thank you all for your assistance as we go along this exciting path! > > Cheers, > Charles Allen > -- Atul Mohan
Apache Incubation task items
https://github.com/druid-io/druid/projects/3 is a list of all the items in http://incubator.apache.org/projects/druid.html We will need help getting these resourced and completed. For a thing to be completed and closed, the page at http://incubator.apache.org/projects/druid.html needs updated with any relevant information. I have also created a new label https://github.com/druid-io/druid/issues?q=is%3Aissue+is%3Aopen+label%3AApache for any issues related to being a part of ASF, not specifically related to the Druid code itself. The kanban board is in no specific order, so please do not take the relative order or issue number as any sort of indicator. Thank you all for your assistance as we go along this exciting path! Cheers, Charles Allen
Re: Access to jira
We should probably have a label for it too. On Thu, May 31, 2018 at 9:23 AM, Gian Merlino wrote: > I don't see why not! > > On Thu, May 31, 2018 at 9:21 AM, Charles Allen wrote: > >> Sounds good. I'd like to put some more formal tracking and responsibility >> to the remaining incubator items. Would github issues be the preferred >> place to do that? >> >> On Thu, May 31, 2018 at 9:20 AM Gian Merlino >> wrote: >> >> > I think we are planning to keep using GitHub issues, based on the >> > discussion in the migration logistics thread. And based on the fact that >> > Apache seems to allow that now (https://github.com/apache/fluo was >> given >> > as >> > an example). So probably the right thing to do is update >> > http://incubator.apache.org/projects/druid.html accordingly? >> > >> > On Thu, May 31, 2018 at 9:15 AM, Charles Allen >> wrote: >> > >> > > Hi all >> > > >> > > http://incubator.apache.org/projects/druid.html says that >> > > https://issues.apache.org/jira/browse/DRUID is our issue tracker, >> but I >> > > don't seem to have access to it. Does anyone know how to apply for >> access >> > > using an existing Apache JIRA login? >> > > >> > > Thanks, >> > > Charles Allen >> > > >> > >> > >
Re: Access to jira
I don't see why not! On Thu, May 31, 2018 at 9:21 AM, Charles Allen wrote: > Sounds good. I'd like to put some more formal tracking and responsibility > to the remaining incubator items. Would github issues be the preferred > place to do that? > > On Thu, May 31, 2018 at 9:20 AM Gian Merlino > wrote: > > > I think we are planning to keep using GitHub issues, based on the > > discussion in the migration logistics thread. And based on the fact that > > Apache seems to allow that now (https://github.com/apache/fluo was given > > as > > an example). So probably the right thing to do is update > > http://incubator.apache.org/projects/druid.html accordingly? > > > > On Thu, May 31, 2018 at 9:15 AM, Charles Allen > wrote: > > > > > Hi all > > > > > > http://incubator.apache.org/projects/druid.html says that > > > https://issues.apache.org/jira/browse/DRUID is our issue tracker, but > I > > > don't seem to have access to it. Does anyone know how to apply for > access > > > using an existing Apache JIRA login? > > > > > > Thanks, > > > Charles Allen > > > > > >
Re: Access to jira
Sounds good. I'd like to put some more formal tracking and responsibility to the remaining incubator items. Would github issues be the preferred place to do that? On Thu, May 31, 2018 at 9:20 AM Gian Merlino wrote: > I think we are planning to keep using GitHub issues, based on the > discussion in the migration logistics thread. And based on the fact that > Apache seems to allow that now (https://github.com/apache/fluo was given > as > an example). So probably the right thing to do is update > http://incubator.apache.org/projects/druid.html accordingly? > > On Thu, May 31, 2018 at 9:15 AM, Charles Allen wrote: > > > Hi all > > > > http://incubator.apache.org/projects/druid.html says that > > https://issues.apache.org/jira/browse/DRUID is our issue tracker, but I > > don't seem to have access to it. Does anyone know how to apply for access > > using an existing Apache JIRA login? > > > > Thanks, > > Charles Allen > > >
Re: Access to jira
I think we are planning to keep using GitHub issues, based on the discussion in the migration logistics thread. And based on the fact that Apache seems to allow that now (https://github.com/apache/fluo was given as an example). So probably the right thing to do is update http://incubator.apache.org/projects/druid.html accordingly? On Thu, May 31, 2018 at 9:15 AM, Charles Allen wrote: > Hi all > > http://incubator.apache.org/projects/druid.html says that > https://issues.apache.org/jira/browse/DRUID is our issue tracker, but I > don't seem to have access to it. Does anyone know how to apply for access > using an existing Apache JIRA login? > > Thanks, > Charles Allen >
Access to jira
Hi all http://incubator.apache.org/projects/druid.html says that https://issues.apache.org/jira/browse/DRUID is our issue tracker, but I don't seem to have access to it. Does anyone know how to apply for access using an existing Apache JIRA login? Thanks, Charles Allen
Re: A question about Druid design
Hi Gian, Thanks for the explanations! I have one more question: You say that "...the RollupFactsHolder there will be a _single_ fact row per TimeAndDims... But with the PlainFactsHolder there may be more than one fact row per TimeAndDims..."In PlainFactsHolder we have more than one fact row per Timestamp actually, or am I missing something? I mean in RollupFactsHolder could you scan only TimeAndDims (leading to rows) with some Timestamp and get the same result? Is it true that TimeAndDims are ordered firstly according to time anyway? I am most likely missing something, just would like to understand what :) Thanks,Anastasia On Wednesday, May 30, 2018, 10:56:26 AM GMT+3, Gian Merlino wrote: Hi Anastasia, 1) At ingestion time the FactsHolder is sorted. The unsorted code path is used by groupBy v1, which hasn't been common since groupBy v2 was made the default a few releases ago. So I would only worry about the sorted case. 2) PlainFactsHolder is used when the user has disabled rollup at ingestion time. The idea is that with the RollupFactsHolder there will be a _single_ fact row per TimeAndDims (and Druid may combine multiple input rows into one indexed fact row). But with the PlainFactsHolder there may be more than one fact row per TimeAndDims (in particular: there will be one fact row per input row). Hope this helps. On Wed, May 30, 2018 at 12:14 AM, Anastasia Braginsky < anas...@oath.com.invalid> wrote: > Hi, > Recall our suggestion to use the new concurrent map named Oak as a base > for Incremental Index. Oak stands for Off-heap Allocated Keys, for more > details please see issue #5698. We had a great progress with Oak > integration and stabilizing OakIndex performance. We have some questions > regarding FactsHolder. As we explained in our design document and > refactoring suggestion we prefer to remove the FactsHolder usage in > the OakIndex, because Oak maps the keys (Time&Dims) to the values > (Aggregators) directly. Therefore the Oak mapping is always sorted and only > from keys to values. From here we have two questions. > > 1. Unsorted FactsHolder: It is understandable that unsorted mapping via > HashMap (O(1) access) might be faster than sorted mapping (O(logN) access). > The question is whether the unsorted variant used frequently? When it is > used? And is it acceptable that in this case Oak will give slightly lower > performance? > > 2. Regarding Plain- vs Rollup- FactsHolder: It can be seen that > PlainFactsHolder is holding a queue of Key->Value (Time&Dims->Aggregator) > per Timestamp, where the sorting is via Timestamp. Therefore, Oak > implements mostly sorted RollupFactsHolder logic. Additionally, Timestamp > is also a part of TIme&Dims and the sorting is initially according to > Timestamp, then other dimensions. The question is what are the use-cases > where the PlainFactsHolder and not Rollup is used? And is there any > functionality that can be given by Plain but not by Rollup? > > Thanks,Anastasia >