Re: Contribution question

kamalesh palanisamy Mon, 03 Jan 2022 16:01:12 -0800

Okay that sounds perfect. Thank you!

On Mon, Jan 3, 2022 at 4:52 AM Bertty Contreras <[email protected]> wrote:


> So nice, I will organize my schedule for it and i will come back to you
> with options. Mean while I will collecting all the designs and other
> elements that are done and could help you with the implementation of the
> new feature ;),
>
> Best regards,
> Bertty
>
> On Mon 3. Jan 2022 at 04:41, kamalesh palanisamy <[email protected]>
> wrote:
>
>> Hi,
>> Thank you for the explanation. Yes, I feel it would be better if we could
>> discuss it so that everything is clear. I am free from Wednesday-Saturday
>> anytime after 3 PM Germany time. You can select which every day suits your
>> schedule best during this time.
>>
>> Thanks,
>> Kamalesh P
>>
>>
>> On Sun, Jan 2, 2022 at 6:28 PM Bertty Contreras <[email protected]>
>> wrote:
>>
>>> The main is idea of wayang is to provide a layer that pick the best
>>> combination of platform to process a query, you can see the details on the
>>> paper rheemix[1]
>>>
>>>  Then providing a SQL-API will allow to transform a query into different
>>> operators of wayang that will allow optimization with platform that only
>>> have SQL like postgres with platforms that don’t SQL lenguaje like giraph.
>>>
>>> The idea to use calcite, is coming from the intermediate representation
>>> that calcite generates that will allows us to create the wayang plan with
>>> an “udf” that are translateble again to SQL or translatable to a executable
>>> code that can be executed by flink, as an example.
>>>
>>> Imagen the query that it said something like:
>>>
>>> Select A.a,A.b,A.c from A join A.a = X.a ….
>>>
>>> Then X(10TB) is on HDFS and A(100MB) is on postgres, then the plan to
>>> execute will something like:
>>>
>>> Select A.a from A(1MB), this file is small then you can do broadcast and
>>> filter using flink.
>>>
>>> Then the join results are just 2 records, the wayang will perform the
>>> query on postgres using the 2 record as condition.
>>>
>>> But also could occurs that the join answer is 1TB, in that case, the
>>> data of postgres will be move to HDFS and the all the rest of the process
>>> will be on using flink.
>>>
>>> Currently the optimizer is taking the decision of what platform will be
>>> used depending on the amount of data to process and data movement. Then the
>>> SQL-API will provide an way of “freedom” the decisions because we will have
>>> all the intermediate representation to performs changes.
>>>
>>> After we have the SQL-API we will be adding platforms that just support
>>> and SQL ;), as you said.
>>>
>>> The idea of using the intermediate representation it maybe sound weird
>>> to you, but we can have a meeting to explain you better, then you can
>>> understand better the full concept and also give us your feedback, let me
>>> if hyou are available and when and I will freedom my schedule for it ;).
>>> I’m in Germany just to you figure if we have some timezone differences ;).
>>>
>>> Best regards,
>>> Bertty
>>>
>>> [1]
>>> https://wayang.apache.org/assets/pdf/paper/journal_vldb.pdf
>>>
>>>
>>> On Sun 2. Jan 2022 at 17:43, kamalesh palanisamy <[email protected]>
>>> wrote:
>>>
>>>> Hi Bertty,
>>>> Thank you for the information! I would love to work on adding the SQL
>>>> API for Wayang. Basically, now I need to add a new platform for the
>>>> wayang-platforms that supports SQL through apache calcite? Am I right?
>>>> Please do correct me if I am wrong.
>>>>
>>>> Thanks,
>>>> Kamalesh P
>>>>
>>>>
>>>> On Sun, Jan 2, 2022 at 3:36 AM Bertty Contreras <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Kamalesh,
>>>>>
>>>>> Currently, Apache Wayang(Incubating) has the issues listed in Jira
>>>>> [1]. One feature that the community didn't have time to work on is the SQL
>>>>> API for Apache Wayang(Incubating) [2]; the main idea is to use Apache
>>>>> Calcite [3] as the parser of the SQL and then do something like Spark
>>>>> adapter of calcite [4]. If you want to contribute to this feature, it will
>>>>> be so awesome :D.
>>>>>
>>>>> If you found another issue interesting, let me know, or even if you
>>>>> have some idea of a feature will be so awesome too :D
>>>>>
>>>>> Best regards,
>>>>> Bertty
>>>>>
>>>>> [1] https://issues.apache.org/jira/projects/WAYANG
>>>>> [2]
>>>>> https://issues.apache.org/jira/projects/WAYANG/issues/WAYANG-25?filter=allopenissues
>>>>> [3] https://calcite.apache.org
>>>>> [4] https://github.com/apache/calcite/tree/master/spark
>>>>>
>>>>> On Sun, Jan 2, 2022 at 6:50 AM kamalesh palanisamy <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi,
>>>>>> My name is Kamalesh and I am currently looking to contribute to the
>>>>>> project, but I couldn't find any proper issues. Can you help me with
>>>>>> any
>>>>>> features you would like me to contribute to?. Thanks!
>>>>>> Thanks,
>>>>>> Kamalesh P
>>>>>>
>>>>> --
Thanks,
Kamalesh P

Re: Contribution question

Reply via email to