Re: Contribution question

Bertty Contreras Tue, 04 Jan 2022 01:56:08 -0800

Hi Kamalesh,

I sent the invitation on Friday at 3pm :D.


If you have any questions let me know.

Best regards,
Bertty

On Tue, Jan 4, 2022 at 1:00 AM kamalesh palanisamy <[email protected]>
wrote:

> Okay that sounds perfect. Thank you!
>
> On Mon, Jan 3, 2022 at 4:52 AM Bertty Contreras <[email protected]>
> wrote:
>
>> So nice, I will organize my schedule for it and i will come back to you
>> with options. Mean while I will collecting all the designs and other
>> elements that are done and could help you with the implementation of the
>> new feature ;),
>>
>> Best regards,
>> Bertty
>>
>> On Mon 3. Jan 2022 at 04:41, kamalesh palanisamy <[email protected]>
>> wrote:
>>
>>> Hi,
>>> Thank you for the explanation. Yes, I feel it would be better if we
>>> could discuss it so that everything is clear. I am free from
>>> Wednesday-Saturday anytime after 3 PM Germany time. You can select which
>>> every day suits your schedule best during this time.
>>>
>>> Thanks,
>>> Kamalesh P
>>>
>>>
>>> On Sun, Jan 2, 2022 at 6:28 PM Bertty Contreras <[email protected]>
>>> wrote:
>>>
>>>> The main is idea of wayang is to provide a layer that pick the best
>>>> combination of platform to process a query, you can see the details on the
>>>> paper rheemix[1]
>>>>
>>>>  Then providing a SQL-API will allow to transform a query into
>>>> different operators of wayang that will allow optimization with platform
>>>> that only have SQL like postgres with platforms that don’t SQL lenguaje
>>>> like giraph.
>>>>
>>>> The idea to use calcite, is coming from the intermediate representation
>>>> that calcite generates that will allows us to create the wayang plan with
>>>> an “udf” that are translateble again to SQL or translatable to a executable
>>>> code that can be executed by flink, as an example.
>>>>
>>>> Imagen the query that it said something like:
>>>>
>>>> Select A.a,A.b,A.c from A join A.a = X.a ….
>>>>
>>>> Then X(10TB) is on HDFS and A(100MB) is on postgres, then the plan to
>>>> execute will something like:
>>>>
>>>> Select A.a from A(1MB), this file is small then you can do broadcast
>>>> and filter using flink.
>>>>
>>>> Then the join results are just 2 records, the wayang will perform the
>>>> query on postgres using the 2 record as condition.
>>>>
>>>> But also could occurs that the join answer is 1TB, in that case, the
>>>> data of postgres will be move to HDFS and the all the rest of the process
>>>> will be on using flink.
>>>>
>>>> Currently the optimizer is taking the decision of what platform will be
>>>> used depending on the amount of data to process and data movement. Then the
>>>> SQL-API will provide an way of “freedom” the decisions because we will have
>>>> all the intermediate representation to performs changes.
>>>>
>>>> After we have the SQL-API we will be adding platforms that just support
>>>> and SQL ;), as you said.
>>>>
>>>> The idea of using the intermediate representation it maybe sound weird
>>>> to you, but we can have a meeting to explain you better, then you can
>>>> understand better the full concept and also give us your feedback, let me
>>>> if hyou are available and when and I will freedom my schedule for it ;).
>>>> I’m in Germany just to you figure if we have some timezone differences ;).
>>>>
>>>> Best regards,
>>>> Bertty
>>>>
>>>> [1]
>>>> https://wayang.apache.org/assets/pdf/paper/journal_vldb.pdf
>>>>
>>>>
>>>> On Sun 2. Jan 2022 at 17:43, kamalesh palanisamy <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Bertty,
>>>>> Thank you for the information! I would love to work on adding the SQL
>>>>> API for Wayang. Basically, now I need to add a new platform for the
>>>>> wayang-platforms that supports SQL through apache calcite? Am I right?
>>>>> Please do correct me if I am wrong.
>>>>>
>>>>> Thanks,
>>>>> Kamalesh P
>>>>>
>>>>>
>>>>> On Sun, Jan 2, 2022 at 3:36 AM Bertty Contreras <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi Kamalesh,
>>>>>>
>>>>>> Currently, Apache Wayang(Incubating) has the issues listed in Jira
>>>>>> [1]. One feature that the community didn't have time to work on is the 
>>>>>> SQL
>>>>>> API for Apache Wayang(Incubating) [2]; the main idea is to use Apache
>>>>>> Calcite [3] as the parser of the SQL and then do something like Spark
>>>>>> adapter of calcite [4]. If you want to contribute to this feature, it 
>>>>>> will
>>>>>> be so awesome :D.
>>>>>>
>>>>>> If you found another issue interesting, let me know, or even if you
>>>>>> have some idea of a feature will be so awesome too :D
>>>>>>
>>>>>> Best regards,
>>>>>> Bertty
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/projects/WAYANG
>>>>>> [2]
>>>>>> https://issues.apache.org/jira/projects/WAYANG/issues/WAYANG-25?filter=allopenissues
>>>>>> [3] https://calcite.apache.org
>>>>>> [4] https://github.com/apache/calcite/tree/master/spark
>>>>>>
>>>>>> On Sun, Jan 2, 2022 at 6:50 AM kamalesh palanisamy <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>> My name is Kamalesh and I am currently looking to contribute to the
>>>>>>> project, but I couldn't find any proper issues. Can you help me with
>>>>>>> any
>>>>>>> features you would like me to contribute to?. Thanks!
>>>>>>> Thanks,
>>>>>>> Kamalesh P
>>>>>>>
>>>>>> --
> Thanks,
> Kamalesh P
>

Re: Contribution question

Reply via email to