Hi all,

First of all, thanks a lot for the warm welcome.
Let me try to answer few of the concerns above, answers in (blue). Some 
questions I won’t be able, others will be more clear once I raise the internal 
requirement.


  *   It would be great if you can share some figures with the community, e.g. 
the user pool size of Huawei Cloud, intersection of Huawei Cloud users & 
Airflow users (as you mentioned, many of your users are requesting Airflow 
integration)
Pool size, I can’t disclose, for obvious reasons. But, more and more customers 
are using Airflow as orchestrator. We, as cloud provider, provide data 
processing and analytical systems, that customers are interested on using.
The intersected services, at least for the Big Data & Analytics are (but not 
limited)

-          OBS (S3) compatible object storage. Users can build their data lake 
over it. 
https://support.huaweicloud.com/intl/en-us/productdesc-obs/en-us_topic_0045853681.html

-          MRS Hadoop cluster, 
https://support.huaweicloud.com/intl/en-us/productdesc-mrs/mrs_08_0001.html

-          Data Warehouse (DWS), Postgres compatible, or with ODBC (not really 
need integration) https://support.huaweicloud.com/intl/en-us/dws/index.html

-          Data Lake Insight (DLI) or serverless computing – For running spark 
loads in the Cloud (MAIN) - 
https://support.huaweicloud.com/intl/en-us/productdesc-dli/dli_07_0001.html

-          Many others (but not critical).
Also to mention. All services above (and beyond), are API accessible, so the 
integration, at least using the already public one, should be pretty 
straightforward.

  *   You may also need to help clarify how to ensure your Profiver can be well 
tested by people in the community, especially when you have a new version 
release later (we don't want to have a provider release without any user 
testing it)
The API tends to change very little (since mayor functionality remains the same)

Would you or your team be willing to help maintain that integration forward? By 
maintain -> Keeping the APIs up-to-date with product APIs, dependencies are 
up-to-date etc? and ofcourse testing?

I am not part of the R&D. I am solution architect focus on Big Data. I am 
raising internally this requirement. My first step is to confirm that Airflow 
can integrate and accept commits from Huawei Cloud (confirmed already). My next 
step is to raise the requirements to R&D, and push for it to be accepted. For 
the development, resources, etc… I am not sure what will be the approach. Will 
post more information once I get more feedback.
Besides that, just to repeat, I guess if the integration is done using (mainly) 
open public APIs from Huawei, there won’t be many changes (But totally agree on 
the need to keep it up to date)


Not sure if you are aware of that, but you could - similarly for example to 
Great Expectations 
https://github.com/great-expectations/airflow-provider-great-expectations -  
release your own provider independently from Airlfow repository

Correct, this is my current approach for a project I am currently involved. My 
first “dirty” solution, in order to get a fast PoC, is to use Airflow HTTP 
provider / Python provider to call an simplified python SDK. This is only a 
proof of concept, and is far away from acceptable in any software development 
metrics, but, again, I need to defend internally the need of support Airflow. 
We have a “competing” tool, basically an UI based orchestrator.


An additional question. I am not really sure if fits to ask in this mail-list, 
so my apologies if I’m wrong.

-          Is there any group/list of Airflow developers open for cooperation? 
Huawei will lack expertise on Airflow related questions.


Thanks a lot, and happy to join the community!

Best regards
David Sanchez Plaza

Huawei Cloud Business Dept, International Cloud & AI
David Sanchez Plaza - 大卫
Huawei HCIE Cloud Service Solutions Architect 
(Link<https://www.youracclaim.com/badges/332b8d61-055a-45a1-a301-0c4094a77202/public_url>)
Mobile: +86 17722639223
D District, Huawei Bantian Base, Huawei, Shenzhen, China.
[cid:[email protected]]  [cid:[email protected]] 
<http://intl.huaweicloud.com/>  [cid:[email protected]] 
<https://www.linkedin.com/showcase/huawei-cloud/>   
[cid:[email protected]] 
<https://www.facebook.com/Huawei-Cloud-1995859487299836/>   
[cid:[email protected]] <https://twitter.com/Huawei_Cloud>  
[494f24f47d32af5819074491a59e2458] 
<https://mp.weixin.qq.com/s?__biz=MzI1Mzc1MzMyOQ==&mid=100007026&idx=1&sn=91c07c2a030ab5fee15f24d601da530c&chksm=69cefeec5eb977faae9c29647b7512ec7e0f9bb6301e863c892dde95106d2ec732ff43662666#rd>
   [5edacfeb19fb3f9d99604eeaee3be363] 
<https://weibo.com/p/1006061930559805/home?from=page_100606&mod=TAB&is_all=1#place>




Also one comment to add. Not sure if you are aware of that, but you could - 
similarly for example to Great Expectations 
https://github.com/great-expectations/airflow-provider-great-expectations -  
release your own provider independently from Airlfow repository. There is 
absolutely no difference in capabilities of such a provider compared to a 
provider in the community if you release your own provider - you get more 
freedom and capabilities of releasing it in your own schedule, you can be more 
relaxed when it comes to testing and documentation (we have rather serious 
requirements for any community provider re documentation, testing and system 
testing) but you have to make sure to keep up with changes in Airflow (there 
are rarely changes that impact providers though).

J.




David,

Great to hear this interest regarding integration with Airflow and happy to 
help guide as well.

Similar to XD, I would very much like to understand both the user base sizing 
and areas of interest from an integration standpoint.

From your perspective, at the risk of repeating both XD and Kaxil's point, the 
critical thing to understand and focus on is really the ongoing maintenance and 
support of the Provider package post initial development.

There is a cost to make sure the Provider package is up to date as changes are 
made to Huawei Cloud, as well as keeping it current with respect to underlying 
dependencies. The release process we have in place for Airflow Providers also 
relies on community testing, so that would be an important factor to consider 
in your release and update process. We track all issues for Providers and Core 
Airflow in the Airflow Github repo, so that is also important to stay on top of.

Happy to answer more questions and help as needed.

Best regards,
Vikram



From: Kaxil Naik [mailto:[email protected]]
Sent: Thursday, October 6, 2022 11:35 AM
To: [email protected]
Cc: David Sanchez Plaza <[email protected]>
Subject: Re: 【New provider】Inquire



On Thu, 6 Oct 2022 at 15:29, Kaxil Naik 
<[email protected]<mailto:[email protected]>> wrote:
Very glad to see more companies and folks interested in integration with 
Airflow.

To add to XD's point - Would you or your team be willing to help maintain that 
integration forward? By maintain -> Keeping the APIs up-to-date with product 
APIs, dependencies are up-to-date etc? and ofcourse testing?

Regards,
Kaxil

On Thu, 6 Oct 2022 at 14:20, Xiaodong Deng 
<[email protected]<mailto:[email protected]>> wrote:
Hi David,

Many thanks for the email.

Such contributions would be definitely welcomed & appreciated. As you already 
found out, there are certain criteria, like "generic enough, are well 
documented, fully covered by tests and with capabilities of being tested by 
people in the community".

I believe you and your team will handle the documentation and unit test part, 
etc. So no concern on that.

However, it will be helpful if you can provide information about the Provider 
you plan to add is "generic enough" AND "capabilities of being tested by people 
in the community", hence:

  *   It would be great if you can share some figures with the community, e.g. 
the user pool size of Huawei Cloud, intersection of Huawei Cloud users & 
Airflow users (as you mentioned, many of your users are requesting Airflow 
integration)
  *   You may also need to help clarify how to ensure your Profiver can be well 
tested by people in the community, especially when you have a new version 
release later (we don't want to have a provider release without any user 
testing it)
Look forward to hearing from you. Meanwhile, I would also like to hear other 
folks' thoughts on this.

Many thanks!


Regards,
XD


On Thu, Oct 6, 2022 at 3:09 PM David Sanchez Plaza 
<[email protected]<mailto:[email protected]>>
 wrote:
Dear Airflow community,

I’m David Sanchez Plaza, currently working as Cloud Solution Architect.

I am investigating on the integration of Apache Airflow with Huawei Cloud and 
improve future cooperation. Many of our customers are requesting this 
integration and we have several services to include in.

Based on the website


“Can I contribute my own provider to Apache Airflow?

Of course, but it’s better to check at developer’s mailing list whether such 
contribution will be accepted by the Community, before investing time to make 
the provider compliant with community requirements. The Community only accepts 
providers that are generic enough, are well documented, fully covered by tests 
and with capabilities of being tested by people in the community. So we might 
not always be in the position to accept such contributions.”
I would like to ask the community, will this contribution, after complying with 
all Airflow community requirements, be accepted into Airflow?

Thanks in advance!

Best regards
David Sanchez Plaza

Huawei Cloud Business Dept, International Cloud & AI
David Sanchez Plaza - 大卫
Huawei HCIE Cloud Service Solutions Architect 
(Link<https://www.youracclaim.com/badges/332b8d61-055a-45a1-a301-0c4094a77202/public_url>)
Mobile: +86 17722639223
D District, Huawei Bantian Base, Huawei, Shenzhen, China.
          <http://intl.huaweicloud.com/>

Reply via email to