Re: Guidelines on Contrib vs Non-contrib

2018-09-17 Thread George Leslie-Waksman
Given we have a plugin system, could we alternatively move away from keeping non-core supported code outside of the core project/repo? It would hugely decrease the surface area of the main repository and testing infrastructure to get most of the contrib code out to its own place. Further, it

Re: Guidelines on Contrib vs Non-contrib

2018-09-17 Thread Tim Swast
> Individual operators and hooks living in separate repositories on github (or possibly other Apache projects), which are then distributed by pip and installed as libraries seems like it would scale better. Pandas did this about a year ago, and it's seemed to have worked well. For example,

Re: Duplicate key unique constraint error

2018-09-17 Thread Abhishek Sinha
Pastebin: https://pastebin.com/K6BMTb5K Regards, Abhishek > On 18-Sep-2018, at 12:31 AM, Stefan Seelmann wrote: > > On 9/17/18 8:19 PM, Abhishek Sinha wrote: >> Any update on this? >> >>> Please find the scheduler error log attached. >>> >>> Can you share the full python stack trace? >

Re: Duplicate key unique constraint error

2018-09-17 Thread Stefan Seelmann
On 9/17/18 8:19 PM, Abhishek Sinha wrote: > Any update on this? > >> Please find the scheduler error log attached. >> >> Can you share the full python stack trace? Seems the mailing list doesn't allow attachments. Either post the stacktrace inline, or post it somewhere at pastebin or so.

Database referral integrity

2018-09-17 Thread Stefan Seelmann
Hi, looking into the DB schema there is almost no referral integrity enforced at the database level. Many foreign key constraints between dag, dag_run, task_instance, xcom, dag_pickle, log, etc would make sense IMO. Is there a particular reason why that's not implemented? Introducing it now

Re: Duplicate key unique constraint error

2018-09-17 Thread Abhishek Sinha
Any update on this? Regards, Abhishek > On 14-Sep-2018, at 6:09 PM, Abhishek Sinha wrote: > > Maxime, > > Please find the scheduler error log attached. > > > > > > Regards, > > Abhishek > > > On Thu, Sep 13, 2018 at 10:07 AM Maxime Beauchemin >

Re: Sep Airflow Bay Area Meetup @ Google

2018-09-17 Thread Maxime Beauchemin
Lyft could probably host if we want to schedule something last minute while you and your crew are in town @Bolke. Maybe a one day get together + some hacking. Do you want to start another thread to assess interest? Max On Fri, Sep 14, 2018 at 11:42 PM Feng Lu wrote: > Not going to happen for

Re: --archives flag missing from SparkSubmitHook

2018-09-17 Thread Ben Laird
Looking at this again, I think one could just use set `spark.yarn.dist.archives` in the conf passed to the job. If that works, please disregard :) On Mon, Sep 17, 2018 at 1:09 PM Ben Laird wrote: > The current SparkSubmitHook doesn't appear to support the --archives flag. > From the Spark docs:

--archives flag missing from SparkSubmitHook

2018-09-17 Thread Ben Laird
The current SparkSubmitHook doesn't appear to support the --archives flag. >From the Spark docs: "spark.yarn.dist.archives (none): Comma separated list of archives to be extracted into the working directory of each executor." https://spark.apache.org/docs/latest/running-on-yarn.html This is

Re: It's very hard to become a committer on the project

2018-09-17 Thread George Leslie-Waksman
Are there Apache rules preventing us from switching to GitHub Issues? That seems like it might better fit much of Airflow's user base. On Sun, Sep 16, 2018, 9:21 AM Jeff Payne wrote: > I agree that Jira could be better utilized. I read the original > conversation on the mailing list about how