[jira] [Created] (AIRFLOW-3108) MsSqlHook.run fails to commit if autocommit=False (Default config)

2018-09-24 Thread kkkkk (JIRA)
k created AIRFLOW-3108:
--

 Summary: MsSqlHook.run fails to commit if autocommit=False 
(Default config)
 Key: AIRFLOW-3108
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3108
 Project: Apache Airflow
  Issue Type: Bug
  Components: hooks
Affects Versions: 1.10.0
Reporter: k


The MsSqlHook.run method doesn't execute conn.commit() if autocommit is set to 
False.
 
It looks like this bug has existed for a very long time, but wasn't apparent in 
1.9, because the default value for autocommit was True. In 1.10 the default 
value was changed to False and the MsSqlHook and Operator have started failing 
silently.
 
The bug happens because the MssqlHook doesn't implement a custom 
get_autocommit(self, conn) method.The superclass' DbiApiHook method always 
returns True even if autocommit wasn't enabled in pymssql. Therefore the hook 
doesn't call commit and pymsql doesn't autocommit.
 
The below patch fixes the issue. Please consider including this fix in airflow 
1.10.1, because it is a very frustrating issue to debug.
 
{code:java}
--- mssql_hook.py
+++ mssql_hook.py
@@ -50,3 +50,13 @@ class MsSqlHook(DbApiHook):
 
def set_autocommit(self, conn, autocommit):
conn.autocommit(autocommit)
+
+ def get_autocommit(self, conn):
+ """
+ MS SQL connection gets autocommit in a different way.
+ :param conn: connection to get autocommit setting from.
+ :type conn: connection object.
+ :return: connection autocommit setting
+ :rtype bool
+ """
+ return conn.autocommit_state
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2141) Cannot create airflow variables when there is a list of dictionary as a value

2018-09-19 Thread kkkkk (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620543#comment-16620543
 ] 

k commented on AIRFLOW-2141:


I have run into the same issue. The problem is in 
[cli.py#L331|[https://github.com/apache/incubator-airflow/blob/1.10.0/airflow/bin/cli.py#L331].]

 
{code:java}
try:
    n = 0
    for k, v in d.items():
        if isinstance(v, dict):
            Variable.set(k, v, serialize_json=True)
        else:
            Variable.set(k, v)
        n += 1
except Exception:
    pass
finally:
    print("{} of {} variables successfully updated.".format(n, len(d)))
{code}
It only tries to serialize if it is a dict. You can temporarily fix it by 
adjusting the line:
{code:java}
        if isinstance(v, dict) or isinstance(v, list):
            Variable.set(k, v, serialize_json=True){code}
But there are probably more values that need serialization. For example, when 
you export the .json file it creates integers as integers and not as strings 
and it also fails.

So perhaps something like this would work...
{code:java}
        if not isinstance(v, string):
            Variable.set(k, v, serialize_json=True){code}
 

> Cannot create airflow variables when there is a list of dictionary as a value
> -
>
> Key: AIRFLOW-2141
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2141
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: aws
>Affects Versions: 1.8.0
>Reporter: Soundar
>Priority: Major
>  Labels: beginner, newbie
> Attachments: airflow_cli.png, airflow_cli2_crop.png
>
>
> I'm trying to create Airflow variables using a json file. I am trying to 
> import airflow variables using UI(webserver) when I upload the json file I 
> get this error "Missing file or syntax error" and when I try to upload using 
> airflow cli not all the variables gets uploaded properly. The catch is that I 
> have a list of dictionary in my json file, say
>  ex:
>  {
>  "demo_archivedir": "/home/ubuntu/folders/archive",
>  "demo_filepattern": [
> { "id": "reference", "pattern": "Sample Data.xlsx" }
> ,
> { "id": "sale", "pattern": "Sales.xlsx" }
> ],
>  "demo_sourcepath": "/home/ubuntu/folders/input",
>  "demo_workdir": "/home/ubuntu/folders/working"
>  }
> I've attached two images
> img1. Using airflow variables cli command I was able to create partial 
> variables from my json file(airflow_cli.png)img2. After inserting logs in the 
> "airflow/bin/cli.py" file, I got this error. (airflow_cli2_crop.png)
> The thing is I gave this value through the Admin UI one by one and it worked. 
> Then I exported those same variable using "airflow variables" cli command and 
> tried importing them, still it failed and the above mentioned error still 
> occurs.
> Note:
>    I am using Python 3.5 with Airflow version 1.8
> The stack trace is as follows
> .compute-1.amazonaws.com:22] out: 0 of 4 variables successfully updated.
> .compute-1.amazonaws.com:22] out: Traceback (most recent call last):
> .compute-1.amazonaws.com:22] out:   File "/home/ubuntu/Env/bin/airflow", line 
> 28, in 
> .compute-1.amazonaws.com:22] out: args.func(args)
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/airflow/bin/cli.py", line 242, 
> in variables
> .compute-1.amazonaws.com:22] out: import_helper(imp)
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/airflow/bin/cli.py", line 273, 
> in import_helper
> .compute-1.amazonaws.com:22] out: Variable.set(k, v)
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/airflow/utils/db.py", line 53, 
> in wrapper
> .compute-1.amazonaws.com:22] out: result = func(*args, **kwargs)
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/airflow/models.py", line 3615, 
> in set
> .compute-1.amazonaws.com:22] out: session.add(Variable(key=key, 
> val=stored_value))
> .compute-1.amazonaws.com:22] out:   File "", line 4, in __init__
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/sqlalchemy/orm/state.py", line 
> 417, in _initialize_instance
> .compute-1.amazonaws.com:22] out: manager.dispatch.init_failure(self, 
> args, kwargs)
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/sqlalchemy/util/langhelpers.py",
>  line 66, in __exit__
> .compute-1.amazonaws.com:22] out: compat.reraise(exc_type, exc_value, 
> exc_tb)
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/sqlalchemy/util/compat.py", 
> line 187, in reraise
> .compute-1.amazonaws.com:22] out: raise value
> .compute-1.amazonaws.com:22] out:   File 
> 

[jira] [Commented] (AIRFLOW-2895) Prevent scheduler from spamming heartbeats/logs

2018-09-09 Thread kkkkk (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608732#comment-16608732
 ] 

k commented on AIRFLOW-2895:


The logs being spammed is not the only problem of this bug. It is also causing 
a very high CPU load, which is particularly problematic on lower-end servers.

Even on my local development machine it causes the CPU to use up the entire 
core I have given it.

Perhaps it would be worth considering releasing another airflow update that 
includes a fix for this sooner rather than later. 

> Prevent scheduler from spamming heartbeats/logs
> ---
>
> Key: AIRFLOW-2895
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2895
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Dan Davydov
>Assignee: Dan Davydov
>Priority: Major
>
> There seems to be a couple of problems with 
> [https://github.com/apache/incubator-airflow/pull/2986] that cause the sleep 
> to not trigger and Scheduler heartbeating/logs to be spammed:
>  # If all of the files are being processed in the queue, there is no sleep 
> (can be fixed by sleeping for min_sleep even if there are no files)
>  # I have heard reports that some files can return a parsing time that is 
> monotonically increasing for some reason (e.g. file actually parses in 1s 
> each loop, but the reported duration seems to use the very time the file was 
> parsed as the start time instead of the last time), I haven't confirmed this 
> but it sounds problematic.
> To unblock the release I'm reverting this PR for now. It should be re-added 
> with tests/mocking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)