[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 @nzw0301 LGTM ð Merged. Well done! (thank you for your review @takuti ) ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 @nzw0301 I'll fix and merge it. No need to update this PR. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 Let me see. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user takuti commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 @myui Would you double-check this? I can merge whenever you are ready. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 It's Hive v2.2.0 bug. Filed a ticket: https://issues.apache.org/jira/browse/HIVE-17406 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user nzw0301 commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 The issue above is avoided by creating table: ``` create table data as ( select 1 as truth, 1 as predicted ); ``` ``` // ok select fmeasure(array(truth), array(predicted)) from data ; // ok select f1score(array(truth), array(predicted)) from data ; ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user nzw0301 commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 But I found another issue for f1score and fmeasure. Both function cannot work on `EMR v5.8.0` ```sql hive> select f1score(array(1), array(1)); FAILED: IllegalArgumentException Size requested for unknown type: org.apache.hadoop.hive.ql.exec.UDAFEvaluator ``` ```sql hive> select fmeasure(array(1), array(1)); FAILED: IllegalArgumentException Size requested for unknown type: java.lang.String ``` However, they can work on `EMR v5.0.0`. I don't know why the failures occur on newer EMR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user nzw0301 commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 I tested @takuti's query. The buggy code (previous code) returns `1.0`. On the other hand, the fixed code return correct value`0.5`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user takuti commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 Importantly, in case that the number of mappers is 1, fixing the bug in `merge()` does not change the output value; you might see the same result `0.42483920860540153` even if the bug exists. Alternatively, let you test the following query (6 mappers will be shown for each `select` statement), and check if its output is same as `fmeasure(truth, predicted, '-average micro')`: ```sql WITH data as ( select 1 as truth, 0 as predicted union all select 0 as truth, 1 as predicted union all select 0 as truth, 0 as predicted union all select 1 as truth, 1 as predicted union all select 0 as truth, 1 as predicted union all select 0 as truth, 0 as predicted ) select f1score(array(truth), array(predicted)) from data ; ``` If the bug has been fixed correctly, output should be same as `fmeasure(truth, predicted, '-average micro) = 0.5`, while the buggy code returns `f1score(array(truth), array(predicted)) = 1.0` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user nzw0301 commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 I will check whether the return value is the same tomorrow. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user nzw0301 commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 @takuti @myui Thank you for your kind comments. I completed update based on reviews. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 @nzw0301 grep f1score in `resources/ddl`. It's not only for `define-all.spark`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user nzw0301 commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 @takuti Thank you for your useful review. I will fix this PR based on your comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 Sure. @takuti Could you help reviewing this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 Also, some other DDLs also needed to be updated. Please grep `tree_export` to know which DDLs to update. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hivemall issue #107: [HIVEMALL-132] Generalize f1score UDAF to sup...
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/107 @nzw Could you update user guide to include `fmeasure` and `f1score` in `incubator-hivemall/docs/gitbook/eval/classification_measures.md` ? `npm install gitbook-cli; gitbook install; gitbook serve` on docs/gitbook . Also, could you revise the current Evaluation section of https://treasure-data.gyazo.com/5ec4b737dcedd55353f8126040ea5366 to ``` ⢠Binary Classification metrics ⢠Area Under the ROC Curve ⢠Regression metrics ⢠Ranking metrics ``` Refer examples in http://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics https://turi.com/learn/userguide/evaluation/classification.html#f_scores --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---