[ 
https://issues.apache.org/jira/browse/MADLIB-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618144#comment-16618144
 ] 

Frank McQuillan commented on MADLIB-1274:
-----------------------------------------

(2)  also not an error
that param is output `schema` not `table`

```
SELECT * FROM madlib.assoc_rules(.10,                  -- Support
                                 .10,                  -- Confidence
                                 'orderid',            -- Transaction id col
                                 'productname',        -- Product col
                                 'order_items',        -- Input data
                                 NULL,                 -- Output schema
                                 TRUE,                 -- Verbose
                                 2                     -- Max itemset size
                                );
```
produces
```
INFO:  finished checking parameters
INFO:  finished removing duplicates
INFO:  finished encoding items
INFO:  finished encoding input table: 0.0259048938751
INFO:  Beginning iteration #1
INFO:  25 Frequent itemsets found in this iteration
INFO:  Completed iteration # 1. Time: 0.00668692588806
INFO:  Beginning iteration # 2
INFO:  time of preparing data: 0.00086498260498
INFO:  201 Frequent itemsets found in this iteration
INFO:  Completed iteration # 2. Time: 0.00649094581604
INFO:  Beginning iteration # 3
INFO:  time of preparing data: 0.011045217514
INFO:  914 Frequent itemsets found in this iteration
INFO:  Completed iteration # 3. Time: 0.0923340320587
INFO:  begin to generate the final rules
INFO:  402 Total association rules found. Time: 0.00862121582031
 output_schema | output_table | total_rules |   total_time   
---------------+--------------+-------------+----------------
 public        | assoc_rules  |         402 | 00:00:00.14029
(1 row)
```



> Association rules error on output schema
> ----------------------------------------
>
>                 Key: MADLIB-1274
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1274
>             Project: Apache MADlib
>          Issue Type: Bug
>          Components: Module: Association Rules
>            Reporter: Frank McQuillan
>            Priority: Major
>             Fix For: v1.15.1
>
>
> Error observed on:
> * Postgres 9.6
> * Greenplum Database 5.9.0
> This is a small AWS single node GP, 4 segments on a machine with 8  VCPUs, 
> and plenty of available memory
> [gpadmin@ip-172-21-0-246 RetailDemo]$ cat /proc/meminfo
> MemTotal:       62711428 kB
> MemFree:        59786076 kB
> MemAvailable:   60281836 kB
> Load data
> {code}
> DROP TABLE IF EXISTS order_items;
> CREATE TABLE order_items(  itemid INTEGER,
>                            orderid INTEGER,
>                            productid INTEGER,
>                            quantity INTEGER,
>                            productname TEXT);                        
> INSERT INTO order_items VALUES
> (      5 ,    1044 ,         9 ,        3 , 'Kirby cukes'),
> (     11 ,      37 ,         2 ,        3 , 'Ooopsi Cola'),
> (     12 ,      37 ,        21 ,        3 , 'black radish'),
> (     15 ,      37 ,        49 ,        3 , 'Leg of lamb'),
> (     18 ,      37 ,        37 ,        3 , 'Uggo Waffles'),
> (     20 ,      37 ,        76 ,        3 , 'Happy Valley White Peaches'),
> (     21 ,      37 ,        29 ,        3 , 'Breakstone Whole Milk Cottage 
> Cheese'),
> (     22 ,      37 ,        25 ,        3 , 'ugli fruit'),
> (      4 ,    1044 ,        44 ,        3 , 'ground beef'),
> (      6 ,    1044 ,        17 ,        3 , 'napa'),
> (      9 ,    1044 ,        10 ,        3 , 'dill'),
> (     13 ,      37 ,        21 ,        3 , 'black radish'),
> (     24 ,      37 ,        47 ,        3 , 'Ball Park Franks'),
> (     25 ,      37 ,        69 ,        3 , 'Ball Park Mustard'),
> (     26 ,      37 ,        64 ,        3 , 'Ballpark Hot Dog Rolls'),
> (     27 ,    1044 ,        47 ,        3 , 'Ball Park Franks'),
> (     28 ,    1044 ,        69 ,        3 , 'Ball Park Mustard'),
> (     29 ,    1044 ,        64 ,        3 , 'Ballpark Hot Dog Rolls'),
> (     30 ,    1044 ,        70 ,        3 , 'Homer''s Strawberry Jam'),
> (     31 ,    1044 ,        71 ,        3 , 'Mr Peanut Peanut Butter'),
> (     32 ,      37 ,        71 ,        3 , 'Mr Peanut Peanut Butter'),
> (     33 ,      37 ,        70 ,        3 , 'Homer''s Strawberry Jam'),
> (      1 ,    1044 ,         1 ,        3 , 'Pivotal Apple Juice'),
> (      3 ,    1044 ,        77 ,        3 , 'Pivotal Baked Beans'),
> (     14 ,      37 ,        53 ,        3 , 'Old Zurich Swiss Cheese'),
> (     17 ,      37 ,        49 ,        3 , 'Leg of lamb'),
> (     19 ,      37 ,        18 ,        3 , 'california navels'),
> (      2 ,    1044 ,        41 ,        3 , '12" Dinner Plates'),
> (      7 ,    1044 ,        32 ,        3 , 'Vermot Extra Sharp Cheddar'),
> (      8 ,    1044 ,        71 ,        3 , 'Mr Peanut Peanut Butter'),
> (     10 ,    1044 ,        39 ,        3 , 'Pivotal Soft and Smooth 24 
> pack'),
> (     16 ,      37 ,        22 ,        3 , 'triple wahsed spinach'),
> (     23 ,      37 ,        61 ,        3 , 'Brooklyn Bagel 6 pack');
> {code}
> (1)
> XXX 
> This one is not an error, it is just running for a long time since there are 
> a gazillion rules generated since not capped by `max_itemset_size` param.  
> See later comment 9/17/18. 
> XXX
> Run assoc rules:
> {code}
> SELECT * FROM madlib.assoc_rules( .25,
>                                   .5,
>                                   'orderid',
>                                   'productid',
>                                   'order_items',
>                                   NULL,
>                                   TRUE
>                                 );
> {code}
> does not return.
> (2)
> Run assoc rules with output table specified results in:
> {code}
> SELECT * FROM madlib.assoc_rules(.10,                  -- Support
>                                  .10,                  -- Confidence
>                                  'orderid',            -- Transaction id col
>                                  'productname',        -- Product col
>                                  'order_items',        -- Input data
>                                  'pivotalmarkets',     -- Output data
>                                  TRUE);                -- Verbose
> {code}
> results in error:
> {code}
> InternalError: (psycopg2.InternalError) plpy.Error: the output schema does 
> not exist
> CONTEXT:  Traceback (most recent call last):
>   PL/Python function "assoc_rules", line 31, in <module>
>     'NULL'
>   PL/Python function "assoc_rules", line 107, in assoc_rules
>   PL/Python function "assoc_rules", line 21, in __assert
> PL/Python function "assoc_rules"
>  [SQL: "SELECT * FROM madlib.assoc_rules(.10,                  -- Support\n   
>                               .10,                  -- Confidence\n           
>                       'orderid',            -- Transaction id col\n           
>                       'productname',        -- Product col\n                  
>                'order_items',        -- Input data\n                          
>        'pivotalmarkets',     -- Output data\n                                 
> TRUE);                -- Verbose"]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to