[
https://issues.apache.org/jira/browse/MADLIB-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618129#comment-16618129
]
Frank McQuillan edited comment on MADLIB-1274 at 9/17/18 8:46 PM:
------------------------------------------------------------------
(1) This generates too many rules that is why it does not return:
```
max_itemset_size | # rules | time (s)
2 | 402 | 0.1
3 | 5,886 | 2.4
4 | 45,310 | 22
5 | 236,380 | 154
```
so it's not a bug
was (Author: fmcquillan):
(1) This generates too many rules that is why it does not return:
```
max_itemset_size | # rules | time (s)
2 | 402 | 0.1
3 | 5,886 | 2.4
4 | 45,310 | 22
5 | 236,380 | 154
```
> Association rules error on output schema
> ----------------------------------------
>
> Key: MADLIB-1274
> URL: https://issues.apache.org/jira/browse/MADLIB-1274
> Project: Apache MADlib
> Issue Type: Bug
> Components: Module: Association Rules
> Reporter: Frank McQuillan
> Priority: Major
> Fix For: v1.15.1
>
>
> Error observed on:
> * Postgres 9.6
> * Greenplum Database 5.9.0
> This is a small AWS single node GP, 4 segments on a machine with 8 VCPUs,
> and plenty of available memory
> [gpadmin@ip-172-21-0-246 RetailDemo]$ cat /proc/meminfo
> MemTotal: 62711428 kB
> MemFree: 59786076 kB
> MemAvailable: 60281836 kB
> Load data
> {code}
> DROP TABLE IF EXISTS order_items;
> CREATE TABLE order_items( itemid INTEGER,
> orderid INTEGER,
> productid INTEGER,
> quantity INTEGER,
> productname TEXT);
> INSERT INTO order_items VALUES
> ( 5 , 1044 , 9 , 3 , 'Kirby cukes'),
> ( 11 , 37 , 2 , 3 , 'Ooopsi Cola'),
> ( 12 , 37 , 21 , 3 , 'black radish'),
> ( 15 , 37 , 49 , 3 , 'Leg of lamb'),
> ( 18 , 37 , 37 , 3 , 'Uggo Waffles'),
> ( 20 , 37 , 76 , 3 , 'Happy Valley White Peaches'),
> ( 21 , 37 , 29 , 3 , 'Breakstone Whole Milk Cottage
> Cheese'),
> ( 22 , 37 , 25 , 3 , 'ugli fruit'),
> ( 4 , 1044 , 44 , 3 , 'ground beef'),
> ( 6 , 1044 , 17 , 3 , 'napa'),
> ( 9 , 1044 , 10 , 3 , 'dill'),
> ( 13 , 37 , 21 , 3 , 'black radish'),
> ( 24 , 37 , 47 , 3 , 'Ball Park Franks'),
> ( 25 , 37 , 69 , 3 , 'Ball Park Mustard'),
> ( 26 , 37 , 64 , 3 , 'Ballpark Hot Dog Rolls'),
> ( 27 , 1044 , 47 , 3 , 'Ball Park Franks'),
> ( 28 , 1044 , 69 , 3 , 'Ball Park Mustard'),
> ( 29 , 1044 , 64 , 3 , 'Ballpark Hot Dog Rolls'),
> ( 30 , 1044 , 70 , 3 , 'Homer''s Strawberry Jam'),
> ( 31 , 1044 , 71 , 3 , 'Mr Peanut Peanut Butter'),
> ( 32 , 37 , 71 , 3 , 'Mr Peanut Peanut Butter'),
> ( 33 , 37 , 70 , 3 , 'Homer''s Strawberry Jam'),
> ( 1 , 1044 , 1 , 3 , 'Pivotal Apple Juice'),
> ( 3 , 1044 , 77 , 3 , 'Pivotal Baked Beans'),
> ( 14 , 37 , 53 , 3 , 'Old Zurich Swiss Cheese'),
> ( 17 , 37 , 49 , 3 , 'Leg of lamb'),
> ( 19 , 37 , 18 , 3 , 'california navels'),
> ( 2 , 1044 , 41 , 3 , '12" Dinner Plates'),
> ( 7 , 1044 , 32 , 3 , 'Vermot Extra Sharp Cheddar'),
> ( 8 , 1044 , 71 , 3 , 'Mr Peanut Peanut Butter'),
> ( 10 , 1044 , 39 , 3 , 'Pivotal Soft and Smooth 24
> pack'),
> ( 16 , 37 , 22 , 3 , 'triple wahsed spinach'),
> ( 23 , 37 , 61 , 3 , 'Brooklyn Bagel 6 pack');
> {code}
> (1)
> XXX
> This one is not an error, it is just running for a long time since there are
> a gazillion rules generated since not capped by `max_itemset_size` param.
> See later comment 9/17/18.
> XXX
> Run assoc rules:
> {code}
> SELECT * FROM madlib.assoc_rules( .25,
> .5,
> 'orderid',
> 'productid',
> 'order_items',
> NULL,
> TRUE
> );
> {code}
> does not return.
> (2)
> Run assoc rules with output table specified results in:
> {code}
> SELECT * FROM madlib.assoc_rules(.10, -- Support
> .10, -- Confidence
> 'orderid', -- Transaction id col
> 'productname', -- Product col
> 'order_items', -- Input data
> 'pivotalmarkets', -- Output data
> TRUE); -- Verbose
> {code}
> results in error:
> {code}
> InternalError: (psycopg2.InternalError) plpy.Error: the output schema does
> not exist
> CONTEXT: Traceback (most recent call last):
> PL/Python function "assoc_rules", line 31, in <module>
> 'NULL'
> PL/Python function "assoc_rules", line 107, in assoc_rules
> PL/Python function "assoc_rules", line 21, in __assert
> PL/Python function "assoc_rules"
> [SQL: "SELECT * FROM madlib.assoc_rules(.10, -- Support\n
> .10, -- Confidence\n
> 'orderid', -- Transaction id col\n
> 'productname', -- Product col\n
> 'order_items', -- Input data\n
> 'pivotalmarkets', -- Output data\n
> TRUE); -- Verbose"]
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)