Please clarify the platform - do you mean GPDB 4.2.0? Would you be able to upgrade to MADlib 1.8? Then you are using the latest software and we can see if you still have a problem.
Frank On Tue, Apr 5, 2016 at 9:20 AM, Esther Vasiete <evasi...@pivotal.io> wrote: > I am using MADlib 1.7.1 on HAWQ 4.2.0. > > Thanks. > > On Mon, Apr 4, 2016 at 8:04 PM, Frank McQuillan <fmcquil...@pivotal.io> > wrote: > >> Thanks for the question, Esther. What version of MADlib are you using >> and what database platform and version are you running on? >> >> It seems to be a MADlib version lower than 1.8 since the error message >> you report is different in the 1.8 release. (There was a bug fix in 1.8 to >> allow user-specified column names in PCA.) >> >> Frank >> >> >> >> >> >> On Mon, Apr 4, 2016 at 4:27 PM, Esther Vasiete <evasi...@pivotal.io> >> wrote: >> >>> Hi, >>> >>> I am trying to use pca_train but I am running through this error: >>> >>> ERROR: plpy.SPIError: plpy.SPIError: plpy.SPIError: plpy.SPIError: >>> Function "madlib.__matrix_densify_sfunc(double >>> precision[],integer,integer,double precision)": invalid argument - col >>> should be in the range of [0, col_dim) (seg35 awsaiuirl1178:40003 >>> pid=104068) (plpython.c:4648) >>> SQL state: XX000 >>> Context: Traceback (most recent call last): >>> PL/Python function "pca_train", line 23, in <module> >>> return pca.pca(**globals()) >>> PL/Python function "pca_train", line 404, in pca >>> PL/Python function "pca_train" >>> >>> My input table has 15472 rows and two columns; a row_id and an array >>> with 853 features. I am calling pca_train like this: >>> >>> DROP TABLE if exists ev.hci_subset_pca_output; >>> SELECT madlib.pca_train( 'ev.hci_subset_pca_input', >>> 'ev.hci_subset_pca_output', >>> 'row_id', >>> 3); >>> >>> I unfortunately cannot share the data but this is how it looks in >>> pgAdmin3. Note that pgAmdin3 won't show a feature_vector that it is too >>> large and this is why it appears to be empty but it isn't as you can see in >>> the second screenshot. >>> >>> [image: Inline image 1] >>> >>> [image: Inline image 3] >>> >>> I am not sure why I am running through this error. Please advice. >>> >>> Update: I have renamed feature_vector to "row_vec" and "row_id" starts >>> with 1. Still getting the same error. >>> >>> Thanks, >>> >>> -- >>> *Esther Vasiete * >>> *Data Scientist | Pivotal* >>> evasi...@pivotal.io >>> >>> >>> >> > > > -- > *Esther Vasiete * > *Data Scientist | Pivotal* > evasi...@pivotal.io >