Thanks for the update Esther.

Frank

On Wed, Apr 6, 2016 at 3:53 PM, Esther Vasiete <evasi...@pivotal.io> wrote:

> Upgrading to MADlib 1.8 solved the problem!
>
> Thanks,
> Esther
>
> On Tue, Apr 5, 2016 at 10:27 AM, Esther Vasiete <evasi...@pivotal.io>
> wrote:
>
>> Oh sorry, it is HAWQ 1.3.1.
>>
>> And the data engineer will upgrade to MADlib 1.8 tonight.
>>
>> Thanks,
>> Esther
>>
>> On Tue, Apr 5, 2016 at 9:26 AM, Frank McQuillan <fmcquil...@pivotal.io>
>> wrote:
>>
>>> Please clarify the platform - do you mean GPDB 4.2.0?
>>>
>>> Would you be able to upgrade to MADlib 1.8?  Then you are using the
>>> latest software and we can see if you still have a problem.
>>>
>>> Frank
>>>
>>> On Tue, Apr 5, 2016 at 9:20 AM, Esther Vasiete <evasi...@pivotal.io>
>>> wrote:
>>>
>>>> I am using MADlib 1.7.1 on HAWQ 4.2.0.
>>>>
>>>> Thanks.
>>>>
>>>> On Mon, Apr 4, 2016 at 8:04 PM, Frank McQuillan <fmcquil...@pivotal.io>
>>>> wrote:
>>>>
>>>>> Thanks for the question, Esther.  What version of MADlib are you using
>>>>> and what database platform and version are you running on?
>>>>>
>>>>> It seems to be a MADlib version lower than 1.8 since the error message
>>>>> you report is different in the 1.8 release.  (There was a bug fix in 1.8 
>>>>> to
>>>>> allow user-specified column names in PCA.)
>>>>>
>>>>> Frank
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Apr 4, 2016 at 4:27 PM, Esther Vasiete <evasi...@pivotal.io>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am trying to use pca_train but I am running through this error:
>>>>>>
>>>>>> ERROR: plpy.SPIError: plpy.SPIError: plpy.SPIError: plpy.SPIError:
>>>>>> Function "madlib.__matrix_densify_sfunc(double
>>>>>> precision[],integer,integer,double precision)": invalid argument - col
>>>>>> should be in the range of [0, col_dim)  (seg35 awsaiuirl1178:40003
>>>>>> pid=104068) (plpython.c:4648)
>>>>>> SQL state: XX000
>>>>>> Context: Traceback (most recent call last):
>>>>>>   PL/Python function "pca_train", line 23, in <module>
>>>>>>     return pca.pca(**globals())
>>>>>>   PL/Python function "pca_train", line 404, in pca
>>>>>> PL/Python function "pca_train"
>>>>>>
>>>>>> My input table has 15472 rows and two columns; a row_id and an array
>>>>>> with 853 features. I am calling pca_train like this:
>>>>>>
>>>>>> DROP TABLE if exists ev.hci_subset_pca_output;
>>>>>> SELECT madlib.pca_train( 'ev.hci_subset_pca_input',
>>>>>>                                            'ev.hci_subset_pca_output',
>>>>>>                                            'row_id',
>>>>>>                                             3);
>>>>>>
>>>>>> I unfortunately cannot share the data but this is how it looks in
>>>>>> pgAdmin3. Note that pgAmdin3 won't show a feature_vector that it is too
>>>>>> large and this is why it appears to be empty but it isn't as you can see 
>>>>>> in
>>>>>> the second screenshot.
>>>>>>
>>>>>> [image: Inline image 1]
>>>>>>
>>>>>> [image: Inline image 3]
>>>>>>
>>>>>> I am not sure why I am running through this error. Please advice.
>>>>>>
>>>>>> Update: I have renamed feature_vector to "row_vec" and "row_id"
>>>>>> starts with 1. Still getting the same error.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> --
>>>>>> *Esther Vasiete *
>>>>>> *Data Scientist | Pivotal*
>>>>>> evasi...@pivotal.io
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Esther Vasiete *
>>>> *Data Scientist | Pivotal*
>>>> evasi...@pivotal.io
>>>>
>>>
>>>
>>
>>
>> --
>> *Esther Vasiete *
>> *Data Scientist | Pivotal*
>> evasi...@pivotal.io
>>
>
>
>
> --
> *Esther Vasiete *
> *Data Scientist | Pivotal*
> evasi...@pivotal.io
>

Reply via email to