How about mapping a number for each string? Maybe you can do it with custom 
Transformer.

> On Jan 19, 2016, at 12:02 AM, Hilmi Yildirim <hilmi.yildi...@dfki.de> wrote:
> 
> Ok. In this case I will use an Array instead.
> 
> Am 18.01.2016 um 14:56 schrieb Theodore Vasiloudis:
>> I agree with Till, the data types are different here so you need a custom
>> string vector.
>> 
>> The Vector abstraction in FlinkML is designed with numerical vectors in
>> mind.
>> 
>> On Mon, Jan 18, 2016 at 2:33 PM, Till Rohrmann <trohrm...@apache.org> wrote:
>> 
>>> Hi Hilmi,
>>> 
>>> I think in your case it makes sense to define a custom vector of strings.
>>> The easiest implementation could be an Array[String] or List[String].
>>> 
>>> The reason why it does not make so much sense to make Vector and
>>> DenseVector
>>> generic is that these types are algebraic data types. How would you define
>>> algebraic operations such as scalar product, outer product, multiplication,
>>> etc. on a vector of strings? Then you would have to provide different
>>> implementations for the different type parameters.
>>> 
>>> Cheers,
>>> Till
>>> ​
>>> 
>>> On Mon, Jan 18, 2016 at 1:40 PM, Hilmi Yildirim <hilmi.yildi...@dfki.de>
>>> wrote:
>>> 
>>>> Hi,
>>>> how I explained it in a previous E-Mail, I need a LabeledVector where the
>>>> label is also a vector. After we discussed this issue, I created a new
>>>> class named LabeledSequenceVector with the labels as a Vector. In my use
>>>> case, I want to train a POS-Tagger system, so the "vector" is a vector of
>>>> strings and the "labels" is also a vector of strings. If I use the Flink
>>>> Vector/DenseVector implementation then the vector does only have double
>>>> values but I need String values.
>>>> 
>>>> Best Regards,
>>>> Hilmi
>>>> 
>>>> 
>>>> Am 18.01.2016 um 13:33 schrieb Chiwan Park:
>>>> 
>>>>> Hi Hilmi,
>>>>> 
>>>>> In NLP, which types are used for vector values? I think we can cover
>>>>> typical case using double values.
>>>>> 
>>>>> On Jan 18, 2016, at 9:19 PM, Hilmi Yildirim <hilmi.yildi...@dfki.de>
>>>>>> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> the Vector and DenseVector implementations of Flink ML only allow
>>> Double
>>>>>> values. But there are cases where the values are not Doubles, e.g. in
>>> NLP.
>>>>>> Does it make sense to make the implementations generic, i.e. Vector[T]
>>> and
>>>>>> DenseVector[T]?
>>>>>> 
>>>>>> Best Regards,
>>>>>> Hilmi
>>>>>> 
>>>>>> --
>>>>>> ==================================================================
>>>>>> Hilmi Yildirim, M.Sc.
>>>>>> Researcher
>>>>>> 
>>>>>> DFKI GmbH
>>>>>> Intelligente Analytik für Massendaten
>>>>>> DFKI Projektbüro Berlin
>>>>>> Alt-Moabit 91c
>>>>>> D-10559 Berlin
>>>>>> Phone: +49 30 23895 1814
>>>>>> 
>>>>>> E-Mail: hilmi.yildi...@dfki.de
>>>>>> 
>>>>>> -------------------------------------------------------------
>>>>>> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
>>>>>> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
>>>>>> 
>>>>>> Geschaeftsfuehrung:
>>>>>> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
>>>>>> Dr. Walter Olthoff
>>>>>> 
>>>>>> Vorsitzender des Aufsichtsrats:
>>>>>> Prof. Dr. h.c. Hans A. Aukes
>>>>>> 
>>>>>> Amtsgericht Kaiserslautern, HRB 2313
>>>>>> -------------------------------------------------------------
>>>>>> 
>>>>>> Regards,
>>>>> Chiwan Park
>>>>> 
>>>>> 
> 
> 
> -- 
> ==================================================================
> Hilmi Yildirim, M.Sc.
> Researcher
> 
> DFKI GmbH
> Intelligente Analytik für Massendaten
> DFKI Projektbüro Berlin
> Alt-Moabit 91c
> D-10559 Berlin
> Phone: +49 30 23895 1814
> 
> E-Mail: hilmi.yildi...@dfki.de
> 
> -------------------------------------------------------------
> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
> 
> Geschaeftsfuehrung:
> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
> Dr. Walter Olthoff
> 
> Vorsitzender des Aufsichtsrats:
> Prof. Dr. h.c. Hans A. Aukes
> 
> Amtsgericht Kaiserslautern, HRB 2313
> -------------------------------------------------------------
> 

Regards,
Chiwan Park


Reply via email to