That's interesting, it seems like you've indexed a matrix into a field. If that's the case I think you'll need to access the arrays using the index as described here:
https://solr.apache.org/guide/8_8/vector-math.html#getting-values-by-index Then you can create a matrix from the arrays. I guess we need to add a way to materialize the matrix directly from a multidimensional array. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Apr 27, 2021 at 6:00 PM FAVORY , XAVIER <[email protected]> wrote: > Hello everyone, > > I am currently trying to create a system for performing distance > computation of different documents based on some pre-computed numerical > feature vector. > > I set up Solr (cloud) 8.7 and I am using streaming expressions. I have > documents as such, with the feature field being pfloat with multiValued set > to True: > > { > "id":"1", > "feature":[ > 0.1, > 0.5, > 0.6, > 1.7], > , > { > "id":"2", > "feature":[ > 0.5, > 0.1, > 0.7, > 0.9], > }, > { > "id":"3", > "feature":[ > -0.5, > 0.9, > 1.5, > 0.2], > }, > > I want to create a matrix so I can then use the distance() function to > compute the distances for the columns of a matrix. The documentation > provides an example of what I am interested in, by defining the vectors on > the fly: > > let(a=array(20, 30, 40), > b=array(21, 29, 41), > c=array(31, 40, 50), > d=matrix(a, b, c), > c=distance(d)) > > By transposing the matrix I can easily perform the distance between the > rows, so I can get what I want. > > However, now I want to extract the numerical features from a feature field > indexed in Solr. The documentation explains how to create a matrix from > numerical values stored in some fields: > > let( > a=random(collection1, q="market:A", rows="5000", fl="price_f"), > b=random(collection1, q="market:B", rows="5000", fl="price_f"), > c=random(collection1, q="market:C", rows="5000", fl="price_f"), > d=random(collection1, q="market:D", rows="5000", fl="price_f"), > e=col(a, price_f), > f=col(b, price_f), > g=col(c, price_f), > h=col(d, price_f), > i=matrix(e, f, g, h), > j=sumRows(i)) > > However, in my case, I already have an array of float values for each > document. So I try to do it that way: > > let( > s1=search(test,q="id:1",fl="feature"), f1=col(s1, feature), > s2=search(test,q="id:2",fl="feature"), f2=col(s2, feature), > s3=search(test,q="id:3",fl="feature"), f3=col(s3, feature), > m=matrix(f1,f2,f3) > ) > > But I get this error: > > { > "result-set": { > "docs": [ > { > "EXCEPTION": "Failed to evaluate expression matrix(f1,f2,f3) - > Numeric value expected but found type java.util.ArrayList for value > [0.1,0.5,0.6,1.7]", > "EOF": true, > "RESPONSE_TIME": 5 > } > ] > } > } > > When I inspect what I get as f3, I see that I have an array of array, which > is why I think it is failing here to create the matrix. I've been searching > a lot on how to create a matrix from float vectors stored in a field of my > documents, and I still cannot find any solution. What I could do is extract > the vectors, create them on the fly, and construct the vectors and matrix, > but I would like to be able to do it in one request. Moreover, I find it > really curious that I cannot directly create the matrix on the results of a > a normal search. For instance, I would prefer to do something like that: > > s=search(test,q="*",fl="feature,id"), m=col(s,feature)) > > which returns: > > { > "result-set": { > "docs": [ > { > "m": [ > [ > 0.1, > 0.5, > 0.6, > 1.7 > ], > [ > 0.5, > 0.1, > 0.7, > 0.9 > ], > [ > -0.5, > 0.9, > 1.5, > 0.2] > ] > ] > }, > { > "EOF": true, > "RESPONSE_TIME": 3 > } > ] > } > } > > and be able to use the matrix I obtain here. But again, I was not able to > perform matrix operations on "m". > > Does anyone know any elegant way to create a matrix from my numerical > vectors stored in my feature field? > > > Thank you. > -- > Xavier Favory > Music Technology Group > Universitat Pompeu Fabra >
