Github user keypointt commented on the issue:

    https://github.com/apache/spark/pull/17451
  
    ```
    >>> from pyspark.ml.feature import Word2Vec
    >>> sent = ("a b " * 100 + "a c " * 10).split(" ")
    >>> doc = spark.createDataFrame([(sent,), (sent,)], ["sentence"])
    >>> word2Vec = Word2Vec(vectorSize=5, seed=42, inputCol="sentence", 
outputCol="model")
    >>> model = word2Vec.fit(doc)
    ```
    above is the setup, and I created the `vec` below. It's fitting in  
`model.findSynonyms` nicely
    ```
    >>> from pyspark.ml.linalg import Vectors
    >>> vec = Vectors.dense([0.267, -0.2691, 0.058, -0.0801, 0.1821, 0.4162, 
0.0259, -0.2163, 0.1787, 0.0764])
    
    >>> model.findSynonyms(vec, 2)
    DataFrame[word: string, similarity: double]
    ```
    but `vec` cannot fit in `model.findSynonymsArray` even its type is `<class 
'pyspark.ml.linalg.DenseVector'>`
    ```
    >>> model.findSynonymsArray(vec, 2)
    word:
    [0.267,-0.2691,0.058,-0.0801,0.1821,0.4162,0.0259,-0.2163,0.1787,0.0764]
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File 
"/Users/renxin/Documents/workspace/spark/python/pyspark/ml/feature.py", line 
2951, in findSynonymsArray
        tuples = self._java_obj.findSynonymsArray(word, num)
      File 
"/Users/renxin/Documents/workspace/spark/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py",
 line 1160, in __call__
      File 
"/Users/renxin/Documents/workspace/spark/python/pyspark/sql/utils.py", line 63, 
in deco
        return f(*a, **kw)
      File 
"/Users/renxin/Documents/workspace/spark/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py",
 line 324, in get_return_value
    py4j.protocol.Py4JError: An error occurred while calling 
o65.findSynonymsArray. Trace:
    py4j.Py4JException: Method findSynonymsArray([class java.util.ArrayList, 
class java.lang.Integer]) does not exist
        at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
        at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)
        at py4j.Gateway.invoke(Gateway.java:274)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Thread.java:745)
    
    
    >>> type(vec)
    <class 'pyspark.ml.linalg.DenseVector'>
    ```
    
    here `vec` is taken as `java.util.ArrayList` 
    does `self._java_obj.findSynonymsArray(word, num)` behave differently from 
`self._call_java("findSynonyms", word, num)` for Vector type? 
    
    thank you Holden 😄 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to