Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9358#discussion_r43841085
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
    @@ -441,6 +537,17 @@ class Dataset[T] private(
       /** Collects the elements to an Array. */
       def collect(): Array[T] = rdd.collect()
     
    +  /**
    +   * (Java-specific)
    +   * Collects the elements to a Java list.
    +   *
    +   * Due to the incompatibility problem between Scala and Java, the return 
type of [[collect()]] at
    --- End diff --
    
    RDD holds a class tag of the element type that it uses to construct the
    correct type of array when you do a collect.
    On Nov 4, 2015 4:57 AM, "Wenchen Fan" <notificati...@github.com> wrote:
    
    > In sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
    > <https://github.com/apache/spark/pull/9358#discussion_r43840774>:
    >
    > > @@ -441,6 +537,17 @@ class Dataset[T] private(
    > >    /** Collects the elements to an Array. */
    > >    def collect(): Array[T] = rdd.collect()
    > >
    > > +  /**
    > > +   * (Java-specific)
    > > +   * Collects the elements to a Java list.
    > > +   *
    > > +   * Due to the incompatibility problem between Scala and Java, the 
return type of [[collect()]] at
    >
    > Will the class tag do the trick? I tried to define a generic class with
    > ClassTag:
    >
    > class MyTest[T : ClassTag] {
    >   def t(): Array[T] = null
    > }
    >
    > object MyTest {
    >   def apply[T](cls: Class[T]): MyTest[T] = {
    >     new MyTest[T]()(ClassTag(cls))
    >   }
    > }
    >
    > The return type of MyClass.t() is still Object at java side.
    > I also tried to use scala RDD at java side, the return type of
    > RDD.collect() is also Object.
    >
    > One possible solution is to define T <: AnyRef, but I think it's hard to
    > make it for Dataset or RDD.
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/spark/pull/9358/files#r43840774>.
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to