Nira Amit created SPARK-18945:
---------------------------------

             Summary: java.lang.ClassCastException when Tuple2 field is an array
                 Key: SPARK-18945
                 URL: https://issues.apache.org/jira/browse/SPARK-18945
             Project: Spark
          Issue Type: Bug
          Components: Java API
    Affects Versions: 2.0.2
            Reporter: Nira Amit


The following code results in an error:

{code}
    private static PairFunction<String, String, String> keyData =
            new PairFunction<String, String, String>() {
                public Tuple2<String, String> call(String x) {
                    return new Tuple2(x.split(" ")[0], x.split(" "));
                }
            };

    public void testPairRdd() throws Exception {
        JavaRDD<String> lines = sc.parallelize(Arrays.asList("This is one line",
                "And another line",
                "Why not one more line"));
        JavaPairRDD<String, String> pairs = lines.mapToPair(keyData);
        Tuple2<String, String> firstPair = pairs.first();
        System.out.println("Got object of type: " + firstPair.getClass());
        System.out.println("First element is of type: " + 
firstPair._1().getClass());
        System.out.println("Second element is of type: " + 
firstPair._2().getClass());
    }
{code}

The problematic expression is the last print. The output in the console is:

{code}
16/12/20 13:42:12 INFO DAGScheduler: ResultStage 0 (first at 
RetentionOutputFormatterTest.java:166) finished in 0.148 s
Got object of type: class scala.Tuple2
First element is of type: class java.lang.String

java.lang.ClassCastException: [Ljava.lang.String; cannot be cast to 
java.lang.String
{code}

If the Tuple2 is of <String, String> instead of  <String, String[]> then the  
code works fine and there is no exception.

This is the relevant part of my pom file:

{code:xml}
  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <jdk.version>1.8</jdk.version>
  </properties>
  <dependencies>
    <dependency>
      <groupId>log4j</groupId>
      <artifactId>log4j</artifactId>
      <version>1.2.17</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.11</artifactId>
      <version>2.0.1</version>
    </dependency>
    <dependency>
      <groupId>com.amazonaws</groupId>
      <artifactId>aws-java-sdk</artifactId>
      <version>1.9.21</version>
      <exclusions>
        <exclusion>
          <groupId>com.fasterxml.jackson.core</groupId>
          <artifactId>jackson-annotations</artifactId>
        </exclusion>
        <exclusion>
          <groupId>net.java.dev.jets3t</groupId>
          <artifactId>jets3t</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
  </dependencies>
    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>2.3.2</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
        </plugins>
    </build>

{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to