[ 
https://issues.apache.org/jira/browse/SPARK-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15764163#comment-15764163
 ] 

Nira Amit edited comment on SPARK-18945 at 12/20/16 1:03 PM:
-------------------------------------------------------------

There's a bug in my code. Should be:
{code}
    private static PairFunction<String, String, String[]> keyData =
            new PairFunction<String, String, String[]>() {
                public Tuple2<String, String[]> call(String x) {
                    return new Tuple2(x.split(" ")[0], x.split(" "));
                }
            };

    public void testPairRdd() throws Exception {
        JavaRDD<String> lines = sc.parallelize(Arrays.asList("This is one line",
                "And another line",
                "Why not one more line"));
        JavaPairRDD<String, String[]> pairs = lines.mapToPair(keyData);
        Tuple2<String, String[]> firstPair = pairs.first();
        System.out.println("Got object of type: " + firstPair.getClass());
        System.out.println("First element is of type: " + 
firstPair._1().getClass());
        System.out.println("Second element is of type: " + 
firstPair._2().getClass());
    }
{code}

(String[] in the function's declaration, not String)


was (Author: amitnira):
There's a bug in my code

> java.lang.ClassCastException when Tuple2 field is an array
> ----------------------------------------------------------
>
>                 Key: SPARK-18945
>                 URL: https://issues.apache.org/jira/browse/SPARK-18945
>             Project: Spark
>          Issue Type: Bug
>          Components: Java API
>    Affects Versions: 2.0.2
>            Reporter: Nira Amit
>
> The following code results in an error:
> {code}
>     private static PairFunction<String, String, String> keyData =
>             new PairFunction<String, String, String>() {
>                 public Tuple2<String, String> call(String x) {
>                     return new Tuple2(x.split(" ")[0], x.split(" "));
>                 }
>             };
>     public void testPairRdd() throws Exception {
>         JavaRDD<String> lines = sc.parallelize(Arrays.asList("This is one 
> line",
>                 "And another line",
>                 "Why not one more line"));
>         JavaPairRDD<String, String> pairs = lines.mapToPair(keyData);
>         Tuple2<String, String> firstPair = pairs.first();
>         System.out.println("Got object of type: " + firstPair.getClass());
>         System.out.println("First element is of type: " + 
> firstPair._1().getClass());
>         System.out.println("Second element is of type: " + 
> firstPair._2().getClass());
>     }
> {code}
> The problematic expression is the last print. The output in the console is:
> {code}
> 16/12/20 13:42:12 INFO DAGScheduler: ResultStage 0 (first at 
> RetentionOutputFormatterTest.java:166) finished in 0.148 s
> Got object of type: class scala.Tuple2
> First element is of type: class java.lang.String
> java.lang.ClassCastException: [Ljava.lang.String; cannot be cast to 
> java.lang.String
> {code}
> If the Tuple2 is of <String, String> instead of  <String, String[]> then the  
> code works fine and there is no exception.
> This is the relevant part of my pom file:
> {code:xml}
>   <properties>
>     <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
>     <jdk.version>1.8</jdk.version>
>   </properties>
>   <dependencies>
>     <dependency>
>       <groupId>log4j</groupId>
>       <artifactId>log4j</artifactId>
>       <version>1.2.17</version>
>     </dependency>
>     <dependency>
>       <groupId>org.apache.spark</groupId>
>       <artifactId>spark-core_2.11</artifactId>
>       <version>2.0.1</version>
>     </dependency>
>     <dependency>
>       <groupId>com.amazonaws</groupId>
>       <artifactId>aws-java-sdk</artifactId>
>       <version>1.9.21</version>
>       <exclusions>
>         <exclusion>
>           <groupId>com.fasterxml.jackson.core</groupId>
>           <artifactId>jackson-annotations</artifactId>
>         </exclusion>
>         <exclusion>
>           <groupId>net.java.dev.jets3t</groupId>
>           <artifactId>jets3t</artifactId>
>         </exclusion>
>       </exclusions>
>     </dependency>
>     <dependency>
>       <groupId>junit</groupId>
>       <artifactId>junit</artifactId>
>       <version>3.8.1</version>
>       <scope>test</scope>
>     </dependency>
>   </dependencies>
>     <build>
>         <plugins>
>             <plugin>
>                 <groupId>org.apache.maven.plugins</groupId>
>                 <artifactId>maven-compiler-plugin</artifactId>
>                 <version>2.3.2</version>
>                 <configuration>
>                     <source>1.8</source>
>                     <target>1.8</target>
>                 </configuration>
>             </plugin>
>         </plugins>
>     </build>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to